vwhere

 SYNOPSIS

   A graphical version of the S-Lang where command, providing a simple
   yet capable mechanism for exploring and filtering datasets.  This
   visual, interactive approach to the construction of complex, multi-
   dimensional filters offers a fluid and intuitive alternative to the
   classic approach of file-based filtering with command line tools (as
   used, e.g., within astronomy data analysis), and can be considerably
   faster, cleaner, and more powerful.  Filtering is performed upon data
   vectors generated either in-memory or from disk files, with no filter
   syntax required and the result instantly visualized for inspection.
   In contrast, file-based filtering tools require explicit syntax (often
   conflicting with the syntax employed by other tools or systems) and
   that the resulting file(s) be re-loaded into separate programs for
   verification (e.g. a plot or file dump). In contrast with the all-at-
   once style mandated by file-based filtering, VWhere filters may be
   applied incrementally (or not) to arbitrary axes of your input dataset.
   This avoids the creation of numerous "file litter" products while one
   experiments with filter ranges or axis combinations, as well as the
   performance penalties of multiple I/O iterations over files.  Moreover,
   most file-based filtering tools are static in function: they cannot be
   augmented at runtime by dynamic loading of modules.  VWhere filters,
   however, may employ not only any built-in S-Lang arithmetic operator
   or function, but also essentially arbitrary C, C++, or FORTRAN codes
   loaded from external modules.

 USAGE

	Array_Type = vwhere( structure | 1Darray, 1Darray, ...)

   Input to vwhere should contain at least 2 numeric vectors, all of equal
   length.  These vectors should be passed in the form of either a comma-
   separated list of 1-D arrays or a single structure containing two or
   more fields.  Structure fields will be ignored if they are non-numeric
   in type, or of unequal length, or have names prefixed with an underscore.
   Upon invocation VWhere launches its Axis Expression Window, which
   provides a means of generating plots, fabricating new data vectors,
   and issuing arbitrary commands through an interactive S-Lang prompt.

 PLOTTING

   Filtering in VWhere amounts to manipulating regions of interest on
   plots.  The number of plots that may be created or overplotted, and
   the number of region filters applied to each, is effectively unlimited.
   Plots may also be deleted, panned, and zoomed -- providing a rapid
   means of data exploration -- as well as customized through a number
   of graphical user interface preferences.
   
   Plots are specified in the Axis Expression Window via two editable
   text fields, one for each of the X and Y axes.  The content of each
   field defaults to the name of the first and second input vectors,
   respectively, and may be changed either by typing new expressions
   for each axis or by selecting from the Choose dropdown menu.  In
   general each axis expression may contain any valid S-Lang statement,
   even calls to C, C++, or FORTRAN functions imported from external
   modules.  The chief constraints upon an axis expression are that it
   be less than 256 characters long and that it generate a numeric vector.
  
   Two kinds of plots may be visualized, filter plots and overplots, by
   pressing either the Plot or OPlot buttons.  The main distinction between
   the two is that overplotted X/Y vector pairs may beof arbitrary length,
   while filter plots require vectors exactly equal in length to those
   within the input dataset.  The latter constraint stems from the fact
   that, logically, array expressions given to the underlying S-Lang where
   command [e.g. where(A < 5 and B > 11)] can operate only upon isomorphic
   vectors.  In addition, because overplotted vectors are not subject to
   where filtering they are always drawn in their entirety; thus they
   provide additional means for qualitative, visual comparison, but have
   no quantitative effect on the result returned by VWhere.  Finally,
   when a filter plot is created each unique axis expression -- and the
   resulting vector that it generates -- is "remembered" in the Choose
   dropdown menu.  This provides for easy re-selection later, and is a
   fast and simple mechanism for fabricating new data on the fly, of
   essentially arbitrary complexity, thanks to the extensibility and
   generality of axis expressions -- on the fly.

 REGION FILTERS

   The following region filters may be applied after visualizing a plot:

	rectangle		click MouseButton1, then drag mouse to
				define bounding box

	ellipse  		same as rectangle

	polygon  		click MouseButton1 to add vertices
				click MouseButton2 to close polygon
				click MouseButton3 to cancel

   Filters may be deleted (by hitting the BACKSPACE or DELETE key),
   moved, or resized after initial placement.
   
   Points on or within the boundary of at least one region filter are
   considered "selected," and will be drawn in the foreground line
   style and symbol color.  All other points are considered "filtered"
   or "excluded," and will be drawn, when requested, in the background
   line style and color.  Line styles and symbol colors may be adjusted
   from within the preferences dialogs.

 INCREMENTAL FILTERING

   One of the more useful features of vwhere is the incremental manner in
   which the dataset may be filtered.  In contrast with file in / file out
   filtering method offered by command line tools, which applies the entire
   set of filters to the entire input dataset -- conceptually in just one
   pass -- vwhere provides the option of filtering some axes of the dataset,
   by applying region filters to currently displayed plots, prior to
   filtering other axes.
   
   This provides a powerful mechanism for exploring relationships within
   your data, and can also speed up subsequent plotting and filtering.
   When incremental filtering is on (the default) only points selected by
   the current filters will be colored in subsequent plots.  Filtered points
   will either be drawn grayed out on subsequent plots (the default) or not
   drawn at all (a faster option for large datasets), per the current
   preferences.  The next section describes how filters are incrementally
   combined.

 RETURN VALUE

   The vwhere guilet return value matches that of the native S-Lang where
   function: an array of numbers, each representing an index into the
   vector(s) given to the comparison operator(s) of the where expression.
   These indices may then be applied to related datasets, or used to create
   filtered output files, etcetera.

   Filters applied to a single plot are unioned to form the set of points
   selected by that plot.  If only one plot is visualized then this set
   completely specifies the indices returned by vwhere.  When multiple plots
   are visualized the incremental selections from each are either intersected
   (the default) or unioned (when chosen in the preferences dialog) to
   generate the aggregate set of selected points.

   If zero region filters have been applied the entire input dataset will be
   returned.  Dismissing the guilet by any means other than pressing "Done"
   in the plots window will return the empty dataset.

 EXAMPLES

   The following explores the curves y = x^2 and z = x^3 over [1,100] :

		result = vwhere([1:100], [1:100]^2, [1:100]^3);

   The following explores a hypothetical binary table read from disk:

		tab = your_favorite_FITS_file_reader ("table.fits");
		result = vwhere(tab);

   If the tab structure contained CCD_ID, PHA, and TIME fields, then valid
   expressions by which two plots could generated from this table might be:

	PLOT 1:
			X :	  ccd_id
			Y :	  pha
	PLOT 2:
			X :	  time
			Y :	  log10(pha)

   The log10(pha) expression for the Y axis of the second plot creates a
   new data vector, which will also be selectable from the Choose dropdown
   menu for use in subsequent plots.

 BUGS

   The GtkPlot widget atop which VWhere is built is not robust in the
   face of Inf/Nan values.  VWhere attempts to compensate for this, but
   for performance reasons does not execute both isnan() and isinf()
   on all X/Y plot vectors.  To avoid undefined behavior and potential
   data loss, Inf/Nan values should thus be culled first.

 SEE ALSO

   where
   http://arxiv.org/abs/astro-ph/0412003	(ADASS XIV Proceedings)
