EXTRACT
Applies to PredictiveInsight only.
Syntax
EXTRACT(predicate_col, data)
Parameters
predicate_col
A column of boolean values or an expression evaluating to a single column of boolean values. Boolean values are interpreted as zero or non-zero. This column should contain at least as many rows as the data range from which data is being extracted. Otherwise, predicate_col will be a limit to the number of rows processed by the EXTRACT macro function (see "Description" below).
data
The values to extract. This can be a constant value, a column, a cell range, or an expression evaluating to any of the above. For the format definition of data, see the "Macro Function Parameters" section in the chapter in this guide for your IBM® product.
Description
EXTRACT returns the rows in the specified data range that have a value of one in the corresponding row of the predicate column. This macro function reduces data by "throwing out" all rows where the corresponding value in predicate_col is zero. EXTRACT returns one new column for each input column, each containing the values in the corresponding input column for which the corresponding value in predicate_col is one. The extracted rows of data will occupy the first n cells of the output columns where n is the number of ones in predicate_col.
Since EXTRACT operates on a row-by-row basis, it produces a result for each row up to the last value of the shortest column (that is, the shortest column out of predicate_col and the columns in the data range data). All non-zero values in predicate_col are evaluated as one.
*
Generally, you will want to create a predicate column using one of the comparison macro functions (for example, ==, >, <, ISEVEN, ISODD, etc.). You can then extract the rows of interest from a specified data range using the EXTRACT macro function. This can be useful to "weed out" bad data points (for example, when a particular value exceeds the maximum or minimum value for a data variable). It also can be used to consolidate all examples of a particular class (for example, if the column V3 contains ones and zeros for one of the output classes, use V4=EXTRACT(V3, V1:V2) to extract the inputs V1 and V2). Since EXTRACT condenses all the extracted rows as a block of data (that is, it fills the cell range VX[1:n}:VY), where n is the number of extracted rows, it is a useful function for copying a range of cells from their current row locations to rows 1- n of the spreadsheet.
Examples
Creates a new column named TEMP containing a copy of column V1.
Creates two new columns named TEMP and VX with values in the first 51 cells. The values in the TEMP column are the cells 50-100 of column V1, and the values in the VX column are the cells 50-100 of column V2.
Creates two new columns named TEMP and VX. For each row where the value in column V3 is one, the corresponding row across columns V1 and V2 are extracted into columns TEMP and VX, respectively.
Creates a new column named TEMP containing all the values in column V1 that were greater than the corresponding values in column V2.
Creates two new columns named TEMP and VX. For rows 10-20 where the value in column V3 is one, the corresponding row across columns V1 and V2 is extracted into columns TEMP and VX, respectively.
Related Functions