NORM_SIGMOID
Applies to PredictiveInsight only.
Syntax
NORM_SIGMOID(data [, keyword]) NORM_SIGMOID(data, mean, std [, keyword]) NORM_SIGMOID(data, base_data [, keyword])
Parameters
data
The values to normalize. This can be a constant value, a column, a cell range, or an expression evaluating to any of the above. For the format definition of data, see the "Macro Function Parameters" section in the chapter in this guide for your IBM® product.
mean, std
These two parameters provide the mean and the standard deviation to use for normalization. They must be constants, except with the ROW keyword, where they can be constants or columns.
base_data
This parameter specifies a data range to use for computing the mean and standard deviation to use for normalizing data.
keyword
This optional keyword determines how the computation is performed over the input data range. Select one of the following:
ALL - Performs the computation on all cells in data (default)
COL - Performs the computation separately for each column of data
ROW - Performs the computation separately for each row of data
For more details on using keywords in IBM® Campaign, see Format Specifications.
For more details on using keywords in IBM® PredictiveInsight, see Format Specifications.
Description
NORM_SIGMOID calculates the normalized values of the specified data range. A sigmoidal normalization redistributes data along a sigmoid curve, returning values between -1.0 and +1.0, inclusive. Essentially, all data within a standard deviation of the mean is linearly distributed in the middle range of the sigmoid. Outliers are represented along the tails of the sigmoid. This allows you to keep very large outlier data points without sacrificing discrimination ability among points near the mean.
The sigmoidal normalization is performed as follows:
where
and mean and std are determined as follows:
*
If mean and std are provided, these values are used for the mean and standard deviation, respectively. If these parameters are provided with the ROW keyword, mean and std can be columns, specifying a mean and standard deviation for each row of data. If min and max are columns, the columns must be either the same length as data or scalar (that is, contain a single value which is used as a constant applied to all values in the corresponding column of data).
*
If base_data is provided, the mean and standard deviation of this data range are calculated and used to normalize data. The columns in base_data must contain two or more cell values.
*
NORM_SIGMOID always returns a data range with the same dimensions as the input data range. The ALL keyword specifies to compute the mean and standard deviation over the entire input data range. The COL keyword specifies to compute a mean and standard deviation for each input column and to use those values for normalizing that column. The ROW keyword specifies to compute a mean and standard deviation for each row in the specified data range and to use those values for normalizing that row.
*
*
To normalize data using the same base_data range (for example, in wrapped user functions), make mean and std constants (this can be done using the CONSTANT macro function).
Examples
Creates a new column named TEMP containing the values -0.55, 0, and 0.55. (The mean and standard deviation [4 and 0.816] are calculated from the data range automatically.)
Creates a new column named TEMP containing the values -0.21, 0.21, and 0.55. (This time the mean and standard deviation [3.5 and 1.2] are provided as arguments.)
TEMP = NORM_SIGMOID(V1) or TEMP = NORM_SIGMOID(V1,ALL)
Creates a new column named TEMP containing normalized values of the contents of column V1. The mean and standard deviation used for normalization are calculated over column V1.
Creates three new columns named TEMP, VX, and VY. Each contains the normalized values of the contents of columns V1, V2, and V3, respectively. The mean and standard deviation used for normalization are calculated over columns V1, V2, and V3.
Creates three new columns named TEMP, VX, and VY, each with values in rows 1-41. The contents of column TEMP are the normalized values of rows 10-50 of column V1, the contents of column VX are the normalized values of rows 10-50 of column V2, and the contents of column VY are the normalized values of rows 10-50 of column V3. The mean and standard deviation for normalization purposes are calculated over rows 10-50 of columns V1-V3.
Creates three new columns named TEMP, VX, and VY. Each contains the normalized values of the contents of columns V1, V2, and V3, respectively. The mean and standard deviation used for normalization are calculated over column V4.
Creates three new columns named TEMP, VX, and VY. Each contains the normalized values of the contents of columns V1, V2, and V3, respectively. The mean and standard deviation used for normalization are calculated over columns V4-V8.
Creates three new columns named TEMP, VX, and VY. Each contains the normalized values of the contents of columns V1, V2, and V3, respectively. The mean and standard deviation used for normalization are calculated for each column independently (that is, a mean and standard deviation is calculated for column V1, a separate mean and standard deviation is calculated for column V2, etc.).
Creates three new columns named TEMP, VX, and VY, each with values in rows 1-41. The contents of column TEMP are the normalized values of rows 10-50 of column V1, the contents of column VX are the normalized values of rows 10-50 of column V2, and the contents of column VY are the normalized values of rows 10-50 of column V3. The mean and standard deviation for normalization purposes are calculated over rows 10-50 of columns V1-V3. The mean and standard deviation for normalization purposes are calculated independently for each column.
Creates three new columns named TEMP, VX, and VY. Each contains the normalized values of the contents of columns V1, V2, and V3, respectively. The mean and standard deviation used for normalization are calculated for each column independently using columns V4-V6 (that is, a mean and standard deviation is calculated over column V4 for normalizing column V1, a separate mean and standard deviation is calculated over column V5 for normalizing column V2, etc.).
Creates three new columns named TEMP, VX, and VY. Each contains the normalized values of the contents of columns V1, V2, and V3, respectively. The mean and standard deviation used for normalization are calculated over independently over each row across columns V1, V2, and V3.
Creates three new columns named TEMP, VX, and VY, each with values in rows 1-41. The contents of column TEMP are the normalized values of rows 10-50 of column V1, the contents of column VX are the normalized values of rows 10-50 of column V2, and the contents of column VY are the normalized values of rows 10-50 of column V3. The mean and standard deviation for normalization purposes are calculated over rows 10-50 of columns V1-V3. The mean and standard deviation for normalization purposes are calculated over each row of columns V1-V3.
Creates three new columns named TEMP, VX, and VY. Each contains the normalized values of the contents of columns V1, V2, and V3, respectively. The mean and standard deviation used for normalization are calculated independently for each row across columns V4-V10.
Related Functions