NORM_ZSCORE

NORM_ZSCORE(data [, keyword]) NORM_ZSCORE(data, mean, std [, keyword]) NORM_ZSCORE(data, base_data [, keyword])

The numerical values to normalize. This can be a constant value, a column, a cell range, or an expression evaluating to any of the above. For the format definition of data, see the "Macro Function Parameters" section in the chapter in this guide for your IBM® product.

mean, std

These two parameters provide the mean and the standard deviation to use for normalization. They must be constants, except with the ROW keyword, where they can be constants or columns.

base_data

This parameter specifies a data range to use for computing the mean and standard deviation to use for normalizing data.

keyword

This optional keyword determines how the computation is performed over the input data range. Select one of the following:

ALL - Performs the computation on all cells in data (default)

COL - Performs the computation separately for each column of data

ROW - Performs the computation separately for each row of data

For more details on using keywords in IBM® Campaign, see Format Specifications.

For more details on using keywords in IBM® PredictiveInsight, see Format Specifications.

Description

NORM_ZSCORE calculates the normalized values of the specified data range. The z-score normalization is performed as follows:

where mean and std are determined as follows:

If mean and std are provided, these values are used for the mean and standard deviation, respectively. If these parameters are provided with the ROW keyword, mean and std can be columns, specifying a mean and standard deviation for each row of data. If min and max are columns, the columns must be either the same length as data or scalar (that is, contain a single value which is used as a constant applied to all values in the corresponding column of data).

If base_data is provided, the mean and standard deviation of this data range are calculated and used to normalize data. The columns in base_data must contain two or more cell values.

If neither of the above mutually exclusive options are provided, the mean and standard deviation are automatically computed from data.

NORM_ZSCORE always returns a data range with the same dimensions as the input data range. It computes a mean and standard deviation for each input column and uses those values for normalizing that column.

If the standard deviation is zero, all zeros are returned.

Examples

TEMP = NORM_ZSCORE(COLUMN(3, 4, 5))

Creates a new column named TEMP containing the values -1.22, 0, and 1.22. (The mean and standard deviation [4 and 0.816] are calculated from the data range automatically.)

TEMP = NORM_ZSCORE(COLUMN(3, 4, 5), 3.5, 1.2)

Creates a new column named TEMP containing the values -0.42, 0.42, and 1.25. (This time the mean and standard deviation [3.5 and 1.2] are provided as arguments.)

TEMP = NORM_ZSCORE(V1)

Creates a new column named TEMP containing normalized values of the contents of column V1. The mean and standard deviation used for normalization are calculated over the column V1.

TEMP = NORM_ZSCORE(V1:V3)

Creates three new columns named TEMP, VX, and VY. Each contains the normalized values of the contents of columns V1, V2, and V3, respectively. The mean and standard deviation used for normalization are calculated for each column independently (that is, a mean and standard deviation is calculated for column V1, a separate mean and standard deviation is calculated for column V2, etc.).

TEMP = NORM_ZSCORE(V1[10:50]:V3)

Creates three new columns named TEMP, VX, and VY, each with values in rows 1-41. The contents of column TEMP are the normalized values of rows 10-50 of column V1, the contents of column VX are the normalized values of rows 10-50 of column V2, and the contents of column VY are the normalized values of rows 10-50 of column V3. The mean and standard deviation for normalization purposes are calculated over rows 10-50 of columns V1-V3. The mean and standard deviation for normalization purposes are calculated independently for each column.

TEMP = NORM_ZSCORE(V1:V3, V4:V6)

Creates three new columns named TEMP, VX, and VY. Each contains the normalized values of the contents of columns V1, V2, and V3, respectively. The mean and standard deviation used for normalization are calculated for each column independently using columns V4-V6 (that is, a mean and standard deviation is calculated over column V4 for normalizing column V1, a separate mean and standard deviation is calculated over column V5 for normalizing column V2, etc.).

TEMP = NORM_ZSCORE(V1:V3, COL)

TEMP = NORM_ZSCORE(V1[10:50]:V3, COL)

Creates three new columns named TEMP, VX, and VY. Each contains the normalized values of the contents of columns V1, V2, and V3, respectively. The mean and standard deviation used for normalization are calculated for each column independently using columns V4-V6 (that is, a mean and standard deviation is calculated over column V4 for normalizing column V1, a separate mean and standard deviation is calculated over column V5 for normalizing column V2, etc.).

TEMP = NORM_ZSCORE (V1:V3, ROW)

TEMP = NORM_ZSCORE(V1[10:50]:V3, ROW)

TEMP = NORM_ZSCORE(V1:V3, V4:V10, ROW)

Related Functions

Function	Description
NORM_MINMAX	Computes the min/max normalization of a data range
NORM_SIGMOID	Computes the sigmoidal normalization of a data range