|
The data range data is automatically normalized using the zero-mean unit-variant method. In other words,
|
|
Principal component analysis is performed on the normalized data range to generate its eigenvectors (see details described for the PCA macro function). This occurs automatically for data if base_data is not provided. It is performed by the explicit call to the PCA macro function if base_data is provided.
|
|
Each row () of the data range (data) is transformed into a new coordinate system () based on the top num_features (m) ranked eigenvectors which compose :
|
|
|
Because calculating PCA on a data range can be compute intensive, using the BUFFER macro function on the PCA calculation is much more efficient. For example: PCA_FEATURES(num_features, range, BUFFER(PCA(base_data)))
|
Creates five new columns named TEMP, VW, VX, VY, and VZ, containing the top five features of the data range V1:V7. The data range V1:V7 is used as the basis for the transformation.
|
Creates three new columns named TEMP, VX, and VY, containing the top three features of the data range V1:V4. The data range V10:V13 is used as the basis for the transformation.
|
Creates three new columns named TEMP, VX, and VY, containing the top three features of the data range V1:V4. The data range V10:V13 is used as the basis for the transformation. Once the principal components of the data range V10:V13 are calculated, those values are stored as constants. If the data values in columns V10 - V13 change, they will not effect this function definition.
|
Copyright IBM Corporation 2013. All Rights Reserved.
|