To configure a Sample process
1.
In a flowchart in Edit mode, connect at least one configured process (such as a Select process) to the Sample process box.
2.
The process configuration dialog appears.
3.
Use the Input drop-down list to select the cells that you want to sample. The list includes all output cells from any process connected to the Sample process. To use more than one source cell, select the Multiple Cells option. If more than one source cell is selected, the same sampling is performed on each source cell.
*
4.
Use the # of Samples/Output Cells field to specify how many samples to create for each input cell. By default, three samples are created for each input cell, with default names Sample1, Sample2 and Sample3.
5.
To change the default sample names, double-click a sample in the Output Name column, then type a new name. You can use any combination of letters, numbers, and spaces. Do not use periods (.) or slashes (/ or \).
*
6.
*
To define the sample size by percentages: Select Specify Size By %, then double-click the Size field to indicate the percentage of records to use for each sample. Use the Max Size field if you want to limit the size of the sample. The default is Unlimited. Repeat for each sample listed in the Output Name column, or use the All Remaining check box to assign all remaining records to that sample. You can select All Remaining for only one output cell.
*
To specify the number of records for each sample size: Select Specify Size By # Records, then double-click the Max Size field to specify the maximum number of records to allocate to the first sample group. Specify the Max Size for the next sample in the Output Name column or use the All Remaining check box to assign all remaining records to that sample. You can select All Remaining for only one output cell.
7.
Ensure that each sample in the Output Name list has a Size defined or has All Remaining checked.
8.
(Optional) Click Sample Size Calculator to use the calculator to help you understand the statistical significance of sample sizes in evaluating campaign results. You can specify a level of accuracy by entering an error bound and computing the sample size needed, or you can enter a sample size and compute the error bound that will result. Results are reported at the 95% confidence level.
9.
In the Sampling Method section, specify how to build the samples:
*
Random Sample: Use this option to create statistically valid control groups or test sets. This option randomly assigns records to sample groups using a random number generator based on the specified seed. Seeds are explained later in these steps.
*
Every Other X: This option puts the first record into the first sample, the second record into the second sample, up to the number of samples specified. This process repeats, until all records have been allocated to a sample group. To use this option, you must specify the Ordered By options to determine how records are sorted into groups. The Ordered By options are explained later in these steps.
*
Sequential Portions: This option allocates the first N records into the first sample, the next set of records in the second sample, and so on. This option is useful for creating groups based on the top decile (or some other size) based on some sorted field (for example, cumulative purchases or model scores). To use this option, you must specify the Ordered By options to determine how records are sorted into groups. The Ordered By options are explained later in these steps.
10.
If you selected Random Sample, in most cases you can simply accept the default seed.
In rare cases, you may want to click Pick to randomly generate a new seed value, or enter a numeric value in the Seed field. Examples of when you might need to use a new seed value are:
*
*
11.
If you selected Every Other X or Sequential Portions, you must specify how records will be sorted. The sort order determines how records will be allocated to sample groups:
a.
Select an Ordered By field from the drop-down list or use a derived field by clicking Derived Fields.
b.
Select Ascending to sort numeric fields in increasing order (low to high) and sort alphabetic fields in alphabetical order. If you choose Descending, the sort order is reversed.
12.
Click the General tab if you want to modify the default Process Name and Output Cell Names. By default, output cell names consist of the process name followed by the sample name and a digit. You can accept the defaultCell Codes or uncheck the Auto Generate Cell Code box and assign codes manually. Enter a Note to clearly describe the purpose of the Sample process.
13.
The process is configured and enabled in the flowchart. You can test run the process to verify that it returns the results you expect.