Troubleshooting Sample-related Provisioning Problems

To handle large volumes of data while not sacrificing the quality of the results, and at the same time getting the results in an acceptable amount of time, certain requirements are made regarding the makeup of the session’s proposed contacts. One of the strategies Optimize uses is to break the proposed contact data into random subsets of approximately equal numbers of customers; it then optimizes the proposed contacts of each of these samples independently. If multiple threads are configured and supported by your hardware, these customer samples will be processed concurrently.

There is a class of problems which can result in errors or suboptimal results that are a side-effect of the customer sample approach. The number of customer samples that will be used for a session run is determined by dividing the number of customers in the PCT by the value of the configuration parameter Optimize|AlgorithmTuning| CustomerSampleSize. It is important that there are enough proposed contacts matching each capacity rule to allow each random customer sample to be statistically similar relative to every feature used by a capacity rule.

For example, say we have 1 million customers, and have a configured customer sample size of 1000. This implies that we will have 1000 customer samples. Imagine that we have a capacity rule that is set up as: minimum 1 email, maximum 5000 emails. What Optimize does in this example is to take the rule constraints and modify them to spread that rule across the customer samples. In this example, the maximum 5000 emails constraint is divided by the number of samples, so that each sample is processed with a maximum 5 emails constraint. But what do we do with the minimum 1 email constraint? We cannot have each sample requiring a minimum 1/1000 of an email!

Instead, we randomly pick one sample to process with a minimum 1 email constraint, while the other 999 samples are processed with no minimum email constraint. This all works fine, unless there are not enough proposed contacts using email, to make sure all 1000 samples get at least one email. If your proposed contacts only contain 500 contacts using email, there will be a smaller than 50% chance that a particular sample will contain an email. That means you have a greater than 50% chance that the session will be aborted with an error, because the minimum cannot be satisfied, even though 500 times that minimum were present in the proposed contacts. In order to avoid this situation, any feature used in a capacity rule should be well-represented relative to the number of samples.