Properly configuring CustomerSampleSize for best Optimize session run time while preserving optimality takes some consideration.
Optimize works by breaking up the proposed contacts into random sub-samples of customers called "chunks." All proposed contacts and contact history belonging to a single customer are processed with that customer in the chunk in which that customer belongs (a customer may belong only to a single chunk). The accuracy of the optimization algorithm depends on these chunks of customers being statistically similar to each other, and a larger chunk size makes this more likely. Cross-customer capacity constraints are evenly distributed across the chunks. For example, if your Optimize session contains a constraint specifying there is a maximum of 1000 offer A's allowed, if the Optimize session is run with 10 chunks, each chunk will have a capacity rule that allows a maximum of 100 offer A's.
The algorithm tuning variable CustomerSampleSize allows you to set the maximum chunk size. The larger the chunk, the more accurate the results, but the session runtime and memory resources required also increase. Do not use chunk sizes significantly greater than 10,000 without careful planning because many systems do not have enough memory resources to process more than 10,000 customers at a time, which will result in a failed Optimize session run (out of memory error). In many cases, a larger chunk size may not significantly increase the optimality of the solution (measured as the sum of scores of the surviving transactions in the Optimized Contacts Table) at all, but still takes more time and memory to run. You may need to tune the CustomerSampleSize based on your specific optimization problem and performance needs.
A smaller maximum chunk size may cause more chunks to be created. This makes it more likely that a rule might depend on some element (such as Email channel) that is less numerous than the number of chunks. If the chunk size were reduced to 100 there would be 1,000 chunks. Now, the rule’s minimum is actually less than the number of chunks, which makes the modified rule .02 (20 divided by 1,000). In this case, 2% of the chunks use a rule with a minimum of 1, and the other 98% of the chunks use a minimum of 0. As long as each chunk is statistically similar with regard to channel Email, Optimize processes the rule as expected. A problem occurs when there are fewer customers offered emails than there are chunks. If only 500 customers are offered emails, each chunk has only a 50% chance of containing a customer offered an email, and the odds that a particular chunk has both a customer offered an email and a minimum 1 rule is only 1%. Instead of meeting the specified minimum of 20, Optimize would only return 5 on average.
Copyright IBM Corporation 2012. All Rights Reserved.
|