Understanding Interact learning
The Interact learning module monitors visitor's responses to offers and visitor attributes. The learning module has two general modes:
Exploration-the learning module serves offers in order to gather enough response data to optimize the estimation used later during exploitation. Offers served during exploration do not necessarily reflect the optimal choice.
Exploitation-after enough data has been collected by the exploration phase, the learning module uses the probabilities to help select which offers to present.
The learning module alternates between exploration and exploitation based on two properties: a confidence level you configure with the confidenceLevel property and a probability that the learning module presents a random offer you configure with the percentRandomSelection property.
You set the confidenceLevel to a percentage which represents how sure (or confident) the learning module must be before its scores for an offer are used in arbitration. At first, when the learning module has no data to work from, the learning module relies entirely upon the marketing score. After every offer has been presented as many times as defined by the minPresentCountThreshold, the learning module enters the exploration mode. Without a lot of data to work with, the learning module is not confident that the percentages it calculates are correct. Therefore, it stays in the exploration mode.
The learning module assigns weights to each offer. To calculate the weights, the learning module uses a formula that takes in as input the configured confidence level as well as historical acceptance data and the current session data. The formula inherently balances between exploration and exploitation, and returns the appropriate weight.
To ensure that the system is not biased toward the offers that perform best during early stages, Interact presents a random offer the percentRandomSelection percent of the time. This forces the learning module to recommend offers other than the most successful to determine if other offers would be more successful if they had greater exposure. For example, if you configure percentRandomSelection to 5, this means that 5% of the time, the learning module presents a random offer and adds the response data to its calculations.
The learning module determines which offers are presented in the following way.
For example, the learning module determines that a visitor is 30% likely to accept offer A and 70% likely to accept offer B and that it should exploit this information. From the treatment rules, the marketing score for offer A is 75 and 55 for offer B. However, the calculations in step 3 makes the final score for offer B higher than offer A, therefore, the runtime environment recommends offer B.
Learning is also based on the recencyWeightingFactor property and the recencyWeightingPeriod property. These properties enable you to add more weight to more recent data than older data. The recencyWeightingFactor is the percentage of weight the recent data should have. The recencyWeightingPeriod is the length of time that is recent. For example, you configure the recencyWeightingFactor to .30 and the recencyWeightingPeriod to 24. This means that the previous 24 hours of data are 30% of all data considered. If you have a week's worth of data, all of the data averaged across the first six days is 70% of the data, and the last day is 30% of the data.
Every session writes the following data to a learning staging table:
At a configurable interval, an aggregator reads the data from the staging table, compiles it, and writes it to a table. The learning module reads this aggregated data and uses it in calculations.