Using the artificial intelligence to filter bulk email letters

Often, as a filter that determines whether to send an email letter with different offers or information to a particular client, it is used the period of inactivity of the client. For example, if during the last 90 days there was some kind of activity of a person (say, opening a letter), then we calmly consign him with next series of email letters. As soon as 90 days have elapsed since the last activity, this process stops and the user should be reactivated. This period can be less or more, but the essence of filtering process remains the same.

Now, let's imagine that a user is not interested in receiving email letters from a particular company, for a long time did not open the ones, and, occasionally, opened some letter. What then? The filter "A user was active for the last N days" is used, and the process of receiving email letters from the company begins again. What ramains to be done? There are two options, either to tolerate or unsubscribe from the email campaign list. In any case, such a situation is problematic and does not contribute to attracting of the client to further purchases.

Thus, there is a demand to develop some kind of additional filter that can determine whether we should send to a particular client an email letter or not and with what content. That is the very situation there the artificial intelligence (namely, the machine learning) can be useful.

Last post

What kind of filters we use in the platform?

To determine whether to send an email letter to some user or not, we use the whole previous history of user's actions. In doing so, we developed several different algorithms, based on ideas of the widely known RFM analysis. Our algorithms work on such sets of metrics:

General quantitative indicators: total number of email letters received by a user within the email campaign, lifetime value, etc.
Open rates. In order to take into account the dynamics of user's behavior in his client history, we distinguish several different rates that take into account both the averaged indicators and the behavior patterns in different periods of client's life cycle. For example, we use the following metrics:

general open rate of email letters by a user within the email campaign;
open rate for the last N received email letters (for example, last 30 or 10);
open rates of email letters received on weekdays and on weekends, etc.

Click rates. Analogously to the open rates.
Last active actions of a user. For instance, number of days that have passed since the user last opening (or clicking).

Based on such parameters, it is possible to construct rather effective algorithms for classification of users activity and predict whether or not the user is going to open some particular letter before sending it. The essence of email campaigns of the bulk type is such that the open rates are usually not very high. There are always a lot more users who do not read an email letter, compared to those who open the one, and, moreover, follow the link. Therefore, it is much easier and more efficient to allocate a group of users who are not going to open the letter with some high probability. Here it is worth determining what we mean by "high probability".

From the choice of a probability threshold, it is depends both the size of users group, which the algorithm will recomend to exclude from the specific email campaign and apply to them the reactivation procedure, and the error occur. Under the error, we understand the situation when it is recomended not to send an email letter to the user because of the low probability of their opening, but nevertheless, the user opened it. In general, it is used 95% and 99% thresholds in the mathematical statistics. We are going to see how the results of applying the filters developed by us from the choice of a threshold in the range from 95% to 99%. When we say that the threshold is equal to, say, 98%, we mean that we are filtering all users for whom the probability of unreading of some particular email letter within the email campaign exceeds 98%.

What results do we get?

We are going to answer three main questions:

What errors do the filters developed by us do?
Which part of clients can be filtered out in addition to the traditionally used "N days" rule?
Which part of potential ipenings do we still lose using our algorithms?

What errors do the filters developed by us do?

The answer to the first questions possed by us can be found at the following diagram.

Precision_3

We see that the accuracy of classification depends on the choice of the probability threshold, increasing as it increases. That is logical, since the higher the probability with which we want to identify users who will not read the next email letter, the fewer such users is find the algorithm. Therefore, the number of missclassified users will also decreases.

Here under the classification error, we mean the case when our algorithm assigns the user to the unlikely to open, whereas, in fact, the user will open the letter (possibly, to unsubscribe). For example, with the threshold of 95%, we will do a classification error of 1.2%. This means that among the group of clients that we will not reccomend to send an email letter, 1.2% of users will open it. Therefore, you can expect the open rate at the level of 1.2% in this group of users.

As we can see from the plot, the classification error for a higher threshold (98% and higher) is small and less than half of a percent.

Which part of users can be further filtered using our methods?

It depends on the specifics of the customer base and other factors, and can fluctuate significantly. For example, if you select the threshold of 95%, you can get a recommendstion to exclude from the email campaign of 30% to 50% of those users who have done some activities within the last 90 days. Similarly, at the threshold of 99%, this indicator can vary from 5% to 10% and higher.

Which part of potential openings do we still lose using our algorithms?

The answer to this question directly depends on what result we get when answering the second one. In general, we can say that we are lose in openings inversely proportional to the choice of the threshold. In other words, if we choose the threshold of 99%, then we lose about 1% in openings, but if we decrease the threshold to 95%, then we lose not less than 5 percent, but maybe more (7-8).

Thus, the higher we choose the threshold, the less there will be loses in openings and clicks, with a smaller part of users we will discard, and vice versa. What strategy should be chosen is an individual issue that can be solved only with all the factors in mind. The described technology of our classifiers based on the application of artificial intelligence algorithms, at least, gives a general picture of what you can rely on using such technologies for your purposes.

Try the platform in Action

If you want to learn more about the technical side of our algorithms, read the article "Forecasting user activity using Machine Learning and R language" in our blog.

4.0 from 5 based on 32 reviews