For deeper understanding of segmentation algorithm used within Evelean AI Engine

Evelean is a Content Intelligence solution for the digital environment, proposed as a SAAS platform.

In Evelean the artificial intelligence will allow you to classify visitors , personalize content, optimize engagement and leads generation.

Evelean Dashboard gives access to multiple analysis features, including traffic classification, adherence between segments and content, heat map view per segment, leads qualification, etc.

It also provides regular CMS functionality for creating and managing content on your web site.

Segmentation algorithms

Three types of algorithms are used within Evelean AI engine. All three aim at analysing the visitors behaviour and determine different groups of audience, also called clusters or segments. In one group, all visitors shall have the same behaviour, in term of interest (number of views is the key metric) or conversion potentiality (number of leads is the key metric). Between two groups on the contrary, behaviours shall be distinctive, meaning that visitors will be interested by different contents or different products.

Score Cards principles

This is the first method used and available by default with any Evelean subscription. The algorithm analyses very specific session parameters, each with a different weight, to decide upon the belonging to one cluster or another. This algorithm is linear and requires no pre-processing of data.

The main value of this approach is to organise visitors into different groups, making the hypothesis each group will behave differently. Then, based upon the accurate measure of each group’s interest towards various contents, two options will emerge:

  1. We observe each group effectively behaves differently, which means that we have built a robust model that can predict the behaviour of future visitors based upon their segment.

  2. We don’t observe important differences from one group to another, meaning that the segmentation method needs to be modified.

Evelean Score Cards algorithm has been tested in many different conditions for different sectors. It is accurate for most standard cases in B2C. 

Unlike pure Deep Learning algorithms (see after), this approach comes with an understandable marketing meaningfulness for each group created. We have classified them into four:

SEGMENT

EXPECTATIONS

MARKET RESPONSE

Economy

Purchase behavior is based on price

Emphasize promotions for this group

Caring

Long term commitment matters

Underline associated services

Brand

Reputation is key

Emphasize reputation and references

Innovation

Highest standards of the market

Focus on quality

Deep Learning (supervised)

We use a standard DL approach, based upon feedforward neural networks. A quick introduction to this algorithm is provided hereafter as well as references of publication for further understanding. This method requires pre-processing for the segmentation model to be trained and then applied for future predictions.

Such an algorithm is very efficient for determining whether a given visitor, as compared to all previous visitors, is likely to have a specific behavior, typically convert into a lead. 

As opposed to the Score Cards method, this approach is not directly looking at the usual marketing criteria (capacity to spend, education, age, localization, etc), but is solely focusing on the end result: based on previous measures and looking at many different parameters, no matter what those parameters might be, we can tell with a great accuracy and a measurable success factor (98% for instance) that this new visitor is likely or not to convert on this product.

The “supervised” aspect of this algorithm however means that all parameters do not come with the exact same weight, some being more important than others. It assumes a manual intervention to guide the training process. 

In some cases, this method can provide more accurate results as compared to the Score Cards algorithm, in term of success factor and/or in term of group size, but it requires data pre-processing and analysis before a prediction model can be deployed.

Unsupervised Deep Learning

Ultimate algorithm that requires no manual intervention during the training process, but an extensive data processing. The objective remains the same as previously: analyse the end result in term of behavior, taking in account as many parameters as possible. This approach can be very useful in very specific B2C cases, or even B2B models. It works efficiently with high volume of reference data, typically data coming from a DMP.

Why Evelean is different ?

Since it solely uses session information (attached to HTTP standards), Evelean is different from all existing cookie based applications or even proprietary Google segmentation methods. Besides, it naturally complies with GDPR constraints since it works on pure anonymous traffic. 

For most business cases, Evelean comes with a plug-and-play segmentation module (Score Card algorithm) that does not require pre-processing or any complex data analysis.

However, for more advanced users or business needs, all the current and infinite possibilities of AI are also available with our others algorithms in deep learning (supervised or not)

Appendix

Feedforward networks

www.deeplearningbook.org

"Deep feedforward networks, also called feedforward neural networks, or multilayer perceptrons (MLPs), are the quintessential deep learning models.

(...)

Feedforward networks are of extreme importance to machine learning practitioners. They form the basis of many important commercial applications. For example, the convolutional networks used for object recognition from photos are a specialized kind of feedforward network. Feedforward networks are a conceptual stepping stone on the path to recurrent networks, which power many natural language applications."

Rectifier Linear Unit

machinelearningmastery.com

"In a neural network, the activation function is responsible for transforming the summed weighted input from the node into the activation of the node or output for that input.

The rectified linear activation function is a piecewise linear function that will output the input directly if is positive, otherwise, it will output zero. It has become the default activation function for many types of neural networks because a model that uses it is easier to train and often achieves better performance."

Dropout

arxiv.org/pdf/1207.0580.pdf

"A feedforward, artificial neural network uses layers of non-linear “hidden” units between its inputs and its outputs. By adapting the weights on the incoming connections of these hidden units it learns feature detectors that enable it to predict the correct output when given an input vector. If the relationship between the input and the correct output is complicated and the network has enough hidden units to model it accurately, there will typically be many different settings of the weights that can model the training set almost perfectly, especially if there is only a limited amount of labeled training data. Each of these weight vectors will make different predictions on held-out test data and almost all of them will do worse on the test data than on the training data because the feature detectors have been tuned to work well together on the training data but not on the test data. Overfitting can be reduced by using “dropout” to prevent complex co-adaptations on the training data. On each presentation of each training case, each hidden unit is randomly omitted from the network with a probability of 0.5, so a hidden unit cannot rely on other hidden units being present. Another way to view the dropout procedure is as a very efficient way of performing model averaging with neural networks. A good way to reduce the error on the test set is to average the predictions produced by a very large number of different networks. The standard way to do this is to train many separate networks and then to apply each of these networks to the test data, but this is computationally expensive during both training and testing. Random dropout makes it possible to train a huge number of different networks in a reasonable time. There is almost certainly a different network for each presentation of each training case but all of these networks share the same weights for the hidden units that are present."

 

Tags