Power Of Personalisation: Driving Customer Growth With Advanced Recommendation Engines

Wojtek Krok, Head of QuantumBlack South East Asia, Lukasz Gaweda, Engagement Manager, QuantumBlack, Piotr Roszkowski, Engagement Manager, QuantumBlack, Tomasz Zamacinski, Junior Principal, QuantumBlack

Recommendation engines are commonplace today. Tailoring product suggestions to a user’s preferences can be found across a range of industries, from video streaming to online fashion retail. However, adoption remains a challenge in digital services industries such as finance and telco as many of their products are interchangeable.

For example, many data packages for mobile plans serve the same customer needs and are substitutable. Determining which recommendations will be attractive for the customers while adding incremental value for the company is complex — tailoring mobile packages suggestions pose a different challenge than recommending one video to watch over another.

Example of different levels of sophistication of the recommenders used in Telco and Banking industries.

In a recent project, QuantumBlack took an experimental design approach when creating an end-to-end recommendation engine for a telco operator. Thanks to applying advanced business objectives to A/B testing, the end system went beyond the traditional segment-based offering to deliver a more sophisticated solution, providing personalised recommendations suited to a customer’s behavioural patterns and needs. The goal was to provide every customer with a unique offer, maximising the expected business value of the resulting product take-up.

This article explores the key elements critical to the success of this project — and therefore key drivers of success in creating recommendation engines which can navigate and drive success in complex sectors.

Treatments in our context could be campaigns with specific product at defined price, delivered in specified form to selected audience.

Treatments in our context could be campaigns with specific product at defined price, delivered in specified form to selected audience.

Before examining specific steps of the project, it is useful to consider general best practice for designing effective A/B testing-based products and campaigns. The below guiding principles have been acquired and refined across a number of use cases and hopefully provide helpful considerations to make before starting a new project, as well as throughout the development process.

  • An experimental mindset — no matter your team’s experience, ideating new campaigns should be approached from a fresh standpoint and focus on asking what the end customer needs.
  • Validate your designs based on incremental effects — A/B tests by definition require a control group that will allow you to capture the difference between a new campaign and business-as-usual.
  • Be precise with the problem — identifying the key-metric and the actual goal of the campaign should occur before the product and campaign ideation.
  • Every test ends with a clear next step — to capture value from experimenting with your campaigns, you need to be able to translate conclusions into actions.
  • Acknowledge that your customers differ — by focusing on a Conditional Average Treatment Effect (CATE), you integrate an appreciation that campaign effect depends on the unique characteristics of each person exposed to the tested treatment.
The key reason behind the AB pilots is the ability to derive CATE for each customer and each treatment.

With this best practice front of mind, we can begin exploring our project.

To estimate the Conditional Average Treatment Effects, a machine learning (ML) model was trained for each treatment and each KPI of interest on corresponding target and control populations. At a customer level, the prediction of the model focused on the expected net change of the KPI due to being targeted by the treatment.

Our use case involved large training set sizes, a large number of available features and a large variance in the observed KPIs. We found that the best performing models were meta-learners. Performance was further improved by customising the meta-learners and accounting for the customer’s propensity to activate the campaign.

Cross-validation was deployed to ensure that each model’s predictions were useful. Cross-validation methodologies rely on grouping and ranking customers based on the predicted uplift values and computing the difference in KPI between target and control.

Through leveraging a common configurable Kedro pipeline, we were able to minimise the effort and risk of error. This allowed us to handle a large number of campaigns and easily add or remove models supporting respective campaigns thanks to the modular design.

We’ve trained our ML models to identify and prioritise customers with highest CATE.

With the customer-level models and KPI-level granularity, selecting the campaigns for each customer was not as simple as matching scored customers to the best predictions.

Firstly, due to business requirements, certain A/B tests were run only on subsegments of populations. To avoid unjustified extrapolation, the scoring pipeline tracked eligibility rules for each campaign.

Secondly, each campaign score was produced by a different model. A calibration procedure was used to account for the different quality of models, where a model prediction was conservatively replaced by an observed out-of-sample net uplift in a corresponding group from many bootstraps of the population.

Finally, due to the limit on the number of treatments that can be executed at the same time, it was important for the telco to provide strict prioritisation rules over two levels:

  • Prioritisation between KPIs: describing what business objectives take priority in campaigning (e.g. customer spend vs specific service/product adoption)
  • Prioritisation within KPIs: the direct output of the models. A thresholding mechanism was added to deprioritise treatments that have low expected uplift

The resulting pipeline included whitelisting and prioritisation rules. It also automatically created control groups that allowed for measuring the impact of the entire system and other control groups suitable for model re-training. In order to leverage custom ML models in a timely manner, despite the large size of the customer base, PySpark cluster was deployed to apply locally trained models.

Maintaining experimentation-driven campaigning across the use case significantly stimulated the growth of the campaign net effect. An initial 15% increase was effectively doubled thanks to the implementation of the model-based prioritisations. Depending on the priority KPIs, engines can increase revenue per user, decrease churn, boost satisfaction and increase user engagement.

Impact seen at client organisation — amplification of the effect identified at AB testing happens through better AI-driven customer prioritisation.

This new tool sparked a sea change in how the telco’s marketing team operated across all campaigns. Any new product idea is now initially tested in a small scale pilot, using statistically significant target and control groups. The results of the experiment are then fed to the AI model which assesses the treatment impact for each customer, driving a department culture to consistently validate and improve initiatives before they are put into the field.

The traditional approach to modeling (Ideation->Experimental Pilot -> Model training) has become the focal point of the overall process.

The adoption of this experimental culture is arguably a far more significant result of the project than the initial impressive performance results. Even the most experienced client marketing team now tackles a project with a belief that the most high performing campaign can always be improved — and this mindset is sure to spread throughout the organisation, generating a constant cycle of AI-driven optimisation.

An advanced analytics firm operating at the intersection of strategy, technology and design. www.quantumblack.com @quantumblack