How to Model Your Customers' Lifetime Value

Food for thought is that it costs 5-10 times more to recruit a new customer than to retain an existing one. The annual churn rate for the telecommunications industry is an estimated 30-35 percent on average. Similarly, for other industries, customer retention has now become even more important than customer acquisition. Some researchers found that a 5% increase in customer retention could increase company profitability from 25% to 85%. Customer Lifetime Value (CLV) or Lifetime Value (LTV) is a key metric that illustrates a prediction of the net profit of an entire future relationship with a customer in marketing. Just to summarise: knowing the lifetime value of your customers helps you:

  • Segment your customers and develop and deliver unique segment-specific marketing Treatments.
  • Define your return on Investment.
  • Forecast customer satisfaction.
  • Innovate and optimize marketing tools, tactics and channels.
  •  Adjust communication campaigns and messages.
  • Conduct profitable loyalty programs.
  • Cross-sell and up-sell based on individual patterns of buying.

There are tons of different ways to calculate lifetime value of customers, such as Navie, RFM, Markov Chains, Hazard Functions, Survival Regressions, Machine learning approaches and distributed based approaches. In this post, we briefly explain Recency Frequency Monetary (RFM) approaches as well as distribution based approaches.

RFM Method 

The most popular approach for measuring the customer's’ lifetime value is RFM. RFM refers to a modelling technique that uses the following three factors from client records:

  • Recency: Period since last purchase.
  • Frequency: How many purchases an individual made during the observation period.
  • Monetary: Cumulative total spent by client during observation period.

Bruce Hardie developed a spreadsheet to implement RFM that is easy to use and can be seen here

Grouping the RFM can be done through clustering algorithms which are a form of unsupervised machine learning. A popular method for clustering is to use Hartigan’s Rule which “essentially compares the ratio of the within-cluster sum of squares for a clustering with k clusters and one with k + 1 clusters, accounting for the number of rows and clusters. If that number is greater than 10, then it is worth using k + 1 clusters.” . In the example below, we have two columns of frequency and recency of my customers and wish to cluster them based on these two factors by using R: 

library (ggplot2) 
mydataCluster <- kmeans(mydata, 3, nstart = 20)
mydataCluster$cluster <- as.factor(mydataCluster$cluster)
ggplot(mydata, aes(frequency, recency, color = mydataCluster$cluster)) + geom_point()

This code clusters our data into three groups and plots them: 


Distribution based Non-Contractual Models

An alternative to RFM is to use a more complicated approach which is the distributed based non-contractual model. Three stochastic models are the most popular ones for calculating CLV, namely BG/NBD, BG/BB, Pareto/NBD. A comparative summary of these models is in the following table:


Bruce Hardie has a number of excel spreadsheets and explanations for these models: 

-BG/NBD Model

-BG/BB Model

-Pareto/NBD Model

An R package called BTYD can be used to calculate these models.

In addition, Lifetimes is a Python library to calculate CLV. We use an example to show how this package can be used for measuring CLV. In this example you can use cdnow_customers.csv located in the datasets/ directory.

from lifetimes.datasets import load_cdnow 
data = load_cdnow(index_col=[0])
print (data)
     frequency   recency      T
1    2           30.43       38.86
2    1            1.71       38.86
3    0            0.00       38.86
4    0            0.00       38.86
5    0            0.00       38.86

T represents the age of the customer in whatever time units chosen. In this example, time unit is week.

This example uses BG/NBD model, however you can try different models by importing one of these: BetaGeoFitter', 'ParetoNBDFitter', 'GammaGammaFitter', 'ModifiedBetaGeoFitter', 'BetaGeoBetaBinomFitter

We build the model:

from lifetimes.datasets import load_cdnow
data = load_cdnow(index_col=[0])
from lifetimes import BetaGeoFitter
bgf = BetaGeoFitter()['frequency'], data['recency'], data['T'])
print (bgf)

Assuming all is fine so far, we should have this output:

<lifetimes.BetaGeoFitter: fitted with 2357 subjects, r: 0.24, alpha: 4.41, a: 0.79, b: 2.43>

Now, we plot our model:

from lifetimes.plotting import plot_frequency_recency_matrix
t = 1
data['predicted_purchases'] = data.apply(lambda r: bgf.conditional_expected_number_of_purchases_up_to_time(t, r['frequency'], r['recency'], r['T']), axis=1)
from lifetimes.plotting import plot_period_transactions



Lifetimes python.


In digital marketing, in addition to attribution modelling and analysing the effect of implementing different strategies on your customers, you need to model your customer's’ lifetime value. We have explained in this post the most popular approaches for calculating CLV. Measuring Lifetime value is easier said than done and just about everyone is doing it wrong.  This blog post can be a good start to get on the right track. In the next blog posts,  we explain a number of problems that might arise with each model and the ways to overcome these problems. Annual churn rate calculation.

Getting ahead in analytics can be a challenge. Internetrix have a range of experienced experts to solve your analytics issues and optimise results and ROI. We are ready to help with lifetimes python and more.

Add Your Comment

No one has commented on this page yet.

Subscribe to our mailing list to receive exclusive offers, free resources and more!