Customer Lifecycle Effects for a Better CLV Model (Series: Part 2)

This blog on the use of customer lifecycle effects in CLV models is the second in a series of articles in which I share a little “inside baseball” on how our data science team here at Theta approaches CLV calculations. It’s a unique way that goes beyond the open-source software models. 

Written by Ethan Anderson, Data Scientist, Theta | March 2023

Introduction

While customer lifetime value (CLV) models predict how much a business’s customers will be worth over their lifetimes, they’re also part of a storytelling exercise. They’re used to convey the customer journey from a customer’s first use of your business to the last. It’s a story for which you control most of the narrative. 

After all, a lot happens throughout the course of a customer’s interactions with your company. The experience you provide, as well as things like promotions, discounts, anniversary specials and more, all can affect those interactions and, ultimately, customer churn and revenue ─ both important components of CLV. 

But there are also things that can affect customers’ behaviors that you can’t control. Among them: customer lifecycle effects ─ behaviors associated with specific points in customers’ life cycles. CLV models with customer lifecycle covariates will outperform the off-the-shelf models every time.

The CLV Model Matters

Modeling a customer’s journey to accurately convey the customer’s story depends largely on the CLV model you use. There are many of them, and as noted in a previous blog, not all CLV models are the same. They can have varying levels of complexity and accuracy, ranging from a crude heuristic model to one that employs sophisticated predictive analytics and/or machine learning. 

Those that lean towards the complex side and allow for incorporating a variety of factors ─ what we would call covariates ─ enable you to tell a more accurate and compelling story. That’s particularly the case when those covariates include customer lifecycle effects, enabling the models to capture more granular customer behavior influences.

Why CLV Models Need Covariates

A lot of things can happen throughout the customer journey. Customer behavior can change for any number of reasons, and distinct patterns can often be identified. 

For example, purchasing patterns for an individual customer may consistently repeat over a period of time around major life events or annual occasions such as birthdays, anniversaries or vacations. Purchase behaviors may also increase or decrease at certain times of the year, often coinciding with specific holidays or seasons. 

Then there are the external shocks that can affect buying behavior, which could range from interest rate hikes to natural disasters. The type of business can make a difference as well, such as contractual or subscription-based businesses as opposed to a non-contractual or non-subscription businesses. 

Over the years, various approaches and models have been developed to handle some of the issues. In addition to numerical optimizations that will be discussed in future articles, Theta addresses this issue through the careful introduction of covariates to our models. The custom covariates we have developed and made use of over the years get right to the root of customer behavior. They ensure our forecasts are built on the business’s true nature rather than naively fitting adjustments in tracking curves.

A lot of things can happen throughout the customer journey. Customer behavior can change for any number of reasons, and distinct patterns can often be identified.

Types of Covariates

Covariates introduce information about a customer’s behavior to a model, as well as the business-level forces that may influence it. They allow a model to better capture the systematic variability of customers and the effects they have on CLV. There are two basic types of covariates:

  • Time-invariant covariates, which include customer characteristics such as demographics that don’t change over time. 
  • Time-varying covariates that include influences on customers that change over time such as marketing efforts or seasonal patterns. 

Time-varying covariates are particularly important for many businesses because there are often observable changes in purchase behavior at certain times of year. An obvious example is the increase in purchases made prior to Christmas or the increase in purchases of certain types of outdoor equipment at the beginning of summer.

However, there are other time-related influences on customer behavior that don’t get captured by these time-varying covariates. This includes customer lifecycle effects such as tenure and anniversary date of becoming a customer.

The Tenure Effect

There are different stages in the customer lifecycle. The variability in a particular customer’s purchase and spend behavior while alive is often associated with where that customer is in the life cycle. These effects happen in parallel with, but separately from, how customer churn affects retention over a customer’s lifecycle.  

Before introducing covariates to CLV models to account for effects such as changes in customers’ behavior with their tenure, it’s important to have business-level insight from a client. At Theta, we want to understand the nature of the business, so that we can still carefully identify underlying behavior of the customer base.  

Shortly after acquisition, for example, customers may just dip their toes in the water with a business, spending less (i.e., using a discount) as they get to know the company and how well its products or services cater to their needs. As they continue their relationship with the business, however, they may be willing to spend more as it becomes clearer that the company’s products or services are a good match to their needs. Tenure could also be associated with a customer being open to cross-selling or up-selling efforts, or becoming an “influencer” or brand ambassador.

The Anniversary Effect

Anniversary dates, such as when an individual first became a customer with a company, can also affect purchase behavior. Case in point: looking at customer data may show a pattern in which customers are likely to increase purchases at or shortly after their anniversary date due to special promotions or marketing efforts. 

This isn’t associated with a specific calendar season but rather to when the customer was acquired. An excellent example can be pulled from subscription-based businesses. When grouping customers by similar acquisition dates, you’ll see regular upward bumps in repeat purchase at the interval of a subscription length.

Where Many Models Fail

Say you’re analyzing a company and notice that its customers exhibit a pattern of steadily increasing spend following their acquisition.  After a certain period of time post-acquisition the behavior gradually falls back to the baseline.  Using the best publicly available software packages, you could fit consistently increasing time-varying covariates to your spend model that end as the actuals fall off, or introduce a machine learning approach filled with covariates that will likely be more of a black box.  

The problem is that this seemingly temporary customer behavior would be incredibly expensive and slow to estimate in this way with common, open-source probabilistic spend models. Both the previously referenced probabilistic and ML approaches are naive to the true underlying behavior of the customers of interest.  

Additionally, using heavy-handed coercion could mislead the model as it tries to disentangle temporary behaviors from the true nature of more long-term behaviors of customers. This makes it difficult to find a trustworthy baseline to project into the future.

Advances in CLV Models

When covariates are added to a CLV model, they add both mathematical and computational complexity. With complexity comes computational expense. Prior to work completed within the last two years, the theoretical introduction of time-varying covariates to BTYD models had not even been successfully completed.  That’s largely why most software packages used for CLV calculations still don’t incorporate time-invariant and time-varying covariates, much less customer lifecycle effects. 

Theta co-founders, Peter Fader and Daniel McCarthy, along with our team of data scientists, have been at the forefront of developing more robust CLV modeling methods. Our approach at Theta takes into account the intricacies of customers’ behavior patterns and the businesses they’re interacting with over the course of their customer lifetimes. 

In addition to incorporating time-varying and time-invariant covariates and numerical optimizations to greatly increase the speed of our models, we work with clients to identify relevant business-level data related to customer lifecycle effects. We know what to look for, and pay attention to patterns or incidences of changing behavior over customers’ lifetimes. 

Our CLV models are built to incorporate customer lifecycle covariates. This helps us more fully understand the potential variations in behaviors over customers’ lifetimes and predict how they will ultimately affect CLV. 

This is important. As noted in our previous blog, if we neglect any of the customer-centric details, we risk producing CLV estimates that misrepresent an individual or complete populations.  That’s why night and day doesn’t even begin to describe the difference in what Theta is doing now with CLV compared to most of its competitors.  We’re able to get into all the nitty gritty details that really provide insight into how a customer behaves — and how that customer is likely to behave in the future.   

In future blogs in this series, we’ll discuss other covariate types, including demand shocks and seasonality.