Customer insights

This research helped retail banks to gain customer insights and generate revenue using statistical modelling.


Retail banks sell a range of complex financial products and services, from mortgages and loans to insurance, to a large number of customers whose profiles vary widely. 

A key challenge for banks is to ensure that they market their products and services efficiently, targeting the appropriate type of client with the right product at the right time. The banks also need to be able to forecast the revenues that might be generated by a customer.

Within marketing there is a concept of ‘customer lifetime value’ (CLV), which is an estimation of what products and services a customer might buy and how much revenue that customer might generate over a period of time. The most widely used statistical CLV models are unsuitable for retail banking organisations, because the models are insufficiently versatile to deal with the heterogeneity and multidimensionality of the datasets. 

A retail bank’s multi-service financial environment is characterised by a significant degree of diversity in customer-purchasing behaviours, by interdependence of purchasing decisions, and by the feature that many of the product-level relationships are contractual and long-term. Also, retail banks want to predict not only future purchase decisions but also their volumes.

Research impact

Research at Leeds resulted in the development of a new statistical model to forecast customer behaviour in the retail banking sector. 

The model has been incorporated into the decision making processes of Yorkshire Bank to direct appropriate products and services to its customers based on predictions and forecasts provided by the model. 

Yorkshire Bank estimates that incorporation of the new forecasting tool into its customer relationship management system resulted in an increase in profits of more than £20 million over the period 2009–12.

Underpinning research

To address the unique challenges faced by the retail banking sector in relation to the complexity of the products and the customer base Leeds researchers developed an adaptive segmentation approach to estimate CLV. 

For a given customer, a group of other customers with similar characteristics (e.g. current and past customer revenues, age, tenure with the company, credit score, and current and past product holdings) is identified and, using historical data of this group the customer's future behaviour and profitability is predicted – typically for the next quarter.

Longer-term predictions of customer behaviour and potential for generating revenue are complicated by the fact that a customer’s profile changes over time: the customer gets older, can move house, change job, buy new products, and so forth.

To account for this fluidity, longer-term forecasts are derived by combining this estimation technique with forecasts on changes of the customer's characteristics due to product purchases or financial circumstances. Identification of suitable 'neighbourhoods' is a key aspect of this approach. 

Customers are segmented into neighbourhoods populated by people with statistically similar characteristics. These neighbourhoods have to be small enough to ensure customer homogeneity within a local segment yet sufficiently large to ensure the robustness of the forecasts of future behaviour. 

This segmentation is achieved by applying a similarity measure, which is defined over the predictive variable space.

The adaptive-segmentation approach to modelling has a number of advantages over existing methods in a retail-banking context: it is independent of the variable distributions and works with different correlation structures, which is an important feature given that most variables in this commercial context are not normally distributed and exhibit strong non-linear correlation and autoregressive effects; all information contained in the distribution of customers' future behaviour is preserved, and; the model works with partial information and missing variable values, which is especially important due to imperfections in company data-collection processes and when predicting future revenue for new-to-bank customers, for whom only partial information is available initially.