Volume 35, Issue 3 e12250
ORIGINAL ARTICLE

Clustering short temporal behaviour sequences for customer segmentation using LDA

Jobin Wilson

Corresponding Author

Jobin Wilson

R8D Department, Flytxt, Trivandrum, India

Correspondence

Jobin Wilson, R8D Department, Flytxt, Trivandrum, India.

Email: [email protected]

Search for more papers by this author
Santanu Chaudhury

Santanu Chaudhury

Department of Electrical Engineering, Indian Institute of Technology Delhi, New Delhi, India

Search for more papers by this author
Brejesh Lall

Brejesh Lall

Department of Electrical Engineering, Indian Institute of Technology Delhi, New Delhi, India

Search for more papers by this author
First published: 07 November 2017
Citations: 7

Abstract

Customer segmentation based on temporal variation of subscriber preferences is useful for communication service providers (CSPs) in applications such as targeted campaign design, churn prediction, and fraud detection. Traditional clustering algorithms are inadequate in this context, as a multidimensional feature vector represents a subscriber profile at an instant of time, and grouping of subscribers needs to consider variation of subscriber preferences across time. Clustering in this case usually requires complex multivariate time series analysis-based models. Because conventional time series clustering models have limitations around scalability and ability to accurately represent temporal behaviour sequences (TBS) of users, that may be short, noisy, and non-stationary, we propose a latent Dirichlet allocation (LDA) based model to represent temporal behaviour of mobile subscribers as compact and interpretable profiles. Our model makes use of the structural regularity within the observable data corresponding to a large number of user profiles and relaxes the strict temporal ordering of user preferences in TBS clustering. We use mean-shift clustering to segment subscribers based on their discovered profiles. Further, we mine segment-specific association rules from the discovered TBS clusters, to aid marketers in designing intelligent campaigns that match segment preferences. Our experiments on real world data collected from a popular Asian communication service provider gave encouraging results.

The full text of this article hosted at iucr.org is unavailable due to technical difficulties.