A multivariate Poisson mixture model for marketing applications
Tom Brijs
Department of Economics, Limburgs Universitair Centrum, Universitaire Campus, B-3590 Diepenbeek, Belgium
Search for more papers by this authorDimitris Karlis
Department of Statistics, Athens University of Economics, 76 Patision, Str., 10434 Athens, Greece
Search for more papers by this authorGilbert Swinnen
Department of Economics, Limburgs Universitair Centrum, Universitaire Campus, B-3590 Diepenbeek, Belgium
Search for more papers by this authorKoen Vanhoof
Department of Economics, Limburgs Universitair Centrum, Universitaire Campus, B-3590 Diepenbeek, Belgium
Search for more papers by this authorGeert Wets
Department of Economics, Limburgs Universitair Centrum, Universitaire Campus, B-3590 Diepenbeek, Belgium
Search for more papers by this authorPuneet Manchanda
Graduate School of Business, University of Chicago, 1101 East 58th Street, Chicago, IL 60637, USA
Search for more papers by this authorTom Brijs
Department of Economics, Limburgs Universitair Centrum, Universitaire Campus, B-3590 Diepenbeek, Belgium
Search for more papers by this authorDimitris Karlis
Department of Statistics, Athens University of Economics, 76 Patision, Str., 10434 Athens, Greece
Search for more papers by this authorGilbert Swinnen
Department of Economics, Limburgs Universitair Centrum, Universitaire Campus, B-3590 Diepenbeek, Belgium
Search for more papers by this authorKoen Vanhoof
Department of Economics, Limburgs Universitair Centrum, Universitaire Campus, B-3590 Diepenbeek, Belgium
Search for more papers by this authorGeert Wets
Department of Economics, Limburgs Universitair Centrum, Universitaire Campus, B-3590 Diepenbeek, Belgium
Search for more papers by this authorPuneet Manchanda
Graduate School of Business, University of Chicago, 1101 East 58th Street, Chicago, IL 60637, USA
Search for more papers by this authorAbstract
This paper describes a multivariate Poisson mixture model for clustering supermarket shoppers based on their purchase frequency in a set of product categories. The multivariate nature of the model accounts for cross-selling effects between the purchases made in different product categories. However, for computational reasons, most multivariate approaches limit the covariance structure by including just one common interaction term, or by not including any covariance at all. Although this reduces the number of parameters significantly, it is often too simplistic as typically multiple interactions exist on different levels. This paper proposes a theoretically more complete variance/covariance structure of the multivariate Poisson model, based on domain knowledge or preliminary statistical analysis of significant purchase interaction effects in the data. Consequently, the model does not contain more parameters than necessary, whilst still accounting for the existing covariance in the data. Practically, retail category managers can use the model to devise customized merchandising strategies.
References
- Agresti, A. (1996), An introduction to categorical data analysis, Wiley Series in Probability and Statistics.
- Ainslie, A. and P. E. Rossi (1998), Similarities in choice behavior across product categories, Marketing Science 17, 91–106.
- Aitchinson, J. and C. H. Ho (1989), The multivariate Poisson–log normal distribution, Biometrika 75, 621–629.
- Akaike, H. (1974), A new look at statistical model identification, IEEE Transactions on Automatic Control AC-19, 716–723.
- Al-Hussaini, E. K. and K. E. D. Ahmad (1981), On the identifiability of finite mixtures of distributions, IEEE Transactions on Information Theory 27, 664–668.
- Andrews, R. L. and I. S. Currim (2003), Retention of latent segments in regression-based marketing models, International Journal of Research in Marketing 20, 315–321.
- Banfield, J. D. and A. E. Raftery (1993), Model based Gaussian and non-Gaussian clustering. Biometrics 49, 803–821.
- Böhning, D. (1999), Computer assisted analysis of mixtures and applications in meta-analysis, disease mapping and others, CRC Press.
-
Cadez, I. V.,
P. Smyth and
H. Mannila (2001), Probabilistic modeling of transaction data with applications to profiling, visualization, and prediction, in:
F. Provost and
R. Srikant (eds), Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco (CA), USA, 37–46.
10.1145/502512.502523 Google Scholar
- Chib, S. and R. Winkelmann (2001), Markov Chain Monte Carlo analysis of correlated count data, Journal of Business and Economic Statistics 19, 428–435.
-
Dempster, A. P.,
N. M. Laird and
D. Rubin (1977), Maximum likelihood from incomplete data via the EM algorithm, Journal of the Royal Statistical Society B
39, 1–38.
10.1111/j.2517-6161.1977.tb01600.x Google Scholar
- Dillon, W. R. and A. Kumar (1994), Latent structure and other mixture models in marketing: an integrative survey and overview, in: R. P. Bagozzi (ed.), Advanced Methods in Marketing Research, Blackwell, Cambridge MA, 295–351.
-
Dumouchel, W. and
D. Pregibon (2001), Empirical Bayes screening for multi-item associations, in:
F. Provost and
R. Srikant (eds), Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco (CA), USA, 67–76.
10.1145/502512.502526 Google Scholar
-
Johnson, N.,
S. Kotz and
N. Balakrishnan (1997), Discrete multivariate distributions, Wiley, New York.
10.1002/9781118150719 Google Scholar
- Hoogendoorn, A. W. (1999), Description of purchase incidence by multivariate heterogeneous Poisson processes, Statistica Neerlandica 53, 21–35.
- Kano, K. and K. Kawamura (1991), On recurrence relations for the probability function of multivariate generalized Poisson distribution, Communications in Statistics – Theory and Methods 20, 165–178.
- Karlis, D. (2003), An EM algorithm for multivariate Poisson distribution and related models, Journal of Applied Statistics 30, 63–77.
- Karlis, D. and L. Meligkotsidou (2003), Finite mixtures of multivariate Poisson distributions with application, Technical Report, Department of Statistics, Athens University of Economics and Business, Greece.
- Kocherlakota, S. and K. Kocherlakota (1992), Bivariate discrete distributions, Marcel Dekker, New York.
- Manchanda, P., A. Ansari and S. Gupta (1999), A model for multi-category purchase incidence decisions, Marketing Science 18, 95–114.
- McLachlan, G. J. and K. E. Basford (1988), Mixture models: inference and applications to clustering, Marcel Dekker, New York.
- McLachlan, G. J. and T. Krishnan (1997), The EM algorithm and its extensions, Wiley, New York.
-
McLachlan, G. J. and
D. Peel (2000), Finite mixture models, Wiley, New York.
10.1002/0471721182 Google Scholar
- Morrison, D. G. and D. C. Schmittlein (1988), Generalizing the NBD model for customer purchases: what are the implications and is it worth the effort?, Journal of Business and Economic Statistics 6, 145–166.
- Mulhern, F. J. and R. P. Leone (1991), Implicit price bundling of retail products: a multiproduct approach to maximizing store profitability, Journal of Marketing 55, 63–76.
-
Ordonez, C.,
E. Omiecinski and
N. Ezquerra (2001), A fast algorithm to cluster high dimensional basket data, in:
N. Cercone,
T. Lin and
X. Wu (eds), Proceedings of the IEEE International Conference on Data Mining, San Jose (CA), USA, 633–636.
10.1109/ICDM.2001.989586 Google Scholar
- Russell, G. J. and W. A. Kamakura (1997), Modeling multiple category brand preference with household basket data, Journal of Retailing 73, 439–461.
- Subrahmaniam, K. (1966), A test for intrinsic correlation in the theory of accident proneness, Journal of the Royal Statistical Society B 28, 180–189.
- Teicher, H. (1967), Identifiability of product measures, Annals of Mathematical Statistics, 1300–1302.
- Tsionas, E. G. (2001), Bayesian multivariate Poisson regression, Communications in Statistics – Theory and Methods 30, 243–255.
- Wedel, M. and W. A. Kamakura (1999), Market segmentation: conceptual and methodological foundations, Kluwer, Dordrecht.
- Wedel, M., W. A. Kamakura, W. S. Desarbo and F. Ter Hofstede (1995), Implications for asymmetry, nonproportionality, and heterogeneity in brand switching from piece-wise exponential mixture hazard models, Journal of Marketing Research 32, 457–462.