Improvement of expectation–maximization algorithm for phase-type distributions with grouped and truncated data
Corresponding Author
Hiroyuki Okamura
Department of Information Engineering, Graduate School of Engineering, Hiroshima University, 1-4-1 Kagamiyama, Higashi-Hiroshima, 739-8527 Japan
Correspondence to: Hiroyuki Okamura, Department of Information Engineering, Graduate School of Engineering, Hiroshima University, 1-4-1 Kagamiyama, Higashi-Hiroshima, 739-8527 Japan.
E-mail: [email protected]
Search for more papers by this authorTadashi Dohi
Department of Information Engineering, Graduate School of Engineering, Hiroshima University, 1-4-1 Kagamiyama, Higashi-Hiroshima, 739-8527 Japan
Search for more papers by this authorKishor S. Trivedi
Department of Electrical and Computer Engineering, Duke University, Hudson Hall, Durham, NC, 27707 USA
Search for more papers by this authorCorresponding Author
Hiroyuki Okamura
Department of Information Engineering, Graduate School of Engineering, Hiroshima University, 1-4-1 Kagamiyama, Higashi-Hiroshima, 739-8527 Japan
Correspondence to: Hiroyuki Okamura, Department of Information Engineering, Graduate School of Engineering, Hiroshima University, 1-4-1 Kagamiyama, Higashi-Hiroshima, 739-8527 Japan.
E-mail: [email protected]
Search for more papers by this authorTadashi Dohi
Department of Information Engineering, Graduate School of Engineering, Hiroshima University, 1-4-1 Kagamiyama, Higashi-Hiroshima, 739-8527 Japan
Search for more papers by this authorKishor S. Trivedi
Department of Electrical and Computer Engineering, Duke University, Hudson Hall, Durham, NC, 27707 USA
Search for more papers by this authorAbstract
This paper proposes an improved expectation–maximization (EM) algorithm for phase-type (PH) distributions with grouped and truncated data. Olsson (1996) derived an EM algorithm for PH distributions under censored data, and the similar technique can be utilized to the PH fitting even under grouped and truncated data. However, it should be noted that Olsson's algorithm has a drawback in terms of computation speed. Because the time complexity of the algorithm is a cube of number of phases, it does not work well in the case where the number of phases is large. This paper proposes an improvement of the EM algorithm under grouped and truncated observations. By applying a uniformization-based technique for continuous-time Markov chains, it is shown that the time complexity of our algorithm can be reduced to the square of number of phases. In particular, when we consider the PH fitting using a canonical form of PH distributions, the time complexity is linear in the number of phases. Copyright © 2012 John Wiley & Sons, Ltd.
References
- 1
Bobbio A,
Telek M. A benchmark for PH estimation algorithms: result for acyclic-PH. Stochastic Models 1994; 10: 661–677.
10.1080/15326349408807315 Google Scholar
- 2 Osogami T, Harchol-Balter M. Closed form solutions for mapping general distributions to minimal PH distributions. Performance Evaluation 2006; 63(6): 524–552.
- 3 Bobbio A, Horváth A, Telek M. Matching three moments with minimal acyclic phase type distributions. Stochastic Models 2005; 21(2/3): 303–326.
- 4 Bobbio A, Horváth A, Scarpa M, Telek M. Acyclic discrete phase type distributions: properties and a parameter estimation algorithm. Performance Evaluation 2003; 54: 1–32.
- 5 Telek M, Heindl A. Matching moments for acyclic discrete and continuous phase-type distributions of second order. Int'l Journal of Simulation Systems, Science & Technology 2002; 3(3–4): 47–57.
- 6
Bosch PMV,
Dietz C,
Pohl EA. Moment matching using a family of phase-type distributions. Stochastic Models 2000; 16: 391–398.
10.1080/15326340008807595 Google Scholar
- 7
Johnson MA. Selecting parameters of phase distributions: combining nonlinear programming, heuristics, and Erlang distributions. ORSA Journal on Computing 1993; 5: 69–83.
10.1287/ijoc.5.1.69 Google Scholar
- 8
Johnson MA,
Taaffe MR. Matching moments to phase distributions: density function shapes. Stochastic Models 1990; 6: 283–306.
10.1080/15326349908807148 Google Scholar
- 9
Johnson MA,
Taaffe MR. Matching moments to phase distributions: mixtures of Erlang distribution of common order. Stochastic Models 1989; 5: 711–743.
10.1080/15326348908807131 Google Scholar
- 10
van der Heijden MC. On the three-moment approximation of a general distribution by a Coxian distribution. Probability in the Engineering and Informational Sciences 1988; 2: 257–261.
10.1017/S0269964800000772 Google Scholar
- 11 van de Liefvoort A. The moment problem for continuous distributions. Technical report, University of Missouri, WP-CM-1990-02, Kansas City, 1990.
- 12 Telek M, Horváth G. A minimal representation of Markov arrival processes and a moments matching method. Performance Evaluation 2007; 64(9–12): 1153–1168.
- 13 Hansen LP. Large sample properties of generalized method of moments estimators. Econometrica 1982; 50(4): 1029–1054.
- 14 Panchenko A, Thümmler A. Efficient phase-type fitting with aggregated traffic traces. Performance Evaluation 2007; 64: 629–645.
- 15 Thümmler A, Buchholz P, Telek M. A novel approach for phase-type fitting with the EM algorithm. IEEE Transactions on Dependable and Secure Computing 2006; 3(3): 245–258.
- 16 Bobbio A, Cumani A. ML estimation of the parameters of a PH distribution in triangular canonical form. In Computer Performance Evaluation, G Balbo, G Serazzi (eds). Elsevier Science Publishers: Amsterdam, 1992; 33–46.
- 17 Asmussen S, Nerman O, Olsson M. Fitting phase-type distributions via the EM algorithm. Scandinavian Journal of Statistics 1996; 23(4): 419–441.
- 18
Dempster AP,
Laird NM,
Rubin DB. Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society, Series B 1977; 39: 1–38.
10.1111/j.2517-6161.1977.tb01600.x Google Scholar
- 19 Okamura H, Dohi T, Trivedi KS. A refined EM algorithm for PH distributions. Performance Evaluation 2011; 68: 938–954.
- 20 Olsson M. Estimation of phase-type distributions from censored data. Scandinavian Journal of Statistics 1996; 23: 443–460.
- 21 Neuts MF. Matrix-Geometric Solutions in Stochastic Models. Johns Hopkins University Press: Baltimore, 1981.
- 22 Cumani A. On the canonical representation of homogeneous Markov processes modelling failure-time distributions. Microelectronics and Reliability 1982; 22: 583–602.
- 23 He Q-H, Zhang H. An algorithm for computing minimal Coxian representations. INFORMS Journal on Computing 2008; 20(2): 179–190.
- 24 Wu CFJ. On the convergence properties of the EM algorithm. Annals of Statistics 1983; 11: 95–103.
- 25 McLachlan GJ, Jones PN. Fitting mixture models to grouped and truncated data via the EM algorithm. Biometrics 1988; 44(2): 571–578.
- 26 Reibman AL, Trivedi KS. Numerical transient analysis of Markov models. Computers and Operations Research 1988; 15: 19–36.
- 27 Malhotra M, Muppala J, Trivedi KS. Stiffness-tolerant methods for transient analysis of stiff Markov chains. Microelectronics and Reliability 1994; 34(11): 1825–1841.
- 28 Okamura H, Dohi T. Faster maximum likelihood estimation algorithms for Markovian arrival processes. Proceedings of 6th International Conference on Quantitative Evaluation of Systems (QEST2009), Budapest, Hungary, 2009; 73–82.
- 29 Okamura H, Dohi T, Trivedi KS. Markovian arrival process parameter estimation with group data. IEEE/ACM Transactions on Networking 2009; 17(4): 1326–1339.
- 30 Scott DW. On optimal and data-based histograms. Biometrika 1979; 66(3): 605–610.
- 31 Kullback S. Information Theory and Statistics. John Wiley & Sons: New York, 1959.
- 32
Bladt M,
Gonzalez A,
Lauritzen SL. The estimation of phase-type related functionals using Markov chain Monte Carlo methods. Scandinavian Actuarial Journal 2003; 4: 280–300.
10.1080/03461230110106435 Google Scholar
- 33 Ausín MC, Lillo RE, Ruggeri F, Wiper MP. Bayesian modelling of hospital bed occupancy times using a mixed generalised Erlang distribution. In Bayesian Statistics, Vol. 7, JM Bernardo, MJ Bayarri, JO Berger, AP David, D Heckerman, AFM Smith, M West (eds). Oxford University Press, Oxford, 2003; 443–451.
- 34 Ausín MC, Wiper MP, Lillo RE. Bayesian estimation for the M/G/1 queue using a phase type approximation. Journal of Statistical Planning and Inference 2004; 118: 83–101.
- 35 Ausín MC, Wiper MP, Lillo RE. Bayesian prediction of the transient behaviour and busy period in short- and long-tailed GI/G/1 queueing systems. Computational Statistics & Data Analysis 2008; 52: 1615–1635.
- 36 McGrory CA, Pettitt AN, Faddy MJ. A fully Bayesian approach to inference for Coxian phase-type distributions with covariate dependent mean. Computational Statistics & Data Analysis 2009; 53: 4311–4321.
Citing Literature
Special Issue:Advanced Reliability and Maintenance Modeling (APARM 2010)
March/April 2013
Pages 141-156