MARKOV DECISION PROCESSES WITH UNCERTAIN TRANSITION RATES: SENSITIVITY AND MAX HYPHEN MIN CONTROL
Suresh Kalyanasundaram
Motorola India Electronics Limited, No. 66/1, Plot 5, Bagmane Techpark, C. V. Raman Nagar Post, Bangalore 560 093, India.
Suresh Kalyanasundaram received his Bachelors degree in Electrical and Electronics Engineering and Masters degree in Physics from Birla Institute of Technology and Science, Pilani, India in 1996. He received his Ph.D. from the School of Electrical and Computer Engineering, Purdue University, in May 2000. Since then he has been with Motorola, working in the area of performance analysis of wireless networks.
Search for more papers by this authorEdwin K. P. Chong
Department of Electrical and Computer Engineering, Colorado State University, Fort Collins, CO 80523-1373, USA.
Professor Edwin K. P. Chong received the B.E.(Hons.) degree with First Class Honors from the University of Adelaide, South Australia, in 1987, graduating top of his class; and the M.A. and Ph.D. degrees in 1989 and 1991, respectively, both from Princeton University, where he held an IBM Fellowship. He joined the School of Electrical and Computer Engineering at Purdue University in 1991, where he was named a University Faculty Scholar in 1999, and promoted to Full Professor in 2001. Since August 2001, he has been a Professor of Electrical and Computer Engineering, and Professor of Mathematics, at Colorado State University. His current interests are in communication networks and optimization methods. He coauthored the best-selling book, An Introduction to Optimization, 2nd Edition, Wiley-Interscience, 2001. He received the NSF CAREER Award in 1995 and the ASEE Frederick Emmons Terman Award in 1998. He coauthored a paper that was awarded Best Paper in the journal Computer Networks, 2003. Professor Chong is a Fellow of the IEEE. He was founding chairman of the IEEE Control Systems Society Technical Committee on Discrete Event Systems, and, until recently, served as an IEEE Control Systems Society Distinguished Lecturer. He has been on the editorial board of the IEEE Transactions on Automatic Control. He is currently on the editorial board of the journal Computer Networks. He has also served on the organizing committees of several international conferences. He has been on the program committees for the IEEE Conference on Decision and Control, the American Control Conference, the IEEE International Symposium on Intelligent Control, IEEE Symposium on Computers and Communications, and the IEEE Global Telecommunications Conference. He has also served in the executive committees for the IEEE Conference on Decision and Control, the American Control Conference, the IEEE Annual Computer Communications Workshop, the International Conference on Industrial Electronics, Technology & Automation, and the IEEE International Conference on Communications. He was the Conference (General) Chair for the Conference on Modeling and Design of Wireless Networks, part of SPIE ITCom 2001.
Search for more papers by this authorNess B. Shroff
School of Electrical and Computer Engineering, Purdue University, West Lafayette, IN 47907-1285, USA.
Ness B. Shroff received his Ph.D. degree from Columbia University, NY in 1994. He is currently an Associate Professor in the School of Electrical and Computer Engineering at Purdue University. His research interests span the areas of wireless and wireline communication networks. He is especially interested in fundamental problems in the design, performance, scheduling, reliability, capacity, pricing, and control of these networks. He has been invited to give tutorials and technical talks at several conferences, workshops, and university and industrial seminars. His work on wireless resource allocation has also received attention from the national and international media. His research is funded by various companies such as Intel, Hewlett Packard, Nortel, AT&T, and L.G. Electronics; and government agencies such as the National Science Foundation, DARPA, Indiana dept. of Transportation, and the Indiana 21st Century fund. Dr. Shroff is an editor for the IEEE/ACM Trans, on Networking and the Computer Networks Journal, and past editor of IEEE Communications Letters. He was the Technical Program co-chair for IEEE INFOCOM'03 (San Francisco, CA), the panel co-chair for ACM Mobicom'02 (Atlanta, GA), program co-chair for the symposium on high-speed networks, Globecom 2001 (San Francisco, CA), and conference chair for the 14th Annual IEEE Computer Communications Workshop (Estes Park, CO). He was also the co-organizer of the NSF Workshop on “Fundamental Research in Networking,” in April 2003. He received the NSF Career award in 1996 and the best paper of the year (2003) award from the Computer Networks journal.
Search for more papers by this authorSuresh Kalyanasundaram
Motorola India Electronics Limited, No. 66/1, Plot 5, Bagmane Techpark, C. V. Raman Nagar Post, Bangalore 560 093, India.
Suresh Kalyanasundaram received his Bachelors degree in Electrical and Electronics Engineering and Masters degree in Physics from Birla Institute of Technology and Science, Pilani, India in 1996. He received his Ph.D. from the School of Electrical and Computer Engineering, Purdue University, in May 2000. Since then he has been with Motorola, working in the area of performance analysis of wireless networks.
Search for more papers by this authorEdwin K. P. Chong
Department of Electrical and Computer Engineering, Colorado State University, Fort Collins, CO 80523-1373, USA.
Professor Edwin K. P. Chong received the B.E.(Hons.) degree with First Class Honors from the University of Adelaide, South Australia, in 1987, graduating top of his class; and the M.A. and Ph.D. degrees in 1989 and 1991, respectively, both from Princeton University, where he held an IBM Fellowship. He joined the School of Electrical and Computer Engineering at Purdue University in 1991, where he was named a University Faculty Scholar in 1999, and promoted to Full Professor in 2001. Since August 2001, he has been a Professor of Electrical and Computer Engineering, and Professor of Mathematics, at Colorado State University. His current interests are in communication networks and optimization methods. He coauthored the best-selling book, An Introduction to Optimization, 2nd Edition, Wiley-Interscience, 2001. He received the NSF CAREER Award in 1995 and the ASEE Frederick Emmons Terman Award in 1998. He coauthored a paper that was awarded Best Paper in the journal Computer Networks, 2003. Professor Chong is a Fellow of the IEEE. He was founding chairman of the IEEE Control Systems Society Technical Committee on Discrete Event Systems, and, until recently, served as an IEEE Control Systems Society Distinguished Lecturer. He has been on the editorial board of the IEEE Transactions on Automatic Control. He is currently on the editorial board of the journal Computer Networks. He has also served on the organizing committees of several international conferences. He has been on the program committees for the IEEE Conference on Decision and Control, the American Control Conference, the IEEE International Symposium on Intelligent Control, IEEE Symposium on Computers and Communications, and the IEEE Global Telecommunications Conference. He has also served in the executive committees for the IEEE Conference on Decision and Control, the American Control Conference, the IEEE Annual Computer Communications Workshop, the International Conference on Industrial Electronics, Technology & Automation, and the IEEE International Conference on Communications. He was the Conference (General) Chair for the Conference on Modeling and Design of Wireless Networks, part of SPIE ITCom 2001.
Search for more papers by this authorNess B. Shroff
School of Electrical and Computer Engineering, Purdue University, West Lafayette, IN 47907-1285, USA.
Ness B. Shroff received his Ph.D. degree from Columbia University, NY in 1994. He is currently an Associate Professor in the School of Electrical and Computer Engineering at Purdue University. His research interests span the areas of wireless and wireline communication networks. He is especially interested in fundamental problems in the design, performance, scheduling, reliability, capacity, pricing, and control of these networks. He has been invited to give tutorials and technical talks at several conferences, workshops, and university and industrial seminars. His work on wireless resource allocation has also received attention from the national and international media. His research is funded by various companies such as Intel, Hewlett Packard, Nortel, AT&T, and L.G. Electronics; and government agencies such as the National Science Foundation, DARPA, Indiana dept. of Transportation, and the Indiana 21st Century fund. Dr. Shroff is an editor for the IEEE/ACM Trans, on Networking and the Computer Networks Journal, and past editor of IEEE Communications Letters. He was the Technical Program co-chair for IEEE INFOCOM'03 (San Francisco, CA), the panel co-chair for ACM Mobicom'02 (Atlanta, GA), program co-chair for the symposium on high-speed networks, Globecom 2001 (San Francisco, CA), and conference chair for the 14th Annual IEEE Computer Communications Workshop (Estes Park, CO). He was also the co-organizer of the NSF Workshop on “Fundamental Research in Networking,” in April 2003. He received the NSF Career award in 1996 and the best paper of the year (2003) award from the Computer Networks journal.
Search for more papers by this authorABSTRACT
Solution techniques for Markov decision problems rely on exact knowledge of the transition rates, which may be difficult or impossible to obtain. In this paper, we consider Markov decision problems with uncertain transition rates represented as compact sets. We first consider the problem of sensitivity analysis where the aim is to quantify the range of uncertainty of the average per-unit-time reward given the range of uncertainty of the transition rates. We then develop solution techniques for the problem of obtaining the max-min optimal policy, which maximizes the worst-case average per-unit-time reward. In each of these problems, we distinguish between systems that can have their transition rates chosen independently and those where the transition rates depend on each other. Our solution techniques are applicable to Markov decision processes with fixed but unknown transition rates and to those with time-varying transition rates.
REFERENCES
- 1 Mine, H. and S. Osaki, Markovian Decision Processes, American Elsevier Publishing Company Inc., Amsterdam (1970).
- 2 Ross, S. M., Applied Probability Models with Optimization Applications, Holden-Day, San Francisco (1970).
- 3
Puterman, M. L., Markov Decision Processes: Discrete Stochastic Dynamic Programming, John Wiley and Sons, New York (1994).
10.1002/9780470316887 Google Scholar
- 4 Bertsekas, D. P. and J. N. Tsitsiklis, Neuro-dynamic Programming, Athena Scientific, Nashua, NH (1996).
- 5 Ross, K. W. and D. H. K. Tsang, “Optimal Circuit Access Policies in An ISDN Environment: A Markov Decision Approach,” IEEE Trans. Commun., Vol. 37, No. 9, pp. 934–939 (1989).
- 6 Hyman, J. M., A. A. Lazar, and G. Pacifici, “A Separation Principle Between Scheduling and Admission Control for Broadband Switching,” IEEE J. Selected Areas Commun., Vol. 11, No. 4, pp. 605–616 (1993).
- 7 Ramjee, R., R. Nagarajan, and D. Towsley, “On Optimal Call Admission Control in Cellular Networks,” Wireless Networks, Vol. 3, No. 1, pp. 29–41 (1997).
- 8 Satia, J. K. and R. E. Lave, Jr., “Markovian Decision Processes with Uncertain Transition Probabilities,” Oper. Res., Vol. 21, pp. 728–740 (1973).
- 9 White, C. C. and H. K. Eldeib, “Markov Decision Processes with Imprecise Transition Probabilities,” Oper. Res., Vol. 43, pp. 739–749 (1994).
- 10 Givan, R. S. Leach, and T. Dean, “Bounded Parameter Markov Decision Processes,” Artif. Intell., Vol. 122, No. 1–2, pp. 71–109 (2000). Also available from http:dynamo.ecn.purdue.edu-givan.
- 11 Derman, C., Finite State Markovian Decision Processes, Academic Press, New York (1970).
- 12 Bertsekas, D. P., Dynamic Programming and Optimal Control, Vol. I and II, Athena Scientific, Nashua, NH (1995).
- 13 Sato, M., K. Abe, and H. Takeda, “An Asymptotically Optimal Learning Controller for Finite Markov Chains with Unknown Transition Probabilities,” IEEE Trans. Automat. Contr., Vol. 30, No. 11, pp. 1147–1149 (1985).
- 14 Sato, M., K. Abe, and H. Takeda, “Learning Control of Finite Markov Chains with Unknown Transition Probabilities,” IEEE Trans. Automat. Contr., Vol. 27, pp. 502–505 (1982).
- 15 Ross, K. W., “Randomized and Past-Dependent Policies for Markov Decision Processes with Multiple Constraints,” Oper. Res., Vol. 37, No. 3, pp. 474–477 (1989).
- 16 Cao, X.-R. and H.-T. Fang, “ Gradient-Based Policy Iteration: An Example,” in Proc. 41st Int. Conf. Decis. Contr., Las Vegas, Nevada, pp. 3367–3371 (2002).
- 17 Chong, E. K. P. and S. H. Zak, An Introduction to Optimization, John Wiley and Sons, New York (1996).
- 18 Cinlar, E., Introduction to Stochastic Processes, Prentice-Hall, New Jersey (1974).
- 19 Yasuda, M., “Semi-Markov Decision Processes with Countable State Space and Compact Action Space,” Bull. Math. Statistics, Vol. 18, pp. 35–54 (1978).
- 20 Schweitzer, P. J., “On Undiscounted Markovian Decision Process with Compact Action Spaces,” Oper. Res. (RAIRO), Vol. 19, No. 1, pp. 71–86 (1985).