Volume 20, Issue 17-18 pp. 2723-2739
Research Article

Computational tools for exact conditional logistic regression

Chris Corcoran

Corresponding Author

Chris Corcoran

Department of Mathematics and Statistics, Utah State University, 3900 Old Main Hill, Logan, Utah 84322-3900, U.S.A.

Department of Mathematics and Statistics, Utah State University, 3900 Old Main Hill, Logan, Utah 84322-3900, U.S.A.Search for more papers by this author
Cyrus Mehta

Cyrus Mehta

Department of Biostatistics, Harvard School of Public Health, 655 Huntington Avenue, Boston, Massachusetts 02115, U.S.A.

Cytel Software Corporation, 675 Massachusetts Avenue, Cambridge, MA 02319, U.S.A.

Search for more papers by this author
Nitin Patel

Nitin Patel

Cytel Software Corporation, 675 Massachusetts Avenue, Cambridge, MA 02319, U.S.A.

Search for more papers by this author
Pralay Senchaudhuri

Pralay Senchaudhuri

Cytel Software Corporation, 675 Massachusetts Avenue, Cambridge, MA 02319, U.S.A.

Search for more papers by this author
First published: 14 August 2001
Citations: 18

Abstract

Logistic regression analyses are often challenged by the inability of unconditional likelihood-based approximations to yield consistent, valid estimates and p-values for model parameters. This can be due to sparseness or separability in the data. Conditional logistic regression, though useful in such situations, can also be computationally unfeasible when the sample size or number of explanatory covariates is large. We review recent developments that allow efficient approximate conditional inference, including Monte Carlo sampling and saddlepoint approximations. We demonstrate through real examples that these methods enable the analysis of significantly larger and more complex data sets. We find in this investigation that for these moderately large data sets Monte Carlo seems a better alternative, as it provides unbiased estimates of the exact results and can be executed in less CPU time than can the single saddlepoint approximation. Moreover, the double saddlepoint approximation, while computationally the easiest to obtain, offers little practical advantage. It produces unreliable results and cannot be computed when a maximum likelihood solution does not exist. Copyright © 2001 John Wiley & Sons, Ltd.

The full text of this article hosted at iucr.org is unavailable due to technical difficulties.