Diagnostic Tests, Evaluation of
2
Colin B. Begg,
Colin B. Begg
Memorial Sloan Kettering Cancer Center, New York, NY, USA
Search for more papers by this authorColin B. Begg,
Colin B. Begg
Memorial Sloan Kettering Cancer Center, New York, NY, USA
Search for more papers by this authorFirst published: 15 July 2005
Abstract
This entry reviews issues in the evaluation of individual diagnostic tests, or the comparison of two alternative tests. Measures of test accuracy are described, and the receiver operating characteristic (ROC) curve method of analysis is mentioned. Various biases that may affect the evaluation of a diagnostic test are reviewed. Parametric and nonparametric methods to compare two tests are indicated, and issues concerning the design of studies to evaluate diagnostic tests are summarized.
References
- 1 Advances in statistical methods for diagnostic radiology: a symposium (1995). Academic Radiology 2, S1–S84.
- 2 Bamber, D. (1975). The area above the ordinal dominance graph and the area below the receiver operating characteristic graph, Journal of Mathematical Psychology 12, 387–415.
- 3 Begg, C. B. (1987). Biases in the assessment of diagnostic tests, Statistics in Medicine 6, 411–423.
- 4 Begg, C. B. (1991). Advances in statistical methodology for diagnostic medicine in the 1980's, Statistics in Medicine 10, 1887–1895.
- 5 Begg, C. B. & Greenes, R. A. (1983). Assessment of diagnostic tests when disease verification is subject to selection bias, Biometrics 39, 207–215.
- 6 Campbell, G. (1994). Advances in statistical methodology for the evaluation of diagnostic and laboratory tests, Statistics in Medicine 13, 499–508.
- 7 DeLong, E. R., DeLong, D. M. & Clarke-Pearson, D. L. (1988). Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach, Biometrics 44, 837–846.
- 8 Dorfman, D. D. & Alf, E. (1969). Maximum likelihood estimation of parameters of signal detection theory and determination of confidence intervals: rating method data, Journal of Mathematical Psychology 6, 487–496.
- 9 Fletcher, S. W., Black, W., Harris, R., Rimer, B. K. & Shapiro, S. (1993). Report on the international workshop on screening for breast cancer, Journal of the National Cancer Institute 85, 1644–1656.
- 10 Gatsonis, C. & McNeil, B. J. (1990). Collaborative evaluations of diagnostic tests: experience of the Radiation Diagnostic Oncology Group, Radiology 175, 571–575.
- 11 Hanley, J. A. & McNeil, B. J. (1982). The meaning and use of the area under a receiver operating characteristic curve, Radiology 143, 29–36.
- 12 Hanley, J. A. & McNeil, B. J. (1983). A method of comparing the area under two ROC curves derived from the same cases, Radiology 148, 839–843.
- 13 Irwig, L., Tosteson, A. N. A., Gatsonis, C., Lau, J., Colditz, G., Chalmers, T. C. & Mosteller, F. (1994). Guidelines for meta-analyses evaluating diagnostic tests, Annals of Internal Medicine 120, 667–676.
- 14 Kardaun, J. W. & Kardaun, O. J. (1990). Comparative diagnostic performance of three radiologic procedures for the detection of lumbar disc herniation, Methods of Information in Medicine 29, 12–22.
- 15 Metz, C. E. Fortran programs ROCFIT, CORROC, LABROC, CLABROC. Department of Radiology, University of Chicago, 5841 South Maryland Avenue.
- 16
Metz, C. E.,
Wang, P. L. &
Kronman, H. B.
(1984).
A new approach for testing the significance of differences between ROC curves for correlated data, in
Information Processing in Medical Imaging,
F. Deconick, ed.
Nijhoff,
The Hague,
pp. 432–445.
10.1007/978-94-009-6045-9_25 Google Scholar
- 17 Poynard, T., Chaput, J. C. & Etienne, J. P. (1982). Relations between effectiveness of a diagnostic test, prevalence of the disease and percentages of uninterpretable results, Medical Decision Making 2, 285–302.
- 18 Ransohoff, D. F. & Feinstein, A. R. (1978). Problems of spectrum and bias in evaluating the efficacy of diagnostic tests, New England Journal of Medicine 299, 926–930.
- 19 Sackett, D. L., Haynes, R. B. & Tugwell, P. (1985). Clinical Epidemiology: A Basic Science for Clinical Medicine, Little Brown & Company, Boston.
- 20 Schwerk, W. B., Durr, H. K. & Schmitz-Moorman, P. (1983). Ultrasound guided fine-needle biopsies in pancreatic and hepatic neoplasms, Gastrointestinal Radiology 8, 219–229.
- 21 Swets, J. A. & Pickett, R. M. (1982). Evaluation of Diagnostic Systems: Methods from Signal Detection Theory. Academic Press, New York.
- 22 Vecchio, T. J. (1966). Predictive value of a single diagnostic test in an unselected population, New England Journal of Medicine 274, 1171–1173.
- 23 Venkatraman, E. S. & Begg, C. B. (1996). A distribution-free procedure for comparing receiver operating characteristic curves from a paired experiment, Biometrika 83, 835–848.
- 24 Walter, S. D. & Irwig, L. M. (1988). Estimation of test error rates, disease prevalence and relative risk from misclassified data: a review, Journal of Clinical Epidemiology 41, 923–938.
- 25 Yerushalmy, J. (1947). Statistical problems in assessing methods of medical diagnosis with special reference to X-ray techniques, Public Health Reports 62, 1432–1449.