The effect of misclassification on the estimation of association: a review

Michael Höfler,

Corresponding Author

Michael Höfler

[email protected].

Max Planck Institute of Psychiatry, Clinical Psychology and Epidemiology, Munich, Germany

Clinical Psychology and Epidemiology, Max Planck Institut of Psychiatry, Kraepelinstr. 2-10, 80804 München, Germany.Search for more papers by this author

Michael Höfler,

Corresponding Author

Michael Höfler

[email protected].

Max Planck Institute of Psychiatry, Clinical Psychology and Epidemiology, Munich, Germany

Clinical Psychology and Epidemiology, Max Planck Institut of Psychiatry, Kraepelinstr. 2-10, 80804 München, Germany.Search for more papers by this author

First published: 06 January 2006

https://doi.org/10.1002/mpr.20

Citations: 77

About

PDF

Tools

Share a link

Email
Wechat
Bluesky

Abstract

Misclassification, the erroneous measurement of one or several categorical variables, is a major concern in many scientific fields and particularly in psychiatric research. Even in rather simple scenarios, unless the misclassification probabilities are very small, a major bias can arise in estimating the degree of association assessed with common measures like the risk ratio and the odds ratio. Only in very special cases — for example, if misclassification takes place solely in one of two binary variables and is independent of the other variable (‘non-differential misclassification’) — is it guaranteed that the estimates are biased towards the null value (which is 1 for the risk ratio and the odds ratio). Furthermore, misclassification, if ignored, usually leads to confidence intervals that are too narrow. This paper reviews consequences of misclassification. A numerical example demonstrates the problem's magnitude for the estimation of the risk ratio in the easy case where misclassification takes place in the exposure variable, but not in the outcome. Moreover, uncertainty about misclassification can broaden the confidence intervals dramatically. The best way to overcome misclassification is to avoid it by design, but some statistical methods are useful for reducing bias if misclassification cannot be avoided. Copyright © 2005 Whurr Publishers Ltd.

References

Barron BA. The effects of misclassification on the estimation of relative risk. Biometrics 1977; 33: 414–18.
10.2307/2529795
CAS PubMed Web of Science® Google Scholar
Brenner H. How independent are multiple ‘independent’ diagnostic classifications? Statistics in Medicine 1996; 15: 1377–86.
10.1002/(SICI)1097-0258(19960715)15:13<1377::AID-SIM275>3.0.CO;2-#
CAS PubMed Web of Science® Google Scholar
Brenner H, Savitz DA, Gefeller O. The effects of joint misclassification of exposure and disease on epidemiologic measures of association. Journal of Clinical Epidemiology 1993; 46: 1195–202.
10.1016/0895-4356(93)90119-L
CAS PubMed Web of Science® Google Scholar
Chapman TF, Mannuzza S, Klein DF, Fyer AJ. Effects of informant mental disorder on psychiatric family history data. American Journal of Psychiatry 1994; 151: 574–9.
10.1176/ajp.151.4.574
PubMed Web of Science® Google Scholar
Cohen PC. The effects of instruments and informants on ascertainment. In D Dunner, ES Gershon, JE Barrett (eds) Relatives at Risk for Mental Disorder. New York: Raven Press, 1998.
Google Scholar
Copeland KT, Checkoway H, McMichael AJ, Holbrook RH. Bias due to misclassification in the estimation of relative risk. American Journal of Epidemiology 1977; 105: 488–95.
10.1093/oxfordjournals.aje.a112408
CAS PubMed Web of Science® Google Scholar
Davidov O, Faraggi D, Reiser B. Misclassification in logistic regression with discrete covariates. Biometrical Journal 2003; 5: 541–53.
10.1002/bimj.200390031
Web of Science® Google Scholar
Duffy SW, Rohan TE, Kandel R, Prevost TC, Rice K, Myles JP. Misclassification in a matched case-control study of c-erbB-2 overexpression and breast cancer. Statistics in Medicine 2003; 22: 2459–68.
10.1002/sim.1477
PubMed Web of Science® Google Scholar
Dunn D. Statistics in Psychiatry. London: Arnold, 2000.
Google Scholar
Flegal KM, Browne C, Haas JD. The effects of exposure misclassification on estimates of relative risk. American Journal of Epidemiology 1986; 123: 736–51.
10.1093/oxfordjournals.aje.a114294
CAS PubMed Web of Science® Google Scholar
Fleiss JL, Levin B, Cho Paik M. Statistical Methods for Rates and Proportions. 3 edn. New York: Wiley.
Google Scholar
Gladen B, Rogan WJ. Misclassification and the design of environmental studies. American Journal of Epidemiology 1979; 109: 607–16.
10.1093/oxfordjournals.aje.a112719
CAS PubMed Web of Science® Google Scholar
Goldberg JD. The effects of misclassification on the bias in the difference between two proportions and the relative odds in the fourfold table. Journal of the American Statistical Association 1975; 561–7.
10.2307/2285933
Web of Science® Google Scholar
Greenland S. The effect of misclassification in the presence of covariates. American Journal of Epidemiology 1980; 112: 564–9.
10.1093/oxfordjournals.aje.a113025
CAS PubMed Web of Science® Google Scholar
Greenland S. Statistical uncertainty due to misclassification: implications for validation substudies. Journal of Clinical Epidemiology 1988; 41: 1167–74.
10.1016/0895-4356(88)90020-0
CAS PubMed Web of Science® Google Scholar
Greenland S. Multiple bias modelling with analysis from observational data. Journal of the Royal Statistical Society A 2005; 168: 267–306.
10.1111/j.1467-985X.2004.00349.x
Google Scholar
Gustafson P. Measurement Error and Misclassification in Statistics and Epidemiology. London: Chapman & Hall CRC, 2004.
Google Scholar
Kendler KS: Is seeking treatment for depression predicted by a history of depression in relatives? Implications for family studies of affective disorder. Psychological Medicine 1995; 25: 807–14.
10.1017/S0033291700035054
PubMed Web of Science® Google Scholar
Kendler KS, Roy MA: Validity of a diagnosis of lifetime major depression obtained by personal interview versus family history. American Journal of Psychiatry 1995; 152: 1608–13.
10.1176/ajp.152.11.1608
CAS PubMed Web of Science® Google Scholar
Kendler KS, Silberg JL, Neale, MC, Kessler RC, Heath AC, Eaves LJ. The family history method: Whose psychiatric history is measured? American Journal of Psychiatry 1991; 148: 1501–4.
Google Scholar
Kraemer HC. The robustness of common measures of 2 _ 2 associations to bias due to misclassifications. American Statistician 1985; 39: 286–90.
10.2307/2683705
Web of Science® Google Scholar
Kraemer HC, Kazdin AE, Offord DR, Kessler RC, Jensen PS, Kupfer DJ. Measuring the potency of risk factors for clinical or policy significance. Psychological Methods 1999; 4: 257–71.
10.1037/1082-989X.4.3.257
Web of Science® Google Scholar
Kraemer HC, Measelle JR, Ablow JC, Essex MJ, Boyce WT, Kupfer DJ. A new approach to integrating data from multiple informants in psychiatric assessment and research: Mixing and matching contexts and perspectives. American Journal of Psychiatry 2003; 160: 1566–77.
10.1176/appi.ajp.160.9.1566
PubMed Web of Science® Google Scholar
Kristensen P. Bias from nondifferential but dependent misclassification of exposure and outcome. Epidemiology 1992; 3: 210–15.
10.1097/00001648-199205000-00005
CAS PubMed Web of Science® Google Scholar
Lieb R, Isensee B. Hofler M, Pfister H. Wittchen HU. Parental major depression and the risk of depression and other mental disorders in offspring — a prospective-longitudinal community study. Archives of General Psychiatry 2002; 59: 365–74.
10.1001/archpsyc.59.4.365
PubMed Web of Science® Google Scholar
Light RJ, Singer JD, Willett JB. By Design: Planning Research on Higher Education. Cambridge MA: Harvard University Press, 1990.
10.4159/9780674040267
Web of Science® Google Scholar
Lyles RH. A note on estimating crude odds ratios in case-control studies with differentially misclassified exposure. Biometrics 2002; 58: 1034–7.
10.1111/j.0006-341X.2002.1034_1.x
PubMed Web of Science® Google Scholar
Neuhaus JM. Bias and efficiency loss due to misclassified responses in binary regression. Biometrika 1999; 86: 843–55.
10.1093/biomet/86.4.843
Web of Science® Google Scholar
Orvaschel H, Thompson WD, Belanger A, Prusoff BA, Kidd KK. Comparison of the family history method to direct interview. Journal of Affective Disorders 1982; 4: 49–59.
10.1016/0165-0327(82)90019-2
PubMed Web of Science® Google Scholar
Paulino CD, Soares P. Neuhaus J. Binomial regression with misclassification. Biometrics 59: 2003; 670–5.
10.1111/1541-0420.00077
PubMed Web of Science® Google Scholar
Reade-Christopher SJ, Kupper LL. Effects of exposure misclassification on regression analyses of epidemiologic follow-up study data. Biometrics 1991; 47: 535–48.
10.2307/2532144
CAS PubMed Web of Science® Google Scholar
Rice K. Full-likelihood approaches to misclassification of a binary exposure in matched case-control studies. Statistics in Medicine 2003, 22: 3177–94.
10.1002/sim.1546
CAS PubMed Web of Science® Google Scholar
Schafer JL. Analysis of Incomplete Multivariate Data. London: Chapman & Hall, 1997.
10.1201/9781439821862
Web of Science® Google Scholar
Sullivan PF, Wells JE, Joyce PR, Bushnell JA, Mulder RT, Oakley-Browne MA. Family history of depression in clinic and community samples. Journal of Affective Disorders 1996; 40: 159–68.
10.1016/0165-0327(96)00056-0
CAS PubMed Web of Science® Google Scholar
Szatmari P, Jones MB. Effects of misclassification on estimates of relative risk in family history studies. Genetic Epidemiology 1999; 16: 368–81.
10.1002/(SICI)1098-2272(1999)16:4<368::AID-GEPI4>3.0.CO;2-A
CAS PubMed Web of Science® Google Scholar
MT Tsuang, T Cohen (eds) Textbook in Psychiatric Epidemiology, 2 edn. New York: John Wiley & Sons, 2002.
10.1002/0471234311
Google Scholar
Van den Hout A, Van der Heijden PGM. Randomized response, statistical disclosure control and misclassification: a review. International Statistical Review 2002; 70: 269–88.
10.1111/j.1751-5823.2002.tb00363.x
Web of Science® Google Scholar
Weinberg CR, Umbach DM, Greenland S. When will nondifferential misclassification of an exposure preserve the direction of a trend? American Journal of Epidemiology 1994; 140: 565–71.
10.1093/oxfordjournals.aje.a117283
CAS PubMed Web of Science® Google Scholar
White E. The effect of misclassification of disease status in follow-up studies: implications for selecting disease classification criteria. American Journal of Epidemiology 1986; 124: 816–25.
10.1093/oxfordjournals.aje.a114458
CAS PubMed Web of Science® Google Scholar
WHO. Composite International Diagnostic Interview (CIDI, version 2.1). Geneva: World Health Organization, 1997.
Google Scholar
Wittchen HU, Höfler M, Gander F, Pfister H, Storz S, Üstün TB, Müller N, Kessler RC. Screening for mental disorders: performance of the Composite International Diagnostic Interview (CID-S). International Journal of Methods in Psychiatric Research 1999; 8: 59–70.
10.1002/mpr.57
CAS PubMed Web of Science® Google Scholar

Citing Literature

Volume14, Issue2

June 2005

Pages 92-101

The effect of misclassification on the estimation of association: a review

Abstract

References

Citing Literature

References

Information

About Wiley Online Library

Help & Support

Opportunities

Connect with Wiley

The effect of misclassification on the estimation of association: a review

Abstract

References

Citing Literature

References

Related

Information