Impact of Population Substructure on Trend Tests for Genetic Case–Control Association Studies
Corresponding Author
Gang Zheng
Office of Biostatistics Research, DPPS, National Heart, Lung and Blood Institute, 6701 Rockledge Drive, MSC 7913, Bethesda, Maryland 20892-7913, U.S.A.
email: [email protected]Search for more papers by this authorZhaohai Li
Department of Statistics, George Washington University, Washington DC 20052, U.S.A.
Biostatistics Branch, Division of Cancer Epidemiology and Genetics, National Cancer Institute, 6120 Executive Blvd, EPS, Bethesda, Maryland 20892, U.S.A.
Search for more papers by this authorMitchell H. Gail
Biostatistics Branch, Division of Cancer Epidemiology and Genetics, National Cancer Institute, 6120 Executive Blvd, EPS, Bethesda, Maryland 20892, U.S.A.
Search for more papers by this authorJoseph L. Gastwirth
Department of Statistics, George Washington University, Washington DC 20052, U.S.A.
Search for more papers by this authorCorresponding Author
Gang Zheng
Office of Biostatistics Research, DPPS, National Heart, Lung and Blood Institute, 6701 Rockledge Drive, MSC 7913, Bethesda, Maryland 20892-7913, U.S.A.
email: [email protected]Search for more papers by this authorZhaohai Li
Department of Statistics, George Washington University, Washington DC 20052, U.S.A.
Biostatistics Branch, Division of Cancer Epidemiology and Genetics, National Cancer Institute, 6120 Executive Blvd, EPS, Bethesda, Maryland 20892, U.S.A.
Search for more papers by this authorMitchell H. Gail
Biostatistics Branch, Division of Cancer Epidemiology and Genetics, National Cancer Institute, 6120 Executive Blvd, EPS, Bethesda, Maryland 20892, U.S.A.
Search for more papers by this authorJoseph L. Gastwirth
Department of Statistics, George Washington University, Washington DC 20052, U.S.A.
Search for more papers by this authorAbstract
Summary Hidden population substructure in case–control data has the potential to distort the performance of Cochran–Armitage trend tests (CATTs) for genetic associations. Three possible scenarios that may arise are investigated here: (i) heterogeneity of genotype frequencies across unidentified subpopulations (PSI), (ii) heterogeneity of genotype frequencies and disease risk across unidentified subpopulations (PSII), and (iii) cryptic correlations within unidentified subpopulations. A unified approach is presented for deriving the bias and variance distortion under the three scenarios for any CATT in a general family. Using these analytical formulas, we evaluate the excess type I errors of the CATTs numerically in the presence of population substructure. Our results provide insight into the properties of some proposed corrections for bias and variance distortion and show why they may not fully correct for the effects of population substructure.
References
- Bacanu, S. A., Devlin, B., and Roeder, K. (2000). The power of genomic control. American Journal of Human Genetics 66, 1933–1944.
- Chen, W.-M., Erdos, M. R., Jackson, A. U., Saxena, R., Sanna, S., Silver, K. D., Timpson, N. J., Hansen, T., Orru, M., Piras, M. G., Bonnycastle, L. L., Willer, C. J., et al. (2008). Variations in the G6PC2/ABCB11 genomic region are associated with fasting glucose levels. Journal of Clinical Investigation 118, 2620–2628.
- Clayton, D. and Hills, M. (1994). Statistical Models in Epidemiology. Oxford , U.K. : Oxford Science Publications.
-
Crow, J. F. and
Kimura, H. (1970). An Introduction to Population Genetics Theory.
Minneapolis
,
Minnesota
: Burgess Publication Co.
10.1006/tpbi.1995.1025 Google Scholar
- Devlin, B. and Roeder, K. (1999). Genomic control for association studies. Biometrics 55, 997–1004.
- Nature Genetics . (1999). Editorial: Freely associating. Nature Genetics 22, 1–2.
- Elandt-Johnson, R. C. (1971). Probability Models and Statistical Methods in Genetics. New York : Wiley and Sons.
- Epstein, M. P., Allen, A. S. and Satten, G. A. (2007). A simple and improved correction for population stratification in case-control studies. American Journal of Human Genetics 80, 921–930.
- Epstein, M. P., Allen, A. S., and Satten, G. A. (2008). Response to Lee et al. American Journal of Human Genetics 82, 526–528.
- Evett, I. W. and Weir, B. S. (1998). Interpreting DNA Evidence: Statistical Genetics for Forensic Scientists. Sunderland , Massachusetts : Sinauer.
- Freidlin, B., Zheng, G., Li, Z., and Gastwirth, J. L. (2002). Trend tests for case-control studies of genetic markers: Power, sample size and robustness. Human Heredity 53, 146–152.
- Gorroochurn, P., Heiman, G. A., Hodge, S. E., and Greenberg, D. A. (2006). Centralizing the non-central chi-square: A new method to correct for population stratification in genetic case-control association studies. Genetic Epidemiology 30, 277–289.
- Klein, R. J., Zeiss, C., Chew, E. Y., Tsai, J. Y., Sackler, R. S., Haynes, C., Henning, A. K., San Giovanni, J. P., Mane, S. M., Mayne, S. T., Bracken, M. B., Ferris, F. L., Ott, J., Barnstable, C., and Hoh, J. (2005). Complement factor H polymorphism in age-related macular degeneration. Science 308, 385–389.
- Lee, S., Sullivan, P. F., Zou, F., and Wright, F. A. (2008). Comment on a simple and improved correction for population stratification. American Journal of Human Genetics 82, 524–526.
- Marchini, J., Cardon, L. R., Phillips, M. S., and Donnelly, P. (2004). The effects of human population structure on large genetic association studies. Nature Genetics 36, 512–517.
- Price, A. L., Patterson, N. J., Plenge, R. M., Weinblatt, M. E., Shadick, N. A., and Reich, D. (2006). Principal components analysis corrects for stratification in genome-wide association studies. Nature Genetics 38, 904–909.
- Price, A. L., Butler, J., Patterson, N., Capelli, C., Pascali, V. L., Scarnicci, F., Ruiz-Linares, A., Groop, L., Saetta, A. A., Korkolopoulou, P., Seligsohn, U., Waliszewska, A., Schirmer, C., Ardlie, K., Ramos, A., Nemesh, J., Arbeitman, L., Goldstein, D. B., Reich, D., Hirschhorn, J. N. (2008). Discerning the ancestry of European Americans in genetic association studies. PLOS Genetics 4, e236.
- Pritchard, J. K. and Rosenberg, N. A. (1999). Use of unlinked genetic markers to detect population stratification in association studies. American Journal of Human Genetics 65, 220–228.
- Sasieni, P. D. (1997). From genotypes to genes: Doubling the sample size. Biometrics 53, 1253–1261.
- The International HapMap Consortium. (2003). The International HapMap Project. Nature 426, 789–796.
-
Thomas, D. C. (2004). Statistical Methods in Genetic Epidemiology.
New York
: Oxford University Press.
10.1093/oso/9780195159394.001.0001 Google Scholar
- Tiwari, H. K., Barnholtz-Sloan, J., Nathan Wineinger, N., Padilla, M. A., Vaughan, L. K., and Allison, D. B. (2008). Review and evaluation of methods correcting for population stratification with a focus on underlying statistical principles. Human Heredity 66, 67–86.
- Voight, B. F. and Pritchard, J. K. (2005). Confounding from cryptic relatedness in case-control association studies. PLOS Genetics 1, e32.
- Wacholder, S., Rothman, N., and Caporaso, N. (2000). Population stratification in epidemiologic studies of common genetic variants and cancer: Quantification of bias. Journal of the National Cancer Institute 92, 1151–1158.
- Whittemore, A. S. (2006). Population structure in genetic association studies. 2006 Proceedings of the American Statistical Association, 2657–2667, ASA Section on Statistics in Epidemiology [CD-ROM]. Alexandria , Virginia : ASA.
- Yasuda, K., Miyake, K., Horikawa, Y., Hara, K., Osawa, H., Furuta, H., Hirota, Y., Mori, H., Jonsson, A., Sato, Y., Yamagata, K., Hinokio, Y. et al. (2008). Variants in KCNQ1 are associated with susceptibility to type 2 diabetes mellitus. Nature Genetics 40, 1092–1097.
- Zhang, H., Li, Q., Zang, Y., Yang, Y. N., and Zheng, G. (2009). Centralized genomic control: A simple approach correcting for population structures in case-control association studies. In Methods and Applications of Statistics in the Life and Health Sciences, N. Balakrishnan (ed). New York : Wiley.
- Zheng, G., Freidlin, B., Li, Z., and Gastwirth, J. L. (2003). Choice of scores in trend tests for case-control studies of candidate gene associations. Biometrical Journal 45, 335–348.
- Zheng, G., Freidlin, B., Li, Z., and Gastwirth, J. L. (2005). Genomic control for association studies under various genetic models. Biometrics 61, 186–192.
- Zhu, X. Li, S., Cooper, R. S., and Elston, R. C. (2008). A unified association analysis approach for family and unrelated samples correcting for stratification. American Journal of Human Genetics 82, 352–365.