Volume 72, Issue 2 pp. 629-638
BIOMETRIC PRACTICE

A unified set-based test with adaptive filtering for gene–environment interaction analyses

Qianying Liu

Qianying Liu

Sanofi, Cambridge, Massachusetts 02139, U.S.A.

Search for more papers by this author
Lin S. Chen

Corresponding Author

Lin S. Chen

Department of Public Health Sciences, The University of Chicago, Chicago, Illinois 60637, U.S.A.

email: [email protected]Search for more papers by this author
Dan L. Nicolae

Dan L. Nicolae

Department of Medicine, Statistics, and Human Genetics, The University of Chicago, Chicago, Illinois 60637, U.S.A.

Search for more papers by this author
Brandon L. Pierce

Brandon L. Pierce

Department of Public Health Sciences, The University of Chicago, Chicago, Illinois 60637, U.S.A.

Comprehensive Cancer Center, The University of Chicago, Chicago, Illinois 60637, U.S.A.

Search for more papers by this author
First published: 23 October 2015
Citations: 11

Summary

In genome-wide gene–environment interaction (GxE) studies, a common strategy to improve power is to first conduct a filtering test and retain only the SNPs that pass the filtering in the subsequent GxE analyses. Inspired by two-stage tests and gene-based tests in GxE analysis, we consider the general problem of jointly testing a set of parameters when only a few are truly from the alternative hypothesis and when filtering information is available. We propose a unified set-based test that simultaneously considers filtering on individual parameters and testing on the set. We derive the exact distribution and approximate the power function of the proposed unified statistic in simplified settings, and use them to adaptively calculate the optimal filtering threshold for each set. In the context of gene-based GxE analysis, we show that although the empirical power function may be affected by many factors, the optimal filtering threshold corresponding to the peak of the power curve primarily depends on the size of the gene. We further propose a resampling algorithm to calculate P-values for each gene given the estimated optimal filtering threshold. The performance of the method is evaluated in simulation studies and illustrated via a genome-wide gene–gender interaction analysis using pancreatic cancer genome-wide association data.

The full text of this article hosted at iucr.org is unavailable due to technical difficulties.