Volume 75, Issue 1 pp. 13-23

BIOMETRIC METHODOLOGY

Adaptive elastic net for group testing

Karl B. Gregory,

Corresponding Author

Karl B. Gregory

[email protected]

orcid.org/0000-0002-7611-0906

Department of Statistics, University of South Carolina, Columbia, South Carolina

Karl B. Gregory, Department of Statistics, University of South Carolina, Columbia, South Carolina

Email: [email protected]

Dewei Wang, Department of Statistics, University of South Carolina, Columbia, SC

Email: [email protected]

Christopher S. McMahan, Department of Mathematical Sciences, Clemson University, Clemson, SC

Email: [email protected]

Search for more papers by this author

Dewei Wang,

Corresponding Author

Dewei Wang

[email protected]

orcid.org/0000-0003-0822-8563

Department of Statistics, University of South Carolina, Columbia, South Carolina

Karl B. Gregory, Department of Statistics, University of South Carolina, Columbia, South Carolina

Email: [email protected]

Dewei Wang, Department of Statistics, University of South Carolina, Columbia, SC

Email: [email protected]

Christopher S. McMahan, Department of Mathematical Sciences, Clemson University, Clemson, SC

Email: [email protected]

Search for more papers by this author

Christopher S. McMahan,

Corresponding Author

Christopher S. McMahan

[email protected]

orcid.org/0000-0001-5056-9615

Department of Mathematical Sciences, Clemson University, Clemson, South Carolina

Karl B. Gregory, Department of Statistics, University of South Carolina, Columbia, South Carolina

Email: [email protected]

Dewei Wang, Department of Statistics, University of South Carolina, Columbia, SC

Email: [email protected]

Christopher S. McMahan, Department of Mathematical Sciences, Clemson University, Clemson, SC

Email: [email protected]

Search for more papers by this author

Karl B. Gregory,

Corresponding Author

Karl B. Gregory

[email protected]

orcid.org/0000-0002-7611-0906

Department of Statistics, University of South Carolina, Columbia, South Carolina

Karl B. Gregory, Department of Statistics, University of South Carolina, Columbia, South Carolina

Email: [email protected]

Dewei Wang, Department of Statistics, University of South Carolina, Columbia, SC

Email: [email protected]

Christopher S. McMahan, Department of Mathematical Sciences, Clemson University, Clemson, SC

Email: [email protected]

Search for more papers by this author

Dewei Wang,

Corresponding Author

Dewei Wang

[email protected]

orcid.org/0000-0003-0822-8563

Department of Statistics, University of South Carolina, Columbia, South Carolina

Karl B. Gregory, Department of Statistics, University of South Carolina, Columbia, South Carolina

Email: [email protected]

Dewei Wang, Department of Statistics, University of South Carolina, Columbia, SC

Email: [email protected]

Christopher S. McMahan, Department of Mathematical Sciences, Clemson University, Clemson, SC

Email: [email protected]

Search for more papers by this author

Christopher S. McMahan,

Corresponding Author

Christopher S. McMahan

[email protected]

orcid.org/0000-0001-5056-9615

Department of Mathematical Sciences, Clemson University, Clemson, South Carolina

Karl B. Gregory, Department of Statistics, University of South Carolina, Columbia, South Carolina

Email: [email protected]

Dewei Wang, Department of Statistics, University of South Carolina, Columbia, SC

Email: [email protected]

Christopher S. McMahan, Department of Mathematical Sciences, Clemson University, Clemson, SC

Email: [email protected]

Search for more papers by this author

First published: 29 September 2018

https://doi.org/10.1111/biom.12973

Citations: 7

Share a link

Email
Wechat
Bluesky

Abstract

For disease screening, group (pooled) testing can be a cost-saving alternative to one-at-a-time testing, with savings realized through assaying pooled biospecimen (eg, urine, blood, saliva). In many group testing settings, practitioners are faced with the task of conducting disease surveillance. That is, it is often of interest to relate individuals’ true disease statuses to covariate information via binary regression. Several authors have developed regression methods for group testing data, which is challenging due to the effects of imperfect testing. That is, all testing outcomes (on pools and individuals) are subject to misclassification, and individuals’ true statuses are never observed. To further complicate matters, individuals may be involved in several testing outcomes. For analyzing such data, we provide a novel regression methodology which generalizes and extends the aforementioned regression techniques and which incorporates regularization. Specifically, for model fitting and variable selection, we propose an adaptive elastic net estimator under the logistic regression model which can be used to analyze data from any group testing strategy. We provide an efficient algorithm for computing the estimator along with guidance on tuning parameter selection. Moreover, we establish the asymptotic properties of the proposed estimator and show that it possesses “oracle” properties. We evaluate the performance of the estimator through Monte Carlo studies and illustrate the methodology on a chlamydia data set from the State Hygienic Laboratory in Iowa City.

Supporting Information

REFERENCES

Albert, A. and Anderson, J. A. (1984). On the existence of maximum likelihood estimates in logistic regression models Biometrika 71, 1–10.
10.1093/biomet/71.1.1
Web of Science® Google Scholar
Bühlmann, P., and Geer, S. (2011). Statistics for High-Dimensional Data. Methods, Theory and Applications. Heidelberg: Springer.
Google Scholar
Chen, P., Tebbs, J. M., and Bilder, C. R. (2009). Group testing regression models with fixed and random effects. Biometrics 65, 1270–1278.
10.1111/j.1541-0420.2008.01183.x
CAS PubMed Web of Science® Google Scholar
Das, D., Gregory, K., and Lahiri, S. N. (2017). Perturbation bootstrap in adaptive lasso. arXiv preprint arXiv:1703.03165.
Google Scholar
Delaigle, A. and Meister, A. (2011). Nonparametric regression analysis for group testing data. J Am Stat Assoc 106, 640–650.
10.1198/jasa.2011.tm10520
CAS Web of Science® Google Scholar
Delaigle, A., Hall, P., and Wishart, J. (2014). New approaches to non-and semi-parametric regression for univariate and multivariate group testing data. Biometrika 101, 567–585.
10.1093/biomet/asu025
Web of Science® Google Scholar
Delaigle, A. and Hall, P. (2015). Nonparametric methods for group testing data, taking dilution into account Biometrika 102, 871–887.
10.1093/biomet/asv049
Web of Science® Google Scholar
Dorfman, R. (1943). The detection of defective members of large populations. Ann Math Stat 14, 436–440.
10.1214/aoms/1177731363
Google Scholar
Farrington, C. P. (1992). Estimating prevalence by group testing using generalized linear models. Stat Med 11, 1591–1597.
10.1002/sim.4780111206
CAS PubMed Web of Science® Google Scholar
Friedman, J., Hastie, T., and Tibshirani, R. (2010). Regularization Paths for Generalized Linear Models via Coordinate Descent. J Stat Softw 33, 1–22.
10.18637/jss.v033.i01
PubMed Web of Science® Google Scholar
Gastwirth, J. L. and Johnson, W. O. (1994). Screening with cost-effective quality control: potential applications to HIV and drug testing. J Am Stat Assoc 89, 972–981.
10.1080/01621459.1994.10476831
Web of Science® Google Scholar
Geer, S., Bühlmann, P., Ritov, Y., Dezeure, R., et al. (2014). On asymptotically optimal confidence regions and tests for high-dimensional models. Ann Stat 42, 1166–1202.
10.1214/14-AOS1221
Web of Science® Google Scholar
Geer, S., Bhlmann, P., and Zhou, S. (2011). The adaptive and the thresholded Lasso for potentially misspecified models (and a lower bound for the Lasso) Electron J Statist 5, 688–749.
10.1214/11-EJS624
Web of Science® Google Scholar
Heffernan, A. L., Aylward, L. L., Leisa-maree, L., Sly, P. D., Macleod, M., and Mueller, J. F. (2014). Pooled biological specimens for human biomonitoring of environmental chemicals: opportunities and limitations. J Expo Sci Environ Epidemiol 24, 225–232.
10.1038/jes.2013.76
CAS PubMed Web of Science® Google Scholar
Hoerl, A. E. and Kennard, R. W. (1970). Ridge regression: Biased estimation for nonorthogonal problems. Technometrics 12, 55–67.
10.1080/00401706.1970.10488634
Web of Science® Google Scholar
Huang, X. (2009). An improved test of latent-variable model misspecification in structural measurement error models for group testing data. Stat Med 28, 3316–3327.
10.1002/sim.3698
CAS PubMed Web of Science® Google Scholar
Huang, J., Ma, S., and Zhang, C.-H. (2008). Adaptive Lasso for sparse high-dimensional regression models. Stat Sin 18, 1603–1618.
Web of Science® Google Scholar
Hui, F. K. C., Warton, D. I., and Foster, S. D. (2015). Tuning parameter selection for the adaptive lasso using ERIC. J Am Stat Assoc 110, 262–269.
10.1080/01621459.2014.951444
CAS Web of Science® Google Scholar
Kim, H.-Y., Hudgens, M. G., Dreyfuss, J. M., Westreich, D. J., and Pilcher, C. D. (2007). Comparison of group testing algorithms for case identification in the presence of test error. Biometrics 63, 1152–1163.
10.1111/j.1541-0420.2007.00817.x
CAS PubMed Web of Science® Google Scholar
Krajden, M., Cook, D., Mak, A., et al. (2014). Pooled nucleic acid testing increases the diagnostic yield of acute HIV infections in a high-risk population compared to 3rd and 4th generation HIV enzyme immunoassays. J Clin Virol 61, 132–137.
10.1016/j.jcv.2014.06.024
CAS PubMed Web of Science® Google Scholar
Lehmann, E. and Casella, G. (1998). Theory of Point Estimation, 2nd edn, New York: Springer.
Google Scholar
Lewis, J. L., Lockary, V. M., and Kobic, S. (2012). Cost savings and increased efficiency using a stratified specimen pooling strategy for Chlamydia trachomatis and Neisseria gonorrhoeae. Sexually Transmitted Dis 39, 46–48.
10.1097/OLQ.0b013e318231cd4a
PubMed Web of Science® Google Scholar
Liu, A., Liu, C., Zhang, Z., and Albert, P. S. (2011). Optimality of group testing in the presence of misclassification. Biometrika 99, 245–251.
10.1093/biomet/asr064
PubMed Web of Science® Google Scholar
McMahan, C. S., Tebbs, J. M., and Bilder, C. R. (2012). Regression models for group testing data with pool dilution effects. Biostatistics 14, 284–298.
10.1093/biostatistics/kxs045
PubMed Web of Science® Google Scholar
McMahan, C. S., Tebbs, J. M., Hanson, T. E., and Bilder, C. R. (2017). Bayesian regression for group testing data. Biometrics 73, 1443–1452.
10.1111/biom.12704
PubMed Web of Science® Google Scholar
Navarro, C., Jolly, A., Nair, R., and Chen, Y. (2003). Risk factors for genital Chlamydial infection. J Sex Reprod Med 3, 23–34.
10.4172/1488-5069.1000047
Google Scholar
Thompson, K. H. (1962). Estimation of the proportion of vectors in a natural population of insects. Biometrics 18, 568–578.
10.2307/2527902
Web of Science® Google Scholar
Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. J Royal Stat Soc Ser B (Methodol)58, 267–288.
Google Scholar
Tibshirani, R. J. and Taylor, J. (2012). Degrees of freedom in lasso problems. Ann Stat 40, 1198–1232.
10.1214/12-AOS1003
Web of Science® Google Scholar
Vansteelandt, S., Goetghebeur, E., and Verstraeten, T. (2000). Regression models for disease prevalence with diagnostic tests on pools of serum samples. Biometrics 56, 1126–1133.
10.1111/j.0006-341X.2000.01126.x
CAS PubMed Web of Science® Google Scholar
Wang, D., McMahan, C. S., Gallagher, C. M., and Kulasekera, K. B. (2014). Semiparametric group testing regression models. Biometrika 101, 587–598.
10.1093/biomet/asu007
Web of Science® Google Scholar
Xie, M. (2001). Regression analysis of group testing samples. Stat Med 20, 1957–1969.
10.1002/sim.817
CAS PubMed Web of Science® Google Scholar
Zhang, B., Bilder, C. R., and Tebbs, J. M. (2013). Group testing regression model estimation when case identification is a goal. Biom J 55, 173–189.
10.1002/bimj.201200168
CAS PubMed Web of Science® Google Scholar
Zhang, C.-H. and Zhang, S. S. (2014). Confidence intervals for low dimensional parameters in high dimensional linear models. J Royal Stat Soc: Ser B (Stat Methodol) 76, 217–242.
10.1111/rssb.12026
Web of Science® Google Scholar
Zou, H. (2006). The adaptive lasso and its oracle properties. J Am Stat Assoc 101, 1418–1429.
10.1198/016214506000000735
CAS Web of Science® Google Scholar
Zou, H. and Hastie, T. (2005). Regularization and variable selection via the elastic net. J Royal Stat Soc: Ser B (Stat Methodol) 67, 301–320.
10.1111/j.1467-9868.2005.00503.x
Web of Science® Google Scholar
Zou, H. and Zhang, H. H. (2009). On the adaptive elastic-net with a diverging number of parameters. Ann Stat 37, 1733–1751.
10.1214/08-AOS625
PubMed Web of Science® Google Scholar

Citing Literature

Volume75, Issue1

March 2019

Pages 13-23

Filename	Description
biom12973-sup-0001-SuppData-S1.pdf1.2 MB	Supplementary Data S1.
biom12973-sup-0002-SuppData-S2.zip154.6 KB	Supplementary Data S2.

Adaptive elastic net for group testing

Abstract

Supporting Information

REFERENCES

Citing Literature

References

Information

About Wiley Online Library

Help & Support

Opportunities

Connect with Wiley

Adaptive elastic net for group testing

Abstract

Supporting Information

REFERENCES

Citing Literature

References

Related

Information