Volume 38, Issue 9 pp. 1039-1041
OVERVIEW
Free Access

Reports from CAGI: The Critical Assessment of Genome Interpretation

Roger A Hoskins

Roger A Hoskins

Department of Plant and Microbial Biology, University of California, Berkeley, California

Search for more papers by this author
Susanna Repo

Susanna Repo

Department of Plant and Microbial Biology, University of California, Berkeley, California

Present address: Physics, California State University East Bay, Hayward, California.

Search for more papers by this author
Daniel Barsky

Daniel Barsky

Department of Plant and Microbial Biology, University of California, Berkeley, California

Present address: ELIXIR, Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK

Search for more papers by this author
Gaia Andreoletti

Gaia Andreoletti

Department of Plant and Microbial Biology, University of California, Berkeley, California

Search for more papers by this author
John Moult

John Moult

Institute for Bioscience and Biotechnology Research, University of Maryland, Rockville, Maryland

Department of Cell Biology and Molecular Genetics, University of Maryland, College Park, Maryland

Search for more papers by this author
Steven E. Brenner

Corresponding Author

Steven E. Brenner

Department of Plant and Microbial Biology, University of California, Berkeley, California

Correspondence

John Moult Institute for Bioscience and Biotechnology Research, University of Maryland, Rockville, Maryland

Email: [email protected]

Steven E. Brenner

Department of Plant and Microbial Biology, University of California, Berkeley, California 94720

Email: [email protected]

Search for more papers by this author
First published: 12 July 2017
Citations: 29

Contract grant sponsors: NIH (U41 HG007346, R13 HG006650); Marie Curie International Outgoing Fellowship (PIOF-GA-2009-237751); Tata Consultancy Services.

For the CAGI Special Issue

Special Issue Editors: John Moult, Steven E. Brenner Lead Guest Editor: Rachel Karchin Organizing Guest Editor: Gaia Andreoletti Consulting Guest Editors: Atul Butte, Scott Kahn, Sean D. Mooney, Robert Nussbaum, and Predrag Radivojac

New genomic and large-scale data hold the promise of revolutionizing our understanding and treatment of human disease, and are already influencing clinical practice. Multiple barriers stand between the acquisition of the data and fully realizing these and other benefits. In particular, we need powerful and well-characterized computational methods for deducing the phenotypic impact of genomic and system-level perturbations. Many such methods have been developed, but currently, even though some are already deployed in clinical settings, we often remain ignorant of how they actually perform, as well as how and when they should be applied. Further, it is already clear that new and more sophisticated approaches must be developed to fully meet these challenges.

The Critical Assessment of Genome Interpretation (CAGI, \ˈkā-jē\) conducts community experiments to objectively assess computational methods for determining the phenotypic impacts of genomic variation. The primary goals are to establish the state of the art, to show where future progress may best be made, to highlight innovations and progress, and to build a strong collaborative community. In the CAGI experiments, participants are typically provided genetic variants and make blind predictions of resulting phenotypes. These predictions are evaluated against gold-standard experimental or clinical data by independent assessors. Four CAGI experiments have been conducted to date—a pilot in 2010, and three full-scale events in 2011, 2013, and 2016. Each edition of CAGI involves about 10 challenges. The experiment is conducted over a period of about a year, starting with the identification and development of suitable challenges, followed by a period during which participants are invited to submit their predictions, then a term in which the independent assessors evaluate the results, and concluding with a meeting to discuss the outcomes.

CAGI challenges span a wide range of relationships between genetic variation and disease. For single base variants, there are challenges that address the problem of interpreting the impact of missense mutations on protein activity using a variety of molecular and cellular phenotypes, challenges that test the ability to predict the effect of mutations in cancer driver genes on cell growth, and challenges on the effect of single-base variants on RNA expression levels and splicing (including Beer, 2017; Capriotti, Martelli, Fariselli, & Casadio, 2017; Carraro et al., 2017; Katsonis & Lichtarge, 2017; Kreimer et al., 2017; Niroula & Vihinen 2017; Pejaver et al., 2017; Tang et al., 2017; Tang & Fenton, 2017; Xu et al., 2017; Yin et al., 2017; Zeng, Edwards, Guo, & Gifford, 2017; Zhang et al., 2017). At the level of full exome and genome sequence, there are challenges that assess methods for assigning complex traits phenotypes and that evaluate the ability to associate genome sequence and an extensive profile of phenotypic traits (including Cai et al., 2017; Daneshjou et al., 2014; Daneshjou et al., 2017; Giollo et al., 2017; Laksshman, Bhat, Viswanath, & Li, 2017; Pal, Kundu, Yin, & Moult, 2017a; Wang et al., 2017). CAGI has also included challenges in which participants were asked to identify causative variants for rare diseases in gene panel, exome, and whole-genome sequence data (including Chandonia et al., 2017; Kundu, Pal, Yin, & Moult, 2017; Pal, Kundu, Yin, & Moult, 2017b). Many challenges have focused on cancer, given its prevalence and the impact of genetics.

This special issue of Human Mutation contains a selection papers reporting the assessments of challenge results, as well as papers from some individual participating teams, describing their methods, and the results obtained. Most papers report on the recent challenges, from CAGI4, held in 2016. As CAGI best helps further development when challenges reoccur year after year, some manuscripts discuss results from the earlier editions of CAGI and their development over time.

Together, these results from CAGI offer powerful insights into the appropriate level of confidence to place in variant annotations and interpretation methods, and which classes of approaches are most suitable for a particular application. They reveal limitations of current data collection and analysis approaches and point to areas for future research and new approaches.

The fifth CAGI edition is presently underway. Full information about this and the previous CAGI editions is available at http://www.genomeinterpretation.org.

ACKNOWLEDGMENTS

We are most grateful to all CAGI participants. The primary contributors whose work is assessed in CAGI is that of the predictors: Allison Abad, Ogun Adebali, Ivan Adzhubey, Talal Amin, Johnathan R. Azaria, Giulia Babbi, Eraan Bachar, Benjamin Bachman, Minkyung Baek, Greet De Baets, Michael Beer, Violeta Beleva-Guthrie, Bonnie Berger, Brady Bernard, Rajendra Bhat, Rohit Bhattacharya, Samuele Bovo, Marcus Breese, Aharon S. Brodie, Yana Bromberg, Binghuang Cai, Colin Campbell, Chen Cao, Emidio Capriotti, Marco Carraro, Rita Casadio, Hannah Carter, Billy H. W. Chang, Shann-Ching Chen, Yun-Ching Chen, Chien-Yuan Chen, Melissa Cline, Andrea Corredor, Chen Cui, Carla Davis, Mark Diekhans, Rezarta I. Dogan, Christopher Douville, Ian Driver, Roland Dunbrack, Joost van Durme, Andrea Eakin, Matthew Edwards, Gokcen Eraslan, Hai Fang, Carlo Ferrari, Anna Flynn, Lukas Folkman, Colby T. Ford, Adam Frankish, Zaneta Franklin, Yao Fu, Alessandra Gasparini, Tom Gaunt, David Gifford, Manuel Giollo, Nina Gonzaludo, Valer Gotea, Julian Gough, Yuchun Guo, Jennifer Harrow, Marcia Hasenahuer, Lim Heo, Ramin Homayouni, Raghavendra Hosur, Cheng L. V. Huang, Peter Huwe, Sohyun Hwang, Tadashi Imanishi, Jules Jacobsen, Chan-Seok Jeong, Yuxiang Jiang, David T. Jones, Daniel Jordan, Beomchang Kang, Rachel Karchin, Panagiotis Katsonis, Sunduz Keles, Manolis Kellis, Nikki Kiga, Dongsup Kim, Eiru Kim, Jack F. Kirsch, Michael Kleyman, Andreas Kraemer, Anshul Kundaje, Kunal Kundu, Pui-Yan Kwok, Ernest Lam, Dae Lee, Gyu Rie Lee, Insuk Lee, Pietro Di Lena, Emanuela Leonardi, Andy Li, Mulin Jun Li, Yue Li, Biao Li, Olivier Lichtarge, Chiao-Feng Lin, Rhonald C. Lua, Angel Mak, Pier L. Martelli, David Masica, Zev Medoff, Aziz M. Mezlini, Rahul Mohan, Alexander M. Monzon, Sean D. Mooney, Matthew Mort, John Moult, Steve Mount, Eliseos Mucaki, Jonathan Mudge, Nikola Mueller, Chris Mungall, Katsuhiko Murakami, Yoko Nagai, Noushin Niknafs, Abhishek Niroula, Conor M. L. Nodzak, Yanay Ofran, Ayodeji Olatubosun, Kymberleigh Pagel, Lipika R. Pal, Taeyong Park, Nathaniel Pearson, Vikas Pejaver, Jian Peng, Alexandra Piryatinska, Catherine Plotts, Predrag Radivojac, Aditya R. Rao, Aliz Rao, Graham Ritchie, Peter Rogan, Frederic Rousseau, Jana M. Schwarz, Joost Schymkowitz, Chaok Seok, George Shackelford, Sohela Shah, Maxim Shatsky, Ron Shigeta, Hashem A. Shihab, Jung E. Shim, Junha Shin, Sunyoung Shin, Ilya Shmulevich, Bradford R. Silver, Nasa Sinnott-Armstrong, Ben Smithers, Yesim A. Son, Mario Stanke, Nathan Stitziel, Andrew Su, Laksshman Sundaram, Paul Tang, Nuttinee Teerakulkittipong, Natalie Thurlby, Janita Thusberg, Kevin Tian, Collin Tokheim, Silvio C. E. Tosatto, Yemliha Tuncel, Tychele Turner, Ron S. Unger, Aneeta Uppal, Gurkan Ustunkar, Jouni Valiaho, Mauno Vihinen, Mary Wahl, Michael Wainberg, Meng Wang, Maggie Wang, Yanran Wang, Xinyuan Wang, Li-San Wang, Liping Wei, Qiong Wei, Rene Welch, Stephen Wilson, Chunlei Wu, Lijing Xu, Qifang Xu, Yuedong Yang, Christopher Yates, Yizhou Yin, Chen-Hsin Yu, Dejian Yuan, Jan Zaucha, Haoyang Zeng, and Maya Zuhl.

We are deeply grateful to the many researchers who shared their data, typically before publication and often requiring extensive permissions and review, to create the CAGI challenges: Russ Altman, Adam P. Arkin, Madeleine P. Ball, Jason Bobe, Paolo Bonvini, Bethany Buckley, George Church, Garry R. Cutting, Emma D'Andrea, Lisa Elefanti, Aron W. Fenton, Andre Franke, Nina Gonzaludo, Joe W. Gray, Linnea Jannson, John P. Kane, Pui-Yan Kwok, Rick Lathrop, Jonathan H. LeBowitz, Federica Lovisa, Angel C. Y. Mak, Mary J. Malloy, Richard McCombie, Chiara Menin, M. Stephen Meyn, John Moult, Robert Nussbaum, Lipika R. Pal, Britt-Sabina Petersen, Mehdi Pirooznia, James B. Potash, Clive R. Pullinger, Jasper Rine, Frederick Roth, Pardis Sabeti, Jeremy Sanford, Maria C. Scaini, Nicole Schmitt, Jay Shendure, Molly Sheridan, Michael Snyder, Tim Sterne-Weiler, Paul L. F. Tang, Sean Tavtigian, Ryan Tewhey, Silvio C. E. Tosatto, Jochen Weile, G. Karen Yu, and Peter Zandi.

The CAGI experiment also depended upon the assessors who evaluated each challenge: Aashish Adhikari, Marco Carraro, John-Marc Chandonia, Rui Chen, Wyatt T. Clark, Roxana Daneshjou, Roland Dunbrack, Iddo Friedberg, Gad Getz, Manuel Giollo, Nick Grishin, Rachel Karchin, Anat Kreimer, Stephen. Meyn, Sean D. Mooney, Alexander A. Morgan, John Moult, Robert Nussbaum, Jeremy Sanford, David B. Searls, Artem Sokolov, Josh Stuart, Shamil Sunyaev, Sean Tavtigian, Silvio C. E. Tosatto, Qifang Xu, and Nir Yosef.

We have been the beneficiaries of many who offered insights and guidance, essential to CAGI's success including our advisory board members over the years: Russ Altman, George Church, Tim Hubbard, Scott Kahn, Sean D. Mooney, Pauline Ng, Susanna Repo, and John Shon; our Scientific Council: Patricia Babbitt, Atul Butte, Garry R. Cutting, Laura Elnitski, Reece Hart, Ryan Hernandez, Rachel Karchin, Robert Nussbaum, Michael Snyder, Shamil Sunyaev, Joris Veltman, and Liping Wei; and the CAGI Ethics Forum: Wylie Burke, Lawrence R Carr, Flavia Chen, Julie Harris-Wai, Kirsten Isgro, Barbara A. Koenig, Selena Martinez, Robert Nussbaum, and Mark Yarborough.

We also acknowledge those who helped organize the CAGI experiment and help with its technology: John-Marc Chandonia, Ajithavalli Chellappan, Flavia Chen, Navya Dabbiru, Reece Hart (who coined the term ‘CAGI’), Melissa K. Ly, Andrew J. Neumann, Gaurav Pandey, Sadhna Rana, Rajgopal Srinivasan, Stephen Yee, Sri Jyothsna Yeleswarapu, and Maya Zuhl. We especially recognize Tata Consultancy Services, which has been a generous collaborator in organizing the CAGI experiment. We also acknowledge the efforts of Brenner and Moult lab researchers who contributed to CAGI.

We greatly appreciate the efforts of the anonymous peer reviewers and of Lead Guest Editor Rachel Karchin, and of Atul Butte, Scott Kahn, Sean D. Mooney, Robert Nussbaum, and Predrag Radivojac who served as Consulting Guest Editors and oversaw the peer-review process. Steven E. Brenner and John Moult were the Special Issue Editors, and Gaia Andreoletti was the Organizing Guest Editor, for this special issue of Human Mutation. We sincerely thank Christine Murray, Stephanie Serraon and Sean Yaftali for their efforts coordinating the editorial and production operations, respectively, at Wiley.

Finally, we also wish to acknowledge our profound debt to those many individuals who shared their private genetic and phenotypic or clinical information as participants in the research studies and clinical datasets that comprise the CAGI challenges.

      The full text of this article hosted at iucr.org is unavailable due to technical difficulties.