A Bayesian integrative approach for multi-platform genomic data: A kidney cancer case study
Thierry Chekouo
Department of Mathematics and Statistics, University of Minnesota Duluth, Duluth, MN 55812, USA
Search for more papers by this authorCorresponding Author
Francesco C. Stingo
Dipartimento di Statistica, Informatica, Applicazioni “G.Parenti”, University of Florence, 50134 Florence, Italy
email: [email protected]Search for more papers by this authorJames D. Doecke
CSIRO Health and Biosecurity/Australian e-Health Research Center Level 5, Queensland 4029, Australia
Search for more papers by this authorKim-Anh Do
Department of Biostatistics, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
Search for more papers by this authorThierry Chekouo
Department of Mathematics and Statistics, University of Minnesota Duluth, Duluth, MN 55812, USA
Search for more papers by this authorCorresponding Author
Francesco C. Stingo
Dipartimento di Statistica, Informatica, Applicazioni “G.Parenti”, University of Florence, 50134 Florence, Italy
email: [email protected]Search for more papers by this authorJames D. Doecke
CSIRO Health and Biosecurity/Australian e-Health Research Center Level 5, Queensland 4029, Australia
Search for more papers by this authorKim-Anh Do
Department of Biostatistics, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
Search for more papers by this authorSummary
Integration of genomic data from multiple platforms has the capability to increase precision, accuracy, and statistical power in the identification of prognostic biomarkers. A fundamental problem faced in many multi-platform studies is unbalanced sample sizes due to the inability to obtain measurements from all the platforms for all the patients in the study. We have developed a novel Bayesian approach that integrates multi-regression models to identify a small set of biomarkers that can accurately predict time-to-event outcomes. This method fully exploits the amount of available information across platforms and does not exclude any of the subjects from the analysis. Through simulations, we demonstrate the utility of our method and compare its performance to that of methods that do not borrow information across regression models. Motivated by The Cancer Genome Atlas kidney renal cell carcinoma dataset, our methodology provides novel insights missed by non-integrative models.
Supporting Information
Additional Supporting Information may be found in the online version of this article.
Filename | Description |
---|---|
biom12587-sup-0001-SuppData.pdf7.6 MB | Supplementary Materials. |
biom12587-sup-0002-SuppData_Code.zip4.3 MB | Supplementary Materials. |
Please note: The publisher is not responsible for the content or functionality of any supporting information supplied by the authors. Any queries (other than missing content) should be directed to the corresponding author for the article.
References
-
Bou-Hamad, I.,
Larocque, D., and
Ben-Ameur, H.
(2011).
A review of survival trees.
Statistics Surveys
5, 44–71.
10.1214/09-SS047 Google Scholar
- Bush, W. S., Moore, J. H. (2012) Chapter 11: Genome-Wide Association Studies. PLoS Comput Biol 8(12):e1002822. doi:10.1371/journal.pcbi.1002822”
- Chekouo, T., Stingo, F. C., Doecke, J. D., and Do, K.-A. (2015). miRNA-target gene regulatory networks: A Bayesian integrative approach to biomarker selection with application to kidney cancer. Biometrics 71, 428–438.
- Daemen, A., Gevaert, O., Bie, T. D., Debucquoy, A., Machiels, J.-P., Moor, B. D., and Haustermans, K. (2008). Integrating microarray and proteomics data to predict the response of cetuximab in patients with rectal cancer. In Pacific Symposium on Biocomputing, R. B. Altman, A. K. Dunker, L. Hunter, T. Murray, and T. E. Klein (eds), 166–177. World Scientific.
- Ding, Y. and Simonoff, J. S. (2010). An investigation of missing data methods for classification trees applied to binary response data. Journal of Machine Learning Research 11, 131–170.
- Gelfand, A. E. and Dey, D. K. (1994). Bayesian Model Choice: Asymptotics and Exact Calculations. Journal of the Royal Statistical Society, Series B (Methodological) 56, 501–514.
- Hamid, J. S., Hu, P., Roslin, N. M., Ling, V., Greenwood, C. M. T., and Beyene, J. (2009). Data integration in genetics and genomics: methods and challenges. Human Genomics Proteomics 2009.
- Hastie, T., Tibshirani, R., Eisen, M., Alizadeh, A., Levy, R., Staudt, L., et al. (2000). ‘Gene shaving’ as a method for identifying distinct sets of genes with similar expression patterns. Genome Biology 1, Genome Biology, 1(2):research0003.1–0003.21.
- Hommel, G. (1988). A stagewise rejective multiple test procedure based on a modified bonferroni test. Biometrika 75, 383–386.
- Imam, J., Buddavarapu, K., Lee-Chang, J., Ganapathy, S., Camosy, C., Chen, Y., et al. (2010). MicroRNA-185 suppresses tumor growth and progression by targeting the Six1 oncogene in human cancers. Oncogene. 35, 4971–4979.
- Isobe, T., Hisamori, S., Hogan, D., Zabala, M., Hendrickson, D., Dalerba, P., et al. (2014). miR-142 regulates the tumorigenicity of human breast cancer stem cells through the canonical WNT signaling pathway. Elife.pages 4971–4979.
- Johnson, V. E. and Rossell, D. (2012). Bayesian model selection in high-dimensional settings. Journal of the American Statistical Association 107, 649–660.
- Laird, P. W. (2003). The power and the promise of dna methylation markers. Nature Review Cancer 3, 253–266.
- Lamnisos, D., Griffin, J. E., and Steel, M. F. J. (2012). Cross-validation prior choice in bayesian probit regression with many covariates. Statistics and Computing 22, 359–373.
- Li, B., Lu, C., Lu, W., Yang, T., Qu, J., Hong, X., et al. (2013). miR-130b is an EMT-related microRNA that targets DICER1 for aggression in endometrial cancer. Medical Oncology 1, 484.
- Li, F. and Zhang, N. R. (2010). Bayesian variable selection in structured high-dimensional covariate spaces with applications in genomics. Journal of the American Statistical Association 105, 1202–1214.
- Lin, S. and Gregory, R. (2015). MicroRNA biogenesis pathways in cancer. Nature Reviews Genetics 15, 321–333.
- Liu, L., Nie, J., Chen, L., Dong, G., Du, X., Wu, X., et al. (2013). The Oncogenic Role of microRNA-130a/301a/454 in Human Colorectal Cancer via Targeting Smad4 Expression. PLoS ONE 8, e55532.
-
Liu, N.,
Zuo, C.,
Wang, X.,
Chen, T.,
Yang, D.,
Wang, J., et al.
(2014).
miR-942 decreases TRAIL-induced apoptosis through ISG12a downregulation and is regulated by AKT.
Oncotarget.
13, 4959–4971.
10.18632/oncotarget.2067 Google Scholar
- Lu, L. J., Xia, Y., Paccanaro, A., Yu, H., and Gerstein, M. (2005). Assessing the limits of genomic data integration for predicting protein networks. Genome research 15, 945–953.
- Marino, N., Marshall, J., Collins, J., Zhou, M., Qian, Y., Veenstra, T., et al. (2013). Nm23-h1 binds to gelsolin and inactivates its actin-severing capacity to promote tumor cell motility and metastasis. Cancer Research 19, 5949–5962.
- Mourtada-Maarabouni, M., Watson, D., Munir, M., Farzaneh, F., and Williams, G. (2013). Apoptosis suppression by candidate oncogene PLAC8 is reversed in other cell types. Curr Cancer Drug Targets 1, 80–91.
- Peterson, C., Stingo, F. C., and Vannucci, M. (2015). Bayesian inference of multiple gaussian graphical models. Journal of the American Statistical Association 110, 159–174.
- Qin, L.-X. (2008). An integrative analysis of microRNA and mRNA expression–A case study. Cancer Informatics 6, 369–379.
- Raftery, A. E., Madigan, D., and Volinsky, C. T. (1996). Accounting for Model Uncertainty in Survival Analysis Improves Predictive Performance (with Discussion). In Bayesian Statistics 5. Oxford, UK: Oxford University Press.
- Simon, N., Friedman, J. H., Hastie, T., and Tibshirani, R. (2011). Regularization paths for Cox's proportional hazards model via coordinate descent. Journal of Statistical Software 39, 1–13.
- Srivastava, S., Wang, W., Manyam, G., Ordonez, C., and Baladandayuthapani, V. (2013). Integrating multi-platform genomic data using hierarchical bayesian relevance vector machines. EURASIP J. Bioinformatics and Systems Biology 2013, 9.
- Stingo, F. C., Chen, Y. A., Tadesse, M. G., and Vannucci, M. (2011). Incorporating biological information into linear models: A Bayesian approach to the selection of pathways and genes. The Annals of Applied Statistics 5, 1978–2002.
- Stingo, F. C., Chen, Y. A., Vannucci, M., Barrier, M., and Mirkes, P. E. (2010). A Bayesian graphical modeling approach to microRNA regulatory network inference. Annals of Applied Statistics 4, 2024–2048.
- Stingo, F. C., Vannucci, M., and G., D. (2012). Bayesian Wavelet-based Curve Classification via Discriminant Analysis with Markov Random Tree Priors. Statistica Sinica 22, 465–488.
- Tanner, M. A. and Wong, W. H. (1987). The calculation of posterior distributions by data augmentation. Journal of the American Statistical Association 82, 528–540.
- Tavazoie, S., Alarcn, C., Oskarsson, T., Padua, D., Wang, Q., Bos, P. D., Gerald, W., and J., M. (2009). Endogenous human microRNAs that suppress breast cancer metastasis. Nature. 7175, 147–152.
- Wan, T., Lam, C., Ng, L., Chow, A., Wong, S., Li, H., et al. (2014). The clinicopathological significance of miR-133a in colorectal cancer. Disease Markers page Epub.
- Wang, W., Baladandayuthapani, V., Morris, J. S., Broom, B. M., Manyam, G., and Do, K.-A. (2013). iBAG: integrative Bayesian analysis of high-dimensional multiplatform genomics data. Bioinformatics 29, 149–159.
- White, N. M., Bao, T. T., Grigull, J., Youssef, Y. M., Girgis, A., Diamandis, M., et al. (2011). mirna profiling for clear cell renal cell carcinoma: Biomarker discovery and identification of potential controls and consequences of mirna dysregulation. The Journal of Urology 186, 1077–1083.
- Wu, S., Xu, Y., Feng, Z., Yang, X., Wang, X., and Gao, X. (2012). Multiple-platform data integration method with application to combined analysis of microarray and proteomic data. BMC Bioinformatics 13, 320.
- Wu, X., Zhang, W., Font-Burgada, J., Palmer, T., Hamil, A. S., Biswas, S. K., et al. (2014). Ubiquitin-conjugating enzyme ubc13 controls breast cancer metastasis through a tak1-p38 map kinase cascade. Proceedings of the National Academy of Sciences 111, 13870–13875.
- Xu, X., Zhang, Y., Zhang, W., Li, T., Gao, H., and Wang, Y. (2014). MicroRNA-133a functions as a tumor suppressor in gastric cancer. Journal of Biology Regulators and Homeostatics Agents 4, 615–624.
- Zhao, G., Zhang, J., Shi, Y., Qin, Q., Liu, Y., Wang, B., et al. (2013). MiR-130b is a prognostic marker and inhibits cell proliferation and invasion in pancreatic cancer through targeting STAT3. PLoS One 9, e7308.