Bayesian additive regression trees and the General BART model
Corresponding Author
Yaoyuan Vincent Tan
Department of Biostatistics and Epidemiology, Rutgers School of Public Health, Piscataway, New Jersey
Yaoyuan Vincent Tan, Department of Biostatistics and Epidemiology, Rutgers School of Public Health, 683 Hoes Lane West Suite 213-A, Piscataway, NJ 08854.
Email: [email protected]
Search for more papers by this authorJason Roy
Department of Biostatistics and Epidemiology, Rutgers School of Public Health, Piscataway, New Jersey
Search for more papers by this authorCorresponding Author
Yaoyuan Vincent Tan
Department of Biostatistics and Epidemiology, Rutgers School of Public Health, Piscataway, New Jersey
Yaoyuan Vincent Tan, Department of Biostatistics and Epidemiology, Rutgers School of Public Health, 683 Hoes Lane West Suite 213-A, Piscataway, NJ 08854.
Email: [email protected]
Search for more papers by this authorJason Roy
Department of Biostatistics and Epidemiology, Rutgers School of Public Health, Piscataway, New Jersey
Search for more papers by this authorAbstract
Bayesian additive regression trees (BART) is a flexible prediction model/machine learning approach that has gained widespread popularity in recent years. As BART becomes more mainstream, there is an increased need for a paper that walks readers through the details of BART, from what it is to why it works. This tutorial is aimed at providing such a resource. In addition to explaining the different components of BART using simple examples, we also discuss a framework, the General BART model that unifies some of the recent BART extensions, including semiparametric models, correlated outcomes, and statistical matching problems in surveys, and models with weaker distributional assumptions. By showing how these models fit into a single framework, we hope to demonstrate a simple way of applying BART to research problems that go beyond the original independent continuous or binary outcomes framework.
Supporting Information
Filename | Description |
---|---|
SIM_8347-Supp-0001-Codes for SIM.Rapplication/R, 9.3 KB |
SIM_8347-Supp-0001-Codes for SIM.R |
Please note: The publisher is not responsible for the content or functionality of any supporting information supplied by the authors. Any queries (other than missing content) should be directed to the corresponding author for the article.
REFERENCES
- 1Chipman HA, George EI, McCulloch RE. BART: bayesian additive regression trees. Ann Appl Stat. 2010; 4(1): 266-298.
- 2Hernández B, Pennington SR, Parnell AC. Bayesian methods for proteomic biomarker development. EuPA Open Proteomics. 2015; 9: 54-64.
- 3Kropat G, Bochud F, Jaboyedoff M, et al. Improved predictive mapping of indoor radon concentrations using ensemble regression trees based on automatic clustering of geological units. J Environ Radioact. 2015; 147: 51-62.
- 4Leonti M, Cabras S, Weckerle CS, Solinas MN, Casu L. The causal dependence of present plant knowledge on herbals—contemporary medicinal plant use in Campania (Italy) compared to Matthioli (1568). J Ethnopharmacol. 2010; 130(2): 379-391.
- 5Hill JL. Bayesian nonparametric modeling for causal inference. J Comput Graph Stat. 2011; 20: 217-240.
- 6Liu Y, Shao Z, Yuan G-C. Prediction of Polycomb target genes in mouse embryonic stem cells. Genomics. 2010; 96(1): 17-26.
- 7Liu Y, Traskin M, Lorch SA, George EI, Small D. Ensemble of trees approaches to risk adjustment for evaluating a hospital's performance. Health Care Manag Sci. 2015; 18(1): 58-66.
- 8Zhang JL, Härdle WK. The Bayesian additive classification tree applied to credit risk modelling. Comput Stat Data Anal. 2010; 54(5): 1197-1205.
- 9Nateghi R, Guikema SD, Quiring SM. Comparison and validation of statistical methods for predicting power outage durations in the event of hurricanes. Risk Analysis. 2011; 31(12): 1897-1906.
- 10Chipman H, George E, Lemp L, McCulloch RE. Bayesian flexible modeling of trip durations. Transp Res B Methodol. 2010; 44(5): 686-698.
- 11Ding J, Bashashati A, Roth A, et al. Feature based classifiers for somatic mutation detection in tumour-normal paired sequencing data. Bioinformatics. 2012; 28(2): 167-175.
- 12Bonato V, Baladandayuthapani V, Broom BM, Sulman EP, Aldape KD, Do K-A. Bayesian ensemble methods for survival prediction in gene expression data. Bioinformatics. 2011; 27(3): 359-367.
- 13Sparapani RA, Logan BR, McCulloch RE, Laud PW. Nonparametric survival analysis using Bayesian Additive Regression Trees (BART). Statist Med. 2016; 35(16): 2741-2753.
- 14Kindo BP, Wang H, Peña EA. Multinomial probit Bayesian additive regression trees. Stat. 2016; 5(1): 119-131.
- 15Agarwal R, Ranjan P, Chipman H. A new Bayesian ensemble of trees approach for land cover classification of satellite imagery. Can J Remote Sens. 2013; 39(6): 507-520.
- 16Linero A, Sinhay D, Lipsitz SR. Semiparametric mixed-scale models using shared Bayesian forests. arXiv preprint: 1809.08521. 2018.
- 17Green DP, Kern HL. Modeling heterogeneous treatment effects in survey experiments with Bayesian additive regression trees. Public Opin Q. 2012; 76(3): 491-511.
- 18Hill JL. Atlantic Causal Inference Conference Competition Results. New York, NY:New York University; 2016. http://jenniferhill7.wixsite.com/acic-2016/competition
- 19Hahn PR, Murray JS, Carvalho CM. Bayesian regression tree models for causal inference: regularization, confounding, and heterogeneous effects. arXiv preprint: 1706.09523; 2017.
- 20Dorie V, Hill JL, Shalit U, Scott M, Cervone D. Automated versus do-it-yourself methods for causal inference: lessons learned from a data analysis competition. arXiv preprint: 1707.02641; 2017.
- 21Xu D, Daniels MJ, Winterstein AG. Sequential BART for imputation of missing covariates. Biostatistics. 2016; 17(3): 589-602.
- 22Kapelner A, Bleich J. Prediction with missing data via Bayesian Additive Regression Trees. Can J Stat. 2015; 43: 224-239.
- 23Tan YV, Flannagan AC, Elliott MR. “Robust-squared” imputation models using BART. J Surv Stat Methodol. 2019; 4. Ahead of print.
- 24Kindo BP, Wang H, Hanson T, Peña EA. Bayesian quantile additive regression trees. arXiv preprint: 1607.02676; 2016.
- 25Murray JS. Log-linear Bayesian additive regression trees for categorical and count responses. arXiv preprint: 1701.01503; 2017.
- 26Starling JE, Murray JS, Carvalho CM, Bukowski R, Scott JG. BART with targeted smoothing: an analysis of patient-specific stillbirth risk. arXiv preprint: 1805.07656; 2018.
- 27Sparapani R, Rein LE, Tarima SS, Jackson TA, Meurer JR. Non-parametric recurrent events analysis with BART and an application to the hospital admissions of patients with diabetes. Biostatistics. 2018. Ahead of print.
- 28Sivaganesan S, Müller P, Huang B. Subgroup finding via Bayesian additive regression trees. Statist Med. 2017; 36(15): 2391-2403.
- 29Schnell PM, Tang Q, Offen WW, Carlin BP. A Bayesian credible subgroups approach to identifying patient subgroups with positive treatment effects. Biometrics. 2016; 72(4): 1026-1036.
- 30Schnell PM, Müller P, Tang Q, Carlin BP. Multiplicity-adjusted semiparametric benefiting subgroup identification in clinical trials. Clinical Trials. 2018; 15(1): 75-86.
- 31Tan YV, Flannagan CAC, Pool LR, Elliott MR. Accounting for selection bias due to death in estimating the effect of wealth shock on cognition for the Health and Retirement Study. arXiv preprint: 1812.08855; 2018.
- 32Logan BR, Sparapani R, McCulloch RE, Laud PW. Decision making and uncertainty quantification for individualized treatments using Bayesian Additive Regression Trees. Stat Methods Med Res. 2019; 28(4): 1079-1093.
- 33Sparapani R, Logan BR, McCulloch RE, Laud PW. Nonparametric competing risks analysis using Bayesian additive regression trees. Stat Methods Med Res. 2019; 1. Ahead of print.
- 34Liang F, Li Q, Zhou L. Bayesian neural networks for selection of drug sensitive genes. J Am Stat Assoc. 2018; 113(523): 955-972.
- 35Nalenz M, Villani M. Tree ensembles with rule structured horseshoe regularization. Ann Appl Stat. 2018; 12(4): 2379-2408.
- 36Lu M, Sadiq S, Feaster DJ, Ishwarana H. Estimating individual treatment effect in observational data using random forest methods. J Comput Graph Stat. 2018; 27(1): 209-219.
- 37Zeldow B, Re VL III, Roy J. A semiparametric modeling approach using Bayesian additive regression trees with an application to evaluate heterogeneous treatment effects. arXiv preprint: 1806.04200; 2018.
- 38Tan YV, Flannagan CAC, Elliott MR. Predicting human-driving behavior to help driverless vehicles drive: random intercept Bayesian Additive Regression Trees. Stat Interface. 2018; 11(4): 557-572.
- 39Zhang S, Shih Y-CT, Müller P. A spatially-adjusted Bayesian additive regression tree model to merge two datasets. Bayesian Analysis. 2007; 2(3): 611-634.
- 40George E, Laud PW, Logan BR, McCulloch RE, Sparapani R. Fully nonparametric Bayesian additive regression trees. arXiv preprint: 1807.00068; 2018.
- 41Bleich J, Kapelner A, George E, Jensen ST. Variable selection for BART: an application to gene regulation. Ann. Appl Stat. 2014; 8(3): 1750-1781.
- 42Ročková V, Saha E. On theory for BART. arXiv preprint: 1810.00787; 2018.
- 43Ročková V, van der Pas S. Posterior concentration for Bayesian regression trees and their ensembles. arXiv preprint: 1708.08734; 2017.
- 44Linero AR. Bayesian regression trees for high-dimensional prediction and variable selection. J Am Stat Assoc. 2018; 113(522): 626-636.
- 45Denison DGT, Mallick BK, Smith AFM. A Bayesian CART algorithm. Biometrika. 1998; 85(2): 363-377.
- 46Wu Y, Tjelmeland H, West M. Bayesian CART: prior specification and posterior simulation. J Comput Graph Stat. 2007; 16(1): 44-66.
- 47Pratola MT. Efficient Metropolis-Hastings proposal mechanisms for Bayesian regression tree models. Bayesian Analysis. 2016; 11(3): 885-911.
- 48He J, Yalov S, Hahn R. XBART: accelerated Bayesian additive regression trees. arXiv preprint: 1810.02215v3; 2019.
- 49Albert JH, Chib S. Bayesian analysis of binary and polychotomous response data. J Am Stat Assoc. 1993; 88(422): 669-679.
- 50Guyon I. A Scaling Law for the Validation-Set Training-Set Size Ratio. Berkeley, CA: AT & T Bell Laboratories; 1997.
- 51Tan YV, Elliott MR, Flannagan CAC. Development of a real-time prediction model of driver behavior at intersections using kinematic time series data. Accid Anal Prev. 2017; 106: 428-436.
- 52Sayer JR, Bogard SE, Buonarosa ML, et al. Integrated Vehicle-Based Safety Systems Light-Vehicle Field Operational Test Key Findings Report. DOT HS 811 416. Washington, DC: National Center for Statistics and Analysis, NHTSA, U.S. Department of Transportation; 2011. http://www.nhtsa.gov/DOT/NHTSA/NVS/Crash%20Avoidance/Technical%20Publications/2011/811416.pdf
10.1037/e621692011-001 Google Scholar
- 53Albert JH, Chib S. Bayesian modeling of binary repeated measures data with application to crossover trials. In: DA Berry, DK Stangl, eds. Bayesian Biostatistics. New York, NY: Marcel Dekker; 1996.
- 54Rässler S. Statistical Matching: A Frequentist Theory, Practical Applications and Alternative Bayesian Approaches. New York, NY: Springer Verlag; 2002. Lecture Notes in Statistics.
10.1007/978-1-4613-0053-3 Google Scholar
- 55Escobar MD, West M. Bayesian density estimation and inference using mixtures. J Am Stat Assoc. 1995; 90: 577-588.
- 56Rossi P. Bayesian Non-and Semi-Parametric Methods and Applications. Princeton, NJ: Princeton University Press; 2014.
10.1515/9781400850303 Google Scholar
- 57Dey DD, Müller P, Sinha D. Practical Nonparametric and Semiparametric Bayesian Statistics. New York, New York, NY: Springer-Verlag; 1998.
10.1007/978-1-4612-1732-9 Google Scholar
- 58Neal RM. Markov chain sampling methods for Dirichlet process mixture models. J Comput Graph Stat. 2000; 9(2): 249-265.
- 59Kapelner A, Bleich J. bartMachine: machine learning with Bayesian additive regression trees. J Stat Softw. 2016; 70(4): 1-40.
- 60Liu Y, Ročková V, Wang Y. ABC variable selection with Bayesian forests. arXiv preprint: 1806.02304. 2018.
- 61Chipman H, George E, McCulloch RE. Bayesian Regression Structure Discovery. Oxford, MA: Oxford University Press; 2013.
10.1093/acprof:oso/9780199695607.003.0022 Google Scholar
- 62Pratola MT, Chipman H, Gattiker JR, Higdon DM, McCulloch RE, Rust WN. Parallel Bayesian additive regression trees. J Comput Graph Stat. 2014; 23(3): 830-852.
- 63Entezari R, Craiu RV, Rosenthal JS. Likelihood inflating sampling algorithm. Can J Stat. 2018; 46(1): 147-175.
- 64Lakshminarayanan B, Roy DM, Teh YW. Particle Gibbs for Bayesian Additive Regression Trees. arXiv preprint: 1502.04622; 2015.
- 65Linero AR, Yang Y. Bayesian regression tree ensembles that adapt to smoothness and sparsity. J R Stat Soc Ser B. 2018; 80(5): 1087-1110.
10.1111/rssb.12293 Google Scholar
- 66Du J, Linero AR. Interaction detection with Bayesian decision tree ensembles. arXiv preprint: 1809.08524; 2018.