Volume 38, Issue 25 pp. 5048-5069
TUTORIAL IN BIOSTATISTICS

Bayesian additive regression trees and the General BART model

Yaoyuan Vincent Tan

Corresponding Author

Yaoyuan Vincent Tan

Department of Biostatistics and Epidemiology, Rutgers School of Public Health, Piscataway, New Jersey

Yaoyuan Vincent Tan, Department of Biostatistics and Epidemiology, Rutgers School of Public Health, 683 Hoes Lane West Suite 213-A, Piscataway, NJ 08854.

Email: [email protected]

Search for more papers by this author
Jason Roy

Jason Roy

Department of Biostatistics and Epidemiology, Rutgers School of Public Health, Piscataway, New Jersey

Search for more papers by this author
First published: 28 August 2019
Citations: 70

Abstract

Bayesian additive regression trees (BART) is a flexible prediction model/machine learning approach that has gained widespread popularity in recent years. As BART becomes more mainstream, there is an increased need for a paper that walks readers through the details of BART, from what it is to why it works. This tutorial is aimed at providing such a resource. In addition to explaining the different components of BART using simple examples, we also discuss a framework, the General BART model that unifies some of the recent BART extensions, including semiparametric models, correlated outcomes, and statistical matching problems in surveys, and models with weaker distributional assumptions. By showing how these models fit into a single framework, we hope to demonstrate a simple way of applying BART to research problems that go beyond the original independent continuous or binary outcomes framework.

The full text of this article hosted at iucr.org is unavailable due to technical difficulties.