Hierarchical Models for Causal Effects†
Methods of Research
Research Methods - Quantitative
†
For Emerging Trends in the Social and Behavioral Sciences, ed. Robert Scott and Stephen Kosslyn. We thank Jennifer Hill and Shira Mitchell for helpful comments and the National Science Foundation and the Institute of Education Sciences for partial support of this work.
Abstract
Hierarchical models play three important roles in modeling causal effects: (i) accounting for data collection, such as in stratified and split-plot experimental designs; (ii) adjusting for unmeasured covariates, such as in panel studies; and (iii) capturing treatment effect variation, such as in subgroup analyses. Across all three areas, hierarchical models, especially Bayesian hierarchical modeling, offer substantial benefits over classical, non-hierarchical approaches. After discussing each of these topics, we explore some recent developments in the use of hierarchical models for causal inference and conclude with some thoughts on new directions for this research area.
References
-
Angrist, J. D., & Pischke, J. S. (2008). Mostly harmless econometrics: An empiricist's companion. Princeton, NJ: Princeton University Press.
10.2307/j.ctvcm4j72 Google Scholar
- Arpino, B., & Mealli, F. (2011). The specification of the propensity score in multilevel observational studies. Computational Statistics and Data Analysis, 55(4), 1770–1780.
- Assmann, S. F., Pocock, S. J., Enos, L. E., & Kasten, L. E. (2000). Subgroup analysis and other (mis) uses of baseline data in clinical trials. Lancet, 355(9209), 1064–1069.
- Beck, N., & Jackman, S. (1998). Beyond linearity by default: Generalized additive models. American Journal of Political Science, 42, 596–627.
- Bien, J., Taylor, J., & Tibshirani, R. (2012). A lasso for hierarchical interactions. arXiv preprint arXiv:1205.5050.
- Bitler, M., Gelbach, J., & Hoynes, H. (2003). What mean impacts miss: Distributional effects of welfare reform experiments. American Economic Review, 96(4), 988–1012.
- Bloom, H. S., Raudenbush, S. W., & Weiss, M. (2013). Estimating variation in program impacts: Theory, practice and applications MDRC Working Paper.
- Bryk, A. S., & Raudenbush, S. W. (1988). Heterogeneity of variance in experimental studies: A challenge to conventional interpretations. Psychological Bulletin, 104(3), 396–404.
- Bryk, A. S., & Raudenbush, S. W. (2002). Hierarchical linear models: Applications and data analysis methods ( 2nd ed.). Thousand Oaks, CA: Sage Publications.
- Chamberlain, G., & Imbens, G. W. (2003). Nonparametric applications of Bayesian inference. Journal of Business and Economic Statistics, 21(1), 12–18. doi:10.1198/073500102288618711
- Cox, D. R. (1984). Interaction. International Statistical Review, 52(1), 1–31. doi:10.2307/1403235
- Dehejia, R. H. (2005). Program evaluation as a decision problem. Journal of Econometrics, 125(1–2), 141–173. doi:10.1016/j.jeconom.2004.04.006
-
Diggle, P., Heagerty, P., Liang, K. Y., & Zeger, S. (2002). Analysis of longitudinal data. Oxford, England: Oxford University Press.
10.1093/oso/9780198524847.001.0001 Google Scholar
- Ding, P., Feller, A., & Miratrix, L. (2014). Randomization inference for treatment effect variation. Working paper available at http://scholar.harvard.edu/files/feller/files/ding_feller_miratrix_submission.pdf.
- Dixon, D. O., & Simon, R. (1991). Bayesian subset analysis. Biometrics, 47, 871–881.
- Dominici, F., Zeger, S. L., Parmigiani, G., Katz, J., & Christian, P. (2006). Estimating percentile-specific treatment effects in counterfactual models: a case-study of micronutrient supplementation, birth weight and infant mortality. Journal of the Royal Statistical Society. Series C. Applied Statistics, 55(2), 261–280. doi:10.1111/j.1467-9876.2006.00533.x
- Feller, A., & Holmes, C. (2009). Beyond Toplines: Heterogeneous Treatment Effects in Randomized Experiments. Working paper available at http://www.stat.columbia.edu/∼gelman/stuff_for_blog/feller.pdf.
- Fink, G., McConnell, M., & Vollmer, S. (2011). Testing for heterogeneous treatment effects in experimental data: False discovery risks and correction procedures. Journal of Development Effectiveness, 6(1), 44–57.
- Frangakis, C. E., & Rubin, D. B. (2002). Principal Stratification in causal inference. Biometrics, 58(1), 21–29.
-
Gelman, A. (2004). Treatment effects in before-after data. In Applied Bayesian modeling and causal inference from incomplete-data perspectives (pp. 195–202). Chichester, England: John Wiley & Sons, Ltd. doi:10.1002/0470090456.ch18
10.1002/0470090456.ch18 Google Scholar
- Gelman, A. (2007). Struggles with survey weighting and regression modeling (with discussion). Statistical Science, 22, 153–188.
-
Gelman, A., Carlin, J. B., Stern, H. S., Dunson, D. B., Vehtari, A., & Rubin, D. B. (2013). Bayesian data analysis. Boca Raton, FL: CRC press.
10.1201/b16018 Google Scholar
-
Gelman, A., & Hill, J. (2006). Data analysis using regression and multilevel/hierarchical models. Cambridge: Cambridge University Press.
10.1017/CBO9780511790942 Google Scholar
- Gelman, A., & Huang, Z. (2008). Estimating incumbency advantage and its variation, as an example of a before–after study. Journal of the American Statistical Association, 103(482), 437–446. doi:10.1198/016214507000000626
- Gerber, A. S., & Green, D. P. (2012). Field experiments: Design, analysis, and interpretation. New York, NY: W. W. Norton and Company.
- Green, D. P., & Kern, H. L. (2012). Modeling heterogeneous treatment effects in survey experiments with Bayesian additive regression trees. Public Opinion Quarterly, 76(3), 491–511.
- Green, D. P., & Vavreck, L. (2007). Analysis of cluster-randomized experiments: A comparison of alternative estimation approaches. Political Analysis, 16(2), 138–152. doi:10.1093/pan/mpm025
- Hausman, J. A. (1978). Specification tests in econometrics. Econometrica, 46, 1251–1271.
- Hill, J. L. (2011). Bayesian nonparametric modeling for causal inference. Journal of Computational and Graphical Statistics, 20(1), 217–240. doi:10.1198/jcgs.2010.08162
- Hill, J. L. (2013). Multilevel models and causal inference. In M. Scott, J. Simonoff & B. Marx (Eds.), The SAGE handbook of multilevel modeling. Los Angeles, CA: Sage.
-
Hill, J., & Scott, M. (2009). Comment: The essential role of pair matching. Statistical Science, 24(1), 54–58. doi:10.1214/09-STS274A
10.1214/09-STS274A Google Scholar
- Hirano, K., Imbens, G. W., Rubin, D. B., & Zhou, X.-H. (2000). Assessing the effect of an influenza vaccine in an encouragement design. Biostatistics, 1(1), 69–88.
- Hodges, J. S., Cui, Y., Sargent, D. J., & Carlin, B. P. (2007). Smoothing balanced single-error-term analysis of variance. Technometrics, 49(1), 12–25. doi:10.1198/004017006000000408
- Hong, G., & Raudenbush, S. W. (2006). Evaluating kindergarten retention policy. Journal of the American Statistical Association, 101(475), 901–910. doi:10.1198/016214506000000447
- Hong, G., & Raudenbush, S. W. (2007). Causal inference for time-varying instructional treatments. Journal of Educational and Behavioral Statistics, 33(3), 333–362. doi:10.3102/1076998607307355
- Imai, K., & Ratkovic, M. (2013). Estimating treatment effect heterogeneity in randomized program evaluation. The Annals of Applied Statistics, 7(1), 443–470. doi:10.1214/12-AOAS593
- Imai, K., & Strauss, A. (2011). Estimation of heterogeneous treatment effects from randomized experiments, with application to the optimal planning of the get-out-the-vote campaign. Political Analysis, 19(1), 1–19. doi:10.1093/pan/mpq035
- Imai, K., King, G., & Nall, C. (2009). The essential role of pair matching in cluster-randomized experiments, with application to the Mexican Universal Health Insurance Evaluation. Statistical Science, 24(1), 29–53. doi:10.1214/08-STS274
-
Imbens, G., & Rubin, D. (2015). Causal inference in statistics, social, and biomedical sciences: An introduction. Cambridge: Cambridge University Press.
10.1017/CBO9781139025751 Google Scholar
- Kim, J., & Seltzer, M. (2011). Examining heterogeneity in residual variance to detect differential response to treatments. Psychological Methods, 16(2), 192–208. doi:10.1037/a0022656
- Neyman, J. (1923). On the application of probability theory to agricultural experimentsEssay on principles. Section 9. Translated and edited by D. M. Dabrowska and T. P. Speed. Statistical Science, 5, 463–480 (1990).
- Lancaster, T., & Jun, S. J. (2009). Bayesian quantile regression methods. Journal of Applied Econometrics, 25(2), 287–307. doi:10.1002/jae.1069
- Pocock, S. J., Assmann, S. E., Enos, L. E., & Kasten, L. E. (2002). Subgroup analysis, covariate adjustment and baseline comparisons in clinical trial reporting: Current practice and problems. Statistics in Medicine, 21(19), 2917–2930. doi:10.1002/sim.1296
- Rasmussen, C. E., & Williams, C. K. I. (2006). Gaussian processes for machine learning. Cambridge, MA: MIT Press.
- Raudenbush, S. W., Martinez, A., & Spybrook, J. (2007). Strategies for improving precision in group-randomized experiments. Educational Evaluation and Policy Analysis, 29(1), 5–29. doi:10.3102/0162373707299460
- Reich, B. J., Bondell, H. D., & Wang, H. J. (2010). Flexible Bayesian quantile regression for independent and clustered data. Biostatistics, 11(2), 337–352. doi:10.1093/biostatistics/kxp049
- Rosenbaum, P. R., & Rubin, D. B. (1983). The central role of the propensity score in observational studies for causal effects. Biometrika, 70(1), 41–55.
- Rubin, D. B. (1974). Estimating causal effects of treatments in randomized and nonrandomized studies. Journal of Educational Psychology, 66(5), 688–701.
- Sargent, D. J. & Hodges, J. S. (1997). Smoothed ANOVA with application to subgroup analysis. Research Report rr2005-018, Department of Biostatistics, University of Minnesota.
- Simon, R. (2002). Bayesian subset analysis: Application to studying treatment-by-gender interactions. Statistics in Medicine, 21(19), 2909–2916. doi:10.1002/sim.1295
- Simon, R. M. (2007). Subgroup analysis. In Wiley Encyclopedia of clinical trials. New York, NY: John Wiley & Sons, Inc.
- Sinclair, B., McConnell, M., & Green, D. P. (2012). Detecting spillover effects: Design and analysis of multilevel experiments. American Journal of Political Science, 56(4), 1055–1069. doi:10.1111/j.1540-5907.2012.00592.x
- Sivaganesan, S., Laud, P. W., & Müller, P. (2010). A Bayesian subgroup analysis with a zero-enriched Polya Urn scheme. Statistics in Medicine, 30(4), 312–323. doi:10.1002/sim.4108
- Taddy, M. A., & Kottas, A. (2010). A Bayesian nonparametric approach to inference for quantile regression. Journal of Business and Economic Statistics, 28(3), 357–369. doi:10.1198/jbes.2009.07331
- Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society B, 58, 267–288.
- Tokdar, S. T. (2013). Causal analysis of observational data with gaussian process potential outcome models. Presentation at the 2013 Joint Statistical Meetings.
-
Van der Laan, M. J., & Robins, J. M. (2003). Unified methods for censored longitudinal data and causality. New York, NY: Springer.
10.1007/978-0-387-21700-0 Google Scholar
- Volfovsky, A., & Hoff, P. D. (2012). Hierarchical array priors for ANOVA decompositions. arXiv.org. Retrieved from http://arxiv.org/pdf/1208.1726v1.pdf
- Wooldridge, J. M. (2010). Econometric analysis of cross section and panel data ( 2nd ed.). Cambridge, MA: MIT press.
- Zajonc, T. (2012). Bayesian inference for dynamic treatment regimes: Mobility, equity, and efficiency in student tracking. Journal of the American Statistical Association, 107(497), 80–92. doi:10.1080/01621459.2011.643747
Further Reading
- Cox, D. R. (1958). The interpretation of the effects of non-additivity in the Latin square. Biometrika, 45, 69–73.
Citing Literature
Browse other articles of this reference work: