Volume 75, Issue 3 pp. 768-777
BIOMETRIC METHODOLOGY

Causal inference when counterfactuals depend on the proportion of all subjects exposed

Caleb H. Miles

Corresponding Author

Caleb H. Miles

Department of Biostatistics, Columbia Mailman School of Public Health, New York, New York

Correspondence

Caleb H. Miles, Department of Biostatistics, Columbia Mailman School of Public Health, New York, NY

Email: [email protected]

Search for more papers by this author
Maya Petersen

Maya Petersen

Division of Biostatistics, University of California at Berkeley, Berkeley, California

Division of Epidemiology, University of California at Berkeley, Berkeley, California

Search for more papers by this author
Mark J. van der Laan

Mark J. van der Laan

Division of Biostatistics, University of California at Berkeley, Berkeley, California

Department of Statistics, University of California at Berkeley, Berkeley, California

Search for more papers by this author
First published: 04 February 2019
Citations: 6

Abstract

The assumption that no subject's exposure affects another subject's outcome, known as the no-interference assumption, has long held a foundational position in the study of causal inference. However, this assumption may be violated in many settings, and in recent years has been relaxed considerably. Often this has been achieved with either the aid of a known underlying network, or the assumption that the population can be partitioned into separate groups, between which there is no interference, and within which each subject's outcome may be affected by all the other subjects in the group via the proportion exposed (the stratified interference assumption). In this article, we instead consider a complete interference setting, in which each subject affects every other subject's outcome. In particular, we make the stratified interference assumption for a single group consisting of the entire sample. We show that a targeted maximum likelihood estimator for the i.i.d. setting can be used to estimate a class of causal parameters that includes direct effects and overall effects under certain interventions. This estimator remains doubly-robust, semiparametric efficient, and continues to allow for incorporation of machine learning under our model. We conduct a simulation study, and present results from a data application where we study the effect of a nurse-based triage system on the outcomes of patients receiving HIV care in Kenyan health clinics.

The full text of this article hosted at iucr.org is unavailable due to technical difficulties.