Targeted Learning

Causal Inference for Observational and Experimental Data

Targeted learning is a framework for causal and statistical inference methodology incorporating machine learning. 

The book Targeted Learning: Causal Inference for Observational and Experimental Data, by Mark J. van der Laan and Sherri Rose, was published in 2011. This text focuses largely on cross-sectional studies.

The second book by van der Laan and Rose, Targeted Learning in Data Science: Causal Inference for Complex Longitudinal Studies, has just been released by Springer in March 2018. This sequel text covers the complicated research questions found in longitudinal and dependent data structures.


Tutorial Papers: Super Learning for Prediction & TMLE for Causal Effects


S. Rose, S. Bergquist, T. Layton. Computational health economics for identification of unprofitable health care enrollees. Biostatistics. [Link]

A. Luedtke, M. van der Laan. Parametric-rate inference for one-sided differentiable parameters. Journal of the American Statistical Association. [Link]

L. Acion et al. Use of a machine learning framework to predict substance abuse disorder treatment success. PLoS One. [Link]

W. Zheng, L. Balzer, M. van der Laan, M. Petersen. Constrained binary classification using ensemble learning: An application to cost-efficient targeted PrEP strategies. Statistics in Medicine. [Link]

M. Schuler, S. Rose. Targeted maximum likelihood estimation for causal inference in observational studies. American Journal of Epidemiology. [PDF]


J. Spertus, S. Normand, R. Wolf, M. Cioffi, A. Lovett, S. Rose. Assessing hospital performance after percutaneous coronary intervention using big data. Circulation:CVQO. [PDF]

A. Luedtke, M. van der Laan. Statistical inference for the mean outcome under a possibly non-unique optimal treatment strategy. Annals of Statistics. [Link]

I. Diaz, M. Carone, M. van der Laan. Second-order inference of the mean of a variable missing at random. IJB [Link]

J. Ahern, D. Karasek, A. Luedtke, T. Bruckner, M. van der Laan. Racial/ethnic differences in the role of childhood adversities for mental disorders among a nationally representative sample of adolescents. Epidemiology. [Link]

K. Colson, K. Rudolph, S. Zimmerman, D. Goin, E. Stuart, M. van der Laan, J. Ahern. Optimizing matching and analysis combinations for estimating causal effects. Scientific Reports. [Link]

M. van der Laan, S. Gruber. One-step targeted minimum loss-based estimation based on universal least favorable one-dimensional submodels. IJB. [Link]

L. Balzer, M. van der Laan, M. Petersen, SEARCH Collaboration. Adaptive pre-specification in randomized trials with and without pair-matching. Statistics in Medicine. [Link]

A. Luedtke, M. van der Laan. Super learning of an optimal dynamic treatment rule. IJB. [Link]

R. Pirracchio, J. Yue, G. Manley, M. van der Laan, A. Hubbard, TRACK-TMI Investigators. Collaborative targeted maximum likelihood estimation for variable importance measure: Illustration for functional outcome prediction in mild traumatic brain injuries. Statistical Methods in Medical Research. [Link]

A. Luedtke, M. van der Laan. Optimal individualized treatments in resource-limited settings. IJB. [Link]

L. Balzer, M. Petersen, M. van der Laan, SEARCH Collaboration. Targeted estimation and inference for the sample average treatment effect in trials with and without pair-matching. Statistics in Medicine. [Link]

M. Schnitzer, J. Lok, S. Gruber. Variable selection for confounder control, flexible modeling and collaborative targeted minimum loss-based estimation in causal inference. IJB. [Link]

S. Rose. A machine learning framework for plan payment risk adjustment. Health Services Research. [Link]

W. Zheng, M. Petersen, M. van der Laan. Doubly robust and efficient estimation of marginal structural models for the hazard function. IJB. [Link]

L. Cain, M. Saag, M. Petersen, et al. Using observational data to emulate a randomized trial of dynamic treatment-switching strategies: An application to antiretroviral therapy. Int J Epidemiology. [Link]

E. LeDell, M. van der Laan, M. Petersen. AUC-maximizing ensembles through metalearning. IJB. [Link]

M. Schnitzer, J. Lok, R. Bosch. Double robust and efficient estimation of a prognostic model for events in the presence of dependent censoring. Biostatistics. [Link]

R. Neugebauer, J. Schmittdiel, M. van der Laan. A case study of the impact of data-adaptive versus model-based estimation of the propensity scores on causal inferences from three inverse probability weighting estimators. IJB. [Link]

A. Mirelman, S. Rose, J. Khan, S. Ahmed, D. Peters, L. Niessen, A. Trujillo. The relationship between non-communicable disease occurrence and poverty: Evidence from demographic surveillance in Matlab, Bangladesh. Health Policy & Planning. [Link]

M. Davies, M. van der Laan. Optimal spatial prediction using ensemble machine learning. IJB. [Link]


A. Chambaz, P. Neuvial. tmle.npvi: Targeted, integrative search of associations between DNA copy number and gene expression, accounting for DNA methylation. Bioinformatics, 31(18):3054-6. [Link]

J. Ahern, L. Balzer, S. Galea. The roles of outlet density and norms in alcohol use disorder. Drug Alcohol Depend, 151:144-50. [Link]

D. Brown, M. Petersen, S. Costello, et al. Occupational exposure to PM2.5 and incidence of ischemic health disease: Longitudinal targeted minimum loss-based estimation. Epidemiology, 26(6):806-14. [Link]

A. Weber, M.J. van der Laan, M. Petersen. Assumption trade-offs when choosing identification strategies for pre-post treatment effect estimation: An illustration of a community-based intervention in Madagascar. JCI, 3(1):109-130. [Link]

S. Gruber. Targeted learning in healthcare research. Big Data, 3(4):211-18. [PDF]

M.J. van der Laan, A. Luedtke. Targeted learning of the mean outcome under an optimal dynamic treatment rule. JCI, 3(1):61-95. [Link]

E. LeDell, M. Petersen, M.J. van der Laan. Computationally efficient confidence intervals for cross-validated area under the ROC curve estimates. Electron J Stat, 9(1):1583-1607. [Link]

S. Lendle, B. Fireman, M.J. van der Laan. Balancing score adjustment targeted minimum loss-based estimation. JCI, 3(2):139-55. [Link]

S. Rose. Targeted learning for pre-analysis plans in public health and health policy research. Observational Studies, 1:294-306. [PDF]

I. Díaz, A. Hubbard, A. Decker, M. Cohen. Variable importance and prediction methods for longitudinal problems with missing variables. PLoS One, 10(3):e0120031. [Link]

M. Petersen, E. LeDell, J. Schwab, et al. Super learner analysis of electronic adherence data improves viral prediction and may provide strategies for selective HIV RNA monitoring. J Acquir Immune Defic Syndr, 69(1):109-18. [Link]

S. Gruber. A causal perspective on OSIM2 data generation, with implications for simulation study design and interpretation. JCI, 3(2):177-87. [Link]

R. Pirracchio, M. Petersen, M. Carone, M. Rigon, S. Chevret, M.J. van der Laan. Mortality prediction in intensive care units with the Super ICU Learner Algorithm (SICULA): A population-based study. Lancet Respir Med, 3(1):42-52. [Link]

R. Pirracchio, M. Petersen, M. van der Laan. Improving propensity score estimators' robustness to model misspecification using super learner. Am J Epidemiol, 181(2):108-19. [Link]

L. Balzer, M. Petersen, M.J. van der Laan and the SEARCH Consortium. Adaptive pair-matching in randomized trials with unbiased and efficient effect estimation. Stat Med, e-pub before print doi: 10.1002/sim.6380. [Link]

R. Kessler, C. Warner, C. Ivany, et al. Predicting suicides after psychiatric hospitalization in US Army soldiers. JAMA Psychiatry, 72(1):49-57. [Link]


H. Wang, Z. Zhang, S. Rose, M.J. van der Laan. A novel targeted learning method for quantitative trait loci mapping. Genetics, 198(4):1369-76. [PDF]

M. Petersen, J. Schwab, S. Gruber, N. Blaser, M. Schomaker, M. van der Laan. Targeted maximum likelihood estimation for dynamic and static longitudinal marginal structural working models. J Causal Inference, 2(2):147-185. [PDF]

H. Leslie, D. Karasek, L. Harris, E. Chang, N. Abdulrahim, M. Maloba, M. Huchko. Cervical cancer precursors and hormonal contraceptive use in HIV-positive women: A application of a causal model and semi-parametric methods. PLoS ONE, 9(6): e101090. [PDF]

M.J. van der Laan, R. Starmans. Entering the era of data science: Targeted learning and the integration of statistics and computational data analysis. Advances in Statistics, 2014: 502678. [PDF]

K. Rudolph, I. Díaz, M. Rosenblum, E. Stuart. Estimating population treatment effects from a survey subsample. Am J Epidemiol, 180(7):737-48. [Link]

R. Thomas, A. Hubbard, C. McHale, L. Zhang, S. Rappaport, et al. Characterization of changes in gene expression and biochemical pathways at low levels of benzene exposure. PLoS ONE, 9(5): e91828. [PDF]

P. Kotwani, L. Balzer, D. Kwarisiima, T. Clark, K. Kabami, et al. Evaluating linkage to care for hypertension after community-based screening in rural Uganda. Trop Med Int Health, 19(4):459-68. [Link]

M. Schnitzer, E. Moodie, M.J. van der Laan, R. Platt, M. Klein. Modeling the impact of hepatitis C viral clearance and end-stage liver disease in an HIV co-infected cohort with targeted maximum likelihood estimation. Biometrics, 70(1):144-52. [PDF]

R. Neugebauer, J. Schmittdiel, M.J. van der Laan. Targeted learning in real-world comparative effectiveness research with time-varying interventions. Stat Med, 33(14):2480-2520. [Link]

R. Kessler, S. Rose, K. Koenen, E Karam, et al. How well can post-traumatic stress disorder be predicted from pre-trauma risk factors? An exploratory study in the WHO World Mental Health Surveys. World Psychiatry, 13(3):265-74. [PDF]

M. Schnitzer, M.J. van der Laan, E. Moodie, R. Platt. Effect of breastfeeding on gastrointestinal infection in infants: A targeted maximum likelihood approach for clustered longitudinal data. Ann Appl Stat, 8(2):703-25. [PDF]

M.J. van der Laan. Targeted estimation of nuisance parameters to obtain valid statistical inference. Int J Biostat, 10(1):29-57. [Link]

S. Sapp, M.J. van der Laan. Subsemble: An ensemble method for combining subset-specfic algorithm fits. J Appl Statist, 41(6). [Link]

M. Petersen, M.J. van der Laan. Causal models and learning from data: Integrating causal modeling and statistical estimation. Epidemiology, 25(3):418-426. [Link]

M.J. van der Laan. Causal inference for a population of causally connected units. JCI, 2(1). [Link]

S. Sapp, M.J. van der Laan. Targeted estimation of binary variable importance measures with interval-censored outcomes. Int J Biostat, 10(1). [Link]

S. Rose, M.J. van der Laan. Rose and van der Laan respond to "Some advantages of RERI."  Am J Epidemiol, 179(6)172-6. [PDF]

S. Rose, M.J. van der Laan. A double robust approach to causal effects in case-control studies.  Am J Epidemiol, 179(6):663-9. [PDF]

A. Chambaz, D. Choudat, C. Huber, J-C. Pairon, M.J. van der Laan. Analysis of the effect of occupational exposure to asbestos based on threshold regression modeling of case-control data. Biostatistics, 15(2):327-340. [Link]


I. Díaz, M.J. van der Laan. Targeted data adaptive estimation of the causal dose-response curve. JCI, 2(1):171-192. [Link]

M. Legrand, R. Pirracchio, A. Rosa, M. Petersen, M.J. van der Laan, J. Fabiani et al. Incidence, risk factors and prediction of post-operative acute kidney injury following cardiac surgery for active infective endocarditis: An observational study. Crit Care, 17(5):R220. [PDF]

M. Subbaraman, S. Lendle, M.J. van der Laan, L. Kaskutas, J. Ahern. Cravings as a mediator and moderator of drinking outcomes in the COMBINE study. Addiction, 108(10):1737-44. [Link]

S. Lendle, B. Fireman, M.J. van der Laan. Targeted maximum likelihood estimation i nsafety analysis. J Clin Epidemiol, 66(8):S91-8. [PDF]

S. Lendle, M. Subbaraman, M.J. van der Laan. Identification and efficient estimation of the natural direct effect among the untreated. Biometrics, 69(2):310-17. [Link]

S. Rose. Mortality risk score prediction in an elderly population using machine learning. Am J Epidemiol, 177(5):443-452. [PDF] ["Editor's Choice" article

J. Brooks, M.J. van der Laan, D. Singer, A. Go. Targeted minimum loss-based estimation of causal effect in right-censored survival data with time-dependent covariates: Warfarin, stroke, and death in atrial fibrillation. JCI, 1(2):235-54. [Link]

T. Haight, M.J. van der Laan, T. Manini, I. Tager. Direct effects of leisure-time physical activity on walking speed. J Nutr Health Aging, 17(8):666-73. [Link]

I. Díaz, M.J. van der Laan. Assessing the causal effect of policies: An example using stochastic interventions. Int J Biostat, 9(2):161-74. [Link]

S. Gruber, M.J. van der Laan. An application of targeted maximum likelihood estimation to the meta-analysis of safety data. Biometrics, 69(1):254-62. [Link]

M.J. van der Laan, M. Petersen, W. Zheng. Estimating the effect of a community-based intervention with two communities. JCI, 1(1):83-106. [Link]

I. Díaz, M.J. van der Laan. Sensitivity analysis for causal inference under unmeasured confounding and measurement error problems. Int J Biostat, 9(2):149-60. [Link]


K. Moore, R. Neugebauer, M.J. van der Laan, I. Tager. Causal inference in epidemiological studies with strong confounding. Stat Med, 31(13):1380-1404. [PDF]

P. Chaffee, M.J. van der Laan. Targeted maximum likelihood estimation for dynamic treatment regimes in sequentially randomized controlled trials. Int J Biostat, 8(1). [Link]

S. Gruber, M.J. van der Laan. Consistent causal effect estimation under dual misspecification and implications for confounder selection procedures. Stat Methods Med Res. [PDF]

A. Chambaz, P. Neuvial, M.J. van der Laan. Estimation of a nonparametric variable importance measure of a continuous exposure. Electon J Stat, 6:1059-99. [PDF]

D. Rubin, M.J. van der Laan. Statistical issues and limitations in personalized medicine research with clinical trials. Int J Biostat, 8(1). [Link]

P. Chaffee, M.J. van der Laan. Discussion of "Evaluation of viable dynamic treatment regimes in a sequentially randomized trial of advanced prostate cancer," by Wang et al. JASA, 107(498):513-517. [PDF]

C.W. Wester, O. Stitelman, V. deGruttola, H. Bussmann, R. Marlink, M.J. van der Laan. Effect modification by sex and baseline CD4+ cell count among adults receiving combination antiretroviral therapy in Botswana: Results from a clinical trial. AIDS Res Hum Retroviruses, 28(9):981-8. [PDF]

M.J. van der Laan, S. Gruber. Targeted minimum loss based estimation of causal effects of multiple time point interventions. Int J Biostat, 8(1). [Link]

I. Díaz Muñoz, M.J. van der Laan. Population intervention causal effects based on stochastic interventions. Biometrics, 68(2):541-549. [PDF]

G. Geeven, M.J. van der Laan, M. de Gunst. Comparison of targeted maximum likelihood and shrinkage estimators of parameters in gene networks. SAGMB, 11(5):Article 2. [Link]

M.J. van der Laan, S. Gruber. Targeted minimum loss based estimator that outperforms a given estimator. Int J Biostat, 8(1). [Link]

A. Malani, O. Bembom, M.J. van der Laan. Accounting for heterogenous treatment effects in the FDA approval process. Food Drug Law J, 67(1):23-50. [Link]

W. Zheng, M.J. van der Laan. Targeted maximum likelihood estimation of natural direct effects. Int J Biostat, 8(1):1-40. [Link]


M. MCullock, M. Broffman, M. van der Laan, A. Hubbard, et al. Colon cancer survival with herbal medicine and vitamins combined with standard therapy in a whole-systems approach: Ten-year follow-up data analyzed with marginal structural models and propensity score methods. Integr Cancer Ther, 10(3):240-59. [PDF]

K. Moore, R. Neugebauer, T. Valappil, M.J. van der Laan. Robust extraction of covariate information to improve estimation efficiency in randomized trials. Stat Med, 30(19):2389-2408. [PDF]

H. Wang, M.J. van der Laan. Dimension reduction with gene expression data using targeted variable importance measurement. BMC Bioinformatics, 12:312. [PDF]

M. Odden, I. Tager, M.J. van der Laan, J. Delaney, C. Peralta, R. Katz, M. Sarnak, B. Psaty, M. Shilpak. Antihypertensive medication use and change in kidney function in elderly adults: A marginal structural model analysis. Int J Biostat, 7(1). [Link]

C. Tuglus, M.J. van der Laan. Repeated measures semiparametric regression using targeted maximum likelihood methodology with application to transcription factor activity discovery. SAGMB, 10(1):2. [PDF]

I. Díaz, M.J. van der Laan. Super learner based conditional density estimation with application to marginal structural models. Int J Biostat, 7(1). [Link]

H. Wang, S. Rose, M.J. van der Laan. Finding quantitative trait loci genes with collaborative targeted maximum likelihood learning. Stat Probabil Lett, 81(7):792–796. [PDF] [Featured in issue editorial]

O. Stitelman, C.W. Wester, V. De Gruttola, M.J. van der Laan. Targeted maximum likelihood estimation of effect modification parameters in survival analysis. Int J Biostat, 7(1). [Link]

S. Rose, J. Snowden, K.M. Mortimer. Rose et al. respond to “G-computation and standardization in epidemiology.” Am J Epidemiol, 173(7):743–744. [PDF]

J. Snowden, S. Rose, K.M. Mortimer. Implementation of G-Computation on a simulated data set: demonstration of a causal inference technique. Am J Epidemiol, 173(7):731–738. [PDF] [Evaluated by Faculty of 1000]

A. Chambaz, M.J. van der Laan. Targeting the optimal design in randomized clinical trials with binary outcomes and no covariate. Int J Biostat, 7(1):10. [PDF]

S. Rose, M.J. van der Laan. A targeted maximum likelihood estimator for two-stage designs. Int J Biostat, 7(1):17. [PDF]

K. Porter, S. Gruber, M.J. van der Laan, J. Sekhon. The relative performance of targeted maximum likelihood estimators. Int J Biostat, 7(1):31. [PDF]


S. Gruber, M.J. van der Laan. A targeted maximum likelihood estimation of a causal effect on a bounded continuous outcome. Int J Biostat, 6(1):26. [PDF]

M.J. van der Laan. Targeted maximum likelihood based causal inference: Part II. Int J Biostat, 6(2):3. [PDF]

M.J. van der Laan. Targeted maximum likelihood based causal inference: Part I. Int J Biostat, 6(2):2. [PDF]

M. Rosenblum, M.J. van der Laan. Targeted maximum likelihood estimation of the parameter of a marginal structural model. Int J Biostat, 6(2):19. [PDF]

O. Stitelman, M.J. van der Laan. Collaborative targeted maximum likelihood for time to event data. Int J Biostat, 6(1). [Link]

S. Gruber, M.J. van der Laan. An application of collaborative targeted maximum likelihood estimation in causal inference and genomics. Int J Biostat, 6(1):18. [PDF]

M.J. van der Laan, S. Gruber. Collaborative double robust targeted maximum likelihood estimation. Int J Biostat, 6(1):17. [PDF]


K. Moore, M.J. van der Laan. Increasing power in randomized trials with right censored outcomes through covariate adjustment. J Biopharm Stat, 19(6):1099-1131. [PDF]

K. Moore, M.J. van der Laan. Covariate adjustment in randomized trials with binary outcomes: Targeted maximum likelihood estimation. Stat Med, 28(1):39-64. [PDF]

S. Rose, M. J. van der Laan. Why match? Investigating matched case-control study designs with causal effect estimation. Int J Biostat, 5(1):1. [PDF]

O. Bembom, M. Petersen, S-Y. Rhee, W. Fessel, S. Sinisi, R. Shafer, M.J. van der Laan. Biomarker discovery using targeted maximum likelihood estimation: Application to the treatment of antiviral-resistant HIV infection. Stat Med, 28(1):152-172. [Link]


S. Rose, M.J. van der Laan. Simple optimal weighting of cases and controls in case-control studies. Int J Biostat, 4(1):19. [PDF]

M.J. van der Laan. Estimation based on case-control designs with known prevalence probability. Int J Biostat, 4(1):18.  [Link]


S. Sinisi, E. Polley, M. Petersen, S-Y. Rhee, M.J. van der Laan. Super learning: An application to the prediction of HIV-1 drug resistance. SAGMB, 6(1):7. [PDF]

O. Bembom, M.J. van der Laan. A practical illustration of the importance of realistic individualized treatment rules in causal inference. Electron J Stat, 1:574-596. [PDF]

M.J. van der Laan, E. Polley, A. Hubbard. Super learner. SAGMB, 6(1). [Link]


M.J. van der Laan. Targeted maximum likelihood learning. Int J Biostat, 2(1). [Link]

M.J. van der Laan. Statistical inference for variable importance. Int J Biostat, 2(1). [Link]


C. Rudin, D. Dunson, R. Irizarry, H. Ji, E. Laber, J. Leek, T. McCormick, S. Rose, C. Schafer, M.J. van der Laan, L. Wasserman, L. Xue; A Working Group of the American Statistical Association (2014). Discovery with Data: Leveraging Statistics and Computer Science to Transform Science and Society.  [PDF] [Press Release] [Amstat News Article]


For the most up-to-date list of technical reports (i.e., preprints) please visit the bepress site.