Targeted Learning

Causal Inference for Observational and Experimental Data

Targeted learning is a framework for causal and statistical inference methodology incorporating machine learning. This site contains research updates and resources. The book Targeted Learning: Causal Inference for Observational and Experimental Data, by Mark J. van der Laan and Sherri Rose, was published in 2011.

COMING 2017: Targeted Learning in Data Science by van der Laan and Rose, published by Springer.

NEW: Visit for additional targeted learning code.

NEW: Read this guest post on Revolution Analytics summarizing current R packages for targeted learning.

SuperLearner Package

CRAN Description:
"This package implements the super learner prediction method and contains a library of prediction algorithms to be used in the super learner."
[Download from CRAN]

tmle Package

CRAN Description:
"tmle implements targeted maximum likelihood estimation, first described in van der Laan and Rubin, 2006 (Targeted Maximum Likelihood Learning, The International Journal of biostatistics, 2(1), 2006. This version adds the tmleMSM function to the package, for estimating the parameters of a marginal structural model (MSM) for a binary point treatment effect. The tmle function calculates the adjusted marginal difference in mean outcome associated with a binary point treatment, for continuous or binary outcomes. Relative risk and odds ratio estimates are also reported for binary outcomes. Missingness in the outcome is allowed, but not in treatment assignment or baseline covariate values. Effect estimation stratified by a binary mediating variable is also available. The population mean is calculated when there is missingness, and no variation in the treatment assignment. An ID argument can be used to identify repeated measures. Default settings call SuperLearner to estimate the Q and g portions of the likelihood, unless values or a user-supplied regression function are passed in as arguments."
[Download from CRAN