Targeted Learning

Causal Inference for Observational and Experimental Data

Targeted learning is a framework for causal and statistical inference methodology incorporating machine learning. 

The book Targeted Learning: Causal Inference for Observational and Experimental Data, by Mark J. van der Laan and Sherri Rose, was published in 2011. This text focuses largely on cross-sectional studies.

The second book by van der Laan and Rose, Targeted Learning in Data Science: Causal Inference for Complex Longitudinal Studies, has just been released by Springer in March 2018. This sequel text covers the complicated research questions found in longitudinal and dependent data structures.


The statistics profession is at a unique point in history. The need for valid statistical tools is greater than ever; data sets are massive, often measuring hundreds of thousands of measurements for a single subject. The field is ready to move towards clear objective benchmarks under which tools can be evaluated. Targeted learning allows (1) the full generalization and utilization of cross-validation as an estimator selection tool so that the subjective choices made by humans are now made by the machine, and (2) targeting the fitting of the probability distribution of the data toward the target parameter representing the scientific question of interest.

Our first book, Targeted Learning, is aimed at both statisticians and applied researchers interested in causal inference and general effect estimation for observational and experimental data. Part I is an accessible introduction to super learning and the targeted maximum likelihood estimator, including related concepts necessary to understand and apply these methods. Parts II-IX handle complex data structures and topics applied researchers will immediately recognize from their own research, including time-to-event outcomes, direct and indirect effects, positivity violations, case-control studies, censored data, and genomic studies. 

From the Back Cover of Targeted Learning

"Targeted Learning, by Mark J. van der Laan and Sherri Rose, fills a much needed gap in statistical and causal inference. It protects us from wasting computational, analytical, and data resources on irrelevant aspects of a problem and teaches us how to focus on what is relevant – answering questions that researchers truly care about."  -Judea Pearl, author of "Causality" and professor of computer science at UCLA.

"In summary, this book should be on the shelf of every investigator who conducts observational research and randomized controlled trials. The concepts and methodology are foundational for causal inference and at the same time stay true to what the data at hand can say about the questions that motivate their collection." -Ira B. Tager, professor emeritus of epidemiology at UC Berkeley  

Our second book, Targeted Learning in Data Science, builds on and is a sequel to our first book. Targeted learning methods are critical tools within data science for answering complex statistical questions, including estimands in networks and longitudinal data with time-dependent confounding. We present a scientific roadmap to translate these real-world data science applications into formal statistical estimation problems. This is accomplished using the general template of targeted maximum likelihood estimators to construct algorithms that incorporate the state-of-the-art in machine learning for estimation, while still providing valid inference. Standard tools are not currently equipped for these challenges. We include demonstrations with software packages and real data sets, as well as new methodological advances since the publication of the first targeted learning book.