This article introduces six carefully selected books for systematically learning about causality from beginner to advanced levels. Each book has distinct strengths in theory, real-world examples, hands-on practice, and coding, and three of them are available for free. The content is organized as a comprehensive roadmap for beginners, practitioners, and researchers in the field of causal inference.


1. Introduction: For Those Looking to Enter Causal Inference

In recent years, interest in causal inference has grown significantly in both research and industry. However, newcomers to the field often experience confusion due to the lack of unified resources and terminology. This is because causal research has developed across various sub-fields over the past decades. Many beginners feel that "it's so vast and I don't know where to start."

The author experienced the same difficulty at first. Having since written a causal inference book based on personal experience, they attest that these six books were instrumental in their causal journey.

"The six books I introduce here played a major role in systematically organizing and accelerating my causal inference journey. I hope they help you in the same way!"

Here's a clear overview of each book's features, key learning points, and where to get them (including free options).


2. First Step: "The Book of Why" -- Understanding the Power of Why

The Book of Why

This book is an introductory text co-authored by Judea Pearl, the "godfather of modern causal inference," and Dana Mackenzie. With its harmonious blend of history, theory, storytelling, mathematics, and real-world applications, it is widely regarded as the perfect introduction to causal inference.

At its core, the book emphasizes the importance of asking "why" and clearly explains foundational concepts such as do-calculus and the Ladder of Causation. You might think, "Wait, do-calculus? That sounds unfamiliar..." but the book makes it accessible through easy examples and comparisons.

Additionally, readers can naturally grasp the differences and applications of the two major frameworks of causal inference: the do-calculus framework and the potential outcomes framework.

"This book is really well structured and enjoyable to read. It covers the history and core theories of causal inference as well as real-world applications."

Key Topics

  • History of causal inference
  • Ladder of Causation
  • Fundamentals of do-calculus
  • Concepts of the potential outcomes framework
  • Applications of causal reasoning in everyday life and science

Where to Get It

  • Available in paperback, Kindle, and audiobook (English)

3. Going Deeper: "Causal Inference in Statistics -- A Primer"

Causal Inference in Statistics: A Primer

After reading "The Book of Why," if you're looking for a more in-depth, practice-oriented book, this one by Pearl, Glymour, and Jewell is the perfect fit. It was also directly recommended by the author on Twitter.

At just over 120 pages, it's slim but packed with core concepts and exercises. The book is divided into four main parts:

  1. Review of basic probability and statistics
  2. Introduction to graphical models
  3. Working with interventions
  4. Exploring counterfactuals

It helps readers focus on the essence of causal inference without complex mathematics or estimation methods. Furthermore, it broadly covers relatively advanced concepts including mediation, direct/indirect effects, and probabilities of necessity and sufficiency.

"This book is structured so you can work through graphical models, various intervention criteria, and counterfactuals by hand. Following along with the exercises builds a solid foundation in causal inference!"

Key Topics

  • Graphical models
  • Interpreting interventions through graphs
  • Back-door/front-door criteria, inverse probability weighting
  • Counterfactual reasoning
  • Mediation effects and probabilities of necessity and sufficiency

Where to Get It

  • Available in paperback and Kindle

4. Expanding Horizons: "Elements of Causal Inference"

Elements of Causal Inference

This book goes a step further by covering the cutting-edge topic of "causal discovery." That is, beyond simply asking "Does A affect B?", it teaches methods for directly uncovering from data "How are various variables interconnected?"

The authors use the concept of causal inference in a broad sense, encompassing diverse topics such as:

  • Statistical models vs. causal models, and their differences
  • Causal inference/discovery theory in bivariate and multivariate models
  • Extended applications including semi-supervised learning, reinforcement learning, domain understanding, and causal discovery in time series

This book focuses on theoretical principles and connections to machine learning rather than practical examples. While the mathematical depth increases, each part helps readers fundamentally understand "why things work this way."

"As the authors themselves note, this is not a comprehensive handbook but a distinctive textbook reflecting the authors' 'personal preferences.' Recommended for those curious about causal structure discovery as opposed to classical causal inference!"

Key Topics

  • Causal discovery (structure learning) theory
  • Bivariate and multivariate causal structures
  • Causality and reinforcement learning
  • Causality and time series

Where to Get It

  • Available in paperback, Kindle, and free PDF (officially provided)!

5. Depth and Rigor: "Causality -- Models, Reasoning and Inference"

Causality: Models, Reasoning and Inference

Judea Pearl's book, at over 400 pages, is a textbook that comes close to being the definitive compendium of causal inference theory. As a textbook, it comprehensively covers:

  • Graphical models and d-separation principles
  • Bayesian causal models, structural causal models
  • Structural equations (SEM), model testing, prerequisites for causal inference
  • The complete rule system of do-calculus formulas
  • Advanced analysis of interventions and counterfactuals, probabilities of causation, and more

Parts of the book briefly cover causal structure learning algorithms, and comparisons between the graphical approach and the "potential outcomes" approach help build understanding. The final 60 or so pages contain in-depth insights through dialogues and discussions with readers.

"It's hard to find causal inference material this comprehensive and deep. It's an excellent reference textbook for causal inference!"

Key Topics

  • Prerequisites for causal inference
  • Advanced do-calculus analysis
  • Causal discovery (structure learning, partial coverage)
  • Probabilities of causation, advanced interventions/counterfactuals

Where to Get It

  • Available in paperback and Kindle

6. Practice and Code: "Causal Inference -- The Mixtape"

Causal Inference: The Mixtape

If you want something more practical and focused on real-world cases, this book is ideal. It provides extensive real-world causal inference examples and practice code (Stata, R, Python) from diverse fields including economics, social policy, and epidemiology. The author frequently includes hip-hop quotes, making it fun to read as well.

The main content centers on popular causal inference methods in modern econometrics. For example, it concretely explains how natural experiments, nested regression analyses (regression discontinuity, instrumental variables, difference-in-differences, synthetic control methods, etc.) are applied in practice. It strikes a good balance between math and practice -- not too hard, not too simple.

"Quoting Chance the Rapper's lyrics to explain how to find instrumental variables -- you can experience learning and fun at the same time!"

Key Topics

  • Natural experiments
  • Potential outcomes framework
  • Regression discontinuity, instrumental variables, difference-in-differences, synthetic control methods
  • Practice with real data and code

Where to Get It

  • Paperback, Kindle, and free online book (fully open on the web)

7. An Integrated Framework: "What If?"

Causal Inference: What If?

Finally, this book by Harvard's Miguel Hernan and James Robins achieves the best balance between the two major pillars of causal inference (graph-based and potential outcomes-based).

It is structured in three parts:

  • Causal inference without models
  • Causal inference with models
  • Causal inference in complex longitudinal (time series) data It particularly covers interaction, selection bias, structural nested models, causal survival analysis, time-varying treatment, and more with real data examples.

Most importantly, it comes with code and examples in multiple languages including SAS, Stata, R, Python, and Julia, making it very helpful for practitioners.

"A well-balanced textbook where graphical and potential outcomes frameworks come together harmoniously, and the best data science primer that lets you practice in multiple languages."

Key Topics

  • Interaction, selection bias
  • Structural nested models
  • Survival analysis
  • Causal effects of time-varying treatments

Where to Get It

  • Free PDF; as of 2024, the printed book is expected to be released soon

Closing Thoughts

We've reviewed six books that guide you through the entire causal inference journey from beginner to expert, organized by chronological order and difficulty. Don't forget that three of these books are available for free, with unrestricted access online or as PDFs.

Wishing you the best on your causal inference journey! If you have questions or thoughts, feel free to reach out to the author via comments or LinkedIn.

"For more resources on causal inference, causal machine learning, and Python practice, visit causalpython.io. Sign up to receive new book updates and free content by email!"

Good luck with your causal journey


Related writing