5.2.1 dags
TRANSCRIPT
Causal Inference in Epidemiology Part 2
PH 250B Fall 2016Jade Benjamin- Chung, ‐ PhD MPH Colford- Hubbard ‐ Research Group
Adapted from Professor Jen Ahern’s 250B slides
Outline1. What does causal inference entail?2. Using directed acyclic graphs
a. DAG basicsb. Identifying confoundingc. Understanding selection bias
3. Causal perspective on effect modificationa. Brief recap of effect modification (EM)b. Linking EM in our studies to realityc. Types of interactiond. Causal interaction / EM
1. Sufficient cause model (“causal pies”)2. Potential outcomes model (“causal types”)
e. Choosing which measure of interaction to estimate and report4. Integrating causal concepts into your research
Reality of study design• We often don’t have ideal data on our population
of interest
• The data we collect are incomplete
• Statistics can help us understand correlations or associations between exposures and outcomes
• Typically what we really want to know is if the exposure causes the outcome
What does causal inference entail?
• Careful definition of our estimation goals
• A set of assumptions that allow us to link our observed data to ideal data that would be used to reach our goals
• Causal inference techniques help us– Express assumptions about our data in a transparent,
mathematical form– Provide us with mathematical steps to translate
assumptions into quantities that can be estimated with observed data
Pearl, Glymour & Jewell, 2016
5
1. Define research hypothesis– Your hypothesis can include possible effect modification– Determine to what extent you aim to make causal inferences
using your data2. Determine study design (trial, cohort, etc.)3. Draw a DAG
a. Identify potential confoundersb. Choose which variables to measure
4. Analyze your dataa.b.
c.
Control for confounders identified in step 3Assess effect modification on the additive or multiplicative scaleMake statistical inferences
5. Make scientific inferences about your hypothesis
Causal inference in your research
Outline1. What does causal inference entail?2. Using directed acyclic graphs
a. DAG basicsb. Identifying confoundingc. Understanding selection bias
3. Causal perspective on effect modificationa. Brief recap of effect modification (EM)b. Linking EM in our studies to realityc. Types of interactiond. Causal interaction / EM
1. Sufficient cause model (“causal pies”)2. Potential outcomes model (“causal types”)
e. Choosing which measure of interaction to estimate and report4. Integrating causal concepts into your research
Causal diagrams as mathematical language
“Graphical methods now provide a powerful symbolic machinery for deriving the consequences of causal assumptionswhen such assumptions are combined with statistical data.”
Pearl J, 2009, Causality
8
Directed Acyclic Graphs (DAGs)• Visually depict assumptions about causal relationships
between exposures, outcomes, and other variables– Depicts the “data generating process”
• DAGs depict our knowledge (or beliefs) about the “data generating process”
• DAGs are informed by subject matter knowledge, prior research, and a priori hypotheses
• Learning curve on terminology and approach – practice helps! Can be very useful tool once you are comfortable with it
How can we use DAGs?Generally• Document assumptions about cause- effect ‐
relationships• Explore implications of those assumptions• Assess how to make causal inferences from both
one’s data and one’s assumptionsToday• To understand selection bias• To identify confounding
Pearl J, 2009, Causality
• Direct causal relationships between variables are represented by arrows– Directed– All causal relationships have a direction– A given variable cannot be simultaneously a cause
and an effect
SESPrenatal Care
10
DAG construction
Malnutrition (M)
Infection (I) I (t=0) I (t=1)
M (t=0) M (t=1)
• There are no feedback loops– Acyclic– Causes always precede their effects– To avoid feedback loops, extend graph over time
11
DAG construction
Vitamins Birth Defects
Prenatal Care
Difficulty conceivingSES
Maternal genetics
• Parent & Child:– Directly connected by an arrow– Prenatal care is a “parent” of birth defects– Birth defects is a “child” of pre- natal‐ care
12
DAG terminology
Vitamins Birth Defects
Prenatal Care
Difficulty conceivingSES
Maternal genetics
• Ancestor & Descendant:– Connected by a directed path of a series of arrows– SES is an “ancestor” of Birth Defects– Birth Defects is a “descendant” of SES
13
DAG terminology
Smoking
Smoking
CancerTar Mutations
Cancer
• Absence of a directed path from X to Y implies X has no effect on Y– Directed paths not in the graph as important as those in
the graph• Note: Not all intermediate steps between two
variables need to be represented– Depends on level of detail of the model
14
DAG assumptions
• DAGs assume that all common causes of exposure and disease are included– Common causes that are not observed should still be
included– These are often denoted with a “U” to indicate they were
unmeasured
U (religious beliefs, culture, lifestyle, etc.)
Alcohol Use
Smoking
Heart Disease
DAG assumptions15
Example
Speed
Bicycle Fall
16
Example
Speed
Bicycle Characteristics
Road/Lane/Path Surface
Bicycle TrafficRoad Grade
Car Traffic
Bicycle Fall
17
SpeedCar Traffic
Example
Rider Skill/Experience
Bicycle Characteristics
Road/Lane/Path Surface
Bicycle TrafficRoad Grade
Bicycle Fall
18
Speed
Bicycle Characteristics
Road/Lane/Path Surface
Bicycle TrafficRoad Grade
Car Traffic
Populace BicycleAwareness
Bicycle Lane/Path
Example
Rider Skill/Experience
Bicycle Fall
19
Speed
Rider Skill/Experience
Bicycle Characteristics
Bicycle Traffic
Road/Lane/Path Surface
Road Grade
Car Traffic
Populace BicycleAwareness
Bicycle Lane/Path
Example
Bicycle Fall
20
What are some assumptions are we making?
Bicycle lane/path only has an effect on bicycle falls throughits effect on bicycle traffic
Road surface does not affect bicycle traffic
All common causes of speed and bicycle fall are included (even those unmeasured)
Statistical underpinnings of DAGs
• Multiple possible causal models for this DAG:
Y Z
X = School funding Y = SAT ScoresZ = College AcceptanceX = UX
Y = (x/3) + UY
Z = (y/16) + UZ
X = Number of hours worked per week Y = Number of training hours per week Z = Race completion time
X = UX
Y = 84 – x + UY
Z = (100/y) + UZ
UX UY
X
UZ
Pearl, Glymour & Jewell, 2016
Statistical underpinnings of DAGs
• Both models share the same statistical relationships:
• Z and Y are dependent• Y and X are dependent• Z and X are likely dependent• Z and X are independent
depending on the values of Y
Y Z
UX UY
X
UZ
X = School funding Y = SAT ScoresZ = College Acceptance
X = UX
Y = (x/3) + UY
Z = (y/16) + UZPearl, Glymour & Jewell, 2016
Statistical underpinnings of DAGs
• Both models share the same statistical relationshipsFor specific values of these variables (lower case x, y, z):
• Z and Y are dependent• Y and X are dependent• Z and X are likely dependent• Z and X are independent
depending on the values of Y
Y Z
UX UY
X
UZ
Pearl, Glymour & Jewell, 2016
Conditioning on a variable in a DAG
• “Conditioning” on a variable means filtering the data into groups based on the value of a variable
• A box is often used around a variable denote that it is being conditioned on (e.g., in this DAG we condition on Y)
• This is equivalent to stratifying the data or controlling for a variable in a statistical model
X YZ
UX UY UZ
DAG configurationsX
X
Y Z
Y
Z
X Z
Y
Chain
Fork
Collider* Has special considerations and challenges
Pearl, Glymour & Jewell, 2016
CancerDiet
BMI
Colliders
In this example, BMI is a collider
26
CancerDiet
BMI
CollidersConditioning on BMI induces an association between diet and cancer
CancerDiet
BMI
Among those who have had a BMI decrease there will be larger numbers of dieters and larger numbers ofpeople with cancer than in the total population
27
Colliders
YX
Why does conditioning on a collider induce an association between its parents?
Example: Z=X+Y
Pearl, Glymour & Jewell, 2016
28
Do not condition on Z:
• X=3 ! know nothing about Y
Z
Colliders
YX
Z
Why does conditioning on a collider induce an association between its parents?
Example: Z=X+Y
Pearl, Glymour & Jewell, 2016
29
Do not condition on Z:
• X=3 ! know nothing about Y
Condition on Z:
• Z=10, X=3 ! know Y=7
Thus, X and Y are dependent given that (“conditional on”) Z = 10
30
Strengths of DAGs• Can determine which variables depend on each
other in our causal model without knowing thespecific functions (e.g., Z=X+Y in the previous slide) connecting them (Pearl, Glymour & Jewell, 2016)
• Allow us to link our causal model to our statistical relationships in our data
• DAGS can incorporate measurement error as well(Hernan & Cole, 2009)
31
Limitations of DAGs• Cannot display effect modification easily (example of road
surface)
• Arrows in graphs do not provide specific definitions of effects (contrast with counterfactuals)
• Can become extremely complicated when representing real data structures
• Are not designed to capture effects of infectious disease interventions that may impact not only intervention recipients but also non- recipients ‐ (e.g., herd effects of vaccines)
DAG limitations
Example of extremely complicated
32
33
DAGs• DAG itself is not used to analyze data from the study
you’ve conducted– Informs how study is designed/data are collected– Informs how data are analyzed– Helps identify which research questions are answerable in
a given data set• Utility of DAGs dependent on accuracy/correctness
of associations we represent in the diagram
34
Non- ‐parametric structural equation models
• Non- parametric ‐ structural equation models (NPSEMs) provide a link between DAGs and counterfactuals and are a way to analyze data
• They encode relationships between variables that can include many possible equations and functional forms
• Non- parametric ‐ estimation used to avoid assumptions of typical SEMs (e.g., linearity)
• Learn more about this topic in PH252D
Example of NPSEM
Y Z
UX UY
X
UZ
Previous example:
X = School funding Y = SAT ScoresZ = College Acceptance
X = UX
Y = (x/3) + UY
Z = (y/16) + UZPearl, Glymour & Jewell, 2016
NPSEM:
X = School funding Y = SAT ScoresZ = College Acceptance
X = fX(UX)Y = fY(X, UY) Z = fZ(Y, UZ)