02: Causal Graphs & DAGsΒΆ

β€œA picture is worth a thousand correlations.” - Judea Pearl

Welcome to the visual language of causality! Directed Acyclic Graphs (DAGs) provide a powerful framework for representing and reasoning about causal relationships. This notebook introduces the graphical approach to causal inference.

🎯 Learning Objectives¢

By the end of this notebook, you’ll be able to:

  • Draw and interpret causal graphs (DAGs)

  • Apply d-separation to identify conditional independence

  • Use the backdoor criterion for causal identification

  • Understand the frontdoor criterion

  • Apply graphical criteria to real-world examples

πŸ“Š What is a Causal Graph?ΒΆ

Directed Acyclic Graph (DAG): A graph where:

  • Nodes represent variables

  • Directed edges (arrows) represent causal relationships

  • No cycles (no feedback loops)

Basic ElementsΒΆ

Notation:

  • \(A \rightarrow B\): A causes B (direct causation)

  • \(A \leftarrow B\): B causes A (reverse causation)

  • \(A \leftrightarrow B\): Bidirectional relationship

  • \(A \dashrightarrow B\): Indirect causation (through other variables)

Path Types:

  • Causal Path: Follows arrow directions

  • Back-door Path: Goes against at least one arrow

  • Confounding Path: Connects cause and effect through confounders

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import networkx as nx
from scipy import stats
import warnings
warnings.filterwarnings('ignore')

# Set random seeds
np.random.seed(42)

# Set up plotting
plt.style.use('default')
sns.set_palette("husl")
plt.rcParams['figure.figsize'] = [12, 8]
plt.rcParams['font.size'] = 12]

print("Causal Graphs & Directed Acyclic Graphs (DAGs)")
print("===========================================")
def create_and_visualize_dag():
    """Create and visualize a simple causal DAG"""
    
    print("=== Creating and Visualizing a Causal DAG ===\n")
    
    # Create a DAG for the ice cream example
    # Temperature β†’ Ice Cream Sales
    # Temperature β†’ Drowning Deaths
    
    G = nx.DiGraph()
    
    # Add nodes
    nodes = ['Temperature', 'Ice Cream Sales', 'Drowning Deaths']
    G.add_nodes_from(nodes)
    
    # Add edges (causal relationships)
    edges = [('Temperature', 'Ice Cream Sales'), 
             ('Temperature', 'Drowning Deaths')]
    G.add_edges_from(edges)
    
    # Visualize the DAG
    plt.figure(figsize=(10, 6))
    
    # Position nodes
    pos = {'Temperature': (0, 0), 
           'Ice Cream Sales': (-1, -1), 
           'Drowning Deaths': (1, -1)}
    
    # Draw the graph
    nx.draw(G, pos, with_labels=True, node_color='lightblue', 
            node_size=3000, font_size=12, font_weight='bold',
            arrows=True, arrowstyle='->', arrowsize=20,
            edge_color='gray', width=2)
    
    plt.title('Causal DAG: Ice Cream Sales & Drowning Deaths\n(Temperature is the Confounder)')
    plt.axis('off')
    plt.show()
    
    print("DAG Analysis:")
print("- Temperature causes both Ice Cream Sales and Drowning Deaths")
print("- Ice Cream Sales and Drowning Deaths are correlated but not causally related")
print("- Temperature is a 'confounding variable' or 'common cause'")
print()
    
    # Analyze paths
    print("Path Analysis:")
    
    # Check if there's a direct path from Ice Cream to Drowning
    direct_path = nx.has_path(G, 'Ice Cream Sales', 'Drowning Deaths')
    reverse_path = nx.has_path(G, 'Drowning Deaths', 'Ice Cream Sales')
    
    print(f"Direct path Ice Cream β†’ Drowning: {direct_path}")
print(f"Direct path Drowning β†’ Ice Cream: {reverse_path}")
print()
    
    # Show all simple paths
    try:
        paths = list(nx.all_simple_paths(G, 'Ice Cream Sales', 'Drowning Deaths'))
        print(f"All paths from Ice Cream Sales to Drowning Deaths: {paths}")
    except:
        print("No direct paths from Ice Cream Sales to Drowning Deaths")
    
    return G

# Create and visualize a simple DAG
simple_dag = create_and_visualize_dag()

πŸ” D-Separation: Reading Independence from GraphsΒΆ

D-Separation (Directional Separation): A criterion for determining whether two variables are independent given a set of conditioning variables.

Rules of D-SeparationΒΆ

Two variables A and B are d-separated (independent) given Z if all paths between A and B are blocked by Z.

Path Blocking Rules:

  1. Chain: A β†’ M β†’ B (blocked by conditioning on M)

  2. Fork: A ← M β†’ B (blocked by conditioning on M)

  3. Collider: A β†’ M ← B (blocked unless M or its descendants are conditioned on)

Path TypesΒΆ

  • Open Path: Not blocked (variables are dependent)

  • Closed Path: Blocked (variables are independent)

  • Back-door Path: Path from cause to effect that goes through confounders

def demonstrate_d_separation():
    """Demonstrate d-separation with different graph structures"""
    
    print("=== D-Separation: Reading Independence from Graphs ===\n")
    
    # Create three classic graph structures
    fig, axes = plt.subplots(1, 3, figsize=(18, 6))
    
    structures = [
        {
            'name': 'Chain (A β†’ M β†’ B)',
            'edges': [('A', 'M'), ('M', 'B')],
            'pos': {'A': (-1, 0), 'M': (0, 0), 'B': (1, 0)},
            'description': 'A and B are independent given M'
        },
        {
            'name': 'Fork (A ← M β†’ B)',
            'edges': [('M', 'A'), ('M', 'B')],
            'pos': {'A': (-1, 0), 'M': (0, 1), 'B': (1, 0)},
            'description': 'A and B are independent given M'
        },
        {
            'name': 'Collider (A β†’ M ← B)',
            'edges': [('A', 'M'), ('B', 'M')],
            'pos': {'A': (-1, 0), 'M': (0, -1), 'B': (1, 0)},
            'description': 'A and B are dependent given M (collider)'
        }
    ]
    
    for i, struct in enumerate(structures):
        G = nx.DiGraph()
        G.add_edges_from(struct['edges'])
        
        nx.draw(G, struct['pos'], ax=axes[i], with_labels=True, 
                node_color='lightcoral', node_size=800, font_size=14,
                arrows=True, arrowstyle='->', arrowsize=20)
        
        axes[i].set_title(f"{struct['name']}\n{struct['description']}")
        axes[i].axis('off')
    
    plt.tight_layout()
    plt.show()
    
    # Demonstrate with synthetic data
    print("\nπŸ“Š Demonstrating D-Separation with Data:\n")
    
    n_samples = 1000
    
    # Chain: Education β†’ Income β†’ Happiness
    education = np.random.normal(50, 10, n_samples)
    income = education * 0.8 + np.random.normal(0, 5, n_samples)
    happiness = income * 0.6 + np.random.normal(0, 3, n_samples)
    
    chain_data = pd.DataFrame({
        'Education': education,
        'Income': income,
        'Happiness': happiness
    })
    
    # Calculate correlations
    corr_ed_hap = chain_data['Education'].corr(chain_data['Happiness'])
    corr_ed_hap_given_income = chain_data['Education'].corr(chain_data['Happiness'])  # This would be partial correlation
    
    print("Chain Structure (Education β†’ Income β†’ Happiness):")
print(f"Correlation(Education, Happiness): {corr_ed_hap:.3f}")
print("After conditioning on Income: Should be ~0 (d-separated)")
print()
    
    # Fork: Common cause (Socioeconomic Status β†’ Education and β†’ Income)
    ses = np.random.normal(0, 1, n_samples)
    education_fork = ses * 2 + np.random.normal(0, 3, n_samples)
    income_fork = ses * 3 + np.random.normal(0, 4, n_samples)
    
    fork_data = pd.DataFrame({
        'SES': ses,
        'Education': education_fork,
        'Income': income_fork
    })
    
    corr_ed_inc_fork = fork_data['Education'].corr(fork_data['Income'])
    
    print("Fork Structure (SES β†’ Education, SES β†’ Income):")
print(f"Correlation(Education, Income): {corr_ed_inc_fork:.3f}")
print("After conditioning on SES: Should be ~0 (d-separated)")
print()
    
    # Collider: Two causes meeting at an effect
    talent = np.random.normal(0, 1, n_samples)
    hard_work = np.random.normal(0, 1, n_samples)
    success = talent * 0.7 + hard_work * 0.8 + np.random.normal(0, 2, n_samples)
    
    collider_data = pd.DataFrame({
        'Talent': talent,
        'Hard_Work': hard_work,
        'Success': success
    })
    
    corr_talent_work = collider_data['Talent'].corr(collider_data['Hard_Work'])
    
    print("Collider Structure (Talent β†’ Success ← Hard_Work):")
print(f"Correlation(Talent, Hard_Work): {corr_talent_work:.3f}")
print("After conditioning on Success: Becomes dependent (collider opens)")
print()
    
    # Demonstrate collider bias
    # People who succeed might be talented OR hard workers, but not necessarily both
    successful_people = collider_data[collider_data['Success'] > collider_data['Success'].quantile(0.8)]
    corr_collider = successful_people['Talent'].corr(successful_people['Hard_Work'])
    
    print(f"Among successful people, correlation(Talent, Hard_Work): {corr_collider:.3f}")
print("(This shows collider bias - conditioning on success creates spurious correlation)")
    
    return chain_data, fork_data, collider_data

# Demonstrate d-separation
chain_data, fork_data, collider_data = demonstrate_d_separation()

πŸšͺ The Backdoor CriterionΒΆ

Backdoor Criterion: A set of variables Z satisfies the backdoor criterion for estimating the causal effect of X on Y if:

  1. No descendant of X: Z contains no descendants of the treatment variable X

  2. Blocks all back-door paths: Z blocks all paths from X to Y that contain an arrow pointing into X

Why β€œBackdoor”?ΒΆ

  • Front door: Direct causal path from X to Y

  • Back door: Paths that β€œsneak around” through confounders

Backdoor AdjustmentΒΆ

If Z satisfies the backdoor criterion, then: $\(P(Y|do(X)) = \sum_z P(Y|X, Z=z) \cdot P(Z=z)\)$

This allows us to estimate causal effects from observational data!

def demonstrate_backdoor_criterion():
    """Demonstrate the backdoor criterion with examples"""
    
    print("=== The Backdoor Criterion ===\n")
    
    # Example 1: Simple confounding
    print("Example 1: Simple Confounding")
print("-" * 30)
    
    # Create DAG: Confounder β†’ Treatment β†’ Outcome
    G1 = nx.DiGraph()
    G1.add_edges_from([('Confounder', 'Treatment'), ('Confounder', 'Outcome'), ('Treatment', 'Outcome')])
    
    plt.figure(figsize=(12, 5))
    
    plt.subplot(1, 2, 1)
    pos1 = {'Confounder': (0, 1), 'Treatment': (-1, 0), 'Outcome': (1, 0)}
    nx.draw(G1, pos1, with_labels=True, node_color='lightgreen', 
            node_size=2000, font_size=10, arrows=True, arrowstyle='->', arrowsize=15)
    plt.title('Simple Confounding\nBackdoor path: Treatment ← Confounder β†’ Outcome')
    plt.axis('off')
    
    # Show backdoor adjustment
    print("DAG: Confounder β†’ Treatment β†’ Outcome")
print("Backdoor path: Treatment ← Confounder β†’ Outcome")
print("Backdoor adjustment: Condition on Confounder")
print("Formula: P(Outcome|do(Treatment)) = βˆ‘_c P(Outcome|Treatment,c) * P(c)")
print()
    
    # Example 2: M-bias (more complex)
    print("Example 2: M-bias")
print("-" * 15)
    
    G2 = nx.DiGraph()
    G2.add_edges_from([
        ('U', 'Treatment'), ('U', 'M'), ('M', 'Outcome'),
        ('Treatment', 'Outcome'), ('Confounder', 'Treatment'), ('Confounder', 'Outcome')
    ])
    
    plt.subplot(1, 2, 2)
    pos2 = {'U': (0, 1.5), 'Confounder': (0, 0), 'Treatment': (-1, -0.5), 
            'M': (1, -0.5), 'Outcome': (0, -1.5)}
    nx.draw(G2, pos2, with_labels=True, node_color='lightcoral', 
            node_size=1500, font_size=9, arrows=True, arrowstyle='->', arrowsize=12)
    plt.title('M-bias\nComplex confounding structure')
    plt.axis('off')
    
    plt.tight_layout()
    plt.show()
    
    print("DAG: Complex confounding with M-bias")
print("Backdoor paths are more complex here")
print("Need to carefully identify valid adjustment sets")
print()
    
    # Demonstrate backdoor adjustment with data
    print("πŸ“Š Backdoor Adjustment with Synthetic Data:\n")
    
    n_samples = 2000
    
    # Generate data with confounding
    # Confounder affects both treatment and outcome
    confounder = np.random.normal(50, 10, n_samples)
    
    # Treatment (affected by confounder)
    treatment_prob = 1 / (1 + np.exp(-(confounder - 50) / 5))  # Logistic
    treatment = np.random.binomial(1, treatment_prob, n_samples)
    
    # Outcome (affected by both treatment and confounder)
    outcome = (10 * treatment + 0.5 * confounder + 
               np.random.normal(0, 5, n_samples))
    
    backdoor_data = pd.DataFrame({
        'Confounder': confounder,
        'Treatment': treatment,
        'Outcome': outcome
    })
    
    # Naive estimate (biased)
    treated_outcome = backdoor_data[backdoor_data['Treatment'] == 1]['Outcome'].mean()
    control_outcome = backdoor_data[backdoor_data['Treatment'] == 0]['Outcome'].mean()
    naive_effect = treated_outcome - control_outcome
    
    # Backdoor adjustment: stratify by confounder
    # Group by confounder bins
    backdoor_data['Confounder_Bin'] = pd.cut(backdoor_data['Confounder'], bins=5)
    
    adjusted_effects = []
    for bin_name, group in backdoor_data.groupby('Confounder_Bin'):
        treated_in_bin = group[group['Treatment'] == 1]['Outcome'].mean()
        control_in_bin = group[group['Treatment'] == 0]['Outcome'].mean()
        if not (np.isnan(treated_in_bin) or np.isnan(control_in_bin)):
            effect_in_bin = treated_in_bin - control_in_bin
            weight = len(group) / len(backdoor_data)
            adjusted_effects.append(effect_in_bin * weight)
    
    adjusted_effect = sum(adjusted_effects)
    
    print(f"Naive estimate (biased): {naive_effect:.2f}")
print(f"Backdoor adjustment: {adjusted_effect:.2f}")
print(f"True effect (from data generation): 10.00")
print()
    
    # Visualize the adjustment
    plt.figure(figsize=(12, 5))
    
    plt.subplot(1, 2, 1)
    sns.scatterplot(data=backdoor_data, x='Confounder', y='Outcome', 
                   hue='Treatment', alpha=0.6)
    plt.title('Raw Data: Confounding is Visible')
    plt.xlabel('Confounder')
    plt.ylabel('Outcome')
    
    plt.subplot(1, 2, 2)
    # Show stratified results
    stratified_results = []
    for bin_name, group in backdoor_data.groupby('Confounder_Bin'):
        bin_center = group['Confounder'].mean()
        treated_mean = group[group['Treatment'] == 1]['Outcome'].mean()
        control_mean = group[group['Treatment'] == 0]['Outcome'].mean()
        stratified_results.append({
            'confounder': bin_center,
            'treated': treated_mean,
            'control': control_mean
        })
    
    stratified_df = pd.DataFrame(stratified_results)
    plt.scatter(stratified_df['confounder'], stratified_df['treated'], 
               color='blue', label='Treated', s=100)
    plt.scatter(stratified_df['confounder'], stratified_df['control'], 
               color='red', label='Control', s=100)
    
    # Connect pairs
    for _, row in stratified_df.iterrows():
        plt.plot([row['confounder'], row['confounder']], 
                [row['treated'], row['control']], 'k--', alpha=0.7)
    
    plt.title('Stratified by Confounder\nBackdoor Adjustment')
    plt.xlabel('Confounder')
    plt.ylabel('Outcome')
    plt.legend()
    
    plt.tight_layout()
    plt.show()
    
    print("πŸ”‘ Key Insight:")
print("- Naive comparison is biased due to confounding")
print("- Backdoor adjustment removes bias by conditioning on confounders")
print("- This gives us an unbiased estimate of the causal effect")
    
    return backdoor_data

# Demonstrate backdoor criterion
backdoor_data = demonstrate_backdoor_criterion()

πŸšͺ The Frontdoor CriterionΒΆ

Frontdoor Criterion: An alternative identification strategy when backdoor adjustment isn’t possible.

A set of variables M satisfies the frontdoor criterion if:

  1. M intercepts all directed paths from X to Y

  2. There are no confounding paths from X to M

  3. All backdoor paths from M to Y are blocked by X

Frontdoor Adjustment FormulaΒΆ

\[P(Y|do(X)) = \sum_m P(M|X=m) \cdot \sum_{x'} P(Y|X=x', M=m) \cdot P(x')\]

When to Use FrontdoorΒΆ

  • When confounders of Xβ†’Y are unobserved

  • When you have a mediator that captures the causal effect

  • When backdoor paths can’t be blocked

def demonstrate_frontdoor_criterion():
    """Demonstrate the frontdoor criterion"""
    
    print("=== The Frontdoor Criterion ===\n")
    
    # Create frontdoor example DAG
    # Treatment β†’ Mediator β†’ Outcome
    # Confounder β†’ Treatment and Confounder β†’ Outcome (unobserved confounder)
    
    G = nx.DiGraph()
    G.add_edges_from([
        ('Treatment', 'Mediator'),
        ('Mediator', 'Outcome'),
        ('Confounder', 'Treatment'),
        ('Confounder', 'Outcome')
    ])
    
    plt.figure(figsize=(10, 6))
    pos = {
        'Treatment': (-1, 0),
        'Mediator': (0, 0),
        'Outcome': (1, 0),
        'Confounder': (0, 1)
    }
    
    nx.draw(G, pos, with_labels=True, node_color='lightyellow', 
            node_size=2500, font_size=11, arrows=True, 
            arrowstyle='->', arrowsize=20)
    
    plt.title('Frontdoor Criterion Example\nConfounder affects both Treatment and Outcome directly')
    plt.axis('off')
    plt.show()
    
    print("Frontdoor Criterion Analysis:")
print("- Confounder β†’ Treatment β†’ Mediator β†’ Outcome ← Confounder")
print("- Confounder is unobserved (can't use backdoor adjustment)")
print("- Mediator satisfies frontdoor criterion")
print("- Can estimate causal effect through the mediator")
print()
    
    # Demonstrate frontdoor adjustment with data
    print("πŸ“Š Frontdoor Adjustment Example:\n")
    
    n_samples = 2000
    
    # Unobserved confounder
    confounder = np.random.normal(0, 1, n_samples)
    
    # Treatment (affected by confounder)
    treatment = (0.5 * confounder + np.random.normal(0, 0.5, n_samples) > 0).astype(int)
    
    # Mediator (affected only by treatment)
    mediator = 2 * treatment + np.random.normal(0, 1, n_samples)
    
    # Outcome (affected by mediator and confounder)
    outcome = 1.5 * mediator + 0.8 * confounder + np.random.normal(0, 1, n_samples)
    
    frontdoor_data = pd.DataFrame({
        'Treatment': treatment,
        'Mediator': mediator,
        'Outcome': outcome,
        'Confounder': confounder  # Unobserved in practice
    })
    
    # Naive estimate (biased due to confounder)
    treated_outcome = frontdoor_data[frontdoor_data['Treatment'] == 1]['Outcome'].mean()
    control_outcome = frontdoor_data[frontdoor_data['Treatment'] == 0]['Outcome'].mean()
    naive_effect = treated_outcome - control_outcome
    
    # Frontdoor adjustment
    # Step 1: P(Mediator | Treatment)
    mediator_by_treatment = frontdoor_data.groupby('Treatment')['Mediator'].mean()
    
    # Step 2: For each mediator value, estimate effect of treatment on outcome
    # This is simplified - in practice needs more sophisticated estimation
    frontdoor_effect = 0
    total_weight = 0
    
    for treatment_val in [0, 1]:
        treatment_data = frontdoor_data[frontdoor_data['Treatment'] == treatment_val]
        mediator_val = treatment_data['Mediator'].mean()
        outcome_val = treatment_data['Outcome'].mean()
        weight = len(treatment_data) / len(frontdoor_data)
        
        # Simplified frontdoor calculation
        frontdoor_effect += outcome_val * weight
        total_weight += weight
    
    # More accurate frontdoor calculation
    # E[Y | do(X=1)] = βˆ‘_m P(M=m|X=1) * E[Y|X=0, M=m]
    
    # For treated group: P(M|X=1) * E[Y|X=0, M]
    treated_mediators = frontdoor_data[frontdoor_data['Treatment'] == 1]['Mediator']
    control_data = frontdoor_data[frontdoor_data['Treatment'] == 0]
    
    frontdoor_estimates = []
    for m_val in treated_mediators:
        # Find similar mediators in control group
        similar_controls = control_data[
            (control_data['Mediator'] >= m_val - 0.5) & 
            (control_data['Mediator'] <= m_val + 0.5)
        ]
        if len(similar_controls) > 0:
            expected_outcome = similar_controls['Outcome'].mean()
            frontdoor_estimates.append(expected_outcome)
    
    frontdoor_effect = np.mean(frontdoor_estimates) if frontdoor_estimates else 0
    
    # True effect (from data generation)
    # Treatment β†’ Mediator: coefficient 2
    # Mediator β†’ Outcome: coefficient 1.5
    # Total effect: 2 * 1.5 = 3
    true_effect = 3.0
    
    print(f"Naive estimate (biased): {naive_effect:.2f}")
print(f"Frontdoor adjustment: {frontdoor_effect:.2f}")
print(f"True causal effect: {true_effect:.2f}")
print()
    
    # Visualize
    plt.figure(figsize=(12, 5))
    
    plt.subplot(1, 2, 1)
    sns.scatterplot(data=frontdoor_data, x='Treatment', y='Outcome', alpha=0.6)
    plt.title('Treatment vs Outcome\nConfounding is Present')
    plt.xticks([0, 1], ['Control', 'Treated'])
    
    plt.subplot(1, 2, 2)
    sns.scatterplot(data=frontdoor_data, x='Mediator', y='Outcome', 
                   hue='Treatment', alpha=0.6)
    plt.title('Mediator vs Outcome\nFrontdoor Path')
    
    plt.tight_layout()
    plt.show()
    
    print("πŸ”‘ Key Insights:")
print("- Frontdoor criterion works when confounders are unobserved")
print("- Uses mediators to identify causal effects")
print("- More complex than backdoor adjustment")
print("- Requires strong assumptions about the mediator")
    
    return frontdoor_data

# Demonstrate frontdoor criterion
frontdoor_data = demonstrate_frontdoor_criterion()

πŸ₯ Real-World Application: Medical Treatment EvaluationΒΆ

Let’s apply causal graphs to a medical scenario where we want to evaluate the effect of a new drug on patient outcomes.

def medical_causal_analysis():
    """Apply causal graphs to medical treatment evaluation"""
    
    print("=== Medical Treatment Evaluation: Causal Graphs in Practice ===\n")
    
    # Create a realistic medical DAG
    G = nx.DiGraph()
    
    # Variables
    nodes = [
        'Age', 'Severity', 'Comorbidities',  # Pre-treatment confounders
        'Treatment',  # The drug
        'Adherence', 'Side_Effects',  # Post-treatment variables
        'Outcome'  # Health outcome
    ]
    
    G.add_nodes_from(nodes)
    
    # Causal relationships
    edges = [
        # Confounders affect treatment assignment
        ('Age', 'Treatment'),
        ('Severity', 'Treatment'),
        ('Comorbidities', 'Treatment'),
        
        # Confounders affect outcome directly
        ('Age', 'Outcome'),
        ('Severity', 'Outcome'),
        ('Comorbidities', 'Outcome'),
        
        # Treatment affects post-treatment variables
        ('Treatment', 'Adherence'),
        ('Treatment', 'Side_Effects'),
        
        # Post-treatment variables affect outcome
        ('Adherence', 'Outcome'),
        ('Side_Effects', 'Outcome'),
        
        # Treatment directly affects outcome
        ('Treatment', 'Outcome')
    ]
    
    G.add_edges_from(edges)
    
    # Visualize
    plt.figure(figsize=(14, 8))
    
    # Position nodes for clarity
    pos = {
        'Age': (-2, 2), 'Severity': (0, 2), 'Comorbidities': (2, 2),
        'Treatment': (0, 0),
        'Adherence': (-1, -1), 'Side_Effects': (1, -1),
        'Outcome': (0, -2)
    }
    
    # Color nodes by type
    node_colors = ['lightcoral'] * 3 + ['lightgreen'] + ['lightblue'] * 2 + ['gold']
    
    nx.draw(G, pos, with_labels=True, node_color=node_colors, 
            node_size=3000, font_size=10, font_weight='bold',
            arrows=True, arrowstyle='->', arrowsize=20,
            edge_color='gray', width=2)
    
    plt.title('Medical Treatment Evaluation DAG\nEvaluating Drug Effectiveness')
    plt.axis('off')
    plt.show()
    
    print("Medical DAG Analysis:")
print("- Age, Severity, Comorbidities are confounders")
print("- They affect both treatment assignment and outcome")
print("- Adherence and Side Effects are mediators")
print("- Treatment has direct and indirect effects on outcome")
print()
    
    # Identify backdoor paths
    print("Backdoor Paths (must be blocked for identification):")
print("1. Treatment ← Age β†’ Outcome")
print("2. Treatment ← Severity β†’ Outcome")
print("3. Treatment ← Comorbidities β†’ Outcome")
print()
    
    print("Valid Adjustment Sets:")
print("- Full backdoor set: {Age, Severity, Comorbidities}")
print("- This blocks all backdoor paths")
print("- Allows unbiased estimation of treatment effect")
print()
    
    # Simulate medical data
    print("πŸ“Š Simulating Medical Treatment Data:\n")
    
    n_patients = 1000
    
    # Generate confounders
    age = np.random.normal(50, 15, n_patients)
    severity = np.random.normal(0, 1, n_patients)
    comorbidities = np.random.poisson(1, n_patients)
    
    # Treatment assignment (affected by confounders)
    treatment_score = (0.02 * age + severity + 0.5 * comorbidities + 
                      np.random.normal(0, 0.5, n_patients))
    treatment = (treatment_score > treatment_score.median()).astype(int)
    
    # Post-treatment variables
    adherence = 0.8 * treatment + np.random.normal(0, 0.2, n_patients)
    side_effects = 0.3 * treatment + np.random.normal(0, 0.1, n_patients)
    
    # Outcome (affected by everything)
    outcome = (5 * treatment +  # Direct treatment effect
               2 * adherence +   # Through adherence
               -1 * side_effects + # Through side effects
               -0.05 * age +     # Age effect
               -2 * severity +   # Severity effect
               -0.8 * comorbidities + # Comorbidities effect
               np.random.normal(0, 2, n_patients))
    
    medical_data = pd.DataFrame({
        'Age': age,
        'Severity': severity,
        'Comorbidities': comorbidities,
        'Treatment': treatment,
        'Adherence': adherence,
        'Side_Effects': side_effects,
        'Outcome': outcome
    })
    
    # Compare naive vs adjusted estimates
    naive_effect = (medical_data[medical_data['Treatment'] == 1]['Outcome'].mean() - 
                   medical_data[medical_data['Treatment'] == 0]['Outcome'].mean())
    
    # Simple adjustment: stratify by severity tertiles
    medical_data['Severity_Tertile'] = pd.qcut(medical_data['Severity'], 3)
    
    adjusted_effects = []
    for _, group in medical_data.groupby('Severity_Tertile'):
        treated_outcome = group[group['Treatment'] == 1]['Outcome'].mean()
        control_outcome = group[group['Treatment'] == 0]['Outcome'].mean()
        if not (np.isnan(treated_outcome) or np.isnan(control_outcome)):
            effect = treated_outcome - control_outcome
            weight = len(group) / len(medical_data)
            adjusted_effects.append(effect * weight)
    
    adjusted_effect = sum(adjusted_effects)
    
    print(f"Naive treatment effect: {naive_effect:.2f}")
print(f"Adjusted for severity: {adjusted_effect:.2f}")
print(f"True direct effect: 5.00 (from data generation)")
print()
    
    print("πŸ” Key Takeaways:")
print("- Medical studies often have complex confounding")
print("- Age, severity, and comorbidities must be controlled")
print("- Treatment effects can be direct and indirect")
print("- Proper causal analysis prevents incorrect conclusions")
    
    return medical_data

# Apply to medical scenario
medical_data = medical_causal_analysis()

🎯 Key Takeaways¢

1. DAGs as Causal MapsΒΆ

  • Directed Acyclic Graphs represent causal assumptions visually

  • Arrows show causal direction, not just correlation

  • No cycles allowed (causality doesn’t loop)

2. D-Separation RulesΒΆ

  • Chain: A β†’ M β†’ B (blocked by M)

  • Fork: A ← M β†’ B (blocked by M)

  • Collider: A β†’ M ← B (opened by M)

  • Determines conditional independence from graph structure

3. Backdoor CriterionΒΆ

  • Identifies valid adjustment sets for confounding

  • Blocks all paths from treatment to outcome that go through confounders

  • Allows causal effect estimation from observational data

4. Frontdoor CriterionΒΆ

  • Alternative when confounders are unobserved

  • Uses mediators to identify causal effects

  • More restrictive assumptions than backdoor

5. Practical ApplicationsΒΆ

  • Medical treatment evaluation

  • Policy impact assessment

  • Business decision making

  • A/B test analysis

πŸ” Critical Thinking QuestionsΒΆ

  1. Draw a DAG for your favorite correlation. Is it causal?

  2. What’s a collider in your field that might cause bias?

  3. Can you think of a frontdoor path in social science?

  4. How would you identify confounders in an observational study?

πŸš€ Next StepsΒΆ

Now that you understand causal graphs, you’re ready for:

  • Experimental Design: RCTs and quasi-experiments

  • Matching Methods: Propensity score matching

  • Instrumental Variables: Natural experiments

  • Regression Discontinuity: Local randomization

Remember: Causal graphs are your roadmap for navigating from correlation to causation. They make your assumptions explicit and help you identify valid estimation strategies!

β€œThe graph is not the territory, but it is a map of the territory.” - Judea Pearl