temporal causal modeling with graphical granger methods andrew arnold (carnegie mellon university)...
Post on 20-Dec-2015
213 views
TRANSCRIPT
Temporal Causal Modeling with Graphical Granger Methods
Andrew Arnold (Carnegie Mellon University)Yan Liu (IBM T.J. Watson Research)
Naoki Abe (IBM T.J. Watson Research)
SIGKDD 07August 13, 2007
2
Talk Outline• Introduction and motivation
• Overview of Granger causality
• Graphical Granger methods– Exhaustive Granger– Lasso Granger– SIN Granger– Vector auto-regression (VAR)
• Experimental results
3
A Motivating Example: Key Performance Indicator Data (KPI)
in Corporate Index Management [S&P]
Time
Variables
Company HAL HAL HAL HAL HAL HAL HAL
Year 1999 2000 2000 2000 2000 2001 2001
Quarter 4 1 2 3 4 1 2
Revenue ($M) 6.24 6.54 5.82 3.89 4.1 4.41 3.6
Revenue-to-RD 2.185704 1.734358 1.381822 0.416212 0.843057 0.906083 0.930714
Revenue-to-RD CAGR -0.61429 -0.47757 -0.32646
Innovation Index 0.517621 0.578062 0.567874 0.98624 0.696722 0.679335 .734627
Innovation Index CAGR 0.346008 0.175194 .229845
CapEx to Revenue 0.152292 0.258789 0.111111 0.63592 1.33114 1.389658 0.009722
4
KPI Case Study: Temporal Causal Modeling for Identifying Levers of Corporate Performance
• How can we leverage information in temporal data to assist causal modeling and inference ?
• Key idea: A cause necessarily precedes its effects…
Time
Variables
Company HAL HAL HAL HAL HAL HAL HAL
Year 1999 2000 2000 2000 2000 2001 2001
Quarter 4 1 2 3 4 1 2
Revenue ($M) 6.24 6.54 5.82 3.89 4.1 4.41 3.6
Revenue-to-RD 2.185704 1.734358 1.381822 0.416212 0.843057 0.906083 0.930714
Revenue-to-RD CAGR -0.61429 -0.47757 -0.32646
Innovation Index 0.517621 0.578062 0.567874 0.98624 0.696722 0.679335 .734627
Innovation Index CAGR 0.346008 0.175194 .229845
CapEx to Revenue 0.152292 0.258789 0.111111 0.63592 1.33114 1.389658 0.009722
5
Granger Causality
• Granger causality– Introduced by the Nobel prize winning economist, Clive Granger [Granger ‘69]
• Definition: a time series x is said to “Granger cause” another time series y, if and only if:
– regressing for y in terms of past values of both y and x – is statistically significantly better than regressing y on past values of y only– Assumption: no common latent causes
6
Variable Space Expansion &Feature Space Mapping
7
Graphical Granger Methods
• Exhaustive Granger– Test all possible univariate Granger models independently
• Lasso Granger – Use L1-normed regression to choose sparse multivariate regression
models– [Meinshausen & Buhlmann, ‘06]
• SIN Granger – Do matrix inversion to find correlations between features across time– [Drton & Perlman, ‘04]
• Vector auto-regression (VAR) – Fit data to linear-normal time series model– [Gilbert, ‘95]
8
Exhaustive Granger vs. Lasso Granger
Baseline methods: SIN and VAR
• SIN
• VAR
9
10
Empirical Evaluation of Competing Methods
• Evaluation by simulation– Sample data from synthetic (linear normal) causal model– Learn using a number of competing methods
• Compare learned graphs to original model– Measure similarity of output graph to original graph in terms of
• Precision of predicted edges• Recall of predicted edges• F1 of predicted edges
• Parameterize performance analysis– Randomly sample graphs from parameter space
• Lag; Features; Affinity; Noise; Samples per feature; Samples per feature per lag
– Conditioning to see interaction effects• E.g. Effect of # features when samples_per_feature_per_lag is small vs large
11
Experiment 1A: Performance vs. Factors- Random sampling all factors -
12
Experiment 1’s Efficiency
13
Experiment 1B: Performance vs. Factors- Fixing other factors -
14
Experiment 1C: Performance vs. Factors- Detail: Parametric Conditioning -
15
Experiment 2: Learned Graphs
16
Experiment 3: Real World DataOutput Graphs on the Corporate KPI Data