identifying robust activation in fmri
DESCRIPTION
Identifying Robust Activation in fMRI. Thomas Nichols, Ph.D. Assistant Professor Department of Biostatistics University of Michigan http://www.sph.umich.edu/~nichols FBIRN March 13, 2006. Are Robust Activations a Problem?. Robust activation - PowerPoint PPT PresentationTRANSCRIPT
1
Identifying Robust Activation in fMRI
Thomas Nichols, Ph.D.Assistant Professor
Department of Biostatistics University of Michigan
http://www.sph.umich.edu/~nichols
FBIRNMarch 13, 2006
2
Are Robust Activations a Problem?
• Robust activation– Proposed definition:
An effect that is detected regardless of the specific model or methods used
• Shouldn’t we be worried about non-robust activations?
3
Robustness Overview
• 1 voxel | Univariate– Validity– Sensitivity
• Images | Mass-univariate– Validity for some multiple
Type I metric– Sensitivity, depending on metric
4
Robustness & Test Validity
• Parametric, Two-sample t-test– Famously robust
• False positive rate even...• Under non-normality, heterogeneous variance• Most robust with balanced data
– Can have problems with outliers– False positive rate may be <
• Impact for imaging– Simple block designs probably very safe
Univariate
5
Univariate
Robustness & Test Validity
• Non-Parametric tests– “Exact” by construction
• False positive rate precisely – NB: Due to discreteness, your may not be available
– Not a generic modeling framework• No “permutation GLM”• Autocorrelation challenging
• Impact for imaging– Within subject, must account for autocorrelation– Between subject, simple models easy
6
Robustness & Test Power
• Parametric, Two-sample t-test– Reduced sensitivity
• From outliers or, with un-balanced data, non-normality or heteroscedasticity
• Impact for imaging– Safe, but possibly conservative approach– Not getting the most out of the data
Univariate
7
Univariate
Robustness & Test Power
• Non-Parametric tests– Sensitivity varies with test!
• Just because all tests are “Exact” doesn’t mean all have same sensitivity to Ha
– When Normality true, or almost, t-test is optimal• Indicates permutation t-test is good
– When data very non-normal, other tests better• E.g. median• “Robust” methods – Iteratively Re-weighted Least
Squares (Wager, NI, 2005)
Univariate
Robustness & Test Power
• Non-Parametric tests• In-flight Monte Carlo Simulation
– One-sample test on differences, 12 Subjects• 11 Ss have effect size 1• 1 S has effect size -2
– Compare power of twopermutation tests
• Median & t-test– Conclusion
• Both tests “exact”, but Median more sensitive in the presence of outliers
TestStatistic
Power(=0.05)
T-test 60.9%
Median 68.7%Normal data, 1,000 realizations
9
Robustness & Test Power
• Implications for Imaging– Non-normality (group heterogeneity) can
reduce sensitivity– Alternate test statistics can out-perform
standard methods
Univariate
10
Mass-Univariate Inference• Interesting Result?
– t = 5.446– 4.3×10-5 !
• Look at the data– Contrast
unremarkable– Standard deviation
low– White matter!
• Must account for multiple tests! FIAC group data, 15 subjects, block design data
Different Speaker & Sentence Effect
11
Mass- Univariate
Robustness & Test Validity
• 100,000 tests, 0-100,000 false positives!– No unique measure of false positives
• Just two:– Familywise Error Rate (FWER)
• Chance of existence of one or more false positives– False Discovery Rate (FDR)
• Expected fraction of false positives (among all detections)
12
Mass- Univariate
Robustness & Test ValidityFWER methods
• Parametric, Random Field Theory– Provides thresholds that control FWER– Assumes data is smooth random field
• Very flexible framework– Closed form results for t/Z/F...
• Can be conservative– Low DF– Low smoothness
13
Mass- Univariate
Robustness & Test ValidityFWER methods
• Non-Parametric– Use permutation to find null max distribution– No smoothness assumptions– “Exact” control of FWER
• Not very flexible– But can get a lot of mileage out of
1-, 2-sample t, and correlation
14
Mass- Univariate
Robustness & Test PowerFWER methods
• Parametric, Random Field Theory– Can be conservative when...
• Low DF• Low smoothness
• Nonparametric Permutation– More powerful when RFT has problems
15
FWERThresholds:RFT vs. Perm• RF & Perm
adapt to smoothness
• Perm & Truth close
• Bonferroni close to truth for low smoothness
9 df
19 df
more
Real Data – ThresholdRFT vs Bonf. vs Perm.
t Threshold (0.05 Corrected)
df RF Bonf Perm Verbal Fluency 4 4701.32 42.59 10.14 Location Switching 9 11.17 9.07 5.83 Task Switching 9 10.79 10.35 5.10 Faces: Main Effect 11 10.43 9.07 7.92 Faces: Interaction 11 10.70 9.07 8.26 Item Recognition 11 9.87 9.80 7.67 Visual Motion 11 11.07 8.92 8.40 Emotional Pictures 12 8.48 8.41 7.15 Pain: Warning 22 5.93 6.05 4.99 Pain: Anticipation 22 5.87 6.05 5.05
Real Data – Num voxel foundRFT vs Bonf. vs Perm. No. Significant Voxels
(0.05 Corrected) t SmVar t df RF Bonf Perm Perm
Verbal Fluency 4 0 0 0 0 Location Switching 9 0 0 158 354 Task Switching 9 4 6 2241 3447 Faces: Main Effect 11 127 371 917 4088 Faces: Interaction 11 0 0 0 0 Item Recognition 11 5 5 58 378 Visual Motion 11 626 1260 1480 4064 Emotional Pictures 12 0 0 0 7 Pain: Warning 22 127 116 221 347 Pain: Anticipation 22 74 55 182 402
18
Mass-Univariate Inference• FWER-Corrected
P-value:0.9878
• FDR-Corrected P-value0.1122
• Interpretation– This result is
totally consistent with the null hyp. when searching 26,000 voxels FIAC group data, 15 subjects, block design data
Different Speaker & Sentence Effect
Robustness Conclusions• Separately consider validity and sensitivity• Validity
– Most methods fairly robust– Event-related fMRI probably least robust
• Sensitivity– Standard univariate methods suffer under non-
normality, heterogeneity– RFT FWER thresholds can lack sensitivity under
low DF, low smoothness– Nonparametric methods, while not fully general,
provide good power under problematic settings
20
Permutation for fMRIBOLD vs. ASL
• Temporal Autocorrelation– BOLD fMRI has it– Makes permutation test difficult
• Differenced ASL data– Differenced ASL data white (Aguirre et al)– Permutation test now easy
• Though Aguirre found that regressing out movement parameters was necessary to get nominal FPR’s
21
BOLD vs. ASL:My stance: Don’t Difference!
• Model the control/label effect– Differenced data has length-n/2– Only using ½ the data is suboptimal– Gauss-Markov Theorem
• Optimally precise estimates come from full, whitened model
• Advantages– Uses standard BOLD fMRI modeling tools
• Reference– Mumford, Hernadez & Nichols,
Estimation Efficiency and Statistical Power in Arterial Spin Labeling FMRI.Provisionally accepted, NeuroImage.
22
ASL w/outDifferencing
• Model all n observations
• Predictors– Baseline
BOLD– Baseline
perfusion– BOLD – Perfusion
Full Data Design Matrix Columns
23
ASL w/outDifferencing
• Two key aspects– Model all data– Account for
autocorrelation• Theoretical
Result– Better power!
Difference in Power Relative toModeling Full Data and Autocorrelation
24
ASL w/outDifferencing
• Two key aspects– Model all data– Account for
autocorrelation• Real Data
Result– Bigger Z’s!
(on average)
Difference in Z scoreFull Model GLS vs. Difference Data OLS
25
ASL Conclusion
• Intrasubject Inferences with ASL– Differenced ASL data only white when noise
1/f– Worry about validity of intrasubject
permutation test• Group Inferences with ASL
– Data then looks just like BOLD fMRI– Permutation test easy again