t tests, anovas and regression tom jenkins ellen meierotto spm methods for dummies 2007
TRANSCRIPT
![Page 1: T tests, ANOVAs and regression Tom Jenkins Ellen Meierotto SPM Methods for Dummies 2007](https://reader035.vdocuments.pub/reader035/viewer/2022062404/5515dc6a550346d46f8b4af2/html5/thumbnails/1.jpg)
T tests, ANOVAs and regression
Tom JenkinsEllen Meierotto
SPM Methods for Dummies 2007
![Page 2: T tests, ANOVAs and regression Tom Jenkins Ellen Meierotto SPM Methods for Dummies 2007](https://reader035.vdocuments.pub/reader035/viewer/2022062404/5515dc6a550346d46f8b4af2/html5/thumbnails/2.jpg)
Why do we need t tests?
![Page 3: T tests, ANOVAs and regression Tom Jenkins Ellen Meierotto SPM Methods for Dummies 2007](https://reader035.vdocuments.pub/reader035/viewer/2022062404/5515dc6a550346d46f8b4af2/html5/thumbnails/3.jpg)
Objectives
Types of error Probability distribution Z scores T tests ANOVAs
![Page 4: T tests, ANOVAs and regression Tom Jenkins Ellen Meierotto SPM Methods for Dummies 2007](https://reader035.vdocuments.pub/reader035/viewer/2022062404/5515dc6a550346d46f8b4af2/html5/thumbnails/4.jpg)
![Page 5: T tests, ANOVAs and regression Tom Jenkins Ellen Meierotto SPM Methods for Dummies 2007](https://reader035.vdocuments.pub/reader035/viewer/2022062404/5515dc6a550346d46f8b4af2/html5/thumbnails/5.jpg)
Error
Null hypothesis Type 1 error (α): false positive Type 2 error (β): false negative
![Page 6: T tests, ANOVAs and regression Tom Jenkins Ellen Meierotto SPM Methods for Dummies 2007](https://reader035.vdocuments.pub/reader035/viewer/2022062404/5515dc6a550346d46f8b4af2/html5/thumbnails/6.jpg)
Normal distribution
![Page 7: T tests, ANOVAs and regression Tom Jenkins Ellen Meierotto SPM Methods for Dummies 2007](https://reader035.vdocuments.pub/reader035/viewer/2022062404/5515dc6a550346d46f8b4af2/html5/thumbnails/7.jpg)
Z scores
Standardised normal distribution µ = 0, σ = 1 Z scores: 0, 1, 1.65, 1.96 Need to know population standard
deviation
Z=(x-μ)/σ for one point compared to pop.
![Page 8: T tests, ANOVAs and regression Tom Jenkins Ellen Meierotto SPM Methods for Dummies 2007](https://reader035.vdocuments.pub/reader035/viewer/2022062404/5515dc6a550346d46f8b4af2/html5/thumbnails/8.jpg)
T tests
Comparing means 1 sample t 2 sample t Paired t
![Page 9: T tests, ANOVAs and regression Tom Jenkins Ellen Meierotto SPM Methods for Dummies 2007](https://reader035.vdocuments.pub/reader035/viewer/2022062404/5515dc6a550346d46f8b4af2/html5/thumbnails/9.jpg)
Different sample variances
![Page 10: T tests, ANOVAs and regression Tom Jenkins Ellen Meierotto SPM Methods for Dummies 2007](https://reader035.vdocuments.pub/reader035/viewer/2022062404/5515dc6a550346d46f8b4af2/html5/thumbnails/10.jpg)
2 sample t tests
21
21
xxs
xxt
2
22
1
21
21 n
s
n
ss xx
Pooled standard error of the mean
![Page 11: T tests, ANOVAs and regression Tom Jenkins Ellen Meierotto SPM Methods for Dummies 2007](https://reader035.vdocuments.pub/reader035/viewer/2022062404/5515dc6a550346d46f8b4af2/html5/thumbnails/11.jpg)
1 sample t test
![Page 12: T tests, ANOVAs and regression Tom Jenkins Ellen Meierotto SPM Methods for Dummies 2007](https://reader035.vdocuments.pub/reader035/viewer/2022062404/5515dc6a550346d46f8b4af2/html5/thumbnails/12.jpg)
The effect of degrees of freedom on t distribution
![Page 13: T tests, ANOVAs and regression Tom Jenkins Ellen Meierotto SPM Methods for Dummies 2007](https://reader035.vdocuments.pub/reader035/viewer/2022062404/5515dc6a550346d46f8b4af2/html5/thumbnails/13.jpg)
Paired t tests
![Page 14: T tests, ANOVAs and regression Tom Jenkins Ellen Meierotto SPM Methods for Dummies 2007](https://reader035.vdocuments.pub/reader035/viewer/2022062404/5515dc6a550346d46f8b4af2/html5/thumbnails/14.jpg)
T tests in SPM: Did the observed signal change occur by chance or is it stat. significant?
Recall GLM. Y= X β + ε β1 is an estimate of signal change over time
attributable to the condition of interest Set up contrast (cT) 1 0 for β1:
1xβ1+0xβ2+0xβn/s.d Null hypothesis: cTβ=0 No significant effect at
each voxel for condition β1
Contrast 1 -1 : Is the difference between 2 conditions significantly non-zero?
t = cTβ/sd[cTβ] – 1 sided
![Page 15: T tests, ANOVAs and regression Tom Jenkins Ellen Meierotto SPM Methods for Dummies 2007](https://reader035.vdocuments.pub/reader035/viewer/2022062404/5515dc6a550346d46f8b4af2/html5/thumbnails/15.jpg)
![Page 16: T tests, ANOVAs and regression Tom Jenkins Ellen Meierotto SPM Methods for Dummies 2007](https://reader035.vdocuments.pub/reader035/viewer/2022062404/5515dc6a550346d46f8b4af2/html5/thumbnails/16.jpg)
ANOVA Variances not means Total variance= model variance + error variance Results in F score- corresponding to a p value
1
)(1
2
2
n
xxs
n
ii
F test = Model variance /Error variance
Variance
![Page 17: T tests, ANOVAs and regression Tom Jenkins Ellen Meierotto SPM Methods for Dummies 2007](https://reader035.vdocuments.pub/reader035/viewer/2022062404/5515dc6a550346d46f8b4af2/html5/thumbnails/17.jpg)
Group 1
Group 2
Group 1
Group 2
Group 1
Group 2
Total = Model +
(Between groups)
Error
(Within groups)
Partitioning the variance
![Page 18: T tests, ANOVAs and regression Tom Jenkins Ellen Meierotto SPM Methods for Dummies 2007](https://reader035.vdocuments.pub/reader035/viewer/2022062404/5515dc6a550346d46f8b4af2/html5/thumbnails/18.jpg)
T vs F tests
F tests- any differences between multiple groups, interactions
Have to determine where differences are post-hoc
SPM- T- one tailed (con) SPM- F- two tailed (ess)
![Page 19: T tests, ANOVAs and regression Tom Jenkins Ellen Meierotto SPM Methods for Dummies 2007](https://reader035.vdocuments.pub/reader035/viewer/2022062404/5515dc6a550346d46f8b4af2/html5/thumbnails/19.jpg)
![Page 20: T tests, ANOVAs and regression Tom Jenkins Ellen Meierotto SPM Methods for Dummies 2007](https://reader035.vdocuments.pub/reader035/viewer/2022062404/5515dc6a550346d46f8b4af2/html5/thumbnails/20.jpg)
Conclusions T tests describe how unlikely it is that
experimental differences are due to chance Higher the t score, smaller the p value, more
unlikely to be due to chance Can compare sample with population or 2
samples, paired or unpaired ANOVA/F tests are similar but use variances
instead of means and can be applied to more than 2 groups and other more complex scenarios
![Page 21: T tests, ANOVAs and regression Tom Jenkins Ellen Meierotto SPM Methods for Dummies 2007](https://reader035.vdocuments.pub/reader035/viewer/2022062404/5515dc6a550346d46f8b4af2/html5/thumbnails/21.jpg)
Acknowledgements
MfD slides 2004-2006 Van Belle, Biostatistics Human Brain Function Wikipedia
![Page 22: T tests, ANOVAs and regression Tom Jenkins Ellen Meierotto SPM Methods for Dummies 2007](https://reader035.vdocuments.pub/reader035/viewer/2022062404/5515dc6a550346d46f8b4af2/html5/thumbnails/22.jpg)
Correlation and Regression
![Page 23: T tests, ANOVAs and regression Tom Jenkins Ellen Meierotto SPM Methods for Dummies 2007](https://reader035.vdocuments.pub/reader035/viewer/2022062404/5515dc6a550346d46f8b4af2/html5/thumbnails/23.jpg)
Topics Covered: Is there a relationship between x and y? What is the strength of this relationship
Pearson’s r Can we describe this relationship and use it to predict
y from x? Regression
Is the relationship we have described statistically significant?
F- and t-tests Relevance to SPM
GLM
![Page 24: T tests, ANOVAs and regression Tom Jenkins Ellen Meierotto SPM Methods for Dummies 2007](https://reader035.vdocuments.pub/reader035/viewer/2022062404/5515dc6a550346d46f8b4af2/html5/thumbnails/24.jpg)
Relationship between x and y Correlation describes the strength and
direction of a linear relationship between two variables
Regression tells you how well a certain independent variable predicts a dependent variable
CORRELATION CAUSATION In order to infer causality: manipulate independent
variable and observe effect on dependent variable
![Page 25: T tests, ANOVAs and regression Tom Jenkins Ellen Meierotto SPM Methods for Dummies 2007](https://reader035.vdocuments.pub/reader035/viewer/2022062404/5515dc6a550346d46f8b4af2/html5/thumbnails/25.jpg)
Scattergrams
Y
X
Y
X
Y
X
YY Y
Positive correlation Negative correlation No correlation
![Page 26: T tests, ANOVAs and regression Tom Jenkins Ellen Meierotto SPM Methods for Dummies 2007](https://reader035.vdocuments.pub/reader035/viewer/2022062404/5515dc6a550346d46f8b4af2/html5/thumbnails/26.jpg)
Variance vs. Covariance
Do two variables change together?
n
yyxxyx
i
n
ii ))((
),cov( 1
Covariance ~
DX * DY
n
xxS
n
ii
x
2
12
)(
Variance ~
DX * DX
![Page 27: T tests, ANOVAs and regression Tom Jenkins Ellen Meierotto SPM Methods for Dummies 2007](https://reader035.vdocuments.pub/reader035/viewer/2022062404/5515dc6a550346d46f8b4af2/html5/thumbnails/27.jpg)
Covariance
When X and Y : cov (x,y) = pos. When X and Y : cov (x,y) = neg. When no constant relationship: cov (x,y) =
0
n
yyxxyx
i
n
ii ))((
),cov( 1
![Page 28: T tests, ANOVAs and regression Tom Jenkins Ellen Meierotto SPM Methods for Dummies 2007](https://reader035.vdocuments.pub/reader035/viewer/2022062404/5515dc6a550346d46f8b4af2/html5/thumbnails/28.jpg)
Example Covariance
0
1
2
3
4
5
6
7
0 1 2 3 4 5 6 7
x y xxi yyi ( xix )( yiy
)
0 3 -3 0 0 2 2 -1 -1 1 3 4 0 1 0 4 0 1 -3 -3 6 6 3 3 9
3x 3y 7
4.15
7)))((
),cov( 1
n
yyxxyx
i
n
ii What does this
number tell us?
![Page 29: T tests, ANOVAs and regression Tom Jenkins Ellen Meierotto SPM Methods for Dummies 2007](https://reader035.vdocuments.pub/reader035/viewer/2022062404/5515dc6a550346d46f8b4af2/html5/thumbnails/29.jpg)
Example of how covariance value relies on variance
High variance data
Low variance data
Subject x y x error * y error
x y X error * y error
1 101 100 2500 54 53 9
2 81 80 900 53 52 4
3 61 60 100 52 51 1
4 51 50 0 51 50 0
5 41 40 100 50 49 1
6 21 20 900 49 48 4
7 1 0 2500 48 47 9
Mean 51 50 51 50
Sum of x error * y error : 7000 Sum of x error * y error :
28
Covariance: 1166.67
Covariance: 4.67
![Page 30: T tests, ANOVAs and regression Tom Jenkins Ellen Meierotto SPM Methods for Dummies 2007](https://reader035.vdocuments.pub/reader035/viewer/2022062404/5515dc6a550346d46f8b4af2/html5/thumbnails/30.jpg)
Pearson’s R
Covariance does not really tell us anything Solution: standardise this measure
Pearson’s R: standardise by adding std to equation:
),cov( yx
yxxy ss
yxr
),cov(
![Page 31: T tests, ANOVAs and regression Tom Jenkins Ellen Meierotto SPM Methods for Dummies 2007](https://reader035.vdocuments.pub/reader035/viewer/2022062404/5515dc6a550346d46f8b4af2/html5/thumbnails/31.jpg)
Basic assumptions Normal distributions Variances are constant and not zero Independent sampling – no
autocorrelations No errors in the values of the
independent variable All causation in the model is one-way
(not necessary mathematically, but essential for prediction)
![Page 32: T tests, ANOVAs and regression Tom Jenkins Ellen Meierotto SPM Methods for Dummies 2007](https://reader035.vdocuments.pub/reader035/viewer/2022062404/5515dc6a550346d46f8b4af2/html5/thumbnails/32.jpg)
Pearson’s R: degree of linear dependence
n
yyxxyx
i
n
ii ))((
),cov( 1
yx
i
n
ii
xy sns
yyxxr
))((1
11 r
n
ZZr
n
iyx
xy
ii 1
*
![Page 33: T tests, ANOVAs and regression Tom Jenkins Ellen Meierotto SPM Methods for Dummies 2007](https://reader035.vdocuments.pub/reader035/viewer/2022062404/5515dc6a550346d46f8b4af2/html5/thumbnails/33.jpg)
Limitations of r r is actually
r = true r of whole population = estimate of r based on data
r is very sensitive to extreme values:
0
1
2
3
4
5
0 1 2 3 4 5 6
r̂
r̂
![Page 34: T tests, ANOVAs and regression Tom Jenkins Ellen Meierotto SPM Methods for Dummies 2007](https://reader035.vdocuments.pub/reader035/viewer/2022062404/5515dc6a550346d46f8b4af2/html5/thumbnails/34.jpg)
In the real world… r is never 1 or –1 interpretations for correlations in
psychological research (Cohen)
Correlation Negative PositiveSmall -0.29 to -0.10 00.10 to 0.29Medium -0.49 to -0.30 0.30 to 0.49Large -1.00 to -0.50 0.50 to 1.00
![Page 35: T tests, ANOVAs and regression Tom Jenkins Ellen Meierotto SPM Methods for Dummies 2007](https://reader035.vdocuments.pub/reader035/viewer/2022062404/5515dc6a550346d46f8b4af2/html5/thumbnails/35.jpg)
Regression
Correlation tells you if there is an association between x and y but it doesn’t describe the relationship or allow you to predict one variable from the other.
To do this we need REGRESSION!
![Page 36: T tests, ANOVAs and regression Tom Jenkins Ellen Meierotto SPM Methods for Dummies 2007](https://reader035.vdocuments.pub/reader035/viewer/2022062404/5515dc6a550346d46f8b4af2/html5/thumbnails/36.jpg)
Best-fit Line
= ŷ, predicted value
Aim of linear regression is to fit a straight line, ŷ = ax + b, to data that gives best prediction of y for any value of x
This will be the line that minimises distance between data and fitted line, i.e.
the residuals
intercept
ε
ŷ = ax + b
ε = residual error
= y i , true value
slope
![Page 37: T tests, ANOVAs and regression Tom Jenkins Ellen Meierotto SPM Methods for Dummies 2007](https://reader035.vdocuments.pub/reader035/viewer/2022062404/5515dc6a550346d46f8b4af2/html5/thumbnails/37.jpg)
Least Squares Regression To find the best line we must minimise the
sum of the squares of the residuals (the vertical distances from the data points to our line)
Residual (ε) = y - ŷ
Sum of squares of residuals = Σ (y – ŷ)2
Model line: ŷ = ax + b
we must find values of a and b that minimise
Σ (y – ŷ)2
a = slope, b = intercept
![Page 38: T tests, ANOVAs and regression Tom Jenkins Ellen Meierotto SPM Methods for Dummies 2007](https://reader035.vdocuments.pub/reader035/viewer/2022062404/5515dc6a550346d46f8b4af2/html5/thumbnails/38.jpg)
Finding b First we find the value of b that gives the min
sum of squares
ε εbb
b
Trying different values of b is equivalent to shifting the line up and down the scatter plot
![Page 39: T tests, ANOVAs and regression Tom Jenkins Ellen Meierotto SPM Methods for Dummies 2007](https://reader035.vdocuments.pub/reader035/viewer/2022062404/5515dc6a550346d46f8b4af2/html5/thumbnails/39.jpg)
Finding a
Now we find the value of a that gives the min sum of squares
b b b
Trying out different values of a is equivalent to changing the slope of the line, while b stays constant
![Page 40: T tests, ANOVAs and regression Tom Jenkins Ellen Meierotto SPM Methods for Dummies 2007](https://reader035.vdocuments.pub/reader035/viewer/2022062404/5515dc6a550346d46f8b4af2/html5/thumbnails/40.jpg)
Minimising sums of squares Need to minimise Σ(y–ŷ)2
ŷ = ax + b so need to minimise:
Σ(y - ax - b)2
If we plot the sums of squares for all different values of a and b we get a parabola, because it is a squared term
So the min sum of squares is at the bottom of the curve, where the gradient is zero.
Values of a and b
sum
s of
sq
uar
es (
S)
Gradient = 0min S
![Page 41: T tests, ANOVAs and regression Tom Jenkins Ellen Meierotto SPM Methods for Dummies 2007](https://reader035.vdocuments.pub/reader035/viewer/2022062404/5515dc6a550346d46f8b4af2/html5/thumbnails/41.jpg)
The maths bit So we can find a and b that give min sum of
squares by taking partial derivatives of Σ(y - ax - b)2 with respect to a and b separately
Then we solve these for 0 to give us the values of a and b that give the min sum of squares
![Page 42: T tests, ANOVAs and regression Tom Jenkins Ellen Meierotto SPM Methods for Dummies 2007](https://reader035.vdocuments.pub/reader035/viewer/2022062404/5515dc6a550346d46f8b4af2/html5/thumbnails/42.jpg)
The solution Doing this gives the following equations for a and b:
a =r sy
sx
r = correlation coefficient of x and ysy = standard deviation of ysx = standard deviation of x
You can see that: A low correlation coefficient gives a flatter slope (small
value of a) Large spread of y, i.e. high standard deviation, results in a
steeper slope (high value of a) Large spread of x, i.e. high standard deviation, results in a
flatter slope (high value of a)
![Page 43: T tests, ANOVAs and regression Tom Jenkins Ellen Meierotto SPM Methods for Dummies 2007](https://reader035.vdocuments.pub/reader035/viewer/2022062404/5515dc6a550346d46f8b4af2/html5/thumbnails/43.jpg)
The solution cont.
Our model equation is ŷ = ax + b This line must pass through the mean so:
y = ax + b b = y – ax
We can put our equation into this giving:
b = y – ax
b = y - r sy
sx
r = correlation coefficient of x and ysy = standard deviation of ysx = standard deviation of x
x
The smaller the correlation, the closer the intercept is to the mean of y
![Page 44: T tests, ANOVAs and regression Tom Jenkins Ellen Meierotto SPM Methods for Dummies 2007](https://reader035.vdocuments.pub/reader035/viewer/2022062404/5515dc6a550346d46f8b4af2/html5/thumbnails/44.jpg)
Back to the model
We can calculate the regression line for any data, but the important question is:
How well does this line fit the data, or how good is it at predicting y from x?
![Page 45: T tests, ANOVAs and regression Tom Jenkins Ellen Meierotto SPM Methods for Dummies 2007](https://reader035.vdocuments.pub/reader035/viewer/2022062404/5515dc6a550346d46f8b4af2/html5/thumbnails/45.jpg)
How good is our model? Total variance of y: sy
2 =∑(y – y)2
n - 1
SSy
dfy
=
Variance of predicted y values (ŷ):
Error variance:
sŷ2 =
∑(ŷ – y)2
n - 1
SSpred
dfŷ
=This is the variance explained by our regression model
serror2 =
∑(y – ŷ)2
n - 2
SSer
dfer
=
This is the variance of the error between our predicted y values and the actual y values, and thus is the variance in y that is NOT explained by the regression model
![Page 46: T tests, ANOVAs and regression Tom Jenkins Ellen Meierotto SPM Methods for Dummies 2007](https://reader035.vdocuments.pub/reader035/viewer/2022062404/5515dc6a550346d46f8b4af2/html5/thumbnails/46.jpg)
Total variance = predicted variance + error variance
sy2 = sŷ
2 + ser2
Conveniently, via some complicated rearranging
sŷ2 = r2 sy
2
r2 = sŷ2 / sy
2
so r2 is the proportion of the variance in y that is explained by our regression model
How good is our model cont.
![Page 47: T tests, ANOVAs and regression Tom Jenkins Ellen Meierotto SPM Methods for Dummies 2007](https://reader035.vdocuments.pub/reader035/viewer/2022062404/5515dc6a550346d46f8b4af2/html5/thumbnails/47.jpg)
How good is our model cont.
Insert r2 sy2 into sy
2 = sŷ2 + ser
2 and rearrange to get:
ser2 = sy
2 – r2sy2
= sy2 (1 – r2)
From this we can see that the greater the correlation the smaller the error variance, so the better our prediction
![Page 48: T tests, ANOVAs and regression Tom Jenkins Ellen Meierotto SPM Methods for Dummies 2007](https://reader035.vdocuments.pub/reader035/viewer/2022062404/5515dc6a550346d46f8b4af2/html5/thumbnails/48.jpg)
Is the model significant? i.e. do we get a significantly better prediction
of y from our regression equation than by just predicting the mean?
F-statistic:
F(dfŷ,dfer) =sŷ
2
ser2
=......=r2 (n - 2)2
1 – r2
complicatedrearranging
And it follows that:
t(n-2) =r (n - 2)
√1 – r2(because F = t2)
So all we need to know are r and n !
![Page 49: T tests, ANOVAs and regression Tom Jenkins Ellen Meierotto SPM Methods for Dummies 2007](https://reader035.vdocuments.pub/reader035/viewer/2022062404/5515dc6a550346d46f8b4af2/html5/thumbnails/49.jpg)
General Linear Model
Linear regression is actually a form of the General Linear Model where the parameters are a, the slope of the line, and b, the intercept.
y = ax + b +ε A General Linear Model is just any
model that describes the data in terms of a straight line
![Page 50: T tests, ANOVAs and regression Tom Jenkins Ellen Meierotto SPM Methods for Dummies 2007](https://reader035.vdocuments.pub/reader035/viewer/2022062404/5515dc6a550346d46f8b4af2/html5/thumbnails/50.jpg)
Multiple regression Multiple regression is used to determine the effect of a
number of independent variables, x1, x2, x3 etc., on a single dependent variable, y
The different x variables are combined in a linear way and each has its own regression coefficient:
y = a1x1+ a2x2 +…..+ anxn + b + ε
The a parameters reflect the independent contribution of each independent variable, x, to the value of the dependent variable, y.
i.e. the amount of variance in y that is accounted for by each x variable after all the other x variables have been accounted for
![Page 51: T tests, ANOVAs and regression Tom Jenkins Ellen Meierotto SPM Methods for Dummies 2007](https://reader035.vdocuments.pub/reader035/viewer/2022062404/5515dc6a550346d46f8b4af2/html5/thumbnails/51.jpg)
SPM Linear regression is a GLM that models the effect of one
independent variable, x, on ONE dependent variable, y
Multiple Regression models the effect of several independent variables, x1, x2 etc, on ONE dependent variable, y
Both are types of General Linear Model
GLM can also allow you to analyse the effects of several independent x variables on several dependent variables, y1, y2, y3 etc, in a linear combination
This is what SPM does and will be explained soon…