graphs in statistical analysis
DESCRIPTION
Graphs in Statistical Analysis. F. J. Anscombe Dept. of Statistics, Yale Univ The American Statistician, 1973 Jan 2 , 2014 Hee -gook Jun. Outline. Introduction Regression Analysis Model Graphs in S tatistical Analysis Conclusion. Both Calculations and Graphs. - PowerPoint PPT PresentationTRANSCRIPT
Graphs in Statistical Analysis
F. J. AnscombeDept. of Statistics, Yale UnivThe American Statistician, 1973
Jan 2, 2014Hee-gook Jun
2 / 19
Outline
Introduction Regression Analysis Model Graphs in Statistical Analysis Conclusion
3 / 19
Both Calculations and Graphs
Should be made and studied Each will contribute to understanding
4 / 19
Stereotype of Graph
Calculations are exact, but graphs are rough Intricate calculations are virtuous, whereas looking at the data is cheating Some data fits specific statistical calculations
5 / 19
Purpose of Graph
Perceive broad features of the data Look behind those broad features Check if the assumptions of statistical calculation are correct
6 / 19
Good Statistical Analysis is
Not a simple routine way– More than one pass through the computer
Sensitive to specific features in the data Sensitive to general background information about data
7 / 19
Outline
Introduction Regression Analysis Model Graphs in Statistical Analysis Conclusion
8 / 19
Regression Analysis
Explain data Estimate new data
x
y
x
y
𝒚=𝒂+𝒃𝒙
𝑎
9 / 19
Regression Analysis Model
Model
data
12345..
1.52.02.53.02.9 ..
( x , y )(1.0, 1.5)(2.0, 2.0)(3.0, 2.5)(4.0, 3.0) …
𝒇 (𝒙 )=𝟏+𝟏 .𝟓𝑿𝒇 :𝑿→𝒀
𝒚=𝒂+𝒃𝒙
new instance (x=10, y=?) → f(10) = 1 + 1.5 * 10 = 16
10 / 19
Residual Value [1/3]
x
y
𝒑𝒆𝒓𝒇𝒆𝒄𝒕𝒍𝒚 𝒇𝒊𝒕𝒕𝒆𝒅𝒎𝒐𝒅𝒆𝒍
x
y
𝒇 (𝒙)=𝒂+𝒃𝒙
𝒖𝒏𝒆𝒙𝒑𝒍𝒂𝒊𝒏𝒆𝒅 𝒗𝒂𝒓𝒊𝒂𝒕𝒊𝒐𝒏
11 / 19
Residual Value [2/3]
x
y
�̂�=𝒇 (𝒙)=𝒂+𝒃𝒙
(x, y)
(x, )
error
12 / 19
Residual Value [3/3]
�̂� 𝒊=𝜷𝟎+𝜷𝟏𝒙 𝒊
𝒚 𝒊=�̂� 𝒊+𝜺𝒊
x
y (x, y)
(x, )
error
𝒚 𝒊=𝜷𝟎+𝜷𝟏𝒙 𝒊+𝜺𝒊
𝒔𝒖𝒎𝒐𝒇 𝒆𝒓𝒓𝒐𝒓𝒔 ?
13 / 19
Outline
Introduction Regression Analysis Model Graphs in Statistical Analysis Conclusion
14 / 19
Numerical Calculations
10.08.0
13.09.0
11.014.06.04.0
12.07.05.0
8.046.957.588.818.339.967.244.26
10.844.825.68
9.148.148.748.779.268.106.133.109.137.264.74
7.466.77
12.747.117.818.846.085.398.156.425.73
8.08.08.08.08.08.08.0
19.08.08.08.0
6.585.767.718.848.477.045.25
12.505.567.916.89
10.08.0
13.09.0
11.014.06.04.0
12.07.05.0
10.08.0
13.09.0
11.014.06.04.0
12.07.05.0
Data set 1x y
Data set 2x y
Data set 3x y
Data set 4x y
N 11 11 11 11
Mean(x) 9.0 9.0 9.0 9.0
Mean(y) 7.5 7.5 7.5 7.5
Regression line y = 3 + 0.5x y = 3 + 0.5x y = 3 + 0.5x y = 3 + 0.5x
15 / 19
Data set 1
The kind of thing most people would see in their mind’s eye
16 / 19
Data set 2
Does not conform with the theoretical description
17 / 19
Data set 3
One of the observation is far from this line
18 / 19
Data set 4
There was something unsatisfactory about the data set
19 / 19
Conclusion
Both Calculations and Graphs contribute to understanding Thought and ingenuity devoted to devising good graphs are likely to pay off