Download - Chuong_4.pdf
-
36 Chng 6
B MN TON GVGD: Nguyn nh Huy
Chng 4
P DNG MS-EXCEL
TRONG PHN TCH
TNG QUAN V HI QUY
Phn tch tng quan
Phn tch hi quy
n gin
a tham s
-
37 Chng 6
B MN TON GVGD: Nguyn nh Huy
A- PHN TCH TNG QUAN
6.1 Khi nim thng k
Hai bin s ngu nhin Y v X c th: lin quan tuyn tnh (a v b), c khuynh hng tuyn tnh
(c) hoc khng c lin quan (d v c).
H s tng quan Pearson:
,( , )
X Y
X X
COV X Y
; 2 2
1
1( )
N
X i X
i
XN
; 2 21
1( )
N
Y i Y
i
YN
S phn tch tng quan (correlation) kho st khuynh hng v mc ca s lin
quan, trong s phn tch hi quy(regrestion) xc nh s lin quan nh lng gia hai bin s
ngu nhin Y v X. H s tng quan c th c c tnh bi biu thc:
1
2 2
1 1
( )( )
( ) ( )
n
i i
iXY
n nXX YY
i i
i i
X X Y YS
RS S
X X Y Y
H s tng quan c dng trong vic nh gi mc lin quan:
Gi tr |R| Mc
-
38 Chng 6
B MN TON GVGD: Nguyn nh Huy
6.2.1 Nhp d liu vo bng tnh
6.2.3 p dng Correlation
a- Nhp ln lt n lnh Tools v lnh Data Analysic
b- Chn phng trnh Correlation trong hp thoi Data Analysic ri nhp nt OK.
c- Trong hp Correlation, ln lt n nh cc chi tit:
Phm vi u vo (Input Range),
Cch xp xp theo hng hay ct (Group By),
Nhn d liu (Labels Fisrt Row/Column),
Phm vi u ra (Output Range)
Hp thoi Correlation
Kt qu
Cc h s tng quan: R (m/thi gian) = 0,97; R(nhit/thi gian) = 0,97 v R (m / nhit)
= 0,95
-
39 Chng 6
B MN TON GVGD: Nguyn nh Huy
B- PHN TCH HI QUY
6.4 Khi nhim thng k
Php phm tch hi quy tuyn tnh (liner regression) hay c p dng trong khoa hc. Th
d, ng hi quy (regression line / line of best fit) thng dng d on v tui th hay hn
dng ca thuc
(L thuyt)
(c tnh)
Phng trnh hi quy c th c c
tnh bng phng php bnh phng
cc tiu (least-squares estimation).
-
40 Chng 6
B MN TON GVGD: Nguyn nh Huy
C- HI QUY TUYN TNH N GIN
6.5 Phng trnh tng qut
| 0XY B BX
0B Y BX
/i i i i
i
X Y X Y N
BX X
Y: bin s ph thuc
(dependent / reponse variable)
X: l bin s c lp
(independent / predictor variable)
B0 v B l cc h s hi quy
(regression coefficients)
Bng ANOVA
Ngun
sai s
Bc
t do
Tng s bnh
phng
Bnh phng
trung bnh
Gi tr
thng k
Hi quy 1 ' ' 2( )iSSR Y Y MSR = SSR
MSRF
MSE
Sai s N 2 ' 2( )i iSSE Y Y MSE = SSE/(N-2)
Tng cng N 1
2( )iSST Y Y
= SSR + SSE
Gi tr thng k
Gi tr R bnh phng (R square):
SSR
RSST
(100R2: % ca bin i trn Y c gii thch bi X)
lch chun (Standard Error):
' 21
( )2
i iS Y YN
(S phn tn ca d liu cng t th gi
tr ca S cng gn zero)
Trc nghim thng k
i vi mt phng trnh hi quy, | 0XY B BX , ngha thng k ca cc h s Bi (B0
hay B) c nh gi bng trc nghim t (phn phi Student) trong khi tnh cht thch hp ca
phng trnh | ( )XY f X c nh gi bng trc nghim F (phn b Fischer)
Trc nghim t
- Gi thuyt:
-
41 Chng 6
B MN TON GVGD: Nguyn nh Huy
H0: i = 0 H s hi quy khng c ngha
H0: i 0 H s hi quy c ngha
- Gi tr thng k:
2
i i
n
Bt
S
;
22
2( )n
i
SS
X X
2
n
B
S
Phn b Student = N-2
- Bin lun:
Nu t < t (N-2) Chp nhn gi thuyt H0 .
Trc nghim F
- Gi thuyt:
H0: i = 0 Phng trnh hi quy khng thch hp
H0: i 0 Phng trnh hi quy thch hp
- Gi tr thng k:MSR
FMSE
Phn b Fischer v1 = 1, v2 = N-2
- Kt lun:
Nu F < F (1,N-2) Chp nhn gi thuyt H0 .
-
42 Chng 6
B MN TON GVGD: Nguyn nh Huy
D- HI QUY TUYN TNH A THAM S
Trong phng trnh hi quy tuyn tnh a tham s bin s ph thuc Y c lin quan n k
bin s c lp Xi (i = 1,2,k) thay v ch c mt nh trong hi quy tuyn tnh n gin.
Phng trnh tng qut : 0 1| , ,..., 0 1 1 2 2
...kX X X k k
Y B B X B X B X
Phng trnh hi quy a tham s c th c trnh by di dng ma trn:
Bng ANOVA
Ngun
sai s
Bc
t do
Tng s bnh
phng
Bnh phng
trung bnh
Gi tr thng k
Hi quy k SSR MSR = SSR/k MSR
FMSE
Sai s N k 1 SSE MSE = SSE/(N-k-1)
Tng cng N 1 SST= SSR + SSE
Gi tr thng k:
Gi tr R bnh phng:
Gi tr R2 c hiu chnh (Adjusted R Square)
2
( 1)
SSR kFR
SST N k kF
(R2 0,81 l kh tt)
Gi tr R2 c hiu chnh (Adjusted R square):
2 2
2 2( 1) (1 )
1 ( 1)ii
N R k k RR R
N k N k
( 2iiR s tr nn m hay khng xc nh nu R2 hay N nh)
-
43 Chng 6
B MN TON GVGD: Nguyn nh Huy
lch chun:
( 1)
SSES
N k
(S 0,30 l kh tt)
Trc nghim thng k
Tng t hi quy n gin, song bn cn ch :
- Trong trc nghim t
H0: i = 0 Cc h s hi quy khng c ngha
H0: i 0 C t nht vi h s hi quy c ngha
Bc t do ca gi tr t: = N k 1.
2
i i
n
Bt
S
;
22
2( )n
i
SS
X X
- Trong trc nghim F:
H0: i = 0 Phng trnh hi quy khng thch hp
H0: i 0 Phng trnh hi quy thch hp vi t nht vi Bi .
Bc t do ca gi tr F: v1 = 1; v2 = N-k-1.
p dng MS-EXCEL
Th d 17: Ngi ta dng ba mc nhit gm 105, 120 v 1350C kt hp vi ba
khong thi gian l 15, 30 v 60 pht thc hin mt phn ng tng hp. Cc hiu sut ca
phn ng (%) c trnh by trong bng sau y:
Thi gian (pht) X1
Nhit (0C)
X2 Hiu sut (%)
Y
15 105 1.87
30 105 2.02
60 105 3.28
15 120 3.05
30 120 4.07
60 120 5.54
15 135 5.03
30 135 6.45
60 135 7.26
Hy cho bit yu t nhit v/ hoc yu t thi gian c lin quan tuyn tnh vi hiu sut
ca phn ng tng hp ? Nu c th iu kin nhit 1150C trong vng 50 pht th hiu sut
phn ng s l bao nhiu?
Nhp d liu vo bng tnh
D liu nht thit phi c nht thit c nhp theo ct:
-
44 Chng 6
B MN TON GVGD: Nguyn nh Huy
S dng Regression
Nhn ln lt n lnh Tools v lnh Data Analysis.
Chn chng trnh Regression trong hp thoi Data Analysis ri nhp OK.
Trong hp thoi Regression, ln lt n nh cc chi tit:
Phm vi ca bin s Y (Input Y Range).
Phm vi ca bin s X (Input Y Range)
Nhn d liu (Labels)
Mc tin cy (Confidence Level)
Ta u ra (Output Range)
V mt s ty chn khc nh ng hi quy (Line Fit Plots), biu thc sai s
(Residuals Plots)
Hp thoi Regression
-
45 Chng 6
B MN TON GVGD: Nguyn nh Huy
Phng trnh hi quy 1| 1
( )XY f X
1| 1
2,73 0,04XY X
(R2 = 0,21; S=1,81)
Regression Statistics
Multiple R 0.462512069
R Square 0.213917414
Adjusted R
Square 0.101619901
Standard Error 1.811191587
Observations 9
ANOVA
df SS MS F Significance F
Regression 1 6.24891746 6.24891746 1.904917 0.209994918
Residual 7 22.96290476 3.280414966
Total 8 29.21182222
Coefficients
Standard
Error t Stat P-value Lower 95%
Intercept 2.726666667 1.280705853 2.129034282 0.070771 -0.301721453
X1 0.044539683 0.032270754 1.38018722 0.209995 -0.031768525
t0 = 2,19 < t0,05 = 2,365 (Hay 2 0,071 0,05VP )
Chp nhn gi thuyt H0.
t1 = 1,38 < t0,05 = 2,365 (Hay 0,209 0,05VP )
Chp nhn gi thuyt H0.
30,051,905 5,590F F (Hay 4 0,209 0,05SF )
Chp nhn gi thuyt H0.
Vy c hai h s 2,37(B0) v 0,04(B1) ca phng trnh hi quy | 2,73 0,04iX iY X u
khng c ngha thng k. Ni mt cch khc, phng trnh hi quy ny khng thch hp.
Kt lun: Yu t thi gian khng c lin quan tuyn tnh vi hiu sut ca phn ng tng hp.
Phng trnh hi quy 2 2
( )XY f X
2| 2
2,73 0,04XY X
(R2 = 0,76; S=0,99)
-
46 Chng 6
B MN TON GVGD: Nguyn nh Huy
Regression Statistics
Multiple R 0.873933544
R Square 0.76375984
Adjusted R
Square 0.730011246
Standard Error 0.99290379
Observations 9
ANOVA
df SS MS F Significance F
Regression 1 22.31081667 22.31082 22.63086 0.002066188
Residual 7 6.901005556 0.985858
Total 8 29.21182222
Coefficients Standard Error t Stat P-value Lower 95%
Intercept -11.14111111 3.25965608 -3.41788 0.011168 -18.84897293
X2 0.128555556 0.027023418 4.757191 0.002066 0.064655325
t0 = 3,418 < t0,05 = 2,365 (Hay 0,011 0,05VP )
Bc b gi thuyt H0.
t2 = 4,757 < t0,05 = 2,365 (Hay 0,00206 0,05VP )
Bc b gi thuyt H0.
0,0522,631 5,590F F (Hay 0,00206 0,05SF )
Bc b gi thuyt H0.
Vy c hai h s -11,14(B0) v 0,13(B2) ca phng trnh hi quy 2| 2
11,14 0,13XY X
u c ngha thng k. Ni mt cch khc, phng trnh hi quy ny thch hp.
Kt lun: Yu t nhit c lin quan tuyn tnh vi hiu sut ca phn ng tng hp.
Phng trnh hi quy 1 2| , 1 2
( , )X XY f X X
1 2| , 1 2
12,70 0,04 0,13X XY X X
(R2 = 0,97; S=0,33)
-
47 Chng 6
B MN TON GVGD: Nguyn nh Huy
Regression Statistics
Multiple R 0.988775634
R Square 0.977677254
Adjusted R
Square 0.970236338
Standard Error 0.329668544
Observations 9
ANOVA
df SS MS F Significance F
Regression 2 28.55973413 14.27987 131.3921 1.11235E-05
Residual 6 0.652088095 0.108681
Total 8 29.21182222
Coefficients Standard Error t Stat P-value Lower 95%
Intercept -12.7 1.101638961 -11.5283 2.56E-05 -15.39561342
X1 0.044539683 0.005873842 7.582718 0.000274 0.03016691
X2 0.128555556 0.008972441 14.32782 7.23E-06 0.106600783
t0 = 11,528 > t0,05 = 2,365 (Hay 52,260.10 0,05VP )
Bc b gi thuyt H0.
t1 = 7,583 > t0,05 = 2,365 (Hay 0,00027 0,05VP )
Bc b gi thuyt H0.
t2 = 14,328 > t0,05 = 2,365 (Hay 67,233.10 0,05VP )
Bc b gi thuyt H0.
0,05131,392 5,140F F (Hay 51,112.10 0,05SF )
Bc b gi thuyt H0.
Vy c hai h s -12,70(B0), 0,04(B1) v 0,13(B2) ca phng trnh hi quy
1 2| , 1 212,70 0,04 0,13X XY X X u c ngha thng k. Ni mt cch khc, phng trnh
hi quy ny thch hp.
Kt lun: Hiu sut ca phn ng tng hp c lin quan tuyn tnh vi c hai yu t l thi
gian v nhit .
S tuyn tnh ca phng trnh 1 2| , 1 2
12,70 0,04 0,13X XY X X c th c trnh by
trn biu phn tn (scatterplots):
-
48 Chng 6
B MN TON GVGD: Nguyn nh Huy
Mun d on hiu sut ca phn ng bng phng trnh hi quy
1 2| , 1 212,70 0,04 0,13X XY X X , bn ch cn chn mt , th d B21, sau nhp hm v
c kt qu nh sau:
B21 = B17 + B18 * 50 + B19 * 115
A B C D
17 Interrcept -12,7 1,101638961 -11,52827782
18 X1 0,044539683 0,005873842 7.582717626
19 X2 0,128555556 0,008972441 14,32782351
20
21 D on 4,310873016
Ghi ch: B17 ta ca B0, B18 ta ca B1, B19 ta ca B2, 50 l gi tr ca X1(thi gian)
v 115 l gi tr ca X2(nhit ).
PH LC:
Bng gi tr ti hn dng trong trc nghim loi gi tr bt thng:
Gi tr thng k
G1
S trng hp kho
st
N
Tr s ti hn
GP (P=0,01)
N=37 3 0,976
2 11
1N
Y YG
Y Y
4 0,846
5 0,729
6 0,644
7 0,586
N=813 8 0,780
3 12
1 1N
Y YG
Y Y
9 0,725
10 0,678
11 0,638
12 0,605
13 0,578
N=1424 14 0,602
3 13
2 1N
Y YG
Y Y
15 0,579
16 0,559
17 0,542
18 0,527
19 0,514
20 0,502
21 0,491
22 0,481
23 0,472
24 0,464