数据的矩阵描述staff.ustc.edu.cn/~zhangwm/matrix_analysis/courseware/...5 sample covariance...

34
1 数据的矩阵描述

Upload: vuthien

Post on 11-Apr-2018

250 views

Category:

Documents


16 download

TRANSCRIPT

Page 1: 数据的矩阵描述staff.ustc.edu.cn/~zhangwm/matrix_analysis/courseware/...5 Sample Covariance and Sample Correlation Coefficient ()() ()() ()() ikkiikki n j jkk n j jii n j jiijkk

1

数据的矩阵描述

Page 2: 数据的矩阵描述staff.ustc.edu.cn/~zhangwm/matrix_analysis/courseware/...5 Sample Covariance and Sample Correlation Coefficient ()() ()() ()() ikkiikki n j jkk n j jii n j jiijkk

2

Array of Data

!!!!!!!!

"

#

$$$$$$$$

%

&

=

npnknn

jpjkjj

pk

pk

xxxx

xxxx

xxxxxxxx

!!

!!!!

!!

!!!!

!!

!!

21

21

222221

111211

x

样本

Page 3: 数据的矩阵描述staff.ustc.edu.cn/~zhangwm/matrix_analysis/courseware/...5 Sample Covariance and Sample Correlation Coefficient ()() ()() ()() ikkiikki n j jkk n j jii n j jiijkk

3

Descriptive Statistics

Summary numbers to assess the information contained in data Basic descriptive statistics

Sample mean Sample variance Sample standard deviation Sample covariance Sample correlation coefficient

Page 4: 数据的矩阵描述staff.ustc.edu.cn/~zhangwm/matrix_analysis/courseware/...5 Sample Covariance and Sample Correlation Coefficient ()() ()() ()() ikkiikki n j jkk n j jii n j jiijkk

4

Sample Mean and Sample Variance

( )1

22

1

1

1

1, 2, ,

n

k jkj

n

k kk jk kj

x xn

s s x xn

k p

=

=

=

= = −

=

∑!

Page 5: 数据的矩阵描述staff.ustc.edu.cn/~zhangwm/matrix_analysis/courseware/...5 Sample Covariance and Sample Correlation Coefficient ()() ()() ()() ikkiikki n j jkk n j jii n j jiijkk

5

Sample Covariance and Sample Correlation Coefficient

( )( )

( )( )

( ) ( )

kiikkiik

n

jkjk

n

jiji

n

jkjkiji

kkii

ikik

n

jkjkijiik

rrsspkpi

xxxx

xxxx

sss

r

xxxxn

s

==

==

−−

−−

==

−−=

∑∑

==

=

=

,,,2,1;,,2,1

1

1

2

1

2

1

1

!!

Page 6: 数据的矩阵描述staff.ustc.edu.cn/~zhangwm/matrix_analysis/courseware/...5 Sample Covariance and Sample Correlation Coefficient ()() ()() ()() ikkiikki n j jkk n j jii n j jiijkk

6

Standardized Values (or Standardized Scores)

Centered at zero Unit standard deviation Sample correlation coefficient can be regarded as a sample covariance of two standardized variables

kk

kjk

s

xx −

Page 7: 数据的矩阵描述staff.ustc.edu.cn/~zhangwm/matrix_analysis/courseware/...5 Sample Covariance and Sample Correlation Coefficient ()() ()() ()() ikkiikki n j jkk n j jii n j jiijkk

7

Properties of Sample Correlation Coefficient

Value is between -1 and 1 Magnitude measure the strength of the linear association Sign indicates the direction of the association Value remains unchanged if all xji’s and xjk’s are changed to yji = a xji + b and yjk = c xjk + d, respectively, provided that the constants a and c have the same sign

Page 8: 数据的矩阵描述staff.ustc.edu.cn/~zhangwm/matrix_analysis/courseware/...5 Sample Covariance and Sample Correlation Coefficient ()() ()() ()() ikkiikki n j jkk n j jii n j jiijkk

8

Arrays of Basic Descriptive Statistics

!!!!!

"

#

$$$$$

%

&

=

!!!!!

"

#

$$$$$

%

&

=

!!!!!

"

#

$$$$$

%

&

=

1

11

,

21

221

112

21

22221

11211

2

1

!

!!!!

!

!

!

!!!!

!

!

!

pp

p

p

pppp

p

p

n

p

rr

rrrr

sss

ssssss

x

xx

R

Sx

Page 9: 数据的矩阵描述staff.ustc.edu.cn/~zhangwm/matrix_analysis/courseware/...5 Sample Covariance and Sample Correlation Coefficient ()() ()() ()() ikkiikki n j jkk n j jii n j jiijkk

9

Random Vectors and Random Matrices

Random vector Vector whose elements are random variables

!Random matrix

Matrix whose elements are random variables

Page 10: 数据的矩阵描述staff.ustc.edu.cn/~zhangwm/matrix_analysis/courseware/...5 Sample Covariance and Sample Correlation Coefficient ()() ()() ()() ikkiikki n j jkk n j jii n j jiijkk

10

Expected Value of a Random Matrix

11 12 1

21 22 2

1 2

( ) ( ) ( )( ) ( ) ( )

( )

( ) ( ) ( )

( )( )

( )

( ) ( )ij

p

p

n n np

ij ij ij ij

ij

ij ij ijall x

E X E X E XE X E X E X

E

E X E X E X

x f x dxE X

x p x

E E

−∞

# $% &% &=% &% &% &' (

)**

= +**,=

X

AXB A X B

!

!

! ! ! !

!

Page 11: 数据的矩阵描述staff.ustc.edu.cn/~zhangwm/matrix_analysis/courseware/...5 Sample Covariance and Sample Correlation Coefficient ()() ()() ()() ikkiikki n j jkk n j jii n j jiijkk

11

Population Mean Vectors1 2

1 2

2 2

1 2

Random vector '

Joint probability density function ( ) ( , , , )

Marginal probability distribution ( )( )

( )

( )

p

p

i

i i

i i i

p

X X X

f f x x x

f xE X

E X

E

µ

σ µ

µ µ µ

" #= $ %

=

=

= −

" #= = $ %

X

x

µ

X

!

!

!

Page 12: 数据的矩阵描述staff.ustc.edu.cn/~zhangwm/matrix_analysis/courseware/...5 Sample Covariance and Sample Correlation Coefficient ()() ()() ()() ikkiikki n j jkk n j jii n j jiijkk

12

Covariance

ik

xall xallkiikkkii

kikiikkkii

kkiiki

i k

xxpxx

dxdxxxfxx

XXEXX

σ

µµ

µµ

µµ

=

"#

"$

%

−−

−−=

−−=

∑∑∫ ∫∞

∞−

∞−

),())((

),())((

))((),Cov(

Page 13: 数据的矩阵描述staff.ustc.edu.cn/~zhangwm/matrix_analysis/courseware/...5 Sample Covariance and Sample Correlation Coefficient ()() ()() ()() ikkiikki n j jkk n j jii n j jiijkk

13

Statistically Independent

general)in not true is converse (thetindependen are , if 0),Cov(

)()()(),,,(

)()()(][][] and[

22112112

,

kiki

pppp

kkiikiik

kkiikkii

XXXX

xfxfxfxxxf

xfxfxxfxXPxXPxXxXP

=

=

=

≤≤=≤≤

!!!

Page 14: 数据的矩阵描述staff.ustc.edu.cn/~zhangwm/matrix_analysis/courseware/...5 Sample Covariance and Sample Correlation Coefficient ()() ()() ()() ikkiikki n j jkk n j jii n j jiijkk

14

Population Variance-Covariance Matrices

[ ]

!!!!!

"

#

$$$$$

%

&

==

'''''

(

)

*****

+

,

−−−

!!!!!

"

#

$$$$$

%

&

=

−−=

pppp

p

p

pp

pp

XXX

X

XX

E

E

σσσ

σσσ

σσσ

µµµ

µ

µ

µ

!

!!!!

!

!

!!

21

22221

11211

221122

11

)Cov(

)')((

X

µXµXΣ

Page 15: 数据的矩阵描述staff.ustc.edu.cn/~zhangwm/matrix_analysis/courseware/...5 Sample Covariance and Sample Correlation Coefficient ()() ()() ()() ikkiikki n j jkk n j jii n j jiijkk

15

Population Correlation Coefficients

1

21

22221

11211

=

=

!!!!!

"

#

$$$$$

%

&

=

ii

kkii

ikik

pppp

p

p

ρ

σσ

σρ

ρρρ

ρρρ

ρρρ

!

!!!!

!

!

ρ

Page 16: 数据的矩阵描述staff.ustc.edu.cn/~zhangwm/matrix_analysis/courseware/...5 Sample Covariance and Sample Correlation Coefficient ()() ()() ()() ikkiikki n j jkk n j jii n j jiijkk

16

Standard Deviation Matrix

2/12/1

2/12/1

22

11

2/1

00

0000

−−=

=

"""""

#

$

%%%%%

&

'

=

ΣVVρρVVΣ

V

ppσ

σ

σ

!

!!!!

!

!

Page 17: 数据的矩阵描述staff.ustc.edu.cn/~zhangwm/matrix_analysis/courseware/...5 Sample Covariance and Sample Correlation Coefficient ()() ()() ()() ikkiikki n j jkk n j jii n j jiijkk

17

Correlation Matrix from Covariance Matrix

!!!

"

#

$$$

%

&

−==

!!!

"

#

$$$

%

&

=

!!!

"

#

$$$

%

&

=!!!

"

#

$$$

%

&

=

!!!

"

#

$$$

%

&

=

!!!

"

#

$$$

%

&

−=

−−

15/15/15/116/15/16/11

5/10003/10002/1

500030002

000000

2532391214

2/12/1

2/1

33

22

112/1

333231

232221

131211

ΣVVρ

V

V

Σ

σ

σ

σ

σσσ

σσσ

σσσ

Page 18: 数据的矩阵描述staff.ustc.edu.cn/~zhangwm/matrix_analysis/courseware/...5 Sample Covariance and Sample Correlation Coefficient ()() ()() ()() ikkiikki n j jkk n j jii n j jiijkk

18

Partitioning Covariance Matrix

!"

#$%

&

−−−−

−−−−

=−−

!!!

"

#

$$$

%

&

−−−=!!!

"

#

$$$

%

&

−−−=

!!!!!!!!!

"

#

$$$$$$$$$

%

&

−−−=

+

)')(()')(()')(()')((

)')((

,

)2()2()2()2()1()1()2()2(

)2()2()1()1()1()1()1()1(

)2(

)1(

)2(

)1(

1

1

µXµXµXµXµXµXµXµX

µXµX

µ

µµ

X

XX

p

q

q

X

X

X

X

!

!

Page 19: 数据的矩阵描述staff.ustc.edu.cn/~zhangwm/matrix_analysis/courseware/...5 Sample Covariance and Sample Correlation Coefficient ()() ()() ()() ikkiikki n j jkk n j jii n j jiijkk

19

Partitioning Covariance Matrix

'1221

1,1

,11,1,11,1

1,1

11,1111

2221

1211

|||||||

|||

)')((

ΣΣ

ΣΣ

ΣΣµXµXΣ

=

!!!!!!!!!

"

#

$$$$$$$$$

%

&

−−−−−−−−−−−−=

!!!

"

#

$$$

%

&

−−−−−−=−−=

+

+++++

+

+

ppqppqp

pqqqqqq

qpqqqqq

pqq

E

σσσσ

σσσσ

σσσσ

σσσσ

!!

!!!!!!

!!

!!

!!!!!!

!!

Page 20: 数据的矩阵描述staff.ustc.edu.cn/~zhangwm/matrix_analysis/courseware/...5 Sample Covariance and Sample Correlation Coefficient ()() ()() ()() ikkiikki n j jkk n j jii n j jiijkk

20

Linear Combinations of Random Variables

)Cov( and )( where')'Var( ariance v

')'(mean has

'n combinatioLinear 11

XΣXµΣccXc

µcXc

Xc

==

==

==

++=

E

E

XcXc pp!

Page 21: 数据的矩阵描述staff.ustc.edu.cn/~zhangwm/matrix_analysis/courseware/...5 Sample Covariance and Sample Correlation Coefficient ()() ()() ()() ikkiikki n j jkk n j jii n j jiijkk

21

Example of Linear Combinations of Random Variables

ΣccXcVar

µcXcXc

'][)'(

')'(],['],,['2

)]()([

)]()[()Var(

)()()(

2212

1211

21

12222

112

22211

2212121

212121

=!"

#$%

&!"

#$%

&=

=

==

++=

−+−=

+−+=+

+=+=+

ba

ba

EXXbaabbaXbXaE

babXaXEbXaXbaXbEXaEbXaXE

σσ

σσ

σσσ

µµ

µµ

µµ

Page 22: 数据的矩阵描述staff.ustc.edu.cn/~zhangwm/matrix_analysis/courseware/...5 Sample Covariance and Sample Correlation Coefficient ()() ()() ()() ikkiikki n j jkk n j jii n j jiijkk

22

Linear Combinations of Random Variables

')Cov()()(

21

22221

11211

CCΣCXΣCµCXZµ

CXXZ

XZ

XZ

==

===

=

!!!!!

"

#

$$$$$

%

&

=

EE

ccc

cccccc

pppp

p

p

!

!!!!

!

!

Page 23: 数据的矩阵描述staff.ustc.edu.cn/~zhangwm/matrix_analysis/courseware/...5 Sample Covariance and Sample Correlation Coefficient ()() ()() ()() ikkiikki n j jkk n j jii n j jiijkk

23

Sample Mean Vector and Covariance Matrix

( ) ( )( )

( )( ) ( ) !!!!!

"

#

$$$$$

%

&

−−−

−−−

=

!!!

"

#

$$$

%

&

=

=

∑∑

∑∑

==

==

n

jpjp

n

jpjpj

n

jpjpj

n

jj

ppp

p

n

p

xxn

xxxxn

xxxxn

xxn

ss

ss

xxx

1

2

111

111

1

211

1

111

21

11

11

],,['

!

!!!

!

!

!!!

!

!

S

x

Page 24: 数据的矩阵描述staff.ustc.edu.cn/~zhangwm/matrix_analysis/courseware/...5 Sample Covariance and Sample Correlation Coefficient ()() ()() ()() ikkiikki n j jkk n j jii n j jiijkk

24

Partitioning Sample Mean Vector

!!!

"

#

$$$

%

&

−−−=

!!!!!!!!!

"

#

$$$$$$$$$

%

&

−−−=

+)2(

)1(

1

1

x

xx

p

q

q

x

x

x

x

!

!

Page 25: 数据的矩阵描述staff.ustc.edu.cn/~zhangwm/matrix_analysis/courseware/...5 Sample Covariance and Sample Correlation Coefficient ()() ()() ()() ikkiikki n j jkk n j jii n j jiijkk

25

Partitioning Sample Covariance Matrix

'1221

1,1

,11,1,11,1

1,1

11,1111

2221

1211

|||||||

|||

SS

SS

SSS

=

!!!!!!!!!

"

#

$$$$$$$$$

%

&

−−−−−−−−−−−−=

!!!

"

#

$$$

%

&

−−−−−−=

+

+++++

+

+

ppqppqp

pqqqqqq

qpqqqqq

pqq

n

ssss

ssss

ssss

ssss

!!

!!!!!!

!!

!!

!!!!!!

!!

Page 26: 数据的矩阵描述staff.ustc.edu.cn/~zhangwm/matrix_analysis/courseware/...5 Sample Covariance and Sample Correlation Coefficient ()() ()() ()() ikkiikki n j jkk n j jii n j jiijkk

26

Population and Sample

总 体(随机变量或向量)——分布

统计量(随机变量或向量)——分布

数 据

推 导

计 算

⽬目 标

桥 梁

出发点

Page 27: 数据的矩阵描述staff.ustc.edu.cn/~zhangwm/matrix_analysis/courseware/...5 Sample Covariance and Sample Correlation Coefficient ()() ()() ()() ikkiikki n j jkk n j jii n j jiijkk

27

Random Matrix

!!!!!

"

#

$$$$$

%

&

=

!!!!!

"

#

$$$$$

%

&

=

'

'2

'1

21

22221

11211

nnpnn

p

p

XXX

XXXXXX

X

XX

X!

!

!!!!

!

!

Page 28: 数据的矩阵描述staff.ustc.edu.cn/~zhangwm/matrix_analysis/courseware/...5 Sample Covariance and Sample Correlation Coefficient ()() ()() ()() ikkiikki n j jkk n j jii n j jiijkk

28

Random SampleRow vectors X1’, X2’, …, Xn’ represent independent observations from a common joint distribution with density function f(x)=f(x1, x2, …, xp) Mathematically, the joint density function of X1’, X2’, …, Xn’ is

),,,()()()()(

21

21

jpjjj

n

xxxfffff

!

!

=xxxx

Page 29: 数据的矩阵描述staff.ustc.edu.cn/~zhangwm/matrix_analysis/courseware/...5 Sample Covariance and Sample Correlation Coefficient ()() ()() ()() ikkiikki n j jkk n j jii n j jiijkk

29

Random SampleMeasurements of a single trial, such as Xj’=[Xj1,Xj2,…,Xjp], will usually be correlated The measurements from different trials must be independent The independence of measurements from trial to trial may not hold when the variables are likely to drift over time

Page 30: 数据的矩阵描述staff.ustc.edu.cn/~zhangwm/matrix_analysis/courseware/...5 Sample Covariance and Sample Correlation Coefficient ()() ()() ()() ikkiikki n j jkk n j jii n j jiijkk

30

Result 1

) of estimatepoint unbiasedan as 1

(

)1

(

1)(,

1)Cov(

) of estimatepoint unbiasedan as (,)(

then,matrix covariance and r mean vecto hason that distributi

joint a from sample random a are ,,, 21

ΣSS

ΣS

ΣSΣX

µXµXΣ

µXXX

n

n

n

n

nn

nnE

nnE

n

E

−=

=−

−==

=

!

Page 31: 数据的矩阵描述staff.ustc.edu.cn/~zhangwm/matrix_analysis/courseware/...5 Sample Covariance and Sample Correlation Coefficient ()() ()() ()() ikkiikki n j jkk n j jii n j jiijkk

31

Proof of Result 1

( ) ( )

( )( )

( )( )'1)')(()Cov(

'1

'11

)')((

)(1

)(1

)(1

)111

()(

1 12

1 12

11

21

21

µXµXµXµXX

µXµX

µXµXµXµX

µXXX

XXXX

−−=−−=

−−=

"#

$%&

'−""

#

$%%&

'−=−−

=+++=

+++=

∑∑

∑∑

∑∑

= =

= =

==

!!

!!

!!

!

!

n

j

n

j

n

j

n

j

nn

jj

n

n

En

E

n

nn

En

En

En

nnnEE

Page 32: 数据的矩阵描述staff.ustc.edu.cn/~zhangwm/matrix_analysis/courseware/...5 Sample Covariance and Sample Correlation Coefficient ()() ()() ()() ikkiikki n j jkk n j jii n j jiijkk

32

Proof of Result 1

( )( )

( )( )

( ) ( )

( )( )

2 21

1

1

' '

1 1 1

'

( )( ) ' 0 for because of independence.

1 1 1Cov( ) ( )( ) '

1( ) ( ')

'

' '

( ) ( ') '

j

n

j jj

n

n j jj

n

j jj

n n n

j j j j jj j j

j j j j

E j

E nnn n

E En

n

E E

=

=

=

= = =

− − = ≠

= − − = =

= − −

− −

= − − − = −

= − + − + = +

∑ ∑ ∑

Xµ X µ

X X

µ X µ Σ Σ

S X X X X

X X X X

X X X X X X X X XX

X X X

µ µ X µ µ Σ µµ! !

Page 33: 数据的矩阵描述staff.ustc.edu.cn/~zhangwm/matrix_analysis/courseware/...5 Sample Covariance and Sample Correlation Coefficient ()() ()() ()() ikkiikki n j jkk n j jii n j jiijkk

33

Proof of Result 1

( )( )

( )

'

1

1

1( ') ( ') '

1( ) ( ')

1 1 1' '

1 ( )( ) '1 1

n

n j jj

n

n j jj

E En

E E nn

nn nn n nnn n

=

=

= − + − + = +

= −

" # −" #= + − − =$ %$ %& '& '

= = − −− −

XX Xµ µ X µ µ Σ µµ

S X X XX

Σ

µµ Σ µµ Σ

S S X X X X

Page 34: 数据的矩阵描述staff.ustc.edu.cn/~zhangwm/matrix_analysis/courseware/...5 Sample Covariance and Sample Correlation Coefficient ()() ()() ()() ikkiikki n j jkk n j jii n j jiijkk

34

Some Other Estimators

( )( )1

The expectation of the ( , ) entry of 1

( ) ( )1

( ) , ( )

Biases ( ) and ( ) can usually

be ignored if size is moderately large

n

ik ji i jk k ikj

ii ii ik ik

ii ii ik ik

i k th

E s E X X X Xn

E s E r

E s E rn

σ

σ ρ

σ ρ

=

= − − =−

≠ ≠

− −

S