数据的矩阵描述staff.ustc.edu.cn/~zhangwm/matrix_analysis/courseware/...5 sample covariance...

Post on 11-Apr-2018

250 Views

Category:

Documents

16 Downloads

Preview:

Click to see full reader

TRANSCRIPT

1

数据的矩阵描述

2

Array of Data

!!!!!!!!

"

#

$$$$$$$$

%

&

=

npnknn

jpjkjj

pk

pk

xxxx

xxxx

xxxxxxxx

!!

!!!!

!!

!!!!

!!

!!

21

21

222221

111211

x

样本

3

Descriptive Statistics

Summary numbers to assess the information contained in data Basic descriptive statistics

Sample mean Sample variance Sample standard deviation Sample covariance Sample correlation coefficient

4

Sample Mean and Sample Variance

( )1

22

1

1

1

1, 2, ,

n

k jkj

n

k kk jk kj

x xn

s s x xn

k p

=

=

=

= = −

=

∑!

5

Sample Covariance and Sample Correlation Coefficient

( )( )

( )( )

( ) ( )

kiikkiik

n

jkjk

n

jiji

n

jkjkiji

kkii

ikik

n

jkjkijiik

rrsspkpi

xxxx

xxxx

sss

r

xxxxn

s

==

==

−−

−−

==

−−=

∑∑

==

=

=

,,,2,1;,,2,1

1

1

2

1

2

1

1

!!

6

Standardized Values (or Standardized Scores)

Centered at zero Unit standard deviation Sample correlation coefficient can be regarded as a sample covariance of two standardized variables

kk

kjk

s

xx −

7

Properties of Sample Correlation Coefficient

Value is between -1 and 1 Magnitude measure the strength of the linear association Sign indicates the direction of the association Value remains unchanged if all xji’s and xjk’s are changed to yji = a xji + b and yjk = c xjk + d, respectively, provided that the constants a and c have the same sign

8

Arrays of Basic Descriptive Statistics

!!!!!

"

#

$$$$$

%

&

=

!!!!!

"

#

$$$$$

%

&

=

!!!!!

"

#

$$$$$

%

&

=

1

11

,

21

221

112

21

22221

11211

2

1

!

!!!!

!

!

!

!!!!

!

!

!

pp

p

p

pppp

p

p

n

p

rr

rrrr

sss

ssssss

x

xx

R

Sx

9

Random Vectors and Random Matrices

Random vector Vector whose elements are random variables

!Random matrix

Matrix whose elements are random variables

10

Expected Value of a Random Matrix

11 12 1

21 22 2

1 2

( ) ( ) ( )( ) ( ) ( )

( )

( ) ( ) ( )

( )( )

( )

( ) ( )ij

p

p

n n np

ij ij ij ij

ij

ij ij ijall x

E X E X E XE X E X E X

E

E X E X E X

x f x dxE X

x p x

E E

−∞

# $% &% &=% &% &% &' (

)**

= +**,=

X

AXB A X B

!

!

! ! ! !

!

11

Population Mean Vectors1 2

1 2

2 2

1 2

Random vector '

Joint probability density function ( ) ( , , , )

Marginal probability distribution ( )( )

( )

( )

p

p

i

i i

i i i

p

X X X

f f x x x

f xE X

E X

E

µ

σ µ

µ µ µ

" #= $ %

=

=

= −

" #= = $ %

X

x

µ

X

!

!

!

12

Covariance

ik

xall xallkiikkkii

kikiikkkii

kkiiki

i k

xxpxx

dxdxxxfxx

XXEXX

σ

µµ

µµ

µµ

=

"#

"$

%

−−

−−=

−−=

∑∑∫ ∫∞

∞−

∞−

),())((

),())((

))((),Cov(

13

Statistically Independent

general)in not true is converse (thetindependen are , if 0),Cov(

)()()(),,,(

)()()(][][] and[

22112112

,

kiki

pppp

kkiikiik

kkiikkii

XXXX

xfxfxfxxxf

xfxfxxfxXPxXPxXxXP

=

=

=

≤≤=≤≤

!!!

14

Population Variance-Covariance Matrices

[ ]

!!!!!

"

#

$$$$$

%

&

==

'''''

(

)

*****

+

,

−−−

!!!!!

"

#

$$$$$

%

&

=

−−=

pppp

p

p

pp

pp

XXX

X

XX

E

E

σσσ

σσσ

σσσ

µµµ

µ

µ

µ

!

!!!!

!

!

!!

21

22221

11211

221122

11

)Cov(

)')((

X

µXµXΣ

15

Population Correlation Coefficients

1

21

22221

11211

=

=

!!!!!

"

#

$$$$$

%

&

=

ii

kkii

ikik

pppp

p

p

ρ

σσ

σρ

ρρρ

ρρρ

ρρρ

!

!!!!

!

!

ρ

16

Standard Deviation Matrix

2/12/1

2/12/1

22

11

2/1

00

0000

−−=

=

"""""

#

$

%%%%%

&

'

=

ΣVVρρVVΣ

V

ppσ

σ

σ

!

!!!!

!

!

17

Correlation Matrix from Covariance Matrix

!!!

"

#

$$$

%

&

−==

!!!

"

#

$$$

%

&

=

!!!

"

#

$$$

%

&

=!!!

"

#

$$$

%

&

=

!!!

"

#

$$$

%

&

=

!!!

"

#

$$$

%

&

−=

−−

15/15/15/116/15/16/11

5/10003/10002/1

500030002

000000

2532391214

2/12/1

2/1

33

22

112/1

333231

232221

131211

ΣVVρ

V

V

Σ

σ

σ

σ

σσσ

σσσ

σσσ

18

Partitioning Covariance Matrix

!"

#$%

&

−−−−

−−−−

=−−

!!!

"

#

$$$

%

&

−−−=!!!

"

#

$$$

%

&

−−−=

!!!!!!!!!

"

#

$$$$$$$$$

%

&

−−−=

+

)')(()')(()')(()')((

)')((

,

)2()2()2()2()1()1()2()2(

)2()2()1()1()1()1()1()1(

)2(

)1(

)2(

)1(

1

1

µXµXµXµXµXµXµXµX

µXµX

µ

µµ

X

XX

p

q

q

X

X

X

X

!

!

19

Partitioning Covariance Matrix

'1221

1,1

,11,1,11,1

1,1

11,1111

2221

1211

|||||||

|||

)')((

ΣΣ

ΣΣ

ΣΣµXµXΣ

=

!!!!!!!!!

"

#

$$$$$$$$$

%

&

−−−−−−−−−−−−=

!!!

"

#

$$$

%

&

−−−−−−=−−=

+

+++++

+

+

ppqppqp

pqqqqqq

qpqqqqq

pqq

E

σσσσ

σσσσ

σσσσ

σσσσ

!!

!!!!!!

!!

!!

!!!!!!

!!

20

Linear Combinations of Random Variables

)Cov( and )( where')'Var( ariance v

')'(mean has

'n combinatioLinear 11

XΣXµΣccXc

µcXc

Xc

==

==

==

++=

E

E

XcXc pp!

21

Example of Linear Combinations of Random Variables

ΣccXcVar

µcXcXc

'][)'(

')'(],['],,['2

)]()([

)]()[()Var(

)()()(

2212

1211

21

12222

112

22211

2212121

212121

=!"

#$%

&!"

#$%

&=

=

==

++=

−+−=

+−+=+

+=+=+

ba

ba

EXXbaabbaXbXaE

babXaXEbXaXbaXbEXaEbXaXE

σσ

σσ

σσσ

µµ

µµ

µµ

22

Linear Combinations of Random Variables

')Cov()()(

21

22221

11211

CCΣCXΣCµCXZµ

CXXZ

XZ

XZ

==

===

=

!!!!!

"

#

$$$$$

%

&

=

EE

ccc

cccccc

pppp

p

p

!

!!!!

!

!

23

Sample Mean Vector and Covariance Matrix

( ) ( )( )

( )( ) ( ) !!!!!

"

#

$$$$$

%

&

−−−

−−−

=

!!!

"

#

$$$

%

&

=

=

∑∑

∑∑

==

==

n

jpjp

n

jpjpj

n

jpjpj

n

jj

ppp

p

n

p

xxn

xxxxn

xxxxn

xxn

ss

ss

xxx

1

2

111

111

1

211

1

111

21

11

11

],,['

!

!!!

!

!

!!!

!

!

S

x

24

Partitioning Sample Mean Vector

!!!

"

#

$$$

%

&

−−−=

!!!!!!!!!

"

#

$$$$$$$$$

%

&

−−−=

+)2(

)1(

1

1

x

xx

p

q

q

x

x

x

x

!

!

25

Partitioning Sample Covariance Matrix

'1221

1,1

,11,1,11,1

1,1

11,1111

2221

1211

|||||||

|||

SS

SS

SSS

=

!!!!!!!!!

"

#

$$$$$$$$$

%

&

−−−−−−−−−−−−=

!!!

"

#

$$$

%

&

−−−−−−=

+

+++++

+

+

ppqppqp

pqqqqqq

qpqqqqq

pqq

n

ssss

ssss

ssss

ssss

!!

!!!!!!

!!

!!

!!!!!!

!!

26

Population and Sample

总 体(随机变量或向量)——分布

统计量(随机变量或向量)——分布

数 据

推 导

计 算

⽬目 标

桥 梁

出发点

27

Random Matrix

!!!!!

"

#

$$$$$

%

&

=

!!!!!

"

#

$$$$$

%

&

=

'

'2

'1

21

22221

11211

nnpnn

p

p

XXX

XXXXXX

X

XX

X!

!

!!!!

!

!

28

Random SampleRow vectors X1’, X2’, …, Xn’ represent independent observations from a common joint distribution with density function f(x)=f(x1, x2, …, xp) Mathematically, the joint density function of X1’, X2’, …, Xn’ is

),,,()()()()(

21

21

jpjjj

n

xxxfffff

!

!

=xxxx

29

Random SampleMeasurements of a single trial, such as Xj’=[Xj1,Xj2,…,Xjp], will usually be correlated The measurements from different trials must be independent The independence of measurements from trial to trial may not hold when the variables are likely to drift over time

30

Result 1

) of estimatepoint unbiasedan as 1

(

)1

(

1)(,

1)Cov(

) of estimatepoint unbiasedan as (,)(

then,matrix covariance and r mean vecto hason that distributi

joint a from sample random a are ,,, 21

ΣSS

ΣS

ΣSΣX

µXµXΣ

µXXX

n

n

n

n

nn

nnE

nnE

n

E

−=

=−

−==

=

!

31

Proof of Result 1

( ) ( )

( )( )

( )( )'1)')(()Cov(

'1

'11

)')((

)(1

)(1

)(1

)111

()(

1 12

1 12

11

21

21

µXµXµXµXX

µXµX

µXµXµXµX

µXXX

XXXX

−−=−−=

−−=

"#

$%&

'−""

#

$%%&

'−=−−

=+++=

+++=

∑∑

∑∑

∑∑

= =

= =

==

!!

!!

!!

!

!

n

j

n

j

n

j

n

j

nn

jj

n

n

En

E

n

nn

En

En

En

nnnEE

32

Proof of Result 1

( )( )

( )( )

( ) ( )

( )( )

2 21

1

1

' '

1 1 1

'

( )( ) ' 0 for because of independence.

1 1 1Cov( ) ( )( ) '

1( ) ( ')

'

' '

( ) ( ') '

j

n

j jj

n

n j jj

n

j jj

n n n

j j j j jj j j

j j j j

E j

E nnn n

E En

n

E E

=

=

=

= = =

− − = ≠

= − − = =

= − −

− −

= − − − = −

= − + − + = +

∑ ∑ ∑

Xµ X µ

X X

µ X µ Σ Σ

S X X X X

X X X X

X X X X X X X X XX

X X X

µ µ X µ µ Σ µµ! !

33

Proof of Result 1

( )( )

( )

'

1

1

1( ') ( ') '

1( ) ( ')

1 1 1' '

1 ( )( ) '1 1

n

n j jj

n

n j jj

E En

E E nn

nn nn n nnn n

=

=

= − + − + = +

= −

" # −" #= + − − =$ %$ %& '& '

= = − −− −

XX Xµ µ X µ µ Σ µµ

S X X XX

Σ

µµ Σ µµ Σ

S S X X X X

34

Some Other Estimators

( )( )1

The expectation of the ( , ) entry of 1

( ) ( )1

( ) , ( )

Biases ( ) and ( ) can usually

be ignored if size is moderately large

n

ik ji i jk k ikj

ii ii ik ik

ii ii ik ik

i k th

E s E X X X Xn

E s E r

E s E rn

σ

σ ρ

σ ρ

=

= − − =−

≠ ≠

− −

S

top related