数据的矩阵描述staff.ustc.edu.cn/~zhangwm/matrix_analysis/courseware/...5 sample covariance...
Post on 11-Apr-2018
250 Views
Preview:
TRANSCRIPT
1
数据的矩阵描述
2
Array of Data
!!!!!!!!
"
#
$$$$$$$$
%
&
=
npnknn
jpjkjj
pk
pk
xxxx
xxxx
xxxxxxxx
!!
!!!!
!!
!!!!
!!
!!
21
21
222221
111211
x
样本
3
Descriptive Statistics
Summary numbers to assess the information contained in data Basic descriptive statistics
Sample mean Sample variance Sample standard deviation Sample covariance Sample correlation coefficient
4
Sample Mean and Sample Variance
( )1
22
1
1
1
1, 2, ,
n
k jkj
n
k kk jk kj
x xn
s s x xn
k p
=
=
=
= = −
=
∑
∑!
5
Sample Covariance and Sample Correlation Coefficient
( )( )
( )( )
( ) ( )
kiikkiik
n
jkjk
n
jiji
n
jkjkiji
kkii
ikik
n
jkjkijiik
rrsspkpi
xxxx
xxxx
sss
r
xxxxn
s
==
==
−−
−−
==
−−=
∑∑
∑
∑
==
=
=
,,,2,1;,,2,1
1
1
2
1
2
1
1
!!
6
Standardized Values (or Standardized Scores)
Centered at zero Unit standard deviation Sample correlation coefficient can be regarded as a sample covariance of two standardized variables
kk
kjk
s
xx −
7
Properties of Sample Correlation Coefficient
Value is between -1 and 1 Magnitude measure the strength of the linear association Sign indicates the direction of the association Value remains unchanged if all xji’s and xjk’s are changed to yji = a xji + b and yjk = c xjk + d, respectively, provided that the constants a and c have the same sign
8
Arrays of Basic Descriptive Statistics
!!!!!
"
#
$$$$$
%
&
=
!!!!!
"
#
$$$$$
%
&
=
!!!!!
"
#
$$$$$
%
&
=
1
11
,
21
221
112
21
22221
11211
2
1
!
!!!!
!
!
!
!!!!
!
!
!
pp
p
p
pppp
p
p
n
p
rr
rrrr
sss
ssssss
x
xx
R
Sx
9
Random Vectors and Random Matrices
Random vector Vector whose elements are random variables
!Random matrix
Matrix whose elements are random variables
10
Expected Value of a Random Matrix
11 12 1
21 22 2
1 2
( ) ( ) ( )( ) ( ) ( )
( )
( ) ( ) ( )
( )( )
( )
( ) ( )ij
p
p
n n np
ij ij ij ij
ij
ij ij ijall x
E X E X E XE X E X E X
E
E X E X E X
x f x dxE X
x p x
E E
∞
−∞
# $% &% &=% &% &% &' (
)**
= +**,=
∫
∑
X
AXB A X B
!
!
! ! ! !
!
11
Population Mean Vectors1 2
1 2
2 2
1 2
Random vector '
Joint probability density function ( ) ( , , , )
Marginal probability distribution ( )( )
( )
( )
p
p
i
i i
i i i
p
X X X
f f x x x
f xE X
E X
E
µ
σ µ
µ µ µ
" #= $ %
=
=
= −
" #= = $ %
X
x
µ
X
!
!
!
12
Covariance
ik
xall xallkiikkkii
kikiikkkii
kkiiki
i k
xxpxx
dxdxxxfxx
XXEXX
σ
µµ
µµ
µµ
=
"#
"$
%
−−
−−=
−−=
∑∑∫ ∫∞
∞−
∞
∞−
),())((
),())((
))((),Cov(
13
Statistically Independent
general)in not true is converse (thetindependen are , if 0),Cov(
)()()(),,,(
)()()(][][] and[
22112112
,
kiki
pppp
kkiikiik
kkiikkii
XXXX
xfxfxfxxxf
xfxfxxfxXPxXPxXxXP
=
=
=
≤≤=≤≤
!!!
14
Population Variance-Covariance Matrices
[ ]
!!!!!
"
#
$$$$$
%
&
==
'''''
(
)
*****
+
,
−−−
!!!!!
"
#
$$$$$
%
&
−
−
−
=
−−=
pppp
p
p
pp
pp
XXX
X
XX
E
E
σσσ
σσσ
σσσ
µµµ
µ
µ
µ
!
!!!!
!
!
!!
21
22221
11211
221122
11
)Cov(
)')((
X
µXµXΣ
15
Population Correlation Coefficients
1
21
22221
11211
=
=
!!!!!
"
#
$$$$$
%
&
=
ii
kkii
ikik
pppp
p
p
ρ
σσ
σρ
ρρρ
ρρρ
ρρρ
!
!!!!
!
!
ρ
16
Standard Deviation Matrix
2/12/1
2/12/1
22
11
2/1
00
0000
−−=
=
"""""
#
$
%%%%%
&
'
=
ΣVVρρVVΣ
V
ppσ
σ
σ
!
!!!!
!
!
17
Correlation Matrix from Covariance Matrix
!!!
"
#
$$$
%
&
−
−==
!!!
"
#
$$$
%
&
=
!!!
"
#
$$$
%
&
=!!!
"
#
$$$
%
&
=
!!!
"
#
$$$
%
&
=
!!!
"
#
$$$
%
&
−
−=
−−
−
15/15/15/116/15/16/11
5/10003/10002/1
500030002
000000
2532391214
2/12/1
2/1
33
22
112/1
333231
232221
131211
ΣVVρ
V
V
Σ
σ
σ
σ
σσσ
σσσ
σσσ
18
Partitioning Covariance Matrix
!"
#$%
&
−−−−
−−−−
=−−
!!!
"
#
$$$
%
&
−−−=!!!
"
#
$$$
%
&
−−−=
!!!!!!!!!
"
#
$$$$$$$$$
%
&
−−−=
+
)')(()')(()')(()')((
)')((
,
)2()2()2()2()1()1()2()2(
)2()2()1()1()1()1()1()1(
)2(
)1(
)2(
)1(
1
1
µXµXµXµXµXµXµXµX
µXµX
µ
µµ
X
XX
p
q
q
X
X
X
X
!
!
19
Partitioning Covariance Matrix
'1221
1,1
,11,1,11,1
1,1
11,1111
2221
1211
|||||||
|||
)')((
ΣΣ
ΣΣ
ΣΣµXµXΣ
=
!!!!!!!!!
"
#
$$$$$$$$$
%
&
−−−−−−−−−−−−=
!!!
"
#
$$$
%
&
−−−−−−=−−=
+
+++++
+
+
ppqppqp
pqqqqqq
qpqqqqq
pqq
E
σσσσ
σσσσ
σσσσ
σσσσ
!!
!!!!!!
!!
!!
!!!!!!
!!
20
Linear Combinations of Random Variables
)Cov( and )( where')'Var( ariance v
')'(mean has
'n combinatioLinear 11
XΣXµΣccXc
µcXc
Xc
==
==
==
++=
E
E
XcXc pp!
21
Example of Linear Combinations of Random Variables
ΣccXcVar
µcXcXc
'][)'(
')'(],['],,['2
)]()([
)]()[()Var(
)()()(
2212
1211
21
12222
112
22211
2212121
212121
=!"
#$%
&!"
#$%
&=
=
==
++=
−+−=
+−+=+
+=+=+
ba
ba
EXXbaabbaXbXaE
babXaXEbXaXbaXbEXaEbXaXE
σσ
σσ
σσσ
µµ
µµ
µµ
22
Linear Combinations of Random Variables
')Cov()()(
21
22221
11211
CCΣCXΣCµCXZµ
CXXZ
XZ
XZ
==
===
=
!!!!!
"
#
$$$$$
%
&
=
EE
ccc
cccccc
pppp
p
p
!
!!!!
!
!
23
Sample Mean Vector and Covariance Matrix
( ) ( )( )
( )( ) ( ) !!!!!
"
#
$$$$$
%
&
−−−
−−−
=
!!!
"
#
$$$
%
&
=
=
∑∑
∑∑
==
==
n
jpjp
n
jpjpj
n
jpjpj
n
jj
ppp
p
n
p
xxn
xxxxn
xxxxn
xxn
ss
ss
xxx
1
2
111
111
1
211
1
111
21
11
11
],,['
!
!!!
!
!
!!!
!
!
S
x
24
Partitioning Sample Mean Vector
!!!
"
#
$$$
%
&
−−−=
!!!!!!!!!
"
#
$$$$$$$$$
%
&
−−−=
+)2(
)1(
1
1
x
xx
p
q
q
x
x
x
x
!
!
25
Partitioning Sample Covariance Matrix
'1221
1,1
,11,1,11,1
1,1
11,1111
2221
1211
|||||||
|||
SS
SS
SSS
=
!!!!!!!!!
"
#
$$$$$$$$$
%
&
−−−−−−−−−−−−=
!!!
"
#
$$$
%
&
−−−−−−=
+
+++++
+
+
ppqppqp
pqqqqqq
qpqqqqq
pqq
n
ssss
ssss
ssss
ssss
!!
!!!!!!
!!
!!
!!!!!!
!!
26
Population and Sample
总 体(随机变量或向量)——分布
统计量(随机变量或向量)——分布
数 据
推 导
计 算
⽬目 标
桥 梁
出发点
27
Random Matrix
!!!!!
"
#
$$$$$
%
&
=
!!!!!
"
#
$$$$$
%
&
=
'
'2
'1
21
22221
11211
nnpnn
p
p
XXX
XXXXXX
X
XX
X!
!
!!!!
!
!
28
Random SampleRow vectors X1’, X2’, …, Xn’ represent independent observations from a common joint distribution with density function f(x)=f(x1, x2, …, xp) Mathematically, the joint density function of X1’, X2’, …, Xn’ is
),,,()()()()(
21
21
jpjjj
n
xxxfffff
!
!
=xxxx
29
Random SampleMeasurements of a single trial, such as Xj’=[Xj1,Xj2,…,Xjp], will usually be correlated The measurements from different trials must be independent The independence of measurements from trial to trial may not hold when the variables are likely to drift over time
30
Result 1
) of estimatepoint unbiasedan as 1
(
)1
(
1)(,
1)Cov(
) of estimatepoint unbiasedan as (,)(
then,matrix covariance and r mean vecto hason that distributi
joint a from sample random a are ,,, 21
ΣSS
ΣS
ΣSΣX
µXµXΣ
µXXX
n
n
n
n
nn
nnE
nnE
n
E
−=
=−
−==
=
!
31
Proof of Result 1
( ) ( )
( )( )
( )( )'1)')(()Cov(
'1
'11
)')((
)(1
)(1
)(1
)111
()(
1 12
1 12
11
21
21
µXµXµXµXX
µXµX
µXµXµXµX
µXXX
XXXX
−−=−−=
−−=
"#
$%&
'−""
#
$%%&
'−=−−
=+++=
+++=
∑∑
∑∑
∑∑
= =
= =
==
!!
!!
!!
!
!
n
j
n
j
n
j
n
j
nn
jj
n
n
En
E
n
nn
En
En
En
nnnEE
32
Proof of Result 1
( )( )
( )( )
( ) ( )
( )( )
2 21
1
1
' '
1 1 1
'
( )( ) ' 0 for because of independence.
1 1 1Cov( ) ( )( ) '
1( ) ( ')
'
' '
( ) ( ') '
j
n
j jj
n
n j jj
n
j jj
n n n
j j j j jj j j
j j j j
E j
E nnn n
E En
n
E E
=
=
=
= = =
− − = ≠
= − − = =
= − −
− −
= − − − = −
= − + − + = +
∑
∑
∑
∑ ∑ ∑
Xµ X µ
X X
µ X µ Σ Σ
S X X X X
X X X X
X X X X X X X X XX
X X X
µ µ X µ µ Σ µµ! !
33
Proof of Result 1
( )( )
( )
'
1
1
1( ') ( ') '
1( ) ( ')
1 1 1' '
1 ( )( ) '1 1
n
n j jj
n
n j jj
E En
E E nn
nn nn n nnn n
=
=
= − + − + = +
= −
" # −" #= + − − =$ %$ %& '& '
= = − −− −
∑
∑
XX Xµ µ X µ µ Σ µµ
S X X XX
Σ
µµ Σ µµ Σ
S S X X X X
34
Some Other Estimators
( )( )1
The expectation of the ( , ) entry of 1
( ) ( )1
( ) , ( )
Biases ( ) and ( ) can usually
be ignored if size is moderately large
n
ik ji i jk k ikj
ii ii ik ik
ii ii ik ik
i k th
E s E X X X Xn
E s E r
E s E rn
σ
σ ρ
σ ρ
=
= − − =−
≠ ≠
− −
∑
S
top related