Download - 吳育德 陽明大學放射醫學科學研究所 台北榮總整合性腦功能研究室
吳育德陽明大學放射醫學科學研究所台北榮總整合性腦功能研究室
Introduction To Principal Component Analysis
Principle Component Analysis
Principle Component Analysis seeks a projection that best represents the data .
The eigenspaces that retain the most significant amount of information are those that correspond to the largest eignevalue.
Principle Component Analysis
Let X denote an m-dimensional random vector representing the environment of interest. We assume that the random vector X has zero mean:
0][ XE
Let q denote a unit vector also of dimension m, onto which the vector X is to be projected. This projection is defined by the inner product of the vectors X and q, as shown by
XqqXA TT
subject to the constraint
1)( 2/1 qqq T
Principle Component Analysis
Under the assumption that the random vector X has zero mean, it follows that the mean value of the projection A is zero too:
0][][ XEqAE T
The variance of A is therefore the same as its mean-square value, and so we may write
Rqq
qXXEq
q)]X)(XE[(q
]E[A
T
TT
TT
][
22
Principle Component Analysis
The m-by-m matrix R is the correlation matrix of the random vector X, formally defined as the expectation of the outer production of the vector X with itself, as shown by
][ TXXER We observe that the correlation matrix R is symmetric, which means that
RRT the variance of the projection A is a function of the unit vector q ; we may thus write
on the basis of which we may think of as a variance probe.
2
Rqq
qT
)( 2
To be considered is that of finding those unit vector along which has extremal values (local maxima or minima):
If q is a unit vector such that the variance probe has an extremal value, then for any small perturbation of the unit vector q, we find that, to a first order in ,, i.e.
Principle Component Analysis
q
)(q
q
From the definition of the variance probe
Rqqq
RqqRqq
qRqRqqRqq
qqRqqqq
T
TT
TTT
T
)(2)(
)(2
)()(2
)()()(
Hence 0)( Rqq T
)()( qqq
--- (1)
Principle Component Analysis
The Euclidean norm of the perturbation vector remains equal to unity; that is :
or equivalently,
This means that the perturbations must be orthogonal to q, and therefore only a change in the direction of q is permitted.
1 qq
0)(
1)()()(2
1)()(
qqqqqq
qqqq
t
ttt
t
q
--- (2)
Principle Component Analysis
If we introduce a scaling factor into the equation (2) with the same dimensions as the entries in the correlation matrix R of (1). We may then combine (1) and (2) into
or equivalently,
it is necessary and sufficient to have
This is the equation that governs the unit vectors q for which the variance probe has external values.
0)()( qqRqq tt
0)()( qRqq t
qRq
)(q
Principle Component Analysis
is recognized as the eigenvalue problem, commonly encountered in linear algebra (Strang, 1980).
Let the eigenvalues of m-by-m matrix R be denoted by , and the associated eigenvectors be denoted by , respectively.
We may then write
Let the corresponding eigenvalues be arranged in decreasing order:
so that ,
qRq
mj qRq jjj ,...,2,1,
mj ......21
max1
Principle Component Analysis
Let the associated eigenvectors represented as
into a single equation:
where is a diagonal matrix defined by the eigenvalues of matrix R:
The matrix Q is an orthogonal (unitary) matrix in the sense that its column vectors (i.e., the eigenvectors of R) satisfy the condition of orthonormality:
mj qRq jjj ,...,2,1,
QRQ
mjdiag ,...,,...,, 21
ji
jiqq j
ti ,0
,1
Principle Component Analysis Equivalently, we may write
from which we deduce that the inverse of matrix Q is the same as its transpose, as shown by
This means that we may rewrite in a form known as the orthogonal similarity transformation:
or in expanded form,
jk
jkRqq j
ktj ,0
,
RQQ t
1QQ t
IQQ t
QRQ
Principle Component Analysis The orthogonal similarity (unitary) transformation of transforms the correlation matrix R into a diagonal matrix of eigenvalues. The correlation matrix R may itself be expressed in terms of its eigenvalues and eigenvectors as:
which is referred to as the spectral theorem.
and are two equivalent representations of
the eigendecomposition of the correlation matrix R.
m
i
tiii qqR
1
m
i
tiii qqR
1
RQQ t
RQQ t
Principle Component Analysis
Principle components analysis and eigendecomposition of matrix R are basically one and the same, just viewing the problem in different ways.
This equivalence follows from and
where we see that the variance probes and eigenvalues are indeed equal, as shown by
Rqqq T)(
m
i
tiii qqR
1
m1,2,...,j q jj ,)(
Principle Component Analysis
We may now summarize the two important findings we have made from the eigenstructure of principle components analysis:
The eigenvectors of the correlation matrix R pertaining to the zero-mean random vector X define the unit vectors , representing the principle directions along which the variance probes have their extremal values.
The associated eigenvalues define the extremal values of the variance probes .
jq
)( jq
)( jq
A Matlab Examplex=[-5 -4 -3 -2 -1 1 2 3 4 5];
y=[ 7 5 7 8 9 4 5 1 8 6];
x1=x-mean(x); x2=y-mean(y);
R=[sum(x1.*x1)/9 sum(x1.*x2)/9;
sum(x2.*x1)/9 sum(x2.*x2)/9]
[V,D]=eig(R);
z= [x1; y1]'* V(:,2);