lanczos method ct. - dipartimento di informatica · lanczos method lanczos method ct. the...
TRANSCRIPT
Lanczos Method
Lanczos method ct.
The coefficients are obtained from
βk−1 = (vk−1)T Avk = (Avk−1)T vk
= (γk−1vk + αk−1vk−1 + βk−2vk−2)T vk = γk−1,
i.e.
Avk = βk vk+1 + αk vk + βk−1vk−1. (1)
Thus,
αk = (vk )T Avk ,
TUHH Heinrich Voss Krylov subspace methods Eigenvalue problems 2012 8 / 95
Lanczos Method
Lanczos method ct.
The coefficients are obtained from
βk−1 = (vk−1)T Avk = (Avk−1)T vk
= (γk−1vk + αk−1vk−1 + βk−2vk−2)T vk = γk−1,
i.e.
Avk = βk vk+1 + αk vk + βk−1vk−1. (1)
Thus,
αk = (vk )T Avk ,
and the condition ‖vk+1‖2 = 1 yields
βk = 1/‖Avk − αk vk − βk−1vk−1‖2. (2)
If the denominator in (2) vanishes, then Avk ∈ Kk (v ,A), and therefore,
Kk (v ,A) is a k -dimensional invariant subspace of A.
TUHH Heinrich Voss Krylov subspace methods Eigenvalue problems 2012 8 / 95
Lanczos Method
Lanczos method ct.
1: v0 = 0; k = 1
2: β0 = ‖r0‖3: while r k−1 6= 0 do
4: vk = r k−1/βk−1
5: r k = Avk
6: r k = r k − βk−1vk−1
7: αk = (vk )T r k
8: r k = r k − αk vk
9: βk = ‖r k‖10: end while
TUHH Heinrich Voss Krylov subspace methods Eigenvalue problems 2012 9 / 95
Lanczos Method
Lanczos method ct.
1: v0 = 0; k = 1
2: β0 = ‖r0‖3: while r k−1 6= 0 do
4: vk = r k−1/βk−1
5: r k = Avk
6: r k = r k − βk−1vk−1
7: αk = (vk )T r k
8: r k = r k − αk vk
9: βk = ‖r k‖10: end while
Then with Tk = tridiag{βj−1, αj , βj}
AVk = Vk Tk + r k (ek )T , r k = Avk − αk vk − βk−1vk−1.
and r k = ‖r k‖2vk+1 = βk vk+1
TUHH Heinrich Voss Krylov subspace methods Eigenvalue problems 2012 9 / 95
Lanczos Method
Lucky termination
The Lanczos method may terminate with βj = 0 for some j .
TUHH Heinrich Voss Krylov subspace methods Eigenvalue problems 2012 10 / 95
Lanczos Method
Lucky termination
The Lanczos method may terminate with βj = 0 for some j .
Then v j+1 = 0, and therefore
Av j = αjvj + βj−1v j−1 ∈ Kj(A, v
1).
TUHH Heinrich Voss Krylov subspace methods Eigenvalue problems 2012 10 / 95
Lanczos Method
Lucky termination
The Lanczos method may terminate with βj = 0 for some j .
Then v j+1 = 0, and therefore
Av j = αjvj + βj−1v j−1 ∈ Kj(A, v
1).
For i < j it holds by construction
Av i ∈ Kj(A, v1).
Hence, Kj(A, v1) is an invariant subspace of A, and therefore every
eigenvalue θ(j)i of Tj is an eigenvalue of A, and the corresponding Ritz vectors
are eigenvectors of A.
TUHH Heinrich Voss Krylov subspace methods Eigenvalue problems 2012 10 / 95
Lanczos Method
Error bound
If θ(m)i are the Ritz values (eigenvalues of Tm), s
(m)i the corresponding
eigenvectors, and x(m)i = Vms
(m)i the Ritz vectors, it holds
(A − θ(m)i I)x
(m)i = (A − θ
(m)i I)Vms
(m)i = AVms
(m)i − Vm(θ
(m)i s
(m)i )
= VmTms(m)i + βmvm+1(em)T s
(m)i − VmTms
(m)i = βmvm+1(em)T s
(m)i ,
which implies
‖(A − θ(m)i I)x
(m)i ‖2 = βm|s
(m)m,i |.
TUHH Heinrich Voss Krylov subspace methods Eigenvalue problems 2012 11 / 95
Lanczos Method
Error bound
If θ(m)i are the Ritz values (eigenvalues of Tm), s
(m)i the corresponding
eigenvectors, and x(m)i = Vms
(m)i the Ritz vectors, it holds
(A − θ(m)i I)x
(m)i = (A − θ
(m)i I)Vms
(m)i = AVms
(m)i − Vm(θ
(m)i s
(m)i )
= VmTms(m)i + βmvm+1(em)T s
(m)i − VmTms
(m)i = βmvm+1(em)T s
(m)i ,
which implies
‖(A − θ(m)i I)x
(m)i ‖2 = βm|s
(m)m,i |.
Then by the Krylov & Bogoliubov Theorem there exists an eigenvalue λ of A
such that
|λ− θ(m)i | ≤ βm|s
(m)m,i |.
TUHH Heinrich Voss Krylov subspace methods Eigenvalue problems 2012 11 / 95
Lanczos Method
Error bound
If θ(m)i are the Ritz values (eigenvalues of Tm), s
(m)i the corresponding
eigenvectors, and x(m)i = Vms
(m)i the Ritz vectors, it holds
(A − θ(m)i I)x
(m)i = (A − θ
(m)i I)Vms
(m)i = AVms
(m)i − Vm(θ
(m)i s
(m)i )
= VmTms(m)i + βmvm+1(em)T s
(m)i − VmTms
(m)i = βmvm+1(em)T s
(m)i ,
which implies
‖(A − θ(m)i I)x
(m)i ‖2 = βm|s
(m)m,i |.
Then by the Krylov & Bogoliubov Theorem there exists an eigenvalue λ of A
such that
|λ− θ(m)i | ≤ βm|s
(m)m,i |.
Notice, that this error bound can be computet without determining the Ritz
vector x(m)i = Vms
(m)i .
TUHH Heinrich Voss Krylov subspace methods Eigenvalue problems 2012 11 / 95
Lanczos Method
Convergence
The Lanczos method is a generalization of the power method where
approximations to eigenvectors are obtained from the last iterate only.
TUHH Heinrich Voss Krylov subspace methods Eigenvalue problems 2012 12 / 95
Lanczos Method
Convergence
The Lanczos method is a generalization of the power method where
approximations to eigenvectors are obtained from the last iterate only.
As for the power method we therefore can expect fast convergence to the
eigenvalue which is maximal in modulus and to the corresponding
eigenvector.
TUHH Heinrich Voss Krylov subspace methods Eigenvalue problems 2012 12 / 95
Lanczos Method
Convergence
The Lanczos method is a generalization of the power method where
approximations to eigenvectors are obtained from the last iterate only.
As for the power method we therefore can expect fast convergence to the
eigenvalue which is maximal in modulus and to the corresponding
eigenvector.
One can influence the convergence of the power method by shifts, either to
separate the wanted eigenvalue more from the remaining spectrum or to
enforce convergence to a different eigenvalue.
TUHH Heinrich Voss Krylov subspace methods Eigenvalue problems 2012 12 / 95
Lanczos Method
Convergence
The Lanczos method is a generalization of the power method where
approximations to eigenvectors are obtained from the last iterate only.
As for the power method we therefore can expect fast convergence to the
eigenvalue which is maximal in modulus and to the corresponding
eigenvector.
One can influence the convergence of the power method by shifts, either to
separate the wanted eigenvalue more from the remaining spectrum or to
enforce convergence to a different eigenvalue.
The Lanczos method is independent of shifts since
Km(v1,A) = Km(v
1,A + αI) for all α ∈ R.
Hence, we can expect convergence of the Lanczos method to extreme
eigenvalues first.
TUHH Heinrich Voss Krylov subspace methods Eigenvalue problems 2012 12 / 95
Lanczos Method
Monotonicity
Assume that the eigenvalues of A and the Ritz values with respect to Km are
ordered by magnitude:
λ1 ≤ λ2 ≤ · · · ≤ λn, θ(m)1 ≤ θ
(m)2 ≤ · · · ≤ θ
(m)m ,
TUHH Heinrich Voss Krylov subspace methods Eigenvalue problems 2012 13 / 95
Lanczos Method
Monotonicity
Assume that the eigenvalues of A and the Ritz values with respect to Km are
ordered by magnitude:
λ1 ≤ λ2 ≤ · · · ≤ λn, θ(m)1 ≤ θ
(m)2 ≤ · · · ≤ θ
(m)m ,
Then the minmax principle yields (S denotes a subspace of Cm and S a
subspace of Cn)
θ(m)j = min
dim S=jmaxy∈S
yHTmy
yHy= min
dim S=jmaxy∈S
yHV Hm AVmy
yV Hm Vmy
= mindim S=j, S⊂Km
maxx∈S
xHAx
xHx≥ min
dim S=j, S⊂Km+1
maxx∈S
xHAx
xHx
= θ(m+1)j ≥ min
dim S=j
maxx∈S
xHAx
xHx= λj .
TUHH Heinrich Voss Krylov subspace methods Eigenvalue problems 2012 13 / 95
Lanczos Method
Monotonicity
Assume that the eigenvalues of A and the Ritz values with respect to Km are
ordered by magnitude:
λ1 ≤ λ2 ≤ · · · ≤ λn, θ(m)1 ≤ θ
(m)2 ≤ · · · ≤ θ
(m)m ,
Then the minmax principle yields (S denotes a subspace of Cm and S a
subspace of Cn)
θ(m)j = min
dim S=jmaxy∈S
yHTmy
yHy= min
dim S=jmaxy∈S
yHV Hm AVmy
yV Hm Vmy
= mindim S=j, S⊂Km
maxx∈S
xHAx
xHx≥ min
dim S=j, S⊂Km+1
maxx∈S
xHAx
xHx
= θ(m+1)j ≥ min
dim S=j
maxx∈S
xHAx
xHx= λj .
Each (finite) sequence {θ(m)j }m=j,j+1,j+2 therefore is monotonically decreasing
and bounded below by λj .
Likewise the sequence of the j-largest eigenvalues of Tm are monotonically
increasing and bounded above by the j-largest eigenvalue of A.
TUHH Heinrich Voss Krylov subspace methods Eigenvalue problems 2012 13 / 95
Lanczos Method
Example
TUHH Heinrich Voss Krylov subspace methods Eigenvalue problems 2012 14 / 95
Lanczos Method
Convergence
Before getting results about the speed of convergence of the Lanczos method
we first prove a bound for the angle between an eigenvector of A and a Krylov
space Km(v1,A). Denote by ui a system of orthonormal eigenvectors
corresponding to the eigenvalues λi .
TUHH Heinrich Voss Krylov subspace methods Eigenvalue problems 2012 15 / 95
Lanczos Method
Convergence
Before getting results about the speed of convergence of the Lanczos method
we first prove a bound for the angle between an eigenvector of A and a Krylov
space Km(v1,A). Denote by ui a system of orthonormal eigenvectors
corresponding to the eigenvalues λi .
LEMMA 10.1
Let Pi be the orthogonal projector onto the eigenspace corresponding to λi . If
Piv1 6= 0 then
tan δ(ui ,Km) = minp∈Πm−1, p(λi )=1
‖p(A)y i‖2 tan δ(ui , v1),
where
y i =
{
(I−Pi )v1
‖(I−Pi )v1‖2if (I − Pi)v
1 6= 0
0 otherwise
TUHH Heinrich Voss Krylov subspace methods Eigenvalue problems 2012 15 / 95
Lanczos Method
Proof
The Krylov space Km(v1,A) consists of all vectors which can be written as
x = q(A)v1 where q ∈ Πm−1 is any polynomial of degree m − 1.
TUHH Heinrich Voss Krylov subspace methods Eigenvalue problems 2012 16 / 95
Lanczos Method
Proof
The Krylov space Km(v1,A) consists of all vectors which can be written as
x = q(A)v1 where q ∈ Πm−1 is any polynomial of degree m − 1.
With the orthogonal decomposition
x = q(A)v1 = q(A)Piv1 + q(A)(I − Pi)v
1
it holds for the angle δ(x , ui) between x and ui
tan δ(x , ui) =‖q(A)(I − Pi)v
1‖2
‖q(A)Piv1‖2
=‖q(A)y i‖2
|q(λi)|
‖(I − Pi)v1‖2
‖Piv1‖2
,
TUHH Heinrich Voss Krylov subspace methods Eigenvalue problems 2012 16 / 95
Lanczos Method
Proof
The Krylov space Km(v1,A) consists of all vectors which can be written as
x = q(A)v1 where q ∈ Πm−1 is any polynomial of degree m − 1.
With the orthogonal decomposition
x = q(A)v1 = q(A)Piv1 + q(A)(I − Pi)v
1
it holds for the angle δ(x , ui) between x and ui
tan δ(x , ui) =‖q(A)(I − Pi)v
1‖2
‖q(A)Piv1‖2
=‖q(A)y i‖2
|q(λi)|
‖(I − Pi)v1‖2
‖Piv1‖2
,
and the scaling p(λ) := q(λ)/q(λi) yields
tan δ(x , ui) = ‖p(A)y i‖2 tan δ(v1, ui)
from which we get the statement by minimizing over all x ∈ K(v1,A).�
TUHH Heinrich Voss Krylov subspace methods Eigenvalue problems 2012 16 / 95
Lanczos Method
Convergence ct.
Inserting any polynomial of degree m − 1 which satisfies p(λi) = 1 one
obtains an upper bound for tan δ(ui ,Km(v1,A)) from the last lemma.
TUHH Heinrich Voss Krylov subspace methods Eigenvalue problems 2012 17 / 95
Lanczos Method
Convergence ct.
Inserting any polynomial of degree m − 1 which satisfies p(λi) = 1 one
obtains an upper bound for tan δ(ui ,Km(v1,A)) from the last lemma.
Definition
The Chebyshev polynomials are defined as
ck (t) := cos(k · arccos t), |t | ≤ 1, k ∈ N ∪ {0}.
TUHH Heinrich Voss Krylov subspace methods Eigenvalue problems 2012 17 / 95
Lanczos Method
Convergence ct.
Inserting any polynomial of degree m − 1 which satisfies p(λi) = 1 one
obtains an upper bound for tan δ(ui ,Km(v1,A)) from the last lemma.
Definition
The Chebyshev polynomials are defined as
ck (t) := cos(k · arccos t), |t | ≤ 1, k ∈ N ∪ {0}.
In the following lemma we collect some properties of the Chebyshev
polynomials that are needed in the sequel.
TUHH Heinrich Voss Krylov subspace methods Eigenvalue problems 2012 17 / 95
Lanczos Method
Lemma 3.1
(i) The functions ck (t) satisfy the recurrence formula
c0(t) ≡ 1, c1(t) = t , ck+1(t) = 2tck (t)− ck−1(t), k ≥ 1.
In particular, ck is a polynomial of degree k .
TUHH Heinrich Voss Krylov subspace methods Eigenvalue problems 2012 18 / 95
Lanczos Method
Lemma 3.1
(i) The functions ck (t) satisfy the recurrence formula
c0(t) ≡ 1, c1(t) = t , ck+1(t) = 2tck (t)− ck−1(t), k ≥ 1.
In particular, ck is a polynomial of degree k .
(ii) For |t | ≥ 1 the Chebyshev polynomials have the following representation
ck (t) = cosh(k · Arcosh t), t ∈ R, |t | ≥ 1.
TUHH Heinrich Voss Krylov subspace methods Eigenvalue problems 2012 18 / 95
Lanczos Method
Lemma 3.1
(i) The functions ck (t) satisfy the recurrence formula
c0(t) ≡ 1, c1(t) = t , ck+1(t) = 2tck (t)− ck−1(t), k ≥ 1.
In particular, ck is a polynomial of degree k .
(ii) For |t | ≥ 1 the Chebyshev polynomials have the following representation
ck (t) = cosh(k · Arcosh t), t ∈ R, |t | ≥ 1.
Proof: (i) follows from the properties of the cosine function, in particular from
cos((k + 1)t) + cos((k − 1)t) = 2 cos(kt) · cos t .
TUHH Heinrich Voss Krylov subspace methods Eigenvalue problems 2012 18 / 95
Lanczos Method
Lemma 3.1
(i) The functions ck (t) satisfy the recurrence formula
c0(t) ≡ 1, c1(t) = t , ck+1(t) = 2tck (t)− ck−1(t), k ≥ 1.
In particular, ck is a polynomial of degree k .
(ii) For |t | ≥ 1 the Chebyshev polynomials have the following representation
ck (t) = cosh(k · Arcosh t), t ∈ R, |t | ≥ 1.
Proof: (i) follows from the properties of the cosine function, in particular from
cos((k + 1)t) + cos((k − 1)t) = 2 cos(kt) · cos t .
(ii) follows from the fact that the functions cosh(k · Arcosh t) satisfy the same
recurrence formula. �
TUHH Heinrich Voss Krylov subspace methods Eigenvalue problems 2012 18 / 95
Lanczos Method
Theorem 10.2
Let α, β, γ ∈ R with α < β and γ 6∈ (α, β). Then the minimization problem
minp∈Πm, p(γ)=1
maxt∈[α,β]
|p(t)|
has a unique solution, and is solved by the scaled Chebyshev polynomial
cm(t) :=
{
cm(1 + 2 t−ββ−α
)/cm(1 + 2 γ−ββ−α
) für γ > β
cm(1 + 2 α−tβ−α
)/cm(1 + 2α−γβ−α
) für γ < α
TUHH Heinrich Voss Krylov subspace methods Eigenvalue problems 2012 19 / 95
Lanczos Method
Proof
We restrict ourselves to the case γ > β.
TUHH Heinrich Voss Krylov subspace methods Eigenvalue problems 2012 20 / 95
Lanczos Method
Proof
We restrict ourselves to the case γ > β.
From t := 1 + 2(γ − β)/(β − α) 6∈ [−1, 1] it follows that cm is defined, and
obviously, cm is a polynomial of degree m, and cm(γ) = 1 holds.
TUHH Heinrich Voss Krylov subspace methods Eigenvalue problems 2012 20 / 95
Lanczos Method
Proof
We restrict ourselves to the case γ > β.
From t := 1 + 2(γ − β)/(β − α) 6∈ [−1, 1] it follows that cm is defined, and
obviously, cm is a polynomial of degree m, and cm(γ) = 1 holds.
It remains to show, that cm is the unique solution of the minimum problem.
TUHH Heinrich Voss Krylov subspace methods Eigenvalue problems 2012 20 / 95
Lanczos Method
Proof
We restrict ourselves to the case γ > β.
From t := 1 + 2(γ − β)/(β − α) 6∈ [−1, 1] it follows that cm is defined, and
obviously, cm is a polynomial of degree m, and cm(γ) = 1 holds.
It remains to show, that cm is the unique solution of the minimum problem.
Assume that qm is a polynomial of degree m such that qm(γ) = 1 and
maxα≤t≤β
|qm(t)| ≤1
cm(1 + 2(γ − β)/(β − α))=: cm.
TUHH Heinrich Voss Krylov subspace methods Eigenvalue problems 2012 20 / 95
Lanczos Method
Proof
We restrict ourselves to the case γ > β.
From t := 1 + 2(γ − β)/(β − α) 6∈ [−1, 1] it follows that cm is defined, and
obviously, cm is a polynomial of degree m, and cm(γ) = 1 holds.
It remains to show, that cm is the unique solution of the minimum problem.
Assume that qm is a polynomial of degree m such that qm(γ) = 1 and
maxα≤t≤β
|qm(t)| ≤1
cm(1 + 2(γ − β)/(β − α))=: cm.
Obviously, the Chebyshev polynomial cm(τ) attains the values +1 and −1
alternately at the arguments τj := cos(jπ/k), j = 0, . . . , k . Hence,
cm(tj) =(−1)j
cmfor tj :=
2τj − α− β
β − α, j = 0, . . . ,m.
TUHH Heinrich Voss Krylov subspace methods Eigenvalue problems 2012 20 / 95
Lanczos Method
Proof ct.
Let rm := cm − qm. Then from
|qm(tj)| ≤1
cm= |cm(tj)|, j = 0, . . . ,m,
one obtains that
r(tj) ≥ 0, if j is even, r(tj) ≤ 0, if j is odd.
TUHH Heinrich Voss Krylov subspace methods Eigenvalue problems 2012 21 / 95
Lanczos Method
Proof ct.
Let rm := cm − qm. Then from
|qm(tj)| ≤1
cm= |cm(tj)|, j = 0, . . . ,m,
one obtains that
r(tj) ≥ 0, if j is even, r(tj) ≤ 0, if j is odd.
From the continuity of rm we get the existence of a root tj in every interval
[tj , tj−1], j = 1, . . . , k . Moreover, rm(γ) = cm(γ)− qm(γ) = 0.
TUHH Heinrich Voss Krylov subspace methods Eigenvalue problems 2012 21 / 95
Lanczos Method
Proof ct.
Let rm := cm − qm. Then from
|qm(tj)| ≤1
cm= |cm(tj)|, j = 0, . . . ,m,
one obtains that
r(tj) ≥ 0, if j is even, r(tj) ≤ 0, if j is odd.
From the continuity of rm we get the existence of a root tj in every interval
[tj , tj−1], j = 1, . . . , k . Moreover, rm(γ) = cm(γ)− qm(γ) = 0.
Therefore, the polynomial rm of degree m has at least m + 1 roots, and it
follows that rm(t) ≡ 0, i.e. qm = cm. �
TUHH Heinrich Voss Krylov subspace methods Eigenvalue problems 2012 21 / 95
Lanczos Method
THEOREM 10.3
The angle δ(ui ,Km(v1,A)) between the exact eigenvector ui and the m-th
Krylov space satisfies the inequality
tan δ(ui ,Km) ≤κi
cm−i(1 + 2ρi)tan δ(ui , v1) (1)
where
κ1 = 1, κi =
i−1∏
j=1
λn − λj
λi − λj
für i > 1 (2)
and
ρi =λi+1 − λi
λn − λi+1
. (3)
TUHH Heinrich Voss Krylov subspace methods Eigenvalue problems 2012 22 / 95
Lanczos Method
THEOREM 10.3
The angle δ(ui ,Km(v1,A)) between the exact eigenvector ui and the m-th
Krylov space satisfies the inequality
tan δ(ui ,Km) ≤κi
cm−i(1 + 2ρi)tan δ(ui , v1) (1)
where
κ1 = 1, κi =
i−1∏
j=1
λn − λj
λi − λj
für i > 1 (2)
and
ρi =λi+1 − λi
λn − λi+1
. (3)
In particular for i = 1 one gets the estimate
tan δ(u1,Km(v1,A)) ≤
1
cm−1(1 + 2ρ1)tan δ(v1, u1) where ρ1 =
λ2 − λ1
λn − λ2
.
TUHH Heinrich Voss Krylov subspace methods Eigenvalue problems 2012 22 / 95
Lanczos Method
THEOREM 10.3
The angle δ(ui ,Km(v1,A)) between the exact eigenvector ui and the m-th
Krylov space satisfies the inequality
tan δ(ui ,Km) ≤κi
cm−i(1 + 2ρi)tan δ(ui , v1) (1)
where
κ1 = 1, κi =
i−1∏
j=1
λn − λj
λi − λj
für i > 1 (2)
and
ρi =λi+1 − λi
λn − λi+1
. (3)
In particular for i = 1 one gets the estimate
tan δ(u1,Km(v1,A)) ≤
1
cm−1(1 + 2ρ1)tan δ(v1, u1) where ρ1 =
λ2 − λ1
λn − λ2
.
Crucial for the convergence is the distance of the two smallest eigenvalues
relative to the width of the entire spectrum.
TUHH Heinrich Voss Krylov subspace methods Eigenvalue problems 2012 22 / 95
Lanczos Method
Proof
We consider the case i = 1 first.
TUHH Heinrich Voss Krylov subspace methods Eigenvalue problems 2012 23 / 95
Lanczos Method
Proof
We consider the case i = 1 first.
Expanding the vector y i in the basis {ui} of eigenvectors yields
y i =
n∑
j=1
αjuj , where
n∑
j=1
|αj |2 = 1
TUHH Heinrich Voss Krylov subspace methods Eigenvalue problems 2012 23 / 95
Lanczos Method
Proof
We consider the case i = 1 first.
Expanding the vector y i in the basis {ui} of eigenvectors yields
y i =
n∑
j=1
αjuj , where
n∑
j=1
|αj |2 = 1
from which we get
‖p(A)y1‖2 =
n∑
j=2
|p(λj)αj |2 ≤ max
j=2,...,n|p(λj)|
2 ≤ maxλ∈[λ2,λn]
|p(λ)|2,
and the statement follows from Theorem 10.2.
TUHH Heinrich Voss Krylov subspace methods Eigenvalue problems 2012 23 / 95
Lanczos Method
Proof ct.
For i > 1 we consider in Lemma 10.1 polynomials of the form
p(λ) :=(λ− λ1) · · · · · (λ− λi−1)
(λi − λ1) · · · · · (λi − λi−1)q(λ)
with q ∈ Πm−i and q(λi) = 1.
TUHH Heinrich Voss Krylov subspace methods Eigenvalue problems 2012 24 / 95
Lanczos Method
Proof ct.
For i > 1 we consider in Lemma 10.1 polynomials of the form
p(λ) :=(λ− λ1) · · · · · (λ− λi−1)
(λi − λ1) · · · · · (λi − λi−1)q(λ)
with q ∈ Πm−i and q(λi) = 1.
Then one gets as before
‖p(A)y i‖2 ≤ maxλ∈[λi+1,λn]
∣
∣
∣
∣
∣
∣
i−1∏
j=1
λ− λj
λi − λj
q(λ)
∣
∣
∣
∣
∣
∣
≤i−1∏
j=1
λn − λj
λi − λj
maxλ∈[λi+1,λn]
|q(λ)|.
TUHH Heinrich Voss Krylov subspace methods Eigenvalue problems 2012 24 / 95
Lanczos Method
Proof ct.
For i > 1 we consider in Lemma 10.1 polynomials of the form
p(λ) :=(λ− λ1) · · · · · (λ− λi−1)
(λi − λ1) · · · · · (λi − λi−1)q(λ)
with q ∈ Πm−i and q(λi) = 1.
Then one gets as before
‖p(A)y i‖2 ≤ maxλ∈[λi+1,λn]
∣
∣
∣
∣
∣
∣
i−1∏
j=1
λ− λj
λi − λj
q(λ)
∣
∣
∣
∣
∣
∣
≤i−1∏
j=1
λn − λj
λi − λj
maxλ∈[λi+1,λn]
|q(λ)|.
The result follows by minimizing this expression over all polynomials q
satisfying the constraint q(λi) = 1.�
TUHH Heinrich Voss Krylov subspace methods Eigenvalue problems 2012 24 / 95
Lanczos Method
THEOREM 10.4 (Kaniel & Paige; 1. eigenvalue)
Let A ∈ Cn×n be Hermitean with eigenvalues λ1 ≤ λ2 ≤ · · · ≤ λn and
corresponding orthonormal eigenvectors u1, . . . , un.
TUHH Heinrich Voss Krylov subspace methods Eigenvalue problems 2012 25 / 95
Lanczos Method
THEOREM 10.4 (Kaniel & Paige; 1. eigenvalue)
Let A ∈ Cn×n be Hermitean with eigenvalues λ1 ≤ λ2 ≤ · · · ≤ λn and
corresponding orthonormal eigenvectors u1, . . . , un.
If θ(m)1 ≤ · · · ≤ θ
(m)m denote the eigenvalues of the matrix Tm obtained after m
steps of Lanczos’ method, then
0 ≤ θ(m)1 − λ1 ≤ (λn − λ1)
(
tan δ(u1, v1)
cm−1(1 + 2ρ1)
)2
,
where ρ1 = (λ2 − λ1)/(λn − λ2).
TUHH Heinrich Voss Krylov subspace methods Eigenvalue problems 2012 25 / 95
Lanczos Method
THEOREM 10.4 (Kaniel & Paige; 1. eigenvalue)
Let A ∈ Cn×n be Hermitean with eigenvalues λ1 ≤ λ2 ≤ · · · ≤ λn and
corresponding orthonormal eigenvectors u1, . . . , un.
If θ(m)1 ≤ · · · ≤ θ
(m)m denote the eigenvalues of the matrix Tm obtained after m
steps of Lanczos’ method, then
0 ≤ θ(m)1 − λ1 ≤ (λn − λ1)
(
tan δ(u1, v1)
cm−1(1 + 2ρ1)
)2
,
where ρ1 = (λ2 − λ1)/(λn − λ2).
Crucial for the speed of convergence is the growth of cj−1(1 + 2ρ1), i.e. the
separation of the first two eigenvalues relative to the width of the entire
spectrum of A.
TUHH Heinrich Voss Krylov subspace methods Eigenvalue problems 2012 25 / 95
Lanczos Method
Proof
The left inequality follows from Rayleigh’s principle.
TUHH Heinrich Voss Krylov subspace methods Eigenvalue problems 2012 26 / 95
Lanczos Method
Proof
The left inequality follows from Rayleigh’s principle.
We have
θ(m)1 = min
x∈Km(v1,A),x 6=0
xHAx
xHx,
and, since each x ∈ Km(v1,A) can be represented as x = q(A)v1 for some
q ∈ Πm−1, it follows
θ(m)1 − λ1 = min
x∈Km(v1,A),x 6=0
xH(A − λ1I)x
xHx
= minq∈Πm−1,q 6=0
(v1)Hq(A)H(A − λ1I)q(A)v1
(v1)Hq(A)2v1.
TUHH Heinrich Voss Krylov subspace methods Eigenvalue problems 2012 26 / 95
Lanczos Method
Proof ct.
With v1 =∑n
j=1 αjuj it holds
θ(m)1 − λ1 = min
q∈Πm−1,q 6=0
∑nj=2(λj − λ1)|αjq(λj)|
2
∑nj=1 |αjq(λj)|2
≤ (λn − λ1) minq∈Πm−1,q 6=0
∑nj=2 |αjq(λj)|
2
|α1q(λ1)|2
≤ (λn − λ1) minq∈Πm−1,q 6=0
maxj=2,...,n
|q(λj)|2
|q(λ1)|2·
∑nj=2 |αj |
2
|α1|2
TUHH Heinrich Voss Krylov subspace methods Eigenvalue problems 2012 27 / 95
Lanczos Method
Proof ct.
With v1 =∑n
j=1 αjuj it holds
θ(m)1 − λ1 = min
q∈Πm−1,q 6=0
∑nj=2(λj − λ1)|αjq(λj)|
2
∑nj=1 |αjq(λj)|2
≤ (λn − λ1) minq∈Πm−1,q 6=0
∑nj=2 |αjq(λj)|
2
|α1q(λ1)|2
≤ (λn − λ1) minq∈Πm−1,q 6=0
maxj=2,...,n
|q(λj)|2
|q(λ1)|2·
∑nj=2 |αj |
2
|α1|2
Defining p(λ) = q(λ)/q(λ1), and observing that the set of all p:s when q
passes through the set Πm−1 is the set of all polynomials of degree not
exceeding m − 1 and satisfying the constraint p(λ1) = 1 we get
θ(m)1 − λ1 ≤ (λn − λ1) tan2 δ(u1, v1) min
p∈Πm−1,p(λ1)=1max
λ∈[λ2,λn]|p(λ)|2
and the statement follows from Theorem 10.2.�
TUHH Heinrich Voss Krylov subspace methods Eigenvalue problems 2012 27 / 95
Lanczos Method
Example
Matrix A has eigenvalue λ1 = 1 and 99 eigenvalues uniformly distributed in
[α, β].it. [α, β]=[20,100] [α, β]=[2,100]
error bound error bound
2 3.3890e+001 3.4832e+003 1.9123e+001 1.1385e+005
3 1.9708e+001 1.0090e+002 1.0884e+001 7.5862e+004
4 6.6363e+000 1.5083e+001 5.8039e+000 5.5241e+004
5 1.5352e+000 2.2440e+000 4.1850e+000 3.8224e+004
10 7.9258e-005 1.6293e-004 1.8948e+000 4.2934e+003
15 4.3730e-009 1.1827e-008 1.0945e+000 4.1605e+002
20 2.9843e-013 8.5861e-013 1.6148e-001 3.9725e+001
25 1.4843e-002 3.7876e+000
30 1.6509e-003 3.6109e-001
35 2.0658e-004 3.4423e-002
40 5.8481e-006 3.2816e-003
45 9.5770e-007 3.1285e-004
50 3.0155e-009 2.9824e-005
55 8.0487e-012 2.8432e-006
60 3.1530e-014 2.7105e-007
TUHH Heinrich Voss Krylov subspace methods Eigenvalue problems 2012 28 / 95
Lanczos Method
THEOREM 10.5 (Kaniel & Paige; higher eigenvalues)
Under the conditions of Theorem 10.4 it holds
0 ≤ θ(m)j − λj ≤ (λn − λ1)
(
κ(m)j tan δ(v1, ui)
cm−j(1 + 2ρj)
)2
with
ρj = (λj+1 − λj)/(λn − λj+1),
and
κ(m)1 ≡ 1, κ
(m)j =
j−1∏
i=1
λn − θ(m)i
λj − θ(m)i
.
TUHH Heinrich Voss Krylov subspace methods Eigenvalue problems 2012 29 / 95
Lanczos Method
THEOREM 10.5 (Kaniel & Paige; higher eigenvalues)
Under the conditions of Theorem 10.4 it holds
0 ≤ θ(m)j − λj ≤ (λn − λ1)
(
κ(m)j tan δ(v1, ui)
cm−j(1 + 2ρj)
)2
with
ρj = (λj+1 − λj)/(λn − λj+1),
and
κ(m)1 ≡ 1, κ
(m)j =
j−1∏
i=1
λn − θ(m)i
λj − θ(m)i
.
The general case j > 1 can be proved by using the maxmin characterization
of θ(m)j of Courant and Fischer.
TUHH Heinrich Voss Krylov subspace methods Eigenvalue problems 2012 29 / 95
Lanczos Method
THEOREM 10.5 (Kaniel & Paige; higher eigenvalues)
Under the conditions of Theorem 10.4 it holds
0 ≤ θ(m)j − λj ≤ (λn − λ1)
(
κ(m)j tan δ(v1, ui)
cm−j(1 + 2ρj)
)2
with
ρj = (λj+1 − λj)/(λn − λj+1),
and
κ(m)1 ≡ 1, κ
(m)j =
j−1∏
i=1
λn − θ(m)i
λj − θ(m)i
.
The general case j > 1 can be proved by using the maxmin characterization
of θ(m)j of Courant and Fischer.
Analogous results hold for the largest eigenvalues and Ritz values.
TUHH Heinrich Voss Krylov subspace methods Eigenvalue problems 2012 29 / 95
Lanczos Method
Orthogonality of basis vectors
In exact arithmetic the Lanczos method generates an orthonormal basis of the
Krylov space Km(v1,A). In the algorithm only the orthogonality with respect to
two basis vectors v j and v j−1 obtained in the last two steps is enforced, with
respect to the previous v i :s it follows from the symmetry of A.
TUHH Heinrich Voss Krylov subspace methods Eigenvalue problems 2012 30 / 95
Lanczos Method
Orthogonality of basis vectors
In exact arithmetic the Lanczos method generates an orthonormal basis of the
Krylov space Km(v1,A). In the algorithm only the orthogonality with respect to
two basis vectors v j and v j−1 obtained in the last two steps is enforced, with
respect to the previous v i :s it follows from the symmetry of A.
It can be shown that in floating point arithmetic the orthogonality is destroyed
when a sequence θ(j)i , j = 1, 2, . . . , of Ritz values has converged to an
eigenvalue λ of A, i.e. if the residual βjs(j)j,i has become small.
TUHH Heinrich Voss Krylov subspace methods Eigenvalue problems 2012 30 / 95
Lanczos Method
Orthogonality of basis vectors
In exact arithmetic the Lanczos method generates an orthonormal basis of the
Krylov space Km(v1,A). In the algorithm only the orthogonality with respect to
two basis vectors v j and v j−1 obtained in the last two steps is enforced, with
respect to the previous v i :s it follows from the symmetry of A.
It can be shown that in floating point arithmetic the orthogonality is destroyed
when a sequence θ(j)i , j = 1, 2, . . . , of Ritz values has converged to an
eigenvalue λ of A, i.e. if the residual βjs(j)j,i has become small.
Thereafter all v j :s obtain a component in the direction of the eigenspace of
the converged eigenvalue, and a duplicate copy of that eigenvalue will show
up in the spectrum of the tridiagonal matrix Tm.
TUHH Heinrich Voss Krylov subspace methods Eigenvalue problems 2012 30 / 95
Lanczos Method
Orthogonality of basis vectors
In exact arithmetic the Lanczos method generates an orthonormal basis of the
Krylov space Km(v1,A). In the algorithm only the orthogonality with respect to
two basis vectors v j and v j−1 obtained in the last two steps is enforced, with
respect to the previous v i :s it follows from the symmetry of A.
It can be shown that in floating point arithmetic the orthogonality is destroyed
when a sequence θ(j)i , j = 1, 2, . . . , of Ritz values has converged to an
eigenvalue λ of A, i.e. if the residual βjs(j)j,i has become small.
Thereafter all v j :s obtain a component in the direction of the eigenspace of
the converged eigenvalue, and a duplicate copy of that eigenvalue will show
up in the spectrum of the tridiagonal matrix Tm.
This effect was first observed and studied by Paige (1971). A detailed
discussion is contained in the monograph of Parlett (1998).
TUHH Heinrich Voss Krylov subspace methods Eigenvalue problems 2012 30 / 95
Lanczos Method
Orthogonality of basis vectors ct.
Note that these multiple Ritz values have nothing to do with possible multiple
eigenvalues of a given matrix, they occur simply as a result of a converged
eigenvalue.
TUHH Heinrich Voss Krylov subspace methods Eigenvalue problems 2012 31 / 95
Lanczos Method
Orthogonality of basis vectors ct.
Note that these multiple Ritz values have nothing to do with possible multiple
eigenvalues of a given matrix, they occur simply as a result of a converged
eigenvalue.
For multiple eigenvalues the Lanczos (and Arnoldi) method in exact arithmetic
can only detect one eigenvector, namely the projection of the initial vector v1
to the corresponding eigenspace. Further eigenvectors can only be obtained
restarting with a different initial vector or by a block Lanczos (Arnoldi) method.
TUHH Heinrich Voss Krylov subspace methods Eigenvalue problems 2012 31 / 95
Lanczos Method
Orthogonality of basis vectors ct.
Note that these multiple Ritz values have nothing to do with possible multiple
eigenvalues of a given matrix, they occur simply as a result of a converged
eigenvalue.
For multiple eigenvalues the Lanczos (and Arnoldi) method in exact arithmetic
can only detect one eigenvector, namely the projection of the initial vector v1
to the corresponding eigenspace. Further eigenvectors can only be obtained
restarting with a different initial vector or by a block Lanczos (Arnoldi) method.
A simple trick to detect duplicate copies of an eigenvalue advocated by
Cullum & Willoughby (1986) is the following.
TUHH Heinrich Voss Krylov subspace methods Eigenvalue problems 2012 31 / 95
Lanczos Method
Orthogonality of basis vectors ct.
Note that these multiple Ritz values have nothing to do with possible multiple
eigenvalues of a given matrix, they occur simply as a result of a converged
eigenvalue.
For multiple eigenvalues the Lanczos (and Arnoldi) method in exact arithmetic
can only detect one eigenvector, namely the projection of the initial vector v1
to the corresponding eigenspace. Further eigenvectors can only be obtained
restarting with a different initial vector or by a block Lanczos (Arnoldi) method.
A simple trick to detect duplicate copies of an eigenvalue advocated by
Cullum & Willoughby (1986) is the following.
Compute the eigenvalues of the reduced matrix Tm ∈ Rm−1×m−1 obtained
from Tm by detracting the first row and column. Those eigenvalues that differ
less than a small multiple times machine precision from the eigenvalues of Tm
are the unwanted eigenvalues, i.e. the ones due to loss of orthogonality.
TUHH Heinrich Voss Krylov subspace methods Eigenvalue problems 2012 31 / 95
Lanczos Method
Example
Convergence of the Lanczos process for A = diag (rand(100,1))
TUHH Heinrich Voss Krylov subspace methods Eigenvalue problems 2012 32 / 95