large sparse linear systemsds.postech.ac.kr/.../2020/08/large-sparse-linear-system.pdf · 2020. 8....

22
포항공과대학교 산업경영학과 Large Sparse Linear System JeYong Lee Statistics and Data Science Lab. August 5, 2020

Upload: others

Post on 19-Aug-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Large Sparse Linear Systemsds.postech.ac.kr/.../2020/08/Large-sparse-linear-system.pdf · 2020. 8. 6. · 7 Before we study iterative method • QR factorization [1] Trefethen and

포항공과대학교 산업경영학과

Large Sparse Linear System

JeYong Lee

Statistics and Data Science Lab.

August 5, 2020

Page 2: Large Sparse Linear Systemsds.postech.ac.kr/.../2020/08/Large-sparse-linear-system.pdf · 2020. 8. 6. · 7 Before we study iterative method • QR factorization [1] Trefethen and

Contents

1. Before we study the iterative method

2. Krylov subspace method

3. Arnoldi process

4. Lanczos process

5. GMRES

6. Next seminar

Page 3: Large Sparse Linear Systemsds.postech.ac.kr/.../2020/08/Large-sparse-linear-system.pdf · 2020. 8. 6. · 7 Before we study iterative method • QR factorization [1] Trefethen and

3

Before we study iterative method

• Gram-Schmidt process

Given 𝑣1, 𝑣2, 𝑣3, … , we can construct the vectors 𝑒1, 𝑒2, 𝑒3, … which are orthonormal vectors.

Method for orthonormalizing a set of vectors

Page 4: Large Sparse Linear Systemsds.postech.ac.kr/.../2020/08/Large-sparse-linear-system.pdf · 2020. 8. 6. · 7 Before we study iterative method • QR factorization [1] Trefethen and

4

Before we study iterative method

• Modified Gram-Schmidt process

[1] The Gram-Schmidt process from wiki-pedia

Instead of, computing the vector 𝑢𝑘,

To avoid the rounding error, it is computed as

Page 5: Large Sparse Linear Systemsds.postech.ac.kr/.../2020/08/Large-sparse-linear-system.pdf · 2020. 8. 6. · 7 Before we study iterative method • QR factorization [1] Trefethen and

5

Before we study iterative method

• Modified Gram-Schmidt process

Algorithm : Modified Gram–Schmidt

1. q1 = 𝑎1/ 𝑎12.3. For i = 1 to n4. vi = ai5. For j = 1 to i - 16. 𝑟𝑖𝑗 = < v𝑖 , q𝑗 >

7. 𝑣i = 𝑣i − 𝑟𝑖𝑗𝑞j

8. 𝑟𝑖𝑖 = 𝑣𝑖9. 𝑞𝑖 = 𝑣𝑖/𝑟𝑖𝑖

Normalizing

Orthogonalization

[1] Trefethen and Bau. (1997). Numerical Linear Algebra

CGS : 9.1852e-12 MGS : 8.3750e-14

CGS : 2.9912 MGS : 2.1554e-11

Page 6: Large Sparse Linear Systemsds.postech.ac.kr/.../2020/08/Large-sparse-linear-system.pdf · 2020. 8. 6. · 7 Before we study iterative method • QR factorization [1] Trefethen and

6

Before we study iterative method

• CGS vs MGS in the sense of the numerical stability

CGS : 𝑣𝑗 = 𝑣𝑗 − (𝑣𝑘𝑇𝑥𝑗)𝑣𝑘 vs MGS : 𝑣𝑘 = 𝑣𝑘 − (𝑣𝑗

𝑇𝑣𝑘)𝑣𝑗

Now we consider the orthogonalization process of the two methods.

If an error is made in computing 𝑞2 in CGS, so that 𝑞1𝑇𝑞2 = 𝛿 is small,

but non-zero. This will not be corrected and accumulated.

CGSMGS

𝑞2 orthogonality

𝑞1 orthogonality

Page 7: Large Sparse Linear Systemsds.postech.ac.kr/.../2020/08/Large-sparse-linear-system.pdf · 2020. 8. 6. · 7 Before we study iterative method • QR factorization [1] Trefethen and

7

Before we study iterative method

• QR factorization

[1] Trefethen and Bau. (1997). Numerical Linear Algebra

Assume that 𝐴 ∈ ℂ𝑚∗𝑛 (𝑚 ≥ 𝑛) has full column rank n.

For many applications, we have interest in the column spaces of a matrix A.

We want the sequence 𝑞1, 𝑞2, 𝑞3, … to have the property < 𝑎1, 𝑎2, 𝑎3, … , 𝑎𝑗 > = < 𝑞1, 𝑞2, 𝑞3, … , 𝑞𝑗 >, 𝑗 = 1, … , 𝑛

⋮ ⋮ ⋮𝑎1 𝑎2 … 𝑎𝑛⋮ ⋮ ⋮

=⋮ ⋮ ⋮𝑞1 𝑞2 … 𝑞𝑛⋮ ⋮ ⋮

𝑟11 ⋯ 𝑟1𝑛⋱ ⋮

0 𝑟𝑛𝑛, 𝑤ℎ𝑒𝑟𝑒 𝑡ℎ𝑒 𝑑𝑖𝑎𝑔𝑜𝑛𝑎𝑙 𝑒𝑛𝑡𝑟𝑖𝑒𝑠 𝑟𝑘𝑘 𝑎𝑟𝑒 𝑛𝑜𝑛 𝑧𝑒𝑟𝑜

𝑎1, 𝑎2, 𝑎3, …, 𝑎𝑛 can be expressed as linaer combinations of 𝑞1, 𝑞2, 𝑞3, …, 𝑞𝑛

[1] expression of linear combination of orthnormal vectors

Page 8: Large Sparse Linear Systemsds.postech.ac.kr/.../2020/08/Large-sparse-linear-system.pdf · 2020. 8. 6. · 7 Before we study iterative method • QR factorization [1] Trefethen and

8

Before we study iterative method

Direct and iterative methods [1]

[1] Trefethen and Bau. (1997). Numerical Linear Algebra

• Direct method• Solve the problem by a finite sequence of operations• Under the situation in the absence of rounding errors, it would deliver an exact solution• Operate directly on elements of a matrix• O(m3) for general matrices if the matrix 𝐴 ∈ ℂ𝑚∗m

• Iterative method• Solve the problem by finding successive approximations to the solution starting from an initial guess• Useful even for linear problems involving a large number of variables where direct methods would be

prohibitively expensive• Exploit sparsity structure that operate in O(m2)

Page 9: Large Sparse Linear Systemsds.postech.ac.kr/.../2020/08/Large-sparse-linear-system.pdf · 2020. 8. 6. · 7 Before we study iterative method • QR factorization [1] Trefethen and

9

Before we study iterative method

• Exploiting Sparsity in the A∙x

[1] E. Chow and Y. Saad, (2014) preconditioned Krylov subspace methods for sampling multivariate Gaussian distribution. SIAM scientific computing

a11 a12a21 a22

⋱ ⋮⋮

⋱ ⋮⋮

⋮Dense

Sparse

N * N N * 1

a1,n−1 a1na2,n−1 a2n

an−1,nan,n

an−1,1an,1

⋮=

a11 ⋅ x1 + … + a1n ⋅ xn : 2n−1 flops

⋮=⋮

a11 ⋅ x1 +0 ⋅ x2… + 0 ⋅ xn : 2N(A)1−1 flops

Total flops : 2N(A)- n

Total flops : 2n2- n

N(A) is the number of nonzero elements of the sparse matrix

𝜈 notation, the number of nonzero elements per row, is often used in many practical case.

[1]

Page 10: Large Sparse Linear Systemsds.postech.ac.kr/.../2020/08/Large-sparse-linear-system.pdf · 2020. 8. 6. · 7 Before we study iterative method • QR factorization [1] Trefethen and

10

Krylov Subspace

• The definition of the Krylov subspace

[1] Krylov Subspace. wiki-pedia

[2] Ilse C.F. Ipsen and Carl D. Meyer. (1997) The Idea Behind Krylov Method. The American Mathematical Monthly 105(10) ·November

Def. Krylov subspace [1]

The linear subspace spanned by the image of b under the first r power of A (starting from the I)

• Why Krylov subspace? [2]

• Assume you have to solve the linear equation Ax = b when A is large and sparse.• If you try using the Gaussian elimination to solve this system, the O(n3) operations are required.• But the matrix-vector multiplications can be computed more inexpensively than the above.• So it is not so difficult to handle Κn even when A is very large.

• We will use the some iterative method for solving linear system based on Krylov subspace.• Arnoldi process is very famous and underlying algorithm for following various algorithm.• Our final goal is to reduce the number of operations from O(n3) to O(n2)

Page 11: Large Sparse Linear Systemsds.postech.ac.kr/.../2020/08/Large-sparse-linear-system.pdf · 2020. 8. 6. · 7 Before we study iterative method • QR factorization [1] Trefethen and

11

Arnoldi process

• The algorithm of the Arnoldi process

[1] Saad, Y. (2003). Iterative Methods for Sparse Linear Systems. SIAM, 2nd edition.

Algorithm. Arnoldi process [1]

1. Choose a vector v1 such that v1 = 1

2. For j = 1,2, … , m, Do3. Compute hij = < Avj, vi > for i = 1,2,…,j

4. Compute wj = Avj - σi=1j

hijvi5. hj+1,j = wj

6. If hj+1,j = 0 then Stop

7. vj+1 = wj / hj+1,j8. EndDo

Same as CGS

Each Avj is the given vectors to be orthogonalized

Each vi is the orthonormal vectors which are the basis of the Krylov subspace (cont.)

Page 12: Large Sparse Linear Systemsds.postech.ac.kr/.../2020/08/Large-sparse-linear-system.pdf · 2020. 8. 6. · 7 Before we study iterative method • QR factorization [1] Trefethen and

12

Arnoldi process

• The algorithm of the Arnoldi process

[1] Saad, Y. (2003). Iterative Methods for Sparse Linear Systems. SIAM, 2nd edition.

Algorithm. Arnoldi process (MGS) [1]

1. Choose a vector v1 such that v1 = 1

2. For j = 1,2, … , m, Do3. Compute hij = < Avj, vi > for i = 1,2,…,j

4. Compute wj = Avj - σi=1j

hijvi5. hj+1,j = vj6. If hj+1,j = 0 then Stop

7. vj+1 = wj / hj+1,j8. EndDo

Same as MGS

Each Avj is the given vectors to be orthogonalized

Each vi is the orthonormal vectors which are the basis of the Krylov subspace (cont.)

Page 13: Large Sparse Linear Systemsds.postech.ac.kr/.../2020/08/Large-sparse-linear-system.pdf · 2020. 8. 6. · 7 Before we study iterative method • QR factorization [1] Trefethen and

13

Arnoldi process

The Details of the Arnoldi process

[1] Saad, Y. (2003). Iterative Methods for Sparse Linear Systems. SIAM, 2nd edition.

1. Km = < b, Ab, A2b, … , Am−1b > = < q1, q2, … , qm >

Assume that the Arnoldi process does not stop before the mth step.Then the vectors {v1, v2,…., vm} form an orthonormal basis of the Krylov subspace Km(A, v1): Arnoldi process can be described as the systematic construction of orthonormal bases for successive Krylov subspace.

2. Vm : n x m matrix Ḫm : Hessenberg matrix (m+1) x m⋮ ⋮ ⋮v1 v2 … vm⋮ ⋮ ⋮

Page 14: Large Sparse Linear Systemsds.postech.ac.kr/.../2020/08/Large-sparse-linear-system.pdf · 2020. 8. 6. · 7 Before we study iterative method • QR factorization [1] Trefethen and

14

Arnoldi process

The Details of the Arnoldi process

[1] Saad, Y. (2003). Iterative Methods for Sparse Linear Systems. SIAM, 2nd edition.

Avj = h1jv1 + h2jv2 + … + hj+1,jvj+1 = σi=1j+1

hijvi for j = 1,2,…,m

Recall the orthogonalization process in the Arnoldi,

hj+1,jvj+1 = Avj − (h1jv1 + h2jv2 + … + hjjvj) for j = 1,2,…,m

Im∗m 0k∗1Hk

hm+1,mem

Page 15: Large Sparse Linear Systemsds.postech.ac.kr/.../2020/08/Large-sparse-linear-system.pdf · 2020. 8. 6. · 7 Before we study iterative method • QR factorization [1] Trefethen and

15

Arnoldi process

The Details of the Arnoldi process [1][2]

[1] Saad, Y. (2003). Iterative Methods for Sparse Linear Systems. SIAM, 2nd edition.

[2] Trefethen and Bau. (1997). Numerical Linear Algebra

• hn+1,n works as the stop criterion : i.e. the Arnoldi process breaks down at the n step. (hn+1,n = 0)

• It means that Kn is an invariant subspace of A : i.e. AKn ⊆ Kn

• It leads to the Kn = Kn+1 = Kn+2 = ….• It reach at the point that can’t be more extended

• Each eigenvalue of the Hn is an eigenvalue of A

• If A is nonsingular, then the solution x to the system of equations Ax = b lies in Kn

This is why we consider the Arnoldi process as underlying algorithim among iterative methods using Krylov subspace.

Page 16: Large Sparse Linear Systemsds.postech.ac.kr/.../2020/08/Large-sparse-linear-system.pdf · 2020. 8. 6. · 7 Before we study iterative method • QR factorization [1] Trefethen and

16

Lanczos process

• The Motivation of the Lanczos process [1][2]

[1] Saad, Y. (2003). Iterative Methods for Sparse Linear Systems. SIAM, 2nd edition.

[2] Trefethen and Bau. (1997). Numerical Linear Algebra

Assume that Arnoldi process is applied to a real symmetric matrix A. (some structure applied)

Tridiagonal MatrixHessenberg Matrix

It means that we can reduce from (n+1) term recurrence (at step n) to the three term recurrence (much cheaper!)

Avj = h1jv1 + h2jv2 + … + hj+1,jvj+1 Avj = 𝛽jvj−1 + 𝛼jvj + 𝛽j+1vj+1

Page 17: Large Sparse Linear Systemsds.postech.ac.kr/.../2020/08/Large-sparse-linear-system.pdf · 2020. 8. 6. · 7 Before we study iterative method • QR factorization [1] Trefethen and

17

Lanczos process

• The algorithm of the Lanczos process [1][2]

[1] Saad, Y. (2003). Iterative Methods for Sparse Linear Systems. SIAM, 2nd edition.

[2] Trefethen and Bau. (1997). Numerical Linear Algebra

Algorithm. Lanczos process [1]

1. Choose an initial vector v1 such that v1 = 1. Set 𝛽1 ≡ 0, v0 ≡ 0

2. For j = 1,2, … , m, Do3. wj ≔Avj − 𝛽jvj−14. 𝛼j ≔ < wj, vj >

5. wj ≔ wj - 𝛼jvj6. 𝛽j+1 ≔ wj . If 𝛽j+1 = 0 then Stop

7. vj+1 ≔ wj / 𝛽j+18. EndDo

Numerical stabiltiy

Page 18: Large Sparse Linear Systemsds.postech.ac.kr/.../2020/08/Large-sparse-linear-system.pdf · 2020. 8. 6. · 7 Before we study iterative method • QR factorization [1] Trefethen and

18

GMRES

• GMRES (Generalized Minimal RESidual) [1][2]

[1] Saad, Y. (2003). Iterative Methods for Sparse Linear Systems. SIAM, 2nd edition.

[2] Trefethen and Bau. (1997). Numerical Linear Algebra

Now we consider the how the Arnoldi process can be used to solve the systems of the equation Ax = b.The standard algorithm of this kind (non-symmetric system) is known as GMRES.

Idea : Approximating the x∗ by the vector xn ∈ Kn that minimize the norm of the residual rn = b – Axn

[1] The least squares polynomial approximation problem

It means that xn can be represented by the linear combination of the columns of the Krylov matrix Kn or the orthonormal basis {v1, v2, … , vn}

Thus the problem is to find a coefficient vector c ∈ ℂn such that

Argminc AKnc − b

Argminy AVny − b

Consider xn = Vny instead of Knc ,

where y ∈ ℂn

c ∈ ℂn

Page 19: Large Sparse Linear Systemsds.postech.ac.kr/.../2020/08/Large-sparse-linear-system.pdf · 2020. 8. 6. · 7 Before we study iterative method • QR factorization [1] Trefethen and

19

GMRES

• GMRES (Generalized Minimal RESidual) [1][2]

[1] Saad, Y. (2003). Iterative Methods for Sparse Linear Systems. SIAM, 2nd edition.

[2] Trefethen and Bau. (1997). Numerical Linear Algebra

Argminy AVny − b

Argminy Vn+1Ḫny − b

Since

Argminy Ḫny − Vn+1∗ b

Argminy Ḫny − b e1

At step n of GMRES we solve this problem for y, then set xn = Qny

Since We set the initial vector v1 = b / bv1t

⋮vn+1t

⋅ b v1 = b e1

Page 20: Large Sparse Linear Systemsds.postech.ac.kr/.../2020/08/Large-sparse-linear-system.pdf · 2020. 8. 6. · 7 Before we study iterative method • QR factorization [1] Trefethen and

20

GMRES

• GMRES (Generalized Minimal RESidual) [1][2]

[1] Saad, Y. (2003). Iterative Methods for Sparse Linear Systems. SIAM, 2nd edition.

[2] Trefethen and Bau. (1997). Numerical Linear Algebra

Algorithm. GMRES [2]

1. v1 = b / b

2. For n = 1,2, … Do3. < step n of Arnoldi iteration >4. Find y to minimize Ḫny − b e1 (= rn )5. xn = Qny6. EndDo

Page 21: Large Sparse Linear Systemsds.postech.ac.kr/.../2020/08/Large-sparse-linear-system.pdf · 2020. 8. 6. · 7 Before we study iterative method • QR factorization [1] Trefethen and

21

FOM (Full Orthogonalization Method)

[1] Saad, Y. (2003). Iterative Methods for Sparse Linear Systems. SIAM, 2nd edition.

[2] Trefethen and Bau. (1997). Numerical Linear Algebra

Algorithm. FOM [1]

• FOM based on GMRES [1][2]

Page 22: Large Sparse Linear Systemsds.postech.ac.kr/.../2020/08/Large-sparse-linear-system.pdf · 2020. 8. 6. · 7 Before we study iterative method • QR factorization [1] Trefethen and

22

Next Seminar

[1] Saad, Y. (2003). Iterative Methods for Sparse Linear Systems. SIAM, 2nd edition.

[2] Trefethen and Bau. (1997). Numerical Linear Algebra

• Conjugate Gradient Method • For Symmetric Positive Definte system problem, Ax = b• GMRES is the method for the general matrix A

• Preconditioning• For fast convergence, transformation that conditions a given problem into a form that is more suitable for

numerical solving method.• Reducing a condition number and repositioning the spectrum of specific matrix A.

• Related paper application• Sampling random multivariate Gaussian samples.