cmsi計算科学技術特論a(10) 行列計算における高速アルゴリズム1
TRANSCRIPT
(1/50)
(2/50)
はじめに
(3/50)
1
•1
A Rm m Ab Rm x Rm
A
•
shift-and invert Lanczos -
Ax = b
(4/50)
•
•LU
CG
(5/50)
•A = LU LU Ly = b Ux = y
ALU
A A = LLT
•
1 LU b
, , Vol. 11, No. 4, pp. 14 – 18 (2006).
(6/50)
•x* x(0), x(1), x(2), …
•
A ApA
#
(7/50)
•x(n+1) = Cx(n) + d C d
x* C dA = M – N M C = M–1N d = M–1b
1 C- SOR ADI
•x(n) x(n+1) x(n)
(8/50)
•A Rm m ATA m
0
1(A) 2(A) m(A)AA m(A) > 0
•A Rm m (A) = 1(A)/ m(A) A
11 2
# r(n) = Ax(n) – b x(n)
#
(9/50)
クリロフ部分空間と射影
(10/50)
•A Rm m b Rm Rm
Kn(A; b) A b n
•x(0) = 0 x(n) Kn(A; b) n 1{x(n)}
# 0K1(A; b) K2(A; b) Kn(A; b)
Kn(A; b) x(n)
Kn(A; b) = span{b, Ab, A2b, …, An–1b}
(11/50)
•Kn(A; b) Kn+1(A; b)
Kn(A; b) = Kn+1(A; b)
•A Kn(A; b) = Kn+1(A; b)n Ax = b x* Kn(A; b)
•Anb = c0b + c1Ab + + cn–1An–1b c1, c2, …, cn–1
c0 = 0 A nc0 0 c0
A((1/c0)An–1b – (cn–1/c0)An–2b – – (c1/c0)b) = b x* Kn(A; b)
(12/50)
•Kn(A; b) x(n) = d0b + d1Ab + + dn–1An–1b
r(n) = Ax(n) – b
Kn(A; b) x(n) 1 1
r(n) = – b + d0Ab + d1A2b + + dn–1Anb = n(A)b
n(z) = – 1 + d0z + d1z2 + + dn–1zn –1 n
(13/50)
•x(n) Kn(A; b)
•
{vn}
Arnordi
b q1 = b / b 2v2 = Aq1v2 q1 q2v3 = Aq2v3 q1, q2 q3
(14/50)
• Arnoldi
• Arnoldi
Arnoldi compact-WY ,2010 , pp. 39-40 (2010).
(15/50)
• Arnoldi Arnoldi Aqi q1, q2, …, qi+1
A n Arnoldi
Aqi = h1i q1 + h2i q2 + + hi+1,i qi+1
AQn = Qn+1Hn
m n
(n+1) n
(16/50)
• Arnoldi Arnoldi AQn = Qn+1Hn Qn
T Qn
A span{q1, q2, …, qn} = Kn(A; b)
# Arnoldi
n n
(17/50)
A
• LanczosQn
TAQn = Hn
Hn 3 Tn
Anoldi hji i j+10
Aqi qi qi–1
Arnoldi Lanczos
(18/50)
A
• Lanczos
n = hnn, n = hn+1,n = hn,n+1
• Lanczosqn qn–1 qn+1 31 n
(19/50)
•x(n) Kn(A; b)
Ritz-Galerkin Petrov-Galerkin 3
(20/50)
I
•r(n) = Ax(n) – b x(n) Kn(A; b)
# e(n) = x(n) – x*
GMRES MINRES
(21/50)
II Ritz-Galerkin
•r(n) = Ax(n) – b Kn(A; b) x(n)
Kn(A; b) 0
•K1(A; b) K2(A; b) Kn(A; b)r(0) = b, r(1), r(2), …, r(n–1) Kn(A; b)
# r(n–1) Kn(A; b) n
Ax(n) – b Kn(A; b)
(22/50)
II Ritz-Galerkin
• A(x) = (1/2) xTAx – xTb Ax= b (x)
x(n) Kn(A; b) x(n) = Qny(n) y(n) Rn
(x(n)) = (1/2) (Qny(n))T AQny(n) – (Qny(n))Tby(n) Qn
T(AQny(n) – b) = 0Ax(n) – b Kn(A; b)
Ritz-Galerkin (x(n)) x(n)
(23/50)
II Ritz-Galerkin
• Ae(n) = x(n) – x* A
Ritz-Galerkin A x(n)
• Ritz-Galerkin CG
FOM# Ritz-Galerkin
(24/50)
III Petrov-Galerkin
•Rm L1 L2 L3 Ln n
r(n) = Ax(n) – b Ln x(n)
Ln 0Ln b* AT b*
Kn(AT; b*)
• Petrov-GalerkinBi-CG QMR
# Petrov-GalerkinPetrov-Galerkin CGS Bi-CGSTABGPBi-CG
Ax(n) – b Ln
(25/50)
•–1 n Pn
n n(z)
• Ritz-Galerkin Ax(n) e(n) = x(n) – x* Ae(n) = Ax(n) – b = r(n)
Ritz-Galerkin
min P n(A)b 2n n
e(n)TAe(n) = r(n)TA–1r(n) = bTn(A)A–1
n(A)b= bTA–1
n(A)A n(A)A–1b= e(0)T
n(A)A n(A)e(0) = n(A)e(0)A
min P n(A)e(0)An n
(26/50)
代表的なクリロフ部分空間法
(27/50)
•GMRES Generalized Minimum ResidualMINRES Minimum Residual
• Ritz-GalerkinCG Conjugate GradientFOM Full Orthogonalizaion Method
• Petrov-GalerkinBi-CG Bi-Conjugate GradientQMR Quasi Minimal ResidualCGS Conjugate Gradient SquaredBi-CGSTAB Stabilized Bi-CG
(28/50)
GMRES
•Kn(A; b) r(n) = Ax(n) – b x(n)
x(n) Kn(A; b) x(n) = Qny(n) y(n) Rn
2 Arnoldi AQn = Qn+1Hn
3 q1 = b / b 2
4 Qn+1 Qn+12
2 n+1 y(n) ny(n) 2
(29/50)
GMRES
• 22 min Hny(n) – b 2 e1 2 Hn
Hn= Un Rn Un R(n+1) n, Rn Rn n QR 1Rny(n) = Un
T b 2 e1
Hn (n+1) nO(n2)
Hn–1 Hn O(n)
# n
(30/50)
GMRES•
Arnoldi Kn(A; b)
Hn QRHn QR
UnT b 2 e1
Rny(n) = UnT b 2 e1
x(n) = Qny(n)
(31/50)
GMRES
• GMRES
n#
1 n# qn q1, q1, …, qn–1
#
• MINRES
Hn 3 1n
x(n) ( (A))2 *
* H. A. van der Vorst “Iterative Krylov Methods for Large Linear Systems”, Cambridge Univ. Press, 2003.
(32/50)
CG
•A
r(n) = Ax(n) – b Kn(A; b) x(n)
x(n) Kn(A; b) x(n) = Qny(n) y(n) Rn
Lanczos 3 Tn Qn = [q1| q2| |qn]Tn = LnUn LU x(n)
•q1, q2, …, qn
# n
QnT(Ax(n) – b) = Qn
T(AQny(n) – b) = Tn y(n) – QnTb = Tn y(n) – e1 = 0.
y(n) = Un–1Ln
–1e1, x(n) = Qny(n)
(33/50)
CG
•x(n)
x(n) = Qny(n) = (QnUn–1)(Ln
–1e1) [p0 | p1 | | pn–1][v0, v1, …, vn–1]T
pn vn
vn–1 = – n–2vn–2 / n–1
(34/50)
•
pn–1 = qn – n–2 pn–2
2 x(n)
CG
CG
x(n) = [p0 | p1 | | pn–1][v0, v1, …, vn–1]T = x(n–1) + vn–1 pn–1
(35/50)
CG
•
•(x) = (1/2) xTAx – xTb
pn–1 npn–1 n
xn
(36/50)
CG
• CGKn(A; b) = span{b, Ab, A2b, …, An–1b} = span{x1, x2,…, xn}
= span{r0, r1,…, rn–1} = span{p0, p1,…, pn–1}rnTrj = 0 (j < n)
A- pnTApj = 0 (j < n)
# pn
• CG
A en A = xn – x*A n
1 n# 3
(37/50)
CG
• CGA en A =
A
•en A
Pn 1 n (A) A( )
(38/50)
CG
• CGA m* CG m*
# ( ) m* 0 m*
n = m* 0
#
•1
CGAx = b
1
(39/50)
Bi-CG
•CG
•
(40/50)
Bi-CG
• Breakdownrn sn 0
1 breakdownTn LU 0
breakdown 2 breakdown
• Bi-CGn
1 n# GMRES# 3
A AT
# GMRES 2
(41/50)
Bi-CG
• CGSBi-CG rn = (Rn(A))2b, sn = b*
# snTrn Bi-CG2
# 2 Bi-CGA AT
• Bi-CGSTABSn(z) = (1 – 0z)(1 – 1z) (1 – n–1z) n Sn(z)
rn = Sn(A)R(A)b, sn = b*
# Bi-CG
nCGS
(42/50)
Bi-CG
• GPBi-CGBi-CGSTAB Sn(z)
Bi-CGSTAB
• QMRBi-CG
Bi-CG# Bi-CG#
(43/50)
前処理
(44/50)
•K A
K–1Ax = K–1b K–1A A
Ap K–1Ap
•AK–1 y = b x = K–1 y
K1–1AK2
–1 y = K1–1b x = K2
–1 y
• KK–1AKK–1x
(45/50)
• LUA LU A LU
K = LU ILU(0)# ILU(1)
ILU(1)
IC(0)
# K–1x L–1x U–1y# LU
ILUT U–1
ILUC
(46/50)
• SPAI; SParse Approximate InverseA A–1 M
MminM I – AM F
LUILU
•D1, D2 A D1
–1AD2–1
•LU
(47/50)
終わりに
(48/50)
•Ax = b
• GMRES CG Bi-CG
•
•Lis PETSc
(49/50)
参考文献
(50/50)
• L. N. Trefethen and D. Bau III: “Numerical Linear Algebra”, SIAM, Philadelphia, 1997.
• H. A. van der Vorst: “Iterative Krylov Methods for Large Linear Systems”, Cambridge University Press, Cambridge, 2003.
• Y. Saad: “Iterative Methods for Sparse Linear Systems”, PWS Publishing Company, Boston, 1996.
• R. Barrett et al.: “Templates for the Solution of Linear Systems: Building Blocks for Iterative Methods”, SIAM, Philadelphia, 1994.
• , : “ ”, , 1996.
• , : “ ”, , 2009.