multigrid method - cc.u-tokyo.ac.jp · multigrid method 2009 3 12 ... – enop steepest descent...

of 38 /38
Multigrid Multigrid Method Method 2009 2009 3 3 12 12 20090312KN 2 SMASH • Science • Modeling • Algorithm • Software MPI • Hardware H H ardware ardware S S oftware oftware A A lgorithm lgorithm M M odeling odeling S S cience cience 20090312KN 3 Algorithm – SM,SH HardwareSoftware H H ardware ardware S S oftware oftware A A lgorithm lgorithm M M odeling odeling S S cience cience 20090312KN 4 Finite Volume MethodFVM Direct Finite Difference Method http://nkl.cc.u-tokyo.ac.jp/seminars/0812-JSIAM/2008JSIAMfall-01.pdf – Gauss-Seidel CGIC(0)ICCG f z y x 2 2 2 2 2 2

Author: hoangcong

Post on 30-Jul-2018

214 views

Category:

Documents


0 download

Embed Size (px)

TRANSCRIPT

  • MultigridMultigrid MethodMethod

    20092009331212

    20090312KN 2

    !SMASH Science

    "#$%&'((( Modeling

    )*%+,-.%*/$((( Algorithm

    012345678%129:;[email protected]?%MPI((( Hardware

    BCD?%EFG(((

    HHardwareardware

    SSoftwareoftware

    AAlgorithmlgorithm

    MModelingodeling

    SSciencecience

    20090312KN 3

    HIJKLMNOPQRSTUVW(((

    XYZYR[\]^VI_Q:`abIcdefSg-

    eRhiMSj]kAlgorithmlRmJPQ SM,SHRmJHdUnVop]qrsnV

    tu !vwxy z{MfS%

    |nb}Hardware%Software |nb}W~IdQsqaYnV

    |nb}4Jg-fPQsqaYnV |nb}345678Jg-fPQsqaYnV z{MfMqVsqaYnVW(((

    HHardwareardware

    SSoftwareoftware

    AAlgorithmlgorithm

    MModelingodeling

    SSciencecience

    20090312KN 4

    fPQ3=6R-

    9:;39:;

    +,#Finite Volume Method%FVMG 2R-.%-.MJ )*Direct Finite Difference MethodfqYQ

    http://nkl.cc.u-tokyo.ac.jp/seminars/0812-JSIAM/2008JSIAMfall-01.pdf

    6%#@

    ]QK9:;

  • 20090312KN 5

  • 9

    ]cpQvw129:;RO4n_BEM%4%MO%MD

    rusparse >N4nFEM%FDM%MD%BEM

    10

    ruq%fR_

    -.%4%MD

    11

    rueeMRSeRfR)*%+,-.

    20090312KN 12

    [email protected]]R-.fRW_Q

    i

    Sia

    Sib

    Sic

    dia dib

    dic

    ab

    c

    dbidai

    dci

    0 iik ikkiikik QVdd

    S

    Vi -.#

    S

    dij -.s{MR

    Q #@

    -.fR

    #@

    die

  • 13

    +,#Ri6>WVru

    16

    15

    14

    13

    12

    11

    10

    9

    8

    7

    6

    5

    4

    3

    2

    1

    16

    15

    14

    13

    12

    11

    10

    9

    8

    7

    6

    5

    4

    3

    2

    1

    }{}]{[

    FFFFFFFFFFFFFFFF

    DXXXXDXXXX

    XDXXXXXDXX

    XXDXXXXXXXDXXXX

    XXXXDXXXXXXXDXX

    XXDXXXXXXXDXXXX

    XXXXDXXXXXXXDXX

    XXDXXXXXDX

    XXXXDXXXXD

    FK

    14

    ru0*R6R)*%FEM%FVM

    15

    ru4rufIq6RqV

    NNNNNN

    NNNNNN

    NN

    NN

    aaaaaaaa

    aaaaaaaa

    ,1,2,1,

    ,11,12,11,1

    ,21,22221

    ,11,11211

    ...

    .........

    ...

    N

    N

    N

    N

    yy

    yy

    xx

    xx

    1

    2

    1

    1

    2

    1

    !

    16

    ]cpQvw129:;RO4n_BEM%4%MO%MD

    rusparse >N4nFEM%FDM%MD%BEM

  • 17

    Direct Method GaussR%HLU*0%H+J>0%H+*ru;>0fJ [email protected]%k6%l;hmRqSPD

    345678 EnopSteepest Descent MethodRq x(i)= x(i-1) + %ip(i)

    x(i)

  • 20090312KN 29

    bxAxxxf ,,21

    AhhbAxhxfhxf ,21, R4h

    AhhbAxhxf

    AhhbhAxhbxAxx

    bhbxAhhAhxAxhAxx

    bhbxAhhxAxhx

    bhxhxAhxhxf

    ,21,

    ,21,,,,

    21

    ,,,21,

    21,

    21,

    21

    ,,,21,

    21

    ,)(,21

    20090312KN 30

    R3456782/5

    )()()1( kk

    kk pxx %

    CGSR x(0)sDI%f(x)REJJrsPQz%kPRJ x(k)frs9 p(k)W{b}fPQf

    )()()()()(2)()( ,,21 kkk

    kkk

    kk

    kk xfAxbpApppxf %%%

    f(x(k+1))JE]PQ}]S

    )()(

    )()(

    )()(

    )()()()(

    ,,

    ,,0 kk

    kk

    kk

    kk

    kk

    kk

    k

    Apprp

    AppAxbppxf

    * %%%

    )()( kk Axbr Sk]PQ[)

    20090312KN 31

    R3456783/5

    )()()1( kk

    kk Aprr %[) r(k)q#pR;]bI M1Q

    )0()0()()1()1( , prprp kkkk '

    rs9 J#pRG;]bIQ

    ;ARfeSpR]k+1H]t< yW{YVRM_QW%

  • 20090312KN 33

    R3456785/5

    )()(

    )()1(

    )()()()1()()()1()()1(

    ,,

    0,,,,

    kk

    kk

    k

    kkk

    kkkkk

    kkk

    AppApr

    AppAprApprApp

    *

    '

    ''

    4 p(k)%[)4 r(k)]^VIq#pRWd

    0, )()1( kk App 4 p(k)WA]UIM_Q

    )()()()(

    )()()()(

    ,,0,,0,

    kkkk

    jiji

    rrrpjirrjiApp

    NMV]MKn[)4 r(k)SNNUsUnV%bISOaWNNRf1]NH#]5)PQS)R78W_Q

    20090312KN 34

    %k%'k

    k

    kk

    k

    kkkkk

    kk

    kk

    kk

    kk

    k

    rrrrrApr

    rrrr

    AppApr

    %%

    '

    )1()1()1()()1()()1(

    )()(

    )1()1(

    )()(

    )()1(

    ,,,

    ,,

    ,,

    S%k%'kSqbfn2]2M1Q

    )()()()()()(

    )()(

    )()(

    )()(

    )()(

    )()(

    ,,,,

    ,,

    ,,

    kkkk

    kk

    kk

    kk

    kk

    kk

    kk

    k

    rrrpApprr

    Apprp

    AppAxbp

    %

    20090312KN 35

    preconditioningfS? R5)SruR+J*]

    +J*W4nd%s^1]V&5)WVru condition numbere^!EE+J2

    W1]V&5)U:PV

    f qfRru[A]]d}ru[M]JFPQef]bI+J*JPQ ru[M]]bIR9:;[A]{x}={b}J[A]{x}={b}fPQeeM[A]=[M]-1[A]%{b}=[M]-1{b} M_Q

    [A]=[M]-1[A]Wru]pYV%fVef]nQ

    klSru%rufq]/PQW%SruJ]PQefWV

    20090312KN 36

    niaan

    ijjijii ,,2,1

    ,1+

    f

    irRf#$R*RJRqfRJW1V

  • 20090312KN 37

    Preconditioned Conjugate Gradient Method PCG

    Compute r(0)= b-[A]x(0)

    for i= 1, 2, solve [M]z(i-1)= r(i-1)

    &i-1= r(i-1) z(i-1)if i=1p(1)= z(0)

    else'i-1= &i-1/&i-2p(i)= z(i-1) + 'i-1 z(i)

    endifq(i)= [A]p(i)

    %i = &i-1/p(i)q(i)x(i)= x(i-1) + %ip(i)r(i)= r(i-1) - %iq(i)check convergence |r|

    end

    ]:Q1 S

    , - ( ) , -rMz 1

    ( ) ( ) ( ) ( )AMAM .. ,11

    f6?!V

    ( ) ( ) ( ) ( )DMDM ,11

    R;ARru

    ( ) ( ) ( ) ( )AMAM ,11

    krulR Wg-

    20090312KN 38

    f6?%ShT

    rufUI%qfRruRf*RJU}ruJru [M] fPQ f6?%ShTpoint-Jacobi

    ( )

    N

    N

    DD

    DD

    M

    0...00000.........00000...0

    1

    2

    1

    solve [M]z(i-1)= r(i-1)fV]ruJ]QefWM1Q

    nxyMS5)PQ

    20090312KN 39

    ILU(0), IC(0) Eqd/YIVQru

    HLU*< Incomplete LU Factorization

    Hhg*< Incomplete Cholesky Factorizationeru

    Hn qfRruWMq%ruSfS,nV fill-in qfRruf0>fill-in_UJbIVQRW

    ILU0%IC0

    20090312KN 40

    K9:;R

  • 20090312KN 41

    Z4R

    "#$%&'%Intel"()*+,->$ cd>$ source /opt/itc/mpi/mpiswitch.sh mpich-mx-intel>$ cp /home/h08000/mgclass.tar .>$ tar xvf mgclass.tar

    ./0123456789:;[email protected]

    B0/CD%./01234567EJK

    LM"()*+CNO>$ source /opt/itc/mpi/mpiswitch.sh mpich-mx-hitachi

    20090312KN 42

    fPQ3=6R-

    9:;39:;

    +,#Finite Volume Method%FVMG 2R-.%-.MJ )*Direct Finite Difference MethodfqYQ

    http://nkl.cc.u-tokyo.ac.jp/seminars/0812-JSIAM/2008JSIAMfall-01.pdf

    6%#@

    ]QK9:;

  • 20090312KN 45

    K39:;)*

    211

    11

    2/12/1

    2x

    xxx

    xdxd

    dxd

    dxd

    dxd

    iii

    iiii

    ii

    i

    022

    Q

    dxd

    i

    x x

    i-1 i i+1

    20090312KN 46

    +,#Finite Volume Method FVM

    [email protected]]

    i

    Sia

    Sib

    Sic

    dia dib

    dic

    ab

    c

    dbidai

    dci

    0 iik ikkiikik QVdd

    S

    Vi -.#

    S

    dij -.s{MR

    Q #@

    -.fR

    #@

    die

    20090312KN 47

    K)*fR231/3

    ibbiib

    ibib Sdd

    Qs /

    iSia Sib

    dia dib dbi

    a b

    dai

    KMRhR^92C Sik= h-.# Vi = h2{MR dij=h/2

    h

    [email protected]

    [email protected]!K4

    1

    20090312KN 48

    K)*fR232/3

    0 iik ikkiikik QVdd

    S

    01 ik ikkiikik

    i

    Qdd

    SV

    MJViMQ

    eR+*]PQf

    iSia Sib

    dia dib dbi

    a b

    dai

    hKMRhR^92C Sik= h-.# Vi = h2{MR dij=h/2

    1

  • 20090312KN 49

    K)*fR233/3

    222211hhh

    biaibia

    iSia Sib

    dia dib dbi

    a b

    dai

    bakik

    kik

    kiik

    ik

    ihh

    hhdd

    SV ,2

    22

    11

    h

    bakik

    bakik

    bakik hh

    hhhh

    hh ,2,2,2

    11

    22

    1

    KMRhR^92C Sik= h-.# Vi = h2{MR dij=h/2

    1

    20090312KN 50

    v^UV)*0]

    x

    yz

    NXNY

    NZ Z

    XY

    x

    yz

    x

    yz

    NXNY

    NZ Z

    XY

    Z

    XY

    XY

    20090312KN 51

    [email protected]%g-nZ4

    mgC

    solMG39:;4O

    mesh.datCZ4

    INPUT.DATZ4

  • 20090312KN 53

    =>[email protected]

    $> cd /run

    $> ls mgmg

    $> f90 Oss noparallel mg.f o mg

    $> cd ../src$> make$> ls ../run/solMG

    solMG

    C mg

    39:;4O solMG

    hmmg.f%[email protected]

    20090312KN 54

    =>[email protected]

    $> cd /run$> $> ./mg

    3$> ls mesh.dat

    mesh.dat

    NX%NY%NZJ$PQf%kmesh.datlWYQ%;=>[email protected]=NY=NZnRM%NXR$

    NXNY

    NZ

    X,Y,Z9 R-.

    20090312KN 55

    mesh.dat1/5PPPQR Q S QP T QPQ U QPS TR PQTSUQV SQQUT PW TPQR VS UPVRWT RQPWV U VPP

    PWQ Q SQQ QQQ TPQPQP S USQQQQSPTRTQPQQQTS QVUQSPQQU RP RQTPQRUVS VQUQPQVR T WQRPPQW Q QQ PQ WQ QP QPQQ QSQ PPQQ QPWQTP QPQPQQQSQ QUS QQPQSQP QQRT PQPQT QUQQ U PPQUQTQRQP R QPPQRQU QS V PPP

    XYQZ Z[XYQZ Z\]^_`

    \]^_`XYQa b

    ^\cdeeU [P

    20090312KN 56

    mesh.dat2/5

    XYQZ Z[XYQZ Z\]^_`

    \]^_`XYQa b

    ^\cdeeU [P

    X,Y,Z9 R-.

    PPPQR Q S QP T QPQ U QPS TR PQTSUQV SQQUT PW TPQR VS UPVRWT RQPWV U VPP

    PWQ Q SQQ QQQ TPQPQP S USQQQQSPTRTQPQQQTS QVUQSPQQU RP RQTPQRUVS VQUQPQVR T WQRPPQW Q QQ PQ WQ QP QPQQ QSQ PPQQ QPWQTP QPQPQQQSQ QUS QQPQSQP QQRT PQPQT QUQQ U PPQUQTQRQP R QPPQRQU QS V PPP

    NXNY

    NZ

  • 20090312KN 57

    mesh.dat3/5

    XYQZ Z[XYQZ Z\]^_`

    \]^_`XYQa b

    ^\cdeeU [P

    -.NXNYNZ

    PPPQR Q S QP T QPQ U QPS TR PQTSUQV SQQUT PW TPQR VS UPVRWT RQPWV U VPP

    PWQ Q SQQ QQQ TPQPQP S USQQQQSPTRTQPQQQTS QVUQSPQQU RP RQTPQRUVS VQUQPQVR T WQRPPQW Q QQ PQ WQ QP QPQQ QSQ PPQQ QPWQTP QPQPQQQSQ QUS QQPQSQP QQRT PQPQT QUQQ U PPQUQTQRQP R QPPQRQU QS V PPP

    NXNY

    NZ

    20090312KN 58

    mesh.dat4/5

    XYQZ Z[XYQZ Z\]^_`

    \]^_`XYQa b

    ^\cdeeU [P

    -.NEIBcell(i,k)

    1SUPMPU

    PPPQR Q S QP T QPQ U QPS TR PQTSUQV SQQUT PW TPQR VS UPVRWT RQPWV U VPP

    PWQ Q SQQ QQQ TPQPQP S USQQQQSPTRTQPQQQTS QVUQSPQQU RP RQTPQRUVS VQUQPQVR T WQRPPQW Q QQ PQ WQ QP QPQQ QSQ PPQQ QPWQTP QPQPQQQSQ QUS QQPQSQP QQRT PQPQT QUQQ U PPQUQTQRQP R QPPQRQU QS V PPP

    1 2 310 11 1219 20 21

    NXNY

    NZ

    23

    20090312KN 59

    NEIBcellUIVQ-.PRS0

    NEIBcell(icel,1)= icel 1NEIBcell(icel,2)= icel + 1NEIBcell(icel,3)= icel NXNEIBcell(icel,4)= icel + NXNEIBcell(icel,5)= icel NX*NYNEIBcell(icel,6)= icel + NX*NY

    NEIBcell(icel,2)NEIBcell(icel,1)

    NEIBcell(icel,3)

    NEIBcell(icel,5)

    NEIBcell(icel,4)NEIBcell(icel,6)

    20090312KN 60

    mesh.dat5/5

    XYQZ Z[XYQZ Z\]^_`

    \]^_`XYQa b

    ^\cdeeU [P

    X,Y,Z9 RXYZ(i,j)

    PPPQR Q S QP T QPQ U QPS TR PQTSUQV SQQUT PW TPQR VS UPVRWT RQPWV U VPP

    PWQ Q SQQ QQQ TPQPQP S USQQQQSPTRTQPQQQTS QVUQSPQQU RP RQTPQRUVS VQUQPQVR T WQRPPQW Q QQ PQ WQ QP QPQQ QSQ PPQQ QPWQTP QPQPQQQSQ QUS QQPQSQP QQRT PQPQT QUQQ U PPQUQTQRQP R QPPQRQU QS V PPP

    1 2 310 11 1219 20 21

    NXNY

    NZ

  • 20090312KN 61

    NEIBcellUIVQ-.PRS0

    x

    yz

    i= XYZ(icel,1)j= XYZ(icel,2), k= XYZ(icel,3)icel= (k-1)*NX*NY + (j-1)*NX + i

    NEIBcell(icel,1)= icel 1NEIBcell(icel,2)= icel + 1NEIBcell(icel,3)= icel NXNEIBcell(icel,4)= icel + NXNEIBcell(icel,5)= icel NX*NYNEIBcell(icel,6)= icel + NX*NY

    NEIBcell(icel,2)NEIBcell(icel,1)

    NEIBcell(icel,3)

    NEIBcell(icel,5)

    NEIBcell(icel,4)NEIBcell(icel,6)

    20090312KN 62

    =>[email protected]=>[email protected]%g-nZ4

    mgC

    solMG39:;4O

    mesh.datCZ4

    INPUT.DATZ4

    20090312KN 63

    =>[email protected]/2

    1.00e-0 1.00e-0 1.00e-0 DX/DY/DZ1.00 1.0e-08 OMEGA, EPSICCG1 SOLVER 1:GS,2:ICCG,3:MG100 100 ITERtotCr/p0.95 CHANGElevel

    DX, DY, DZ -.RX,Y,Z9 M

    OMEGA 1.0]

    EPSICCG 5)J "#

    b

    Axb )(k

    20090312KN 64

    =>[email protected]/2

    SOLVER 1:GSGauss-Seidel%2ICCG%3MG

    ITERtotCr%ITERtotCp ITERtotCrRestrictionvRRH ITERtotCpProlongationvRRH

    CHANGElevel RR[)fR2JbI%eRJ1pYR4v%rd

    1.00e-0 1.00e-0 1.00e-0 DX/DY/DZ1.00 1.0e-08 OMEGA, EPSICCG1 SOLVER 1:GS,2:ICCG,3:MG100 100 ITERtotCr/p0.95 CHANGElevel

  • 20090312KN 65

    r

    go.sh

    #@$-r mg-single#@$-q lecture1#@$-N 1#@$-J T1#@$-e err#@$-o 064_cg.lst#@$-lM 28GB#@$-lT 00:15:00#@$

    cd $PBS_O_WORKDIR

    ./solMGexit

    >$ cd /run>$ qsub go.sh

    20090312KN 66

    H% %T2K

    N Gauss-Seidel ICCG

    83= 512 1,2510.03 sec19

  • 20090312KN 69

    MultigridfShttps://computation.llnl.gov/casc/

    J/UIGY}]UIJFPQ%;]Z7fJb})WdPQ Gauss-Seidel%SOR

    UsUnW%R)SnsnsUnV

    20090312KN 70

    Gauss-Seidel%SORR5)

    ITERATION#

    RES

    IDU

    AL

    EBSd[)W4*

    20090312KN 71

    Gauss-Seidel%SORR5)

    ITERATION#

    RES

    IDU

    AL

    ][)W4UndnQ*

    20090312KN 72

    MultigridfShttps://computation.llnl.gov/casc/

    J/UIGY}]UIJFPQ%;]Z7fJb})WdPQ Gauss-Seidel%SOR

    0.00

    0.25

    0.50

    0.75

    1.00

    0 100 200 300 400 500

    Iterations

    Res

    idua

    l

    GS ICCG

  • 20090312KN 73

    MultigridfShttps://computation.llnl.gov/casc/

    J/UIGY}]UIJFPQ%;]Z7fJb})WdPQ Gauss-Seidel%SOR

    UsUnW%R)SnsnsUnV "%CxyvwWdnYXYp%5){MRHW*PQ

    SkxyZ7Hl]2PQ}%kScalable WxyZ7]R2PQlfSVnV

    ICCGR

    20090312KN 74

    MultigridfS01https://computation.llnl.gov/casc/

    MultigridfS%vsVfVJ/UI%R)J]%QM_QVJ/PY%R)J%QefWM1Q

    MultigridMS%R)JK]%QefWn} 5)Wd%s^scalable{MR5)HWxyZ7]oK%U}WbI WxyZ7]R2

    vwxy p

  • 20090312KN 77

    MultigridR-2/5

    ;3 fine grid levelC;:./0?=>?

  • 20090312KN 81

    uF(i) = SF (AF ,f)

    rF = f - AF uF(i)

    rC = RFC rF= RFC (f - AF uF(i))

    AC uC = rC

    uF(i) = RCF uC= RCF AC-1 RFC (f - AF uF(i))

    uF(i+1) = uF(i) + uF(i)= uF(i) + RCF AC-1 RFC (f - AF uF(i))

    3R2

    uC = AC-1 rC = AC-1 rC= AC-1 RFC (f - AF uF(i))

    20090312KN 82

    MultigridR34567824R

    vsVM9:;J

  • 20090312KN 85

    vsVMt]9:;J

  • 20090312KN 89

    MultigridR-5/5

    0 =0 2 30?kCC-vCz:%6o0V-Cycle7E

  • 20090312KN 93

    H% %T2K

    N Gauss-Seidel ICCG

    83= 512 1,2510.03 sec19

  • 20090312KN 97

    MultigridMultigrid is scalable !!!

    =>A}RxyvwWeak Scaling

    MG: scalable

    ICCG: unscalableMETHOD=2

    GS: unscalableMETHOD=1

    Number of Processors (Problem Size)

    Tim

    e to

    Sol

    utio

    n

    200

    150

    100

    50

    0100 101 102 103

    Based on LLNL

    20090312KN 98

    =>[email protected]://nkl.cc.u-tokyo.ac.jp/07w/CW08/2007-11-21-CW08.pdfhttp://nkl.cc.u-tokyo.ac.jp/07w/CW09/2007-11-28-CW09.pdf

    )* 39:;

    Gauss-Seidel ICCG MG

    Gauss-Seidel VZ4

    20090312KN 99

    823NRklCW{bI1NRklCJ2PQ

    EqVCS111H#W1C

    20090312KN 100

    =>[email protected]=>[email protected]%g-nZ4

    mgC

    solMG39:;4O

    mesh.datCZ4

    INPUT.DATZ4

  • 20090312KN 101

    =>[email protected]

    $> cd /W3/mg/run$> ./mg

    4$> ls mesh.dat

    mesh.dat

    pRNX!NY!NZJ$PQf%kmesh.datlWYQ

    }U((( NX%NY%NZS2R1MnpYnnVC]]@SnVWi4B?6mR WM1nV

    NXNY

    NZ

    20090312KN 102

    =>[email protected]/2

    1.00e-0 1.00e-0 1.00e-0 DX/DY/DZ1.00 1.0e-08 OMEGA, EPSICCG1 SOLVER 1:GS,2:ICCG,3:MG100 100 ITERtotCr/p0.95 CHANGElevel

    DX, DY, DZ -.RX,Y,Z9 M

    OMEGA =1.0Gauss-Seidel%QefS

    EPSICCG 5)J

    20090312KN 103

    =>[email protected]/2

    SOLVER 1:GSGauss-Seidel%2ICCG%3MG

    ITERtotCr%ITERtotCp ITERtotCrRestrictionvRRH ITERtotCpProlongationvRRH

    CHANGElevel RR[)fR2JbI%eRJ1pYR4v%rd

    1.00e-0 1.00e-0 1.00e-0 DX/DY/DZ1.00 1.0e-08 OMEGA, EPSICCG1 SOLVER 1:GS,2:ICCG,3:MG100 100 ITERtotCr/p0.95 CHANGElevel

    20090312KN 104

    ITERtotCr%ITERtotCpdo iter_out= 1, ITERmax

    enddo

    Restrictiondo iterk= 1, ITERtotCr

    (RELAX+RESTRICTION)

    enddo

    Prolongationdo iterk= 1, ITERtotCp

    (RELAX+PROLONGATION)

    enddo

  • 20090312KN 105

    CHANGElevel*WnR

    ITERATIONS

    RES

    IDU

    AL levelNEXTtoproceed

    lCHANGEleveRRif

    i

    i +1

    20090312KN 106

    nxy

    xyW^fnQ]S%RestrictionfProlongationMHRWg-

    RPq]UnpYnnV

    SXR]SnbIVnVW%^6Jb}qd5)PQW_Q

    20090312KN 107

    RPidir=1,%idir=-1

    if (idir.eq.1) thendo icel= levMGcelINDEX(lev-1)+1, levMGcelINDEX(lev)

    XPREV= XMG(icel)RF= BMG(icel) - WAcoefMG(1,icel)*XMG(NEIBcellMG(1,icel)) &

    & - WAcoefMG(2,icel)*XMG(NEIBcellMG(2,icel)) && - WAcoefMG(3,icel)*XMG(NEIBcellMG(3,icel)) && - WAcoefMG(4,icel)*XMG(NEIBcellMG(4,icel)) && - WAcoefMG(5,icel)*XMG(NEIBcellMG(5,icel)) && - WAcoefMG(6,icel)*XMG(NEIBcellMG(6,icel))

    XMG(icel)= (RF*DDMG(icel)-XPREV)*OMEGA + XPREVenddoelsedo icel= levMGcelINDEX(lev), levMGcelINDEX(lev-1)+1, -1

    XPREV= XMG(icel)RF= BMG(icel) - WAcoefMG(1,icel)*XMG(NEIBcellMG(1,icel)) &

    & - WAcoefMG(2,icel)*XMG(NEIBcellMG(2,icel)) && - WAcoefMG(3,icel)*XMG(NEIBcellMG(3,icel)) && - WAcoefMG(4,icel)*XMG(NEIBcellMG(4,icel)) && - WAcoefMG(5,icel)*XMG(NEIBcellMG(5,icel)) && - WAcoefMG(6,icel)*XMG(NEIBcellMG(6,icel))

    XMG(icel)= (RF*DDMG(icel)-XPREV)*OMEGA + XPREVenddo

    endif

    20090312KN 108

    R=>[email protected]

    W

    rC = RFC rF

    AC uC = rCuF(i) = RCF uC

  • 20090312KN 109

    R=>[email protected] /srcmainj=>[email protected]

    input

    pointer_initCZ4

    boundary_cella

    poi_geni6

    poi_gen_mgi6

    GSGauss-Seidel

    ICCGICCG

    MULTIG

    outucdG$

    RELAX_MG4]cpQ

    RESTRICT_MG4]cpQ,

    20090312KN 110

    H% %T2K

    N Gauss-Seidel ICCG

    83= 512 1,2510.03 sec19

  • 20090312KN 113

    D8::KHG:;:KHG:;:KHG:;

  • 20090312KN 121

    01?LkC%!?,kC MAC%SMACf%!fgh>:KHG:;:KHG:;

  • 20090312KN 125

    opoCq{Ir?so>:q

    tuC;nopq?qy

    Irnopq flexible

    +vkIrwnopq

    20090312KN 126

    20#Jv*GPQef]bICJPQ

    Level 012 nodes20 tri's

    Level 142 nodes80 tri's

    Level 2162 nodes320 tri's

    Level 42,562 nodes5,120 tri's

    Level 3642 nodes

    1,280 tri's

    20090312KN 127

    CRJXR{{i4B?6m]PQ

    GenerateFine Meshes

    CoarseGrid Info

    20090312KN 128

    *

    9 R]* nC#J

  • 20090312KN 129

    Semi-Coarsening ?vgz{>=|p6?

    xC;:D=0}0~v

  • 20090312KN 133

    ,(

    R1 ** cRcR TCFFC , cSkl, - ( ), -FCFC rRr *

    , - ( ), -TffffC rrrrr 43211111

    , - ( ) , -CTffff uuuuu 11114321 , - ( ), -CFCF uRu * c =1

    ,

    ,

    20090312KN 134

    [K4

    PE#0 PE#1

    LEVEL= i

    LEVEL= I+1

    PE#0 PE#1

    20090312KN 135

    tu%634 R]V4MS634 ]]Q

    ;=>[email protected]

    Parallel Serial Parallel

    Restriction Prolongation

    20090312KN 136

    1280x340=435,000 cells/PEup to 56M DOF on 128 PEs

    ICCG/IBM BG-L MGCG/IBM BG-LICCG/IBM SP-3 MGCG/IBM SP-3

    0

    250

    500

    750

    1000

    1250

    0 16 32 48 64 80 96 112 128

    PE#

    sec.

    1PE_}RxyZ7 kScalablelM_Y%PEW*%PnoxyZ7W1dnbIq% WonVS

    ICCGMS%H#xyZ7W1dnQf%HW*PQ}% q*PQ

    MGCGMSHWonV}% Sf&onV

    BG-L IBM BG/L SP-3 : IBM SP3

  • 20090312KN 137

    ] UIICCGfR23

    xy Rin= 0.50%r= 0.01play.p_XXXX 320#%1_}1000\320,000C PEJQf%1_}RCJ320,000]UI JRoutSG

    $> cd /run$> qsub go.sh

    $> cd /src-iccg$> make$> ls ../run/soliccg

    $> cd /src-mgcg$> make$> ls ../run/solmgcg

    20090312KN 138

    input.datR+*Wtest.grid.320r0.caseT : COARSEgrid4 : ILEVcoarseAPPROX : COARSEgrid METHOD10 : iterSSOR1 : SOLFLAG= 1 (1:ICCG, 2:MCCG)0 : NCOLORtot10000 : DEPTHsgl1.d0 : SIGMA= 1.d01.d-6 : EPS = 1.d-81000 : ITERmax=2 : iSLEV_MG_typeuniform : BOUNDARY conditions0.80d0, 0.80d0 : CMG1, CMG21.0d-24 : EPSMG1.0d0 : EPScoarse2 : COARSEsolver_flag10000, 5, 5 : ITERc/r/p/MAX1 : PEsmpTOT

    20090312KN 139

    input.datBCtest.grid.320r0.caseT : COARSEgrid4 : ILEVcoarseAPPROX : COARSEgrid METHOD10 : iterSSOR1 : SOLFLAG= 1 (1:ICCG, 2:MCCG)0 : NCOLORtot10000 : DEPTHsgl1.d0 : SIGMA= 1.d01.d-6 : EPS = 1.d-81000 : ITERmax=2 : iSLEV_MG_typeuniform : BOUNDARY conditions0.80d0, 0.80d0 : CMG1, CMG21.0d-24 : EPSMG1.0d0 : EPScoarse2 : COARSEsolver_flag10000, 5, 5 : ITERc/r/p/MAX1 : PEsmpTOT

    uniformXminXmaxYminYmaxZminZmax

    20090312KN 140

    R9R=RmaxMR

    K]=0f uniform

    ^20#RBM=0feRf2sYQf2q

    Xmin%Xmax Ymin%Ymax Zmin%Zmax

    kuniformlf23UIS5)V]ICCG

  • 20090312KN 141

    R9R=RmaxMR01^20#RBM=0feRf2sYQf2q

    20090312KN 142

    input.datCMG1,CMG2test.grid.320r0.caseT : COARSEgrid4 : ILEVcoarseAPPROX : COARSEgrid METHOD10 : iterSSOR1 : SOLFLAG= 1 (1:ICCG, 2:MCCG)0 : NCOLORtot10000 : DEPTHsgl1.d0 : SIGMA= 1.d01.d-6 : EPS = 1.d-81000 : ITERmax=2 : iSLEV_MG_typeuniform : BOUNDARY conditions0.80d0, 0.80d0 : CMG1, CMG21.0d-24 : EPSMG1.0d0 : EPScoarse2 : COARSEsolver_flag10000, 5, 5 : ITERc/r/p/MAX1 : PEsmpTOT

    R]@

    20090312KN 143

    kCMG1lfS ?0.80d0, 0.80d0 : CMG1, CMG2

    Iterations

    Res

    idua

    l

    R1

    R2

    R3 R4

    R5

    R6

    klfnbIVQk,lR

    4]cpQ]cVI

    R(i) > CMG1R(i-1)(0

  • 20090312KN 145

    input.datEPSMGtest.grid.320r0.caseT : COARSEgrid4 : ILEVcoarseAPPROX : COARSEgrid METHOD10 : iterSSOR1 : SOLFLAG= 1 (1:ICCG, 2:MCCG)0 : NCOLORtot10000 : DEPTHsgl1.d0 : SIGMA= 1.d01.d-6 : EPS = 1.d-81000 : ITERmax=2 : iSLEV_MG_typeuniform : BOUNDARY conditions0.80d0, 0.80d0 : CMG1, CMG21.0d-24 : EPSMG1.0d0 : EPScoarse2 : COARSEsolver_flag10000, 5, 5 : ITERc/r/p/MAX1 : PEsmpTOT

    EqVMR"])EqVMR J&e{M^#]:Qs

    20090312KN 146

    input.datITERc/r/p/MAXtest.grid.320r0.caseT : COARSEgrid4 : ILEVcoarseAPPROX : COARSEgrid METHOD10 : iterSSOR1 : SOLFLAG= 1 (1:ICCG, 2:MCCG)0 : NCOLORtot10000 : DEPTHsgl1.d0 : SIGMA= 1.d01.d-6 : EPS = 1.d-81000 : ITERmax=2 : iSLEV_MG_typeuniform : BOUNDARY conditions0.80d0, 0.80d0 : CMG1, CMG21.0d-24 : EPSMG1.0d0 : EPScoarse2 : COARSEsolver_flag10000, 5, 5 : ITERc/r/p/MAX1 : PEsmpTOT

    4MRGauss-SeidelRH

    20090312KN 147

    input.datITERc/r/p/MAXGauss-SeidelRH

    ITERcMAX:EqV

    ITERrMAX:,%v

    ITERpMAX:$%&%v

    Fine

    Coarse

    Fine

    Coarse

    10000, 5, 5 : ITERc/r/p/MAX

    eYRJJ1dPQf%H#RHS4PQWK_}R S*PQmR^6Rxyq_

    20090312KN 148

    go.sh#@$-r mg#@$-q lecture1#@$-N 4#@$-J T16#@$-e err#@$-o mgcg-064.lst#@$-lM 28GB#@$-lT 00:15:00#@$

    cd $PBS_O_WORKDIR

    mpirun n2.sh ./solmgcgexit

    soliccg ICCGsolmgcg MGCG

    $> cd /run$> qsub go.sh

  • 20090312KN 149

    h3

    4 #@$-N 1 #@$-J T4

    8 #@$-N 1 #@$-J T8

    16 #@$-N 1 #@$-J T16

    32 #@$-N 2 #@$-J T16

    64 #@$-N 4 #@$-J T16

    6H7.14 sec

    181H13.9 sec

    4

    19H26.0 sec

    2362H215.7 sec

    64

    9H12.1 sec

    1195H107.0 sec

    32

    7H9.50 sec

    610H54.0 sec

    16

    6H7.18 sec

    328H25.3 sec

    8

    MGCGICCGh3

    20090312KN 150

    h3=32, 10.24106 cellssoliccg (ICCG)1141 3.491998E-061161 2.085699E-061181 1.418004E-061195 8.236873E-07

    ### solver tot time 1.069786E+02 sec. PE= 0### comm. time 4.377219E+00 sec. PE= 0

    9.590832E+010 1000 1.660144E+04 1.658377E+04

    solmg (MGCG)1 1.189154E+002 6.495532E-023 1.383950E-024 5.833822E-045 2.290085E-046 1.387011E-057 8.196302E-068 1.108853E-069 3.970644E-07

    ### solver tot time 1.212685E+01 sec. PE= 0### MGCG 6.109128E-01 sec. PE= 0### MG 1.151593E+01 sec. PE= 0### relax 1.097669E+01 sec. PE= 0### restriction 4.404089E-01 sec. PE= 0### prolongation 8.220959E-02 sec. PE= 0### comm. time 1.852987E+00 sec. PE= 0

    8.471996E+010 1000 1.660144E+04 1.658377E+04

    20090312KN 151

    tu ]cpQxy

    C;:%yy/0?,g;

    0

    gP7HHG:;< $$+o HIDHierarchical Interface

    Decomposition