fault tolerant multigrid solver - :// · 2015. 5. 13. · fault tolerant multigrid solver markus...

24
Fault Tolerant Multigrid Solver Markus Huber (FAU Erlangen, [email protected]) U. R¨ ude, B. Gmeiner (FAU) B. Wohlmuth, C. Waluga (TUM) Lehrstuhl f¨ ur Informatik FAU Erlangen N¨ urnberg www10.informatik.uni-erlangen.de EMG 2014 September 8-12 Leuven (Belgium) Fault Tolerant Multigrid Solver - Markus Huber

Upload: others

Post on 02-Feb-2021

1 views

Category:

Documents


0 download

TRANSCRIPT

  • Fault Tolerant Multigrid Solver

    Markus Huber (FAU Erlangen, [email protected])U. Rüde, B. Gmeiner (FAU)

    B. Wohlmuth, C. Waluga (TUM)

    Lehrstuhl für InformatikFAU Erlangen Nürnberg

    www10.informatik.uni-erlangen.de

    EMG 2014September 8-12

    Leuven (Belgium)

    Fault Tolerant Multigrid Solver - Markus Huber

  • 2

    Outline and Goals

    Outline:

    Multigrid and HPC

    Darcy and StokesHierarchical Hybrid GridsScalability

    Fault Tolerant Multigrid Solver

    Faulty solution processLocal recovery strategy

    Jumpy Coefficients

    Robustness in jumpsTo symmetrize or not to symmetrize

    Goals:

    Fault Tolerant Multigrid Strategies

    Jumpy Coefficients: No problem for geometric multigrid

    Fault Tolerant Multigrid Solver - Markus Huber

  • 3

    Multigrid and HPC

    Fault Tolerant Multigrid Solver - Markus Huber

  • 4

    Model Problems

    Darcy equation: Stokes equations:

    −div(η · ∇u) = f −div(2η · �(uuu)) +∇p = fffdiv(uuu) = 0

    with Dirichlet boundary conditions. with �(uuu) = 12(∇uuu + (∇uuu)T ) and

    Dirichlet boundary conditions.

    FEM discretization

    Au = f

    FEM discretization and Schur-complement formulation resultsin:(A BT

    0 BA−1BT

    )(uuup

    )=

    (fff

    BA−1fff

    )Application: Porous media Application: Flow problems, geo-

    physics

    Fault Tolerant Multigrid Solver - Markus Huber

  • 5

    Hierarchical Hybrid Grids (HHG)Unstructured input mesh is refinedregularly:

    3D tetrahedral refinement

    Geometric hierarchic with one ghostlayer:

    (volume) elements

    faces

    edges

    vertices

    [1] Gmeiner, B., Köstler, H., Stürmer, M. and Rüde, U. (2014):Parallel multigrid on hierarchical hybrid grids:a performance study on current high performance computing clusters. Concurrency and Computation:Practice and Experience,26(1), pp. 217-240.

    Fault Tolerant Multigrid Solver - Markus Huber

  • 6

    Hierarchical Hybrid Grids (HHG)

    Ghost Layer Communication

    Matrix-free implementation: Update via stencil-application

    Fault Tolerant Multigrid Solver - Markus Huber

  • 7

    HHG Weak Scalability on JuQueen for Stokes

    Nodes Threads Grid points Resolution Time: (A) (B)1 30 2.1 · 1007 32 km 30 s 89 s4 240 1.6 · 1008 16 km 38 s 114 s

    30 1 920 1.3 · 1009 8 km 40 s 121 s240 15 360 1.1 · 1010 4 km 44 s 133 s

    1 920 122 880 8.5 · 1010 2 km 48 s 153 s15 360 983 040 6.9 · 1011 1 km 54 s 170 sDiscretization with prismatic elements

    Regular refinement of each block (non-curved boundaries)

    Spherical refinement of the icosahedral mesh

    Largest computation to date:

    2.76x10^12 unknowns

    [1] Gmeiner, B., Rüde, U., Stengel, H., Waluga, C., Wohlmuth, B.: Performance and scalability ofhierarchical hybrid multigrid solvers for Stokes systems. SIAM J. Sci. Comput., submitted. 2013.

    Fault Tolerant Multigrid Solver - Markus Huber

  • 7

    HHG Weak Scalability on JuQueen for Stokes

    Nodes Threads Grid points Resolution Time: (A) (B)1 30 2.1 · 1007 32 km 30 s 89 s4 240 1.6 · 1008 16 km 38 s 114 s

    30 1 920 1.3 · 1009 8 km 40 s 121 s240 15 360 1.1 · 1010 4 km 44 s 133 s

    1 920 122 880 8.5 · 1010 2 km 48 s 153 s15 360 983 040 6.9 · 1011 1 km 54 s 170 sDiscretization with prismatic elements

    Regular refinement of each block (non-curved boundaries)

    Spherical refinement of the icosahedral mesh

    Largest computation to date:

    2.76x10^12 unknowns

    [1] Gmeiner, B., Rüde, U., Stengel, H., Waluga, C., Wohlmuth, B.: Performance and scalability ofhierarchical hybrid multigrid solvers for Stokes systems. SIAM J. Sci. Comput., submitted. 2013.

    Fault Tolerant Multigrid Solver - Markus Huber

  • 7

    HHG Weak Scalability on JuQueen for Stokes

    Nodes Threads Grid points Resolution Time: (A) (B)1 30 2.1 · 1007 32 km 30 s 89 s4 240 1.6 · 1008 16 km 38 s 114 s

    30 1 920 1.3 · 1009 8 km 40 s 121 s240 15 360 1.1 · 1010 4 km 44 s 133 s

    1 920 122 880 8.5 · 1010 2 km 48 s 153 s15 360 983 040 6.9 · 1011 1 km 54 s 170 sDiscretization with prismatic elements

    Regular refinement of each block (non-curved boundaries)

    Spherical refinement of the icosahedral meshLargest computation to date:

    2.76x10^12 unknowns

    [1] Gmeiner, B., Rüde, U., Stengel, H., Waluga, C., Wohlmuth, B.: Performance and scalability ofhierarchical hybrid multigrid solvers for Stokes systems. SIAM J. Sci. Comput., submitted. 2013.

    Fault Tolerant Multigrid Solver - Markus Huber

  • 8

    Fault Tolerant Multigrid Solver

    Fault Tolerant Multigrid Solver - Markus Huber

  • 9

    Fault Tolerant Multigrid Solver

    Why do fault tolerant algorithms become necessary in future HPC-clusters?

    Fault becomes more frequent

    Trend to core # > 106 → power problemHardware fault safety costs (redundancy)

    Data corruption

    Hardware failure (node crashes, damaged memory pages, uncorrectedbit-flips,...)

    Mean time to failure � checkpoint time

    [1] Cappello, F., Geist, A., Gropp, B., Kale, L., Kramer, B., and Snir, M.: Toward exascale resilience.International Journal of High Performance Computing Applications. 2009.

    Fault Tolerant Multigrid Solver - Markus Huber

  • 10

    Fault Tolerant Techniques

    Resulting Strategies [1]:

    Hardware-based fault tolerance (hardware-based error correction,...)

    Software-based fault tolerance (checkpointing,...)

    Algorithm-based fault tolerance (resilient subspace correction [2], resilientKrylov space solver [3]...)

    Goal: Fault Tolerant Multigrid Solver

    [1] Cui, T., Xu, J. and Zhang, C.-S.: An Error-Resilient Redundant Subspace Correction Method. ArXive-prints, (2013).

    [2] Xu, J.: Design And Analysis Of Fault-Tolerant Multilevel Iterative Algorithms. Position Paper, ExaMath2013.

    [3] Shantharam, M., Srinivasmuthy, S. and Raghavan, P.: Fault Tolerant Preconditioned Conjugate GradientFor Sparse Linear System Solution. Proceedings of the 26th ACM international conference onSupercomputing. ACM, pp.69-78.

    Fault Tolerant Multigrid Solver - Markus Huber

  • 11

    Model Scenario

    Test scenario:

    One-to-one relationship: process - coarse grid tetrahedron

    One compute core crashes

    Only interior of a coarse grid tetrahedron is affected by the fault.

    Fault detectable

    Fault Tolerant Multigrid Solver - Markus Huber

  • 12

    Faulty Solution ProcessModel problem:

    −∆u = f in Ω + BCFault of one process after 5 V-cycles:

    1,00E-16

    1,00E-12

    1,00E-08

    1,00E-04

    1,00E+00

    Re

    sid

    ua

    l

    Iterations

    No Fault Fault

    0 5 10 15

    Fau

    lt

    Fault Tolerant Multigrid Solver - Markus Huber

  • 13

    Schematic Recovery Strategy

    Fault

    Recovery

    Fault Tolerant Multigrid Solver - Markus Huber

  • 14

    Subdomain Problem

    Observation: Subdomain problem with Dirichlet data (interfaces redundant ondifferent processes in HHG)

    Recovery:

    V-cycle: Application of one local V-cycle

    W-cycle: Application of one local W-cycle

    F-cycle: Application of one local F-cycle

    Smoothing: Application of 10 GS smoothing steps

    Fault Tolerant Multigrid Solver - Markus Huber

  • 15

    Recovery Strategies

    Fault of one process after 5 V-cycles and local recovery:

    1,00E-16

    1,00E-12

    1,00E-08

    1,00E-04

    1,00E+00

    Res

    idu

    al

    Iterations

    No Fault Fault 10x Smoothing 1x Vcycle 1x Wcycle 1x Fcycle

    0 5 10 15

    Fa

    ult

    & R

    ec

    ove

    ry

    Fault Tolerant Multigrid Solver - Markus Huber

  • 16

    Jumpy Coefficients

    Fault Tolerant Multigrid Solver - Markus Huber

  • 17

    Jumpy Coefficients for Darcy

    −div(η · ∇u) = 0 in (0, 2)3, η ≥ η0 > 0 + BC

    Layer (L) Checkerboard 2d (C2) Checkerboard 3d (C3)

    Multigrid V(3,3)-cycle with GS-smoother convergence rates (16 mill. DOFs):

    Jump (L) (C2) (C3)

    1.0 0.19405 0.19405 0.194051e1 0.19291 0.24760 0.423821e3 0.19272 0.39717 0.577561e6 0.19302 0.39925 0.579571e9 0.19302 0.39954 0.57957

    Fault Tolerant Multigrid Solver - Markus Huber

  • 18

    To Symmetrize or Not to SymmetrizeMultigrid as CG accelerator:

    Hybrid approach:

    Symmetrization within primitives

    But not across primitives

    GS Edge Update

    Backward

    Forward

    Smoother types:

    JOR: presmoothing weighted Jacobi smoother (ω = 0.8)postsmoothing weighted Jacobi smoother (ω = 0.8)

    GS (I): presmoothing forward GS smootherpostsmoothing forward GS smoother

    GS (II): presmoothing forward GS on vertex, edge, face and volumepostsmoothing forward GS on vertex, edge, face andbackward GS on volume

    GS (III): presmoothing forward GSpostsmoothing backward GS

    [1] Holst, M. and Vandewalle, S.: Schwarz Methods: To Symmetrize or Not to Symmetrize. SIAM Journal onNumerical Analysis, Vol. 34, No. 2, pp. 699-722, 1997.

    Fault Tolerant Multigrid Solver - Markus Huber

  • 19

    Jumpy Coefficients for Darcy: CG Acceleration

    Jump: 1e3

    0 4 8 12 16 201.00E-16

    1.00E-12

    1.00E-08

    1.00E-04

    1.00E+00

    Vcycle JOR (w=0.8) GS (I)GS(II) GS (III)

    Iterations

    Res

    idua

    l

    Fault Tolerant Multigrid Solver - Markus Huber

  • 20

    Pressure Correction for the Stokes-System

    Recall discrete Stokes system in Schur-complement formulation:(A BT

    0 BA−1BT

    )(uuup

    )=

    (fff

    BA−1fff

    )Pressure Correction [1]:

    for k = 0, 1, 2, ... (Outer Iteration) doSolve:Auuuk+1 = fff −BT pk (geo. MG)

    Solve:Sp

    k+1= f̃ (Inner Iteration:

    cg-iteration)

    with S = BA−1BT and f̃ = A−1fff.

    end

    Additional Costs: Each inner iteration one application of geo. multigrid for A−1 in S.

    [1] Verfürth, R.: A combined conjugate gradient – multi-grid algorithm for the numerical solution of theStokes problem. IMA Journal of Numerical Analysis (4), pp. 441-455, 1984.

    Fault Tolerant Multigrid Solver - Markus Huber

  • 21

    Jumpy Coefficients for the Stokes-System

    0 5 10 15 20 25 301.00E-16

    1.00E-12

    1.00E-08

    1.00E-04

    1.00E+00

    1 10 1.00E+003

    Outer Iterations

    Res

    idua

    l

    Fault Tolerant Multigrid Solver - Markus Huber

  • 22

    Conclusion and OutlookTake home:

    Fault tolerant multigrid strategiesGeometric multigrid for jumpy coefficients

    Future work:

    Extending fault tolerant strategiesApplication of fault tolerant MG to real parallel settingTuning jumpy coefficients for mantle convectionStudy of variable viscosity

    Thank you for your attention!!!

    Thanks to:

    DFG SPP Exa

    Fault Tolerant Multigrid Solver - Markus Huber