new vlasov code simulations: a tutorial - 名古屋大学 · 2018. 9. 7. · a tutorial takayuki...
TRANSCRIPT
Vlasov Code Simulations:A Tutorial
Takayuki UmedaInstitute for Space-Earth Environmental Research
Nagoya [email protected]
In memory of Prof. MahaAshour-AbdallaMarch 22, 2006@Westin Miyako Kyoto
March 2, 2007@ISSS-8, Hilton Kauai Hawaii
Background• Geospace plasma: possible for direct (in-situ)
observation via spacecraft.cf. astrophysical plasma: observation of radiations.
• Various codes of space simulations: MHD, Hall-MHD, multi-fluids, hybrid-PIC/Vlasov, full-PIC/Vlasov, etc– Fluid dynamics versus particle kinetics– Cross-scale coupling between them
• PC-cluster-type supercomputers become standard in High-Performance Computing.
Global 3D MHD
Bow shock:Discontinuity in flows
30RE ~ 200,000km
K-H instability:Velocity shear layer
Magnetic reconnection: Current layer
Macro: global magnetosphereMeso: local boundary layers
Micro: wave-particleinteractions
>100,000km 10,000~1,000km 100km~10m
by T. Ogino
900 x 600 x 600 grids(~80GB) Δ=0.1Re ~640km
Space plasma simulations• Meso-scale processes at local boundary layers
(e.g., shocks, reconnection, KT vortices) can be reproduced in macro-scale MHD simulations of global magnetosphere when the spatial resolution is higher.
• In much higher spatial resolution, non-MHD effect would be effective.
• Kinetic simulations are needed for micro-scale wave-particle interactions.
• Meso- and macro-scale kinetic simulations require huge computational resources.
Basic Equations
Hannes Alfvén [1942](Nobel Prize in Physics 1970)𝑚𝑚
𝜕𝜕𝑁𝑁𝜕𝜕𝜕𝜕
+ 𝛁𝛁 � 𝑚𝑚𝑁𝑁𝑼𝑼 = 0
𝜕𝜕𝜕𝜕𝜕𝜕
𝑚𝑚𝑁𝑁𝑼𝑼 + 𝛁𝛁 � 𝑚𝑚𝑁𝑁𝑼𝑼𝑼𝑼 + 𝛁𝛁𝑃𝑃 = 𝑱𝑱 × 𝑩𝑩
𝜕𝜕𝜕𝜕𝜕𝜕
𝑚𝑚𝑁𝑁𝑈𝑈2 + D𝑃𝑃 + 𝛁𝛁 � 𝑚𝑚𝑁𝑁𝑈𝑈2 + D𝑃𝑃 𝑼𝑼 + 2𝛁𝛁 � 𝑃𝑃𝑼𝑼 = 2𝑬𝑬 � 𝑱𝑱
Mass conservation
Momentum conservation
Energy conservation
• Vlasov Eq. (Collisionless Boltzmann Eq.)Anatoly Vlasov
[1938]
• MHD Eq.
𝜕𝜕𝑓𝑓𝜕𝜕𝜕𝜕
+ 𝒗𝒗 �𝜕𝜕𝑓𝑓𝜕𝜕𝒙𝒙
+𝑞𝑞𝑚𝑚
𝒗𝒗 + 𝑬𝑬 × 𝑩𝑩 �𝜕𝜕𝑓𝑓𝜕𝜕𝒗𝒗
= 0
Mass conservation in phase space:First principle
Single fluid approximation
Kinetic codes: PIC vs. VlasovParticle-In-Cell
• Lagrangian + Eulerian• Difficult to parallelize• High numerical noise
– Self-consistent thermal fluctuations enhanced!
• Well-developed schemes• Lower computational cost• Tracking each particle
Vlasov• Eulerian• Easy to parallelize• Low numerical noise
– Thermal fluctuations imposed as initial noise
• Developing schemes• Huge computational cost• Evolution of velocity
distribution function
History of Vlasov Simulations• 1D (1x1v phase-space ) Vlasov-Poisson problems
– Late 1960’s – present. => hands-onNx × Nv = 100×100 ~ 100KB
• “Reduced 2D” problems [e.g., Newman]– Guiding center … : f(x, y, v||)
U⊥ = (E × B)/|B|2
• “2D” problems 21st century– Drift kinetic, gyro kinetic : f(x, y, v||, v⊥) – 2x2v problems (e.g., K-H inst.): f(x, y, v⊥1,v⊥2)
Nx×Ny×Nvx×Nvy = 1004 ~ 4GB
• 2.5D (2x3v phase-space) problems– Late 2000’s – present.
Nx×Ny×Nvx×Nvy×Nvz=1005 ~ 400GB!• Reduced 3D (3x2v) problems
– Gyro kinetic : f(x, y, z, v||, v⊥)– Fusion plasma [e.g., Idomura, Watanabe]
• 3D (3x3v phase-space) problems– Difficult to run on present supercomputers
1006 ~ 40TB!
History of Vlasov Simulations
Spec. of Personal WSsDual CPU systems with large memory is important
for development of Vlasov codes. • 2004 - 32bit-Linux, Xeon (P4:NetBurst – 2 cores),
2GB (DDR 512MB×4) mem. ~ $3,000• 2007 - 64bit-Linux, Xeon (Core – 4x2 cores),
8GB (DDR2 2GB×4) mem. ~ $6,000• 2010 - Xeon (Westmere – 6x2 cores),
48GB (DDR3 4GB×12) mem. ~ $9,000• 2012 - Xeon (Sandy Bridge – 8x2 cores)
128GB (DDR3 8GB×16) mem. ~ $9,000• 2016 - Xeon (Broadwell – 18x2 cores)
512GB (DDR4 32GB×16) mem. ~ $18,000
National computers in JapanEarth Simulator
• NEC SX-6 - 8CPUs: 64GFlops/nodeVector processor
• 16GB memory/node• 640nodes: Crossbar• 40.96TFlops peak
K computer• SPARC64 VIII - 8cores:
128GFlops/node
• 16GB memory/node• 82,944nodes: Tofu• 10.62PFlops peak
©JAMSTEC/NEC©RIKEN/Fujitsu
2002/03-2009/03 2011/11-
Post K• A64FX -
48cores:2.7TFlops/node
• 32GB/node• >370,000
nodes: Tofu
2020?-
Supercomputers in 2018
• More than 95% supercomputers in the world use x86_64 processors.
- 110 systems use accelerators/coprocessors. • Typical shared (single-node) memory:16-64GB.⇒Memory per single core: ≤1GB/core.• Total number of cores: >> 10,000Memory per node does not increase…⇒Need to develop highly scalable parallel codes
for large-scale simulations.
June 2018 data @ www.top500.org
Numerical Schemes for Vlasov
Time steppingInterpolationParallelization
Overview of Numerical Techniques for 1x1v (ES) code
0=∂∂
+∂∂
+∂∂
xxx v
fEmq
xfv
tf
0
0
=∂∂
+∂∂
=∂∂
+∂∂
xx
x
vfE
mq
tf
xfv
tf
“Splitting” Scheme [Cheng & Knorr, 1976]Vlasov equation
Valid when vx and Ex are constant in time and space.
Use this as an approximation for small Δt and ∆x.
Splitting SchemeSplitting scheme for 1x1v (1D) code.• Shift in configuration space with ∆t/2.
• Shift in velocity space by E with ∆t.
• Shift in configuration space with ∆t/2.
),(),(' ,2,, jxt
jxit
jxi vvxfvxf ∆−=
),('),('' ,,, tEvxfvxf ixmq
jxijxi ∆−=
),(''),( ,2,, jxt
jxijxitt vvxfvxf ∆∆+ −=
Shifting of profiles with a “numerical interpolation”
Splitting Scheme[Cheng & Knorr, 1976]
i i+1
j
j+1vx
Ex
0
0
=∂∂
+∂∂
=∂∂
+∂∂
xfE
mq
tf
xfv
tf
x
x
Get ○
Splitting Scheme[Cheng & Knorr, 1976]
i i+1
j
j+1vx
Ex
0
0
=∂∂
+∂∂
=∂∂
+∂∂
xfE
mq
tf
xfv
tf
x
x
Get △
Splitting Scheme[Cheng & Knorr, 1976]
i i+1
j
j+1vx
Ex
0
0
=∂∂
+∂∂
=∂∂
+∂∂
xfE
mq
tf
xfv
tf
x
x
Get ☆
Semi-Lagrangian
0=∂∂
+∂∂
xfV
tf
),(),( ttVxfttxf ii ∆−=∆+
x
f(x,t)V∆t
1D advection equation
Shift of profiles with a numerical interpolation.
General solution
f(x,t+∆t)
)( Vtxff −≡
• The numerical integration in time becomes the numerical interpolation in space. 𝑂𝑂 ∆𝑥𝑥𝑛𝑛 = 𝑂𝑂 ∆𝜕𝜕𝑛𝑛
Interpolation schemes• Spline-type schemes
– Traditional (e.g., Cheng & Knorr 1976-)Still used (B-spline, etc… e.g., Shoucri)
– Global interpolation.• Fourier (Spectral) schemes
– Good for periodic and continuous profiles.– Global interpolation.
• Multi-data schemes (CIP, MMA, etc)– Accurate, but computationally expensive.
• Conservative schemes– Good for discontinuities by using “limiter.”– Mostly local interpolation.– Numerically dissipative, need to be higher-order…
=> hands-on
PIC vs. Vlasov with CIP• Two-stream instability
• No strong oscillations
• Background color changes– Negative density
• Strong oscillations⇒Physics break down!
PIC:Nx=200, Np=100,000*Nx
Vlasov with CIPNx=200, Nv=100
CIP vs. Non-oscillatory • Two-stream instability
• Background color changes– Negative density
• Strong oscillations⇒Physics break down!
• No strong oscillations• Diffusive
– Needs more accuracy
Vlasov with CIPNx=200, Nv=100
Vlasov with fifth-order, conservative, non-oscillatroyNx=200, Nv=100
Concepts of Numerical Interpolation Schemes
• Memory saving (important for >3D)×Multi-data scheme (e.g., CIP, MMA)×Higher-order time integration (e.g. R-K)⇒Semi-Lagrangian scheme• Suitableness for plasma physics○ Conservative ○ Non-oscillatory○ Positivity× TVD scheme
=> hands-on
Conservative Schemes
21
21
21
21
,1UUf
t
UUff
nxnxi
i
iit
itt
i
+−=
∂∂
+−=
+=
−+∆+
∑The total value is controlled by the flux from boundaries.
[ ])()()()(
),(),(),(),(
),(),(),(),(
21
21
21
21
21
21
−+∆+
−+∆+
−+∆+
+−=
+−=
+−=
∑
ixixit
itt
vjijiji
tji
tt
jijijit
jitt
xJxJxx
vxUvxUvxfvxf
vxUvxUvxfvxf
j
ρρ
The charge-continuity equation is automatically satisfied.
it
itt
i fff δ+=∆+
Non-”conservative” scheme
(U1/2=UNx+1/2 in periodic systems)• 1x1v system
Flux limiters
• Conservative schemes are generally dissipative.⇒Higher-order schemes needed.• Overshoots/undershoots in velocity space become
sources of wave excitation.⇒Need to suppress “numerical oscillations” (spurious extrema) in velocity space. ⇒Need to preserve positivity.
vf(v)
Non-oscillatory scheme
vf(v)
Linear Advection Test:3rd-order: Umeda EPS 20084th-order: Umeda et al. CPC 20125th-order: Umeda et al. unpublished
5th-degree Lagrangian Interpolation+Conservative and Non-oscillatory Limiter
Detect local extrema with 5 points.
• Interpolation: [e.g., Umeda et al. 2012]5th-order Non-oscillatory, positive, and conservative semi-Lagrangian scheme
• Advection: [Umeda et al. 2009]Multi-dimensional unsplitting advection scheme
( ) 0=∂∂⋅×++
∂∂⋅+
∂∂
vBvE
rv s
s
sss fmqf
tf
Advection Rotation
0=∂∂⋅+
∂∂
rv ss f
tf
0=∂∂⋅+
∂∂
vE s
s
ss fmq
tf
( ) 0=∂∂⋅×+
∂∂
vBv s
s
ss fmq
tf
• Rotation: [Schmitz & Grauer 2006] Gyro-kineticsBack-substitution methodSimilar to the Boris rotation scheme in Cartesian grids
Operator Splitting for EM Vlasov in hyper dimensions
Shift f in configuration space with ∆t/2 Compute current density with charge conservation Solve electromagnetic fields by (implicit) FDTD Shift f in velocity space by E-field with ∆t/2 Rotate f in velocity space by B-field with ∆t Shift f in velocity space with by E-field ∆t/2 Shift f in configuration space with ∆t/2
• The time chart is same as explicit Particle-In-Cell’s.• Boris scheme is essential to solve E x B drift.• Conservative schemes are important for charge conservation.
Cf. Boris pusher
Operator Splitting for EM Vlasov in hyper dimensions
Computational Resources• 1x1v: Nx × Nv = 1000×1000 ~ 10MB
– Suitable for laptop computing.
Large gaps!
• 2x2v: Nx×Ny×Nvx×Nvy = 1004~4GB• 2x3v: Nx×Ny×Nvx×Nvy×Nvz = 1005~400GBHigh-performance and parallel computing techniques are essential for hyper-dimensional simulations.
FX1/K/FX10/FX100 Performance Evaluation
FX10@Tokyo
FX1@Nagoya
FX100@Nagoya
FX10@KyushuFX1@JAXA
K
Applications
“Local” simulations:- Magnetic Reconnection : 2x3v- Kelvin-Helmholtz Instability : 2x2v- Rayleigh-Taylor Instability : 2x2v
“Global” simulation:• Interaction between solar wind
and small magnetosphere : 2x3v
Numerical oscillations
“Debye sheath” not solved correctly in low-res run.
Wave generation
Comparison between low- and high-resolution runs
Summary• There are several “key” numerical schemes for
Vlasov simulations to save computational memory.– Semi-Lagrangian time integration– Higher-order conservative scheme– Time stepping same as PIC
• PC-cluster-type supercomputers are common in recent days.
⇒Parallel computing is essential for hyper-dimensional Vlasov simulations.
• High performance computing techniques will be important for large-scale simulations.