cosmogrid: simulating the universe across the globereiprich/agcosmo/stefan_harfst.pdf · mpwideis a...
TRANSCRIPT
project participants Simon Portegies Zwart (PI), Derek Groen, Stefan Harfst, Steven Rieder (Leiden
Observatory) Jun Makino, Tomoaki Ishiyama, Keigo Nitadori, Kei Hiraki, Mary Inaba (Center
for Computational Astrophysics , Tokyo University) Steve McMillan, Enrico Vesperini, Otonyo Mangete (Drexel University) Cees de Laat, Paolo Grosso (University Amsterdam) …
project funding NCF DEISA NWO
project support SARA, Surfnet, Netherlight (the Netherlands) Starlight (United States), CANARIE (Canada) t-LEX, KDDNET, FUMI, NOC (Japan)
high-resolution simulation
is there a bias in LCDM simulation using re-simulation?
future of supercomputing is in distributed computing
challenging computer science problem
(Hoekstra et al., 2008)
(Ishiyama et al., 2009)
massively parallel TreePM code
highly optimized
load-balancing
force calculation with PhantomGRAPE
(Ishiyama et al., 2009)
(Nitadori et al., 2006)
● each supercomputer calculates
only a part of the universe
● periodic exchange of boundary
layers and mesh data
● long-distance communications
using
● the MPWide communication library
● a reserved 10Gbit lightpath
● can be extended to more than two
supercomputers
● MPWide is a library designed for message passing over long-range networks
● features:– used to setup communication between otherwise independent
programs
– MPI-like user interface
– multi-threaded communication with parallel tcp streams
– supports custom settings for each communication channel (e.g. buffer sizes and parallelism)
– supports restarts of member programs at run-time
● Other uses:– file transfers, user-space port forwarding.
(Groen et al., 2010)
actual network goes around the “wrong” side of the planet
• three times the distance• latency about a third of a second
limited by the speed of light
standard LCDM parameters box size 30Mpc3
20483 particles softening 175 pc
mass resolution ~105 Msun
requirements ~4 million CPU hours ~1 Tbyte memory ~110 Tbyte data storage 1.5 Gbyte data transfer/step
gridification of the code
solved for two supercomputers
now generalizing for any number
network
getting it up
reliability
scheduling run-time
queuing arrangements on different supercomputers
don’t laugh it works :) production simulation ongoing (z ≈ 1.5)
data transfer/reduction started
data will be published web page: http://www.2048x2048x2048.org/
future runs planned on many supercomputers increased resolution and/or box-size
high-z clump formation
use expertise gained with the Amsterdam-Tokyo run
on existing grid (e.g. DEISA)