mcs vision

25
1 Pioneering Science and Technology Office of Science U.S. Department of Energy MCS Vision Petascale Computing Grid Computing Computational Science and Engineering Increase by several orders of magnitude the computing power that can be applied to individual scientific problems, thus enabling progress in understanding complex physical and biological systems. Interconnect the world’s most important scientific databases, computing systems, instruments and facilities to improve scientific productivity and remove barriers to collaboration. Make high-end computing a core tool for challenging modeling, simulation and analysis problems.

Upload: cili

Post on 13-Jan-2016

54 views

Category:

Documents


0 download

DESCRIPTION

Petascale Computing Grid Computing Computational Science and Engineering. Increase by several orders of magnitude the computing power that can be applied to individual scientific problems, thus enabling progress in understanding complex physical and biological systems. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: MCS Vision

1

Pioneering Science andTechnology

Office of Science U.S. Department

of Energy

MCS Vision

Petascale Computing

Grid Computing

Computational Science and Engineering

Petascale Computing

Grid Computing

Computational Science and Engineering

• Increase by several orders of magnitude the computing power that can be applied to individual scientific problems, thus enabling progress in understanding complex physical and biological systems.

• Interconnect the world’s most important scientific databases, computing systems, instruments and facilities to improve scientific productivity and remove barriers to collaboration.

• Make high-end computing a core tool for challenging modeling, simulation and analysis problems.

Page 2: MCS Vision

2

Pioneering Science andTechnology

Office of Science U.S. Department

of Energy

MCS Products/Resources

• Enabling technologies- “middleware”- “tools”- “support applications”

• Scientific applications

• Hardware

• Other fundamental CS research

Page 3: MCS Vision

3

Pioneering Science andTechnology

Office of Science U.S. Department

of Energy

Enabling Technologies

• Globus Toolkit- Software infrastructure/standards for Grid computing

• MPICH- Our free implementation of MPI

• Jumpshot- Software for analysis of message passing

• pNetCDF- High performance parallel I/O library

• PetsC- Toolkit for parallel matrix solves

• Visualization (“Futures lab”)- Scalable parallel visualization software, large-scale displays

• Access Grid- Collaboration environment

Page 4: MCS Vision

4

Pioneering Science andTechnology

Office of Science U.S. Department

of Energy

Collaboration Technology – the Access Grid

• Multi-way meetings and conferences over the Internet

• Using high-quality video/audio technology

• Large format display: 200+ installations worldwide

• Easily replicated configurations, open source software

• www.accessgrid.org

Page 5: MCS Vision

5

Pioneering Science andTechnology

Office of Science U.S. Department

of Energy

The Grid Links People with Distributed Resources on a National Scale

Sensors

Page 6: MCS Vision

6

Pioneering Science andTechnology

Office of Science U.S. Department

of Energy

Some key scientific applications

• Flash- Community code for general Astrophysical phenomena- ASCI project with UC

• Nek5- Biological fluids

• pNeo- Neo-cortex simulations for study of epileptic seizures

• QMC- Monte Carlo simulations of atomic nuclei

• Nuclear Reactor Simulations

Page 7: MCS Vision

7

Pioneering Science andTechnology

Office of Science U.S. Department

of Energy

Hardware

Chiba City – Software Scalability R&DAddresses scalability issues in system software,open source software, and applications code.512 CPUs, 256 nodes, Myrinet, 2TB storage, Linux.DOE OASCR funded. Installed in 1999.

Jazz – Linux Cluster for ANL ApplicationsSupports and enhances ANL application community.50+ projects from a spectrum of S&E divisions350 CPUs, Myrinet, 20TB storage.ANL funded. Installed in 2002. Achieved 1.1 TF sustained.

Blue Gene prototype – coming soontwo-rack system scalable to twenty racks

Page 8: MCS Vision

8

Pioneering Science andTechnology

Office of Science U.S. Department

of Energy

Other HPC areas

• Architecture and Performance Evaluation

• Programming Models and Languages

• Systems Software

• Numerical Methods and Optimization

• Software components

• Software Verification

• Automatic Differentiation

Page 9: MCS Vision

9

Pioneering Science andTechnology

Office of Science U.S. Department

of Energy

I-WIRE

Flash Blue Gene NLCF

Teragrid

I-Wire Impact

Two concrete examples of the impact of I-WIRE

Page 10: MCS Vision

10

Pioneering Science andTechnology

Office of Science U.S. Department

of Energy

UIUC/NCSA

Starlight International Optical Network Hub(NU-Chicago)

Argonne National

Laboratory U Chicago

IIT

UIC

Chicago

U Chicago Gleacher Center

James R. Thompson Center – Illinois Century Network

40 Gb/s Distributed Terascale Facility Network

Commercial Fiber Hub

I-WIRE

• State Funded Dark Fiber Optical Infrastructure to support Networking and Applications Research

- $11.5M Total Funding- $6.5M FY00-03- $5M in process for FY04-5

- Application Driven- Access Grid: Telepresence &

Media- TeraGrid: Computational and Data

Grids- New Technologies Proving Ground

- Optical Network Technologies- Middleware and Computer

Science Research• Deliverables

- A flexible infrastructure to support advanced applications and networking research

(Illinois Wired/Wireless Infrastructure for Research and Education)

Page 11: MCS Vision

11

Pioneering Science andTechnology

Office of Science U.S. Department

of Energy

UI-Chicago

Illinois Inst. Tech

Northwestern Univ-Chicago“Starlight”

U of Chicago

I-55

I-90/94

I-290

I-294

UIUC/NCSAUrbana (approx 140 miles South)

ANL

IITUIC

UChicagoMain

UIUC/NCSA

Starlight / NU-C

ICN

UChicagoGleacher Ctr

Commercial Fiber Hub

Status: I-WIRE Geography

Argonne Nat’l Lab(approx 25 miles SW)

Page 12: MCS Vision

12

Pioneering Science andTechnology

Office of Science U.S. Department

of Energy

TeraGrid Vision: A Unified National HPC Infrastructure that is Persistent and Reliable

• Largest NSF compute resources

• Largest DOE instrument (SNS)

• Fastest network

• Massive storage

• Visualization instruments

• Science Gateways

• Community databases

E.g: Geosciences: 4 data collections including high-res CT scans, global telemetry data, worldwide hydrology data, and regional LIDAR terrain data

Page 13: MCS Vision

13

Pioneering Science andTechnology

Office of Science U.S. Department

of Energy

Resources and Services(33TF, 1.1PB disk, 12 PB tape)

UC Caltech IU NCSA ORNL PSC Purdue SDSC TACC

ComputationalResources

Itanium2(0.5 TF)

IA32(0.5 TF)

Itanium2(0.8 TF)

Itanium,IA-32

(2.1TF)Power4+

(1TF)

Itanium2(10TF)

TCS(6TF)

Marvel(0.3TF)

Heterogeneous(1.7 TF)

Itanium2(4 TF)

Power4+(1.1 TF)

IA-32(5.2TF)Sun

(Vis)

OnlineStorage

20 TB 170 TB 6 TB 230 TB 150 TB 540 TB 50 TB

ArchivalStorage

150 TB 1.5 PB 2.4 PB 6 PB 2 PB

Networking(Gbps to hub)

30 GbpsCHI

30 GbpsLA

10 GbpsCHI

30 GbpsCHI

10GbpsATL

30GbpsCHI

10 GbpsCHI

30 GbpsLA

10GbpsCHI

Database &DataCollections

Yes Yes Yes Yes Yes

Instruments Yes Yes

Visualization Yes Yes Yes Yes Yes

Dissemination Yes Yes Yes Yes Yes Yes Yes

Page 14: MCS Vision

14

Pioneering Science andTechnology

Office of Science U.S. Department

of Energy

Los Angeles Atlanta

SDSC TACC NCSA Purdue ORNL

UC/ANL PSCCaltech

Chicago

IU

Current TeraGrid Network

Resources: Compute, Data, Instrument, Science Gateways

Page 15: MCS Vision

15

Pioneering Science andTechnology

Office of Science U.S. Department

of Energy

Flash

• Flash Project

- Community Astrophysics code

- DOE funded ASCI program at UC/Argonne

- 4 million per year over ten years

- Currently in 7th year

• Flash Code/Framework

- Heavy emphasis on software engineering, performance, and usability

- 500+ downloads

- Active user community

- Runs on all major hpc platforms

- Public automated testing facility

- Extensive user documentation

Page 16: MCS Vision

16

Pioneering Science andTechnology

Office of Science U.S. Department

of Energy

Flash -- Simulating Astrophysical processes

Cellular detonation

Compressed turbulence

Helium burning on neutron stars

Richtmyer-Meshkov instability

Laser-driven shock instabilitiesNova outbursts on white dwarfs Rayleigh-Taylor instability

Flame-vortex interactions

Gravitational collapse/Jeans instability

Wave breaking on white dwarfs

Shortly: Relativistic accretion onto NS

Orzag/Tang MHDvortex

Type Ia Supernova

Intracluster interactions

MagneticRayleigh-Taylor

Page 17: MCS Vision

17

Pioneering Science andTechnology

Office of Science U.S. Department

of Energy

How has fast network helped Flash?

• Flash in production for five years

• Generating terabytes of data

• Currently done “by hand”- Data transferred locally from supercomputing centers for

visualization/analysis- Data remotely visualized at UC using Argonne servers- Can harness data storage across several sites

• Not just “visionary” grid ideas that are useful. Immediate “mundane” things as well!

Page 18: MCS Vision

18

Pioneering Science andTechnology

Office of Science U.S. Department

of Energy

Buoyed Progress in HPC

• FLASH flagship application for BG/L- Currently being run on 4K processors at Watson- Will run on 16K procs in several months

• Argonne partnership with Oak Ridge for National Leadership Class Computing Facility- Non-classified computing- BG at Argonne- X1, Black Widow at ORNL- Application focus groups apply for time

Page 19: MCS Vision

19

Pioneering Science andTechnology

Office of Science U.S. Department

of Energy

Petaflops Hardware is Just Around the Corner

Ter

aflo

ps

Pet

aflo

ps

We arehere

Page 20: MCS Vision

20

Pioneering Science andTechnology

Office of Science U.S. Department

of Energy

Diverse Architectures for Petaflop Systems

v IBM – Blue Gene- Puts processors + cache + network interfaces on same chip- Achieves high packaging density, low power consumption

v Cray – RS and X1- 10K processor (40 TF) Red Storm at SNL- 1K processor (20 TF) X1 to ORNL

• Emerging- Field Programmable Gate Arrays- Processor in Memory- Streams …

v = systems slated for DOE National Leadership Computing Facility

Page 21: MCS Vision

21

Pioneering Science andTechnology

Office of Science U.S. Department

of Energy

NLCF Target Application Areas

SciDAC Fusion

SciDACAstrophysics

Genomesto Life

Nanophase Materials

SciDAC Climate

SciDACChemistry

Page 22: MCS Vision

22

Pioneering Science andTechnology

Office of Science U.S. Department

of Energy

The Blue Gene Consortium: Goals

• Provide new capabilities to selected applications partnerships.• Provide functional requirements for a petaflop/sec version of BG.• Build a community around a new class of architecture.

- Thirty university and lab partners- About ten hardware partners and about twenty software collaborators

• Develop a new, sustainable model of partnership.- “Research product” by passing normal “productization” process/costs- Community-based support model (hub and spoke)

• Engage (or re-engage) computer science researchers with high-performance computing architecture.

- Broad community access to hardware systems- Scalable operating system research and novel software research

• Partnership of DOE, NSF, NIH, NNSA, and IBM will work on computer science, computational science, and architecture development.

• Kickoff meeting was April 27, 2004, in Chicago.

Page 23: MCS Vision

23

Pioneering Science andTechnology

Office of Science U.S. Department

of Energy

Determining application fit

• How will applications map onto different petaflop architectures?

applicationsapplications

BG/LBG/L

BG/PBG/PBG/QBG/Q

Vector/parallelVector/parallel

X1X1

Black Black WidowWidow

ClusterCluster

Red StormRed Storm

??

Massively parallelMassively parallelFine grainFine grain

Our focusOur focus

Page 24: MCS Vision

24

Pioneering Science andTechnology

Office of Science U.S. Department

of Energy

Application Analysis Process

A priori algorithmicA priori algorithmicperformance performance

analysisanalysis

ScalabilityScalabilityAnalysisAnalysis

(Jumpshot, FPMPI)(Jumpshot, FPMPI)

Refine model Refine model Tune codeTune code

Validate modelValidate model(BG/L hardware(BG/L hardware))

• Look for inauspicious scalability Look for inauspicious scalability bottlenecksbottlenecks

• May be based on data fromMay be based on data from conventional systems or BGSimconventional systems or BGSim• Extrapolate conclusionsExtrapolate conclusions• Model 0Model 0thth order behavior order behavior

Detailed message passing statisticsDetailed message passing statisticson BG hardwareon BG hardware

Detailed Detailed architecturalarchitecturaladjustmentsadjustments

PerformancePerformanceAnalysisAnalysis

(PAPI, hpmlib)(PAPI, hpmlib)

Rewrite algorithms?Rewrite algorithms?

• Use of memory hierarchyUse of memory hierarchy• Use of pipelineUse of pipeline• Instruction mixInstruction mix• Bandwidth vs. latencyBandwidth vs. latency• etc.etc.

00thth order scaling problems order scaling problems

Page 25: MCS Vision

25

Pioneering Science andTechnology

Office of Science U.S. Department

of Energy

High performance resources operate in complex and highly distributed environments

• Argonne co-founded the Grid- Establishing a persistent, standards-based infrastructure

and applications interfaces that enable high performance assess to computation and data

- ANL created the Global Grid Forum, an international standards body

- We lead development of the Globus Toolkit

• ANL staff are PIs and key contributors in many grid technology development and application projects- High performance data transport, Grid security, virtual

organization mgt., Open Grid Services Architecture, …- Access Grid – group-to-group collaboration via large multimedia

displays