nvidia korea psg - gist · nvidia korea psg 이주석: jslee@ ... maxwell equation solver ring...

71
NVIDIA Confidential NVIDIA Korea PSG 이주석 : [email protected] http://nvidiakoreapsc.com

Upload: letram

Post on 14-Apr-2018

233 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: NVIDIA Korea PSG - GIST · NVIDIA Korea PSG 이주석: jslee@ ... Maxwell equation solver Ring Oscillator ... ANSYS Mechanical Abaqus/Standard MSC Nastran Marc AFEA AMLS, FastFRS

NVIDIA Confidential

NVIDIA Korea PSG

이주석 : [email protected]

http://nvidiakoreapsc.com

Page 2: NVIDIA Korea PSG - GIST · NVIDIA Korea PSG 이주석: jslee@ ... Maxwell equation solver Ring Oscillator ... ANSYS Mechanical Abaqus/Standard MSC Nastran Marc AFEA AMLS, FastFRS

NVIDIA Confidential

ENTERPRISE GROUP Visualization, Accelerated Computing & Virtualization

TESLA Accelerating Momentum in HPC and Big Data Analytics

QUADRO Revolutionizing Design &

Visualization

GRID Enabling End-to-End

Enterprise Virtualization

Page 3: NVIDIA Korea PSG - GIST · NVIDIA Korea PSG 이주석: jslee@ ... Maxwell equation solver Ring Oscillator ... ANSYS Mechanical Abaqus/Standard MSC Nastran Marc AFEA AMLS, FastFRS

NVIDIA Confidential

CUDA: World’s Most Pervasive Parallel Programming Model

700+ University Courses

In 62 Countries 14,000 Institutions with CUDA Developers

2,000,000 CUDA Downloads

487,000,000 CUDA GPUs Shipped

Page 4: NVIDIA Korea PSG - GIST · NVIDIA Korea PSG 이주석: jslee@ ... Maxwell equation solver Ring Oscillator ... ANSYS Mechanical Abaqus/Standard MSC Nastran Marc AFEA AMLS, FastFRS

NVIDIA Confidential

GPUs Power World’s 10 Greenest Supercomputers

Green500

Rank MFLOPS/W Site

1 4,503.17 GSIC Center, Tokyo Tech

2 3,631.86 Cambridge University

3 3,517.84 University of Tsukuba

4 3,185.91 Swiss National Supercomputing (CSCS)

5 3,130.95 ROMEO HPC Center

6 3,068.71 GSIC Center, Tokyo Tech

7 2,702.16 University of Arizona

8 2,629.10 Max-Planck

9 2,629.10 (Financial Institution)

10 2,358.69 CSIRO

37 1959.90 Intel Endeavor (top Xeon Phi cluster)

49 1247.57 Météo France (top CPU cluster)

Page 5: NVIDIA Korea PSG - GIST · NVIDIA Korea PSG 이주석: jslee@ ... Maxwell equation solver Ring Oscillator ... ANSYS Mechanical Abaqus/Standard MSC Nastran Marc AFEA AMLS, FastFRS

NVIDIA Confidential

Developer/Compute HPC/Big Data Graphics Life Science

Oil & Gas Finance Manufacturing Media & Entertainment

Graphics Virtualization Mobile App & Game

Development

PC Game Development In-car Infotainment

WHERE ART MEETS SCIENCE MEETS ENGINEERING

MEETS BUSINESS

4 Days

500+ Sessions

170+ Research Posters

5 Co-located Summits

48 Countries

3438 registers

www.nvidia.com/gtc

March 24-27, 2014 | San Jose, California

Page 6: NVIDIA Korea PSG - GIST · NVIDIA Korea PSG 이주석: jslee@ ... Maxwell equation solver Ring Oscillator ... ANSYS Mechanical Abaqus/Standard MSC Nastran Marc AFEA AMLS, FastFRS

NVIDIA Confidential

113

182

242

0

50

100

150

200

250

300

2011 2012 2013

0%

20%

40%

60%

80%

100%

2010 2011 2012 2013

Accelerated Computing Growing Fast

Rapid Adoption of Accelerators

Hundreds of GPU Accelerated Apps

NVIDIA GPU is Accelerator of Choice

NVIDIA GPUs

85%

INTEL PHI

4% OTHERS

11%

Intersect360 Research HPC User Site Census: Systems, July 2013

Intersect360 HPC User Site Census: Systems, July 2013 IDC HPC End-User MSC Study, 2013

% of HPC Customers with Accelerators

44%

77%

Page 7: NVIDIA Korea PSG - GIST · NVIDIA Korea PSG 이주석: jslee@ ... Maxwell equation solver Ring Oscillator ... ANSYS Mechanical Abaqus/Standard MSC Nastran Marc AFEA AMLS, FastFRS

NVIDIA Confidential

Performance gap continues to grow

0

500

1000

1500

2000

2500

2008 2010 2012 2014

Peak Double Precision FLOPS

NVIDIA GPU x86 CPU

Fermi

GT200

K20X

GK210-Duo

Nehalem

Sandy Bridge

Haswell

GFLOPS

0

100

200

300

400

500

600

2008 2010 2012 2014

Peak Memory Bandwidth

NVIDIA GPU x86 CPU

GB/s

Fermi

GT200

K20X

GK210-Duo

Nehalem Sandy Bridge

Haswell

Page 8: NVIDIA Korea PSG - GIST · NVIDIA Korea PSG 이주석: jslee@ ... Maxwell equation solver Ring Oscillator ... ANSYS Mechanical Abaqus/Standard MSC Nastran Marc AFEA AMLS, FastFRS

NVIDIA Confidential

Hybrid GPU 솔루션

Application Code

+

가속기 CPU Parallelize using CUDA Programming Model

Only Critical Functions Rest of Sequential

CPU Code

Page 9: NVIDIA Korea PSG - GIST · NVIDIA Korea PSG 이주석: jslee@ ... Maxwell equation solver Ring Oscillator ... ANSYS Mechanical Abaqus/Standard MSC Nastran Marc AFEA AMLS, FastFRS

NVIDIA Confidential

GPU Acceleration Across All Platforms

x86

POWER ARM64

NVIDIA GPU

NEW

Page 10: NVIDIA Korea PSG - GIST · NVIDIA Korea PSG 이주석: jslee@ ... Maxwell equation solver Ring Oscillator ... ANSYS Mechanical Abaqus/Standard MSC Nastran Marc AFEA AMLS, FastFRS

NVIDIA Confidential

2015 2014

Kepler

x86 | ARM64 | Power8

2016

PCIe - 16 GB/s

NVLink- 80 GB/s

ARM64 | Power8+

Pascal

Connecting with CPUs via NVLink

Page 11: NVIDIA Korea PSG - GIST · NVIDIA Korea PSG 이주석: jslee@ ... Maxwell equation solver Ring Oscillator ... ANSYS Mechanical Abaqus/Standard MSC Nastran Marc AFEA AMLS, FastFRS

NVIDIA Confidential

Arm + GPU

Page 12: NVIDIA Korea PSG - GIST · NVIDIA Korea PSG 이주석: jslee@ ... Maxwell equation solver Ring Oscillator ... ANSYS Mechanical Abaqus/Standard MSC Nastran Marc AFEA AMLS, FastFRS

NVIDIA Confidential

Arm + GPU

SECO Hardware Development Kit

CUDA GPU Tegra ARM CPU

http://www.secoqseven.com/en/item/secocq7-mxm/

Page 13: NVIDIA Korea PSG - GIST · NVIDIA Korea PSG 이주석: jslee@ ... Maxwell equation solver Ring Oscillator ... ANSYS Mechanical Abaqus/Standard MSC Nastran Marc AFEA AMLS, FastFRS

NVIDIA Confidential

GPUs Propel 64-Bit ARM into HPC

ARM64

Power Efficiency

System Configurability

Large, Open Ecosystem

GPU

Ultra-Fast Compute Perf

Hundreds of CUDA Apps

Large HPC Ecosystem

GPUs make ARM64 Competitive

in HPC from Day One

Page 14: NVIDIA Korea PSG - GIST · NVIDIA Korea PSG 이주석: jslee@ ... Maxwell equation solver Ring Oscillator ... ANSYS Mechanical Abaqus/Standard MSC Nastran Marc AFEA AMLS, FastFRS

NVIDIA Confidential

GPU-Accelerated ARM64 Development Platforms Now Available

RM1905D Development Platform

1U Rackmount Server

2x ARM64 CPUs + 2x Tesla K20

GPUs

Page 15: NVIDIA Korea PSG - GIST · NVIDIA Korea PSG 이주석: jslee@ ... Maxwell equation solver Ring Oscillator ... ANSYS Mechanical Abaqus/Standard MSC Nastran Marc AFEA AMLS, FastFRS

NVIDIA Confidential

Availability of ARM64 Platforms

Partner &

System System Description Availability Contact

CirraScale RM1905D

1U Rackmount

2x ARM64 CPUs + 2x Tesla K20

GPUs

Order now

Start Shipping: End of July

Al Lucarelli

[email protected]

E4 Computer Erka

3U Rackmount

2x ARM64 CPUs + 2x Tesla K20

GPUs

Order now

Start Shipping: End of July Piero Altoè

[email protected]

Eurotech Aurora

High density, Liquid cooled

8x ARM64 CPUs + 32 Tesla K20

GPUs in 3U space Est Availability Q4 2014

Giovanbattista Mattiussi [email protected]

Production Server Systems will available in Q4 2014

Page 16: NVIDIA Korea PSG - GIST · NVIDIA Korea PSG 이주석: jslee@ ... Maxwell equation solver Ring Oscillator ... ANSYS Mechanical Abaqus/Standard MSC Nastran Marc AFEA AMLS, FastFRS

NVIDIA Confidential

IBM Partners with NVIDIA to Build Next-Generation Supercomputers

POWER 8

CPU Tesla

GPU

+

GPU-Accelerated POWER-Based Systems Available in 2014

Page 17: NVIDIA Korea PSG - GIST · NVIDIA Korea PSG 이주석: jslee@ ... Maxwell equation solver Ring Oscillator ... ANSYS Mechanical Abaqus/Standard MSC Nastran Marc AFEA AMLS, FastFRS

NVIDIA Confidential

Introducing NVLINK and Stacked Memory

NVLINK GPU high speed interconnect

80-200 GB/s

Planned support for POWER

CPUs

Stacked Memory 4x Higher Bandwidth (~1 TB/s)

3x Larger Capacity

4x More Energy Efficient per bit

Page 18: NVIDIA Korea PSG - GIST · NVIDIA Korea PSG 이주석: jslee@ ... Maxwell equation solver Ring Oscillator ... ANSYS Mechanical Abaqus/Standard MSC Nastran Marc AFEA AMLS, FastFRS

NVIDIA Confidential

Introducing NVLink

• Differential with embedded clock

• PCIe programming model (w/ DMA+)

• Unified Memory

• Cache coherency in Gen2.0

• 5 to 12X PCIe

Page 19: NVIDIA Korea PSG - GIST · NVIDIA Korea PSG 이주석: jslee@ ... Maxwell equation solver Ring Oscillator ... ANSYS Mechanical Abaqus/Standard MSC Nastran Marc AFEA AMLS, FastFRS

NVIDIA Confidential

NVLink Enables Data Transfer At Speed of CPU Memory

TESLA

GPU CPU

DDR Memory Stacked Memory

NVLink

80 GB/s

DDR4

50-75 GB/s

HBM

1 Terabyte/s

Page 20: NVIDIA Korea PSG - GIST · NVIDIA Korea PSG 이주석: jslee@ ... Maxwell equation solver Ring Oscillator ... ANSYS Mechanical Abaqus/Standard MSC Nastran Marc AFEA AMLS, FastFRS

NVIDIA Confidential

Unified Memory Dramatically Lower Developer Effort

Developer View Today Developer View With Unified Memory

Unified Memory System Memory

GPU Memory

Page 21: NVIDIA Korea PSG - GIST · NVIDIA Korea PSG 이주석: jslee@ ... Maxwell equation solver Ring Oscillator ... ANSYS Mechanical Abaqus/Standard MSC Nastran Marc AFEA AMLS, FastFRS

NVIDIA Confidential

HPC over Cloud

Page 22: NVIDIA Korea PSG - GIST · NVIDIA Korea PSG 이주석: jslee@ ... Maxwell equation solver Ring Oscillator ... ANSYS Mechanical Abaqus/Standard MSC Nastran Marc AFEA AMLS, FastFRS

NVIDIA Confidential

Page 23: NVIDIA Korea PSG - GIST · NVIDIA Korea PSG 이주석: jslee@ ... Maxwell equation solver Ring Oscillator ... ANSYS Mechanical Abaqus/Standard MSC Nastran Marc AFEA AMLS, FastFRS

NVIDIA Confidential

GeForce GRID

CLIENT

Decode

Render

Kybd/Mse

SERVER

Render

Capture

Encode

GeForce GRID

60 ms

4 Frames

Network

30 ms

2 Frames

GeForce GRID

30-60 ms

2 Frames

IP Network

CPU NIC

Page 24: NVIDIA Korea PSG - GIST · NVIDIA Korea PSG 이주석: jslee@ ... Maxwell equation solver Ring Oscillator ... ANSYS Mechanical Abaqus/Standard MSC Nastran Marc AFEA AMLS, FastFRS

NVIDIA Confidential

GPU virtualization technologies

Hypervisor

Control Path

VM

OS

NVIDIA Driver

NVIDIA GPU Hypervisor

VM

OS

NVIDIA Driver

Fast Path

OS API

Intercept

NVIDIA Driver

OS API

Intercept

Translation, Execution, Readback

VM

OS

NVIDIA Driver

VM VM

Direct-assigned GPU API intercept NVIDIA Virtual GPU

VM

OS

NVIDIA Driver

NMOS enabled

Page 25: NVIDIA Korea PSG - GIST · NVIDIA Korea PSG 이주석: jslee@ ... Maxwell equation solver Ring Oscillator ... ANSYS Mechanical Abaqus/Standard MSC Nastran Marc AFEA AMLS, FastFRS

NVIDIA Confidential

Physics + CUDA programing + Visualization

Page 26: NVIDIA Korea PSG - GIST · NVIDIA Korea PSG 이주석: jslee@ ... Maxwell equation solver Ring Oscillator ... ANSYS Mechanical Abaqus/Standard MSC Nastran Marc AFEA AMLS, FastFRS

NVIDIA Confidential

Page 27: NVIDIA Korea PSG - GIST · NVIDIA Korea PSG 이주석: jslee@ ... Maxwell equation solver Ring Oscillator ... ANSYS Mechanical Abaqus/Standard MSC Nastran Marc AFEA AMLS, FastFRS

NVIDIA Confidential

Design + 공기역학 Simulation

공기의 힘으로 차체를 많이 누를수록 타이어와 바닥의 마찰력이 높아진다. 엔진과

브페이크의 힘이 바닥으로 전달되는 효율이 높아지면서 가속과 감속 능력이 좋아진다.

D=1/2pV²·A·Cd 여기서 D=공기저항, p=공기밀도, V= 차속, A=전면투영면적,

Cd=항력계수다. 앞 투영면적을 최대한 작게 하면 공기저항이 줄어든다

Page 28: NVIDIA Korea PSG - GIST · NVIDIA Korea PSG 이주석: jslee@ ... Maxwell equation solver Ring Oscillator ... ANSYS Mechanical Abaqus/Standard MSC Nastran Marc AFEA AMLS, FastFRS

NVIDIA Confidential

다양한 산업 분야에서의 요구 사항 증가

Finance Government Edu/Research Oil and gas Life Sciences Manufacturing

Seismic Processing

Reservoir Sim

Astrophysics

Molecular Dynamics

Weather / Climate

Signal Processing Satellite Imaging Video Analytics

Bio-chemistry

Bio-informatics Material Science

Genomics

Risk Analytics Monte Carlo

Options Pricing Insurance

Structural Mechanics

Computational Fluid Dynamics

Electromagnetics

Page 29: NVIDIA Korea PSG - GIST · NVIDIA Korea PSG 이주석: jslee@ ... Maxwell equation solver Ring Oscillator ... ANSYS Mechanical Abaqus/Standard MSC Nastran Marc AFEA AMLS, FastFRS

NVIDIA Confidential

Image Processing

Page 30: NVIDIA Korea PSG - GIST · NVIDIA Korea PSG 이주석: jslee@ ... Maxwell equation solver Ring Oscillator ... ANSYS Mechanical Abaqus/Standard MSC Nastran Marc AFEA AMLS, FastFRS

NVIDIA Confidential

What is Machine Vision?

“I’m seeing more and more machine vision companies use GPUs.

Heuristic searches and training tasks that might have been impractical on a

single processor; you might now be able to do things that weren’t

possible, leading to a new whole new class of algorithms”

Perry West

President of Automated Vision Systems Inc http://www.machinevisiononline.org/vision-resources-details.cfm?content_id=411

1. Capture images in manufacturing line

2. Process images and make decision

on product quality

3. Take action on target product

1

3

2

Page 31: NVIDIA Korea PSG - GIST · NVIDIA Korea PSG 이주석: jslee@ ... Maxwell equation solver Ring Oscillator ... ANSYS Mechanical Abaqus/Standard MSC Nastran Marc AFEA AMLS, FastFRS

NVIDIA Confidential

Machine Vision Solutions Available Now

ISV GPU Solution Description

MVTec HALCON 10 • Leading Machine Vision ISV with customers worldwide

Dalsa Sapera Nitrous • Another leading Machine Vision ISV with 8-10% market

share in fragmented market

Libraries GPU Solution Description

CUDA Vision

Workbench Computer Vision Workbench

• Application used primarily for demonstration,

benchmarking and development of vision primitives

implemented in CUDA

NVIDIA Library NVIDIA Performance

Primitives • Library of functions for performing CUDA accelerated

processing, with focus on imaging and video processing

Libraries to build custom solution

Page 32: NVIDIA Korea PSG - GIST · NVIDIA Korea PSG 이주석: jslee@ ... Maxwell equation solver Ring Oscillator ... ANSYS Mechanical Abaqus/Standard MSC Nastran Marc AFEA AMLS, FastFRS

NVIDIA Confidential

0x

10x

20x

30x

40x

MVTec Halcon 10: 10x - 30x Faster

Speed-up: Tesla C2050 GPU vs Quad-core Intel Nehalem

Page 33: NVIDIA Korea PSG - GIST · NVIDIA Korea PSG 이주석: jslee@ ... Maxwell equation solver Ring Oscillator ... ANSYS Mechanical Abaqus/Standard MSC Nastran Marc AFEA AMLS, FastFRS

NVIDIA Confidential

불량 분석 : 1시간 작업을 7분만에 완료

7 minutes

18 minutes

1 hr 20 mins

0

5

10

15

20

25

2 GPUs 1 GPU Dual Core2 Duo

x Fa

ste

r

CT Scan & Reconstruction of Solder Ball Failure using CUDA

courtesy North Star Imaging

xViewCT with 1536x1920 X-ray detector

1.2B voxels, 8GB raw data set

GPU를 이용한 BGA볼 불량 분석 사례 : X-ray 또는 카메라 이미지 프로세싱 가속화 알고리즘 사용

Page 34: NVIDIA Korea PSG - GIST · NVIDIA Korea PSG 이주석: jslee@ ... Maxwell equation solver Ring Oscillator ... ANSYS Mechanical Abaqus/Standard MSC Nastran Marc AFEA AMLS, FastFRS

NVIDIA Confidential

Target Industries for Machine Vision

Product Quality

Operation Efficiency

Return on Investment

Higher Revenue

Textile

Steel Security Semiconductors

Food Electronic Manuf. Paper

Flat Panel

Page 35: NVIDIA Korea PSG - GIST · NVIDIA Korea PSG 이주석: jslee@ ... Maxwell equation solver Ring Oscillator ... ANSYS Mechanical Abaqus/Standard MSC Nastran Marc AFEA AMLS, FastFRS

NVIDIA Confidential

Intelligent Video

Surveillance

Facial Recognition

Video and Imagery

Search and Analysis

Computer Vision

Video Enhancement

Signal Processing

10x-100x Faster

Page 36: NVIDIA Korea PSG - GIST · NVIDIA Korea PSG 이주석: jslee@ ... Maxwell equation solver Ring Oscillator ... ANSYS Mechanical Abaqus/Standard MSC Nastran Marc AFEA AMLS, FastFRS

NVIDIA Confidential

CT 이미지 구성을 통한 3D 구현

Page 37: NVIDIA Korea PSG - GIST · NVIDIA Korea PSG 이주석: jslee@ ... Maxwell equation solver Ring Oscillator ... ANSYS Mechanical Abaqus/Standard MSC Nastran Marc AFEA AMLS, FastFRS

NVIDIA Confidential

Enabling New Computation Solutions

Shared

Mem

Texture Engine

L2 Cache

Tessellation

Engine

Primitive

Engine

L1 TEX

Cache

192 CUDA Cores

Face Recognition

Head Tracking

Object Recognition

Recognition

Gesture

Recognition

3D Reconstruction

Augmented Reality

Perfect architecture for parallel algorithms

Page 38: NVIDIA Korea PSG - GIST · NVIDIA Korea PSG 이주석: jslee@ ... Maxwell equation solver Ring Oscillator ... ANSYS Mechanical Abaqus/Standard MSC Nastran Marc AFEA AMLS, FastFRS

NVIDIA Confidential

Signal Processing

Page 39: NVIDIA Korea PSG - GIST · NVIDIA Korea PSG 이주석: jslee@ ... Maxwell equation solver Ring Oscillator ... ANSYS Mechanical Abaqus/Standard MSC Nastran Marc AFEA AMLS, FastFRS

NVIDIA Confidential

1

2

QuadroPlex #1: Card #1&2 (SLI)

Output #1:0.1 QuadroPlex #2: Card #1

Output #1 : 0.1

QuadroPlex #2: Card #2

Output #1:0.2

지질 탐사, 유전 탐사 : 충격파를 이용하여 지질, 해저 구조를

CUDA로 분석 그리고 QuadroPlex를

이용한 8K 고해상도 영상 구현

충격파 Simulation + Visualization

Page 40: NVIDIA Korea PSG - GIST · NVIDIA Korea PSG 이주석: jslee@ ... Maxwell equation solver Ring Oscillator ... ANSYS Mechanical Abaqus/Standard MSC Nastran Marc AFEA AMLS, FastFRS

NVIDIA Confidential

전자기학 simulation + 설계

9.9 Mcells/s

500.0 Mcells/s

0

100

200

300

400

500

600

Intel Xeon (2.6 GHz) 4 GPUs(Tesla 8-series)

Speed Mcells/s

Cell Phone Model Simulation Simulation size : 80 Mcells

FDTD Acceleration using GPUs Source: Acceleware

FDTD Solvers

Acceleware

EM Photonics

Ongoing work

Maxwell equation solver

Ring Oscillator (FDTD)

Particle beam dynamics simulator

Page 41: NVIDIA Korea PSG - GIST · NVIDIA Korea PSG 이주석: jslee@ ... Maxwell equation solver Ring Oscillator ... ANSYS Mechanical Abaqus/Standard MSC Nastran Marc AFEA AMLS, FastFRS

NVIDIA Confidential

simulation

Page 42: NVIDIA Korea PSG - GIST · NVIDIA Korea PSG 이주석: jslee@ ... Maxwell equation solver Ring Oscillator ... ANSYS Mechanical Abaqus/Standard MSC Nastran Marc AFEA AMLS, FastFRS

NVIDIA Confidential

가시화 위성 영상 분석

GPU 활용 사례 in CWO

3.5-km GEOS-5 Simulated Clouds

dx =2km

dx =1km

Reality

NWP 가속

Page 43: NVIDIA Korea PSG - GIST · NVIDIA Korea PSG 이주석: jslee@ ... Maxwell equation solver Ring Oscillator ... ANSYS Mechanical Abaqus/Standard MSC Nastran Marc AFEA AMLS, FastFRS

NVIDIA Confidential

ASUCA and NWP Achievement: 145 TFLOPS

ASUCA and NWP Simulation on Tsubame 2.0, TiTech Supercomputer:

Dr. Takayuki Aoki, GSIC, Tokyo Institute of Technology, Tokyo Japan

Tsubame 2.0 Tokyo Institute of Technology

1.19 Petaflops

4,224 Tesla M2050 GPUs

3990 Tesla M2050s

145.0 Tflops SP

76.1 Tflops DP

Page 44: NVIDIA Korea PSG - GIST · NVIDIA Korea PSG 이주석: jslee@ ... Maxwell equation solver Ring Oscillator ... ANSYS Mechanical Abaqus/Standard MSC Nastran Marc AFEA AMLS, FastFRS

NVIDIA Confidential

Available Today

Product in 2013

Product Evaluation

Research Evaluation

GPU Status Structural Mechanics Fluid Dynamics Electromagnetics

ANSYS Mechanical

Abaqus/Standard

MSC Nastran

Marc

AFEA

AMLS, FastFRS

NX Nastran

HyperWorks OptiStruct

PAM-CRASH implicit

LS-DYNA implicit

RecurDyn

Adventure Cluster

ANSYS CFD (FLUENT)

Moldflow

Culises (OpenFOAM)

Particleworks

SpeedIT (OpenFOAM)

AcuSolve

Abaqus/CFD

LS-DYNA CFD

CFD++

FloEFD

STAR-CCM+

XFlow

LS-DYNA

Abaqus/Explicit

RADIOSS

PAM-CRASH

EMPro

CST MWS

XFdtd

SEMCAD X

FEKO

Nexxim

JMAG

CFD-ACE+

GPU Progress – Commercial CAE Software

Xpatch

HFSS

SCSK

Page 45: NVIDIA Korea PSG - GIST · NVIDIA Korea PSG 이주석: jslee@ ... Maxwell equation solver Ring Oscillator ... ANSYS Mechanical Abaqus/Standard MSC Nastran Marc AFEA AMLS, FastFRS

NVIDIA Confidential

164

210

341

395

0

100

200

300

400

500

CPU + GPU

CPU OnlyHigher

is Better

ANSYS Mechanical 14.5 GPU Acceleration

AN

SY

S M

echanic

al N

um

ber

of

Jobs

Per

Day

Xeon X5690 3.47 GHz 8 Cores + Tesla C2075

Xeon E5-2687W 3.10 GHz 8 Cores + Tesla K20

V14sp-5 Model

Turbine geometry

2,100,000 DOF

SOLID187 FEs

Static, nonlinear

One iteration (final

solution requires 25)

Distributed ANSYS 14.5

Direct sparse solver

Results from Supermicro

X9DR3-F, 64GB memory

Results for Distributed ANSYS 14.5 with 8-Core CPUs and single GPUs

Westmere Sandy Bridge

K20 = 1.9x Acceleration

C2075 = 2.1x Acceleration

Page 46: NVIDIA Korea PSG - GIST · NVIDIA Korea PSG 이주석: jslee@ ... Maxwell equation solver Ring Oscillator ... ANSYS Mechanical Abaqus/Standard MSC Nastran Marc AFEA AMLS, FastFRS

NVIDIA Confidential

G1 G2 G3 G4

8-Cores 8-Cores 16-Core Server Node

Multi-GPU Acceleration of

16-Core ANSYS Fluent 15.0

(Preview) External Aero

Xeon E5-2667 + 4 x Tesla K20X GPUs

2.9X Solver Speedup

CPU Configuration CPU + GPU Configuration

ANSYS Fluent Solver Times for Sedan – 4 GPUs

3.6 M Mixed cells

Steady, k-e turbulence

Coupled PBNS, DP

AMG F-cycle on CPU

AMG V-cycle on GPU

Page 47: NVIDIA Korea PSG - GIST · NVIDIA Korea PSG 이주석: jslee@ ... Maxwell equation solver Ring Oscillator ... ANSYS Mechanical Abaqus/Standard MSC Nastran Marc AFEA AMLS, FastFRS

NVIDIA Confidential

0

1.5

3

4.5

6

SOL101, 2.4M rows, 42K front SOL103, 2.6M rows, 18K front

serial 4c 4c+1g

MSC Nastran 2013 and GPU Performance SMP + GPU acceleration of SOL101 and SOL103

Higher is

Better

Server node: Sandy Bridge E5-2670 (2.6GHz), Tesla K20X GPU, 128 GB memory

1X 1X

2.7X

1.9X

6X

2.8X

Lanczos solver (SOL 103) Sparse matrix factorization

Iterate on a block of vectors (solve)

Orthogonalization of vectors

Page 48: NVIDIA Korea PSG - GIST · NVIDIA Korea PSG 이주석: jslee@ ... Maxwell equation solver Ring Oscillator ... ANSYS Mechanical Abaqus/Standard MSC Nastran Marc AFEA AMLS, FastFRS

NVIDIA Confidential

1

1.5

2

2.5

3

3.5

0

5000

10000

15000

20000

8c 8c + 1g 8c + 2g 16c 16c + 2g

Elapsed Time in seconds Speed up relative to 8 core

Rolls Royce: Abaqus 3.5x Speedup with 5M DOF

Server with 2x E5-2670, 2.6GHz CPUs, 128GB memory, 2x Tesla K20X, Linux RHEL 6.2, Abaqus/Standard 6.12-2

• 4.71M DOF (equations); ~77 TFLOPs • Nonlinear Static (6 Steps) • Direct Sparse solver, 100GB memory Sandy Bridge + Tesla K20X Single Server

Speed u

p r

ela

tive t

o 8

core

(1x)

2.42x

2.11x

Page 49: NVIDIA Korea PSG - GIST · NVIDIA Korea PSG 이주석: jslee@ ... Maxwell equation solver Ring Oscillator ... ANSYS Mechanical Abaqus/Standard MSC Nastran Marc AFEA AMLS, FastFRS

NVIDIA Confidential

Bio Informatics

Page 50: NVIDIA Korea PSG - GIST · NVIDIA Korea PSG 이주석: jslee@ ... Maxwell equation solver Ring Oscillator ... ANSYS Mechanical Abaqus/Standard MSC Nastran Marc AFEA AMLS, FastFRS

NVIDIA Confidential

Computation: 3rd Pillar of Scientific Research

Experimental Description of natural phenomena

Experimental methods and quantification

Theoretical Formulation of Newton’s laws, Maxwell’s equations …

Computational Simulation of complex phenomena

Data Distributed communities unifying theory, experiment and simulation with massive data sets from multiple sources and disciplines

1,000 years ago Last 500 years Last 50 years Today

2

2

2.

3

4

a

cG

a

a

Page 51: NVIDIA Korea PSG - GIST · NVIDIA Korea PSG 이주석: jslee@ ... Maxwell equation solver Ring Oscillator ... ANSYS Mechanical Abaqus/Standard MSC Nastran Marc AFEA AMLS, FastFRS

NVIDIA Confidential

Computer graphics require billions

to trillions of parallel computations

per second.

Page 52: NVIDIA Korea PSG - GIST · NVIDIA Korea PSG 이주석: jslee@ ... Maxwell equation solver Ring Oscillator ... ANSYS Mechanical Abaqus/Standard MSC Nastran Marc AFEA AMLS, FastFRS

NVIDIA Confidential

Scientific simulations can require quadrillions of parallel computations per second.

Page 53: NVIDIA Korea PSG - GIST · NVIDIA Korea PSG 이주석: jslee@ ... Maxwell equation solver Ring Oscillator ... ANSYS Mechanical Abaqus/Standard MSC Nastran Marc AFEA AMLS, FastFRS

NVIDIA Confidential

Gene Sequencing

Sequence Analysis

Molecular Modeling

Diagnostic Imaging

GPUs Accelerate Life Sciences Pipeline

Page 54: NVIDIA Korea PSG - GIST · NVIDIA Korea PSG 이주석: jslee@ ... Maxwell equation solver Ring Oscillator ... ANSYS Mechanical Abaqus/Standard MSC Nastran Marc AFEA AMLS, FastFRS

NVIDIA Confidential

BGI (Beijing) Crunches Through Genomics Data Deluge with GPUs

Petabytes of data

Equal 15,000 human genomes /

year

Understand disease treatments

Study how individuals respond to

bacteria, virus, drugs

Personalized Medicine

Page 55: NVIDIA Korea PSG - GIST · NVIDIA Korea PSG 이주석: jslee@ ... Maxwell equation solver Ring Oscillator ... ANSYS Mechanical Abaqus/Standard MSC Nastran Marc AFEA AMLS, FastFRS

NVIDIA Confidential

A key path to drug discovery is determining

the similarity of one molecule to another.

OpenEye software uses Tesla GPUs to

accelerate the process, enabling millions of

molecules to be compared in seconds,

rather than hours or days.

Page 56: NVIDIA Korea PSG - GIST · NVIDIA Korea PSG 이주석: jslee@ ... Maxwell equation solver Ring Oscillator ... ANSYS Mechanical Abaqus/Standard MSC Nastran Marc AFEA AMLS, FastFRS

NVIDIA Confidential

USCD team uses Tesla GPUs for CT scans Reduces radiation dosage by up to 70 times

Up to 28,000 Americans each year develop cancer due to radiation from CT scans

Page 57: NVIDIA Korea PSG - GIST · NVIDIA Korea PSG 이주석: jslee@ ... Maxwell equation solver Ring Oscillator ... ANSYS Mechanical Abaqus/Standard MSC Nastran Marc AFEA AMLS, FastFRS

NVIDIA Confidential

Drug Discovery Process in “Wet Labs”

Synthesize new

Chemical Compounds Testing for Efficacy,

Side Effects, Safety

Clinical Trials

FDA Approval

Process

Robot-assisted screening

High Throughput

Screening

Millions of

Compounds

1000s of

Drug Leads

Trial and Error

~5 years

Page 58: NVIDIA Korea PSG - GIST · NVIDIA Korea PSG 이주석: jslee@ ... Maxwell equation solver Ring Oscillator ... ANSYS Mechanical Abaqus/Standard MSC Nastran Marc AFEA AMLS, FastFRS

NVIDIA Confidential

Computation-based Drug Discovery

Synthesize new

Chemical Compounds Testing for Efficacy,

Side Effects, Safety

Clinical Trials

FDA Approval

Process

Robot-assisted screening

High Throughput Screening

Millions of

Compounds

1000s of

Compounds

Check if compounds bind

to target proteins

Virtual Screening

Synthesize compounds

based on similarity

Computational Chemistry

Modify chemicals to

improve efficacy

Lead Optimization

Page 59: NVIDIA Korea PSG - GIST · NVIDIA Korea PSG 이주석: jslee@ ... Maxwell equation solver Ring Oscillator ... ANSYS Mechanical Abaqus/Standard MSC Nastran Marc AFEA AMLS, FastFRS

NVIDIA Confidential

Example of using Computational Methods

878 FDA-Approved

Drugs 2,787 Pharmaceutical

Compounds

246 Targets (Proteins, etc)

From MDDR database

Similarity Ensemble

Approach (SEA) 6,928

Similar

Pairs

Remove known

associations 3,832

Remaining

Predictions Tested 30

Predictions

23 New Drug-Target Associations

Predicting new molecular targets for known drugs, Keiser et al, Nature, 2009

Confirmed one in animal

Page 60: NVIDIA Korea PSG - GIST · NVIDIA Korea PSG 이주석: jslee@ ... Maxwell equation solver Ring Oscillator ... ANSYS Mechanical Abaqus/Standard MSC Nastran Marc AFEA AMLS, FastFRS

NVIDIA Confidential

Why throughput capacity matters

Astronomical Sciences

12% Chemical, Thermal Systems

4%

Advanced Scientific

Computing 5%

Earth Sciences

5%

All 15 Others 3%

Atmospheric Sciences

2%

Molecular Biosciences

29% Chemistry

13%

Materials Research

6%

Physics 21%

2008 TeraGrid Usage By Discipline

Astronomical Sciences

12%

Chemical, Thermal Systems

4%

Advanced Scientific

Computing 5%

Earth Sciences 5%

All 15 Others 3%

Atmospheric Sciences

2% Excess Capacity

25%

Molecular Biosciences

4%

Chemistry 13%

Materials Research

6%

Physics 21%

2008 TeraGrid Usage By Discipline

What’s the value of adding 25% capacity?

Equivalent to reducing Molecular Bioscience usage by 7x

Page 61: NVIDIA Korea PSG - GIST · NVIDIA Korea PSG 이주석: jslee@ ... Maxwell equation solver Ring Oscillator ... ANSYS Mechanical Abaqus/Standard MSC Nastran Marc AFEA AMLS, FastFRS

NVIDIA Confidential

Big Data

Page 62: NVIDIA Korea PSG - GIST · NVIDIA Korea PSG 이주석: jslee@ ... Maxwell equation solver Ring Oscillator ... ANSYS Mechanical Abaqus/Standard MSC Nastran Marc AFEA AMLS, FastFRS

NVIDIA Confidential

Big Data ?

6.7

2.4

0.0

1.0

2.0

3.0

4.0

5.0

6.0

7.0

8.0

Big Data Compute Enterprise Search

Source: Wikibon and Frost & Sullivan

$ Billion

Big Data Market Size, 2015

Page 63: NVIDIA Korea PSG - GIST · NVIDIA Korea PSG 이주석: jslee@ ... Maxwell equation solver Ring Oscillator ... ANSYS Mechanical Abaqus/Standard MSC Nastran Marc AFEA AMLS, FastFRS

NVIDIA Confidential

Big Data Market Size, segment (‘13-’17)

4 5 7 8 8 2 3

5 6 7 13

19

26

30 33

0

5

10

15

20

25

30

35

40

45

50

2013 2014 2015 2016 2017

Compute Application Everything Else

Source: Wikibon, Wikibon.org

Note: For data related other segments, go to Appendix for reference

$Billion

Page 64: NVIDIA Korea PSG - GIST · NVIDIA Korea PSG 이주석: jslee@ ... Maxwell equation solver Ring Oscillator ... ANSYS Mechanical Abaqus/Standard MSC Nastran Marc AFEA AMLS, FastFRS

NVIDIA Confidential

패턴 매칭 가속을 통해 Big Data를 분석

Analyzing Twitter

Shazam

Searching Audio Image-based Search Real-time

Video Delivery

Page 65: NVIDIA Korea PSG - GIST · NVIDIA Korea PSG 이주석: jslee@ ... Maxwell equation solver Ring Oscillator ... ANSYS Mechanical Abaqus/Standard MSC Nastran Marc AFEA AMLS, FastFRS

NVIDIA Confidential

Now You Can Build Google’s

$1M Artificial Brain on the Cheap “ “

-Wired

Artificial Neural Network at a Fraction

of the Cost with GPUs

1,000 CPU Servers 2,000 CPUs • 16,000 cores

600 kWatts

$5,000,000

GOOGLE BRAIN

STANFORD AI LAB

3 GPU-Accelerated Servers 12 GPUs • 18,432 cores

4 kWatts

$33,000

Fast Growing GTC topics

Page 66: NVIDIA Korea PSG - GIST · NVIDIA Korea PSG 이주석: jslee@ ... Maxwell equation solver Ring Oscillator ... ANSYS Mechanical Abaqus/Standard MSC Nastran Marc AFEA AMLS, FastFRS

NVIDIA Confidential

Hadoop Framework

Customer

Applications

Machine

Learning Search Data Mining

Tools and

Applications Mahout Hive Solr & Lucene Giraph Hama

HDFS MapReduce Basic Platform

SQL Graph

Analytics

Scientific

Computing

Sample

Customers NSA JPMC Chevron Facebook MTV Network eBay

Indexing

Storm

Page 67: NVIDIA Korea PSG - GIST · NVIDIA Korea PSG 이주석: jslee@ ... Maxwell equation solver Ring Oscillator ... ANSYS Mechanical Abaqus/Standard MSC Nastran Marc AFEA AMLS, FastFRS

NVIDIA Confidential

Key Algorithms for Applications

Scientific Computing

Matrix Multiplication

Giraph

N-Body Simulation

Data Warehouse Graphic Analytics

Page Rank

Hama Hive

Gzip

Bzip2

Mahout

Naïve Bayes

Classifier

Fuzzy K-Means

Machine Learning

Recommenders

K-Means

Canopy

Decision Forests

Linear Regression

Frequent Itemset

Mining

Collocations

Solr & Lucene

Similarity Score

String Match

Word Count

Search

Apriori

Bellman-Ford

Depth-first Search

Sparse Matrix-Vector Multiplication

Snappy

Not Fit with GPU Not Sure Computing Intensive

Page View

Rank/Count

Inverted Index

Relational Algebra

Nearest neighbor

Shared connections

Personalization-

based Popularity

Priority-queue

based traversals

Page 68: NVIDIA Korea PSG - GIST · NVIDIA Korea PSG 이주석: jslee@ ... Maxwell equation solver Ring Oscillator ... ANSYS Mechanical Abaqus/Standard MSC Nastran Marc AFEA AMLS, FastFRS

NVIDIA Confidential

다양한 산업 분야에서의 요구 사항 증가

Finance Government Edu/Research Oil and gas Life Sciences Manufacturing

Seismic Processing

Reservoir Sim

Astrophysics

Molecular Dynamics

Weather / Climate

Signal Processing Satellite Imaging Video Analytics

Bio-chemistry

Bio-informatics Material Science

Genomics

Risk Analytics Monte Carlo

Options Pricing Insurance

Structural Mechanics

Computational Fluid Dynamics

Electromagnetics

Page 69: NVIDIA Korea PSG - GIST · NVIDIA Korea PSG 이주석: jslee@ ... Maxwell equation solver Ring Oscillator ... ANSYS Mechanical Abaqus/Standard MSC Nastran Marc AFEA AMLS, FastFRS

NVIDIA Confidential

KISTI- NVIDIA

Joint Laboratory

Education

&

Training center

openACC

& CUDA

Projects

GPU optimized

ISVs

Future

Architecture

Expand HPC users to Industry

In MV & ML

Page 70: NVIDIA Korea PSG - GIST · NVIDIA Korea PSG 이주석: jslee@ ... Maxwell equation solver Ring Oscillator ... ANSYS Mechanical Abaqus/Standard MSC Nastran Marc AFEA AMLS, FastFRS

NVIDIA Confidential

CUDA everywhere

2007 2008 2009 2010 2011 2012 2013 2014

CUDA

tour

CUDA

workshop

CUDA

contest

서울대

연세대

고려대

경북대

KAIST

GIST

포항공대

연세대

KAIST 고려대

동의대

KAIST

GIST

포항공대

Round

Table

meeting

@Yangjae

서울대

고려대

경북대

KAIST

GIST

부경대

포항공대

경북대

인제대

UNIST

GIST

한양대

시립대

충남대

GIST

경북대

동명대

강촌 안면도 덕산 서울 곤지암

CUDA

trainings

KISTI

http://nvidiakoreapsc.com

Page 71: NVIDIA Korea PSG - GIST · NVIDIA Korea PSG 이주석: jslee@ ... Maxwell equation solver Ring Oscillator ... ANSYS Mechanical Abaqus/Standard MSC Nastran Marc AFEA AMLS, FastFRS

NVIDIA Confidential

감사합니다