2016 06-30-deep-learning-archi

19
The architecture of Deep Learning web services

Upload: daisuke-nagao

Post on 05-Apr-2017

421 views

Category:

Data & Analytics


0 download

TRANSCRIPT

Page 1: 2016 06-30-deep-learning-archi

The architecture of Deep Learning web services

Page 2: 2016 06-30-deep-learning-archi

自己紹介

• 長尾 太介 ( Daisuke Nagao )• 2002 – 2015 : Fujixerox, 粉体シミュレータの開発 , HPC on AWS の導入

/ 構築

• 2015 - 現在 : NVIDIA, クラウドビジネスデベロップメント

Page 3: 2016 06-30-deep-learning-archi

Learning system requires HPC architecture.

Inference system requires Web architecture.

My messages (Deep learning web services)

NVIDIA’s deep learning strategy intends for large-scale data center.If you want to build the services on a cloud such as AWS, you can have other way.

長尾の個人的見解ですが・・

Page 4: 2016 06-30-deep-learning-archi

ここからは NVIDIA が提供する Deep Learning 環境の説明

Page 5: 2016 06-30-deep-learning-archi

For example

container shipmite Motor scooter leopard

画像のタグ付け・・・・

Page 6: 2016 06-30-deep-learning-archi

Learning

Neural NetworkStructure

Inference

Learning: HPC Architecture Inference: Web Architecture

BIGDATA Labels

Deploy a trained Network

apple Orange strawberryBanana

Deep Learning web services require two system.

Require real time processRequire Many Core, BIGDATA analysis

API MicroService

• マイクロサービスを提供する• ~250μs response time• リクエストを常時待ちうけ。

• 数日間、 CPU and/or GPUの負荷がほぼ 100%で動作する。マルチノード、マルチ GPUに対応• ジョブを走らせているときのみ計算ノードを利用

Page 7: 2016 06-30-deep-learning-archi

Learning

Neural NetworkStructure

Inference

Learning: HPC Architecture Inference: Web Architecture

BIGDATA Labels

Deploy a trained Network

apple Orange strawberryBanana

Deep Learning web services require two system.

Require real time processRequire Many Core, BIGDATA analysis

API MicroService

Learningに適した GPU 推論に適した GPU

Page 8: 2016 06-30-deep-learning-archi

Learning

Neural NetworkStructure

Inference

Learning: HPC Architecture Inference: Web Architecture

BIGDATA Labels

Deploy a trained Network

apple Orange strawberryBanana

Require real time processRequire Many Core, BIGDATA analysis

API MicroService

nvidia-docker

• There are many Deep Learning framework and version.• Need to deploy the trained network to Inference server.

Page 9: 2016 06-30-deep-learning-archi

Neural NetworkStructure

Learning: HPC Architecture Inference: Web Architecture

BIGDATA Labels

Deploy a trained Network

apple Orange strawberryBanana

Require real time processRequire Many Core, BIGDATA analysis

API

Learning Inference

nvidia-docker

• There are many Deep Learning framework and version.• Need to deploy the trained network to Inference server.

DockerRegistry

PUSH PULL

MicroService

Page 10: 2016 06-30-deep-learning-archi
Page 11: 2016 06-30-deep-learning-archi

Learning

Neural NetworkStructure

Inference

Learning: HPC Architecture Inference: Web Architecture

BIGDATA Labels

Deploy a trained Network

apple Orange strawberryBanana

Require real time processRequire Many Core, BIGDATA analysis

API

GPU Rest Engine is template written by go-lang to launch the micro-service.This template launch webserver with the port number that is set by admin.

GPU Rest EngineDockerRegistry

https://github.com/NVIDIA/gpu-rest-engine

Page 12: 2016 06-30-deep-learning-archi

GPU Rest Engine & Nvidia DockerSoftware stack

GPU2 GPU3 GPU4 GPU6 GPU7

NVIDIA CUDA Driver

Docker Engine

GPU0 GPU1 GPU0 GPU

1GPU

2 GPU0 GPU1

GPU2

GPU5

GPU Rest EngineGolang

GPU Rest EngineGolang

GPU Rest EngineGolang

Golangnet/http

Golangnet/http

Golangnet/http

Application1

Application2

Application3

GPU0 GPU1

Docker container1

Host PC

Docker container2 Docker container2

Page 13: 2016 06-30-deep-learning-archi

http://qiita.com/daikumatan/items/2efa5dbbf7276e1e017d

Page 14: 2016 06-30-deep-learning-archi

Neural NetworkStructure

Learning: HPC Architecture Inference: Web Architecture

BIGDATA Labels

Deploy a trained Network

apple Orange strawberryBanana

Require real time processRequire Many Core, BIGDATA analysis

API

ついに HPC でも Docker が! How deploy your apps?Docker リポジトリー

は?

Mesos can give an abstract of datacenter with both HPC and Webserver

Submit Job Daemon

Learning Inference

GPU Rest EngineDockerRegistry

Page 15: 2016 06-30-deep-learning-archi

Deep learning services on AWSもし AWS のみで環境を構築するならば・・・

AWSJ の松尾さんに相談中

Page 16: 2016 06-30-deep-learning-archi

Neural NetworkStructure

Learning: HPC Architecture Inference: Web Architecture

BIGDATALabels

Deploy a trained Network

apple Orange strawberryBanana

Require real time processRequire Many Core, BIGDATA analysis

API

ついに HPC でも Docker が! How deploy your apps?Docker リポジトリー

は?

Learning Inference

AWS Elastic Beanstalk

bucket

cfncluster

AmazonDynamoDB

AmazonDynamoDB

bucket

Meta Data

Amazon API Gateway

Submit Job DaemonHPC on AWS 用ミドルウェア• HPC Clusterの動的作成・削除・管理• スケジューリング機能

アーキたたき台

Page 17: 2016 06-30-deep-learning-archi

Neural NetworkStructure

Learning: HPC Architecture Inference: Web Architecture

BIGDATALabels

Deploy a trained Network

apple Orange strawberryBanana

Require real time processRequire Many Core, BIGDATA analysis

API

ついに HPC でも Docker が! How deploy your apps?Docker リポジトリー

は?

Learning Inference

AWS Elastic Beanstalk

bucket

cfncluster

AmazonDynamoDB

AmazonDynamoDB

bucket

Meta Data

Amazon API Gateway

学習も API起動するなら、ジョブ管理ソフトをキックするよう構成

Submit Job Daemon

アーキたたき台

Page 18: 2016 06-30-deep-learning-archi

Amazon API Gateway

Cfncluster• インスタンスの起動・ターミネート、スケージューリングの役目を担う• これがあることで、多数のリクエストを適切に処理できる。必要なときに必要なだけ GPUインスタンスを使うことができる

トレーニングデータ

Labels

メタ情報

cfncluster

Page 19: 2016 06-30-deep-learning-archi

まとめ

• HPC屋はWebのアーキテクチャー、Web屋は HPCのアーキテクチャーを知る必要がある• Deep Learningを行う上で、 NVIDIAはキープレイヤーではあるが、データセンター事業者向けアーキと推測。 GPUを売る会社なので当然・・・• AWS上にサービスをすべて乗っける場合など、使用する環境にあった、アーキを構築すべき