brightics deep learning 플랫폼 · 2019-07-22 · 22 model architecture define model & input...

36
Brightics Deep Learning 플랫폼

Upload: others

Post on 28-Jan-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Brightics Deep Learning 플랫폼 · 2019-07-22 · 22 Model Architecture Define model & input function Training Start the model training • Set the Parameters Server, Master, and

Brightics Deep Learning 플랫폼

Page 2: Brightics Deep Learning 플랫폼 · 2019-07-22 · 22 Model Architecture Define model & input function Training Start the model training • Set the Parameters Server, Master, and

1.Data 준비할때고생좀덜하면좋겠다

2. 모델선정시행착오를 줄이면좋겠다

3. GPU들을 공유하면 좋겠다 (Jupyter를그냥띄워놓은 상태에서)

1

2

3

Page 3: Brightics Deep Learning 플랫폼 · 2019-07-22 · 22 Model Architecture Define model & input function Training Start the model training • Set the Parameters Server, Master, and

1

Page 4: Brightics Deep Learning 플랫폼 · 2019-07-22 · 22 Model Architecture Define model & input function Training Start the model training • Set the Parameters Server, Master, and

[ 학습 데이터 부족 ] [ 불균형 학습 데이터 ]

데이터 수량

A B C A B C A B C A B C

필요 최소 데이터 기준

결함 구분

데이터 확보 필요 데이터 확보 필요

Page 5: Brightics Deep Learning 플랫폼 · 2019-07-22 · 22 Model Architecture Define model & input function Training Start the model training • Set the Parameters Server, Master, and

Generator(CVAE-GAN)

ClassifierX1

ClassifierX

Step-1. Generator & Classifier training

Step-2. Training sample generation

Step-3. Second classifier training & Evaluation test

X2

Generated SamplesWith High Confidence

ClassifierX & X2

GeneratedSamples

X : Real Data for Training

X1 : Generated Samples

X2 : High Confidence X1

Xtest : Real Data for Test

Y1 & Y2 : Classification Result

C-1

Xtest Compare Y1 & Y2

C-1

C-2

C-2

C-1

XGenerator

(CVAE-GAN)

Y1

Y2

X

Page 6: Brightics Deep Learning 플랫폼 · 2019-07-22 · 22 Model Architecture Define model & input function Training Start the model training • Set the Parameters Server, Master, and

Original

[ Pinhole ]

Original

[ Crater ]

Original

[ Pollution ]

Generated Generated Generated

Page 7: Brightics Deep Learning 플랫폼 · 2019-07-22 · 22 Model Architecture Define model & input function Training Start the model training • Set the Parameters Server, Master, and

정확도

OriginalOnly

Original+ Aug.

(Vertical Flip, etc )

Original+ Aug + GAN 각 결함 別 추가 생성 데이터 수

0 2500 5000 10000

5% 향상

Page 8: Brightics Deep Learning 플랫폼 · 2019-07-22 · 22 Model Architecture Define model & input function Training Start the model training • Set the Parameters Server, Master, and
Page 9: Brightics Deep Learning 플랫폼 · 2019-07-22 · 22 Model Architecture Define model & input function Training Start the model training • Set the Parameters Server, Master, and

(유사 이미검색 동영상 시연)

Page 10: Brightics Deep Learning 플랫폼 · 2019-07-22 · 22 Model Architecture Define model & input function Training Start the model training • Set the Parameters Server, Master, and

(Labeling Tool 동영상시연)

Page 11: Brightics Deep Learning 플랫폼 · 2019-07-22 · 22 Model Architecture Define model & input function Training Start the model training • Set the Parameters Server, Master, and

반쯤 학습한 모델에게 남은 unlabeled data를 추론해보라고 하고, 답이 불확실한 data만 labeling

100%

100%Labeling

모델

UnlabeledData

5%Labeling

95%

확실

불확실

90%

5%

학습

1

2

학습

추론

기존

Active

Learning

UnlabeledData

모델

UnlabeledData

5% 추가Labeling

3학습

목표 정확도달성시까지 반복

5%

10%

Page 12: Brightics Deep Learning 플랫폼 · 2019-07-22 · 22 Model Architecture Define model & input function Training Start the model training • Set the Parameters Server, Master, and

다른 형태/분포의 유사 입력 데이터에 대한 적응력을 학습하는 방법

문자 인식 자율 주행

Page 13: Brightics Deep Learning 플랫폼 · 2019-07-22 · 22 Model Architecture Define model & input function Training Start the model training • Set the Parameters Server, Master, and

상호 연관성을 가진 중간 형태의 데이터 도메인을 생성하여 적응 학습 과정을 수행

기존

중간Domain

생성

Page 14: Brightics Deep Learning 플랫폼 · 2019-07-22 · 22 Model Architecture Define model & input function Training Start the model training • Set the Parameters Server, Master, and

학습 데이터 : Aligned X-ray Dataset

적용 데이터 : Unaligned X-ray Dataset

학습 데이터 대비 변형된 신규 데이터에 대한 적응력이 기존 방법 대비 향상

Ours

Aligned Unaligned

Page 15: Brightics Deep Learning 플랫폼 · 2019-07-22 · 22 Model Architecture Define model & input function Training Start the model training • Set the Parameters Server, Master, and

2

Page 16: Brightics Deep Learning 플랫폼 · 2019-07-22 · 22 Model Architecture Define model & input function Training Start the model training • Set the Parameters Server, Master, and

다양한 모델

성능

학습 시간

BEST 선정

결함 결함

결함 정상

결함 결함

결함 정상

결함 결함

결함 정상

결함 결함

결함 정상

데이터

AS-IS Best Model Proposal

글러브

게 뇌

비행기

사자

행성

글러브

게 뇌

비행기

사자

행성

Page 17: Brightics Deep Learning 플랫폼 · 2019-07-22 · 22 Model Architecture Define model & input function Training Start the model training • Set the Parameters Server, Master, and

후보 모델 선발

비유사도 기반 평가 단기간 학습

Target Datafeature

Domain Discriminator

feature 특성분석

결함

정상

추론

학습

loss curve∫Ldt

학습속도분석

Target Data

CAM PCA

activation 분석

Target Data

𝒔𝒄𝒐𝒓𝒆𝒇𝒊𝒏𝒂𝒍 = 𝒇(𝑭𝒆𝒂𝒕𝒖𝒓𝒆, 𝑨𝒄𝒄𝒖𝒓𝒂𝒄𝒚, 𝑺𝒑𝒆𝒆𝒅)

Accuracy Speed

Page 18: Brightics Deep Learning 플랫폼 · 2019-07-22 · 22 Model Architecture Define model & input function Training Start the model training • Set the Parameters Server, Master, and

3

Page 19: Brightics Deep Learning 플랫폼 · 2019-07-22 · 22 Model Architecture Define model & input function Training Start the model training • Set the Parameters Server, Master, and
Page 20: Brightics Deep Learning 플랫폼 · 2019-07-22 · 22 Model Architecture Define model & input function Training Start the model training • Set the Parameters Server, Master, and
Page 21: Brightics Deep Learning 플랫폼 · 2019-07-22 · 22 Model Architecture Define model & input function Training Start the model training • Set the Parameters Server, Master, and

Domain Adaptation

Semi-Auto Labeling

Best Model Proposal

Environment Service

ExperimentService

ProductionService

Page 22: Brightics Deep Learning 플랫폼 · 2019-07-22 · 22 Model Architecture Define model & input function Training Start the model training • Set the Parameters Server, Master, and

22

Define model & input functionModel Architecture

Start the model trainingTraining

• Set the Parameters Server, Master, and chief worker (to

coordinate the asynchronous training across workers)

• Set where and how to store the logs to view in TensorBoard

Distributed Training Prep

Distributed software,DevOps,..

Data scientists

• Specify CPU, GPU, memory resources of each

• Assign specs for Master, and workers, chief worker

• Use Kubernetes to launch, maintain, and monitor those pods

Cluster Provisioning

As-Is 상황

Page 23: Brightics Deep Learning 플랫폼 · 2019-07-22 · 22 Model Architecture Define model & input function Training Start the model training • Set the Parameters Server, Master, and

23

Define model & input functionModel Architecture

Start the model trainingTraining

Data scientists

Experiment.run(

runconfig=RunConfig(no_of_workers=4))

To-Be 상황

Page 24: Brightics Deep Learning 플랫폼 · 2019-07-22 · 22 Model Architecture Define model & input function Training Start the model training • Set the Parameters Server, Master, and
Page 25: Brightics Deep Learning 플랫폼 · 2019-07-22 · 22 Model Architecture Define model & input function Training Start the model training • Set the Parameters Server, Master, and
Page 26: Brightics Deep Learning 플랫폼 · 2019-07-22 · 22 Model Architecture Define model & input function Training Start the model training • Set the Parameters Server, Master, and
Page 27: Brightics Deep Learning 플랫폼 · 2019-07-22 · 22 Model Architecture Define model & input function Training Start the model training • Set the Parameters Server, Master, and
Page 28: Brightics Deep Learning 플랫폼 · 2019-07-22 · 22 Model Architecture Define model & input function Training Start the model training • Set the Parameters Server, Master, and

※ 지속적으로개선중인화면임

Page 29: Brightics Deep Learning 플랫폼 · 2019-07-22 · 22 Model Architecture Define model & input function Training Start the model training • Set the Parameters Server, Master, and

※ 지속적으로개선중인화면임

Page 30: Brightics Deep Learning 플랫폼 · 2019-07-22 · 22 Model Architecture Define model & input function Training Start the model training • Set the Parameters Server, Master, and

※ 지속적으로개선중인화면임

Page 31: Brightics Deep Learning 플랫폼 · 2019-07-22 · 22 Model Architecture Define model & input function Training Start the model training • Set the Parameters Server, Master, and

4

Page 32: Brightics Deep Learning 플랫폼 · 2019-07-22 · 22 Model Architecture Define model & input function Training Start the model training • Set the Parameters Server, Master, and

가지치기 알고리즘불필요한 신경망 가중치 제거

추론 가속기(Inference Acceleration

Algorithm)

Low-rank approximation

행렬 연산 차원 축소

정밀도 간소화Float32 → Int8

Tensor Fusion

GPU 대역폭 최적화

Dynamic Tensor Memory

메모리 사용 효율화

Parallel-Stream Processing

스트림 병렬 처리

• 정확도 유지하고 연산은 간소화느린 부동소수 연산을 INT로 양자화하여 처리량 극대화

• GPU ↔ GPU 메모리 간 대역폭 최적화, 병목 개선GPU 커널 노드를 융합, 직접 Handling

• 메모리 사용량 최적화(적게 사용)텐서(L x M x N, 3차원 행렬)을 위한 메모리 공간 배정

• 기존 단일 입력 → 병렬입력 구조로 개선, 속도개선단일 입력 스트림을병렬 입력 스트림으로 변환

• 중요하지 않은 가중치와 노드를 제거, 탐색속도 개선불필요한 신경노드 탐색최소화

• 행렬 연산 간소화, 고속화행렬연산 시 동일한차수 행렬부터 수행함으로써 행렬 연산 간소화

모델Tensor

• 대부분 모델에서 동일 문제 존재

• 원인

- 최적화 부족

- 다중 중첩(Wrapper) 구조

- Tensorflow → C 언어 변환 문제

- CPU ↔ GPU 병목

- 메모리 비효율적 사용

- Tensor(다차원행렬) 비효율 연산

- GPU/메모리 간접 핸들링

모델Tensor

모델구조변경없이간소화 수행

Classification/Detection 성능개선

→ 속도 2배이상개선,

GPU 메모리 50%이상절감

“느린추론속도“

[AS-IS 문제점]

Tensorflow, Pytorch 등개발/학습된 딥러닝 모델

Page 33: Brightics Deep Learning 플랫폼 · 2019-07-22 · 22 Model Architecture Define model & input function Training Start the model training • Set the Parameters Server, Master, and

Tensorflow v1.13.0 연구소(FP16) 연구소(INT8)

BATCH FPS GPU(MB) FPS GPU(MB) FPS GPU(MB)

1 11 6015 15 3623 24 2759

5 14 6527 19 5111 31 2999

10 14 8575 18 9246 26 9143

0

200

400

600

800

1000

1200

1 50 100 200 400

FPS 비교

Tensorflow v1.13.0 TensorRT(FP16) TensorRT(INT8)

0

2000

4000

6000

8000

10000

12000

1 50 100 200 400

GPU 사용량 비교

Tensorflow v1.13.0 TensorRT(FP16) TensorRT(INT8)

Tensorflow v1.13.0 TensorRT FP16 TensorRT INT8

BATCH FPS GPU(MB) FPS GPU(MB) FPS GPU(MB)

1 82 8061 175 858 313 2448

50 317 8573 471 1834 960 2574

100 326 8575 470 2886 829 2702

200 332 8573 441 4986 761 5006

400 335 10793 435 9204 758 9614

※ NVIDIA Korea 최적화미적용 수치

Page 34: Brightics Deep Learning 플랫폼 · 2019-07-22 · 22 Model Architecture Define model & input function Training Start the model training • Set the Parameters Server, Master, and

Tensorflow v1.13.0 TensorRT(FP16) TensorRT(INT8)

BATCH FPS GPU(MB) FPS GPU(MB) FPS GPU(MB)

1 11 6015 15 3623 24 2759

5 14 6527 19 5111 31 2999

10 14 8575 18 9246 26 9143

0

10

20

30

40

1 5 10

FPS 비교

Tensorflow v1.13.0 TensorRT(FP16) TensorRT(INT8)

0

2000

4000

6000

8000

10000

1 5 10

GPU 사용량 비교

Tensorflow v1.13.0 TensorRT(FP16) TensorRT(INT8)

※ NVIDIA Korea 최적화미적용 수치

Page 35: Brightics Deep Learning 플랫폼 · 2019-07-22 · 22 Model Architecture Define model & input function Training Start the model training • Set the Parameters Server, Master, and

• 추론 가속화 (Inference Acceleration - TensorRT)

Jetson Xavier 기반

GPU: 512-Core Volta

CPU: 8-Core ARM v8.2 64-Bit

Mem:16 GB

→ 10 FPS 성능 (LH-RCNN)

※ Human Body Detection에 적용예정