[ndc2017] 딥러닝으로 게임 콘텐츠 제작하기 - vae를 이용한 콘텐츠 생성 기법...
TRANSCRIPT
넥슨 신규개발본부 Blast팀
김환희
VAE를 이용한 콘텐츠 생성 기법 연구 사례
딥러닝으로 게임 콘텐츠 제작하기
발표자 소개
6년차 게임기획자NDC2014 <기획자가 유니티를 만났을 때 :
자체 툴 제작을 해보자>NDC2016 <신경망은 컨텐츠 자동생성의 꿈을 꾸는가 :
딥러닝을 이용한 SRPG 맵 평가 사례 연구>
안내
이 발표는 개인 연구 프로젝트입니다.
목차
1. 계기2. Autoencoder3. Variational Autoencoder4. 생성 사례
1. 계기
- 왜 이 주제를 선택했는가
작년 발표 자료
https://www.slideshare.net/HwanheeKim2/ndc2016-61418391
기술의 빠른 발전
http://fvae.ail.tokyo/
• <FACIAL VAE: Conditional VAE-GAN with Attribute Inference for Faces>
FaceApp
https://itunes.apple.com/us/app/faceapp-free-neural-face-transformations/id1180884341?mt=8https://play.google.com/store/apps/details?id=io.faceapp&hl=ko
기술의 빠른 발전• <Unsupervised Representation Learning with Deep Convolutional
Generative Adversarial Networks>
https://www.semanticscholar.org/paper/Unsupervised-Representation-Learning-with-Deep-Radford-Metz/35756f711a97166df11202ebe46820a36704ae77
기술의 빠른 발전• <BEGAN: Boundary Equilibrium Generative Adversarial Networks>
https://github.com/Heumi/BEGAN-tensorflow
기술의 빠른 발전• <BEGAN: Boundary Equilibrium Generative Adversarial Networks>
https://github.com/Heumi/BEGAN-tensorflow
개인적인 필요• 컨텐츠 제작 시간, 능력 부족
딥러닝을 이용한 생성 모델• VAE (Variational Auto-Encoder)
• GAN (Generative Adversarial Networks)
• PixelRNN
https://blog.openai.com/generative-models/
GAN (Generative Adversarial Networks)
• 적대적 생성 모델
• 이미지를 생성하는 생성자(Generator, G)와 이미지의 진짜/가짜를 판단하는판별자(Discriminator, D)가 상호 경쟁하며 발전
https://medium.com/@devnag/generative-adversarial-networks-gans-in-50-lines-of-code-pytorch-e81b79659e3f
Pixel RNN(Recurrent Neural Networks)
• 픽셀 단위로 반복 학습
• 제한된 정보를 이용해서 다음 상태를 예측
https://arxiv.org/abs/1601.06759
딥러닝을 이용한 생성 모델• VAE (Variational Auto-Encoder)
• GAN (Generative Adversarial Networks)
• PixelRNN
딥러닝을 이용한 생성 모델• VAE (Variational Auto-Encoder)
• GAN (Generative Adversarial Networks)
• PixelRNN
2. Autoencoder
- 자기 자신을 재생산
일반적인 딥러닝 모델 - 분류
catdog
horse
table
flower
93%
46%
12%
20%
18%
일반적인 딥러닝 모델 - 분류
catdog
horse
table
flower
93%
46%
12%
20%
18%
X Yf
AutoEncoder
http://kvfrans.com/variational-autoencoders-explained/
AutoEncoder
http://kvfrans.com/variational-autoencoders-explained/
X Xf
수학적으로는 f(x) = x 와 비슷
https://ocw.mit.edu/ans7870/18/18.013a/textbook/HTML/chapter01/section04.html
본질은 압축
http://codingplayground.blogspot.kr/2015/10/what-are-autoencoders-and-stacked.html
본질은 압축
https://foundationsofvision.stanford.edu/chapter-8-multiresolution-image-representations/
비복원 압축
http://goelhardik.github.io/2016/06/04/mnist-autoencoder/
본질은 압축
http://codingplayground.blogspot.kr/2015/10/what-are-autoencoders-and-stacked.html
이미지의 차이를 줄이는 쪽으로 학습• 이미지의 차이(error) = MSE = Mean Squared Error
https://blog.insightdatascience.com/isee-removing-eyeglasses-from-faces-using-deep-learning-d4e7d935376f
3. Variational Autoencoder
- 자기 자신을 재생산 + 약간의 변화
Autoencoder
http://kvfrans.com/variational-autoencoders-explained/
Autoencoder + α (variational)
http://kvfrans.com/variational-autoencoders-explained/
압축된 정보 = latent variable
https://www.slideshare.net/TJTorres1/deep-style-using-variational-autoencoders-for-image-generation
Unit Gaussian → latent variable• 평균μ, 분산σ2, 표준편차σ를 가지는 분포의 형태
https://www.slideshare.net/TJTorres1/deep-style-using-variational-autoencoders-for-image-generation
Error = KL-div + MSE
https://www.slideshare.net/TJTorres1/deep-style-using-variational-autoencoders-for-image-generation
KL-div• Kullback–Leibler divergence
• Cross-entropy
KL-div• Kullback–Leibler divergence
• Cross-entropy
사람 이름 차이
Solomon Kullback
Richard Arthur Leibler
KL-div• Kullback–Leibler divergence
• Cross-entropy : 정보량의 차이
http://blog.evjang.com/2016/08/variational-bayes.html
http://blog.fastforwardlabs.com/2016/08/12/introducing-variational-autoencoders-in-prose-and.html
MNIST dataset
(handwriting)
28 x 28
Reconstructed
28 x 28
Latent Variable Space
http://blog.fastforwardlabs.com/2016/08/12/introducing-variational-autoencoders-in-prose-and.html
Latent Space Walking
http://blog.fastforwardlabs.com/2016/08/22/under-the-hood-of-the-variational-autoencoder-in.html
Reconstructed
4. 생성 사례
- Dungeon Shape, 캐릭터 Portrait, Sprite 등
개발 환경• Tensorflow v1.0.1 (windows 10)
• Python 3.5
• GTX1070
• Javascript (Web Demo)
Network Overview• VAE Network + recurrent generation
• Pooling Layer 대신 Convolution Layer 에서 stride=2 (2칸씩 건너뛰기)
• Kevin frans 의 코드 참고
http://deeplearning.net/software/theano/tutorial/conv_arithmetic.html
Network - Encoder
Input64,64,c
Conv132,32,64
Conv216,16,128
Conv38,8,256
Reshape16384
Dropout16384
Dense(μ)800
Dense(σ)800
Network - Encoder
Input64,64,c
Encoder Dense(μ)800
Dense(σ)800
• 이미지를 800개의 가우시안 분포로 변환(압축)
Network - Encoder
Input64,64,c
Encoder Dense(μ)800
Dense(σ)800
Network – Decoder
Conv_T32,32,64
Conv_T16,16,128
Dense16384
Reshape8,8,256
Dense(μ)800
Dense(σ)800
Conv_T64,64,c
Sample800
Random800
Network - Decoder• Random = 표준정규분포 난수
• Sample = μ + σ * Random(mean=0,std=1) = 정규분포 난수
>>> import numpy as np
>>> print(np.random.normal(0,1))
-0.3835458243630663
https://www.mathsisfun.com/data/images/standardizing.svg
Network – Decoder
Conv_T32,32,64
Conv_T16,16,128
Dense16384
Reshape8,8,256
Dense(μ)800
Dense(σ)800
Conv_T64,64,c
Sample800
Random800
Network – Decoder
Output64,64,c
Conv_T32,32,64
Conv_T16,16,128
Dense16384
Reshape8,8,256
Dense(μ)800
Dense(σ)800
Conv_T64,64,c
Sample800
Random800
Network – Decoder
Output64,64,c
Conv_T32,32,64
Conv_T16,16,128
Dense16384
Reshape8,8,256
Dense(μ)800
Dense(σ)800
Conv_T64,64,c
Sample800
Random800
Recurrent
generation
Recurrent Generationimport mathimport numpy as npimport matplotlib.pyplot as plt%matplotlib inline
def sigmoid(a): return 1/(1+math.exp(-a))
N = 200random_y = np.random.randn(N) / 5000_x = np.random.randn(N)plt.figure()plt.xlim(-6, 6)plt.scatter(_x, random_y)
_x2 = _x * 6plt.figure()plt.xlim(-6, 6)plt.scatter(_x2, random_y)
sigmoid1 = [sigmoid(m) for m in _x]plt.figure()plt.xlim(-6, 6)plt.ylim(0, 1)plt.scatter(_x, sigmoid1, c='red')
sigmoid2 = [sigmoid(m) for m in _x2]plt.figure()plt.xlim(-6, 6)plt.ylim(0, 1)plt.scatter(_x2, sigmoid2, c='red')
x x * 6
sigmoid(x) sigmoid(x * 6)
Network - Decoder• Recurrent Generation
Original Generated (step=6)
Network – Decoder
Output64,64,c
Dense(μ)800
Dense(σ)800
Sample800
Random800
Decoder
• 샘플링된 데이터에서 이미지 데이터를 만들어냄 (복원)
Network – Decoder
Output64,64,c
Dense(μ)800
Dense(σ)800
Sample800
Random800
Decoder
Play with Data• 목표
- Latent Variable 탐색
- Shape morphing (e.g. Town → Maze)
- Online Demo
Dungeon Shape• 64 x 64, Binary Image
roomsAndCorridor openCave diamondMine town division maze
Dungeon Shape• Procedural Generation
• N = 50,000
openCave diamondMine town division mazeroomsAndCorridor
Dungeon Shape• t-SNE
• 고차원→저차원
• 800D→2D
• t-SNE
• 일부분 확대
Dungeon Shape
Latent Variable (z)
roomsAndCorridor
z1
Encoder
Network
(conv)
Decoder
Network
(conv)
roomsAndCorridor
Latent Variable (z)
z2
Encoder
Network
(conv)
Decoder
Network
(conv)
diamondMine diamondMine
Shape Morphing
z1
.
.
.
.
zn=z1+(z2-z1)*n/step.
.
.
.
z2
Decoder
Network
(conv)step ???
??????
Latent Variable (z)
roomsAndCorridor
z1
Encoder
Network
(conv)
Decoder
Network
(conv)
roomsAndCorridor
mu1
std1
Shape Morphingmu1
.
mun=mu1+(mu2-
mu1)*n/step.
mu2Decoder
Network
(conv)
?????????
mu1
std1
mu2
std2
std1
.
stdn=std1+(std2-
std1)*n/step.
std2
z1
.
.
.
.
zn
.
.
.
.
z2
Shape Morphing
Shape Morphing
Shape Morphing
Interpolation• Linear Interpolation (Lerp)
• Spherical linear Interpolation (Slerp)
http://www.geeks3d.com/20140205/glsl-simple-morph-target-animation-opengl-glslhacker-demo/
Shape Morphing• Slerp
• Lerp
• random
• random
• random (threshold = 0.5)
Online Demo
• mattya, carpedm20 의 코드 참고
https://github.com/mattya/chainer-DCGANhttps://github.com/carpedm20/DCGAN-tensorflow
Data2 : Character Portrait• Game Character Hub 의 Face Set
• N = 50,000
• 128 px → 108 px (crop) → 64 px (resize)
• mean
• random
Character Portrait• t-SNE
• Y값 없으므로 단색
Character Portrait• t-SNE
• 일부분 확대
• t-SNE
• t-SNE(x2)
• t-SNE(x2)
• t-SNE(x2)
Analogy• 유추
• A : B = C : ?
• ? = (B + C) – A
<Sampling Generative Networks>, Tom White
J-Diagram
http://www.nibcode.com/en/psychometric-training/abstract-reasoning-test/
Analogy
A B
C ?
Analogy
A B
C ? = (B + C) – A
Analogy
A B
C ? = (B + C) – A
Analogy
A B
C ? = (B + C) – A
Online Demo
Data3 : Character Sprite• Game Character Hub: Portfolio Edition
의 Character Set
• N = 450,000 = 50,000 x 9 (frames)
• 64 x 64
Mean Calculation
https://github.com/Newmu/dcgan_code/tree/gh-pages
Mean Calculation
?
3rd frame 3rd frame
mean
5th frame
mean
Mean Calculation
3rd frame 3rd frame
mean
5th frame
mean
guessed
Mean Calculation
3rd frame 3rd frame
mean
5th frame
mean
guessed real
Mean Calculation
Mean Calculation
Experiment
3rd frame +3rd frame
mean
+3rd frame
mean
+3rd frame
mean
+3rd frame
mean
+3rd frame
mean
+3rd frame
mean
-3rd frame
mean
-3rd frame
mean
-3rd frame
mean
-3rd frame
mean
-3rd frame
mean
-3rd frame
mean
Online Demo• https://greentec.github.io
결론 & 논의• 레벨 디자인 같은 데이터는 실사용 가능
• 이미지 데이터는 실사용 무리 → 멀지 않은 미래에 가능할 듯 (BEGAN 등)
• Original 데이터는 많을수록 좋음
• 이 방법으로는 최소 N > 2,000
• Latent variable 탐색 방법은 발전의 여지가 많음
더 알아보기• VAE-GAN
• BEGAN
• PixelVAE
Q&A
감사합니다.
Unit Gaussian
https://experimentationground.wordpress.com/2016/10/01/intuition-and-math-behind-variational-autoencoder/
Unit Gaussian
https://experimentationground.wordpress.com/2016/10/01/intuition-and-math-behind-variational-autoencoder/
Data3 : Korean Font• 142개 무료 폰트 활용
• 표준 글자 2,442자 사용 (특수문자 포함)