unsupervised learning - 國立臺灣大學speech.ee.ntu.edu.tw/~tlkagk/courses/ml_2016/lecture/auto...

24
Unsupervised Learning: Deep Auto-encoder

Upload: tranthuy

Post on 21-Aug-2018

259 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Unsupervised Learning - 國立臺灣大學speech.ee.ntu.edu.tw/~tlkagk/courses/ML_2016/Lecture/auto (v7).pdf · Initialize by RBM layer-by-layer Reference: Hinton, Geoffrey E., and

Unsupervised Learning:Deep Auto-encoder

Page 2: Unsupervised Learning - 國立臺灣大學speech.ee.ntu.edu.tw/~tlkagk/courses/ML_2016/Lecture/auto (v7).pdf · Initialize by RBM layer-by-layer Reference: Hinton, Geoffrey E., and

Auto-encoder

NNEncoder

NNDecoder

codeCompact

representation of the input object

codeCan reconstruct

the original object

Learn together28 X 28 = 784

Usually <784

Page 3: Unsupervised Learning - 國立臺灣大學speech.ee.ntu.edu.tw/~tlkagk/courses/ML_2016/Lecture/auto (v7).pdf · Initialize by RBM layer-by-layer Reference: Hinton, Geoffrey E., and

Starting from PCA

𝑥

Input layer

𝑊

ො𝑥

𝑊𝑇

output layerhidden layer

(linear)

𝑐

As close as possible

Minimize 𝑥 − ො𝑥 2

Bottleneck later

Output of the hidden layer is the code

encode decode

Page 4: Unsupervised Learning - 國立臺灣大學speech.ee.ntu.edu.tw/~tlkagk/courses/ML_2016/Lecture/auto (v7).pdf · Initialize by RBM layer-by-layer Reference: Hinton, Geoffrey E., and

Initialize by RBM layer-by-layer

Reference: Hinton, Geoffrey E., and Ruslan R. Salakhutdinov. "Reducing the dimensionality of data with neural networks." Science 313.5786 (2006): 504-507

Of course, the auto• -encoder can be deep

Deep Auto-encoder

Inp

ut Layer

Layer

Layer

bo

ttle

Ou

tpu

t Layer

Layer

Layer

Layer

Layer

… …

Code

As close as possible

𝑥 ො𝑥

𝑊1𝑊1

𝑇𝑊2

𝑊2𝑇

Symmetric is not necessary.

Page 5: Unsupervised Learning - 國立臺灣大學speech.ee.ntu.edu.tw/~tlkagk/courses/ML_2016/Lecture/auto (v7).pdf · Initialize by RBM layer-by-layer Reference: Hinton, Geoffrey E., and

Deep Auto-encoder

Original Image

PCA

DeepAuto-encoder

78

4

78

4

78

4

10

00

50

0

25

0

30

25

0

50

0

10

00

78

4

30

Page 6: Unsupervised Learning - 國立臺灣大學speech.ee.ntu.edu.tw/~tlkagk/courses/ML_2016/Lecture/auto (v7).pdf · Initialize by RBM layer-by-layer Reference: Hinton, Geoffrey E., and

78

4

78

4

78

4

10

00

50

0

25

0

2

2

25

0

50

0

10

00

78

4

Page 7: Unsupervised Learning - 國立臺灣大學speech.ee.ntu.edu.tw/~tlkagk/courses/ML_2016/Lecture/auto (v7).pdf · Initialize by RBM layer-by-layer Reference: Hinton, Geoffrey E., and

Pokémon

• http://140.112.21.35:2880/~tlkagk/pokemon/pca.html

• http://140.112.21.35:2880/~tlkagk/pokemon/auto.html

• The code is modified from

• http://jkunst.com/r/pokemon-visualize-em-all/

Page 8: Unsupervised Learning - 國立臺灣大學speech.ee.ntu.edu.tw/~tlkagk/courses/ML_2016/Lecture/auto (v7).pdf · Initialize by RBM layer-by-layer Reference: Hinton, Geoffrey E., and

Auto-encoder – Text Retrieval

word string:“This is an apple”

this

is

a

an

apple

pen

1

1

0

1

1

0

Bag-of-word

Semantics are not considered.

Vector Space Model

document

query

Page 9: Unsupervised Learning - 國立臺灣大學speech.ee.ntu.edu.tw/~tlkagk/courses/ML_2016/Lecture/auto (v7).pdf · Initialize by RBM layer-by-layer Reference: Hinton, Geoffrey E., and

Auto-encoder – Text Retrieval

Bag-of-word(document or query)

query

LSA: project documents to 2 latent topics

2000

500

250

125

2

The documents talking about the same thing will have close code.

Page 10: Unsupervised Learning - 國立臺灣大學speech.ee.ntu.edu.tw/~tlkagk/courses/ML_2016/Lecture/auto (v7).pdf · Initialize by RBM layer-by-layer Reference: Hinton, Geoffrey E., and

Auto-encoder –Similar Image Search

Retrieved using Euclidean distance in pixel intensity space

(Images from Hinton’s slides on Coursera)

Reference: Krizhevsky, Alex, and Geoffrey E. Hinton. "Using very deep

autoencoders for content-based image retrieval." ESANN. 2011.

Page 11: Unsupervised Learning - 國立臺灣大學speech.ee.ntu.edu.tw/~tlkagk/courses/ML_2016/Lecture/auto (v7).pdf · Initialize by RBM layer-by-layer Reference: Hinton, Geoffrey E., and

Auto-encoder –Similar Image Search

32x32

81

92

40

96

20

48

10

24

51

2

25

6

code

(crawl millions of images from the Internet)

Page 12: Unsupervised Learning - 國立臺灣大學speech.ee.ntu.edu.tw/~tlkagk/courses/ML_2016/Lecture/auto (v7).pdf · Initialize by RBM layer-by-layer Reference: Hinton, Geoffrey E., and

retrieved using 256 codes

Retrieved using Euclidean distance in pixel intensity space

Page 13: Unsupervised Learning - 國立臺灣大學speech.ee.ntu.edu.tw/~tlkagk/courses/ML_2016/Lecture/auto (v7).pdf · Initialize by RBM layer-by-layer Reference: Hinton, Geoffrey E., and

Auto-encoder – Pre-training DNN

Greedy Layer• -wise Pre-training again

Targ

et

784

1000

1000

10

Input

output

Input 784

1000

784

W1

𝑥

ො𝑥

500

W1’

Page 14: Unsupervised Learning - 國立臺灣大學speech.ee.ntu.edu.tw/~tlkagk/courses/ML_2016/Lecture/auto (v7).pdf · Initialize by RBM layer-by-layer Reference: Hinton, Geoffrey E., and

Auto-encoder – Pre-training DNN

• Greedy Layer-wise Pre-training again

Targ

et

784

1000

1000

500

10

Input

output

Input 784

1000

W1

1000

1000

fix

𝑥

𝑎1

ො𝑎1

W2

W2’

Page 15: Unsupervised Learning - 國立臺灣大學speech.ee.ntu.edu.tw/~tlkagk/courses/ML_2016/Lecture/auto (v7).pdf · Initialize by RBM layer-by-layer Reference: Hinton, Geoffrey E., and

Auto-encoder – Pre-training DNN

Greedy Layer• -wise Pre-training again

Targ

et

784

1000

1000

10

Input

output

Input 784

1000

W1

1000

fix

𝑥

𝑎1

ො𝑎2

W2fix

𝑎2

1000

W3

500 500W3’

Page 16: Unsupervised Learning - 國立臺灣大學speech.ee.ntu.edu.tw/~tlkagk/courses/ML_2016/Lecture/auto (v7).pdf · Initialize by RBM layer-by-layer Reference: Hinton, Geoffrey E., and

Auto-encoder – Pre-training DNN

• Greedy Layer-wise Pre-training again

Targ

et

784

1000

1000

10

Input

output

Input 784

1000

W1

1000

𝑥

W2

W3

500 500

10output

W4Random

init

Find-tune by backpropagation

Page 17: Unsupervised Learning - 國立臺灣大學speech.ee.ntu.edu.tw/~tlkagk/courses/ML_2016/Lecture/auto (v7).pdf · Initialize by RBM layer-by-layer Reference: Hinton, Geoffrey E., and

Auto-encoder

• De-noising auto-encoder

𝑥 ො𝑥𝑐

encode decode

Add noise

𝑥′

As close as possible

More: Contractive auto-encoderRef: Rifai, Salah, et al. "Contractive

auto-encoders: Explicit invariance

during feature extraction.“ Proceedings

of the 28th International Conference on

Machine Learning (ICML-11). 2011.

Vincent, Pascal, et al. "Extracting and composing robust features with denoising autoencoders." ICML, 2008.

Page 18: Unsupervised Learning - 國立臺灣大學speech.ee.ntu.edu.tw/~tlkagk/courses/ML_2016/Lecture/auto (v7).pdf · Initialize by RBM layer-by-layer Reference: Hinton, Geoffrey E., and

Learning More - Restricted Boltzmann Machine• Neural networks [5.1] : Restricted Boltzmann machine –

definition• https://www.youtube.com/watch?v=p4Vh_zMw-

HQ&index=36&list=PL6Xpj9I5qXYEcOhn7TqghAJ6NAPrNmUBH

• Neural networks [5.2] : Restricted Boltzmann machine –inference

• https://www.youtube.com/watch?v=lekCh_i32iE&list=PL6Xpj9I5qXYEcOhn7TqghAJ6NAPrNmUBH&index=37

• Neural networks [5.3] : Restricted Boltzmann machine - free energy

• https://www.youtube.com/watch?v=e0Ts_7Y6hZU&list=PL6Xpj9I5qXYEcOhn7TqghAJ6NAPrNmUBH&index=38

Page 19: Unsupervised Learning - 國立臺灣大學speech.ee.ntu.edu.tw/~tlkagk/courses/ML_2016/Lecture/auto (v7).pdf · Initialize by RBM layer-by-layer Reference: Hinton, Geoffrey E., and

Learning More - Deep Belief Network• Neural networks [7.7] : Deep learning - deep belief network

• https://www.youtube.com/watch?v=vkb6AWYXZ5I&list=PL6Xpj9I5qXYEcOhn7TqghAJ6NAPrNmUBH&index=57

• Neural networks [7.8] : Deep learning - variational bound

• https://www.youtube.com/watch?v=pStDscJh2Wo&list=PL6Xpj9I5qXYEcOhn7TqghAJ6NAPrNmUBH&index=58

• Neural networks [7.9] : Deep learning - DBN pre-training

• https://www.youtube.com/watch?v=35MUlYCColk&list=PL6Xpj9I5qXYEcOhn7TqghAJ6NAPrNmUBH&index=59

Page 20: Unsupervised Learning - 國立臺灣大學speech.ee.ntu.edu.tw/~tlkagk/courses/ML_2016/Lecture/auto (v7).pdf · Initialize by RBM layer-by-layer Reference: Hinton, Geoffrey E., and

Auto-encoder for CNN

Convolution

Pooling

Convolution

Pooling

Deconvolution

Unpooling

Deconvolution

Unpooling

As close as possible

Page 21: Unsupervised Learning - 國立臺灣大學speech.ee.ntu.edu.tw/~tlkagk/courses/ML_2016/Lecture/auto (v7).pdf · Initialize by RBM layer-by-layer Reference: Hinton, Geoffrey E., and

CNN -Unpooling

14 x 14 28 x 28

Source of image : https://leonardoaraujosantos.gitbooks.io/artificial-inteligence/content/image_segmentation.html

Alternative: simply repeat the values

Page 22: Unsupervised Learning - 國立臺灣大學speech.ee.ntu.edu.tw/~tlkagk/courses/ML_2016/Lecture/auto (v7).pdf · Initialize by RBM layer-by-layer Reference: Hinton, Geoffrey E., and

CNN - Deconvolution

=

Actually, deconvolution is convolution.

+

+ +

+

Page 23: Unsupervised Learning - 國立臺灣大學speech.ee.ntu.edu.tw/~tlkagk/courses/ML_2016/Lecture/auto (v7).pdf · Initialize by RBM layer-by-layer Reference: Hinton, Geoffrey E., and

Next …..

• Can we use decoder to generate something?

NNDecoder

code

Page 24: Unsupervised Learning - 國立臺灣大學speech.ee.ntu.edu.tw/~tlkagk/courses/ML_2016/Lecture/auto (v7).pdf · Initialize by RBM layer-by-layer Reference: Hinton, Geoffrey E., and

Next …..

• Can we use decoder to generate something?

NNDecoder

code