conditional generation by rnn &...

59
Conditional Generation by RNN & Attention Hung-yi Lee 李宏毅

Upload: others

Post on 25-May-2020

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Conditional Generation by RNN & Attentionspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture... · 2017-03-25 · Conditional Generation •We don’t want to simply generate some

Conditional Generation by RNN & Attention

Hung-yi Lee

李宏毅

Page 2: Conditional Generation by RNN & Attentionspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture... · 2017-03-25 · Conditional Generation •We don’t want to simply generate some

Outline

• Generation

• Attention

• Tips for Generation

• Pointer Network

Page 3: Conditional Generation by RNN & Attentionspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture... · 2017-03-25 · Conditional Generation •We don’t want to simply generate some

GenerationGenerating a structured object component-by-component

Page 4: Conditional Generation by RNN & Attentionspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture... · 2017-03-25 · Conditional Generation •We don’t want to simply generate some

Generation

• Sentences are composed of characters/words

• Generating a character/word at each time by RNN

前 明 月

http://youtien.pixnet.net/blog/post/4604096-%E6%8E%A8%E6%96%87%E6%8E%A5%E9%BE%8D%E4%B9%8B%E5%B0%8D%E8%81%AF%E9%81%8A%E6%88%B2

<BOS> w1 w2 w3

……

……

……P(w1) P(w2|w1) P(w3|w1,w2) P(w4|w1,w2,w3)

床 前 明

~ ~ ~ ~sample

Page 5: Conditional Generation by RNN & Attentionspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture... · 2017-03-25 · Conditional Generation •We don’t want to simply generate some

Generation

• Images are composed of pixels

• Generating a pixel at each time by RNN

blue pink blue

<BOS> w1 w2 w3

……

……

……P(w1) P(w2|w1) P(w3|w1,w2) P(w4|w1,w2,w3)

red

Consider as a sentence

blue red yellow gray ……

Train a language model based on the “sentences”

red blue pink

~ ~ ~ ~

Page 6: Conditional Generation by RNN & Attentionspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture... · 2017-03-25 · Conditional Generation •We don’t want to simply generate some

Generation

• Images are composed of pixels

3 x 3 images

Page 7: Conditional Generation by RNN & Attentionspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture... · 2017-03-25 · Conditional Generation •We don’t want to simply generate some

Generation

• Image• Aaron van den Oord, Nal Kalchbrenner, Koray Kavukcuoglu, Pixel Recurrent

Neural Networks, arXiv preprint, 2016• Aaron van den Oord, Nal Kalchbrenner, Oriol Vinyals, Lasse Espeholt, Alex

Graves, Koray Kavukcuoglu, Conditional Image Generation with PixelCNNDecoders, arXiv preprint, 2016

• Video• Aaron van den Oord, Nal Kalchbrenner, Koray Kavukcuoglu, Pixel Recurrent

Neural Networks, arXiv preprint, 2016

• Handwriting• Alex Graves, Generating Sequences With Recurrent Neural Networks, arXiv

preprint, 2013

• Speech• Aaron van den Oord, Sander Dieleman, Heiga Zen, Karen Simonyan, Oriol

Vinyals, Alex Graves, Nal Kalchbrenner, Andrew Senior,Koray Kavukcuoglu, WaveNet: A Generative Model for Raw Audio, 2016

Page 8: Conditional Generation by RNN & Attentionspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture... · 2017-03-25 · Conditional Generation •We don’t want to simply generate some

Conditional Generation

• We don’t want to simply generate some random sentences.

• Generate sentences based on conditions:

Given condition:

Caption Generation

Chat-bot

Given condition:

“Hello”

“A young girl is dancing.”

“Hello. Nice to see you.”

Page 9: Conditional Generation by RNN & Attentionspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture... · 2017-03-25 · Conditional Generation •We don’t want to simply generate some

Conditional Generation

• Represent the input condition as a vector, and consider the vector as the input of RNN generator

Image Caption Generation

Input image

CNN

A vector

. (perio

d)

……

<BOS>

A

wo

man

Page 10: Conditional Generation by RNN & Attentionspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture... · 2017-03-25 · Conditional Generation •We don’t want to simply generate some

Conditional Generation

• Represent the input condition as a vector, and consider the vector as the input of RNN generator

• E.g. Machine translation / Chat-bot

機 習器 學

Information of the whole sentences

Jointly trainEncoder Decoder

Sequence-to-sequence learning

mach

ine

learnin

g

. (perio

d)

Page 11: Conditional Generation by RNN & Attentionspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture... · 2017-03-25 · Conditional Generation •We don’t want to simply generate some

Conditional Generation

U: Hi

U: Hi

M: Hi

M: HelloM: Hi

M: Hello

Serban, Iulian V., Alessandro Sordoni, Yoshua Bengio, Aaron Courville, and Joelle Pineau, 2015 "Building End-To-End Dialogue Systems Using Generative Hierarchical Neural Network Models.

https://www.youtube.com/watch?v=e2MpOmyQJw4

Need to consider longer context during chatting

Page 12: Conditional Generation by RNN & Attentionspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture... · 2017-03-25 · Conditional Generation •We don’t want to simply generate some

AttentionDynamic Conditional Generation

Page 13: Conditional Generation by RNN & Attentionspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture... · 2017-03-25 · Conditional Generation •We don’t want to simply generate some

Dynamic Conditional Generation

機 習器 學

Information of the whole sentences

Encoder Decoder

mach

ine

learnin

g

. (perio

d)

𝑐1 𝑐2 𝑐3

Page 14: Conditional Generation by RNN & Attentionspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture... · 2017-03-25 · Conditional Generation •We don’t want to simply generate some

Machine Translation

• Attention-based model

𝑧0

機 習器 學

ℎ1 ℎ2 ℎ3 ℎ4

match

𝛼01

➢ Cosine similarity of z and h

➢ Small NN whose input is z and h, output a scalar

➢ 𝛼 = ℎ𝑇𝑊𝑧

Design by yourself

What is ? match

Jointly learned with other part of the network

match

ℎ 𝑧

𝛼

Page 15: Conditional Generation by RNN & Attentionspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture... · 2017-03-25 · Conditional Generation •We don’t want to simply generate some

Machine Translation

• Attention-based model

𝛼01 𝛼0

2 𝛼03 𝛼0

4

𝑐0

𝑧1

Decoder input

𝑐0 = ො𝛼0𝑖ℎ𝑖

mach

ine

0.5 0.5 0.0 0.0

= 0.5ℎ1 + 0.5ℎ2

𝑧0

softmax

𝑐0

ො𝛼01 ො𝛼0

2 ො𝛼03 ො𝛼0

4

機 習器 學

ℎ1 ℎ2 ℎ3 ℎ4

Page 16: Conditional Generation by RNN & Attentionspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture... · 2017-03-25 · Conditional Generation •We don’t want to simply generate some

Machine Translation

• Attention-based model

𝑧1

mach

ine

𝑧0

𝑐0

match

𝛼11

機 習器 學

ℎ1 ℎ2 ℎ3 ℎ4

Page 17: Conditional Generation by RNN & Attentionspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture... · 2017-03-25 · Conditional Generation •We don’t want to simply generate some

Machine Translation

• Attention-based model

𝑧1

mach

ine

𝑧0

𝑐0

𝑐1

𝑧2

learnin

g

𝑐1

𝑐1 = ො𝛼1𝑖ℎ𝑖

= 0.5ℎ3 + 0.5ℎ4

𝛼11 𝛼1

2 𝛼13 𝛼1

4

0.0 0.0 0.5 0.5

softmax

ො𝛼11 ො𝛼1

2 ො𝛼13 ො𝛼1

4

機 習器 學

ℎ1 ℎ2 ℎ3 ℎ4

Page 18: Conditional Generation by RNN & Attentionspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture... · 2017-03-25 · Conditional Generation •We don’t want to simply generate some

Machine Translation

• Attention-based model

𝑧1

mach

ine

𝑧0

𝑐0

𝑧2

learnin

g

𝑐1

match

𝛼21

The same process repeat until generating

. (period)機 習器 學

ℎ1 ℎ2 ℎ3 ℎ4

……

……

……

Page 19: Conditional Generation by RNN & Attentionspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture... · 2017-03-25 · Conditional Generation •We don’t want to simply generate some

Speech Recognition

William Chan, Navdeep Jaitly, Quoc

V. Le, Oriol Vinyals, “Listen, Attend

and Spell”, ICASSP, 2016

Page 20: Conditional Generation by RNN & Attentionspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture... · 2017-03-25 · Conditional Generation •We don’t want to simply generate some

Image Caption Generation

filter filter filterfilter filter filter

match 0.7

CNN

filter filter filterfilter filter filter

𝑧0

A vector for each region

Page 21: Conditional Generation by RNN & Attentionspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture... · 2017-03-25 · Conditional Generation •We don’t want to simply generate some

Image Caption Generation

filter filter filterfilter filter filter

CNN

filter filter filterfilter filter filter

A vector for each region

0.7 0.1 0.1

0.1 0.0 0.0

weighted sum

𝑧1

Word 1

𝑧0

Page 22: Conditional Generation by RNN & Attentionspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture... · 2017-03-25 · Conditional Generation •We don’t want to simply generate some

Image Caption Generation

filter filter filterfilter filter filter

CNN

filter filter filterfilter filter filter

A vector for each region

𝑧0

0.0 0.8 0.2

0.0 0.0 0.0

weighted sum

𝑧1

Word 1

𝑧2

Word 2

Page 23: Conditional Generation by RNN & Attentionspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture... · 2017-03-25 · Conditional Generation •We don’t want to simply generate some

Image Caption Generation

Kelvin Xu, Jimmy Ba, Ryan Kiros, Kyunghyun Cho, Aaron Courville, Ruslan Salakhutdinov, Richard Zemel, Yoshua Bengio, “Show, Attend and Tell: Neural Image Caption Generation with Visual Attention”, ICML, 2015

Page 24: Conditional Generation by RNN & Attentionspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture... · 2017-03-25 · Conditional Generation •We don’t want to simply generate some

Image Caption Generation

Kelvin Xu, Jimmy Ba, Ryan Kiros, Kyunghyun Cho, Aaron Courville, Ruslan Salakhutdinov, Richard Zemel, Yoshua Bengio, “Show, Attend and Tell: Neural Image Caption Generation with Visual Attention”, ICML, 2015

Page 25: Conditional Generation by RNN & Attentionspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture... · 2017-03-25 · Conditional Generation •We don’t want to simply generate some

Li Yao, Atousa Torabi, Kyunghyun Cho, Nicolas Ballas, Christopher Pal, Hugo

Larochelle, Aaron Courville, “Describing Videos by Exploiting Temporal Structure”, ICCV, 2015

Page 26: Conditional Generation by RNN & Attentionspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture... · 2017-03-25 · Conditional Generation •We don’t want to simply generate some

Memory Network Answer

Match Query

vector

Document

q

Extracted

Information

……𝑥1

𝛼1

=

𝑛=1

𝑁

𝛼𝑛𝑥𝑛

𝑥2 𝑥3 𝑥𝑁

𝛼2 𝛼3 𝛼𝑁

DNNSentence to vector can be jointly trained.

Sainbayar Sukhbaatar, Arthur Szlam, Jason Weston, Rob Fergus, “End-To-End Memory Networks”, NIPS, 2015

Page 27: Conditional Generation by RNN & Attentionspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture... · 2017-03-25 · Conditional Generation •We don’t want to simply generate some

Answer

Match Query

Document

q

Extracted

Information

……𝑥1

𝛼1

=

𝑛=1

𝑁

𝛼𝑛ℎ𝑛

𝑥2 𝑥3 𝑥𝑁

𝛼2 𝛼3 𝛼𝑁

Jointly learned……ℎ1 ℎ2 ℎ3 ℎ𝑁

DNN

MemoryNetwork

Hopping

Page 28: Conditional Generation by RNN & Attentionspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture... · 2017-03-25 · Conditional Generation •We don’t want to simply generate some

MemoryNetwork

q

……

……

Compute attention

Extract information

……

……

Compute attention

Extract information

DNN a

Page 29: Conditional Generation by RNN & Attentionspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture... · 2017-03-25 · Conditional Generation •We don’t want to simply generate some

Wei Fang, Juei-Yang Hsu,

Hung-yi Lee, Lin-Shan Lee,

"Hierarchical Attention Model

for Improved Machine

Comprehension of Spoken

Content", SLT, 2016

Page 30: Conditional Generation by RNN & Attentionspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture... · 2017-03-25 · Conditional Generation •We don’t want to simply generate some

Neural Turing Machine

• von Neumann architecture

https://www.quora.com/How-does-the-Von-Neumann-architecture-provide-flexibility-for-program-development

Neural Turing Machine not only read from memory

Also modify the memory through attention

Page 31: Conditional Generation by RNN & Attentionspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture... · 2017-03-25 · Conditional Generation •We don’t want to simply generate some

Neural Turing Machine

r0

y1

fh0

𝑚01 𝑚0

2 𝑚03 𝑚0

4

x1

ො𝛼01 ො𝛼0

2 ො𝛼03 ො𝛼0

4

𝑟0 = ො𝛼0𝑖𝑚0

𝑖

Long term memory

Retrieval process

Page 32: Conditional Generation by RNN & Attentionspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture... · 2017-03-25 · Conditional Generation •We don’t want to simply generate some

Neural Turing Machine

𝑚01 𝑚0

2 𝑚03 𝑚0

4

ො𝛼01 ො𝛼0

2 ො𝛼03 ො𝛼0

4

𝑟0 = ො𝛼0𝑖𝑚0

𝑖

𝑒1 𝑎1𝑘1

𝛼1𝑖 = 𝑐𝑜𝑠 𝑚0

𝑖 , 𝑘1

ො𝛼11 ො𝛼1

2 ො𝛼13 ො𝛼1

4

softmax

𝛼11 𝛼1

2 𝛼13 𝛼1

4

r0

y1

fh0

x1

Page 33: Conditional Generation by RNN & Attentionspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture... · 2017-03-25 · Conditional Generation •We don’t want to simply generate some

Neural Turing Machine

• Real version

Page 34: Conditional Generation by RNN & Attentionspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture... · 2017-03-25 · Conditional Generation •We don’t want to simply generate some

Neural Turing Machine

𝑚01 𝑚0

2 𝑚03 𝑚0

4

ො𝛼01 ො𝛼0

2 ො𝛼03 ො𝛼0

4

𝑒1 𝑎1𝑘1

𝑚11 𝑚1

2 𝑚13 𝑚1

4

𝑚1𝑖 𝑚0

𝑖= 𝑒1 𝑎1+ො𝛼1𝑖

(element-wise)

−ො𝛼1𝑖

ො𝛼11 ො𝛼1

2 ො𝛼13 ො𝛼1

4

𝑚0𝑖

0 ~ 1

Page 35: Conditional Generation by RNN & Attentionspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture... · 2017-03-25 · Conditional Generation •We don’t want to simply generate some

Neural Turing Machine

𝑚01 𝑚0

2 𝑚03 𝑚0

4

ො𝛼01 ො𝛼0

2 ො𝛼03 ො𝛼0

4 ො𝛼11 ො𝛼1

2 ො𝛼13 ො𝛼1

4

𝑚11 𝑚1

2 𝑚13 𝑚1

4

ො𝛼21 ො𝛼2

2 ො𝛼23 ො𝛼2

4

𝑚21 𝑚2

2 𝑚23 𝑚2

4

r0

y1

fh0

x1 r1

y2

fh1

x2

Page 36: Conditional Generation by RNN & Attentionspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture... · 2017-03-25 · Conditional Generation •We don’t want to simply generate some

Tips for Generation

Page 37: Conditional Generation by RNN & Attentionspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture... · 2017-03-25 · Conditional Generation •We don’t want to simply generate some

Attention

Bad Attention

Good Attention: each input component has approximately the same attention weight

w1 w2 w3 w4

E.g. Regularization term: 𝑖𝜏 −

𝑡𝛼𝑡𝑖2

For each component Over the generation

𝛼11

𝛼𝑡𝑖

component

time

𝛼21𝛼3

1𝛼41 𝛼1

2𝛼22𝛼3

2𝛼42 𝛼1

3𝛼23𝛼3

3𝛼43 𝛼1

4𝛼24𝛼3

4𝛼44

(woman) (woman) …… no cooking

Kelvin Xu, Jimmy Ba, Ryan Kiros, Kyunghyun Cho, Aaron Courville, Ruslan Salakhutdinov, Richard Zemel, YoshuaBengio, “Show, Attend and Tell: Neural Image Caption Generation with Visual Attention”, ICML, 2015

Page 38: Conditional Generation by RNN & Attentionspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture... · 2017-03-25 · Conditional Generation •We don’t want to simply generate some

Mismatch between Train and Test

• Training Reference:

A B B

A

B

𝐶 =

𝑡

𝐶𝑡

Minimizing cross-entropy of each component

: condition A

A

B

B

A

B

<BOS>

Page 39: Conditional Generation by RNN & Attentionspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture... · 2017-03-25 · Conditional Generation •We don’t want to simply generate some

Mismatch between Train and Test

• GenerationA A

A

B

A

B

A

A

B

B

B

We do not know the reference

Testing: Output of model is the input of the next step.

Training: the inputs are reference.

Exposure Bias<BOS>

Page 40: Conditional Generation by RNN & Attentionspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture... · 2017-03-25 · Conditional Generation •We don’t want to simply generate some

A B

A AB B

AB A B A B

AB

A B

A AB B

AB A B A B

AB

One step wrong

May be totally wrong

Never explore ……

一步錯,步步錯

Page 41: Conditional Generation by RNN & Attentionspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture... · 2017-03-25 · Conditional Generation •We don’t want to simply generate some

Modifying Training Process?

A A

A

B

A

B

A

A

B

B

B

BA BReference

In practice, it is hard to train in this way.

Training is matched to testing.

When we try to decrease the loss for both step 1 and 2 …..

A

Page 42: Conditional Generation by RNN & Attentionspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture... · 2017-03-25 · Conditional Generation •We don’t want to simply generate some

A

A

B

A

B

B

B

A

B

B

A

A

Scheduled Sampling

from model

From reference B

A

from model From

reference

Reference

Page 43: Conditional Generation by RNN & Attentionspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture... · 2017-03-25 · Conditional Generation •We don’t want to simply generate some

Scheduled Sampling

• Caption generation on MSCOCO

BLEU-4 METEOR CIDER

Always from reference 28.8 24.2 89.5

Always from model 11.2 15.7 49.7

Scheduled Sampling 30.6 24.3 92.1

Samy Bengio, Oriol Vinyals, Navdeep Jaitly, Noam Shazeer, Scheduled Sampling for Sequence Prediction with Recurrent Neural Networks, arXiv preprint, 2015

Page 44: Conditional Generation by RNN & Attentionspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture... · 2017-03-25 · Conditional Generation •We don’t want to simply generate some

Beam Search

A B

A AB B

A B A B AB

AB

0.4

0.9

0.9

0.6

0.4

0.4

0.6

0.6

The green path has higher score.

Not possible to check all the paths

Page 45: Conditional Generation by RNN & Attentionspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture... · 2017-03-25 · Conditional Generation •We don’t want to simply generate some

Beam Search

A B

A AB B

A B A B AB

AB

Keep several best path at each step

Beam size = 2

Page 46: Conditional Generation by RNN & Attentionspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture... · 2017-03-25 · Conditional Generation •We don’t want to simply generate some

Beam Search

https://github.com/tensorflow/tensorflow/issues/654#issuecomment-169009989

The size of beam is 3 in this example.

Page 47: Conditional Generation by RNN & Attentionspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture... · 2017-03-25 · Conditional Generation •We don’t want to simply generate some

Better Idea?

A

A

B

A

B

B

B

<BOS>

A

B

A

B

<BOS> A

B

AB

U: 你覺得如何?

M: 高興想笑 or 難過想哭

高興 想笑

高興

高興≈難過

high score

高興≈難過

想笑≈想哭

Page 48: Conditional Generation by RNN & Attentionspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture... · 2017-03-25 · Conditional Generation •We don’t want to simply generate some

Object level v.s. Component level

• Minimizing the error defined on component level is not equivalent to improving the generated objects

𝐶 =

𝑡

𝐶𝑡

Ref: The dog is running fast

A cat a a a

The dog is is fast

The dog is running fastCross-entropy of each step

Optimize object-level criterion instead of component-level cross-entropy. object-level criterion: 𝑅 𝑦, ො𝑦

𝑦: generated utterance, ො𝑦: ground truth

Gradient Descent?

Page 49: Conditional Generation by RNN & Attentionspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture... · 2017-03-25 · Conditional Generation •We don’t want to simply generate some

Reinforcement learning?Start with

observation 𝑠1 Observation 𝑠2 Observation 𝑠3

Action 𝑎1: “right”

Obtain reward 𝑟1 = 0

Action 𝑎2 : “fire”

(kill an alien)

Obtain reward 𝑟2 = 5

Page 50: Conditional Generation by RNN & Attentionspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture... · 2017-03-25 · Conditional Generation •We don’t want to simply generate some

Reinforcement learning?

Marc'Aurelio Ranzato, SumitChopra, Michael Auli, WojciechZaremba, “Sequence Level Training with Recurrent Neural Networks”, ICLR, 2016

A A

A

B

A

B

A

A

B

B

Bobservation

Actions set

The action we take influence the observation in the next step

<BO

S>

Action taken𝑟 = 0 𝑟 = 0

𝑟𝑒𝑤𝑎𝑟𝑑:

R(“BAA”, reference)

Page 51: Conditional Generation by RNN & Attentionspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture... · 2017-03-25 · Conditional Generation •We don’t want to simply generate some

reinforcement

Scheduled sampling

Page 52: Conditional Generation by RNN & Attentionspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture... · 2017-03-25 · Conditional Generation •We don’t want to simply generate some

DAD: Scheduled SamplingMIXER: reinforcement

Page 53: Conditional Generation by RNN & Attentionspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture... · 2017-03-25 · Conditional Generation •We don’t want to simply generate some

Pointer Network

Oriol Vinyals, Meire Fortunato, Navdeep Jaitly, Pointer Network, NIPS, 2015

Page 54: Conditional Generation by RNN & Attentionspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture... · 2017-03-25 · Conditional Generation •We don’t want to simply generate some
Page 55: Conditional Generation by RNN & Attentionspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture... · 2017-03-25 · Conditional Generation •We don’t want to simply generate some
Page 56: Conditional Generation by RNN & Attentionspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture... · 2017-03-25 · Conditional Generation •We don’t want to simply generate some
Page 57: Conditional Generation by RNN & Attentionspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture... · 2017-03-25 · Conditional Generation •We don’t want to simply generate some
Page 58: Conditional Generation by RNN & Attentionspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture... · 2017-03-25 · Conditional Generation •We don’t want to simply generate some
Page 59: Conditional Generation by RNN & Attentionspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture... · 2017-03-25 · Conditional Generation •We don’t want to simply generate some

Applications

Jiatao Gu, Zhengdong Lu, Hang Li, Victor O.K. Li, “Incorporating Copying Mechanism in Sequence-to-Sequence Learning”, ACL, 2016Caglar Gulcehre, Sungjin Ahn, Ramesh Nallapati, Bowen Zhou, Yoshua Bengio, “Pointing the Unknown Words”, ACL, 2016

Machine Translation

Chat-bot

User: X寶你好,我是宏毅

Machine: 宏毅你好,很高興認識你