neural networks in natural language processingjiesutd.github.io/papers/201811nuist.pdf ·...

34
1 Neural Networks in Natural Language Processing -- with POS/NER as an example Jie YANG 杨杰 Singapore University of Technology and Design November 21, 2018 @Nanjing University of Information Science & Technology

Upload: others

Post on 17-Jun-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Neural Networks in Natural Language Processingjiesutd.github.io/papers/201811NUIST.pdf · 2020-06-03 · 2 Outline • Introduction • Machine Learning and Neural Networks Framework

1

Neural Networks in Natural Language Processing-- with POS/NER as an example

Jie YANG 杨杰

Singapore University of Technology and Design

November 21, 2018

@Nanjing University of Information Science & Technology

Page 2: Neural Networks in Natural Language Processingjiesutd.github.io/papers/201811NUIST.pdf · 2020-06-03 · 2 Outline • Introduction • Machine Learning and Neural Networks Framework

2

Outline

• Introduction

• Machine Learning and Neural Networks Framework

• Neural Network Models in Natural Language Processing (NLP)

• Overview of NN for NLP – example of POS/NER

• Conclusion

Page 3: Neural Networks in Natural Language Processingjiesutd.github.io/papers/201811NUIST.pdf · 2020-06-03 · 2 Outline • Introduction • Machine Learning and Neural Networks Framework

3

Outline

• Introduction

• Machine Learning and Neural Networks Framework

• Neural Network Models in Natural Language Processing (NLP)

• Overview of NN for NLP – example of POS/NER

• Conclusion

Page 4: Neural Networks in Natural Language Processingjiesutd.github.io/papers/201811NUIST.pdf · 2020-06-03 · 2 Outline • Introduction • Machine Learning and Neural Networks Framework

Introduction

4

Ø Natural Language Processing: [1]

§ Subfield of Artificial Intelligent (AI).

§ Interactions between computers and human (natural) languages.

§ How to program computers to process and analyze large amounts of natural language data.

[1]  https://en.wikipedia.org/wiki/Natural_language_processing

Page 5: Neural Networks in Natural Language Processingjiesutd.github.io/papers/201811NUIST.pdf · 2020-06-03 · 2 Outline • Introduction • Machine Learning and Neural Networks Framework

Introduction

5

Ø Natural Language Processing: application

Digital Speaker

Search  Engine

Machine  Translation

News  Recommend

Page 6: Neural Networks in Natural Language Processingjiesutd.github.io/papers/201811NUIST.pdf · 2020-06-03 · 2 Outline • Introduction • Machine Learning and Neural Networks Framework

Introduction

6

Ø Neural Network: [2]

§ Inspired by the biological neural networks.

§ Connected neurons, with weights and nonlinear functions.

[2]  https://en.wikipedia.org/wiki/Artificial_neural_network

Biological  Neuron Neural  Network

Page 7: Neural Networks in Natural Language Processingjiesutd.github.io/papers/201811NUIST.pdf · 2020-06-03 · 2 Outline • Introduction • Machine Learning and Neural Networks Framework

7

Outline

• Introduction

• Machine Learning and Neural Networks Framework

• Neural Network Models in Natural Language Processing (NLP)

• Overview of NN for NLP – example of POS/NER

• Conclusion

Page 8: Neural Networks in Natural Language Processingjiesutd.github.io/papers/201811NUIST.pdf · 2020-06-03 · 2 Outline • Introduction • Machine Learning and Neural Networks Framework

ML&NN

8

Ø Machine Learning: machine + learning

§ Machine: automatic, efficient, programmable

§ Learning: learning from data, rather than using rules§ Supervised: large annotated data§ Semi-supervised: small annotated data + large unannotated data § Unsupervised: unannotated data

Page 9: Neural Networks in Natural Language Processingjiesutd.github.io/papers/201811NUIST.pdf · 2020-06-03 · 2 Outline • Introduction • Machine Learning and Neural Networks Framework

ML&NN

9

Ø Supervised Learning:

§ What we have: annotated data, i.e. training data: (𝑥, 𝑦)𝑥 is input vector, 𝑦 is given label

§ What we want: find a model 𝑓() to predict the label 𝑦' of giving decode data 𝑥'.i.e. 𝑓(𝑥') = 𝑦'

e.g. 𝑓 𝑥 = 0.5𝑥, + 2.1𝑥 + 0.3 , 0.5/2.1/0.3 are parameters.

§ The way of finding the representation of 𝑓() is machine learning.

Page 10: Neural Networks in Natural Language Processingjiesutd.github.io/papers/201811NUIST.pdf · 2020-06-03 · 2 Outline • Introduction • Machine Learning and Neural Networks Framework

ML&NN

10

Ø Supervised Learning:

§ Train:(𝑥, 𝑦)

model  𝑓(𝑥)/parameter𝑥

Predicted  𝑦1

Real𝑦

Compare(Loss  function)

RightWrong

Update  parameterwith   loss

Next  𝑥

§ Decode:

model  𝑓(𝑥')/parameter𝑥'

Predict𝑦'

Page 11: Neural Networks in Natural Language Processingjiesutd.github.io/papers/201811NUIST.pdf · 2020-06-03 · 2 Outline • Introduction • Machine Learning and Neural Networks Framework

ML&NN

11

Ø Model 𝑓(𝑥) : structure + parameters

§ Linear Regression, SVM, Decision Tree, etc.

§ Neural Network:Feed-forward NN: Convolutional NN:

Recurrent NN:

https://www.learnopencv.com/understanding-­‐feedforward-­‐neural-­‐networks/http://colah.github.io/posts/2015-­‐08-­‐Understanding-­‐LSTMs/

Page 12: Neural Networks in Natural Language Processingjiesutd.github.io/papers/201811NUIST.pdf · 2020-06-03 · 2 Outline • Introduction • Machine Learning and Neural Networks Framework

12

Ø Problem: § Machine Learning or Neural Network models are trained to

build a fitting function, which is calculated using numbers.

For example: 𝑓 𝑥 = 0.2𝑥, + 3𝑥2 − 2.3𝑥45

if 𝑥 = 1, then 𝑦 =  𝑓 𝑥 = 𝑓 1 = 0.2 + 3 − 2.3 = 0.9

§ Language is represented with words/characters.

“我来到南京信息工程大学。”“Our neural sequence labeling framework contains three layers.”

§ How do we apply neural networks to language processing?

Page 13: Neural Networks in Natural Language Processingjiesutd.github.io/papers/201811NUIST.pdf · 2020-06-03 · 2 Outline • Introduction • Machine Learning and Neural Networks Framework

13

Outline

• Introduction

• Machine Learning and Neural Networks Framework

• Neural Network Models in Natural Language Processing (NLP)

• Overview of NN for NLP – example of POS/NER

• Conclusion

Page 14: Neural Networks in Natural Language Processingjiesutd.github.io/papers/201811NUIST.pdf · 2020-06-03 · 2 Outline • Introduction • Machine Learning and Neural Networks Framework

NN for NLP

14

Ø Word Representation:

§ Goal: map the words into numbers.

§ Method: word embeddings.

§ Format: distributed real numbers, vectors

The chairman of the Federal Reserve is Ben Bernanke

[0.4, ...,1.3, -0.6] [0.7, ...,3.2, 1.5] [0.2, -1.2, 6.1...]... ... ...

Page 15: Neural Networks in Natural Language Processingjiesutd.github.io/papers/201811NUIST.pdf · 2020-06-03 · 2 Outline • Introduction • Machine Learning and Neural Networks Framework

NN for NLP

15

Ø Word Embeddings:

§ Advantages: can be tuned, contains word similarity

Word embeddings mapped in two dimensionshttp://nlp.yvespeirsman.be/images/glove-­‐word-­‐embeddings-­‐education.png

Page 16: Neural Networks in Natural Language Processingjiesutd.github.io/papers/201811NUIST.pdf · 2020-06-03 · 2 Outline • Introduction • Machine Learning and Neural Networks Framework

NN for NLP

16

Ø NN models in NLP:

model  𝑓(𝑥')/parameter𝑥'

Predict𝑦'

RNN/LSTM/GRU CNN Transformer

... ...Embeddings

1.  Design  Challenges   and  Misconceptions   in  Neural  Sequence   Labeling2.  Attention  Is  All  You  Need

Page 17: Neural Networks in Natural Language Processingjiesutd.github.io/papers/201811NUIST.pdf · 2020-06-03 · 2 Outline • Introduction • Machine Learning and Neural Networks Framework

17

Outline

• Introduction

• Machine Learning and Neural Networks Framework

• Neural Network Models in Natural Language Processing (NLP)

• Overview of NN for NLP – example of POS/NER

• Conclusion

Page 18: Neural Networks in Natural Language Processingjiesutd.github.io/papers/201811NUIST.pdf · 2020-06-03 · 2 Outline • Introduction • Machine Learning and Neural Networks Framework

POS/NER task

18

Ø POS: Part-of-speech (POS) tagging:§ group the words with the specific class which share similar

syntactic behaviors or fit specific types.

Ø NER: Named Entity Recognition§ identify and classify the named entities from the input text

into pre-defined entity categories.

Input The complicated language in the huge new law has muddied the fight .Output DT VBN NN IN DT JJ JJ NN VBZ VBN DT NN .Input We ’re about to see if advertising works .Output PRP VBP IN TO VB IN NN VBZ .Input This time , the firms were ready .Output DT NN , DT NNS VBD JJ .

[Barack Obama] PER was born in [hawaii] LOC .

Rare [Hendrix] PER song draft sells for almost $ 17,000 .

[Volkswagen AG] ORG won 77,719 registrations .

[Burundi] LOC disqualification from [African Cup] MISC confirmed .

The bank is a division of [First Union Corp] ORG .

The chairman of the Federal Reserve] ORG is [Ben Bernanke] PER .

[US] LOC President [Trump] PER and [KP] LOC leader [Kim] PER will meet in [Singapore] LOC .

Page 19: Neural Networks in Natural Language Processingjiesutd.github.io/papers/201811NUIST.pdf · 2020-06-03 · 2 Outline • Introduction • Machine Learning and Neural Networks Framework

POS/NER task

19

张三 来到 南京 信息 工程 大学Input:

NER: Person Organization

POS: NR VV NR NN NN NN

NER: B-PER O B-ORG I-ORG I-ORG I-ORG

Page 20: Neural Networks in Natural Language Processingjiesutd.github.io/papers/201811NUIST.pdf · 2020-06-03 · 2 Outline • Introduction • Machine Learning and Neural Networks Framework

Overview

20

Ø Overview of the entire process:

Data Annotation

Model Design

Model Training

Model Evaluation

Page 21: Neural Networks in Natural Language Processingjiesutd.github.io/papers/201811NUIST.pdf · 2020-06-03 · 2 Outline • Introduction • Machine Learning and Neural Networks Framework

Overview

21

Ø Data Annotation:§ based on the task, manually annotate text as training

data, i.e. build (𝑥, 𝑦).

Example of YEDDA annotation interface

Page 22: Neural Networks in Natural Language Processingjiesutd.github.io/papers/201811NUIST.pdf · 2020-06-03 · 2 Outline • Introduction • Machine Learning and Neural Networks Framework

Overview

22

Ø Data Annotation:§ Data format example:

NER  data  segment POS  data  segment

Page 23: Neural Networks in Natural Language Processingjiesutd.github.io/papers/201811NUIST.pdf · 2020-06-03 · 2 Outline • Introduction • Machine Learning and Neural Networks Framework

Overview

23

Ø Overview of the entire process:

Data Annotation

Model Design

Model Training

Model Evaluation

Page 24: Neural Networks in Natural Language Processingjiesutd.github.io/papers/201811NUIST.pdf · 2020-06-03 · 2 Outline • Introduction • Machine Learning and Neural Networks Framework

Overview

24

Ø Model Design:§ Different tasks are benefited from different models:

§ Parsing: Transition-based, Biaffine Attention, tree-LSTM

§ Translation: EncoderDecoder+Attention, Transformer

§ Sequence Labeling tasks: LSTM+CRF

Page 25: Neural Networks in Natural Language Processingjiesutd.github.io/papers/201811NUIST.pdf · 2020-06-03 · 2 Outline • Introduction • Machine Learning and Neural Networks Framework

Overview

25

Ø LSTM+CRF model:

Character  Representation Word  Representation

Example of NCRF++

Page 26: Neural Networks in Natural Language Processingjiesutd.github.io/papers/201811NUIST.pdf · 2020-06-03 · 2 Outline • Introduction • Machine Learning and Neural Networks Framework

Overview

26

Ø Overview of the entire process:

Data Annotation

Model Design

Model Training

Model Evaluation

Page 27: Neural Networks in Natural Language Processingjiesutd.github.io/papers/201811NUIST.pdf · 2020-06-03 · 2 Outline • Introduction • Machine Learning and Neural Networks Framework

Overview

27

Ø Model Training:

§ Initialize the model parameters: random or pretraining.

§ Feed the annotated data as input: 𝑥, 𝑦 .

§ Update the model parameters to fit the annotated data.

Page 28: Neural Networks in Natural Language Processingjiesutd.github.io/papers/201811NUIST.pdf · 2020-06-03 · 2 Outline • Introduction • Machine Learning and Neural Networks Framework

Overview

28

Ø Overview of the entire process:

Data Annotation

Model Design

Model Training

Model Evaluation

Page 29: Neural Networks in Natural Language Processingjiesutd.github.io/papers/201811NUIST.pdf · 2020-06-03 · 2 Outline • Introduction • Machine Learning and Neural Networks Framework

Overview

29

Ø Model Evaluation:

§ After training, how to evaluate the model performance?

§ Different tasks require different evaluate metrics.§ POS tagging: 𝑎𝑐𝑐𝑢𝑟𝑎𝑐𝑦 = <=>>?<@4@=A?B

CDD4@=A?B§ NER: 𝐹1 = ,FG

FHG§ Some tasks are hard to evaluate, e.g. translation

I hate you .我讨厌你我厌恶你我不喜欢你

Original:Translator1:Translator2:Translator3:

Page 30: Neural Networks in Natural Language Processingjiesutd.github.io/papers/201811NUIST.pdf · 2020-06-03 · 2 Outline • Introduction • Machine Learning and Neural Networks Framework

Overview

30

Ø Model Evaluation:

POS task evaluation NER task evaluation

Page 31: Neural Networks in Natural Language Processingjiesutd.github.io/papers/201811NUIST.pdf · 2020-06-03 · 2 Outline • Introduction • Machine Learning and Neural Networks Framework

Overview

31

Ø Model Evaluation:

§ Using the trained model to decode the test data 𝑥 and get the result 𝑦1

§ Conduct the model evaluation between 𝑦1 and gold label 𝑦.

§ If necessary, go back to refine the model design or training step, or even the data annotation step.

Data Annotation

Model Design

Model Training

Model Evaluation

Page 32: Neural Networks in Natural Language Processingjiesutd.github.io/papers/201811NUIST.pdf · 2020-06-03 · 2 Outline • Introduction • Machine Learning and Neural Networks Framework

32

Outline

• Introduction

• Machine Learning and Neural Networks Framework

• Neural Network Models in Natural Language Processing (NLP)

• Overview of NN for NLP – example of POS/NER

• Conclusion

Page 33: Neural Networks in Natural Language Processingjiesutd.github.io/papers/201811NUIST.pdf · 2020-06-03 · 2 Outline • Introduction • Machine Learning and Neural Networks Framework

Conclusion

33

Ø We have gone through the whole development process of NLP in neural network.

Ø The neural network models are actually to train a function to fit the given annotated data.

Ø There are plenty of neural network models to be used in NLP.

Ø Neural network based NLP uses embedding vectors to map the words as numbers.

Ø Data + Model +Evaluation are three important parts in the development of NN based NLP.

Page 34: Neural Networks in Natural Language Processingjiesutd.github.io/papers/201811NUIST.pdf · 2020-06-03 · 2 Outline • Introduction • Machine Learning and Neural Networks Framework

34

Thanks!

Q&A