自然言語処理 2010

45
自自自自自自 2010 自自自自自自 自自自自自自自自自自自自自 自自自自

Upload: nerina

Post on 11-Jan-2016

48 views

Category:

Documents


1 download

DESCRIPTION

自然言語処理 2010. 東京工科大学 コンピュータサイエンス学部 亀田弘之. 今日の話題. 研究レベルの話題に挑戦! 機械学習(帰納論理プログラミングの紹介). 研究レベルの話題に挑戦!. Pacling2009 での研究発表を理解してみよう! 皆さんはもうこのレベルの話しを理解できます。 批判的に聞いてください。 自分なりのアイデアを得てください。 (アイデアを得たらそれをもとに NLP の研究をしよう!). - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: 自然言語処理 2010

自然言語処理 2010

東京工科大学コンピュータサイエンス学部

亀田弘之

Page 2: 自然言語処理 2010

今日の話題

• 研究レベルの話題に挑戦!• 機械学習(帰納論理プログラミングの紹

介)

2

Page 3: 自然言語処理 2010

研究レベルの話題に挑戦!

• Pacling2009 での研究発表を理解してみよう!– 皆さんはもうこのレベルの話しを理解できます

。– 批判的に聞いてください。– 自分なりのアイデアを得てください。

(アイデアを得たらそれをもとに       NLP の研究をしよう!)

3

Page 4: 自然言語処理 2010

Unknown Word Acquisitionby Reasoning Word Meanings

Incrementally and Evolutionarily

Hiroyuki Kameda Tokyo University of Technology

Chiaki Kubomura Yamano College of Aesthetics

Page 5: 自然言語処理 2010

Overview

1. Research background

2. Basic ideas of knowledge acquisition

3. Demonstrations

4. Concludings

5

Page 6: 自然言語処理 2010

Exchange of Information & Knowledge

Exchange of Information & Knowledge

natural & smooth communication

natural & smooth communication

Language etc.

6

Page 7: 自然言語処理 2010

A talking Robot with a human

7

Page 8: 自然言語処理 2010

Dialogue

New products release

new social eventsunknown words

( Various Topics )

yougers’ words

Professional Slangs

natural, smooth and flexible communication

natural, smooth and flexible communication

Highly upgraded NLP technology

Highly upgraded NLP technology

Unknown word Processing

Unknown word Processing

8

Page 9: 自然言語処理 2010

Unknown Word Acquisition-Our Basic ideas-

9

Page 10: 自然言語処理 2010

Information Processing Model

10

Page 11: 自然言語処理 2010

NLP System

Text: Dogs ran fast. Syntactic structure: S-----NP-----N-----dogs | ----VP----V----ran | --Adv---fast

11

Page 12: 自然言語処理 2010

Information Processing Model 2

12

Page 13: 自然言語処理 2010

Knowledge Acquisition Model

13

Page 14: 自然言語処理 2010

Knowledge Acquisiton System

14

Page 15: 自然言語処理 2010

More concretely to say…

NLP Engine

Lexicon( Domain Knowledge )

Program(Source codes)( Knowledge of objects

to be processed )

UW Acquisition

Processing rule acquisition

sentenceInternal representation

trigger

trigger

15

Page 16: 自然言語処理 2010

Further more concretely to say…

NLP Engine( Prolog Interpreter )

Lexicon( Domain Knowledge )

Rule-based grammar + α( Target Knowledge )

UW Acquisition

Syntactic rule acquisition( by ILP )

sentence Internal representation

Failure-trigger

Batch

16

Page 17: 自然言語処理 2010

Further more concretely to say…2

NLP Engine( Prolog Interpreter )

Lexicon( Domain Knowledge )

Rule-based grammar + α( Target Knowledge )

UW Acquisition

Syntactic rule acquisition( by ILP )

sentence Internal representation

Failure-trigger

Batch

17

Page 18: 自然言語処理 2010

Main topics

18

Page 19: 自然言語処理 2010

Let’s consider a grammar.G1 = { Vn, Vt, s, P },

where

Vn = {noun, verb, adverb, sentence},

Vt = {dogs, run, fast},

s = sentence,

P = { sentence → noun + verb + adverb,

noun→dogs, verb→run, adverb →fast}.

○   dogs run fast. ×   cars run fast.

19

Page 20: 自然言語処理 2010

Let’s consider a grammar.Grammar G1 = { Vn, Vt, s, P },

where

Vn = {noun, verb, adverb, sentence},

Vt = {dogs, run, fast},

s = sentence,

P = { sentence → noun + verb + adverb, noun→dogs, verb→run, adverb →fast}.

○   dogs run fast. ×   cars run fast.

Cannot unify!cars <=!=> noun

20

Page 21: 自然言語処理 2010

Our ideas

1. Processing modes

2. Processing strategies

21

Page 22: 自然言語処理 2010

Processing Modes

22

Page 23: 自然言語処理 2010

Processing Modes

Modes Unknow

Words

Unknow Syntactic rules

Mode 1 No No

Mode 2 Yes No

Mode3 No Yes

Mode4 Yes Yes23

Page 24: 自然言語処理 2010

Processing Modes

Modes Unknow

Words

Unknow Syntactic rules

Mode 1 No No

Mode 2 Yes No

Mode3 No Yes

Mode4 Yes Yes×

×

24

Page 25: 自然言語処理 2010

Processing Modes

Modes Unknow

Words

Unknow Syntactic rules

Mode 1 No No

Mode 2 Yes No

Mode3 No Yes

Mode4 Yes Yes×

×

○   dogs run fast.×   cars run fast.

25

Page 26: 自然言語処理 2010

Processing Strategies

26

Page 27: 自然言語処理 2010

Adopted Processing Strategies

1. Parse a sentence in mode-1 at first.

2. If parsing fails, then switch the processing mode from mode-1 to mode-2.

27

Page 28: 自然言語処理 2010

Grammar G1 in Prolog

sentence (A,D):-noun (A,B), verb (B,C), adverb (C,D).

noun([dogs|T],T).

verb([run|T],T).

adverb([fast|T],T).

noun(AT,T) :- write(‘Unknown word found!‘).

Syntactic rule

Lexicon

New processing rule

28

G1 = { Vn, Vt, s, P },

where

Vn = {noun, verb, adverb, sentence},

Vt = {dogs run fast.},

s = sentence,

P = { sentence → noun + verb + adverb,

noun→dogs, verb→run, adverb →fast}.

Page 29: 自然言語処理 2010

References

• Kameda, Sakurai and Kubomura :ACAI’99 Machine Learning and Applications, Proceedings of Workshop W01: Machine learning in human language technology, pp.62-67(1999).

• Kameda & Kubomura:Proc. of Pacling2001, pp.146-152(2001).

29

Page 30: 自然言語処理 2010

Let’s explain in more details!

31

Page 31: 自然言語処理 2010

Example sentence

Tom broke the cup with the hammer.(Okada1991)

tom broke the cup with the hammer

32

Page 32: 自然言語処理 2010

Grammatical settings

G2 = <Vn, Vt, P, s>Vn = { s, np, vp, prpn, v, prp, pr, det, n },Vt = { broke, cup, hammer, the, tom, with }P = { s -> np,vp. np -> prpn. vp -> v,np,prp.

prp -> pr,np. np -> det,n.prpn -> tom. V -> broke.Det -> the. n -> cup. pr -> with.n -> hammer. }

s:start symbol

33

Page 33: 自然言語処理 2010

Prolog version of Grammar G2

/* Syntactic rules*/s(A, C, s( _np, _vp ), a1( _act, _agt, _obj, _inst )) :-

np( A, B, _np, sem( _agt ) ),vp( B, C, _vp, sem( _act, _agt, _obj, _inst )).

np(A, B, np( _prpn ), sem( _ )) :- prpn(A, B, _prpn, sem( _ )).

vp(A, D, vp( _v, _np, _prp ), sem( Act, Agt, Obj, Inst )) :-v( A, B, _v, sem( Act, Agt, Obj, Inst )),np( B, C, _np, sem( Obj )),prp( C, D, _prp, sem( Inst )).

vp(A, C, vp( _v, _np ), sem(Act,Agt,Obj,Inst) ) :-v(A, B, _v, sem( Act, Agt, Obj, Inst ) ),np(B, C, _np, sem( Obj )).

prp(A, C, prp( _pr, _np ), sem( Z )) :-pr(A, B, _pr, sem( _ ) ),np(B, C, _np, sem( Z )).

np(A, C, np( _det, _n ), sem( W )) :- det(A, B, _det, sem( _ ) ),n(B, C, _n, sem( W )).

/* Lexicon*/prpn( [tom|T], T, prpn(tom), sem(human) ).v([broke|T],T, v1(broke),

sem(change, in_shape, human, thing, tool) ).det([the|T], T, det(the), sem( _ ) ).n( [cup|T], T, n(cup), sem(thing) ).pr( [with|T], T, pr(with), sem( _ )).n( [hammer|T], T, n(hammer), sem(tool) ).

34

Page 34: 自然言語処理 2010

Demonstration ( Mode 1)

• Input 1 :[tom,broke,the,cup,with,the,hammer]

• Input 2 :[tom,broke,the,glass,with,the,hammer]

35

Page 35: 自然言語処理 2010

Problem

• Parsing fails, when unknown words exist in sentences.

36

Page 36: 自然言語処理 2010

Unknown word Processing

• Switching processing modes( from Mode-1 to Mode-2 )

• When fails, switch the processing mode from mode-1 to mode-2.

• Execute the predicate assert of Prolog to change the mode.

• Switching processing modes( from Mode-1 to Mode-2 )

• When fails, switch the processing mode from mode-1 to mode-2.

• Execute the predicate assert of Prolog to change the mode.

37

Page 37: 自然言語処理 2010

Demonstration ( Mode2 )

• Input :– P1: [tom,broke,the,cup,with,the,hammer]– P2: [tom,broke,the,glass,with,the,hammer]– P3: [tom,broke,the,glass,with,the,stone]– P4: [tom,vvv,the,glass,with,the,hammer]

38

Page 38: 自然言語処理 2010

Problem

• Leaning is sometimes imperfect.

• Learnig order influences learnig results.

• Solution :Influence of learning order is covered with introducing a function of evolutionary learning

39

Page 39: 自然言語処理 2010

More Explanations

• All information of unknown words should be guessed, when the unknown words are registered to lexicon.

spelling and POS are guessed, but not pronunciation. (imperfect knowledge)

• If the pronunciation can be guessed later, the information will be added to lexicon. → Evolutionary Learning!

40

Page 40: 自然言語処理 2010

Solution

• Setting(some knowledge may be revised but some must not)– a priori knowledge (Initial Knowledge) :

must not change– posterior knowledge(Acquired Knowledge):

• Must not change, if perfect• May change, if imperfect

41

Page 41: 自然言語処理 2010

Demonstration of Final version

• Input :– P4: [tom,vvv,the,glass,with,the,hammer]– P2: [tom,broke,the,glass,with,the,hammer]– P3: [tom,broke,the,glass,with,the,stone]

43

Page 42: 自然言語処理 2010

Concludings

1. Research background

2. Basic ideas of knowledge acquisition 1. Some models

1. Information processing model

2. Unknown word acquisition model

2. Modes and Strategies

3. Demonstrations

44

Page 43: 自然言語処理 2010

Future Works

• Applying to more real world domain– Therapeutic robots– Robot for schizophrenia rehabilitation

45

Page 44: 自然言語処理 2010

次の話題は?

46

Page 45: 自然言語処理 2010

機械学習

• ILP の紹介

47