自然言語処理 2010

Post on 11-Jan-2016

48 Views

Category:

Documents

1 Downloads

Preview:

Click to see full reader

DESCRIPTION

自然言語処理 2010. 東京工科大学 コンピュータサイエンス学部 亀田弘之. 今日の話題. 研究レベルの話題に挑戦! 機械学習(帰納論理プログラミングの紹介). 研究レベルの話題に挑戦!. Pacling2009 での研究発表を理解してみよう! 皆さんはもうこのレベルの話しを理解できます。 批判的に聞いてください。 自分なりのアイデアを得てください。 (アイデアを得たらそれをもとに NLP の研究をしよう!). - PowerPoint PPT Presentation

TRANSCRIPT

自然言語処理 2010

東京工科大学コンピュータサイエンス学部

亀田弘之

今日の話題

• 研究レベルの話題に挑戦!• 機械学習(帰納論理プログラミングの紹

介)

2

研究レベルの話題に挑戦!

• Pacling2009 での研究発表を理解してみよう!– 皆さんはもうこのレベルの話しを理解できます

。– 批判的に聞いてください。– 自分なりのアイデアを得てください。

(アイデアを得たらそれをもとに       NLP の研究をしよう!)

3

Unknown Word Acquisitionby Reasoning Word Meanings

Incrementally and Evolutionarily

Hiroyuki Kameda Tokyo University of Technology

Chiaki Kubomura Yamano College of Aesthetics

Overview

1. Research background

2. Basic ideas of knowledge acquisition

3. Demonstrations

4. Concludings

5

Exchange of Information & Knowledge

Exchange of Information & Knowledge

natural & smooth communication

natural & smooth communication

Language etc.

6

A talking Robot with a human

7

Dialogue

New products release

new social eventsunknown words

( Various Topics )

yougers’ words

Professional Slangs

natural, smooth and flexible communication

natural, smooth and flexible communication

Highly upgraded NLP technology

Highly upgraded NLP technology

Unknown word Processing

Unknown word Processing

8

Unknown Word Acquisition-Our Basic ideas-

9

Information Processing Model

10

NLP System

Text: Dogs ran fast. Syntactic structure: S-----NP-----N-----dogs | ----VP----V----ran | --Adv---fast

11

Information Processing Model 2

12

Knowledge Acquisition Model

13

Knowledge Acquisiton System

14

More concretely to say…

NLP Engine

Lexicon( Domain Knowledge )

Program(Source codes)( Knowledge of objects

to be processed )

UW Acquisition

Processing rule acquisition

sentenceInternal representation

trigger

trigger

15

Further more concretely to say…

NLP Engine( Prolog Interpreter )

Lexicon( Domain Knowledge )

Rule-based grammar + α( Target Knowledge )

UW Acquisition

Syntactic rule acquisition( by ILP )

sentence Internal representation

Failure-trigger

Batch

16

Further more concretely to say…2

NLP Engine( Prolog Interpreter )

Lexicon( Domain Knowledge )

Rule-based grammar + α( Target Knowledge )

UW Acquisition

Syntactic rule acquisition( by ILP )

sentence Internal representation

Failure-trigger

Batch

17

Main topics

18

Let’s consider a grammar.G1 = { Vn, Vt, s, P },

where

Vn = {noun, verb, adverb, sentence},

Vt = {dogs, run, fast},

s = sentence,

P = { sentence → noun + verb + adverb,

noun→dogs, verb→run, adverb →fast}.

○   dogs run fast. ×   cars run fast.

19

Let’s consider a grammar.Grammar G1 = { Vn, Vt, s, P },

where

Vn = {noun, verb, adverb, sentence},

Vt = {dogs, run, fast},

s = sentence,

P = { sentence → noun + verb + adverb, noun→dogs, verb→run, adverb →fast}.

○   dogs run fast. ×   cars run fast.

Cannot unify!cars <=!=> noun

20

Our ideas

1. Processing modes

2. Processing strategies

21

Processing Modes

22

Processing Modes

Modes Unknow

Words

Unknow Syntactic rules

Mode 1 No No

Mode 2 Yes No

Mode3 No Yes

Mode4 Yes Yes23

Processing Modes

Modes Unknow

Words

Unknow Syntactic rules

Mode 1 No No

Mode 2 Yes No

Mode3 No Yes

Mode4 Yes Yes×

×

24

Processing Modes

Modes Unknow

Words

Unknow Syntactic rules

Mode 1 No No

Mode 2 Yes No

Mode3 No Yes

Mode4 Yes Yes×

×

○   dogs run fast.×   cars run fast.

25

Processing Strategies

26

Adopted Processing Strategies

1. Parse a sentence in mode-1 at first.

2. If parsing fails, then switch the processing mode from mode-1 to mode-2.

27

Grammar G1 in Prolog

sentence (A,D):-noun (A,B), verb (B,C), adverb (C,D).

noun([dogs|T],T).

verb([run|T],T).

adverb([fast|T],T).

noun(AT,T) :- write(‘Unknown word found!‘).

Syntactic rule

Lexicon

New processing rule

28

G1 = { Vn, Vt, s, P },

where

Vn = {noun, verb, adverb, sentence},

Vt = {dogs run fast.},

s = sentence,

P = { sentence → noun + verb + adverb,

noun→dogs, verb→run, adverb →fast}.

References

• Kameda, Sakurai and Kubomura :ACAI’99 Machine Learning and Applications, Proceedings of Workshop W01: Machine learning in human language technology, pp.62-67(1999).

• Kameda & Kubomura:Proc. of Pacling2001, pp.146-152(2001).

29

Let’s explain in more details!

31

Example sentence

Tom broke the cup with the hammer.(Okada1991)

tom broke the cup with the hammer

32

Grammatical settings

G2 = <Vn, Vt, P, s>Vn = { s, np, vp, prpn, v, prp, pr, det, n },Vt = { broke, cup, hammer, the, tom, with }P = { s -> np,vp. np -> prpn. vp -> v,np,prp.

prp -> pr,np. np -> det,n.prpn -> tom. V -> broke.Det -> the. n -> cup. pr -> with.n -> hammer. }

s:start symbol

33

Prolog version of Grammar G2

/* Syntactic rules*/s(A, C, s( _np, _vp ), a1( _act, _agt, _obj, _inst )) :-

np( A, B, _np, sem( _agt ) ),vp( B, C, _vp, sem( _act, _agt, _obj, _inst )).

np(A, B, np( _prpn ), sem( _ )) :- prpn(A, B, _prpn, sem( _ )).

vp(A, D, vp( _v, _np, _prp ), sem( Act, Agt, Obj, Inst )) :-v( A, B, _v, sem( Act, Agt, Obj, Inst )),np( B, C, _np, sem( Obj )),prp( C, D, _prp, sem( Inst )).

vp(A, C, vp( _v, _np ), sem(Act,Agt,Obj,Inst) ) :-v(A, B, _v, sem( Act, Agt, Obj, Inst ) ),np(B, C, _np, sem( Obj )).

prp(A, C, prp( _pr, _np ), sem( Z )) :-pr(A, B, _pr, sem( _ ) ),np(B, C, _np, sem( Z )).

np(A, C, np( _det, _n ), sem( W )) :- det(A, B, _det, sem( _ ) ),n(B, C, _n, sem( W )).

/* Lexicon*/prpn( [tom|T], T, prpn(tom), sem(human) ).v([broke|T],T, v1(broke),

sem(change, in_shape, human, thing, tool) ).det([the|T], T, det(the), sem( _ ) ).n( [cup|T], T, n(cup), sem(thing) ).pr( [with|T], T, pr(with), sem( _ )).n( [hammer|T], T, n(hammer), sem(tool) ).

34

Demonstration ( Mode 1)

• Input 1 :[tom,broke,the,cup,with,the,hammer]

• Input 2 :[tom,broke,the,glass,with,the,hammer]

35

Problem

• Parsing fails, when unknown words exist in sentences.

36

Unknown word Processing

• Switching processing modes( from Mode-1 to Mode-2 )

• When fails, switch the processing mode from mode-1 to mode-2.

• Execute the predicate assert of Prolog to change the mode.

• Switching processing modes( from Mode-1 to Mode-2 )

• When fails, switch the processing mode from mode-1 to mode-2.

• Execute the predicate assert of Prolog to change the mode.

37

Demonstration ( Mode2 )

• Input :– P1: [tom,broke,the,cup,with,the,hammer]– P2: [tom,broke,the,glass,with,the,hammer]– P3: [tom,broke,the,glass,with,the,stone]– P4: [tom,vvv,the,glass,with,the,hammer]

38

Problem

• Leaning is sometimes imperfect.

• Learnig order influences learnig results.

• Solution :Influence of learning order is covered with introducing a function of evolutionary learning

39

More Explanations

• All information of unknown words should be guessed, when the unknown words are registered to lexicon.

spelling and POS are guessed, but not pronunciation. (imperfect knowledge)

• If the pronunciation can be guessed later, the information will be added to lexicon. → Evolutionary Learning!

40

Solution

• Setting(some knowledge may be revised but some must not)– a priori knowledge (Initial Knowledge) :

must not change– posterior knowledge(Acquired Knowledge):

• Must not change, if perfect• May change, if imperfect

41

Demonstration of Final version

• Input :– P4: [tom,vvv,the,glass,with,the,hammer]– P2: [tom,broke,the,glass,with,the,hammer]– P3: [tom,broke,the,glass,with,the,stone]

43

Concludings

1. Research background

2. Basic ideas of knowledge acquisition 1. Some models

1. Information processing model

2. Unknown word acquisition model

2. Modes and Strategies

3. Demonstrations

44

Future Works

• Applying to more real world domain– Therapeutic robots– Robot for schizophrenia rehabilitation

45

次の話題は?

46

機械学習

• ILP の紹介

47

top related