bayesian subtree alignment model based on dependency trees

33
Bayesian Subtree Alignment Model based on Dependency Trees Toshiaki Nakazawa Sadao Kurohashi Kyoto University 1 2011/11/11 @ IJCNLP2011

Upload: colby-palmer

Post on 01-Jan-2016

22 views

Category:

Documents


0 download

DESCRIPTION

Bayesian Subtree Alignment Model based on Dependency Trees. Toshiaki Nakazawa Sadao Kurohashi Kyoto University. 2011/11/ 11 @ IJCNLP2011. Outline. Background Related Work Bayesian Subtree Alignment Model Model Training Experiments Conclusion. Background. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Bayesian  Subtree  Alignment Model based on Dependency Trees

Bayesian Subtree Alignment Model based on Dependency Trees

Toshiaki Nakazawa Sadao KurohashiKyoto University

12011/11/11 @ IJCNLP2011

Page 2: Bayesian  Subtree  Alignment Model based on Dependency Trees

Outline

• Background• Related Work• Bayesian Subtree Alignment Model• Model Training• Experiments• Conclusion

2

Page 3: Bayesian  Subtree  Alignment Model based on Dependency Trees

3

Background

• Alignment quality of GIZA++ is quite insufficient for distant language pairs– Wide range reordering, many-to-many alignment

En: He is my bother .

Zh: 他 是 我 哥哥 。

Fr: Il est mon frère .

Ja: 彼 は 私 の 兄 です 。

Page 4: Bayesian  Subtree  Alignment Model based on Dependency Trees

Alignment Accuracy of GIZA++

Language pair Precision Recall AERFrench-English 87.28 96.30 9.80English-Japanese 81.17 62.19 29.25Chinese-Japanese 83.77 75.38 20.39

with combination heuristic

Sure alignment: clearly right

Possible alignment: reasonable to

||||

||||1

AS

APSASAER

4

Automatic (GIZA++) alignment

make but not so clear

Page 5: Bayesian  Subtree  Alignment Model based on Dependency Trees

Background• Alignment quality is quite insufficient for

distant language pairs– Wide range reordering, many-to-many alignment– English-Japanese, Chinese-Japanese > 20% AER– Need to incorporate syntactic information

En: He is my bother .

Zh: 他 是 我 哥哥 。

Fr: Il est mon frère .

Ja: 彼 は 私 の 兄 です 。 5

Page 6: Bayesian  Subtree  Alignment Model based on Dependency Trees

Related Work

• Cherry and Lin (2003)– Discriminative alignment model using source side

dependency tree– Allows one-to-one alignment only

• Nakazawa and Kurohashi (2009)– Generative model using both side dependency

trees– Allows phrasal alignment– Degeneracy of acquiring incorrect larger phrases

derived from Maximum Likelihood Estimation6

Page 7: Bayesian  Subtree  Alignment Model based on Dependency Trees

Related Work

• DeNero et al. (2008)– Incorporate prior knowledge about the parameter

to void the degeneracy of the model – Place Dirichlet Process (DP) prior over phrase

generation model– Simple distortion model: position-based

• This work– Take advantage of two works by Nakazawa et al.

and DeNero et al.

7

Page 8: Bayesian  Subtree  Alignment Model based on Dependency Trees

Related Work

• Generative story of (sequential) phrase-based joint probability model1. Choose a number of components 2. Generate each of phrase pairs independently• Nonparametric Bayesian prior

3. Choose an ordering for the phrases• Model

}),{|(),();()},,({,

$

fePfePpPfePfe

MG aa

Step 1 Step 3Step 2

[DeNero et al., 2008]

8

Page 9: Bayesian  Subtree  Alignment Model based on Dependency Trees

Example of the Model}),{|(),();()},,({

,$

fePfePpPfePfe

MG aa

Step 1 Step 3Step 2

He

is

my

brother

彼 は

です

私 の

C1

C2

C3

C4

9Simple position-based distortion

Page 10: Bayesian  Subtree  Alignment Model based on Dependency Trees

Proposed Model}),{|(),();()},,({

,$

fePfePpPfePfe

MG aa

Step 1 Step 3Step 2

10Dependency Tree-based distortion

He

is

my

brother 兄

です

彼は

私の

C1

C2

C3

C4

Page 11: Bayesian  Subtree  Alignment Model based on Dependency Trees

Bayesian Subtree Alignment Modelbased on Dependency Trees

11

Page 12: Bayesian  Subtree  Alignment Model based on Dependency Trees

Model Decomposition}),{|(),();()},,({

,$

fePfePpPfePfe

MG aa

12

1$$$ )1();( pppPG

Null Non-null

),()1(),(),( fepfepfeP JNM

fe

eefffe RRfeDPfeP,

)()(}),{|(}),{|( a

dependency relationsdependency of phrases

cf. [DeNero et al., 2008]|)()(|)),((}),{|( sfposepos

a

kjbkjafeP

a

a

Page 13: Bayesian  Subtree  Alignment Model based on Dependency Trees

Dependency Relations

rel(“He”, “is”)

13

彼は

私の

He

is

borther

my

rel(“brother”, “is”) = (1, 0)rel(“my”, “brother”) = (1, 0)

です# of steps for going up

# of steps for going down= (Up, Down) = (1, 0)

Page 14: Bayesian  Subtree  Alignment Model based on Dependency Trees

Dependency Relations

rel(“ 彼 は” , “ です” ) = (1, 0)

14

彼は

私の

He

is

borther

my

rel(“ 私 の” , “ 兄” ) = (1, 0)rel(“ 兄” , “ です” ) = (1, 0)

です

Page 15: Bayesian  Subtree  Alignment Model based on Dependency Trees

Dependency Relations

rel(“long”, “hair”) = (0, 1)

15

彼女は

髪が

長い

She

has

hairlong

rel(“hair”, “she has”) = (1, 2)

rel(“ 髪 が” , “ 長い” ) = (0, 1)

NULL

Page 16: Bayesian  Subtree  Alignment Model based on Dependency Trees

Dependency Relations

16

彼女は

髪が

長い

She

has

hairlong

rel(“ 彼女” , “ は” ) = ?

NULL

rel(“ 彼女” , “ 長い” ) = (0, 2)

N(“ 彼女” ) = 1 # of NULL words on the way to non-null parent

Page 17: Bayesian  Subtree  Alignment Model based on Dependency Trees

Dependency Relation Probability

• Assign probability to the tuple:   p(Re = (N, rel) = (N, Up, Down)) ~

• Reordering model is decomposed as:

fe

eefffe RRfeDP,

)()(}),{|(

),(),,( efefeffefefe MDPMDP ~~

ef

1

1

)1(

)1(

DownUpN

efefef

DownUpNfefefe

ppM

ppM

17

Page 18: Bayesian  Subtree  Alignment Model based on Dependency Trees

Model Training

18

Page 19: Bayesian  Subtree  Alignment Model based on Dependency Trees

19

Model Training

• Initialization– Create heuristic phrase alignment like ‘grow-diag-

final-and’ on dependency trees using results from GIZA++

– Count phrase alignment and dependency relations• Refine the model by Gibbs sampling– Operators: SWAP, TOGGLE, EXPAND

Page 20: Bayesian  Subtree  Alignment Model based on Dependency Trees

SWAP Operator・

・・

・・

NULL

NULL

・・

・・

・・

・・

SWAP-1

SWAP-2

• Swap the counterparts of two alignments

20

Page 21: Bayesian  Subtree  Alignment Model based on Dependency Trees

TOGGLE Operator

• Remove existing alignment or add new alignment

NULL

NULL

TOGGLE

21

Page 22: Bayesian  Subtree  Alignment Model based on Dependency Trees

EXPAND Operator

• Expand or contract an aligned subtree

NULL

NULL

EXPAND-1

EXPAND-2

22

Page 23: Bayesian  Subtree  Alignment Model based on Dependency Trees

Alignment Experiment

• Training: 1M for Ja-En, 678K for Ja-Zh• Testing: about 500 hand-annotated parallel

sentences (with Sure and Possible alignments)• Measure: Precision, Recall, Alignment Error Rate• Japanese Tools: JUMAN and KNP• English Tool: Charniak’s nlparser• Chinese Tools: MMA and CNP (from NICT)

23

Page 24: Bayesian  Subtree  Alignment Model based on Dependency Trees

Alignment Experiment

• Ja-En (paper abstract: 1M sentences)

Precision Recall AERInitialization 82.39 61.82 28.99Proposed 85.93 64.71 25.73GIZA++ & grow 81.17 62.19 29.25Berkeley Aligner 85.00 53.82 33.72

Page 25: Bayesian  Subtree  Alignment Model based on Dependency Trees

Alignment Experiment

• Ja-Zh (technical paper: 680K sentences)

Precision Recall AERInitialization 84.71 75.46 19.90Proposed 85.49 75.26 19.60GIZA++ & grow 83.77 75.38 20.39Berkeley Aligner 88.43 69.77 21.60

Page 26: Bayesian  Subtree  Alignment Model based on Dependency Trees

Improved Example

GIZA++ Proposed

26

Page 27: Bayesian  Subtree  Alignment Model based on Dependency Trees

GIZA++

Proposed

Page 28: Bayesian  Subtree  Alignment Model based on Dependency Trees

28

GIZA++

Proposed

Page 29: Bayesian  Subtree  Alignment Model based on Dependency Trees

29

Japanese-to-EnglishTranslation Experiment

• Baseline: Just run Moses and MERT

Baseline Initialization Proposed24

24.5

25

25.5

26

26.5

Page 30: Bayesian  Subtree  Alignment Model based on Dependency Trees

30

Japanese-to-EnglishTranslation Experiment

• Initialization: Use the result of initialization as the alignment result for Moses

Baseline Initialization Proposed24

24.5

25

25.5

26

26.5

Page 31: Bayesian  Subtree  Alignment Model based on Dependency Trees

31

Japanese-to-EnglishTranslation Experiment

• Proposed: Use the alignment result of proposed model after few iterations for Moses

Baseline Initialization Proposed24

24.5

25

25.5

26

26.5

Page 32: Bayesian  Subtree  Alignment Model based on Dependency Trees

GIZA++

Proposed

Page 33: Bayesian  Subtree  Alignment Model based on Dependency Trees

• Bayesian Tree-based Phrase Alignment Model– Better alignment accuracy than GIZA++ in distant

language pairs• Translation– Currently (temporally), not improved

• Future work– Robustness for parsing errors• Using N-best parsing result or forest

– Show improvement in translation• Tree-based decoder(, Hiero?)

Conclusion

33