yang liu state key laboratory of intelligent technology and systems tsinghua national laboratory for...

26
Yang Liu State Key Laboratory of Intelligent Technology and Systems Tsinghua National Laboratory for Information Science and Technology Department of Computer Science and Technology Tsinghua University, Beijing 100084, China ACL 2013

Upload: reginald-parks

Post on 14-Dec-2015

219 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: Yang Liu State Key Laboratory of Intelligent Technology and Systems Tsinghua National Laboratory for Information Science and Technology Department of Computer

Yang Liu

State Key Laboratory of Intelligent Technology and Systems

Tsinghua National Laboratory for Information Science and Technology

Department of Computer Science and TechnologyTsinghua University, Beijing 100084, China

ACL 2013

Page 2: Yang Liu State Key Laboratory of Intelligent Technology and Systems Tsinghua National Laboratory for Information Science and Technology Department of Computer

Introduction

目前的 statistical machine translation approach 大致上分為兩類 phrase-based syntax-based

提出 shift-reduce parsing algorithm 來整合兩類的優點

翻譯的對象是 string-to-dependency phrase pair

利用 maximum entropy model 來解決conflicts 的問題

Page 3: Yang Liu State Key Laboratory of Intelligent Technology and Systems Tsinghua National Laboratory for Information Science and Technology Department of Computer

Introduction

datasets: 使用 NIST Chinese-English translation datasets

evaluation : BLEU & TER , 並與phrase-based 和 syntax-based 結果相比較

Page 4: Yang Liu State Key Laboratory of Intelligent Technology and Systems Tsinghua National Laboratory for Information Science and Technology Department of Computer

Shift-Reduce Parsing for Phrase-based

String-to-Dependency Translation

Example:zongtong jiang yu siyue lai lundun

fangwen

The President will visit London in April

GIZA++

Context free grammar parser

Page 5: Yang Liu State Key Laboratory of Intelligent Technology and Systems Tsinghua National Laboratory for Information Science and Technology Department of Computer

Shift-Reduce Parsing for Phrase-based

String-to-Dependency Translation

Two broad categories: well-formed:

fixed floating – (left or right ,according to

position of head) ill-formedsource phrase target phrase dependen

cycategory

r1r2r3r4r5

fangwenyu siyue

zongtone jiangyu siyue lai

lundunzongtone jiang

visitin April

The President will

London in AprilPresident will

{}{1 2}{2 1}{2 3}

{}

fixedfixed

floating left

floating right

ill-formed

Page 6: Yang Liu State Key Laboratory of Intelligent Technology and Systems Tsinghua National Laboratory for Information Science and Technology Department of Computer

shift-reduce algorithm - exampletuple

<S,C>

從 empty state開始

terminate: 當所有 source words 都被翻譯且 stack 內有完整的 dependency tree 時

Page 7: Yang Liu State Key Laboratory of Intelligent Technology and Systems Tsinghua National Laboratory for Information Science and Technology Department of Computer
Page 8: Yang Liu State Key Laboratory of Intelligent Technology and Systems Tsinghua National Laboratory for Information Science and Technology Department of Computer

A Maximum Entropy Based Shift-Reduce Parsing Model

h : fixed l : left floating r : right

floating

Page 9: Yang Liu State Key Laboratory of Intelligent Technology and Systems Tsinghua National Laboratory for Information Science and Technology Department of Computer

A Maximum Entropy Based Shift-Reduce Parsing Model

maximum entropy model:

a ∈ {S , Rl , Rr} c : 為 boolean 值表示是否所有的 source

words 都 covered h(a, c, st, st-1) : vector of binary features Ѳ: vector of feature weights

Page 10: Yang Liu State Key Laboratory of Intelligent Technology and Systems Tsinghua National Laboratory for Information Science and Technology Department of Computer

A Maximum Entropy Based Shift-Reduce Parsing Model

Page 11: Yang Liu State Key Laboratory of Intelligent Technology and Systems Tsinghua National Laboratory for Information Science and Technology Department of Computer

A Maximum Entropy Based Shift-Reduce Parsing Model

為了 train model, 我們需要每個training example gold-standard action sequence

To alleviate this problem : derivation graph

Page 12: Yang Liu State Key Laboratory of Intelligent Technology and Systems Tsinghua National Laboratory for Information Science and Technology Department of Computer
Page 13: Yang Liu State Key Laboratory of Intelligent Technology and Systems Tsinghua National Laboratory for Information Science and Technology Department of Computer

Decoding

linear model with the following features: standard features

relative frequencies in two directions lexical weights in two directions phrase penalty distance-based reordering model lexicaized reordering model n-gram language model model word penalty

Page 14: Yang Liu State Key Laboratory of Intelligent Technology and Systems Tsinghua National Laboratory for Information Science and Technology Department of Computer

Decoding (continue)

dependency features: ill-formed structure penalty dependency language model maximum entropy parsing model

Page 15: Yang Liu State Key Laboratory of Intelligent Technology and Systems Tsinghua National Laboratory for Information Science and Technology Department of Computer

Decoding

Page 16: Yang Liu State Key Laboratory of Intelligent Technology and Systems Tsinghua National Laboratory for Information Science and Technology Department of Computer

Decoding

在 decoding 的過程中 ,stack 內的context information 會不斷變動(dependency language model and maximum entropy model probabilities)

使用 hypergraph reranking (Huang and Chiang, 2007; Huang, 2008) divided into two part

Page 17: Yang Liu State Key Laboratory of Intelligent Technology and Systems Tsinghua National Laboratory for Information Science and Technology Department of Computer

Decoding

為了提高 rule coverage, 使用 Shen et al. (2008) 的 ill-formed structures

如果 : ill-formed structure 有單一個 root : 當作

(pseudo) fixed structure 其他的 ill-formed structure 拆成一個

(pseudo) left floating structure 和一個(pseudo) right floating structure

Page 18: Yang Liu State Key Laboratory of Intelligent Technology and Systems Tsinghua National Laboratory for Information Science and Technology Department of Computer

Experiments

evaluated on Chinese-English translation

training data : 2.9M 個 sentence pairs,包含 76.0M Chinese words 和 82.2M English words

development set : 2002 NIST MT Chinese-English dataset

test sets: 2003-2005 NIST datasets

Page 19: Yang Liu State Key Laboratory of Intelligent Technology and Systems Tsinghua National Laboratory for Information Science and Technology Department of Computer

Experiments

用 Stanford parser 得到 English sentence 的 dependency trees

train a 4-gram language model on the Xinhua portion of the GIGAWORD corpus, which contains 238M English words

train a 3-gram dependency language model was trained on the English dependency trees

Page 20: Yang Liu State Key Laboratory of Intelligent Technology and Systems Tsinghua National Laboratory for Information Science and Technology Department of Computer

Experiments

compare with: The Moses phrase-based decoder

(Koehn et al., 2007) A re-implementation of bottom-up string-

to-dependency decoder (Shen et al., 2008)

b limit : 100 pharse table limit : 20

Page 21: Yang Liu State Key Laboratory of Intelligent Technology and Systems Tsinghua National Laboratory for Information Science and Technology Department of Computer

Experiments

Moses shares the same feature set with our system except for the dependency features.

For the bottom-up string-to-dependency system, we included both well-formed and ill-formed structures in chart parsing.

Page 22: Yang Liu State Key Laboratory of Intelligent Technology and Systems Tsinghua National Laboratory for Information Science and Technology Department of Computer

Experiments

Moses dependency

This work

Rule number 103M 587M 124M

avg. decoding time(per sentence)

3.67 s 13.89 s 4.56 s

Page 23: Yang Liu State Key Laboratory of Intelligent Technology and Systems Tsinghua National Laboratory for Information Science and Technology Department of Computer

Experiments

Page 24: Yang Liu State Key Laboratory of Intelligent Technology and Systems Tsinghua National Laboratory for Information Science and Technology Department of Computer

Experiments

Page 25: Yang Liu State Key Laboratory of Intelligent Technology and Systems Tsinghua National Laboratory for Information Science and Technology Department of Computer

Conclusion

提出 shift-reduce parsing algorithm for phrase-based string-to-dependency translation, 這個方法能整合 phrase-based和 string-to-dependency model 的優點 ,並在 Chinese-to-English translation 的實驗結果 ,outperform 兩個 baseline(phrase-based , syntax-based)

Page 26: Yang Liu State Key Laboratory of Intelligent Technology and Systems Tsinghua National Laboratory for Information Science and Technology Department of Computer

Future work

在 maximum entropy model 中增加更多的contextual information 來解決 conflicts的問題 , 另一方面 , 修改 Huang and Sagae (2010) 提出的 dynamic programming algorithm 來提高 string-to-dependency decoder 的效果