agreement matters: challenges of translating into a mrl

108
Agreement matters: Challenges of translating into a MRL Yoav Goldberg Ben Gurion University

Upload: malha

Post on 23-Feb-2016

31 views

Category:

Documents


0 download

DESCRIPTION

Agreement matters: Challenges of translating into a MRL. Yoav Goldberg Ben Gurion University. Why should we care about syntactic modeling of MRLs?. A brief summary. A brief summary. What Kevin said. Example: English  Hebrew. I wash the car . Example: English  Hebrew. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Agreement matters: Challenges of translating  into  a MRL

Agreement matters:Challenges of translating into a MRL

Yoav GoldbergBen Gurion University

Page 2: Agreement matters: Challenges of translating  into  a MRL

Why should we care about syntactic modeling of MRLs?

Page 3: Agreement matters: Challenges of translating  into  a MRL

A brief summary

Page 4: Agreement matters: Challenges of translating  into  a MRL

A brief summary

• What Kevin said.

Page 5: Agreement matters: Challenges of translating  into  a MRL

Example: English Hebrew

I wash the car

Page 6: Agreement matters: Challenges of translating  into  a MRL

Example: English Hebrew

המכונית את רוחץ אני

I wash the car

(I wash the car)

Page 7: Agreement matters: Challenges of translating  into  a MRL

Example: English Hebrew

I wash the floor

המכונית את רוחץ אני

I wash the car

(I wash the car)

Page 8: Agreement matters: Challenges of translating  into  a MRL

Example: English Hebrew

I wash the floor

המכונית את רוחץ אני

I wash the car

(I wash the car)הרצפה את שוטפת אני

(I wash the floor)

Page 9: Agreement matters: Challenges of translating  into  a MRL

Example: English Hebrew

I wash the floor

המכונית את רוחץ אני הרצפה את שוטפת אני

I wash the car

(I wash the floor)(I wash the car)

Page 10: Agreement matters: Challenges of translating  into  a MRL

Example: English Hebrew

I wash the floor

המכונית את רוחץ אני הרצפה את שוטפת אני

I wash the car

(I wash the floor)(I wash the car)

FeminineMasculine

Page 11: Agreement matters: Challenges of translating  into  a MRL

Hebrew Verbs are morphologically marked for Gender

(and Number, and Person, and Tense..)

Hebrew Fact 1

Page 12: Agreement matters: Challenges of translating  into  a MRL

Example: English Hebrew

I wash the floor

המכונית את רוחץ אני הרצפה את שוטפת אני

I wash the car

(I wash the floor)(I wash the car)

FeminineMasculine

Are these bad translations?

Page 13: Agreement matters: Challenges of translating  into  a MRL

Example: English Hebrew

I wash the floor

המכונית את רוחץ אני הרצפה את שוטפת אני

I wash the car

(I wash the floor)(I wash the car)

FeminineMasculine

Are these bad translations?

No. These are actually quite good.

No gender information in source.Target must indicate gender. translator uses world knowledge.

Page 14: Agreement matters: Challenges of translating  into  a MRL

Let’s have some fun

Page 15: Agreement matters: Challenges of translating  into  a MRL

Language Models as Social Indicators

• I love her• I love him

Page 16: Agreement matters: Challenges of translating  into  a MRL

Language Models as Social Indicators

• I love her• I love him

אותה אוהבאני •אותו אוהבתאני •

Page 17: Agreement matters: Challenges of translating  into  a MRL

Language Models as Social Indicators

• I love her• I love him• I love meat• I love vegetables

אותה אוהבאני •אותו אוהבתאני •

Page 18: Agreement matters: Challenges of translating  into  a MRL

Language Models as Social Indicators

• I love her• I love him• I love meat• I love vegetables

אותה אוהבאני •אותו אוהבתאני •בשר אוהבאני •ירקות אוהבתאני •

Page 19: Agreement matters: Challenges of translating  into  a MRL

Language Models as Social Indicators

• I love her• I love him• I love meat• I love vegetables• I love to eat• I love to cook

אותה אוהבאני •אותו אוהבתאני •בשר אוהבאני •ירקות אוהבתאני •

Page 20: Agreement matters: Challenges of translating  into  a MRL

Language Models as Social Indicators

• I love her• I love him• I love meat• I love vegetables• I love to eat• I love to cook

אותה אוהבאני •אותו אוהבתאני •בשר אוהבאני •ירקות אוהבתאני •לאכול אוהבאני •לבשל אוהבתאני •

Page 21: Agreement matters: Challenges of translating  into  a MRL

Language Models as Social Indicators

• I love her• I love him• I love meat• I love vegetables• I love to eat• I love to cook• I love hash• I love marijuana

אותה אוהבאני •אותו אוהבתאני •בשר אוהבאני •ירקות אוהבתאני •לאכול אוהבאני •לבשל אוהבתאני •

Page 22: Agreement matters: Challenges of translating  into  a MRL

Language Models as Social Indicators

• I love her• I love him• I love meat• I love vegetables• I love to eat• I love to cook• I love hash• I love marijuana

אותה אוהבאני •אותו אוהבתאני •בשר אוהבאני •ירקות אוהבתאני •לאכול אוהבאני •לבשל אוהבתאני •חשיש אוהבאני •מריחואנה אוהבתאני •

Page 23: Agreement matters: Challenges of translating  into  a MRL

Language Models as Social Indicators

• I hate him. אותו. שונאתאני •

Page 24: Agreement matters: Challenges of translating  into  a MRL

Language Models as Social Indicators

• I hate him.• I hate her.

אותו. שונאתאני •

Page 25: Agreement matters: Challenges of translating  into  a MRL

Language Models as Social Indicators

• I hate him.• I hate her.

אותו. שונאתאני •אותה.שונאת אני •

Page 26: Agreement matters: Challenges of translating  into  a MRL

Language Models as Social Indicators

• I hate him.• I hate her.• I hate him

אותו. שונאתאני •אותה.שונאת אני •אותושונא אני •

Page 27: Agreement matters: Challenges of translating  into  a MRL

Language Models as Social Indicators

• I hate him.• I hate her.• I hate him

אותו. שונאתאני •אותה.שונאת אני •אותושונא אני •

Really? A dot!?Not very stable…

Page 28: Agreement matters: Challenges of translating  into  a MRL

Language Models as Social Indicators

• I hate him.• I hate her.• I hate him• I hate her

אותו. שונאתאני •אותה.שונאת אני •אותושונא אני •אותה שונאתאני •

Really? A dot!?Not very stable…

Page 29: Agreement matters: Challenges of translating  into  a MRL

Language Models as Social Indicators

• I hate him.• I hate her.• I hate him• I hate her

אותו. שונאתאני •אותה.שונאת אני •אותושונא אני •אותה שונאתאני •

Really? A dot!?Not very stable…

Hmm… is there a message here after all?

Page 30: Agreement matters: Challenges of translating  into  a MRL

• I love• I hate

Language Models as Social Indicators

Page 31: Agreement matters: Challenges of translating  into  a MRL

• I love• I hate

Language Models as Social Indicators

אוהבאני •שונאתאני •

Page 32: Agreement matters: Challenges of translating  into  a MRL

Back to Machine Translation

Page 33: Agreement matters: Challenges of translating  into  a MRL

One English Many Hebrewlove

אוהבותאוהבים אוהבת אוהב

Need to acquire more knowledge• … use larger parallel corpora• … use dictionaries• … use FSAs to model inflections• Let’s assume this is solved

Page 34: Agreement matters: Challenges of translating  into  a MRL

One English Many Hebrewlove

אוהבותאוהבים אוהבת אוהב

Need to acquire more knowledge• … use larger parallel corpora• … use dictionaries• … use FSAs to model inflections• Let’s assume this is solved

Page 35: Agreement matters: Challenges of translating  into  a MRL

One English Many Hebrewlove

אוהבותאוהבים אוהבת אוהב

Need to acquire more knowledge• … use larger parallel corpora• … use dictionaries• … use FSAs to model inflections• Let’s assume this is solved

Page 36: Agreement matters: Challenges of translating  into  a MRL

One English Many Hebrewlove

אוהבותאוהבים אוהבת אוהב

Need to acquire more knowledge• … use larger parallel corpora• … use dictionaries• … use FSAs to model inflections• Let’s assume this is solved

Page 37: Agreement matters: Challenges of translating  into  a MRL

One English Many Hebrewlove

אוהבותאוהבים אוהבת אוהב

Need to acquire more knowledge• … use larger parallel corpora• … use dictionaries• … use FSAs to model inflections• Let’s assume this is solved

Page 38: Agreement matters: Challenges of translating  into  a MRL

One English Many Hebrewlove

אוהבותאוהבים אוהבת אוהב

Need to acquire more knowledge• … use larger parallel corpora• … use dictionaries• … use FSAs to model inflections• Let’s assume this is solved

Page 39: Agreement matters: Challenges of translating  into  a MRL

One English Many HebrewHebrew English: easy

אני I loveאוהבתאני אוהב I loveאנו אוהבים We loveאנו אוהבות We love

English Hebrew: hardI love ? אוהב אני? אוהבת אני? אוהבים אני? אוהבות אני

Don’t worry about it. Just say “love.”

The reader will decide.

Which form to choose?

Translator must decide.

HOW??

Page 40: Agreement matters: Challenges of translating  into  a MRL

One English Many HebrewHebrew English: easy

אני אוהבת I loveאני אוהב I loveאנו אוהבים We loveאנו אוהבות We love

English Hebrew: hardI love ? אוהב אני? אוהבת אני? אוהבים אני? אוהבות אני

Don’t worry about it. Just say “love.”

The reader will decide.

Which form to choose?

Translator must decide.

HOW??

Page 41: Agreement matters: Challenges of translating  into  a MRL

One English Many HebrewHebrew English: easy

אני I loveאוהבתאני אוהב I loveאנו אוהבים We loveאנו אוהבות We love

English Hebrew: hardI love ? אוהב אני? אוהבת אני? אוהבים אני? אוהבות אני

Don’t worry about it. Just say “love.”

The reader will decide.

Which form to choose?

Translator must decide.

HOW??

Page 42: Agreement matters: Challenges of translating  into  a MRL

One English Many HebrewHebrew English: easy

אני I loveאוהבתאני אוהב I loveאנו אוהבים We loveאנו אוהבות We love

English Hebrew: hardI love ? אוהב אני? אוהבת אני? אוהבים אני? אוהבות אני

Don’t worry about it. Just say “love.”

The reader will decide.

Page 43: Agreement matters: Challenges of translating  into  a MRL

One English Many HebrewHebrew English: easy

אני I loveאוהבתאני אוהב I loveאנו אוהבים We loveאנו אוהבות We love

English Hebrew: hardI love ? אוהב אני? אוהבת אני? אוהבים אני? אוהבות אני

Don’t worry about it. Just say “love.”

The reader will decide.

Which form to choose?

Translator must decide.

HOW??

Page 44: Agreement matters: Challenges of translating  into  a MRL

When translating into an MRL:

• Many possible word forms– Hard to acquire [but assume its solved]

• Need to choose correct inflection

Page 45: Agreement matters: Challenges of translating  into  a MRL

One English Many HebrewHebrew English: easy

אני I loveאוהבתאני אוהב I loveאנו אוהבים We loveאנו אוהבות We love

English Hebrew: hardI love ? אוהב אני? אוהבת אני? אוהבים אני? אוהבות אני

Don’t worry about it. Just say “love.”

The reader will decide.

Which form to choose?

Translator must decide.

HOW??

Page 46: Agreement matters: Challenges of translating  into  a MRL

One English Many HebrewHebrew English: easy

אני I loveאוהבתאני אוהב I loveאנו אוהבים We loveאנו אוהבות We love

English Hebrew: hardI love ? אוהב אני? אוהבת אני? אוהבים אני? אוהבות אני

Don’t worry about it. Just say “love.”

The reader will decide.

Which form to choose?

Translator must decide.

HOW??

Just choose one at random?In the worst case we’ll insult

someone..

Page 47: Agreement matters: Challenges of translating  into  a MRL

Hebrew Verbs agree with Subject on gender and number

Hebrew Fact 2

Page 48: Agreement matters: Challenges of translating  into  a MRL

I love ? אוהב אני

? אוהבת אני

? אוהבים אני

? אוהבות אני

Agreement dictates form

Page 49: Agreement matters: Challenges of translating  into  a MRL

I love ? אוהב אני

? אוהבת אני

? אוהבים אני

? אוהבות אני

singular

singular

singular

singular

singular

Agreement dictates form

Page 50: Agreement matters: Challenges of translating  into  a MRL

I love ? אוהב אני

? אוהבת אני

? אוהבים אני

? אוהבות אני

singular

singular

singular

singular

singular

singular

singular

plural

plural

Agreement dictates form

Page 51: Agreement matters: Challenges of translating  into  a MRL

I love ? אוהב אני

? אוהבת אני

? אוהבים אני

? אוהבות אני

singular

singular

singular

singular

singular

singular

singular

plural

plural

✘✘

Agreement dictates form

Page 52: Agreement matters: Challenges of translating  into  a MRL

The girls love? אוהב הבחורות

? אוהבת הבחורות

? אוהבים הבחורות

? אוהבות הבחורות

plural / fem

no missing information

Page 53: Agreement matters: Challenges of translating  into  a MRL

The girls love? אוהב הבחורות

? אוהבת הבחורות

? אוהבים הבחורות

? אוהבות הבחורות

plural / fem

no missing information

plural / fem

plural / fem

plural / fem

plural / fem

sing / masc

plural / masc

sing / fem

plural / fem

Page 54: Agreement matters: Challenges of translating  into  a MRL

The girls love? אוהב הבחורות

? אוהבת הבחורות

? אוהבים הבחורות

? אוהבות הבחורות

plural / fem

no missing information

✘plural / fem

plural / fem

plural / fem

plural / fem

sing / masc

plural / masc

sing / fem

plural / fem

Page 55: Agreement matters: Challenges of translating  into  a MRL

When translating into an MRL:

• Many possible word forms– Hard to acquire [but assume its solved]

• Need to choose correct inflection• Inflection is determined based on information

which is external to the word

Page 56: Agreement matters: Challenges of translating  into  a MRL

Back to Google Translate

The boy washes the car The girl washes the car

Page 57: Agreement matters: Challenges of translating  into  a MRL

Back to Google Translate

The boy washes the car

רוחץ המכונית הנער את

The girl washes the car

את רוחצת הילדההמכונית

Page 58: Agreement matters: Challenges of translating  into  a MRL

Back to Google Translate

The boy washes the car

רוחץ המכונית הנער את

The girl washes the car

את רוחצת הילדההמכונית

Page 59: Agreement matters: Challenges of translating  into  a MRL

Back to Google Translate

The boy washes the car

רוחץ המכונית הנער את

The girl washes the car

את רוחצת הילדההמכונית

Good job Franz!

Page 60: Agreement matters: Challenges of translating  into  a MRL

Back to Google Translate

The boy washes the car

רוחץ המכונית הנער את

The girl washes the car

את רוחצת הילדההמכונית

Good job Franz?

Page 61: Agreement matters: Challenges of translating  into  a MRL

Back to Google Translate

The boy washes the car

רוחץ המכונית הנער את

The girl washes the car

את רוחצת הילדההמכונית

Good job Franz?childyoung-man

Page 62: Agreement matters: Challenges of translating  into  a MRL

Back to Google TranslateThe boy with the sunglasses washes the floor

השמש הנער משקפי הרצפה שוטפת עם את

Good job Franz?

The girl with the sunglasses washes the car

השמש הבחורה משקפי המכונית שוטףעם את chick

Page 63: Agreement matters: Challenges of translating  into  a MRL

Back to Google TranslateThe boy with the sunglasses washes the floor

השמש הנער משקפי הרצפה שוטפת עם את

Good job Franz?

The girl with the sunglasses washes the car

השמש הבחורה משקפי המכונית שוטףעם את

Page 64: Agreement matters: Challenges of translating  into  a MRL

Back to Google TranslateThe boy with the sunglasses washes the floor

השמש הנער משקפי הרצפה שוטפת עם את

Good job Franz?

The girl with the sunglasses washes the car

השמש הבחורה משקפי המכונית שוטףעם את

Page 65: Agreement matters: Challenges of translating  into  a MRL

Back to Google TranslateThe boy with the sunglasses washes the floor

השמש הנער משקפי הרצפה שוטפת עם את

Good job Franz?

The girl with the sunglasses washes the car

השמש הבחורה משקפי המכונית שוטףעם את

young-man

chick

Page 66: Agreement matters: Challenges of translating  into  a MRL

What happened?• Long distance agreement• Can’t be represented in phrase-table• Can’t be represented in n-gram LMLocal “semantic” information from LM/PhraseBad translation (ungrammatical)

Page 67: Agreement matters: Challenges of translating  into  a MRL

What happened?• Long distance agreement• Can’t be represented in phrase-table• Can’t be represented in n-gram LMLocal “semantic” information from LM/PhraseBad translation (ungrammatical)

It’s not Franz’s fault, but the system’s

Page 68: Agreement matters: Challenges of translating  into  a MRL

When translating into an MRL:

• Many possible word forms– Hard to acquire [but I assume its solved]

• Need to choose correct inflection• Inflection is determined based on information

which is external to the word and frequently far away from it

Page 69: Agreement matters: Challenges of translating  into  a MRL

When translating into an MRL:

• Many possible word forms– Hard to acquire [but I assume its solved]

• Need to choose correct inflection• Inflection is determined based on information

which is external to the word and frequently far away from it

Distance from Verb to Subject in the Hebrew Dependency Treebank (news domain)

S-V Dep-Length Count Percent

1 3218 42%2 1504 19%3 914 12%4 405 5%5 297 4%>5 1322 17%

Page 70: Agreement matters: Challenges of translating  into  a MRL

When translating into an MRL:

• Many possible word forms– Hard to acquire [but I assume its solved]

• Need to choose correct inflection• Inflection is determined based on information

which is external to the word and frequently far away from it

Distance from Verb to Subject in the Hebrew Dependency Treebank (news domain)

S-V Dep-Length Count Percent

1 3218 42%2 1504 19%3 914 12%4 405 5%5 297 4%>5 1322 17%

2 words apart is already though for reliably estimating in an n-gram based system!

Page 71: Agreement matters: Challenges of translating  into  a MRL

When translating into an MRL:

• Many possible word forms– Hard to acquire [but I assume its solved]

• Need to choose correct inflection• Inflection is determined based on information

which is external to the word and frequently far away from it

Page 72: Agreement matters: Challenges of translating  into  a MRL

When translating into an MRL:

• Many possible word forms– Hard to acquire [but I assume its solved]

• Need to choose correct inflection• Inflection is determined based on information

which is external to the word and frequently far away from it

Phrase based + N-gram LM can’t do it

Page 73: Agreement matters: Challenges of translating  into  a MRL

What if both languages are MRLs?• Gender/number marked on both sides• No need for word-external information• We can translate wordword again• MRL MRL is easy!

Page 74: Agreement matters: Challenges of translating  into  a MRL

What if both languages are MRLs?• Gender/number marked on both sides• No need for word-external information• We can translate wordword again• MRL MRL is easy!

Wrong!

Page 75: Agreement matters: Challenges of translating  into  a MRL

What if both languages are MRLs?• Gender/number marked on both sides• But:– agreement patterns differ between languages– gender information differ between languages

Page 76: Agreement matters: Challenges of translating  into  a MRL

What if both languages are MRLs?Example:– Spanish and Hebrew have adjective-noun

agreement

Page 77: Agreement matters: Challenges of translating  into  a MRL

What if both languages are MRLs?Example:– Spanish and Hebrew have adjective-noun

agreement

– new shirt• חדשה חולצה• nueva camisa

– new car• חדשה מכונית• nuevo automovil

Page 78: Agreement matters: Challenges of translating  into  a MRL

What if both languages are MRLs?Example:– Spanish and Hebrew have adjective-noun

agreement

– new shirt• חדשה חולצה• nueva camisa

– new car• חדשה מכונית• nuevo automovil

Page 79: Agreement matters: Challenges of translating  into  a MRL

– new computer• חדש מחשב• nueva computadora

What if both languages are MRLs?Example:– Spanish and Hebrew have adjective-noun

agreement

– new shirt• חדשה חולצה• nueva camisa

– new car• חדשה מכונית• nuevo automovil

Page 80: Agreement matters: Challenges of translating  into  a MRL

– new computer• חדש מחשב• nueva computadora

What if both languages are MRLs?Example:– Spanish and Hebrew have adjective-noun

agreement

– new shirt• חדשה חולצה• nueva camisa

– new car• חדשה מכונית• nuevo automovil

nuevo חדשחדשה nueva

Page 81: Agreement matters: Challenges of translating  into  a MRL

– new computer• חדש מחשב• nueva computadora

What if both languages are MRLs?Example:– Spanish and Hebrew have adjective-noun

agreement

– new shirt• חדשה חולצה• nueva camisa

– new car• חדשה מכונית• nuevo automovil

nuevo חדשחדשה nueva

- Many-to-many mapping- Correct form still depends on external information- More chances for error- Acquiring all the pairs from parallel corpora is harder

Page 82: Agreement matters: Challenges of translating  into  a MRL

Must consider (at least) syntax

Phrase-tables and n-grams still can’t do it

Page 83: Agreement matters: Challenges of translating  into  a MRL

When Translating into an MRL:

• MT systems must be aware of gender/number

• Should have a notion of agreement

• Use syntax to enforce agreement

Page 84: Agreement matters: Challenges of translating  into  a MRL

Source-side Syntax

washes

floor

theThe boy

with

the

sunglasses

subj obj

Page 85: Agreement matters: Challenges of translating  into  a MRL

Source-side Syntax

washes

floor

theThe boy

with

the

sunglasses

subj obj

נער ילד

בחור

masc/singmasc/singmasc/sing

Page 86: Agreement matters: Challenges of translating  into  a MRL

Source-side Syntax

washes

floor

theThe boy

with

the

sunglasses

subj obj

נער ילד

בחור

שוטףשוטפתשוטפיםשוטפות

masc/singmasc/singmasc/sing

masc/singfem /singmasc/pluralfem /plural

Page 87: Agreement matters: Challenges of translating  into  a MRL

Source-side Syntax

washes

floor

theThe boy

with

the

sunglasses

subj obj

נער ילד

בחור

שוטףשוטפתשוטפיםשוטפות

masc/singmasc/singmasc/sing

masc/singfem /singmasc/pluralfem /plural

Agree!

Page 88: Agreement matters: Challenges of translating  into  a MRL

Source-side Syntax

washes

floor

theThe boy

with

the

sunglasses

subj obj

נער ילד

בחור

שוטףשוטפתשוטפיםשוטפות

masc/singmasc/singmasc/sing

masc/singfem /singmasc/pluralfem /plural

Agree!

Page 89: Agreement matters: Challenges of translating  into  a MRL

Source-side Syntax

washes

floor

theThe boy

with

the

sunglasses

subj obj

נער ילד

בחור

שוטףשוטפתשוטפיםשוטפות

masc/singmasc/singmasc/sing

masc/singfem /singmasc/pluralfem /plural

Problems:- How to obtain gender/number information?- How to decode efficiently?- Agreement behavior is not always that simple

Page 90: Agreement matters: Challenges of translating  into  a MRL

Target-side Syntax

xLNT transducers can model agreement

Page 91: Agreement matters: Challenges of translating  into  a MRL

Target-side Syntax

NNmasc/sing (ילד) boyVBmasc/sing (שוטף) washesVBfem/sing (שוטפת) washes

NPmasc/sing(x0:DT x1:NNmasc/sing x2:PP) x0 x1 x2 VPmasc/sing (x0:VB masc/sing x1:NP) x0 x1Smasc/sing(x0:NPmasc/sing x1:VPmasc/sing) x0 x1

xLNT transducers can model agreement

Page 92: Agreement matters: Challenges of translating  into  a MRL

Target-side Syntax

NNmasc/sing (ילד) boyVBmasc/sing (שוטף) washesVBfem/sing (שוטפת) washes

NPmasc/sing(x0:DT x1:NNmasc/sing x2:PP) x0 x1 x2 VPmasc/sing (x0:VB masc/sing x1:NP) x0 x1Smasc/sing(x0:NPmasc/sing x1:VPmasc/sing) x0 x1

Gender and number informationencoded in the lexical rules

xLNT transducers can model agreement

Page 93: Agreement matters: Challenges of translating  into  a MRL

Target-side Syntax

NNmasc/sing (ילד) boyVBmasc/sing (שוטף) washesVBfem/sing (שוטפת) washes

NPmasc/sing(x0:DT x1:NNmasc/sing x2:PP) x0 x1 x2 VPmasc/sing (x0:VB masc/sing x1:NP) x0 x1Smasc/sing(x0:NPmasc/sing x1:VPmasc/sing) x0 x1

Gender and number informationencoded in the lexical rules

Agreement information encoded in the grammar

xLNT transducers can model agreement

Page 94: Agreement matters: Challenges of translating  into  a MRL

Target-side Syntax

NNmasc/sing (ילד) boyVBmasc/sing (שוטף) washesVBfem/sing (שוטפת) washes

NPmasc/sing(x0:DT x1:NNmasc/sing x2:PP) x0 x1 x2 VPmasc/sing (x0:VB masc/sing x1:NP) x0 x1Smasc/sing(x0:NPmasc/sing x1:VPmasc/sing) x0 x1

Gender and number informationencoded in the lexical rules

Agreement information encoded in the grammar

xLNT transducers can model agreement

Problems:- How to obtain gender/number information?- Grammar is going to be huge (can we make it smaller?)- How are we going to obtain the grammar?

efficiently encoding morphological processes in a treebank grammar an open research question

Page 95: Agreement matters: Challenges of translating  into  a MRL

On The Parsing Side of Things

• Most work on parsing MRLs:– consider morphology to be a lexicon-level issue

• Many inflections high OOV rate– Ignoring morphology at syntax-level– PCFGLA works frustratingly well

• Recently: – Smarter modeling of morphology at syntax level– Using morphological agreement to improve parsingModest benefits to parsing accuracyPCFGLA still better

Page 96: Agreement matters: Challenges of translating  into  a MRL

On The Parsing Side of Things

• Most work on parsing MRLs:– consider morphology to be a lexicon-level issue

• Many inflections high OOV rate– Ignoring morphology at syntax-level– PCFGLA works frustratingly well

• Recently: – Smarter modeling of morphology at syntax level– Using morphological agreement to improve parsingModest benefits to parsing accuracyPCFGLA still better

Page 97: Agreement matters: Challenges of translating  into  a MRL

On The Parsing Side of Things

• Most work on parsing MRLs:– consider morphology to be a lexicon-level issue

• Many inflections high OOV rate– Ignoring morphology at syntax-level– PCFGLA works frustratingly well

• Recently: – Smarter modeling of morphology at syntax level– Using morphological agreement to improve parsingModest benefits to parsing accuracyPCFGLA still better

Page 98: Agreement matters: Challenges of translating  into  a MRL

On The Parsing Side of Things

• Most work on parsing MRLs:– consider morphology to be a lexicon-level issue

• Many inflections high OOV rate– Ignoring morphology at syntax-level– PCFGLA works frustratingly well

• Recently: – Smarter modeling of morphology at syntax level– Using morphological agreement to improve parsingModest benefits to parsing accuracyPCFGLA still better

Page 99: Agreement matters: Challenges of translating  into  a MRL

On The Parsing Side of Things

• Most work on parsing MRLs:– consider morphology to be a lexicon-level issue

• Many inflections high OOV rate– Ignoring morphology at syntax-level– PCFGLA works frustratingly well

• Recently: – Smarter modeling of morphology at syntax level– Using morphological agreement to improve parsingModest benefits to parsing accuracyPCFGLA still better

Page 100: Agreement matters: Challenges of translating  into  a MRL

On The Parsing Side of Things

• Most work on parsing MRLs:– consider morphology to be a lexicon-level issue

• Many inflections high OOV rate– Ignoring morphology at syntax-level– PCFGLA works frustratingly well

• Recently: – Smarter modeling of morphology at syntax level– Using morphological agreement to improve parsingModest benefits to parsing accuracyPCFGLA still better

Page 101: Agreement matters: Challenges of translating  into  a MRL

On The Parsing Side of Things

• Most work on parsing MRLs:– consider morphology to be a lexicon-level issue

• Many inflections high OOV rate– Ignoring morphology at syntax-level– PCFGLA works frustratingly well

• Recently: – Smarter modeling of morphology at syntax level– Using morphological agreement to improve parsingModest benefits to parsing accuracyPCFGLA still better

PCFGLA is not

modeling agreement

Page 102: Agreement matters: Challenges of translating  into  a MRL

Rebels without a cause?• Syntax-based MT:– Neat! – Only marginally better than phrase-based

Page 103: Agreement matters: Challenges of translating  into  a MRL

Rebels without a cause?• Syntax-based MT:– Neat! – Only marginally better than phrase-based

English grammaticality is relatively easy to capture using local information

Page 104: Agreement matters: Challenges of translating  into  a MRL

Rebels without a cause?• Syntax-based MT:– Neat! – Only marginally better than phrase-based

• Syntax-level morphology in parsing:– Neat! – Only marginally better than ignoring it

English grammaticality is relatively easy to capture using local information

Page 105: Agreement matters: Challenges of translating  into  a MRL

Rebels without a cause?• Syntax-based MT:– Neat! – Only marginally better than phrase-based

• Syntax-level morphology in parsing:– Neat! – Only marginally better than ignoring it

English grammaticality is relatively easy to capture using local information

• I should work harder.• Not many agreement mistakes to

begin with.• Agreement is a generation issue more

than a parsing one.

Page 106: Agreement matters: Challenges of translating  into  a MRL

Rebels without a cause?• Syntax-based MT:– Neat! – Only marginally better than phrase-based

• Syntax-level morphology in parsing:– Neat! – Only marginally better than ignoring it

Both are crucial for translating into MRLs!

Page 107: Agreement matters: Challenges of translating  into  a MRL

To Conclude

• Translating into MRLs brings new challenges

• Syntax is crucial– If you are not looking into syntax, you should!– If you are looking into syntax – look deeper!

• Plenty of interesting work to be done

Page 108: Agreement matters: Challenges of translating  into  a MRL

To Conclude

• Translating into MRLs brings new challenges

• Syntax is crucial– If you are not looking into syntax, you should!– If you are looking into syntax – look deeper!

• Plenty of interesting work to be done– Finishing up my phd on parsing

Looking for a postdoc position for next year