streamlining mt for asian languages, by natsuki wakabayashi, ise and tetsuzo nakamura, electrosuisse...

Streamlining MT for Asian Languages ISE MT Project 2016 Wakabayashi, ISE Nakamura, Electrosuisse Japan 2016/4/19 Information System Engineering, Electrosuisse Co. 2016 1

Upload: taus-enabling-better-translation

Post on 16-Jan-2017




2 download


Page 1: Streamlining MT for Asian Languages, by Natsuki Wakabayashi, ISE and Tetsuzo Nakamura, Electrosuisse Japan

Streamlining MT for Asian Languages

ISE MT Project 2016

Wakabayashi, ISE Nakamura, Electrosuisse Japan

2016/4/19 Information System Engineering, Electrosuisse Co. Ⓒ 2016 1

Page 2: Streamlining MT for Asian Languages, by Natsuki Wakabayashi, ISE and Tetsuzo Nakamura, Electrosuisse Japan

日本語スタイルガイド 第3版Japanese Writing Style-Guideline 3rd editionfrom JTCA

• Adding the fifth chapter to it: Writing rules for t7n

Page 3: Streamlining MT for Asian Languages, by Natsuki Wakabayashi, ISE and Tetsuzo Nakamura, Electrosuisse Japan

About ISEA leading company that pioneers new areas in technical communication technologies

• Date of establishment• October, 1979

• Business sites• Tokyo (Headquarters), Osaka, Kobe• Beijing, Shanghai, Switzerland

• Affiliated business• Electrosuisse Japan (Kobe)

• Our Business• Technical Communication• Interface Design• Systems Design &

Development• Technical Consulting

• Customer Fields• Japanese Governmental Agencies,

Educational/Research Institutions• Financial Institutions, Trading,

Manufacturing, Information Services

2016/4/19 Information System Engineering, Electrosuisse Co. Ⓒ 2016 3

Page 4: Streamlining MT for Asian Languages, by Natsuki Wakabayashi, ISE and Tetsuzo Nakamura, Electrosuisse Japan

Project Phases

1. From English data: Utilizing TMs in MT to make L10n more effective and efficient: speeding up and lowering cost

1. For European Languages2. For Asian languages

2. From Japanese data: JA-EN MT-ize: utilizing MT to facilitate communication including SNS, customer support

42016/4/19 Information System Engineering, Electrosuisse Co. Ⓒ 2016

Page 5: Streamlining MT for Asian Languages, by Natsuki Wakabayashi, ISE and Tetsuzo Nakamura, Electrosuisse Japan

from EN to Asian

• Source Language : EN• Target Language : CN, IN, JA• Domain : ICT Equipment• Document Type : Manual

52016/4/19 Information System Engineering, Electrosuisse Co. Ⓒ 2016

Page 6: Streamlining MT for Asian Languages, by Natsuki Wakabayashi, ISE and Tetsuzo Nakamura, Electrosuisse Japan


Page 7: Streamlining MT for Asian Languages, by Natsuki Wakabayashi, ISE and Tetsuzo Nakamura, Electrosuisse Japan

Analytical Assessments

English source text I:

• To use this safely and comfortably, carefully read precautions and install in an appropriate location.

• Um diese Maschine sicher zu benutzen und bequem, lesen Sie sorgfältig die folgenden Vorkehrungen und installieren Sie die Maschine in einen passenden Standort.

• Pour utiliser cette machine sans risque et lisez confortablement, soigneusement les précautions suivantes et installez la machine dans un emplacement approprié.

• Per utilizzare sicuro questa macchina e legga confortevolmente, con attenzione le seguenti precauzioni ed installi la macchina in una posizione appropriata.

• Para utilizar esta máquina con seguridad y lea comfortablemente, cuidadosamente las precauciones siguientes e instale la máquina en una ubicación apropiada.

2016/4/19 Information System Engineering, Electrosuisse Co. Ⓒ 2016 7

Page 8: Streamlining MT for Asian Languages, by Natsuki Wakabayashi, ISE and Tetsuzo Nakamura, Electrosuisse Japan

Case Studies: Comparisons of MT and Human GeneratedEnglish to Simplified Chinese Translations

English source text I:

To use this safely and comfortably, carefully read precautions and install in an appropriate location.

MT translation:为了安全地使用本机和舒适地,请仔细阅读以下注意事项并在合适的地方安装本机。

Human translation:若要安全舒适地使用本机,请仔细阅读以下注意事项并将本机安装在适当的位置。

2016/4/19 Information System Engineering, Electrosuisse Co. Ⓒ 2016 8

Page 9: Streamlining MT for Asian Languages, by Natsuki Wakabayashi, ISE and Tetsuzo Nakamura, Electrosuisse Japan

English Source Text I: Analysis

In this sentence the infinitive form of the verb “to use” is used. In the MT translation “to” is rendered as wèile (为了) in Chinese, and as ruòyào (若要) in the human translation. In both of the translations the meaning of “to” is conveyed, but the contextual nuance of ruòyào is better suited for this particular sentence as per the human translation.

MT splits the combination of adverbs “safely and comfortably” into ānquánde...shūshìde (安全地...舒适地) whereas the human renders them into a succinct lexical unit ānquánshūshìde (安全舒适地).

The word highlighted in green jiāng (将) in the human translation is a particle which marks direct objects. This particle is absent in the MT translation, and the syntactical structures of the two sentences are different.

The term “appropriate location” is rendered as “suitable place” héshìdedìfāng(合适的地方) by MT, whereas it is translated as “appropriate location” shìdāngdewèizhì (适当的位置) as per the human translator.2016/4/19 Information System Engineering, Electrosuisse Co. Ⓒ 2016 9

Page 10: Streamlining MT for Asian Languages, by Natsuki Wakabayashi, ISE and Tetsuzo Nakamura, Electrosuisse Japan

Case Studies: MT Generated English to Indonesian with LSP Comments

English source text I:To use this safely and comfortably, carefully precautions and install in an appropriate location.• MT translation:Untuk menggunakan mesin ini secara aman dan nyaman, berhati-hati membaca pencegahan berikut dan menginstal mesin dalam lokasi yang tepat.• LSP Comments:The message is understandable. But, the language is very poor. • Improvement Strategy(next step) :Terminology, MT Tuning2016/4/19 Information System Engineering, Electrosuisse Co. Ⓒ 2016 10

Page 11: Streamlining MT for Asian Languages, by Natsuki Wakabayashi, ISE and Tetsuzo Nakamura, Electrosuisse Japan

Analytical Assessments

English source text II:This is classified as a Class 1 under ZZZ XXXXX-X:XXXX, ENXXXXX-X:XXXX.

Diese Maschine wird als Laser-Produkt der Klassen-X unter ZZZ XXXXX-X klassifiziert: XXXX, ENXXXXX-X: XXXX.

Cette machine est classifiée comme produit de laser de la classe 1 sous le CEI XXXXX-X : XXXX, ENXXXXX-X : XXXX.

Questa macchina è classificata come prodotto del laser della classe A 1 nell'ambito dell'ZZZXXXXX-X: XXXX, ENXXXXX-X: XXXX.

Esta máquina se clasifica como producto del laser de la clase 1 bajo ZZZ XXXXX-X: XXXX, ENXXXXX-X: XXXX.

2016/4/19 Information System Engineering, Electrosuisse Co. Ⓒ 2016 11

Page 12: Streamlining MT for Asian Languages, by Natsuki Wakabayashi, ISE and Tetsuzo Nakamura, Electrosuisse Japan

Case Studies: Comparisons of MT and Human Generated English to Simplified Chinese Translations

English source text II:

• This is classified as a Class 1 under ZZZ XXXXX-Y:XXXX, ZZXXXXX-Y:XXXX.

• MT translation:本机被列为ZZZ XXXXX-Y下的1类激光产品:XXXX、ZZXXXXX-Y:XXXX.

• Human translation:根据 ZZZ XXXXX-Y: XXXX, ZZXXXXX-Y: XXXX,本机被归类为 1 类激光产品。

2016/4/19 Information System Engineering, Electrosuisse Co. Ⓒ 2016 12

Page 13: Streamlining MT for Asian Languages, by Natsuki Wakabayashi, ISE and Tetsuzo Nakamura, Electrosuisse Japan

English Source Text II: Analysis

• In this sentence the term “is classified as” is semantically correct in both sentences, albeit the lexical choices are rendered differently. For MT it is bèilièwéi (被列为), whereas it is bèiguīlèiwéi (被归类为) as per the human translator.

• For the word “under” MT uses xiàde (下的), and the human translator renders it as gēnjù (根据) which means “according to.”

• MT does not maintain the same sequential order for the International Electrotechnical Commission (ZZZ) numbers, whereas the human keeps the sequence intact.

2016/4/19 Information System Engineering, Electrosuisse Co. Ⓒ 2016 13

Page 14: Streamlining MT for Asian Languages, by Natsuki Wakabayashi, ISE and Tetsuzo Nakamura, Electrosuisse Japan

Case Studies: MT Generated English to Indonesian with LSP Comments

English source text II:This is classified as a Class 1 under ZZZ XXXXX-Y:XXXX, ZZXXXXX-Y:XXXX.

• MT translation:Mesin ini bisa diklasifikasikan sebagai Kelas 1 di bawah ZZZ Produk Laser XXXXX-Y: XXXX, ZZXXXXX-Y: ZZZZ.

• LSP Comments:Understandable but because it is very poorly presented, misunderstanding may arise.

• Improvement Strategy(next step) :Terminology, Building Translation Model and MT Tuning

2016/4/19 Information System Engineering, Electrosuisse Co. Ⓒ 2016 14

Page 15: Streamlining MT for Asian Languages, by Natsuki Wakabayashi, ISE and Tetsuzo Nakamura, Electrosuisse Japan

Analytical Assessments

English source text III:

• The beam is confined in the unit by a cover, so there is no danger of the beam escaping during normal operation.

• Das Laserstrahl wird in der Laserlesegeräteinheit durch eine Abdeckung, so dort ist keine Gefahr des Laserstrahlentgehens während der normalen Rechneroperation begrenzt.

• L'à rayon laser est confiné dans l'unité de module de balayage à laser par une couverture, tellement là n'est aucun danger de l'évasion à rayon laser pendant l'opération de machine normale.

• Il raggio laser è limitato nell'unità dell'analizzatore di laser da una copertura, così là non è il pericolo del raggio laser che sfugge durante l'operazione a macchina normale.

• El de rayo láser es confinada en la unidad del escáner de laser por una cubierta, tan allí no es ningún peligro del escape de rayo láser durante la operación de máquina normal.

2016/4/19 Information System Engineering, Electrosuisse Co. Ⓒ 2016 15

Page 16: Streamlining MT for Asian Languages, by Natsuki Wakabayashi, ISE and Tetsuzo Nakamura, Electrosuisse Japan

Case Studies: Comparisons of MT and Human Generated English to Simplified Chinese Translations

English source text III:

• The beam is confined in the unit by a cover, so there is no danger of the beam escaping during normal operation.

• MT translation:激光在激光扫描仪单元中被盖板,没有任何激光束泄漏的危险在本机的正常操作。

• Human translation:激光束由盖板约束在激光扫描仪单元中,因此在本机正常操作期间没有激光束泄漏的危险。

2016/4/19 Information System Engineering, Electrosuisse Co. Ⓒ 2016 16

Page 17: Streamlining MT for Asian Languages, by Natsuki Wakabayashi, ISE and Tetsuzo Nakamura, Electrosuisse Japan

English Source Text III: Analysis

• In this sentence the term “is confined” is translated as bèigàibăn (被盖板) which means “covered” by MT, and the human translator renders it into a split phrase meaning “bound by...constrained” shūyóu...yuēshū (束由...约束).

• The word “no” is rendered as méiyŏurènhé (没有任何) which means “not any” by MT, and it is translated as méiyŏu (没有) meaning “no” as per the human translator.

• The word “so” is lacking in the MT translation, whereas the human translator uses the word yīncĭ(因此). In addition the word “during” is not translated by MT, it is thereby rendered as qījiān (期间) by the human translator.

• Overall it should be pointed out that the syntactical structures of both these translations are quite different.

2016/4/19 Information System Engineering, Electrosuisse Co. Ⓒ 2016 17

Page 18: Streamlining MT for Asian Languages, by Natsuki Wakabayashi, ISE and Tetsuzo Nakamura, Electrosuisse Japan

Case Studies: MT Generated English to Indonesian with LSP Comments

English source text III:The beam is confined in the unit by a cover, so there is no danger of the beam escaping during normal operation.

• MT translation:Sinar laser terbatas dalam scanner laser unit dengan penutup, jadi tidak ada bahaya dari sinar laser melarikan diri selama operasi mesin normal.

• LSP Comments:Understandable but because it is very poorly presented, misunderstanding may arise.

• Improvement Strategy(next step) :Terminology, Building Translation Model and MT Tuning

2016/4/19 Information System Engineering, Electrosuisse Co. Ⓒ 2016 18

Page 19: Streamlining MT for Asian Languages, by Natsuki Wakabayashi, ISE and Tetsuzo Nakamura, Electrosuisse Japan

EN->JPNakamura, Electrosuisse Japan

Page 20: Streamlining MT for Asian Languages, by Natsuki Wakabayashi, ISE and Tetsuzo Nakamura, Electrosuisse Japan

Recent Situations for MT in Japan

• Decision makers in clients: managements covering technical development

• More than 20 years ago, a managing director gave me a recommendation making l10n more efficient by using MT when I belonged to a manufacturer

• but, I denied because of the quality at that time • I suppose he and the persons on the same level of him tried …• And, they doubt …• Because they tried MT by themselves in the past without any

preparations and got an awful traumatic results: garbages?

Page 21: Streamlining MT for Asian Languages, by Natsuki Wakabayashi, ISE and Tetsuzo Nakamura, Electrosuisse Japan

Jokes Around Us

•• Indiscriminate misuse by the people who don’t have enough

knowledge about MT

• Albert Einstein: His Life and Universe アインシュタインその生涯と宇宙

• which drove the publisher into a bankruptcy • Amazing…, which got a premium price more than twenty-thousand

yen because of its uniqueness

Page 22: Streamlining MT for Asian Languages, by Natsuki Wakabayashi, ISE and Tetsuzo Nakamura, Electrosuisse Japan

From Albert Einstein: His Life and Universe

• 彼は,時には,やかましくこっこっと鳴って,終わりに全体の出来事


• … he cackled loudly at times and at the end pronounced the entire event “most amusing.”

• metaphor: cackle

Page 23: Streamlining MT for Asian Languages, by Natsuki Wakabayashi, ISE and Tetsuzo Nakamura, Electrosuisse Japan

From Albert Einstein: His Life and Universe

• ボルンの妻のヘートヴィヒに最大限にしてください。(そのヘートヴィヒは,彼の家族に関する彼の処理,今や説教された頃,彼が「自分がそのかなり不幸な回答に駆り立てられるのを許容していないべきでない」と自由に彼に叱った)。以上は,彼が目立つべきであり,彼女が言ったのを「科学の人里離れている寺」に尊敬します。」(p.41)

• Max Born’s wife, Hedwig, who had freely scolded Einstein about his treatment of his family, now lectured, “[You should] not have allowed yourself to be goaded into that rather unfortunate reply.” He should show more respect, she said, for “the secluded temple of science.”286

• term, part of speech: Max (vt) Born’s wife

• word order: ,she said,

• metaphor: secluded temple of science

Page 24: Streamlining MT for Asian Languages, by Natsuki Wakabayashi, ISE and Tetsuzo Nakamura, Electrosuisse Japan

From Albert Einstein: His Life and Universe

• アルバート・アインシュタインの爆発するようなグローバルな名声と芽生え始めたシオニズムは科学の歴史の中でもユニークで,どんな分野にも,本当に,顕著であった出来事のため一九二二年春に集中。一種の大規模狂乱を喚起して,ツアーしているロックスターをぞくぞくさせるへつらいを押す東とmidwestern合衆国を通る壮大な二ヶ月の行列式書。(p.45)

• Albert Einstein’s exploding global fame and budding Zionism came together in the spring of 1921 for an event that was unique in the history of science, and indeed remarkable for any realm: a grand two-month processional through the eastern and midwestern United States that evoked the sort of mass frenzy and press adulation that would thrill a touring rock star.

• terms

• metaphor

Page 25: Streamlining MT for Asian Languages, by Natsuki Wakabayashi, ISE and Tetsuzo Nakamura, Electrosuisse Japan

From Albert Einstein: His Life and Universe

• 世界は,以前一度も見たことがなくて,おそらくユダヤ人のための再




• The world had never before seen, and perhaps never will again, such a scientific celebrity superstar, one who also happened to be a gentle icon of humanist values and a living patron saint for Jew.

Page 26: Streamlining MT for Asian Languages, by Natsuki Wakabayashi, ISE and Tetsuzo Nakamura, Electrosuisse Japan

Actually Garbage?

• No you can get reasonable results from the MT now, if you• limit the categories to technical documents in a certain field• prepare an appropriate terminology dictionary in the field• and, add some spices

• But, clients in Japan are still hesitant

• The quality level has been improved even without creating a translation model

• It’s time to diffuse • We will show you some

Page 27: Streamlining MT for Asian Languages, by Natsuki Wakabayashi, ISE and Tetsuzo Nakamura, Electrosuisse Japan

• To use this safely and comfortably, carefully read the precautions and install in an appropriate location.

• この機械を安全かつ安心して使用するためには、注意深く次の注意を読み、適切な位置に機械を取付けて下さい。

Page 28: Streamlining MT for Asian Languages, by Natsuki Wakabayashi, ISE and Tetsuzo Nakamura, Electrosuisse Japan

日本語スタイルガイド 第3版Japanese Writing Style-Guideline 3rd editionfrom JTCA

• Adding the fifth chapter to it: Writing rules for t7n