mt domain customization – conditions and benefits. chris wendt (microsoft)

Click here to load reader

Post on 22-Jan-2018

416 views

Embed Size (px)

TRANSCRIPT

  1. 1. Translator
  2. 2. Slide 3 3
  3. 3. Slide 5 Learn word and phrase alignments from parallel data
  4. 4. Slide 6
  5. 5. Slide 7 f e* e* = argmaxe P(e | f) P(e | f) = P(f | e) P(e) / P(f) argmaxe P(e | f) = argmax P(f | e) P(e) P(f | e) channel translation model P(e) language model
  6. 6. Slide 8 Start With Parallel sentences Monolingual data Decoding Algorithm Build These Components Translation Model Language Model P(E) Decoder
  7. 7. Slide 9 Translation Model Target Language Model Other Models Microsoft s vast language knowledge Translation Model Target Language Model Your and your community s language knowledge Translator service and API Your Applications Your test and tuning documents Lambda weight vector
  8. 8. Slide 10 Your site or application Translator Service Supply Corrections Consume Translations Collaborative Translations Store Microsoft Translator Hub Custom ModelsGeneric Models Your own, previously translated documents Supply Documents Build custom models Import Corrections for training
  9. 9. Slide 11 Your site or application Translator Service Supply Corrections Consume Translations Collaborative Translations Store Microsoft Translator Hub Custom ModelsGeneric Models Your own, previously translated documents Supply Documents Build custom models Import Corrections for training Translate() AddTranslation() GetTranslations() GetUserTranslations() Speak() Detect() BreakSentences() Thorough customization Retrain every 2 months, or 20000 segments Continuous Improvement
  10. 10. Slide 12 What goes in What it does Rules to follow Be strict. Compose them to be optimally representative of what you are going to translate in the future. Calculate the BLEU score just for you. Dictionaries Forces the given translation with a probability of 1. Be restrictive. Safe to use only for compound nouns and named entities. Better to not use and let the system learn. Build the translation model aka phrase table. Teaches how to translate. Be liberal. Any in-domain human translation is better than MT. Add and remove documents as you go and try to improve the score. Build the target language model. Improve grammar and fluency. Be liberal. Use any in-domain target language material you can get.
  11. 11. Slide 13 Humans can easily detect 0.5 to 1.0 points Faster post-editing Higher document comprehension Small: Higher improvement within the domain Large: Better suited for input variability Better exploit of training docs Better to build a larger domain (lower BLEU delta)
  12. 12. Slide 14
  13. 13. Slide 15 Quality SpeedPrice You can only have two P3
  14. 14. Slide 16 Post-Editing Goal: Human translation quality Increase human translators productivity In practice: 0% to 25% productivity increase Varies by content, style and language Raw publishing Goals: Good enough for the purpose Speed Cost Publish the output of the MT system directly to end user Best with bilingual UI Good results with technical audiences Cost-effective way for inbound material Triage Analysis and classification P3 Post-Publish Post-Editing Know what you are human translating, and why Make use of community Domain experts Enthusiasts Employees Professional translators Best of both worlds Fast Better than raw Always current
  15. 15. Slide 17 Assimilation Dissemination Post-Edit Use customized machine translation Never miss a chance to collect a human edit Make the source visible on demand Show the source Show domain-relevant dictionaries Apply TM with 100% Apply TM with 80% Reveal alternatives Publish raw first, collect human feedback Use modern, collaborative TM systems (i.e. MemSource)
  16. 16. Slide 18 18
  17. 17. Slide 19 Deep Neural Networks (>30% in ASR) Recurrent Neural Networks (1-6 BLEU) Filtering, domain adaptation
  18. 18. Slide 20
  19. 19. Slide 21
  20. 20. Slide 22
  21. 21. blogs.msdn.com/translator twitter.com/MSTranslator facebook.com/MicrosoftTranslator linkedin.com/company/Microsoft-Translator microsoft.com/translator
  22. 22. Slide 24