Kotaro Hara and Shamsi Iqbal
makeability lab
Effect of Machine Translation in Interlingual Conversation
Successful intercultural communication is important in:
Education
Business
Health-care
Yet, a language barrier can be a challenge
“The infrequent use of
interpreters* in the
delivery ward was
among the most
important reasons for
the reduced quality of
[health] care.”
Kale and Syed (2010)
* Interpreter: a person who translates
the words that someone is speaking
into a different language
(Merriam-Webster)
Speech Recognition & Machine Translation Performance
Fügen et al. (2007) noted that “automatic systems can
already provide usable information for people.”
2015
2015
Little research has paid attention to how people interact with
spoken language translation technology
Hamon O., et al., End-to-End Evaluation in Simultaneous Translation, ECACL2009
Hello!
Speech
Text
Spoken Language Translation System
Spoken Language Translation
Automatic speech recognition Machine translation
Speech synthesis
a あ
Hamon O., et al., End-to-End Evaluation in Simultaneous Translation, ECACL2009
Hello!
Speech
Text
Spoken Language Translation System
Spoken Language Translation
Automatic speech recognition Machine translation
Speech synthesis
a あ
Danger.
Do not enter
Effect of machine translation on text-based communication Gao et al. 2014; Yamashita et al. 2009; Yamashita and Ishida 2006
Hello!
Focuses on automatic speech recognition, but not machine translation.
…
Native English Speaker Non-native English Speaker
Hello!
Effect of speech recognition on interlingual conversation Gao et al. CHI2014
Effect of spoken language translation system to
interlingual communication is under-explored.
Key Point
Evaluation of NESPOLE!
Constantini et al. 2002
• No detailed discussion on
how people adapted in
using the system
• Push-to-talk interface made
it hard to analyze turn-taking
behavior
• Examined usability of
spoken language
translation system
Explore how spoken language translation
affects a natural conversation between
people speaking different languages
Goal
Video to show how the system work:
One side speaks in English, another
side in German
Translator Tool | Demo
Skype Translator Demo, English-German Conversation
Video to show how the system work:
One side speaks in English, another
side in German
Translator Tool | Demo
Video to show how the system work:
One side speaks in English, another
side in German
Video chat interface
Translator Tool | Interface Components
Video to show how the system work:
One side speaks in English, another
side in German
Closed Caption (CC)
Translator Tool | Interface Components
Video to show how the system work:
One side speaks in English, another
side in German
Translated Text-to-Speech (TTS) Audio
Translator Tool | Interface Components
Video to show how the system work:
One side speaks in English, another
side in German
Headset
Microphone
Translator Tool | Interface Components
Study Method | Participants
8 French-German pairs
(N=16 participants)
15 English-German pairs
(N=30 participants)
Study Method | Participants
8 French-German pairs
(N=16 participants)
15 English-German pairs
(N=30 participants)
No common language With common language (English)
All participants could speak
English reasonably well.
Study Method | Participants
8 French-German pairs
(N=16 participants)
15 English-German pairs
(N=30 participants)
With common language
(English)
I will mainly talk about
this study
No common language
Study Method | Conversation Task
Olivia trainierte fur den
Tanzwettbewerb. (Olivia was practicing for the
dance-off )
Conversation Task
German Speaker French Speaker
A starting sentence was
provided to the participant
Study Method | Conversation Task
Olivia trainierte fur den
Tanzwettbewerb. (Olivia was practicing for the
dance-off )
Conversation Task
German Speaker French Speaker
Translate
Study Method | Conversation Task
Olivia trainierte fur den
Tanzwettbewerb. (Olivia was practicing for the
dance-off ) Je ne savais pas Olivia dansé.
Combien de temps elle s'y
adonne ? (I didn’t know Olivia danced. How long
has she been practicing?)
Conversation Task
German Speaker
Translate
French Speaker
Study Method | Conversation Task
Olivia trainierte fur den
Tanzwettbewerb. (Olivia was practicing for the
dance-off )
Seit sechs Jahren (For six years)
C'est une longue période. (That’s a long time.)
Conversation Task
German Speaker
…
French Speaker
Je ne savais pas Olivia dansé.
Combien de temps elle s'y
adonne ? (I didn’t know Olivia danced. How long
has she been practicing?)
We asked participants to perform 9 conversation tasks
(~3 min ea.) in their respective languages and an
additional conversation task in English
The goal of each task was to collaboratively construct
a coherent story
Study Method | Conversation Task
Tasks were conducted using three different settings of
the translator tool
Conversation Task
Study Method | Interface Settings
Closed Caption &
Text-to-Speech (CC & TTS)
CC
Closed Caption only (CC)
CC
Text-to-Speech only (TTS)
Interface Settings
Closed Caption & Text-to-Speech Closed Caption Text-to-Speech
With Closed Caption and Text-to-Speech
CC CC
Closed Caption and
Text-to-Speech
CC
Closed Caption
CC
Text-to-Speech
Round 1
Study Method | Interface Settings
Interface Settings
Order was permuted for each pair
for counter balancing
CC CC Round 1
Study Method | Interface Settings
Interface Settings
CC CC
CC CC
Round 2
Round 3
9 tasks with different interface settings and
different starting sentences
CC CC Round 1
Study Method | Interface Settings
Interface Settings
CC CC
CC CC
Round 2
Round 3
Eng A baseline English task to compare
with/without translation settings
CC CC Round 1
Study Method | Data
Data
CC CC
CC CC
Round 2
Round 3
Post-task survey (after every task)
E.g., “We had a successful conversation.” Strongly
Agree
Strongly
Disagree
Eng
Eng
CC CC Round 1
Study Method | Data
Data
CC CC
CC CC
Round 2
Round 3
Post-round survey
(after every 3 tasks)
Interface setting preference ranking
CC & TTS: __________, CC: __________, TTS: __________ Best Neutral Worst
CC CC Round 1
Study Method | Data
Data
CC CC
CC CC
Round 2
Round 3
Eng Post-session survey
and Interview
Translator
Tool
Study
Method
Quantitative
Analysis
Content
Analysis
• Overall interface
setting preferences
• Whether people became
used to using the
translator tool
Three interface settings
Three rounds (nine tasks)
Two languages
3 x 3 x 2 mixed design study
With-in subject
With-in subject
Between subject
Quantitative Analysis | Analysis Method
Analysis Method
Transformed ordinal data in survey responses with
aligned rank transformation
We analyzed the data with restricted maximum
likelihood model
Quantitative Analysis | Analysis Method
Analysis Method
Interface Preference Results
Quantitative Analysis | Interface Preference Results
Which interface setting did you favor the most and least?
Closed Caption & Text-to-Speech Closed Caption only Text-to-Speech only
CC CC
59%
35%
6%
0%
25%
50%
75%
100%
Closed Caption & Text-to-Speech Closed Caption only Text-to-Speech only
Quantitative Analysis | Interface Preference Results
Most Favored Interface Setting
by French-German Group
CC CC
Interface Settings
Percentage of people who
favored the interface setting
59%
35%
6%
0%
25%
50%
75%
100%
Closed Caption & Text-to-Speech Closed Caption only Text-to-Speech only
Quantitative Analysis | Interface Preference Results
Most Favored Interface Setting
by French-German Group
Interface settings with closed
caption were preferred
CC CC
59%
35%
6%
0%
25%
50%
75%
100%
Closed Caption & Text-to-Speech Closed Caption only Text-to-Speech only
Quantitative Analysis | Interface Preference Results
Most Favored Interface Setting
by French-German Group
CC CC
24% more people preferred
to have text-to-speech
59%
35%
6%
0%
25%
50%
75%
100%
Closed Caption & Text-to-Speech Closed Caption only Text-to-Speech only
Quantitative Analysis | Interface Preference Results
Most Favored Interface Setting
by French-German Group
CC CC
Set the Closed Caption & Text-to-Speech to the default,
but allow users to turn on/off setting
2.2 2.2
2.7
4.8
1
2
3
4
5
Round 1 Round 2 Round 3 English
Overall Conversation Quailty
Perceived Conversation Quality over Rounds
Round
Average 5-point
Likert scale rating
Quantitative Analysis | Perceived Conversation Quality
5-point Likert scale rating (higher is better)
Rating
2.2 2.2
2.7
4.8
1
2
3
4
5
Round 1 Round 2 Round 3 English
Overall Conversation Quailty
p < 0.01
p < 0.01
Participants felt that
their conversation
quality improved
Perceived Conversation Quality over Rounds
Quantitative Analysis | Perceived Conversation Quality
F2,112=6.275, p<0.01; Error bars are standard errors
5-point Likert scale score (higher is better)
Rating
2.2 2.2
2.7
4.8
1
2
3
4
5
Round 1 Round 2 Round 3 English
Overall Conversation Quailty
Perceived conversation
quality is not as good as
that of English task
Perceived Conversation Quality over Rounds
Quantitative Analysis | Perceived Conversation Quality
5-point Likert scale score (higher is better)
Rating
Translator
Tool
Study
Method
Quantitative
Analysis
Content
Analysis
• Why people were satisfied or
dissatisfied with the experience
• How people adapted to using
the translator tool
Interview transcripts and survey responses were
coded with a content analysis method; recurring
themes were extracted and grouped together
Post-session interviews were transcribed by authors
Content Analysis | Method
Method
62.5% of the participants noted that
proper noun recognition
errors hindered the
conversation.
Content Analysis | Proper Noun Recognition Errors
Proper Noun Recognition Errors
‘Ava’, there was no way it was getting ‘Ava’ no matter how
many times I tried. So in German, when you say ‘but,’ it
sounds very similar. So that’s what it was picking up.
French-German Pair 4, German speaker
Content Analysis | Proper Noun Recognition Errors
‘Ava’, there was no way it was getting ‘Ava’ no matter how
many times I tried. So in German, when you say ‘but,’ it
sounds very similar. So that’s what it was picking up.
French-German Pair 4, German speaker
Content Analysis | Proper Noun Recognition Errors
Lack of an adaptation technique
Content Analysis | Grammar and Word Order Errors
37.5% of the participants noted that
grammatical errors and
misplacement of words were
problematic.
Grammar and Word Order Errors
French-German Pair 3, German speaker
Because [...] German sentence structure is so different,
[translated] sentence wouldn’t make any sense when
[synthesized speech] said it, but when you see it on the
screen, you can kind of reorganize the words in the way it
supposed to go. And then it makes sense.
Content Analysis | Grammar and Word Order Errors
French-German Pair 3, German speaker
Because [...] German sentence structure is so different
[translated] sentence wouldn’t make any sense when
[synthesized speech] said it, but when you see it on the
screen, you can kind of reorganize the words in the way it
supposed to go. And then it makes sense.
Content Analysis | Grammar and Word Order Errors
People could find a
fallback strategy
Content Analysis | Information from Original Speech
31.3% of the participants noted that
original speech was useful.
Information from Original Speech
[Having partner’s original voice] humanizes the
relationships as well, because if you only get the robotic
voice, I think it just cuts you off from the person you are
engaged with. Whereas you talk, then it feels much more
human relationship.
French-German Pair 1, French speaker
Content Analysis | Information from Original Speech
An original speech can be useful
even when a person do not know
the language
Content Analysis | Forgiving
31.3% of the participants mentioned
that they forgave the errors of
the translator tool
Forgiving
My sentence became shorter, I pronounced more clearly,
I was speaking ... I mean, if I were speaking to someone
who’s not a native German, and if I didn’t have a
translator, then I would also slow down, obviously.
French-German Pair 5, German speaker
Content Analysis | Forgiving
My partner and I would begin to speak then the
translation from the previous statement would catch up,
translation and speaker would then all be talking at the
same time.
English-German Pair 4, English speaker
Content Analysis | Speech Overlap
Content Analysis | Speech Overlap
Speech Overlap
25.0% of the participants mentioned
that they speech overlap was
a problem
French-German Group
Content Analysis | Speech Overlap
Speech Overlap
25.0%
50.0%
Participants from the English-German group perceived
speech overlap as a more severe problem
French-German Group English-German Group
This was partly because German
speakers could respond without
seeing/hearing translation.
Design Guidelines
• Support Users’ Adaptation for Speech Production
• Offer Fallback Strategies with Non-verbal Input
• Support Comprehension of Messages
• Support Users’ Turn Taking
Summary
We used Skype Translator to elicit interaction
problems in using spoken language translation for
interlingual conversation.
Our analysis revealed how people adjusted
(or could not adjust) their behaviors to overcome
the problems.
makeability lab
makeability lab
Image and Icon Credit
- Friendlies by Mo Riza (https://www.flickr.com/photos/moriza/2565606353)
- 133/2011 euroscola by rosipaw (https://www.flickr.com/photos/rosipaw/5768035647/)
- Japantimes (http://www.japantimes.co.jp/news/2014/05/08/world/social-issues-world/births-fall-economies-may-falter)
- Windows8core.com (http://www.windows8core.com/bing-translator-app-for-windows-8rt-lands-in-store-review/)
- Jigsaw puzzle by James Petts
- Speech bubble from Wikimedia Commons (http://commons.wikimedia.org/wiki/File:Speech_bubble.svg)
- Head silhouette from Channelweb (http://www.channelweb.co.uk/crn-uk/news/2244637/good-week-bad-week)
- Microphone from Uxrepo.com (http://uxrepo.com/icon/mic-by-mfg-labs)
- Document from IconArchive (http://www.iconarchive.com/show/mono-general-2-icons-by-custom-icon-design/document-icon.html)
- Funny Signs from Fanpop (http://www.fanpop.com/clubs/picks/images/1771904/title/funny-signs-photo)
- Compass Zoomed by Gwgs (https://www.flickr.com/photos/gwgs/346806882)
- Beaker by Emily van den Heever (https://thenounproject.com/EmvdHeever/)