emotion oriented intelligent...
TRANSCRIPT
Emotion Oriented Intelligent Interface
by
Kazuya Mera
DISSERTATION
Submitted to Tokyo Metropolitan Institute of Technology in partial fulfillment of the requirement for the degree of Doctor of Philosophy
March 2003
ACKNOWLEDGEMENTS
The author wished to express sincere gratitude to his supervisor Professor T. Yamashita
whose guidance has been greatly helpful in completing this dissertation.
To Professor S. Fukuda, Professor T. Yamaguchi, and Professor T. Takagi, he is
indebted for their continuous supports and their valuable comments and suggestions during
the preparation of this dissertation.
The author would also like to thank Professor N. Okada, Professor T. Aizawa, and Dr. T.
Ichimura for their continuous supports and their valuable comments and suggestions
during the preparation of this dissertation.
He further wished to thank Mr. Y. Takehara, Mr. K. Ono, and Mr. S. Kawamoto, fellow
graduate students of a master course, and Ms. M. Fujisawa, Mr. T. Takagi, Mr. S. Shimada,
Mr. A. Teramoto, Ms. T. Suehiro, Mr. M. Yoshie, Mr. Y. Sato, Mr. T. Inoue, Mr. F. Kurose,
Mr. M. Iemori, Mr. S. Ryu and Mr. S. Hamano, fellow graduate students of a bachelor
course.
Special thanks are also due to Mr. D. Edward for correcting the language mistakes.
To the colleagues in Yamashita Laboratory (Tokyo Institute of Technology), Natural
Language Processing Laboratory (Hiroshima City University), and SENCE (Hiroshima
City University), the author expresses his thanks for their collaboration.
i
CONTENTS CHAPTER 1 INTRODUCTION ______________________________________________ 1
CHAPTER 2 EMOTION GENERATING CALCULATIONS _____________________ 6
2.1 Agent Model ____________________________________________________________ 6 2.1.1 Agent Model of AIBO __________________________________________________ 7 2.1.2 MaC Model ___________________________________________________________ 8 2.1.3 Agent Model in This Study _____________________________________________ 9
2.2 Process of Emotion Generating Calculations _______________________________10
2.3 Case Frame Representation _____________________________________________13 2.3.1 Case Frame Representation in Japanese _______________________________ 13 2.3.2 Classification of Event Type___________________________________________ 13 2.3.3 Construction of Case Frame Representation ____________________________ 14
2.4 Favorite Value Database ________________________________________________16 2.4.1 Favorite Value_______________________________________________________ 16 2.4.2 Default Favorite Value _______________________________________________ 17 2.4.3 Favorite Value Learning Method ______________________________________ 18
2.5 Equations of Emotion Generating Calculations ____________________________19 2.5.1 Equations for Event __________________________________________________ 20 2.5.2 Equations for Attribute _______________________________________________ 23 2.5.3 Equations for is-a Relationship ________________________________________ 24 2.5.4 Favorite Value of Predicate with Negative Aspect _______________________ 25 2.5.5 Favorite Value of Modified Noun ______________________________________ 25
2.6 Emotion Strength Calculation ___________________________________________34
2.7 Example of EGC Method ________________________________________________36
2.8 Experimental Result____________________________________________________36
2.9 Emotion Distinguishing Method based on Emotional Space__________________40
2.10 Future Work__________________________________________________________42
2.11 Conclusion____________________________________________________________43
CHAPTER 3 COMPLICATED EMOTION ALLOCATING METHOD BASED ON EMOTION ELICITING CONDITION THEORY_________________________________44
3.1 Emotion Discrimination _________________________________________________44
ii
3.2 Emotion Eliciting Condition Theory ______________________________________49
3.3 Complicated Emotion Allocating Method based on Emotion Eliciting Condition Theory using Emotion Generating Calculation ________________________________53
3.3.1 Fortunes of the Others _______________________________________________ 53 3.3.2 Prospect-Based Emotions _____________________________________________ 56 3.3.3 Confirmation ________________________________________________________ 58 3.3.4 Well-Being __________________________________________________________ 61 3.3.5 Attribution __________________________________________________________ 62 3.3.6 Well-Being / Attribution ______________________________________________ 64
3.4 Dependency among Emotion Groups______________________________________64
3.5 Example of Complicated Emotion Allocating Method _______________________66
3.6 Experimental Results ___________________________________________________71 3.6.1 Experimentation 1 ___________________________________________________ 71 3.6.2 Experimentation 2 ___________________________________________________ 74
3.7 Future Works __________________________________________________________75
3.8 Conclusion_____________________________________________________________77
CHAPTER 4 ANALYSIS OF AFFIRMATIVE/NEGATIVE INTENTIONS FROM USER’S ANSWERS TO YES-NO QUESTIONS__________________________________78
4.1 Intention Analyzing Method from the Utterance ___________________________78 4.1.1 Understanding Intentions of the Indirect Speech-Act in Natural Language Interfaces ________________________________________________________________ 78 4.1.2 Recognizing User Communicative Intention in a Dialogue-Based Consultant System___________________________________________________________________ 81
4.2 An Overview of the Affirmative/Negative Intention Analyzing Method________83 4.2.1 Web-based Analytical System of Health Service needs among Healthy Elderly_________________________________________________________________________ 83
4.2.2 Affirmative/Negative Intention Analyzing Method for Web-based Analytical System of Health Service __________________________________________________ 87
4.3 Affirmative/Negative Element ___________________________________________89 4.3.1 Affirmative/Negative Description for the yes-no Question ________________ 89 4.3.2 Direct Expression of Intention in the Response__________________________ 90 4.3.3 Indirect Expression of Intention in the Response ________________________ 90 4.3.4 Data Structure Description in Question ________________________________ 93
iii
4.4 Affirmation Value ______________________________________________________93 4.4.1 Affirmative Value of the Interjection ___________________________________ 93 4.4.2 Affirmative Value of the Verb _________________________________________ 94
4.5 Affirmative Value Changing Scale________________________________________94 4.5.1 “Affirmative Value Changing Scale” Affected by the Adverb ______________ 94 4.5.2 Affirmative Value Change by Modality _________________________________ 97
4.6 Analyzing Intention from Plural Sentences _______________________________102
4.7 Example of Our Method ________________________________________________103
4.8 Application of Affirmative/negative Intention Analyzing Method ____________104
4.9 Experimental Result___________________________________________________106
4.10 Future Works ________________________________________________________109
4.11 Conclusion___________________________________________________________ 111
CHAPTER 5 EMOTION ORIENTED INTERACTION SYSTEMS — FACEMAIL & FACECHAT — _____________________________________________________________113
5.1 Facial Expression Generating Method by Parallel Sand Glass Type Neural Network _________________________________________________________________113
5.1.1 Sand Glass Type Neural Network ____________________________________ 113 5.1.2 Facial Training Data ________________________________________________ 117 5.1.3 Learning Experimental Results ______________________________________120
5.2 JavaFaceMail _________________________________________________________127 5.2.1 System Overview ___________________________________________________127 5.2.2 Assign Rules to the Facial Expressions________________________________134 5.2.3 Mental Effects by Outputting Facial Expressions ______________________135
5.3 JavaFaceChat_________________________________________________________136
5.4 Emotion Oriented Interactive Interface for Raising Students’ Awareness in a Group Lesson ____________________________________________________________140
5.4.1 System Overview ___________________________________________________142 5.4.2 Assign Rules to Detect Variances in the Student’s awareness____________142
5.5 Conclusion____________________________________________________________144
CHAPTER 6 CONCLUSION ______________________________________________145
REFERENCES _____________________________________________________________146
1
CHAPTER 1 INTRODUCTION
Although we have recently been able to access various media by computer networks, some
problems are noticeable about the interface tools to support communication between human and
computer or human and human. In order to achieve natural communication, developing
communication interactive tool considering the human mind is expected [1].
Ekman classified body actions relating to communication as follows; verbal information,
paralanguage and non-verbal information. Verbal information is expressed by strings obtained from
sentences, utterances, and so on. Paralanguage is expressed by rhythm, intonation, and so on.
Non-verbal information is expressed by facial expressions, gestures, blinks, and so on [2]. It is
especially well known that the role of non-verbal information in communication and conversation is
important. Mehrabian, a psychologist, reported “an affection reaches to the companion by verbal
information (the weight is 7%), paralanguage (the weight is 38%) and non-verbal information (the
weight is 55%) in a conversation [3].”
Various methods are proposed for perceiving the emotions from non-verbal information.
Harashima proposed a method whereby a machine extracts and encodes the variation factors of
facial expression from the user’s facial image and another machine located faraway receives and
decodes the factors and reconstructs the facial expression by Computer Graphics [4, 5]. The methods
to analyze the emotions from voice information such as rhythm, frequency, and length of the
sentence are proposed by Uwatoko [6], Kawanami [7], Shigenaga [8], Kadotani [9], et al.
However, people sometimes express facial expressions which differ from their real emotions. For
example, a person smiles even if he/she is displeased. In this situation, if the former systems
recognize the users’ emotions as happy from the smile, the system never obtains their confidence.
Therefore, a method to analyze the users’ emotions based on not only non-verbal information but
also verbal information, is required.
Therefore, it is also important to recognize the user’s intention from his/her utterances. If the
flexibility and usability of the interface process are inadequate, the confidence of the users will be
spoiled even if the system is good for emotions.
Interface by natural language dialogue, especially voice, is very effective in communicating to the
user the intention and the contents of the dialogue. Voice is adaptive to input data on a portable
terminal, including situations where the user’s eyes and hands are used for another purpose, because
using the voice does not require any big devices, or the user’s eyes or hands. Therefore, there are
many studies for a vocal interface such as the DARPA Communicator project which deals with the
2
issues of supporting a travel plan [10, 11, 12], Arise, which guides a train schedule [13] and many other
studies for conversing with robots [14, 15]. However, the user will use many informal languages which
are like conversations between two people in our proposed system, if the system considers the user’s
emotions and the user feels confident in the system.
The technology to recognize the voices of informal language is still far from the practical level. A
voice recognition project in the U.S. developed a method that can recognize 90% or more of the task
dependence voices, but even they have not been able to increase the recognition rate for the voices of
informal language [16]. Furthermore, we will meet many more difficulties with voices including
hoarseness, cracks, tremors and dialects, because we are going to apply our system to elderly people.
For these reasons, we propose a method to recognize the user’s mind (emotions and intentions)
from the user’s utterances in natural language conversation in this paper.
First, we propose a method to analyze the user’s emotion concerning the contents of the
utterances.
Generally, the utterances in the dialogue are represented in sentence forms, and they mainly
express the events and the attribute evaluations. Although there are various types, we employ 11
event types and 6 attribution types that Okada classified [17], and define a calculation for each type.
These calculations distinguish pleasure/displeasure based on Arnold’s definition of emotion [18];
“Pleasure/displeasure are aroused from the actions relating to appearance/avoidance of like/dislike
objects.” We use the taste information (Favorite Value: FV) for each object. This calculation method
extracts pleasure and displeasure, and the emotion’s degree is calculated from the synthetic vector
among FVs of three case elements [19, 20].
However, the classified pleasure/displeasure is too ambiguous to apply them into the process
considering the user’s emotions. There are many kinds of emotions in human society like “relief,”
“expectation,” “envy,” and so on. Many psychologists have proposed the criteria of emotion
evaluations to distinguish such various emotions from the simple emotions like pleasure/displeasure.
For example, Wundt proposed three dimensions, namely “pleasure vs. displeasure,” “calmness vs.
tension,” and “relaxation vs. excitement [21].” Schlosberg proposed three other dimensions, “pleasure
vs. displeasure,” “attention vs. rejection,” and “strong vs. weak for activation [22].” Plutchik proposed
eight kinds of primary emotions (anger, disgust, sadness, surprise, fear, joy, acceptance and
anticipation) and the degree of each emotion [23]. Although these studies are effective to classify
emotions from the viewpoint of the person who arouses the emotions, they are not effective to guess
the other person’s emotions because they require the degree of the other’s perceptions such as
excitement and attention.
In this paper, we employ “emotion eliciting condition theory” proposed by Elliott [24, 25] to classify
3
the simple emotions (pleasure/displeasure) into complex emotions. The “emotion eliciting condition
theory” is developed for an agent system called Affective Reasoner from Ortony’s theory [26, 27]. We
classify pleasure/displeasure into 20 kinds of complex emotions by checking the emotion eliciting
conditions based on the grammatical features in the user’s utterance.
In this theory, we check five conditions; “pleased/displeased about an event,”
“desirable/undesirable event for another,” “prospective event,” “confirmed/unconfirmed event,” and
“approved/disapproved event.” We judge “pleased/displeased about an event” based on
pleasure/displeasure extracted by the method described in the previous paragraph, and
“desirable/undesirable event for another” is judged from the result of the method using the other’s
taste information. The remaining conditions are judged based on grammatical features such as
adverbs, tenses, aspects, and the subject of the sentence. This method can extract multiple emotions
from one sentence at the same time [28].
On the other hand, there are many studies which analyze the user’s intention from his/her
utterance.
Mima proposed a method for understanding the intention of the “indirect speech-act” in natural
language interface for the operation of the computer system [29]. This method detects the user’s
demand about operations by converting the surface structure of the user’s input sentence. However,
it is difficult to apply this method to natural language dialogue, because the Japanese language tends
to be ambiguous and is tolerant of omitting and inversion. This method has to define used words and
their order strictly to analyze the intention.
Kumamoto also proposed a method to recognize a user’s communicative intention (CI) from the
natural language dialogue in order to support the usage of a computer. This method extracts
“function words” which indicate the features to determine the CI type, and the user’s intention is
guessed using determined CI type and the grammatical features.
In this paper, we propose a method to analyze a user’s affirmative/negative intention from the
response to yes-no questions based on these methods [30, 31, 32]. The affirmative/negative intention is
expressed not by binary value but by real value in the range [0.0, 1.0] to indicate the ambiguity of
the user’s intention. In this paper, we analyze the intention by extracting the words, which indicate
affirmation/negation (affirmative/negative element) like function words by Kumamoto. There are
three types of affirmative/negative words; “affirmative/negative description for the yes-no question,”
“direct expression of intention in the response,” and “indirect expression of intention in the
response.” “Affirmative/negative descriptions for the yes-no question” do not have any independent
meaning, but they show the intention by referring to the content of the question. “Direct expressions
of intention in the response” are the derivatives of the verb and auxiliary verb in the question.
Although there are many types of “indirect expression of intention in the responses,” [33] we employ
4
three of them; “indirect information addition,” “non-standard reason addition,” and “standard reason
addition.” Their roles are to guess the intention by mentioning the reasons for the intention.
Each affirmative/negative element has a degree of affirmation/negation (affirmative value) in the
range [0.0, 1.0]. Then, the total affirmative/negative intention of the user’s utterances is calculated
from the average of the affirmative values of extracted affirmative/negative elements. This method
can deal with not only complete sentences but also the sentences including omission, incomplete
expressions and wrong voice recognition results, because this method does not refer to the overall
grammar of the sentence but partial word units.
As described in the former paragraphs, we recognize the user’s mind (emotions and intentions)
from the user’s utterances in natural language conversation. Now, extracted intentions of the user
will be sent to the body of the system from our proposed interface system, but we have to consider
how to express the extracted emotions.
As agent systems that express emotions, many pet robots are developed such as AIBO [34, 35] by
Sony, MaC model [36, 37] by OMRON, PaPeRo [38, 39] by NEC, and ComoComo [40] by Toshiba. All the
agents recognize the present environment by the sensors in the agents, and generate the emotions
from the user’s own viewpoint.
We aim for smooth human-computer communication like human to human communication.
Although it is enough to express only the agent’s own emotions to be made a pet of the user and to
be attached, it is not enough to obtain the user’s reliance for the computer interface.
In order to be believed and relied on by the user, we propose a method to express synchronized
emotions against the user’s emotions.
Heider proposed the P-O-X theory and it is expressed as follows. The P stands for a person, O is
another person and X is an impersonal entity (topic, subject, event) that P and O have an opinion
about. When both P and O have the same opinions (approval/disapproval) about X, they will approve
of each other [41, 42].
In this study, the P would be the user, O is the computer system and X is the topic in the
conversation. If the computer analyzes the user’s emotion about the topic based on the user’s taste
information and expresses the user’s emotion by facial expression, the user will prefer the computer.
We are aiming to obtain an affinity with the user.
We employ facial expressions to express the analyzed emotions. Many researchers point to the
importance of non-verbal information by face through the faculty of sight [43].
Facial expressions have important roles to express emotions and they are more important than
verbal information. We can express emotions just from facial expressions and basic facial
5
expressions are common in the world. There are many studies to analyze and express facial
expressions and to apply them into the dialogue systems [1, 44].
Ichimura proposed a method to generate one facial expression image based on six emotions (anger,
happiness, sadness, surprise, fear and disgust) using a “parallel sand glass type neural network.” We
propose a method to generate the user’s facial expressions based on the extracted emotions by the
EGC method. This method uses a “sand glass type neural network” trained by real facial images.
First, we classify the emotions for facial expressions as “happiness,” “sadness,” “disgust,” “anger,”
“fear” and “surprise” as proposed by Ekman [44]. By training the neural network based on such types
of facial expressions, each emotion is partitioned on the two-dimensional emotional space
constructed by the outputs of the third layer in the neural network.
In order to employ the emotional space, we assign the EGC output (20 kinds of emotions) to the
input of the two-dimensional emotional space (6 kinds of emotions) as described in Section 5.2.2.
Successively, a point on the two-dimensional emotional space is determined from the assigned
emotions [45, 46, 47].
We applied this method into mail software and a chat system. The mail software (JavaFaceMail)
calculates the emotions from the content of the mail, generates one facial expression image of the
sender, and sends the mail with the facial image. The chat system (JavaFaceChat) also generates a
facial expression image like JavaFaceMail. Furthermore, it analyzes the variances of emotions for
each user, and invites two users to a new closed chat room when their tendencies of variances are
alike.
6
CHAPTER 2 EMOTION GENERATING CALCULATIONS
Although the human-computer interfaces still have been constructed considering only the machine
circumstances, we have a lot of opportunities to deal with computers. However, such interfaces are
inconvenient especially for elderly people and the handicapped because they often do not know
how to operate them and cannot manipulate the tools, such as input data by keyboard, click the
small button by mouse, and so on. We consider that the advanced interfaces such as communicating
by natural language are convenient even for people in normal health. However, to understand the
intention in the utterance and to achieve natural communication between human and the system, we
have to consider a human’s emotion which is important. The concept of the emotional computer
was unfamiliar, however, recently the concept is becoming more popular, for example, AIBO made
by SONY [34, 35].
We are going to propose an agent model that expresses emotions of its own and calculates the
user’s emotions. When the agent recognizes an event guessed by the stimulus from out of the agent,
the agent calculates emotions by evaluating the event based on the individuals’ likes and dislikes.
However, it is very difficult to evaluate all events in the world. There are two processes, emotion
generation and emotion analysis, in emotion processing. We believe our method can deal with both
processes, however, we limit the emotion analysis process in this paper, as generating emotion of
the computer’s own has not gained the consensus regarding the problems of validity and morality
yet. In this paper, we restrict the stimulus within the natural language of the sentence, as we are
going to adopt the agent model for the natural language interface.
In this section, we propose a method that the agent calculates some emotions expected to raise in
the human with regard to the event that the agent recognizes. The agent calculates the emotions by
substituting values of words’ impression about likes/dislikes (FV) for the equation that is readied for
each event type. The strength of the emotion is calculated from a diagonal’s length of rectangular
solid consisting of all the terms, which are Subject, Object, and Predicate, in the equation.
Furthermore, the procedures to construct default FV database and to learn FVs from the dialogue are
described. These calculated emotions are expressed by the facial expression and response and this
method is presented in chapter 5 and 6. We evaluated this method’s validity by comparing the result
of emotions extracted from the system against the responses of the various individuals. 2.1 Agent Model
Our purpose is to realize natural dialogue processed between human and computer by considering
human emotion. In this section, we propose an agent model that can make adequate responses and
can express adequate expression of emotions in the verbal input by a human. At first, we introduce
7
some emotional agents that generate and express emotion against the stimulus from the external
world such as AIBO of Sony [34, 35] and MaC model of OMRON [36, 37]. Next, we will explain the
structure of our agent model.
2.1.1 Agent Model of AIBO Sony made an entertainment robot called AIBO. It is an autonomous walking machine in the real
world and it recognizes the environment by external/internal sensor such as camera, microphone,
touch sensor, battery rest sensor and so on. AIBO generates the instincts from the state of the sensors,
and elicits emotions based on the instinct. Then, it calculates its emotion from the instincts and
expresses its emotions as gestures, light signals, and sounds.
AIBO has five instincts: to sleep, to be a pet, charge, explore, and play with someone. It acts on its
instinct. AIBO uses expressive gestures to tell you its desire. AIBO has six emotions: joy, sadness,
anger, surprise, fear, and discontent. AIBO expresses these emotions through its horn lights, sounds,
and gestures [34, 35].
Figure 2.1 is an agent model of AIBO. This figure is written by mine based on the upper
explanation.
Actuator
Emotion domain (joy, sadness, anger, surprise, fear, and discontent)
Instinct domain (to sleep, to be a pet, charge,explore, and play with someone)
External world
Outer Stimulus Sounds Signals
External Sensors
Internal Sensors
Inner Stimulus
Light Microphone
Gestures
Figure 2.1 Agent model of AIBO
8
2.1.2 MaC Model Ushida, et al. proposes an emotion model, MaC (Mind and Consciousness) model, for life-like
agents with emotions and motivations as shown in Figure 2.2. This model consists of reactive and
deliberative mechanisms. The former mechanism covers direct mapping from sensors to effectors.
The deliberative mechanism has two processes: the cognitive and emotional processes. The
cognitive process executes recognition, decision-making, and planning. The emotional process
generates emotions according to the cognitive appraisals.
The process of emotion generating is divided in two steps as shown in Figure 2.3. In the first step,
emotional factors (i.e. desirability, praiseworthiness, and appealingness) are computed. The levels of
the emotional factors are obtained using the emotion eliciting condition rules. This model uses seven
emotional factors as follows: Goal success level (GSL), Goal failure level (GFL), Blameworthy level
(BWL), Pleasant feeling level (PFL), Unpleasant feeling level (UFL), Unexpected level (UEL), and
Goal crisis level (GCL). The second step is to compute emotion intensities. Emotion intensities are
obtained by using emotional factors, time decay, and other emotions. Emotional factors influence the
intensities by using production rules [36, 37].
Cognitive Process �� Recognition �� Decision making�� Planning
Emotional Process �� Desirability �� Praiseworthiness � Appealingness
Sensors Reflex Effectors
Cognitive Appraisal
Generated Emotion
Environment
Figure 2.2 Conceptual model of the mind and consciousness
9
2.1.3 Agent Model in This Study
The agent models as shown in the former sections gain some stimulus from external world and
generate emotions based on the stimulus. Then they express emotions using effectors and actuators.
In order to express emotions, AIBO and MaC model use the actions. We propose an agent model
that can deal with natural language and emotion to communicate a human being naturally as shown
in Figure 2.4. The natural language processing on the agent model consists of mainly three parts;
(1) Analysis of input utterance (Sentence analysis domain),
(2) Decision of what-to-say (Dialogue planning domain),
(3) Decision of how-to-say (Sentence generation domain).
When a user’s utterance comes from “External world,” “Sentence analysis domain” analyzes the
utterance content and extracts the user’s intention. Next, “Dialogue planning domain” makes the
response content based on the user’s intention, the conversation’s log and the present feeling. Then
“Sentence generation domain” makes the responses and outputs it to “External world.”
These domains transfer their data through “Internal world domain” which works as the short-term
memory and manages agent model, user model and current situation. On the other hand, “Memory
management domain” works as the long-term memory and accumulates knowledge and experience.
Figure 2.3 Framework for emotion generation
Situation
BWL
GFL
GSL
PFL
UFL
UEL
GCL
Emotional Factors
Happiness
Anger
Sadness
Disgust
Fear
Surprise
Emotions
Fuzzy Inference
▪ Current goal ▪ Current object ▪ Distance to object ▪ Object’s contribution to the goal ▪ Other’s action
Emotion Eliciting Condition Rules
If getting an object succeeds & Its contribution degree to a goal is high& the goal’s importance is high, then the GSL is high.
10
In the emotional process within the model, “Emotion generation domain” observes the output of
“Sentence analysis domain” through “internal world” and extracts some emotions from the various
events that are expressed in the user’s utterances. On the other hand, “Face selection domain”
receives these emotions and selects adequate facial expression using neural network, and the facial
expression is drawn on the display. The extracted emotions influence other domains not just “face
selection domain.”
2.2 Process of Emotion Generating Calculations We present Emotion Generating Calculations (EGC) method that extracts emotions from the
utterances. This method is constructed by focusing on the similarities between the grammar
structures and the semantic structures within the case frame representation. The input of our agent
model is the sentence of the user’s utterance, and the outputs are responses by utterance and facial
expression. Figure 2.5 is the procedure of our EGC method.
At first, the user’s utterances are transcribed into the case frame representation based on the result
of morphing and parsing for the utterance, because the input form of our proposed method is case
frame representaion.
Dialog planning domain
Internal world
Face selection domain Emotion generation domain
Sentence analysisdomain
Memory management domain
Sentence generation domain
External world
Agent
Utterance Facial expression Utterance
Figure 2.4 Agent model of our system
11
Next, the agent extracts pleasure/displeasure and its strength from an event that is described by
case frame representation. In the psychological field, “unpleasure” is often used as the opposite of
“pleasure.” However, we use “displeasure,” because an explicit intention about “unhappy” should be
indicated. The agent does morphological analysis and parsing to input utterance before this process.
Then, the agent calculates the degree of pleasure/displeasure from a diagonal’s length of rectangular
solid consisting of all the terms in EGC. EGC uses eight types of equations for 12 types of events
classified types by Okada [17]. The agent substitutes word concepts’ impression degrees about
likes/dislikes (FV) for the equations. The equations consist of 2 or 3 terms and each term mainly
means subject, object and predicate. The method also calculates the degree of extracted emotion
(Emotion Value: EV) using FVs for their terms. Furthermore, the negatives and the noun phrases are
also used in these calculations.
Then, the agent divides this simple emotion (pleasure/displeasure) into 20 various emotions based
on the Elliott’s “Emotion Eliciting Condition Theory.” Elliott’s theory requires judging such
conditions as follows; “feeling for another,” “prospect and confirmation,” “approval/disapproval.”
“Feeling for another” means someone’s emotion (not mine) about the utterance’s content and it is
judged based on EGC’s result using another’s taste information. The method extracts some aspects
and adverbs about the tense to judge “prospect and confirmation.” “Approval/disapproval” is judged
by the utterance’s case frame representation with the transitive verb.
This method calculates not the agent’s own emotions like AIBO and MaC model, but the user’s
emotions. This enables an adequate facial expression in order to sympathize the user’s emotion and
to avoid the utterance that the agent causes displeasure.
12
Input utterance
Morphological Analyzing
Parsing
EGC method
Favorite ValueDatabase
EGC Equations Database
Pleasure/Displeasure
Classify the distinguished pleasure/displeasure
Dialogue planning domain Face selection domain
20 types of emotions
Tense & Aspects
Figure 2.5 Procedure of the EGC method
Case Frame Representation
13
2.3 Case Frame Representation 2.3.1 Case Frame Representation in Japanese
The case frame structure bases the predicate phrase and the other case elements which connect to
the predicate phrase. There are two case frame structure types; “surface structure” based on the word
string of a sentence, and “deep structure” based on the content of a sentence. However we are going
to deal with the Japanese in this study, it is very difficult to analyze surface structure of Japanese.
Because Japanese sentences can be made up without subject or object and the particles that have
various functions such as “WA” are often used in Japanese [48]. In this study, the deep structure is
used in order to avoid any ambiguities that exist in Japanese.
2.3.2 Classification of Event Type
We will meet with an infinite number of events in this world. It is impossible to propose the
emotion generating rules for each event. Then, we have to classify the events.
Okada classified the event concepts into “simple event concept” which is represented by
connection of the case elements and “combined event concept” which is represented by combination
of the simple event concepts. In this paper, we deal with only the “simple event concept” because
“combined event concept” can be dealt later if a method for “simple event concept” is established.
Okada presented 11 case element types to express the event; Subject, Object, Object-From,
Object-To, Object-Mutual, Object-Source, Object-Content, Implement, Location, Time, Reason and
Degree [17]. Okada also defined seven essential elements, Subject, Object, Object-From, Object-To,
Object-Mutual, Object-Content and Implement, as the least necessary elements for recognizing the
event. Okada classified the simple event concepts recorded in a classified vocabulary chart [49] into
12 kinds of type based on the least necessary elements of the event. Table 2.1 shows all types and
their examples. Type I-V are the intransitive verbs, type VI-XI are transitive verbs, and type XII is
the rest event.
We presume that the events in an event type are dealt with in the same semantic structures which
the human recognizes. For example, although “Smoke goes up a chimney.” and “The man left
town.” are felt completely different events, both events have common form “Subject’s place changes
from one to the other.” We propose a method to generate emotion for each event type [19, 20].
14
Table 2.1 Event types
Type Event type Example sentence
I V(S) I run.
II V(S, OF) The man left town.
III V(S, OT) He goes to school.
IV V(S, OM) The beer is mixed with water.
V V(S, OS) The child disobeys his parents.
VI V(S, O) He bends a branch.
VII V(S, O, OF) The driver unloaded the baggage from the car.
VIII V(S, O, OT) He put the book into his bag.
IX V(S, O, OM) The man bumped the enemy’s head against a wall.
X V(S, O, I) He carved wood with his knife.
XI V(S, O, OC) I feel the wind refreshingly.
XII Others
2.3.3 Construction of Case Frame Representation In order to transcribe the user’s utterances into the case frame representation, we implement
morphological analysis and parsing to the input sentence first. We use JUMAN as a morphological
analyzer and KNP as a parser. There is a popular morphing system for Japanese, ChaSen, however,
we adopted JUMAN because it is needed for KNP. Both of JUMAN and KNP are developed in
Kyoto University [50].
Figure 2.6 is the example of the processes of JUMAN and KNP. In this example, a sentence
“KARE GA WATASHI NO KURUMA WO KOWASHITA. (He broke my car)” is inputted. At first,
JUMAN separates the sentence into seven morphemes. Next, KNP analyzes their relationship and
constructs the grammatical structure.
The grammatical structure of the sentence is obtained by this process. However, although its result
is expressed by surface structure, our method needs deep structure. Next, all case elements are
classified based on their particles. The result of KNP outputs the surface case frame structure and its
case elements are named by their particle names such as GA-case and WO-case. Then, we propose
some translation rules as shown in Table 2.2 that translate their case names into deep structure based
on the usage of the particles [51].
15
Input Sentence: 「彼が私の車を壊した。」
Morphological analyzing
彼 かれ 彼 名詞 6 普通名詞 1 * 0 * 0 が が が 助詞 9 格助詞 1 * 0 * 0 私 わたし 私 名詞 6 普通名詞 1 * 0 * 0 の の の 助詞 9 接続助詞 3 * 0 * 0 車 くるま 車 名詞 6 普通名詞 1 * 0 * 0 を を を 助詞 9 格助詞 1 * 0 * 0 壊した こわした 壊す 動詞 2 * 0 子音動詞サ行 5 タ形 8 。 。 。 特殊 1 句点 1 * 0 * 0 EOS
Parsing
((3 (type:D int:0 ext:) ((壊した こわした 壊す 動詞 2* 0 子音動詞サ行 5 タ形 8 NIL (自立))(。 。 。 特殊
1 句点 1 * 0 * 0 NIL (文末 付属))) (文末 句点 用言:強:動 レベル:C 区切:5-5 ID:(文末) 提題受:15) (-1)) ((2 (type:D int:0 ext:) ((車 くるま 車 名詞 6 普通
名詞 1 * 0 * 0 NIL (漢字 自立 ))(を を を 助詞 9 格助詞 1 * 0 * 0 NIL (付属))) (ヲ 助詞 体言 係:ヲ格 区切:0-0) NIL) ((1 (type:D int:0 ext:) ((私 わたし 私 名詞 6普通名詞 1 * 0 * 0 NIL (漢字 自立 ))(の の の 助詞 9接続助詞 3 * 0 * 0 NIL (付属))) (助詞 体言 係:ノ格 区切:0-4) NIL))) ((0 (type:D int:0 ext:) ((彼 かれ 彼 名詞 6 普通名
詞 1 * 0 * 0 NIL (文頭 漢字 自立 ))(が が が 助詞 9格助詞 1 * 0 * 0 NIL (付属))) (文頭 ガ 助詞 体言 係:ガ格 区切:0-0) NIL)))
Figure 2.6 Example of morphological analyzing and parsing
彼が (KARE GA)
私の (WATASHI NO)
車を (KURUMA WO)
壊した。(KOWASHITA)
16
Table 2.2 Translation rules from particle case element name to deep case element name
Particle GA KARA NI/ E/ MADE NI TO
Case Element Subject Object-From Object-To Object-Mutual Object-Source
Particle WO DE WA/ MO
Case Element Object Implement Subject/ Object
In this rule, NI-case is classified both “Object-To” and “Object-Mutual.” Then, it is classified
based on the event type of predicate in the sentence. On the other hand, WA-case and MO-case can
be classified “Subject” and “Object.” When the sentence does not have “Subject,” the case is
“Subject,” and when the sentence does not have “Object,” the case is “Object.” When both of
“Subject” and “Object” do not exist, it is “Subject” provisionally.
The case frame representation also presents the information of tense and aspect. The information
are extracted based on auxiliary verbs in the predicate phrase. We limit the considering tense/aspect
to past, future, and negation, because they are effective in generating emotion. The sentence in
Figure 2.6 is transcribed into the deep case frame structure as follows:
Predicate: KOWASU Subject: KARE
Object: WATASHI NO KURUMA
Tense: past
Aspect: nothing
2.4 Favorite Value Database 2.4.1 Favorite Value
We calculate pleasure/displeasure about an event by substituting the value that means degree of
like/dislike (FV) to the equation of EGC. We give positive numbers to some objects when the user
likes them, and give negative numbers to the other objects which the user dislikes. The FV is
predefined a real number in the range [–1.0, 1.0].
There are two types of FVs, personal FV and default FV. Personal FV is sotred in a personal
database for each person who the agent knows well, and it shows the degree of like/dislike to an
object from the person’s viewpoint. On the other hand, default FV shows the common degree of
like/dislike to an object that the agent feels. Generally, it is generated based on the agent’s own taste
information according to the result of some questionnaires. Both personal and default FVs are stored
in each user’s Favorite Value database. An object’s FV is retrieved by the following procedure
(Figure 2.7);
17
1. Retrieve the object in personal Favorite Value database.
2. Retrieve the upper concept’s FV in default value database.
3. Retrieve further upper concept’s FV in default value database.
4. Retrieve the object or the upper concept in default Favorite Value database.
5. Give the object the value 0 as the FV when there is no information in any database.
2.4.2 Default Favorite Value Default FVs are predefined based on corpus in the field that the system is applied. The object
(noun), event core (verb) and attribute (adjective) have FVs.
At first, we predefined the attributes’ FVs based on “Dictionary about Usage of Present-day
Adjectives [52].” In this book, there is a list of adjective images, and the positive/negative image and
its degrees of 1,010 adjectives are listed in the range [–3, +3]. However, some of the adjectives have
both images. For example, when a word “cool” is used about temperature, it means “not so cold or
comfortable.” On the other hand, when the word is used about eagerness, it means “not eager, lack
of will.” In this paper, we did not deal with such words because identifying the difference in
meaning is very difficult.
Next, we predefined the favorite degree of the event cores. In EGC method, pleasure/displeasure
is extracted based on approach/avoidance of a likable/dislikable object. Then, we gave the verbs that
relate to “gain” positive numbers, and gave the verbs that relate to “lose” negative numbers.
The FVs of the objects were gained from a questionnaire. In order to do it, we constructed favorite
data collecting system on WWW. It shows some nouns with an input frame for FV. The subjects
input a real number in the range [–1.0, 1.0] for each word. We adopted the average of all subjects’
“Dog” FV (Favorite Value) = −0.5
“Spitz” FV = null
“Doberman” FV = −0.5
“This dog”FV = null
“Dachshund”FV = 0.7
“Pochi (Taro’s dog)”FV = −0.8
FV = −0.5
Figure 2.7 Retrieving further upper concept’s favorite value
18
reply values as the objects’ default FVs. However, there are countless objects in the world. In this
paper, we limited the objects that have default FV into the words that frequently appear in the
dialogue about the field where our method is applied.
2.4.3 Favorite Value Learning Method EGC method needs objects’ FVs and the values are predefined from a questionnaire on WWW as
described in Section 2.4.2. However, predefining common user’s FV database is very different
among people. Even for the same person, preferences for objects can change easily. Then, people
generally guess a user’s taste information in the dialogue.
We propose four FV learning methods to learn the user’s taste information from the dialogue
using grammatical knowledge and already known FVs [53]:
1) Direct expression about like/dislike
2) Favorite Value changing situations
3) Association displeasure with the object
4) Backward calculation from the emotional expression
1) Direct expression about like/dislike When we guess a person’s taste information, we pay attention to the words “like” and “dislike.”
These words are used to tell one’s impression about something. In this method, when the sentence’s
nominative is the person and its predicate is like/dislike, the word in object frame is regarded as
liked/disliked for the person.
Some adjectives also have good/bad images. These images are identified using “standard
good/bad image of adjectives table [52].” Then, the adjectives that have good/bad images are dealt
same as “like” and “dislike.”
For example, when the agent hears the sentence “I like an apple,” the agent recognizes the user’s
taste about apple is “like,” and when the sentence is “My sister is shameless,” the agent can guess
that the user does not like his/her sister, because the word “shameless” has a bad image.
2) Favorite Value changing situations FV naturally increases when an object does something useful or favorite to the agent. It decreases,
on the other hand, when an object does something harmful or unfavorable. Current FV for a
predicate of an event is assigned a pre-determined numerical value.
In this approach, FV for an object is calculated based on the situations which will influence its FV,
from the agent’s knowledge structure. Such situations are called “Favorite Value Changing
Situations,” and are defined with the following three rules; Condition: events, Situation: situations
represented by condition, and Favorite Value change: increase or decrease. Here’s an example.
19
This example situation indicates that the agent dislikes a person who dates two people at the same
time. The situation in which a person P1 is dating with two persons causes FV for P1 to decrease.
Condition:( date ((Subj P1)(Obj-M P2))) and (date ((Obj-M P1)(Goal P3)))
Situation: P1 is dating with two persons at the same time.
Favorite Value change: FV for P1 decreases
3) Association displeasure with the object An object that often participates in something displeasing, tends to be disliked because it is
associated with past displeasing events. When a person encounters some unpleasant events, he will
hate the objects that participate in such events. Therefore, we decrease all the object’s FV when it
appears in a displeasing utterance. We define the degree of FV’s change by this method as the least,
in our four FV change methods because this effect is caused by a reoccurrence of similar situations.
For example, when the agent hears the sentences “I was struck by my brother with a bat
yesterday.” and “I was scolded by the captain for forgetting my bat.” the agent tends to guess the
user hates bats somewhat.
4) Backward calculation from the emotional expression This method guesses the user’s impression about the utterance’s contents from the emotional
expression when he speaks. For example, the user will feel pleasant about the utterance’s contents
when he smiles or his voice sounds pleasant. Guessing the user’s impression is possible by using not
only non-verbal expression but also the relation of cause and effect like “I am sad because …” and
“..., however I feel happy.” [54]
If there is a word whose FV is not defined in a sentence and the agent have already guessed the
user’s impression about the sentence, the undefined word’s FV can be guessed by calculating EGC
backward.
2.5 Equations of Emotion Generating Calculations Arnold defined “emotions” as tendencies of activation about approach/avoidance of good/bad
object [18]. We define the equations of EGC for each event types as described in Section 2.3.2. These
equations are used for detecting the user’s pleasure/displeasure. At first, we assumed the following
conditions to extract pleasure based on Arnold’s definition. The conditions to extract displeasure are
the opposite of pleasure’s conditions.
1. Favorite agent gains a benefit. / Detestable agent suffers a loss.
2. The condition of favorite/detestable agent becomes better/worse.
20
3. Favorite/detestable agent gets good/bad evaluation.
4. Favorite/detestable agent has a favorite/detestable attribute.
In this section, we propose the equations for each case frame type. We define the following
variables in the equations based on Okada’s classification [17] as described in Section 2.3.2.
fS : FV of Subject
fO : FV of Object
fOF : FV of Object-From
fOT : FV of Object-To
fOM : FV of Object-Mutual
fOS : FV of Object-Source
fOC : FV of Object-Content
fP : FV of Predicate
2.5.1 Equations for Event We define the equations for each event type as shown in Table 2.1.
Type I: The event in type I expresses “Subject (S) does Predicate (P) that the influences reach to S.”
The relationship between these elements’ FVs and its generated emotion is shown in Table 2.3 based
on the pleasure extracting condition “Favorite agent gains a benefit. / Detestable agent suffers a
loss.” Then, the equation of this event type is expressed as the product of fS and fP. The pleasure
when “Detestable agent suffers a loss” means “It serves him/her right.”
PS ffEV �� (1)
Table 2.3 Relationship between FVs and generated emotion
Subject
Like (+) 0 Dislike (–)
Benefit (+) Pleasure (+) Displeasure (–)
0 Event Core
(Predicate) Suffer (–) Displeasure (–)
0
Pleasure (+)
Type II and III: The events in type II and III express “The statement of Subject (S) that has a
relation to Predicate (P) changes from Object-From (OF) to Object-To (OT).”
When the event means “change of position or quantity” like “go” and “stray,” we judge whether
the present statement is becoming better or worse from the difference between fOF and fO. Then, we
21
calculate pleasure/displeasure by the product of fS and (fOT – fOF) the same as type I. We give fOT at
type II and fOF at type III the value 0 as a default position when they are not pointed out clearly in
the dialogue.
)( OFOTS fffEV ��� (2)
However, there are also events that express “change of mind or feeling” like “aspire” and “be suited”
in the event type III. We give fOF the value 0, because the content of Object-To only effects the
emotion. Then, “changing better/worse” is expressed by the positive and negative of FV of
Predicate.
POTS fffEV ���� )0( (3)
Then, we define the following equation that combined (2) and (3) for the event type II and III. When
the event means “change of position or quantity,” we give the event core’s FV a positive number
based on the equation (2).
POFOTS ffffEV ���� )( (4)
Type IV: The events in type IV expresses “Subject (S) and Object-Mutual (OM) have a relation to
Predicate (P).” Figure 2.8 is a relationship between FVs of S, OM, P, and generated emotion. In this
figure, the characters beside the arrows mean FV of Subject or Object-Mutual. FVs of the predicate
that means closeness are positive numbers, and FVs of the predicate that means avoidance are
negative numbers. Then, we can get an EV by a product of fS, fOM and fP based on the relationship in
Figure 2.8. POMS fffEV ��� (5)
Pleasure Event Displeasure Event
+ + closeness
− − closeness
+ − closeness
+ − avoidance
+ + avoidance
− − avoidance
Figure 2.8 Relation between favorite values of S, OM, P and generated emotion
22
Type V: The events in type V expresses “The Subject (S) and Object-Source (OS) do Predicate (P) at
the same time.” For example, an event “The child disobeys his parents,” includes two viewpoints,
“the child is defiant” and “His parents are disobeyed.” So, even if the agent does not take care of the
child, when the agent likes the parents, the agent will feel sorry for them. We reverse the sign of
Predicate’s FV when we calculate emotion against the opposite event, because the meaning of the
predicate also be reversed.
POSS
POSPS
fffffffEV
���
�����
)())(()(
(6)
There is the other type like “adhere” and “originate” in type V. We do not define any equation for
them, because there are not any example to raise pleasure/displeasure.
Type VI: There are two types in type VI; paying attention to Subject’s action like “like” and
“dance,” and paying attention to Object’s action like “bake” and “turn over.” The former events
express “Subject (S) does Predicate (P) against Object (O).” We define the equation as the product of
not only fS and fP but also fO,, because Object also has a large effect on the action.
POS fffEV ��� (7)
On the other hand, the latter events express “Object is done Predicate by Subject.” The event belongs
to this type when its predicate is a transitive verb. In this case, we take care of “Object is done
Predicate.” We use the FV of “do” and “be done” properly, because these two predicates are dealt
with different ones in Japanese. The variable fP in the following equation means the FV of “Action
of Predicate is done.”
PO ffEV �� (8)
Type VII and VIII: We define an equation based on the idea the same as in event type II and III.
However, the agent of the event is not Subject but Object in these event types, because the predicates
of these types are also transitive verbs same as the predicates in the latter type of type VI. Then, we
replace fS with fO in the equation (4).
POFOTO ffffEV ���� )( (9)
23
Type IX: The events in type IX expresses “The Object (O) and Object-Mutual (OM) have a relation
to Predicate (P) by Subject (S)” like event type IV. Then, we replace fS with fO in the equation (5)
same as type VI, VII and VIII.
POMO fffEV ��� (10)
However, some predicates that belong to this type relate “exchange.” In this case, it is important that
who owns the objects in order to understand the benefit or loss. For example, at the event “Taro
substituted the rolls of bills for that of counterfeit bills,” nobody can understand who gains a benefit
or who suffers a loss, because there is not any information about who owns the bills or the
counterfeit bills. In order to get the information about the owner, we have to retrieve not only the
content of the event but also the conversation log, the concept of the object, common knowledge and
so on. Then, we do not define any equations for “exchange” predicate as our present agent model can
deal with only the emotion generation from a single event.
Type X: Because the case element Implement effects to only the degree of generated emotion, the
meaning of the type X’s event is similar to that of the latter type VI’s event type. Then, the equation
of this event type is defined as the same as the equation (8).
PO ffEV ��
Type XI: The content of this type’s event is a part that “Object (O) is Object-Content (OC).” Then,
we admit the generated emotion from this part is the emotion from a whole event.
OCO ffEV �� (11)
We was not able to define any equation for type XII (the other) because the features of their
predicates are too various to unify their concepts.
2.5.2 Equations for Attribute Okada also classified the attribute concepts recorded in the classified vocabulary chart [49] into
seven kinds of type based on the least necessary elements of the attribute concepts. Table 2.4 shows
all types and their examples. In this table, there is a new case element type C (Comparative-object)
and “A” means Attribute. Furthermore, In the following equations, fC means the FV of
Comparative-object and fP means the FV of Attribute [19, 20].
24
Table 2.4 Types of Attribute Concept Type Attribute type Example sentence
I A (S, C) He is taller than her.
II A (S, OF, C) Japan is farther from Europe than America.
III A (S, OT, C) Japan is closer to Europe than America.
IV A (S, OM, C) (no example)
V A (S, OS, C) Taro is more detailed about chemistry than mathematics.
VI A (S, O, C) Hanako is more favorite orange than apple.
VII Others A is equal in B.
Type I to V: Because OF, OT, OM, OS and C are used for expressing the degree of attribute, So only
the case element Subject and Attribute relate to the pleasure/displeasure. The relationship between
these elements’ FVs and its generated emotion is shown in Table 2.5 based on the pleasure extracting
condition “Favorite agent gains a benefit. / Detestable agent suffers a loss.” Then, the equation of
this event type is expressed as the product of fS and fP
PS ffEV ��
Table 2.5 Relationship between FVs and generated emotion
Subject
Like (+) 0 Dislike (–)
Favorite (+) Pleasure (+) Displeasure (–)
0 Attribute
(Predicate) detestable (–) Displeasure (–)
0
Pleasure (+)
Type VI: In this type, Attribute does not evaluate Subject but Object. Then, we give this type the
same equation (8) as Object and Attribute relate.
PO ffEV ��
We was not able to define any equation for type VII (the other) because the features of their
predicates are too various to unify their concepts.
2.5.3 Equations for is-a Relationship There are three concept types expressed in a sentence; the event concept used verb, the attribute
concept used adjective, and is-a concept used “is”. The form of is-a relationship concept is mainly
25
“Subject (S) is Noun (N)” and there are the following three types of relations between S and N [51].
1. S is a kind of N. (e.g. “Hamlet is a story written in the medieval times.”)
2. S and N are the same object. (e.g. “Shakespeare is a writer of Hamlet.”)
3. There is no direct relationship between S and N. (e.g. “Boku wa Hamlet da.” (in Japanese))
There is not the third type expression in English. In Japanese, the meaning of this expression
depends on the topic. However, all types expressions mean that there is a relationship between S and
N. Then, the equation of this event type is expressed as the product of fS and fP (FV of Noun).
PS ffEV ��
2.5.4 Favorite Value of Predicate with Negative Aspect The words in the event have FVs. However, their words often appear with a modifier or aspect. For
example, the nouns are modified as noun phrases or noun clauses, and the verb and adjective are
modified by adverbs, tenses, and aspects. These modifications influence their FVs. The judge of
like/dislike is changeable with a modifier as “apple” and “rotten apple.”
We propose calculation methods of the FV with modification. In this section, we explain about the
predicate, and we explain about the noun phrase and noun clause in the next section.
When a predicate has a negative aspect, we reverse the sign of the FV of the predicate because the
meaning of the predicate becomes opposite. There are not only negative aspects but also various
aspects in dialogue, however, we do not take care of these aspects because they have no influence to
distinguish likes/dislikes.
2.5.5 Favorite Value of Modified Noun The structures of noun modification are classified as shown in Table 2.6. We explain how to
calculate the FV of the noun phrase and noun clause. The noun phrase modifies a noun by a word
such as noun, adjective, and pronoun. On the other hand, the noun clause modifies a noun by a
clause (i.e. sentence).
Table 2.6 Structure of Noun Modification Structure Example
Pronoun + Noun His story
Adjective + Noun Sad story
Noun + particle + Noun The story of Denmark
Noun + Clause The story that Shakespeare wrote
26
2.5.5.1 Favorite Value of Noun Phrase
The FV of the noun phrase is defined as a product of the FV of the modifier and that of the
modified word, because the modified word is given some information about the owner, attribute, and
so on by the modifier. When the modifier is a pronoun and the content of the pronoun be guessed,
we calculate the FV by supplying the omitted word.
FV of Noun Phrase = FV of modifier * FV of the modified word (12)
Furthermore, when the modified word is a proper noun, we deal with the FV of the modified word as
the value of the whole noun phrase, because the concept of a proper noun is unlimited by the
modifier.
FV of Noun Phrase = FV of the proper noun (13)
2.5.5.2 Favorite Value of Noun Clause
There are three types of noun clause structures as follows [51].
1. Content clause (e.g. The story that Romeo loves Juliet)
2. Modified clause by supplementary word
2.1. Limited modification (e.g. The story that Shakespeare wrote)
2.2. Unlimited modification (e.g. Shakespeare who wrote “Romeo and Juliet”)
3. Modified clause for relative noun (e.g. The day that Shakespeare wrote “Hamlet”)
Figure 2.9 Truth value of the proposition ((e) is A) is τ)
τ: a little true
a 0 0
1
1
1
U e
a μA(e)
A
μτ(a)
27
We propose a common method based on an idea of the “limitation of fuzzy truth value” for
evaluating the FV of noun clause.
The “limitation of fuzzy truth value” is explained as follows [55]:
First, we consider about
((x) is A) is τ
as a fuzzy predicate with the “linguistic truth value.” “A” is a fuzzy set on the universal set “U”, and
“x” is a variable fixed into an element in “U”. When we fix a variable “x” into an element “e” in a
group “U,”
((e) is A) is τ
becomes a kind of fuzzy proposition and only one truth value is obtained as shown in Figure 2.9.
The truth value of the proposition ((e) is A) is τ) is described as )(eA� . When we define )(eA�
as “a,” the fuzzy proposition “((e) is A) is τ” changes into a new fuzzy proposition as
(a) is τ. Therefore, the truth value of the proposition becomes )(a
�� , and the truth value of the proposition
“((e) is A) is τ” are given as follows;
� � � �� �ea A�����
� .
We apply the “limitation of fuzzy truth value” method for evaluating the FV of noun clause. In the
“limitation of fuzzy truth value” method, the truth value of a fuzzy proposition is enhanced/deflated
by a “linguistic truth value” after the proposition. We consider that the relationship between the
“truth value of fuzzy proposition” and the “linguistic truth value” is the same as a modified word and
the modifier clause, i.e. the impression (FV) of the modified word is enhanced/deflated by its action
or attribute described in the modifier clause. When the impressions of the modified word and the
context of the modifier clause are the same (both good and bad), the “linguistic truth value”
enhances the FV of the modified word. On the other hand, when the impressions of the modified
word and the context of the modifier clause are different, the “linguistic truth value” deflates the FV
of the modified word. However, we can already obtain the FV of modified word in the range [-1.0,
1.0]. Then, we realize the “limitation of fuzzy truth value” for modifier clauses by defining a
membership function for the “linguistic truth value.”
28
Figure 2.10 Transition function for modifier clause
We propose a transition function for modifier clause as shown in Figure 2.10 based on an idea that
“the objects which do not have concrete evaluation are more effective about the FV.” We have to
adjust the maximum value of the effect not to over 1.0. We define the maximum effect to the FV as
�
mdcEV and α as 2.0, because the maximum of the EV is 3 , where mdcEV means the “EV of
modifier clause.”
We apply a kind of fuzzy method into calculation of the FV and EV, however, the FV and EV are
in the range [-1.0, 1.0]. Therefore, we give two truth values for a FV. For example, the FV +0.5 has a
“truth value of like” as 0.5 and a “truth value of dislike” as 0.0. On the other hand, the FV –0.3 has a
“truth value of like” as 0.0 and a “truth value of dislike” as 0.3. We show the “truth value of like” as
likeTV and the “truth value of dislike” as dislikeTV .
a) 0�mdwFV ���
�
0�
�
dislike
mdwlike
TVFVTV
b) 0�mdwFV ���
�
mdwdislike
like
FVTVTV
��
� 0
0 −1.0 FVmdw
1.0
EVmdc �
−1.0
1.0
FVmdr
29
c) 0�mdwFV ���
�
00
�
�
dislike
like
TVTV
We explain our method for each pattern, based on the sign of mdwFV (FV of modified word) in
order to apply the transition function as shown in Figure 2.10 into the fuzzy sets.
a) Favorite value of the modified word is positive )0( �mdwFV
When the content of the modifier clause is favorable )0( �mdcEV , mdwFV is enhanced by the
favorable event or attribute. dislikeTV is always 0.0 because the effect is in the fuzzy set of positive
as shown in Figure 2.11.
� ���
�mdc
likemdc
likelikeEV
TVEV
TV ���
���
�� 1
On the other hand, when the content of the modifier is unfavorable )0( �mdcEV , mdwFV is
deflated by the unfavorable event or attribute as shown in Figure 2.12. When the absolute value of
mdcEV is large enough to remove the mdwFV ’s effect, the liking degree of the object becomes
0.0. Furthermore, the disliking degree of the object increases from 0.0 as shown in Figure 2.13. The
calculation is as follows;
� ���
�mdc
likemdc
likedislikeEV
TVEV
TV ���
���
��� 1
30
Figure 2.11 Membership function for TVlike )0( �mdcEV
Figure 2.12 Membership function for TVlike )0( �mdcEV
EVmdc �
(1,1)
TVlike 0
EVmdc
EVmdc−�
μlike(TVlike)
EVmdc �
(1,1)
TVlike 0
EVmdc
EVmdc−�
μlike(TVlike)
31
Figure 2.13 Membership function for TVdislike )0( �mdcEV
Then, we define the following membership functions for the condition )0( �mdwFV .
��
�
mdc
mdclike EV
EVTVi)
� �
� � 0
1
�
���
���
��
likedislike
mdclike
mdclikelike
TV
EVTV
EVTV
�
���
��
�
mdc
mdclike EV
EVTVii)
� �
� ���
�
�
mdclike
mdclikedislike
likelike
EVTVEVTV
TV
���
���
��
1
0
−EVmdc �
(1,1)
TVlike 0
μdislike(TVlike)
32
b) Favorite value of the modified word is negative )0( �mdwFV
dislikeTV is deflated by the favorable event or attribute and is enhanced by the unfavorable event
or attribute, in the contrary direction with likeTV . Then, we define a membership function by
reversing the sign of mdcEV as shown in Figure 2.14 and 2.15.
��
�
mdc
mdcdislike EV
EVTVi)
� �
� � 0
1
�
���
���
��
dislikelike
mdcdislike
mdcdislikedislike
TV
EVTVEVTV
�
���
��
�
mdc
mdcdislike EV
EVTVii)
� �
� ���
�
�
mdcdislike
mdcdislikelike
dislikedislike
EVTVEVTV
TV
���
���
���
�
1
0
c) Favorite value of the modified word is 0 )0( �mdwFV
In this case, we consider the effect of the modifier as likeTV or dislikeTV .
0) �mdcEVi
� �
� � 00
0
�
�
dislike
mdclike
EV
�
��
0) �mdcEVii
� �
� ��
�
�
mdcdislike
like
EV��
�
0
00
33
Figure 2.14 Membership function for TVdislike )0( �mdcEV
Figure 2.15 Membership function for TVlike )0( �mdcEV
−EVmdc
�
(1,1)
TVdislike 0
EVmdc EVmdc+�
μdislike(TVdislike)
EVmdc �
(1,1)
TVdislike 0
μlike(TVdislike)
34
We apply this method for calculating the FV of noun modifier using the EV of the modifier clause.
However, a case element is often omitted in the “modifier clause by supplementary word,” because
of the element is used as the modified word. When we apply our method into such modified clauses,
we give a FV of the modified word the default value +0.5. In order to avoid multiplying the effect of
the modified word, the given FV is not the word’s FV. 2.6 Emotion Strength Calculation
EGC method extracts pleasure/displeasure about an event and it has mainly two processes, one is
identifying pleasure/displeasure by equation as shown in the former section, and the other is
calculating the degree of the emotion. We substitute the real numbers of FVs in the range [–1.0, 1.0]
to the equations, then, we can get not only positive/negative but also the real number from the
calculation. The EV calculated by our method coincides with human feeling that “the degree of the
emotion increases with the degree of contemplation to the case elements,” because this value is in
proportion with the FVs of the case element. Then, we consider the degree of EV as the strength of
generated emotion. However, we cannot compare output values from the equations. Because, there
are eight types of equations, and both types of quadratic equations and cubic equations exist.
Furthermore, the averages of output values are different between the results of the quadratic
equations and cubic equations.
In this paper, we assume an emotional space as three-dimensional space. We consider the length
of the synthetic vector among their FVs as the pleasure/displeasure degree of the event. Table 2.7
shows the correspondence between the case elements in the EGC equations and the axis in the
three-dimensional model. We made this calculation method based on cubic equations. Then, when
we calculate the degree of EV for quadratic equations like type I, V, VI, and XI, we supply a dummy
FV β as third element. We tentatively defined the value as 0.5 as it does not affect our method.
Figure 2.16 is an example of emotion strength of event type VI. There are three elements; Subject,
Object, and Predicate, in the event type VI, and the orthogonal vectors by the elements construct a
rectangular solid. Then, we regard the length of diagonal as the degree of the EV for the type VI
event.
35
Table 2.7 Correspondence between the case elements and the axis
Event type Equation f1 f2 f3 V (S) A (S, C) A (S, OF, C) A (S, OT, C) A (S, OM, C) A (S, OS, C) N (S)
PS ff � fS — fP
V (S, OF) V (S, OT) POFOTS ffff ��� )( fS fOT – fOF fP
V (S, OM) POMS fff �� fS fOM fP
V (S, OS) POSS fff �� )( fS – fOS — fP
POS fff �� fS fO fP V (S, O)
PO ff � fO — fP V (S, O, OF) V (S, O, OT) POFOTO ffff ��� )( fO fOT – fOF fP
V (S, O, OM) POMO fff �� fO fOM fP
V (S, O, I) PO ff � fO | fI | fP
V (S, O, OC) OCO ff � fO — fOC
A (S, O, C) PO ff � fO — fP
Figure 2.16 Example of emotion strength of event type VI
0 fS
fO
fP
f1
f2
f3
F1
F2 F3
36
2.7 Example of EGC Method We show an event “Romeo dates Juliet” as a calculation example. When the event is given to the
EGC method, the calculation is as follows. In this example, we assume the agent is Romeo, and he
likes to date and he loves Juliet.
Event: “Romeo dates Juliet.”
Predicate (P) = “date with” : +0.6
Subject (S) = “Romeo” : +1.0
Object-Mutual (OM)= “Juliet” : +0.9
The event type of the predicate “date” is V, and we substitute the FVs of the case elements for the
equation (5). Then, we got a positive number 0.54. This result shows that the agent (Romeo) feels
pleasure about the event “Romeo dates with Juliet.”
Event Type: “date with” V(S, OM)
(Pleasure)number Positive54.0)6.0()9.0()0.1(
)()()(
��
������
��� datefJulietfRomeofEVofSign POMS
Next, we calculate the degree of pleasure for this event. We calculate the diagonal’s length of
rectangular solid that constructed by the orthogonal vectors by the elements, Subject, Object-Mutual,
and Predicate. Then, we regard the length as the degree of EV.
47.1)1()47.1(
)1()6.0,9.0,0.1()(),,(
��
����
���
�� EVofSignfffEVofDegree POMS
When we compare the degrees of generated emotions between the event including impressive
object (its FV is large, e.g. (1.0, 0.1, 0.1)) and the event constructing a common object (e.g. (0.5, 0.5,
0.5)), the generated emotion for the former event is bigger than the latter’s one. This result
corresponds with a human feeling that the event relating interesting object is more impressive.
2.8 Experimental Result We extracted some responses of the users from the conversation log and adopted the EGC method
37
to them. We assumed the agent is the user (the speaker). Then, the EGC method extracted the same
pleasure/displeasure with human feeling from 55 utterances from 80 utterances as shown in Table
2.8.
Table 2.8 Number of the Example Generated Emotion (EGC/Human)
Event Type I II III IV V
Number of the Example Generated Emotion (EGC/Human) 19/31 0/0 11/11 3/3 4/4
Event Type VI VII VIII IX X XI XII
Number of the Example Generated Emotion (EGC/Human) 17/27 0/1 0/2 0/0 1/1 0/0 0/0
We show some examples as follows:
38
Event: “My leg has become swollen.”
Predicate (P) = “swell” : –0.6
Subject (S) = “my leg”
= (+0.9) * (+0.6)
= +0.54
Event Type: “swell” V(S)
re)(Displeasunumber Negative32.0)6.0()54.0(
)()(
��
����
�� swellflegmyfEVofSign PS
95.0)1()95.0(
)1()6.0,5.0,54.0()(),,(
��
����
����
�� EVofSignffEVofDegree PS �
Event: “I’m useful to my family.”
Predicate (P) = “useful” : +0.6
Subject (S) = “I” : +1.0
Event Type: “useful” A(S, C)
(Pleasure)number Positive60.0)6.0()0.1(
)()(
��
����
�� usefulfIfEVofSign PS
27.1)1()27.1(
)1()6.0,5.0,0.1()(),,(
��
����
����
�� EVofSignffEVofDegree PS �
39
The following examples show the difference among the generated emotion’s degrees based on the
FVs of the elements.
Event: “He scratched my car.”
Predicate (P) = “scratch” : –0.2
Object (O) = “my car”
= (+0.9) * (+0.6)
= +0.54
Event Type: “scratch” V(S, O)
re)(Displeasunumber Negative11.0)2.0()54.0(
)()(
��
����
�� scratchfcarmyfEVofSign PO
76.0)1()76.0(
)1()2.0,5.0,54.0()(),,(
��
����
����
�� EVofSignffEVofDegree PO �
Event: “He crashed my car.”
Predicate (P) = “crash” : –0.8
Object (O) = “my car”
= (+0.9) * (+0.6)
= +0.54
Event Type: “scratch” V(S, O)
re)(Displeasunumber Negative43.0)8.0()54.0(
)()(
��
����
�� crashfcarmyfEVofSign PO
09.1)1()09.1(
)1()8.0,5.0,54.0()(),,(
��
����
����
�� EVofSignffEVofDegree PO �
40
2.9 Emotion Distinguishing Method based on Emotional Space We presented a method to extract pleasure/displeasure from an event by two processes,
distinguishing pleasure/displeasure using the EGC method, and calculating the strength of the
emotion to measure the length of synthetic vector among three FVs in emotional space. However,
the EGC method is based on whether the FV of each element is positive or negative. On the other
hand, the synthetic vector is in an area that is partitioned by three axes. Therefore, we present a
method to distinguish pleasure/displeasure from an event not using EGC but judging which area the
synthetic vector is in.
Table 2.9 is the corresponding chart between the sign of each axis and the generated
pleasure/displeasure. When the vector is on the axis and the value of the axis is zero, the event does
not raise any emotion.
Table 2.9 Distinguish pleasure/displeasure using the sign of each axis Area F1 F2 F3 Emotion
I + + + Pleasure II - + + Displeasure III - - + Pleasure IV + - + Displeasure V + + - Displeasure VI - + - Pleasure VII - - - Displeasure VIII + - - Pleasure
41
We applied this new method to the examples as shown in Section 2.7 and 2.8.
Event: “Romeo dates with Juliet.”
Predicate (P) = “date with” : +0.6
Subject (S) = “Romeo” : +1.0
Object-Mutual (OM) = “Juliet” : +0.9
Event Type: “date with” V(S, OM)
47.1)6.0,9.0,0.1(
),,(
�
����
� POMS fffEVofDegree
(Pleasure) I Area)6.0,9.0,0.1(),,(/
����
� POMS fffedispleasurpleasurehDistinguis
47.1)1()47.1(
)/()(
��
���
�� edispleasurpleasureEVofDegreeEmotionGenerated
Event: “My leg has become swollen.”
Predicate (P) = “swell” : –0.6
Subject (S) = “my leg”
= (+0.9)� (+0.6)
= +0.54
Event Type: “scratch” V(S)
95.0)6.0,5.0,54.0(
),,(
�
����
� PS ffEVofDegree �
re)(Displeasu V Area)6.0,5.0,54.0(),,(/
����
� PS ffedispleasurpleasurehDistinguis �
95.0)1()95.0(
)/()(
��
���
�� edispleasurpleasureEVofDegreeEmotionGenerated
42
Event: “I’m useful to my family.”
Predicate (P) = “useful” : +0.6
Subject (S) = “I” : +1.0
Event Type: “useful” A(S, C)
27.1)6.0,5.0,0.1(
),,(
�
����
� PS ffEVofDegree �
(Pleasure) I Area)6.0,5.0,0.1(),,(/����
� PS ffedispleasurpleasurehDistinguis �
27.1)1()27.1(
)/()(
��
���
�� edispleasurpleasureEVofDegreeEmotionGenerated
2.10 Future Work We found two types of problems about the EGC method from the experimental result as described
in Section 2.8.
1. Inadequate pleasure against an unpopular person.
2. Pleasure/displeasure by guessing situations from aspects in the utterance.
At first, generating negative emotion against an unpopular person depends on the individual.
Although the EGC method always detects negative emotion to unpopular person, some people
occasionally feel sorry for the unlucky person even they do not like him/her. There were 11 counter
examples in the experiment. We found that it depends on the interest to the individual. In order to
realize this, we have to consider the objects’ parameter not only FV but also the other attributes like
interest.
The next problem is about the aspect like “have to.” The expression with this aspect “have to”
often implies a duty like “Although the speaker does not want to do something, he/she has to do it.”
Then, we consider that the speaker will generate displeasure for the forced event. The EGC method
should be developed to be able to consider the effects of aspects, not only “have to” but also “can’t,”
“take trouble to,” and so on.
The study about “Favorite Value learning method” as described in Section 2.4.3 has proceeded,
too. Especially, we are studying about “Favorite Value changing situations” method and “Backward
calculation from the emotional expressions” method.
“Favorite Value changing situations” method extracts the situations which influence an object’s
43
FV from the agent’s knowledge structure. Now, we are studying about how to construct the
knowledge from each utterance using Truth Maintenance System while avoiding any contradictions
[53].
“Backward calculation from the emotional expressions” method attempts to extract FVs from
utterances containing objects whose values are undefined. The objects’ FVs are extracted by
calculating the EGC backward. We show an example for “Romeo felt sad because Mercutio was
killed.” In this sentence, two clauses are connected by a subordinating conjunction “because.” The
independent clause means “Romeo feels displeasure” and the dependent clause shows the reason.
The EGC result for the dependent clause is shown as follow;
Event: “Mercutio was killed.”
Predicate (P) = “be killed” : –0.6
Subject (S) = “Mercutio” : ?
Event Type: “be killed” V(S)
re)(Displeasu )6.0,5.0(?,),(?,/
���
� PfedispleasurpleasurehDistinguis �
The agent does not know the FV of “Mercutio,” but the agent guesses the EGC result is
“displeasure.” Then, we calculate the FV of “Mercutio” using the relationship between the FVs and
the emotion as shown in Table 2.9. We guess the sign of F1 (Mercutio’s FV) is positive based on the
table, because the signs of F2 and F3 are positive and the output emotion is “pleasure.” We consider
that the result of this calculation is correct based on the story of Hamlet.
2.11 Conclusion In this chapter, we presented an emotion-handling dialogue model in order to facilitate comfortable
interaction with the users. We proposed Emotion Generating Calculations (EGC) method to generate
pleasure/displeasure emotion from an event in utterance. We also proposed how to calculate the
degree of the pleasure/displeasure from a diagonal’s length of rectangular solid consisting of all the
terms in EGC. EGC uses eight type equations for 12 event types, two types equations for seven
attribute types, and an equation for the noun phrase. FVs for objects are used for their calculations.
Furthermore, we applied these calculations to the negative aspect and modified noun.
To verify the effectiveness of the proposed method, we applied our method to 80 events in the
conversation and calculated emotions which almost corresponded to human-generating emotions.
44
CHAPTER 3 COMPLICATED EMOTION ALLOCATING METHOD BASED ON EMOTION ELICITING CONDITION THEORY
We proposed the EGC method which calculates pleasure/displeasure from the events. However,
expressing emotion only by pleasure/displeasure is too vague. Human usually recognizes many
emotions like hope, shame, love, anxiety, gratitude, anger, and so on.
In this chapter, we propose a method to distinguish generated simple emotion
(pleasure/displeasure) by the EGC method into 20 various emotions based on Elliott’s “Emotion
Eliciting Condition Theory.” Elliott’s theory requires judgment on the following conditions; “feeling
for another,” “prospect and confirmation,” approval/disapproval.” “Feeling for another” means
someone’s emotion (not mine) about the event and it is judged based on the EGC’s result using the
other’s FV information. We extract some aspects and adverbs about the tense to judge “prospect and
confirmation.” “Approval/disapproval” is judged by the event’s case frame structure with the
transitive verb.
To verify the effectiveness of the proposed method, we report the result of some questionnaires.
3.1 Emotion Discrimination The word “emotion” described here includes various emotion types; basic and common emotions
that the other animals also have like “pleasure,” “sadness,” “anger” and so on, and the emotions
based on the social cultural background like “contempt,” “pride,” “jealousy,” “shame” and so on.
Furthermore, some emotions are close to each other and the other emotions are independent. It is the
starting point of an emotion study that shows how to grasp the whole relationship of various
emotions.
In psychology, some models for emotions’ relationships are presented [56]. They plot the emotions
on the N-dimensional space constructed by finite emotional dimensions. For example, Wundt
considered emotions as states varying along a few dimensions. He proposed three dimentions,
namely pleasure vs. displeasure (Lust vs. Unlust), calmness vs. tension (Beruhigung vs. Erregung),
and relaxation vs. excitement (Lösung vs. Spannung) as shown in Figure 3.1. Schlosberg presented a
“three-dimensional model of emotion,” pleasure vs. displeasure, attention vs. rejection, and the
activation, based on the facial expressions. The dimension of activation is basic to the behavior of
living organisms as shown in Figure 3.2 [22].
45
Pleasure
Displeasure
Relaxation
Tension Calmness
Excitement
Figure 3.1 Three-Dimensional Model of Wundt
Figure 3.2 Three-Dimensional Model of Schlosberg
Pleasure Attention
Rejection Displeasure
Activations
46
These ideas are effective for classifying various emotions, however, they do not mention the
active aspect and function of the emotions. For example, they rarely mention how to generate
emotion and original function for each emotion (e.g. “horror” let the agent avoid the danger and
“anger” let him/her fight).
Then, recent emotion studies in psychology have four major theoretical traditions in term of
definition, study, explanation of emotion; Darwinian perspective, Jamesian perspective, the cognitive
perspective, and the social constructivist perspective, as shown in Table 3.1 [57].
Table 3.1 Four Major Theoretical Traditions in Psychology
Perspective Principal Thought Classic Study Present Study
Darwinian Emotions have adaptive functions
and are general.
Darwin
(1872/1965)
Ekman (1987)
Jamesian Emotion = Body Action James (1884) Levinson (1990)
Cognitive Emotions are based on the
evaluations.
Arnold (1960a) Smith and Lazarus
(1993)
Social constructivist Emotions are the social constructive
that contribute to social purpose.
Averill (1980a) Smith and Kleinman
(1989)
Plutchik constructed a new three-dimensional model that includes the Darwinian perspective and
Jamesian perspective. His model describes the relations among emotion concepts as shown in Figure
3.3. The cone’s vertical dimension represents intensity of the emotions, and the circle represents
degrees of similarity among the emotions by their positions on the circle. The eight sectors are
designed to indicate that there are eight primary emotion dimensions defined by the theory arranged
as four pairs of opposites (Joy-Sadness, Trust-Disgust, Fear-Rage, Surprise-Anticipation.) This
model is similar to the former multi-dimensional model. However, the emotions this model
presented are based on “the original active patterns (take in, reject, protect, destroy, breed, adapt, and
investigate) of the living things.” Furthermore, the model deals with complex emotions like
contempt and shame as the combination of a few simple emotions.
However, on our agent model, emotions are not created by perception and action but the content
of the utterance. Therefore, we consider that the cognitive perspective is the most effective for our
usage. Lazarus stressed the importance of cognitive process for emotion generation, because
cognitive appraisal about environment stimulus is essential to generate emotion and the cognitive
process goes ahead of emotion generation process. He presented that a human applies a two-step
cognitive appraisement in dealing with a situation; primary appraisal and secondary appraisal [57].
47
1. Primary appraisal refers to the issue of whether the situation has relevance for personal
well-being. During primary appraisal, individuals implicitly ask themselves the question: "Am I in
trouble or am I benefiting, now or in the future, and in what way?"
2. Secondary appraisal focuses on the possible ways of coping with the situation, and judges the
extent of available personal and environmental resources for dealing with it. The secondary appraisal
process can be translated into the implicit question: "What if anything can be done about the
situation (or about the way it will make me feel)?"
If the primary appraisal process determines that the environmental element poses a threat but the
secondary appraisal process determines that the person has immediate and direct control over the
environmental element, then the person’s suffering will become fewer. For example, if a person
hears loud, raucous music from the next apartment while he/she is trying to study, but also knows
that the neighbor will graciously turn off the stereo if he/she asked, the person will experience little
distractions or stress, or any of the negative psychological effects of non-control [58].
Lazarus also considered the relationship between the emotions’ difference and cognitive processes.
The primary appraisal process assesses only that the situation is beneficial/harmful, for example, it
does not distinguish between negative emotions like anger, fear, disappointment, and sadness. Then,
the secondary appraisal process distinguishes between such negative emotions based on what the
subject can behave about the situation. Lazarus presented a “core relational theme” related to
specific emotions based on the appraisal that a situation is beneficial/harmful. What the agent can
behave about the situation when the agent is faced with a situation is shown in Table 3.2. For
example, the core relational theme of anger is “A demeaning offence against me and mine” [59].
Ecstasy
Terror
Grief
Vigilance
Rage
Loathing
Sadness Disgust Surprise
Pensiveness Boredom Distraction
Apprehension
Fear
Annoyance
Anger Amazement
Admiration
Figure 3.3 Three-Dimensional Corn Model of Plutchik
48
Table 3.2 Emotions and Their Core Relational Themes [59]
Emotion Core relational theme
Anger A demeaning offence against me and mine
Anxiety Facing uncertain, existential threat
Fright Facing an immediate, concrete, and overwhelming physical danger
Guilt Having transgressed a moral imperative
Shame Having failed to live up to an ideal of the ego
Sadness Having experienced an irrevocable loss
Envy Wanting what someone else has
Jealousy Resenting a third party for loss or threat to another’s affection
Disgust Taking in or being too close to an indigestible object or idea (metaphorically
speaking)
Happiness Making reasonable progress toward the realization of a goal
Pride Enhancement of one’s ego-identity by taking credit for a valued object or
achievement, either one’s own or that of some group with whom we identify
Relief A distressing goal-incongruent condition that has changed for the better or gone
away
Hope Fearing the worst but yearning for better
Love Desiring or participating in affection, usually but not necessarily reciprocated
Compassion Being moved by another’s suffering and wanting to help
49
3.2 Emotion Eliciting Condition Theory As shown in Section 3.1, Lazarus proposed two processes to analyze generated emotions,
referring to whether the situation has relevance for personal well-being and evaluating the situation
as avoidable, based on the cognitive perspective. Ortony proposed the theory of the cognitive
structure of emotions and it views emotions as valid reactions to events, agents and their actions, and
objects. The theory specifies a total of 22 emotion types as shown in Table 3.3 [60, 61].
The emotion types are essentially just classed as eliciting conditions, but each emotion type is
labeled with a word or phrase, generally an English emotional word corresponding to a relatively
neutral example of an emotion fitting the type. The simplest emotions are the “well-being” emotions
such as joy and distress. These are an individual’s positive and negative reactions to desirable or
undesirable events. Eliciting these “well-being” emotions corresponds to Lazarus’s primary appraisal,
and the other emotion types corresponds to the secondary appraisal.
The “fortunes-of-others” group covers four emotion types: happy-for, gloating, resentment, and
sorry-for. Each type in this group is a combination of pleasure or displeasure over an event further
categorized as being presumed to be desirable or undesirable for another person.
The “prospect-based” group includes six emotion types: hope, satisfaction, relief, fear,
fear-confirmed, and disappointment. Each type is a reaction to a desirable or undesirable event that
is still pending or that has been confirmed or unconfirmed.
The “attribution” group covers four types: pride, admiration, shame, and reproach. Each
attribution emotion type is a positive or negative reaction to either one’s own or another’s action.
The “attraction” group is a structureless group of reactions to objects. The two emotions in this
group are the momentary feelings (as opposed to stable dispositions) of liking or disliking.
The final group is comprised of four compounds of “well-being/attribution” emotion types. These
compound emotions do not correspond to the co-occurrence of their component emotions. Rather,
each compound’s eliciting conditions are the union of the component’s eliciting conditions. For
example, the eliciting conditions for anger combine the eliciting conditions for reproach with those
for distress [61].
Elliott used Ortony’s emotion eliciting condition rules for the strong-theory reasoning component
of the Affective Reasoner that supports four requirements;
(1) a simulated world which is rich enough to test the many subtle variations a treatment of
emotion reasoning requires,
(2) agents capable of (a) a wide range of affective states, (b) an interesting array of interpretations
of situations leading to those states and (c) a reasonable set of reactions to those states,
(3) a way to capture a theory of emotions, and
(4) a way for agents to interact and to reason about the affective states of one another.
Elliott has used the extended and adapted twenty-four emotion-type version of the “Emotion
50
Eliciting Condition Theory” for the agent in the Affective Reasoner as shown in Table 3.4. The
descriptions of twenty-four emotion types are extended in order to refer the situation as an event [24,
25].
This “Emotion Eliciting Condition Theory” requires pleasure/displeasure about an event and some
information about the situation of the event (i.e. affection of another, prospect, confirmation,
approval, and attraction). We appraise pleasure/displeasure by the EGC result and extract situation
information from the tense and aspect in the utterance.
However, detecting the concepts for the likes/dislikes, needs personal taste information, one’s own
experiences, perceptions, and so on. Then, in this paper, we deal 20 types of emotions except
“liking,” “disliking,” “love,” and “hate.”
51
Table 3.3 “Emotion Eliciting Condition Theory” by Ortony
Group Specification Types (name)
pleasure (joy) Well-being Appraisal of event
displeased (distress)
pleased about an event desirable for another (happy-for)
pleased about an event undesirable for another
(gloating)
displeased about an event desirable for another
(resentment)
Fortunes-of-others Presumed value of
an event affecting
another
displeased about an event undesirable for another
(sorry-for)
pleased about a prospective desirable event (hope)
pleased about a confirmed desirable event (satisfaction)
pleased about an unconfirmed undesirable event (relief)
displeased about a prospective undesirable event (fear)
displeased about a confirmed undesirable event
(fears-confirmed)
Prospect-based Appraisal of a
prospective event
displeased about an unconfirmed desirable event
(disappointment)
approving of one’s own action (pride)
approving of another’s action (admiration)
disapproving of one’s own action (shame)
Attribution Appraisal of an
agent’s action
disapproving of another’s action (reproach)
liking an appealing object (love) Attraction Appraisal of an
object disliking an unappealing object (hate)
admiration + joy → gratitude
reproach + distress → anger
pride + joy → gratification
Well-being/
Attribution
Compound emotions
shame +distress → remorse
52
Table 3.4 “Emotion Eliciting Condition Theory” by Elliott
Group Specification Types (name)
pleased about an event (joy) Well-being Appraisal of a situation as
an event displeased about an event (distress)
pleased about an event desirable for another (happy-for)
pleased about an event undesirable for another
(gloating)
displeased about an event desirable for another
(resentment)
Fortunes-of-
others
Presumed value of a
situation as an event
affecting another
displeased about an event undesirable for another
(sorry-for)
pleased about a prospective desirable event (hope) Prospect-
based
Appraisal of a situation as
a prospective event displeased about a prospective undesirable event (fear)
pleased about an unconfirmed undesirable event (relief)
pleased about a confirmed desirable event (satisfaction)
displeased about a confirmed undesirable event
(fears-confirmed)
Confirmation Appraisal of a situation as
confirming or
unconfirming an
expectation
displeased about an unconfirmed desirable event
(disappointment)
approving of one’s own action (pride)
approving of another’s action (admiration)
disapproving of one’s own action (shame)
Attribution Appraisal of a situation as
an accountable act of
some agent
disapproving of another’s action (reproach)
finding an appealing object (liking) Attraction Appraisal of a situation as
containing an attractive or
unattractive object
finding an unappealing object (disliking)
admiration + joy → gratitude
Reproach + distress → anger
pride + joy → gratification
Well-being/
Attribution
Compound emotions
shame +distress → remorse
Admiration + liking → love Attraction/
Attribution
Compound emotion
extensions Reproach + disliking → hate
53
3.3 Complicated Emotion Allocating Method based on Emotion Eliciting Condition Theory using Emotion Generating Calculation
This “Emotion Eliciting Condition Theory” requires pleasure/displeasure about an event and some
information about the situation of the event (i.e. affection of another, prospect, confirmation,
approval, and attraction). We propose the methods to appraise pleasure/displeasure by the EGC
results and extract situation information from the grammatical features in the utterance.
3.3.1 Fortunes of the Others The emotions that belong to a group of “Fortunes-of-Others” are elicited from the emotion that the
other affects. There are “happy-for,” “gloating,” “resentment,” and “sorry-for” in this group.
The EGC method calculates pleasure/displeasure concerning the event from the user’s viewpoint
using FVs. The FVs which have been defined are based on the user’s preference. However, some
emotions like “sorry-for” and “happy-for” are aroused based on the other’s emotions. We give the
conditions for the emotions about “fortunes of others” as follows.
Happy-for : pleased about an event desirable for another
Gloating : pleased about an event undesirable for another
Resentment : displeased about an event desirable for another
Sorry-for : displeased about an event undesirable for another
In order to appraise a factor that an event is desirable/undesirable for a person in the conditions,
we assumed the condition as that the event pleases/displeases the person. However, different
reactions (pleased/displeased) have happened in the same type events. Therefore, we translated
these conditions as follows; 1) when the user likes the individual that is pleased about an event, the
user feels happy-for him/her, and 2) when the user dislikes the individual that is pleased about an
event, the user gloats over his/her misfortune. In order to confirm the translation, the adequacies of
the translated conditions for generating these emotions are investigated by questionnaire. As the
result, it was found that the person’s preference and the event’s impression for the person are
important factors for these emotions.
Therefore, the method has to judge that the person is favorable/hateful for the user and the event
is desirable/undesirable from the person’s viewpoint. The EGC is used for appraising these
conditions.
First, it is detected whether an individual is favorite or hateful from the user’s viewpoint. The FV
for the target person is used for checking it, because the user’s preference is already clear by the
FVs. When the FV is positive, the user likes the person. On the other hand, when the FV is negative,
the user dislikes the person.
54
Next, it is detected that an event is pleasure/displeasure for the other person. The EGC method
can calculate pleasure/displeasure for the user’s viewpoint based on the user’s FVs. Then, if we use
not the user’s FVs but the other person’s FVs, obtained EV indicates the emotion from the person’s
viewpoint. Therefore, we use “FVs from the other’s viewpoint” for the EGC, and we consider the
output EV as the emotion of “the other” about the event. When the value is positive, the event
pleases the individual. On the other hand, when the value is negative, the event displeases the
individual. The person’s database of FV is managed in the same way as that of the user’s one. The
retrieving process of FV is the same as shown in Section 2.4.1. When the individual’s Favorite
Value database does not exist, default database is adopted.
Table 3.5 shows a relationship between the preference of an individual, emotion for the event
from the individual’s viewpoint, and generated emotion. In this table, ‘A’ means an individual
without the user. We describe the FV of ‘A’ from the user’s viewpoint as “A (user),” and the FV of
‘B’ from C’s viewpoint as “B (C).” The EV of the event is described in the same way, for example,
“EV (A)” means the EV of the event from A’s viewpoint.
Table 3.5 Generated emotions for the preference of an individual and his/her emotion in the event
EV (A)
Pleasure 0 Displeasure
Like Happy-for ‘A’ Sorry-for ‘A’
0 A (user)
Dislike Resentment
0
Gloating
Figure 3.4 is the procedure to extract the emotion of “Fortunes-of-Others.” At first, EV of the
event is calculated using Favorite Value database of an individual ‘A’. This value means the emotion
for the event from the viewpoint of ‘A’. When the EV is not 0, i.e. ‘A’ feels pleasure/displeasure for
the event, the FV of ‘A’ is checked from the Favorite Value database of the user. It shows how the
user feels against ‘A’. Then, an emotion in this group is extracted based on EV (A) and A (user).
When ‘A’ does not feel any pleasure/displeasure for the event, or when the user does not take care of
‘A’, any emotions from the event are not extracted.
55
Figure 3.4 Procedure to extract the emotion of “Fortunes-of-Others
Calculate EV for theevent
EV (A) > 0
A (user) > 0
A (user)
Happy-for Resentment
EV (A) EV (A) < 0
A (user) < 0 A (user) > 0
A (user)
Sorry-for Gloating
A (user) < 0
EV (A)
FV databaseof ‘A’
Case frame representationof an event
Nothing
A (user) = 0EV (A) = 0
FV databaseof the user
56
3.3.2 Prospect-Based Emotions There are “hope” and “fear” in a group of “Prospect-Based”. The condition for the emotion is
“pleased/displeased about a prospective desirable/undesirable event.” We can already check that the
event is desirable/undesirable using the EGC method as described in Section 3.3.1. But, we have to
give a method to check whether the event is prospective or not.
Although people generally use reasoning to predict the future event, our study has not utilized
such reasoning process. However, they do not always make a complete reasoning. When they cannot
reason for the event, they occasionally refer the grammatical features.
Therefore, we also extract the information about “prospects” from the aspect in the case frame
representation. When there is an aspect of “inference (will)” and “intention (be going to),” the event
means a future event.
When we adopt the EGC to the prospective event and when its EV is positive/negative, we
consider the agent affects “hope/fear” based on “Emotion Eliciting Condition Theory” as shown in
Table 3.4. The event which will happen in the future is taken into account in the “prospective event
list” in order to confirm that the prospective event has happened or not as described in 3.3.3.
Figure 3.5 is the procedure to extract the “Prospect-based” emotions. At first, whether the aspect
that means “inference” or “intention” in the event case frame exists or not, is checked. When the
event has the appropriate aspect, the content of the event is prospective and the event accumulates in
the prospective event list. Next, the EV of the prospect event is calculated by EGC from the user’s
viewpoint. Therefore, when the value is positive (i.e. the user feels pleasure for the event), the user
arouses “hope.” On the other hand, when the value is negative (i.e. the user feels displeasure for the
event), the user arouses “fear.”
57
Figure 3.5 Procedure to extract the emotion of “Prospect-based.”
EV (A) > 0
Hope Fear
EV (A) < 0
ProspectiveEvent List
Case frame representation of an event
EV (A) = 0
FV databaseof the user
Are there any aspects that mean “inference” or
“intention”?
Calculate EV for the event
EV (user)
Yes
No
Nothing
58
3.3.3 Confirmation There are “satisfaction,” “relief,” “fears-confirmed,” and “disappointment” in a group of
“Confirmation”. The conditions for the emotions are as follows.
Relief : pleased about an unconfirmed undesirable event
Satisfaction : pleased about a confirmed desirable event
Fears-confirmed : displeased about a confirmed undesirable event
Disappointment : displeased about an unconfirmed desirable event
We can already check that the event is desirable/undesirable using the EGC method. But, we have
to give a method to check whether the event is confirmed one or not.
To recognize that an event is confirmed, the event has to be prospected in advance and it has to
happen actually. To recognize that an event is unconfirmed, the event has to be prospected in
advance and it has to be confirmed that the event will not happen any more. In order to check the
conditions, we consider “whether the event is prospected or not” and “the event is
confirmed/unconfirmed/ unknown.” Prospected events have already been taken into account in the
prospective event list as described in Section 3.3.2. Now, we propose a confirmation method for the
prospected event as follows.
First, we inspect the event with the past aspect in order to confirm realization of the prospective
events. When there is the same content event as the one that is shown before, we consider that “we
had predicted the event and it happened.” The effect of the negative aspect is shown in Table 3.6.
Next, we extract four emotions such as “satisfaction,” “relief,” “fears-confirmed,” and
“disappointment” using the result of confirmation and EGC output based on the conditions for the
emotions. Table 3.7 shows the relationship between the result of confirmation, EGC output and
generated emotion.
Table 3.6 Relationship between FVs and generated emotion
Confirmed Event
Affirmative Negative
Affirmative Happened Not Happened Prospective
Event Negative Not Happened Happened
59
Table 3.7 Generated emotions for the result of confirmation and EGC output
Confirmation
Happened Not Happened
Pleasure Satisfaction Disappointment
0 0 EGC Result
Displeasure Fears-confirmed Relief
Figure 3.6 is the procedure to extract the emotions of “Confirmation.” At first, whether the event
finished or not, is checked according to the existence of the past aspect. Next, the event is retrieved
from the Prospective Event List that accumulates the expected events. When the event exists in the
Prospective Event List, we compare affirmative/negative expression of the input event with that of
the expected event based on Table 3.6. Then, the user’s emotion for the prospective event is
calculated by EGC. When the event happens and it pleases the user, the user feels satisfaction. Other
emotions are extracted in the same way as shown in Table 3.7.
There is another process to extract emotions about confirmation. If the adverb which suggests
“predicted result” exists in the event, we consider the event was expected because we can guess the
event had already been expected even through the user did not inform us about the prospect. On the
other hand, we consider the event was not expected when there is an adverb which suggests an
“unpredicated result” in the event.
Some adverbs like YAPPARI (as expected) and ANGAI (unexpectedly) suggest confirming that the
prospective event is due to happen or not. The following 17 adverbs mean that suggest confirmation
in “Present-day adjective using dictionary [52].”
Predicted Result: SASUGANI (as may be expected), TSUINI, YATTO, YOUYAKU (at last,
finally), ANNOJOU, YAPPARI (as one expected), NANNAKU (without difficulty), NANTOKA
(somehow or other)
Unpredicted Result: IKKOUNI (no progress at all), IGAINI, ANGAI (unexpectedly), KAETTE
(on the contrary), KEKKOU (quite, very well), TSUI, TSUITSUI (unintentionally, unconsciously),
DOUSITEMO (at any cost), NAKANAKA (not easily)
60
Figure 3.6 Procedure to extract the emotion of “Confirmation”
No
Calculate EV for theevent
EV (user) > 0
Satisfaction Fears- confirmed
EV (user) < 0 EV (user) > 0
ReliefDisappointment
EV (user) < 0
Case frame representation of an event
Nothing
EV (user) = 0EV (user) EV (user)
Calculate EV for theevent
Compare affirmative/negative expression of input event withthat of prospective event based on Table 3.5.
Happened Not prospected Prospected
Nothing
Is the event in the Prospective Event List?
Does this event have a past aspect?
Yes
No
Yes
Not happened
No
ProspectiveEvent List
Does the adverb which relates to the prospect exist?
61
3.3.4 Well-Being The emotions in the “Well-Being” group are aroused when the user feels pleasure or displeasure
for the event. When the user feels pleasure about an event, the user is feeling “joy”, and when the
user feels displeasure about an event, the displeasure means “distress.” The “Emotion Eliciting
Condition Theory” (Table 3.4) suggests that joy is elicited when he/she is pleased about an event.
However, this condition can be confirmed by the other emotions such as happy-for, gloating, hope,
satisfaction, and relief as shown in Table 3.8.
In these events, eliciting joy is judged by adopting the EGC output about the event. Furthermore,
when the user elicits happy-for, gloating, hope, satisfaction, and relief, the user elicits joy, too,
because the eliciting condition of these emotions also meets a demand of the condition of joy. The
condition about distress is dealt with in the same way as shown in Table 3.9.
If an event elicits opposite emotions at the same time, the situation is called conflict. For example,
an event such as “my son was jilted by a bimbo,” the speaker is sorry for his son but feels relief at
the same time. Any special processes are not supplied for the conflict, but just extract two opposite
emotions.
Table 3.8 Comparison amongst the emotion eliciting conditions relating to the “pleasure” emotion
Emotion Emotion Eliciting Condition
Joy Pleased about an event
Happy-for Pleased about an event desirable for another
Gloating Pleased about an event undesirable for another
Hope Pleased about a prospective desirable event
Satisfaction Pleased about a confirmed desirable event
Relief Pleased about an unconfirmed undesirable event
Table 3.9 Comparison amongst the emotion eliciting conditions relating to the “displeasure” emotion
Emotion Emotion Eliciting Condition
Distress Displeased about an event
Sorry-for Displeased about an event desirable for another
Resentment Displeased about an event undesirable for another
Fear Displeased about a prospective desirable event
Fears-confirmed Displeased about a confirmed desirable event
Disappointment Displeased about an unconfirmed undesirable event
62
3.3.5 Attribution There are “pride,” “admiration,” “shame” and “reproach” in a group of “Attribution”. The
condition for the emotion is “approving/disapproving of one’s own/ another action.” We propose the
methods to judge whether the event is approving or not and who happened the event.
First, we propose a method to judge whether the event is approving or not. The event is
approved/disapproved of by the one’s own judgement. There are various criteria for the standard
character based on many factors like the users’ senses of values, experience, living environment,
social environment, and so on. However, it is impossible to compare all the events with all the moral
values whenever the agent recognizes the event, because there are countless different values in the
world and we need a system of complex reasoning to confirm that the event is in keeping with the
values for each case. Furthermore, there are not any knowledge databases which can deal with such
complex reasoning.
Therefore, we deal with only one moral value “An event that gives me pleasure is a good thing,”
because it is the simplest and the most instinctive moral. “An event that gives me pleasure” is
defined as “the event that the EGC result is pleasure.”
Next, we propose a method to check who happened the event. The actor of the event is also
needed for detecting attribution emotion in the “Emotion Eliciting Condition Theory.” We adopt this
method only when we use the verb transitive, because the concept of the actor is expressed as the
subject of such a event. The event types with the verb transitive are type VI to type XI.
Then we classify the emotions based on “the actor of the event is one’s own or not” and “the event
is pleasure for me” as shown in Table 3.10.
Figure 3.7 is the procedure to extract the emotion of “Attribution.” At first, whether the predicate
of the event is the transitive verb or not, is checked. The event with the transitive verb means the
situation is “the object has been effected by someone.” Next, the emotion about the event is
calculated by the EGC. When the user feels pleasure or displeasure about the event, we pay attention
to the actor, i.e. who brings about the event. When the EV of the event is pleasure, and is created by
the user, the user feels pride, and when it is created by another, the user feels admiration. On the
other hand, When the EV of the event is displeasure and it is created by the user, the user feels
shame, and when it is created by another, the user feels reproach.
Table 3.10 Relationship between the actor, EGC result and generated emotion
Actor
One’s own Another
Pleasure Pride Admiration
0 0 EGC Result
Displeasure Shame Reproach
63
Figure 3.7 Procedure to extract the emotion of “Attribution”
Calculate EV for the event
The user
Subject
Pride Admiration
EV (user)
The other The user
Shame Reproach
The other
EV (A)
Case frame representation of an event
Nothing
FV databaseof the user
Is the predicate of the event a transitive verb?
EV (user) = 0 Subject
EV (user) > 0 EV (user) < 0
Yes No
64
3.3.6 Well-Being / Attribution The emotions in the group of “Well-being/Attribution” are elicited as compound emotions. There
are four emotions, gratitude, anger, gratification, and remorse, in “Well-being/Attribution” group.
As shown in Table 3.11, these emotions are compounded from “Well-being” emotions and
“Attribution” emotions based on the “Emotion Eliciting Condition Theory” as shown in Table 3.4.
Some conflicts are elicited in this table, however, Any special processes are not supplied for the
conflict.
Table 3.11 Emotion compound rules from “Well-being” emotions and “Attribution” emotions
Emotion of Attribution
Admiration Reproach Pride Shame
Joy Gratitude Conflict Gratification Conflict Emotion of Well-being Distress Conflict Anger Conflict Remorse
3.4 Dependency among Emotion Groups
We consider the dependency among emotion groups, as shown in Figure 3.8, based on the
eliciting condition of each emotion.
At first, we calculate pleasure/displeasure of the user concerning the event using EGC method
from the user’s viewpoint. When the event is prospective, the emotion in the “Prospect-based” group
is extracted. When the prospective event is confirmed or unconfirmed, the emotion in
“Confirmation” group is extracted. Furthermore, the EGC is applied to the event from the other’s
viewpoint, too, and when the other feels pleasure or displeasure, the emotion in the
“Fortunes-of-others” group is extracted. The emotions in the “Prospect-based,” “Confirmation,” and
“Fortunes-of-others” groups are aroused when the user is pleased about the event. Therefore, the
emotion in the “Well-being” group is extracted when these emotions are extracted or the user feels
pleasure or displeasure about the event. On the other hand, the output of EGC also shows the moral
value. When a moral is approved/disapproved of by the event, we can extract the emotion in the
“Attribution” group.
At last, the emotion in the “Well-being/Attribution” group is compounded from “Well-being”
emotions and “Attribution” emotions as shown in Table 3.4.
65
Figure 3.8 Dependency among emotion groups
Emotion Generating Calculations (EV)
Prospect-based
Confirmation
Fortunes-of-others
Well-being Attribution
Well-being/Attribution
ProspectiveEvent List
Approving or disapproving of a moral
Events from the viewpoint of the user
Events from the viewpoint of others
Predicted event from theviewpoint of the user
66
3.5 Example of Complicated Emotion Allocating Method Example 1: “Rome dates with Juliet.”
Emotion Generative Calculations method
Event: “Romeo dates with Juliet.”
Predicate (P) = “date with” : +0.6
Subject (S) = “Romeo” : +1.0
Object-Mutual (OM) = “Juliet” : +0.9
Event Type: “date with” V(S, OM)
47.1)6.0,9.0,0.1(
),,(
�
����
� POMS fffEVofDegree
(Pleasure) I Area)6.0,9.0,0.1(),,(/
����
� POMS fffedispleasurpleasurehDistinguis
47.1)1()47.1(
)/()(
��
���
�� edispleasurpleasureEVofDegreeEmotionGenerated
Complicated Emotion Allocating Method:
(1) Fortunes-of-others (Section 3.3.1)
(a) Fortunes-of-“Juliet”
Predicate (P) = “date with” : +0.7
Subject (S) = “Romeo” : +0.9
Object-Mutual (OM) = “Juliet (myself)” : +1.0
Event Type: “date with” V(S, OM)
(Pleasure) I Area)7.0,0.1,9.0(),,(/
����
� POMS fffedispleasurpleasurehDistinguis
EV of the event from “Juliet’s” viewpoint = Pleasure
&
Romeo likes the “Juliet”
(FV of “Juliet” from Romeo’s viewpoint: +0.9)
Happy for “Juliet”
(Table 3.5)
67
(b) Fortunes-of-“Lord Montague (Romeo’s father)”
Predicate (P) = “date with” : +0.3
Subject (S) = “Romeo” : +0.8
Object-Mutual (OM) = “Juliet” : –0.5
Event Type: “date with” V(S, OM)
re)(Displeasu IV Area)3.0,5.0,8.0(),,(/
����
� POMS fffedispleasurpleasurehDistinguis
EV of the event from “Lord Montague” viewpoint = Displeasure
&
Romeo likes “Lord Montague”
(FV of “Lord Montague” from Romeo’s viewpoint: +0.5)
Sorry for “Lord Montague”
(2) Well-being (Section 3.3.4)
(a) “Happy for Juliet” is generated by the event
Joy about the event
(b) “Sorry for Lord Montague” is generated by the event
Distress about the event
(Table 3.5)
(Table 3.8)
(Table 3.8)
68
Example 2: “Yesterday, a mother scolded her noisy child.”
Emotion Generative Calculations method
Event: “Yesterday, a mother scolded her noisy child.”
Predicate (P) = “scold” : –0.3
Subject (S) = “a mother” : 0.0
Object (O) = “her noisy child” : –0.5
Event Type: “scold” V(S, O)
77.0)3.0,5.0,5.0(
),,(
�
����
� PO ffEVofDegree �
(Pleasure) VI Area)3.0,5.0,5.0(),,(/
����
� PO ffedispleasurpleasurehDistinguis �
77.0)1()77.0(
)/()(
��
���
�� edispleasurpleasureEVofDegreeEmotionGenerated
Complicated Emotion Allocating Method:
(2) Fortunes-of-others (Section 3.3.1)
(a) Fortunes-of-“a mother”
Predicate (P) = “scold” : –0.4
Subject (S) = “a mother (myself)” : +1.0
Object (O) = “her noisy child” : +0.8
Event Type: “scold” V(S, O)
re)(Displeasu V Area)3.0,5.0,8.0(),,(/
����
� PO ffedispleasurpleasurehDistinguis �
EV of the event from “a mother’s” viewpoint = Displeasure
&
The user does not have any impression for “a mother”
(FV of “a mother” from the user’s viewpoint: 0.0)
No emotion
(Table 3.5)
69
(b) Fortunes-of-“her noisy child”
Predicate (P) = “scold” : –0.3
Subject (S) = “a mother” : +0.8
Object (O) = “her noisy child (myself)” : +1.0
Event Type: “scold” V(S, O)
re)(Displeasu V Area)3.0,5.0,0.1(),,(/
����
� PO ffedispleasurpleasurehDistinguis �
EV of the event from “her noisy child’s” viewpoint = Displeasure
&
The user dislikes the “her noisy child”
(FV of “her noisy child” from the user’s viewpoint: –0.5)
Gloating at the “her noisy child”
(2) Well-being (Section 3.3.4)
Gloating is generated by the event
Joy about the event
(3) Attribution (Section 3.3.5)
The predicate of the event (scold) is a verb transitive.
&
EV about the event from the user’s viewpoint = Pleasure
Admiration for “a mother’s” act
(4) Well-being / Attribution (Section 3.3.6)
Joy is generated about the event
&
Admiration is generated about the event
Gratitude to “a mother”
(Table 3.5)
(Table 3.8)
(Table 3.10)
(Table 3.11)
70
Example 3: “My friend, suffering a disease, was cured.”
(The event “My friend, suffering a disease won’t be cured” was recognized before.)
Emotion Generative Calculations method
Event: “My friend, suffering a disease was cured.”
Predicate (P) = “be cured” : +0.6
Subject (S) = “my friend, suffering a disease” : +0.5
Event Type: “be cured” V(S)
93.0)6.0,5.0,5.0(
),,(
�
����
� PS ffEVofDegree �
(Pleasure) I Area)6.0,5.0,5.0(),,(/
����
� PS ffedispleasurpleasurehDistinguis �
93.0)1()93.0(
)/()(
��
���
�� edispleasurpleasureEVofDegreeEmotionGenerated
Complicated Emotion Allocating Method:
(1) Fortunes-of-others
(a) Fortunes-of-“my friend”
Predicate (P) = “be cured” : +0.4
Subject (S) = “my friend, suffering a disease (myself)” : +1.0
Event Type: “scold” V(S)
(Pleasure) I Area)4.0,5.0,0.1(),,(/
����
� PS ffedispleasurpleasurehDistinguis �
EV of the event from “my friend’s” viewpoint = Pleasure
&
The user likes the “my friend”
(FV of “my friend” from the user’s viewpoint: +0.5)
Happy-for the “my friend”
(Table 3.5)
71
(3) Confirmation (Section 3.3.3)
The prospective event “My friend, suffering a disease won’t be cured.”
Event: “My friend, suffering a disease won’t be cured.”
Predicate (P) = “not be cured” : –0.6
Subject (S) = “my friend, suffering a disease” : +0.5
Event Type: “be cured” V(S)
re)(Displeasu V Area)6.0,5.0,5.0(),,(/
����
� PS ffedispleasurpleasurehDistinguis �
The prospective event is displeasure for the user.
The prospective event did not happen. (Table 3.6)
Relief about the prospective event
(3) Well-being
Happy-for and Relief are generated about the event
Joy about the event
3.6 Experimental Results In this section, the adequacy of the generated emotion by our proposed method described in this
chapter is reviewed through the analysis of compared the generated emotions of the system, with the
result of the questionnaire. At first, we adopted our method to dialogue corpus and we extracted 30
sentences from the corpus. Then, we asked 15 university students which emotion was aroused by the
content of the 30 sentences as shown in Table 3.12.
3.6.1 Experimentation 1 In this experimentation, we evaluated how the system generates emotions, similar to that aroused
in the subjects. At first, we showed 30 sentences to 15 subjects, and they select adequate emotions
from the 20 emotions that our system can generate. Table 3.13 shows the comparative results
between the system output and the subjects’ answers. The system extracted all the emotions that all
subjects selected, and it extracted 75% of emotions that most of the subjects (80%) selected.
We consider two reasons that the system cannot extract common emotions.
At first, there are rarely inadequate FVs of predicates. We define the FVs of predicate, based on
(Table 3.7)
(Table 3.8)
72
whether the predicate means approach or avoidance. However, when the system analyzes an event “I
don’t know the reason of my disease,” the subject is “I,” the object is “the reason of my disease,” the
predicate is “not know,” and the event type is V (S, O). We give the predicate “know” the positive
image because “know” means “gain some knowledge,” then the system generated “pleasure” about
the event. However, the subjects aroused negative emotions like “fear (86.7%),” “distress (73.3%),”
and so on. There is no doubt about “the reason of my disease is not preferable,” but we guess further
disadvantages like “the disease won’t be cured” if the reason is unclear. We should define the FV of
predicates considering such situations, too. We have to add the reasoning system to solve this
problem completely.
Predicate (P) = “not know”: –0.4
Subject (S) = “I”: +1.0
Object (O) = “the reason of my disease”: –0.3
Event Type: “scold” V(S, O)
(Pleasure) VIII Area)4.0,3.0,0.1(),,(/
����
� POS fffedispleasurpleasurehDistinguis
73
Table 3.12 Sample sentences for experimentation
Table 3.13 Reappearance Rate
Agreement rate (over *%) 100 90 80 70 60
Selected number 2 4 20 29 47
Reappearance number 2 3 15 20 31
Reappearance rate (%) 100.0 75.0 75.0 69.0 66.0
1. An old peoples home won’t accept me. 2. A player for the Giants hits a slump. 3. I can’t withdraw my savings at the bank. 4. I’m bored. 5. My grandson will be born next month. 6. Unexpectedly, my daughter didn’t come to meet me. 7. I live with my family. 8. I hurt my grandson. 9. I always relax at home. 10. My friend has an incurable disease. 11. At last, my friend’s disease was incurable. 12. My friend recovered from his disease. 13. My son will come home tomorrow. 14. My son didn’t come home. 15. My son came home this evening. 16. As I expected, the Carp lost to the Swallows. 17. I don’t know the reason for my disease. 18. My family manages my money instead of me. 19. I remember all of the “one hundred poems” cards. 20. Motorcycle gang members haven’t been arrested by the police. 21. I have a family G.P. 22. I’m suffering from kidney failure. 23. My friend was injured. 24. The Carp often loses to the Giants. 25. I’m terribly forgetful. 26. My daughter takes trouble to check my change. 27. I can make a lot of household objects. 28. My friend didn’t take me to hot spring at that time. 29. Yesterday, the house owner argued with my noisy neighbor. 30. I often give advice to my family.
74
The second problem is extracting tacit intention from the “aspects.” Some “aspects” imply
“hope,” “possibility,” and so on. For example, “DEKINAI (cannot)” implies “I want to do something,
but it has been prevented,” and “SHITEKURERU (take trouble to do)” implies “someone took
trouble to help me.” In this experimentation, the subjects aroused “gratitude (66.7%)” about the
event “my daughter takes trouble to check my change,” and “distress (53.3%)” about the event “I
can’t withdraw my savings at the bank.” We have to gather such implicit expressions by analyzing
the corpus.
3.6.2 Experimentation 2 In this experimentation, how the system generates adequate emotions is reviewed. We showed the
generated emotions by the system to 15 subjects and they answered the adequacy of the emotions by
the five grade evaluations such as “TOTEMO-DATOU (exactly),” “YAYA-DATOU (adequate),”
“DOCHIRADEMONAI (so-so),” “YAYA-FUTEKISETSU (inadequate),” and
“TOTEMO-FUTEKISETSU (wrong).” Then, we assigned the numbers 1.0, 0.75, 0.5, 0.25 and 0.0 to
their answer patterns. We considered the adequacy rate as the average of the answer values.
∑ (subject’s answer value) Adequacy Rate =
The number of subjects
Table 3.14 shows the result. Half (47.0%) of the generated emotions by the system were evaluated
that they were exactly correct because their adequacy rates were over 0.8. Furthermore, most
(86.4%) of the generated emotions were evaluated that were relatively correct because their
adequacy rates were over 0.5.
Table 3.14 Adequacy Rate Adequacy Rate (over *) 0.8 0.6 0.5
Agreed Rate in the evaluated emotions (%) 47.0 75.8 86.4
We consider two reasons that the system cannot extract common emotions.
The first problem is about the dependency of emotions. In our method, the system always
generates “joy” when the EGC outputs “pleasure.” However, there are some exceptions in this
experimentation. For example, our method generates “gratitude,” “joy,” and so on about the event
“My family manages my money instead of me.” The subjects agreed “gratitude (68.3%),” but they
did not agree “joy (33.3%)” that was derived from “gratitude.” We are re-investigating the
conditions of “well-being” by analyzing such paradoxical responses.
The next problem is the relationship among competing emotions. For the event “my friend was
75
hurt,” the system generated not only “sorry-for” but also “reproach” and “anger.” The latter
emotions are caused by the displeasure about the situation “my friend’s act upset me.” Most of the
subjects agreed “sorry-for (98.3%),” however, a few subjects agreed “reproach (26.7%)” and
“anger (23.3%).” We have to consider the relationship “some people do not arouse aggressive
emotions against a person that they are feeling sorry for,” causes such situations. We investigate
these relationships among competing emotions.
3.7 Future Works Using our method, we can extract emotion from the sentence form expression. However, we can
guess the speaker’s emotion not only from the sentence but also from some words like “BANZAI,”
“ARIGATOU (Thank you),” “GOMEN (I’m sorry),” “CHIKUSHOU (damn),” and so on.
Furthermore, we are not always dealing with complete sentences, because the hearer occasionally
mishears a few words from the utterance and the speaker sometimes omits some words. Therefore,
we focus on extracting emotion by some words that have an emotional feeling. This method picks up
some small emotional expressions and follows the EGC method.
We found out many affective words by retrieving emotion words in the “EDR Japanese Word
Dictionary [62]” as shown in Table 3.15.
When there are some affective words in the event, the degree of the corresponding emotion is
increased a little. This method is easy and effective, however, it cannot correspond to the negative
aspect. Thus, we give restrained values to this method.
By adding this method to the EGC method, we can extract not only the former 20 emotions but
also the emotions about “Attraction” and “Attraction/Attribution” synthetically.
76
Table 3.15 Affective words for each emotion
Emotion Examples of affective words
Joy ASOBI (play, amusement), SHOUMI (relish), AIKAN (joys and sorrows)
Distress ITAMASHII (painful), KUNOU (suffering), SHIREN (trial)
Happy-for ANTAI (peace, welfare), IWAU (congratulate), KANSEI (cheer)
Gloating ETSUBO (act of gloating), KOKIMIYOI (smart, neat)
Resentment FUKUSHUUSURU (revenge), ENKON (bitter feeling)
Sorry-for KAIKON (remorse, regret), MOUSHIWAKENAI (I’m very sorry)
Hope INORU (pray), KOUMYOU (gleam of hope), KIBOU (hope)
Fear AKUMU (nightmare), AWADATSU (get goosebumps)
Relief HOTTO (feel relieved), IKITSUKU (catch the breath)
Satisfaction AKIRU (have enough), YOI (good), ENMAN (perfect, harmonious)
Fears-confirmed —
Disappointment MUNASHII (empty, ineffectual), GAKKARI (be discouraged)
Pride OHON (cough with pride), IKIYOUYOU (be in high spirits)
Admiration APPARE (bravo, well done), KANSHIN (admire, be deeply impressed)
Shame AKAPPAJI (shame, disgrace), OTEN (stain, blot)
Reproach URAMESHII (reproachful, resentful), SAINAMU (reproach, torment)
Liking SUKI (like), NINKI (popularity), YOROSHIKU (Give my regards to)
Disliking MUKATSUKU (be irritated, get angry), GUUTARA (lazybones)
Gratitude ARIGATOU (thank you), OKURIMONO (gift), KANSHA (gratitude)
Anger KATTO (fly into a rage), IKIDOORASU (be angry, resent)
Gratification SIYOKU (gratification of desire)
Remorse —
Love AISURU (love), ADOKENAI (artless, innocent), ATATAKAI (warmhearted)
Hate URAMI (bitter feeling, grudge), IMAWASHII (disgusting, detestable)
77
3.8 Conclusion In this chapter, we presented a method to classify generated simple emotion (pleasure/displeasure)
by EGC method into 20 various emotions based on Elliott’s “Emotion Eliciting Condition Theory.”
Elliott’s theory requires judging such conditions as follows; “feeling for another,” “prospect and
confirmation,” approval/disapproval.” We defined the rules to check the condition of “feeling for
another” based on the EGC’s result using another’s FVs. We can judge “prospect and confirmation”
by extracting some aspects and adverbs. “Approval/disapproval” is judged by the event’s case frame
structure with the transitive verb.
To verify the effectiveness of the proposed method, we compared generated emotions with
human aroused emotions. As a result, 75% of the emotions that most (80%) of the subjects aroused
reappeared by this method, and half of the generated emotions by the system were evaluated that
they were exactly correct because their adequacy rates were over 0.8. Furthermore, most (86.4%) of
the generated emotions were evaluated that were relatively correct because their adequacy rates were
over 0.5.
Our proposed method for generating various emotions is confirmed its adequacy by questionnaire.
However, there are much more factors not only as we adopted, to connect events with emotions. We
are investigating emotion generating rules by clustering the relationship between emotions and
events using tree structure. We have to compare the result with our proposed method and unite them.
Next, in this study, we do not supply any special processes for conflict, for example, “distress”
and “admiration” are aroused for an event at the same time. Because there are various reactions
against the conflict. We are going to investigate these reactions and its conditions based on
psychology, and realize these processes.
78
CHAPTER 4 ANALYSIS OF AFFIRMATIVE/NEGATIVE INTENTIONS FROM USER’S ANSWERS TO YES-NO QUESTIONS
To achieve natural communication between a human and the computer, we presented a method to
calculate a user’s emotion from the content of the user’s utterances in chapter 2 and 3. However,
when the computer feels and expresses emotions, people will use unrestrained expressions for the
computer and a person alike. Furthermore, people occasionally use ambiguous expressions when
their language may cause the hearer’s or one’s own displeasure or they have not have clear intentions.
Even in such a situation, the natural language dialogue system has to recognize the user’s tacit
intention from the expressions.
In this chapter, we propose a method to analyze the user’s affirmative/negative intention from
his/her utterances in the dialogue [30, 31, 32]. First, we extract some elements based on the surface
structures of the responses and a concept of the question, and calculate a value (affirmative/negative
value) corresponding to the degree of affirmation/negation. There are three types in the elements,
“affirmative/negative description for the yes-no question,” “direct expression of intention in the
response,” and “indirect expression of intention in the response.” The affirmative values are defined
based on the questionnaires. Furthermore, a calculating function of affirmation value changes is
defined according to aspects of the verb. Finally, the total affirmative/negative intentions to a
question in the dialogue are calculated.
We applied this method into “Web-based analytical system of health service.” This system asks
the questions to the user and analyzes the user’s intention through conversation.
To verify the validity of our proposed method, we apply our method to 50 responses in the
dialogue corpus which obtained by conversing with five subjects, and evaluate the results.
4.1 Intention Analyzing Method from the Utterance 4.1.1 Understanding Intentions of the Indirect Speech-Act in Natural Language Interfaces
Mima et al. proposed a method for understanding the intention of the indirect speech-act in natural
language interface for the operation of the computer system [29]. This method detects the user’s
demand for commands by converting the surface concept of the user’s input sentence based on its
intention.
First, the surface concept and the intention of the input sentence are extracted based on the result
of the morphological analysis. This method deals with five kinds of intention; “REFUSAL,”
“REVERSAL,” “RESTRICTION,” “BENEFIT,” and “DISABILITY.” In order to decide the
intention, they use a lot of morphological and syntactic rules as shown in Table 4.1. These rules are
79
specialized for Japanese grammar. Next, the system anticipates the user’s demand using the
knowledge representation for concepts of operations (Figure 4.1) and the intention links on the
operation knowledge base (Figure 4.2).
For an example of the process used by this method, we consider “I don’t want to show the file
named letter.” First, a surface concept “show the file named letter” and an intention “REFUSAL.”
Successively, the demand is anticipated using the concept, the intention, and the intention link. Then,
the system anticipates the demand as “request for forbid reading the file named letter.”
Table 4.1 Morphological and syntactic rules for deciding the intentions <*REFUSAL>::=
|
|
|
<N><relationship expression1><predicative expression1>
<N><relationship expression1><V><passive><predicative expression1>
<V><predicative expression1><N><verb that means existence>
<V><passive><predicative expression1><N><verb that means existence> <*REVERSAL>::=
| <N><relationship expression1><V><predicative expression2>
: : :
<relationship expression1>::= <subject>|<starting point>|<degree1>|<repeat>|<term>| … : :
<verb that means existence>::= <GA ARU>|<GA ARIMASU>|<GA SONZAI SURU>
|<GA SONZAI SHIMASU>
: :
80
Figure 4.1 Illustrations of knowledge representations for concepts of operations
<LOAD>
<LOOK>
<READ>
<DISPLAY>
MIRU
KAKUNIN
YOMIKOMU
YOMU SIMESU HYOUJI
plan
plan
plan
(a) <LOAD>
<RESTORE>
<UNDELETE> <STORE>
FUKKI
FUKKATSU <STORE-PAGE_NUM>
TSUKERU
<UNDELETE-FILE>
D
D
D
effect
(b) <RESTORE>
81
Figure 4.2 Illustrations of the intention links on the operation knowledge base
4.1.2 Recognizing User Communicative Intention in a Dialogue-Based Consultant System
Kumamoto et al. proposed a method to recognize a user’s communicative intention (CI) from the
natural language dialogue in order to support the usage of a computer [63]. They consider eight types
of CI type; “method,” “attribute value,” “concept,” yes-no value,” “goal,” “belief,” “end of
dialogue,” and “start of dialogue.”
Figure 4.3 shows a process of the communicative intention recognition. First, the system extracts
some features to determine the CI type according to the result of the morphological analysis of the
input sentence. There are four type features; “function word,” “information about parts of speech,”
“conjugate information,” “original form information.” Table 4.2 is the list of the function words.
Next, the CI type is determined according to the pattern matching between the extracted features and
the “pattern-CI type translation table” as shown in Table 4.3. Successively, the system generates the
action frame of sentence and outputs the CI description by combining the CI type and the generated
action frame.
<DELETE>
<CREATE>
<RESTORE>
<TRANSFER>
<FORBID-WRITE>
<FORBID-READ>
<LOAD> <PRINT>
“REVERSAL”“REVERSAL”
“REVERSAL” “REVERSAL”“REFUSAL”
“REFUSAL” “REFUSAL” “REVERSAL”
“REFUSAL” “REVERSAL”
“REFUSAL” “REVERSAL”
82
Figure 4.3 Flow of communicative intention recognition
Table 4.2 List of function words Type of function word Examples 1 Dialogue starting signal SUMIMASEN, ANO 2 Dialogue ending signal WAKARIMASITA 3 HOW phrase DOUSURU, DOUYARU, DOU 4 Predicate about teaching OSHIERU
5 Predicate about benefit MORAU, KURERU
6 Predicate about wish HOSHII
7 Interrogative pronoun DARE, DOKO (without NANI)
8 NANI NANI
: : :
19 Predicate about knowledge SHIRU, OBOERU, WAKARU
Input of user utterance sentence
Morphological analysis
Feature extraction
CI type determination Frame generation
Output of CI description
+
83
Table 4.3 Pattern-CI type translation table
(((Dialogue starting signal)) Starting dialogue) (((Dialogue ending signal)) Ending dialogue) (((HOW phrase)) Method) (((Predicate about teaching)) Attribute value) (((Predicate about benefit)) Goal) (((Predicate about wish)) Goal) (((Interrogative pronoun)(OK type predicate) Attribute value: OK) (((NANI MO)(OK type predicate) Yes-no value: OK)
: : (((end of sentence)) Belief)
For an example of the process used by this method, we consider “SHUURYOU WO OSEBA IIN
DESUNE? (It is OK that I just click “SHUURYOU” button, isn’t it?)” First, the system extracts
(SA-HEN type noun, predicate about movement, condition form, OK type predicate, question type
particle, end of sentence) as the features to determine CI type, and CI type “yes-no value: OK” is
selected according to the “pattern-CI type translation table.” Next, the action frame “click the
“SHUURYOU” button” is generated at the frame generation stage. Therefore, the CI description
(yes-no value: OK, “click the “SHUURYOU” button”) is obtained.
4.2 An Overview of the Affirmative/Negative Intention Analyzing Method 4.2.1 Web-based Analytical System of Health Service needs among Healthy Elderly
An increase of elderly people requires not only a need for medical care but also a need for health
services. Japanese society has already built up medical insurance and medical care systems.
However, the health service does not cater for the “healthy elderly.” Yoshida proposed the
“web-based analytical system of health service needs of the elderly.” [64] The basic questionnaire
consists of 50 items. These items are related to QOL, -Quality of Life-, life-satisfaction, life-style,
mental stress, and social concern. The subject tries to answer these questionnaire sheets in the
homepage and these answers are sent to the web server in Figure 4.4(a). Successfully, the system
checks if the questionnaires were fulfilled by all the conditions and were stored in the queue to
implement a reasoning. Based on such calculations, the system was presented to the subject, in order
to give the analytical results and health counseling comment to the browser in Figure 4.4(b).
The system was developed to analyze the population-based health service needs for the official
84
health center. A health service for elderly people will be offered, based on these results. The system
is expected to classify the health service needs of the elderly in his/her home.
Although the personal computer diffusion in Japan reaches up to about 30% according to the
statistical data of the Economic Planning Agency of Japan, a few people feel difficulties using a
computer. Especially, we often see them amongst the elderly. For example, they tire of answering to
50 questions, gazing a display for a long time, and reacting to the monotonous computer system. The
problems are caused in the distance between the conversation and the usability of the computer. That
is, they hope the computer equipped with human-like interfaces, enables an easy conversation like
greetings and so on. Besides verbal messages, human face-to-face communication usually involves
nonverbal messages such as facial expressions, vocal inflection, speaking rate, pauses, hand gestures,
body movements, posture, and so on.
To improve this weak point of the developed “web-based health service system,” we propose a
method to analyze the user’s tacit intention even if the user replies with an ambiguous response and
to respond with natural responses for comfortable conversation.
87
4.2.2 Affirmative/Negative Intention Analyzing Method for Web-based Analytical System of Health Service
The system asks 50 questions about QOL (quality of life) to the user, and displays the counseling
comment about the user’s health as shown in Figure 4.4. All the questions are the yes-no questions
that the user will answer by “yes” or “no,” and the questions have been fixed. However, people
cannot always determine their intentions clearly and the intentions occasionally clarify through the
conversation.
We propose a method to analyze a user’s affirmative/negative intention from responses for the
questions asked on Web [30, 31, 32].
Figure 4.5 shows the process to analyze the user’s intention. First, the system asks one of the
prepared yes-no questions and the user responds with an answer using natural language expression.
Next, the response is analyzed morphologically using chasen [65], Japanese morphological analyzer.
Then, the system extracts some words that imply affirmative/negative intention of the user,
affirmative/negative elements, from the output of morphological analyzing based on the concept
data of the question. We define three types of affirmative/negative elements; affirmative/negative
description of the yes-no question, direct expression of intention in the answer, and indirect
expression of intention in the answer as shown in Section 4.3. Successively, the system calculates the
affirmation values. All the affirmative/negative elements have affirmative values that indicate the
degree of affirmation/negation. However, some adverbs and modalities affect the affirmative values
of the verbs. We show their calculation method in Section 4.4, 4.5 and 4.6. At last, the system
calculates the total affirmative intention of the user by extracting affirmative/negative elements. The
dialogue about the question continues until the system obtains enough affirmative/negative elements
to guess the user’s intention or the system judges that the same topic is continuing too long.
We limit the aim of this study to Japanese utilization, because this method is applied to the
interface in the “Web-based analytical system of the Health Service for the Elderly [64].”
88
Figure 4.4 Overview of the intention extracting method
Response Morphological analyze and parsing
Extracting elementsof intention Question concept
database
Affirmative/negative elements
Calculate affirmative values of the elements
Utterance intention list
Calculate total affirmativeintention of the subject
total affirmative intention of the subject
Modality & adverb
Question
89
4.3 Affirmative/Negative Element The speaker’s intention about a yes-no question is guessed by extracting affirmative/negative
elements from the conversation. We categorize the affirmative/negative elements into three types
based on the relationship with the content of the question as follows;
�� Affirmative/negative description for the yes-no question
�� Direct expression of intention in the response
�� Affirmative/negative expression using the main verb in the question
�� Affirmative/negative expression using the auxiliary verb in the question
�� Indirect expression of intention in the response
�� Indirect information addition
�� Non-standard reason addition
4.3.1 Affirmative/Negative Description for the yes-no Question The following example is a dialogue which includes an “affirmative/negative description for the
yes-no question” type’s affirmative/negative element. This question has been used in the “Web-based
analytical system of the Health Service for the Elderly.” The questions in Section 4.3.2 and 4.3.3
have been used in the web-based analytical system, too.
Q1: Do you sometimes become nervous?
A1: Yes, I do.
We limit the speaker’s intention to “Yes” or “No,” because our purpose is guessing whether the
speaker agrees or disagrees the content of the yes-no question. The most standard response for the
question is “Yes” or “No.” Furthermore, “That’s right” and “It’s wrong” are also used as responses to
the question.
In this paper, we use the following interjections as described in “Basic Japanese grammar [51]” as
affirmative/negative elements.
Affirmative interjections: HAI, EE, AA, UN, HAA, SOUDA
Negative interjections: IIE, IYA, CHIGAU
In the case of two or more interjections are in a sentence, we use the first interjection of the
sentence, because the other interjections, which appear in the middle of the sentence, sometimes
indicate the opposite intention to the prior interjection. In Japanese, the answers against negative
questions like “Can’t you…?” sometimes cause confusion. However, the system does not supply
such a negative question.
90
4.3.2 Direct Expression of Intention in the Response
Q2: Do you lose your way?
A2: Yes. I often lose my way.
Q3: Can you fill out the pension forms?
A3: Of course, I can.
The user answers his/her affirmative/negative intention using the verb in the question.
Furthermore, some auxiliary verbs which indicate capability and continuation, occationary represent
affirmation/negation. We consider that the sentence has an affirmative intention, when there is a verb
or an auxiliary verb which appears in the question and it does not have a negative aspect. We
predefine the following verbs and auxiliary verbs as affirmative/negative elements, because the
system always asks the same 50 questions and their expressions are always the same.
Main verbs: write, read, lose, think, feel, talk, guide, have, interest, satisfy, inconvenience,
socialize, live, remember, calculate, hang, help, join, take, eat, walk, endeavor, smoke, desire
Auxiliary verbs: can, have
4.3.3 Indirect Expression of Intention in the Response
Q4: Can you fill out the pension forms?
A4: They aren’t so difficult for me.
Q5: Can you fill out the pension forms?
A5: My daughter guides me.
People occasionally do not want to reply clearly, when they feel coy about the content of the
response, they have not completed their intentions yet, and the response may cause displeasure to the
hearer, and so on. In such situations, they use indirect expressions that make the hearer guess the
intention. Yamada classified indirect responses into 12 categories as shown in Table 4.4 [33].
We limit the extract intention to guess the speaker’s affirmation/negation, that is, “Indirect
information addition” and “Non-standard reason addition.”
The reasoning process is needed for guessing the speaker’s purpose, the intention, demand, and so
on using his/her surrounding situation. However, it is too difficult to judge from a natural language
utterance.
91
Therefore, we define some verbs that will be used in the indirect representations for each question,
and we consider them as affirmative elements. These affirmative elements are selected based on the
result of the question answer dialogue corpus. They are similar/same words and upper/lower
concepts of the question’s verb, and the attribute of the question’s context, and so on. For example,
we select the following verbs as affirmative elements in the question “Can you fill out the pension
form?”
Sentence: Can you fill out the pension forms?
Affirmative matter: write, easy, move
Negative matter: ask, tired, hard
92
Table 4.4 Classification of cooperative responses [33]
1. Precondition responses
▪Correction precondition When the precondition about the question’s intention is
wrong, the response corrects it.
▪Notice precondition The response notices the precondition of the question’s
intention even if it is not satisfied.
▪Confirmation precondition The respondent sometimes asks about the precondition
when he/she does not know whether it is satisfied or not.
2. Providing information responses
▪ Providing wanted information Provide additional information about what the speaker
wants by guessing his/her tacit intention in the question.
▪ Providing Supportive
information
Provide supportive information to accomplish the
speaker’s purpose.
▪ Providing indirect information When the respondent does not know the answer but
he/she has incomplete information to guess the answer, it
is provided.
▪ Providing substitute information When the responses are negative, the substitute
information is provided to accomplish the purpose.
3. Providing reason responses
▪ Providing contrary expectation
reasons
Provide a reason why the respondent cannot meet the
speaker’s expectation of the question.
▪ Providing a non-standard reason Provide the reason for the response, when it is contrary to
general reasoning.
▪ Providing a standard reason Provide the reason for the response, even though, there is
a reason to imply non-standard.
4. Counter questions to the question
▪Cooperating question Asking for more information to achieve a purpose.
▪Intention correcting question Ask the real purpose of the question when the respondent
cannot guess the speaker’s intention.
93
4.3.4 Data Structure Description in Question We can apply the “affirmative/negative description for the yes-no question” for all questions.
However, the “direct/indirect expression of intention” depends on the expression of the question.
Then, the affirmative/negative elements about “direct/indirect expression of intention” for each
question have to be predefined.
Each question has three kinds of data frames; direct answer verb, affirmative matter, and negative
matter. The direct answer verb is defined from the main verb and the auxiliary verb in the question
as shown in Section 4.3.2. On the other hand, the indirect answer verbs are classified into
“affirmative matter” and “negative matter,” and they are extracted from the corpus for each question
as shown in Section 4.3.3. The following example is the data structure description in the question
“Can you fill out the pension forms?”
Sentence: Can you fill out the pension forms?
Direct answer verb: can fill out, can
Affirmative matter: write, do, easy, move
Negative matter: ask, leave, tired, hard
4.4 Affirmation Value In order to calculate the user’s affirmative degree to the question, we prepare the affirmative value
for each affirmative element. We define the affirmative values in the range [0.0, 1.0], where 1.0
means the strongest affirmation and 0.0 means the strongest negation. 0.5 means neither affirmation
nor negation. We defined the affirmative values of the interjections and verbs from the result of the
questionnaire as shown in Figure 4.6.
4.4.1 Affirmative Value of the Interjection
We investigated the affirmative degrees of the typical Japanese interjections, HAI (yes) and IIE
(no), by a questionnaire of the 11 grade evaluations on 14 university students (male: 10, female: 4).
The degrees are described on the number line. Figure 4.6 shows an example of the questionnaire.
By the results of the questionnaire, all subjects answered with an affirmative value towards HAI
over 0.8 and the value towards IIE under 0.2. Then we defined the average of the answers as their
affirmative values; the value of HAI is 0.94, and the value of IIE is 0.06. We did not give an absolute
value 1.0 and 0.0 to them, because when Japanese say “yes” it sometimes includes a little “no,” and
“no” sometimes includes a little “yes” [66].
We give the other affirmative interjections like EE, UN in Section 4.3.1 the same affirmation
value 0.94, and the other negative interjections like IYA, CHIGAU in the same section, the
94
affirmation value 0.06. We call these pair of interjections and its affirmative values “interjection
data.”
Q: Can you fill out the pension forms?
A1: Yes.
A2: No.
Affirmative degree for A1:
Affirmative degree for A2:
Figure 4.6 Example of the questionnaire 4.4.2 Affirmative Value of the Verb
We investigated the affirmative degrees of the direct expressions (KAKERU (can fill out),
DEKIRU (can)) and the indirect expressions (KAKU (write), YARU (do), TANOMU (ask),
MAKASERU (leave)) for the question “Can you fill out the pension forms?” using the questionnaire
of the 11 grade evaluations on 14 university students (male: 10, female: 4) the same as in Section
4.4.1.
We defined the value of the verbs of the direct expressions as 0.91, the value of the verbs of the
indirect affirmative expressions as 0.82, and the value of the verbs of the indirect negative
expression as 0.25 based on the average of the questionnaire’s result.
4.5 Affirmative Value Changing Scale 4.5.1 “Affirmative Value Changing Scale” Affected by the Adverb
Adverbs modify the verbs and the modification effect the affirmative degrees of the verbs. We
define the degree of the effect that changes the affirmative degrees of the verbs and adjectives as
“affirmative value changing scale.”
4.5.1.1 Classification of the adverb
Adverbs are classified into many types, and there are mainly four types; the “state adverb” which
Absolute affirmation
Absolute negation
Neither affirmationnor negation
Absolute affirmation
Absolute negation
Neither affirmationnor negation
95
indicates the state of action, the “degree adverb” which shows the degree of statement, feeling, and
change, the “quantity adverb” which shows the quantity of the objects and people that relate to the
action, the “tense and aspect adverb” which shows the time, occurrence, frequency and development
of the event. We consider only the “degree adverbs” because they effect the affirmative degree of the
modified verb.
In order to define the “affirmative value changing scale” of the adverbs, we classified both the
affirmative adverbs and the negative adverbs into three groups based on “DAIJIRIN [67]” and
“Adverbs’ meaning and usage [68]” as shown in Table 4.5.
Table 4.5 Classification of the degree adverbs [67, 68]
Group Adverbs A HIJOUNI, TOTEMO, ZUIBUN (greatly, extremely), KEKKOU, KANARI, DAIBU (quite,
very well), TOTEMO (very, really), JUUBUN, YOKU (enough, sufficiently), SOUTOU
(considerable) B WARITO, WARIAI, WARINI (comparatively, relatively), MAAMAA (moderate, so-so) C SUKOSHI (slightly), CHOTTO, SHOUSHOU, TASHOU (a little), IKURAKA (somewhat) D (no modified expression by adverb)
E MATTAKU, ZENZEN (quite, entirely), SUKOSHIMO, SAPPARI, CHITTOMO (not at all),
TOUTEI (hardly), METTANI (rarely)
F TOTEMOTOTEMO (not at all)
G SONNANI, AMARI, ANMARI (not very), SAHODO (not so much), TAISHITE (not very
much)
H (no modified expression by adverb)
4.5.1.2 Calculation of “affirmative value changing scale”
We investigated the effects to the affirmative degree by the adverb in each group using the
questionnaire that was completed by 30 subjects based on Analytic Hierarchy Process (AHP) theory [69]. We compared three affirmative adverb groups and the expression without any adverbs. We
selected one adverb from each group (group A “HIJOUNI,” group B “WARITO,” group C
“SUKOSHI”) and calculated their priorities for the affirmative degree of each adverb’s expression by
a paired comparison. In a similar way, we compared three negative adverb groups and an expression
without any adverbs, too. We selected these adverbs for each group (group E “MATTAKU,” group F
“TOTEMOTOTEMO,” group G “SONNANI”).
A paired comparison on AHP evaluates the ratio of each alternative’s priority. The subjects gave
the value 1.0 to the situation “the adverb A has the same meaning as the adverb B,” the value 3.0 to
96
“the adverb A is slightly stronger than the adverb B,” the value 5.0 to “the adverb A is stronger than
the adverb B,” the value 7.0 to “the adverb A is clearly stronger than the adverb B” and the value 9.0
to “the adverb A is absolutely stronger than the adverb B.” Based on the relationship between the
adverb A and B, the values change to 1.0, 1/3, 1/5, 1/7 and 1/9 respectively. We did not consider any
hierarchies for AHP because we just need the relative strength among the expressions, modified by
the adverb. We constructed four dimensions square matrix ][ ijaA � based on the result of a paired
comparison. ija is the result of a paired comparison between the adverb A and B. We calculated the
eigenvalue and the eigenvector of the matrix, normalized the eigenvector, and then calculated four priorities of the expressions ],,,[ 4321 wwwww � as shown in Table 4.6.
However, the affirmative degree of no adverb expression is also changed if these priorities’ values
are used for the “affirmative value changing scales” as it is. Therefore, the priorities are normalized
to the “affirmative value changing scale” of no adverb expression, as 1.00 by the following
expression.
178.0i
iw
u �
The value of the denominator means the priority of the no adverb expression, iw gives priority
to the group i, and iu is the “affirmative value changing scale” of the group i. In the same way, we
calculated the “affirmative value changing scales” for the negative adverb groups.
We consider that group E and F are the same group because some subjects answered that group E
is stronger than group F, and some that answered that group F is stronger than E.”
Table 4.6 shows the result of each group’s “affirmative value changing scale.” We call the pairing
of the adverb and its “affirmative value changing scale” “adverb data.” The affirmative value
calculation method for the modified predicate, using “affirmative value changing scale” is presented
in Section 4.5.2.4.
Table 4.6 Priority and “affirmative value changing scale” of each adverb group
Group Priority Affirmative value changing scale
A (HIJOUNI) 0.654 3.67
B (WARITO) 0.111 0.62
C (SUKOSHI) 0.057 0.32
D (nothing) 0.178 1.00
E (MATTAKU) 0.404 2.93
F (TOTEMOTOTEMO) 0.404 2.93
G (SONNANI) 0.054 0.39
H (nothing) 0.138 1.00
97
4.5.2 Affirmative Value Change by Modality Some suffix can change the meaning of the verb and the adjective. For example, when the suffix
“not” appears with the verb “write,” the meaning of “not write” is reversed from that of “write.” Not
only the negative modality but also the past modality and the double negative modality have some
effects. In this section, we propose the functions that express the change of the affirmative value by
variable modalities. 4.5.2.1 Negative modality
In order to clarify the relationship between the affirmative degree of normal expression and that of
the same expression with negative modality, we investigated the relationship by questionnaire. We
presented six normal sentences and their negative sentences to 14 subjects (university students) and
they answered using affirmative degrees within the range [0.0, 1.0]. Table 4.7 is the presented
sentences in this questionnaire.
Table 4.7 Affirmative and negative expressions used for the questionnaire
Q: “NENKIN NO SHORUI WO HITORI DE KAKEMASUKA?” (Can you fill out the pension forms?)
A1: “KAKEMASU.” (I can fill them out.)
A2: “KAKEMASEN.” (I can’t fill them out.)
A3: “DEKIMASU.” (I can.)
A4: “DEKIMASU.” (I can.)
A5: “YARIMASU.” (I do them.)
A6: “YARIMASEN.” (I don’t do them.)
A7: “TANONDE IMASU.” (I always ask someone to do them.)
A8: “TANONDE IMASEN.” (I don’t ask anyone to do them.)
A9: “MAKASETE IMASU.” (I leave them to someone else.)
A10: “MAKASETE IMASEN.” (I don’t leave them to someone else.)
Figure 4.7 is the result of the questionnaire for investigating the function of negative modality.
The horizontal axis is the affirmative degree of the normal sentence, and the vertical axis is that of
the sentence with negative modality.
98
Figure 4.7 Function of the negative modality
By this graph, the sentences’ affirmative degrees are changed to the symmetrical values based on
middle value 0.5. We define a common function from the affirmative expression to the negative
expressions. We calculated an approximate linear function and a linear polynomial expression. The
shape of the linear polynomial function is almost the same as that of the approximate linear function,
because the curved line of linear polynomial function ( 221 xCxCby ��� ) is mild and 2C , the
constant of 2x , is a very small value. Furthermore, the spread of the regression line is not so large, because the correlation coefficient of the approximate linear function (-0.886) is close to –1.0. Then,
we approximate the effect of negative modality to the following linear function. x is the affirmative value of normal expression and y is the affirmative value of the expression with
negative modality. 0.1��� xy (4.1)
The following example shows the change of the affirmative value of the verb by negative modality.
Q: “NENKIN NO SHORUI WO KAKEMASUKA?” (Can you fill out the pension forms?)
A: “MOU KAITEINAIKARA DEKINAIDESUNE.” (I can’t do them because I haven’t written
them now.)
The affirmative value of a verb phrase “KAITEINAI” changes to 0.18 from the value of its normal
expression “KAKU (write)” 0.82, and the value of a verb phrase “DEKINAI” changes to 0.09 from
the value of its normal expression “DEKIRU (can)” 0.91.
Affirmative degree of the normal sentence
Affirm
ative degree of the sentencew
ith ne gative modality
Linear function Polynomial expression
99
4.5.2.2 Double negative modality
We investigated the relationship between normal expression and the double negative expression in
the questionnaire, that presents two normal sentences and their double negative sentences to 14
subjects the same as in Section 4.5.2.1. Table 4.8 is the presented sentences in this questionnaire.
Table 4.8 Normal and double negative expressions used for the questionnaire
Q: “NENKIN NO SHORUI WO HITORI DE KAKEMASUKA?” (Can you fill out the pension forms?)
A1: “KAKEMASU.” (I can fill them out.)
A2: “KAKENAI KOTOMO NAIDESU.” (I don’t think that I can’t fill them out.)
A3: “DEKIMASU.” (I can.)
A4: “DEKINAI KOTOMO NAIDESU.” (I don’t think that I can’t.)
Figure 4.8 is the result of the questionnaire for investigating the function of double negative
modality. The horizontal axis is the affirmative degree of the normal sentence, and the vertical axis
is that of the sentence with the double negative modality.
Figure 4.8 Function of the double negative modality
By this graph, the affirmative values of the sentence with double negative modality are uneven when
that of the normal sentence is 1.0, however, we can see a pattern that the degrees of
affirmation/negation generally decrease a little. Then, we approximate the effect of double negative
Affirmative degree of the normal sentence
Affirm
ative degree
of the
sentencew
ith double ne gative modality
Linear function Polynomial expression
100
modality to the following linear function. x is the affirmative value of the normal expression and y is the affirmative value of the expression with double negative modality.
3.04.0 �� xy (4.2) The following example shows the change of affirmative value of verb by double negative
modality.
Q: “NENKIN NO SHORUI WO KAKEMASUKA?” (Can you fill out the pension forms?)
A: “KAKENAI KOTOMO NAI DESU.” (I don’t think I can’t write it.)
The affirmative value of a verb phrase “KAKENAI KOTOMO NAI” drops to 0.66 from the value
of the normal expression “KAKERU (can write)” 0.91.
4.5.2.3 Past modality
We investigated the relationship between the normal expression and the past expression using the
questionnaire that present two normal sentences and their past sentences to 14 subjects the same as
in Section 4.5.2.1. Table 4.9 shows the presented sentences in this questionnaire.
Table 4.9 Normal and past expressions used for the questionnaire
Q: “NENKIN NO SHORUI WO HITORI DE KAKEMASUKA?” (Can you fill out the pension forms?)
A1: “KAKEMASU.” (I can fill them out.)
A2: “KAKEMASHITA.” (I could fill them out.)
A3: “DEKIMASU.” (I can.)
A4: “DEKIMASHITA.” (I could.)
A5: “KAKIMASU.” (I write.)
A6: “KAKIMASHITA.” (I wrote.)
Figure 4.9 is the result of the questionnaire for investigating the function of past modality. The
horizontal axis is the affirmative degree of the normal sentence, and the vertical axis is that of the
sentence with the past modality.
101
Figure 4.9 Function of the past modality
However, the subjects’ answers were spread between affirmation and negation. We guess the
reason as follows; when subjects hear about the speaker’s past experiences, some subjects guess
“Then, he still can do it” and the others guess “He can’t do it now. Because if he were do it, he
doesn’t have to use past modality.” Therefore, we do not consider past modality in this study.
4.5.2.4 Modification by adverb
When the extracted predicate is modified by the adverb, the affirmative value of the modified
predicate is calculated by multiplying the affirmative value of the predicate by the “affirmative value
changing scale” of the adverb. The adverb whose “affirmative value changing scale” is over 1.0
emphasizes the predicate’s affirmative value, and the adverb whose “affirmative value changing
scale” is under 1.0 drops the predicate’s affirmative value. We define the following equations for
modification by the adverb.
50.0)50.0( ���� uxy (4.3)
��
��
�
�
00.1
00.0yynew
)00.1()00.100.0(
)00.0(
�
��
�
yy
y (4.4)
x is the affirmative value of normal expression, u is the “affirmative value changing scale,” and y
is the affirmative value of the modified expression by the adverb. The conditions in the equation
(4.4) are used for saving the affirmative value which multiplied the “affirmative value changing scale” in the range [0.0, 1.0]. We consider newy as the affirmative value of the modified expression
Affirmative degree of the normal sentence
Affirm
ative degree
of the
sentencew
ith double ne gative modality
Linear function Polynomial expression
102
by the adverb.
The following example shows the change of affirmative value to the verb by negative modality.
Q: “NENKIN NO SHORUI WO KAKEMASUKA?” (Can you fill out the pension forms?)
A: “WARIAI KAKEMASU.” (I can fill them in comparatively well.)
The affirmative value of a verb phrase “WARIAI KAKEMASU” drops to 0.75 as multiplying the
affirmative value of the verb “KAKERU (can write)” 0.91 by the “affirmative value changing scale”
of the adverb “WARIAI (comparatively well)” 0.62.
We calculate the affirmative value of the modified predicate by the predicate and the adverb.
However, an adverb seldom appears without any predicates in the spoken language. For example, we
respond like “MAAMAA (so-so),” “ZENZEN (entirely),” and so on. When we calculate the
affirmative value of such an expression, we load the predicate in the question, and then we apply our
method.
Q: “NENKIN NO SHORUI WO KAKEMASUKA?” (Can you fill out the pension forms?)
A: “MAAMAA DESU.” (I might be able to.)
In this example, we fill up the question’s verb “KAKERU (can write)” into the response. This
operation matches a human feeling that the complete expression of the response is “MAAMAA
KAKEMASU (I can fill them out so-so).” Then, the affirmative value of the verb “KAKEMASU”
drops to 0.75 by “MAAMAA” the same as for the former example.
4.6 Analyzing Intention from Plural Sentences We proposed the methods to extract affirmative/negative elements and to calculate affirmative
values as described in the former sections. Then, we proposed a method to calculate the total
affirmative intention of the user’s plural utterances using such affirmative elements.
We consider the average value of the affirmative values with the priorities of extracted
affirmative/negative elements as the total affirmative intention of the user’s plural utterances. The
equation is given as follows.
)( 1
1
1iin
i i
n
i ii www
wxz �
�
��
�
�
�
� (4.5)
,where ix is the affirmative value of the affirmative/negative element, iw is the priority of each
103
affirmative/negative element, z is the total affirmative intention value of the user’s plural utterances,
and n is the number of extracted affirmative/negative elements. In this equation, the newer element
has the stronger effect, because the priority of an element is defined to be larger than the element of
the older one.
4.7 Example of Our Method Let us show an example to analyze the intention of the following utterance.
Q: “KODOKUKAN WO KANJI MASUKA?” (Do you feel any loneliness?)
A: “IIE, ANMARI.” (No, not very.)
First, the system implements morphological analyzing to the response “IIE, ANMARI” that the
user inputs. Next, the system extracts affirmative/negative elements from the parsing result and
calculates the affirmative values of extracted elements using the interjection data, the adverb data,
and the question data. In this example, two affirmative/negative elements are extracted “IIE” from
the interjection data and “ANMARI” from the adverb data.
“ANMARI” is used as a single adverb and the system guesses the question’s verb “KANJIRU
(affirmative value: 0.91)” is omitted as shown in Section 4.5.2.4. We fill up the negation of the
question’s verb “KANJINAI” because “ANMARI” has the negative aspect. Then, the affirmative
value of “ANMARI” is calculated as follows. (In this calculation, AV and AVCS denote “affirmative
value” and “affirmative value changing scale,” respectively.)
(1) Employ the equation (4.1) to calculate the affirmation value of “KANJINAI” because it is the
negative expression of “KANJIRU”.
09.00.1)91.0(
0.1)""(""
�
���
��� KANJIRUofAVKANJINAIofAV
(2) Employ the equations (4.3) and (4.4) to consider the effect of the adverb “ANMARI.”
34.050.039.0)50.009.0(
50.0)""(}50.0)""{(""
�
����
���� ANMARIofAVCSKANJINAIofAVANMARIofAV
Then, the system gets the affirmative values “IIE (affirmative value: 0.06)” and “ANMARI
(affirmative value: 0.34). Successively, the system calculates the total affirmative intention of the
user’s utterances by substituting their affirmative elements for the equation for investigating the
104
function of negative modality (4.5).
20.02
139.0106.0�
����ntionmativeInteTotalAffir
4.8 Application of Affirmative/negative Intention Analyzing Method We applied this affirmative/negative intention analyzing method into the Web-based analytical
system as described in 4.2.
This system consists of a server system and client systems connected by Internet. The main
process of “Web-based analytical system” works on the server. The clients supply an interface by
natural language and a display to show facial images and the result of the “Web-based analytical
system.” The natural language interface will decrease the user’s stress caused by using a computer,
because the user will not need to deal with a keyboard and a mouse.
First, the system gives the user a question about QOL and obtains his/her response through natural
language interface. The server receives the response and do morphological analyze and parsing.
Analyzing results are sent to “emotion generating domain,” “intention analyzing domain” and
“response generating domain.”
The system applies the EGC method to the user’s utterance and calculates some emotions from
the user’s viewpoint. The emotions are used in “facial expression selecting domain” in order to
select an appropriate facial expression from the user’s face data. Generating facial image is
displayed on the client’s display. The process for generating facial image is described in Chapter 5.
“Intention analyzing domain” calculates the user’s affirmative/negative intention against the
question from his/her utterances. When the system obtains an ambiguous intention, the system
continues the conversation in order to clear the user’s intention, because the method can deal with
plural utterances.
Therefore, “response generating domain” generates appropriate responses based on the
grammatical feature of the user’s utterance in order to continue the conversation. We also propose a
method to increase the variation of response expressions and develop techniques to select
appropriate response expression, in order to be made it easier to talk with a computer. We deal with
three kinds of response types; “simple response,” “repeating,” and “showing hearer’s interest.”
“Simple response” can be used anytime in a break of phrases. “Repeating” is mainly used just after
the question to confirm the user’s utterance. “Showing hearer’s interest” is used for showing that the
system can understand the user’s intention. Then, it makes a response expression based on the
content of the user’s utterance and responds to it [70].
The system collects the affirmative/negative intentions against all the QOL questions through the
conversation. These results are used to evaluate analytical results and health counseling comment.
105
Figure 4.10 Overview of the extended Web-based analytical system of health service
Question List Asking domain
Analyzing domain
Intention analyzingdomain
Emotion generating domain
Response generating domain
QOL evaluatingdomain
Facial expressionselecting domain
QOL result
Natural languageinterface
Display
Server Client
Facial Expression Database
106
Figure 4.11 Interface tool for “web-based system of health service needs among healthy elderly
4.9 Experimental Result We applied our method to 50 dialogues corpus (10 Q&A corpus for each 5 people) and evaluated
their results using the questionnaire by the seven grades evaluations. However, when the subject
replies with an ambiguous response, it will be wasted at time to confirm his/her true intention
because he/she does not make clear his/her intention as he/she cannot determine an unique intention.
Then, we used 32 different subjects (university students, male: 23, female: 9) to evaluate the
intentions objectively.
Table 4.12 shows the comparison results between the average of the affirmative value that the
users replied by the questionnaire and the affirmative value that our system calculates. A5 in Table
4.12(b) is the result against the example in Section 4.7.
We consider that the system calculates adequate values because both values are similar.
107
Table 4.10 Comparison between the average of the affirmative value that the users replied and the
affirmative value that our system calculated
Table 4.10 (a) Comparison results that the 90% or more of subjects select the same intention
Sample No. Average value of questionnaire’s
result (X)
Affirmative value that system
calculates (Y) || YX �
A1 0.974 0.940 0.034
A4 0.120 0.090 0.030
A6 0.780 0.754 0.034
A8 0.729 0.500 0.229
A9 0.964 0.970 0.006
A10 0.953 0.940 0.013
B1 0.963 0.910 0.053
B2 1.000 0.925 0.075
B3 0.816 0.910 0.910
B9 1.000 0.925 0.075
B10 0.906 0.850 0.056
C1 1.000 0.925 0.075
C2 0.031 0.455 0.424
C3 0.031 0.060 0.029
C4 0.906 0.910 0.004
C5 0.031 0.060 0.029
C7 0.990 0.925 0.065
C8 0.837 0.910 0.073
C9 1.000 0.940 0.060
C10 0.995 0.940 0.055
D1 0.774 0.940 0.055
D3 0.905 0.880 0.025
D5 0.078 0.060 0.018
D7 0.990 0.910 0.080
D9 0.979 0.955 0.024
D10 0.835 0.754 0.081
E4 0.885 0.920 0.035
E6 0.844 0.940 0.094
E8 0.905 0.820 0.085
108
Table 4.10 (b) Comparison results that the 80%-90% of subjects select the same intention
Sample No. Average value of questionnaire’s
result (X)
Affirmative value that system
calculates (Y) || YX �
A5 0.252 0.200 0.052
B5 0.303 0.340 0.037
D4 0.688 0.771 0.083
D6 0.678 0.533 0.145
E1 0.141 0.820 0.679
E3 0.723 0.607 0.116
Table 4.10 (c) Comparison results that the 80% or less of subjects select the same intention
Sample No. Average value of questionnaire’s
result (X)
Affirmative value that system
calculates (Y) || YX �
A2 0.395 0.910 0.515
A3 0.823 0.625 0.198
A7 0.989 0.940 0.049
B4 0.376 0.395 0.019
B6 0.637 0.832 0.195
B7 0.647 0.754 0.107
B8 0.516 0.455 0.061
C6 0.568 0.170 0.398
D2 0.583 0.535 0.048
D8 0.568 0.607 0.039
E2 0.569 0.925 0.356
E5 0.376 0.090 0.286
E7 0.749 0.500 0.249
E9 0.683 0.500 0.183
E10 0.661 0.500 0.161
109
4.10 Future Works We compared the average of the affirmative values and the affirmative values that our system
calculated for all samples as shown in Table 4.11. We consider that there is not so much difference
because the average value of the difference is 0.138 and the standard deviation is 0.185. The sample
that has maximum difference (0.754) is caused by the failure of morphological analyzing. We
extracted the samples that 80% of the subjects selected the same intention (affirmation/ negation/
neither-affirmation-nor-negation), and we considered them as standard samples. We did not consider
the standard samples that did not reach a consensus, that is to say 20% or more of the subjects
objected to the major opinion. Table 4.12 shows some of the examples that 80% of the subjects
selected with the same intention, and Table 4.13 shows the examples that the subjects suppose
variable situations.
Table 4.11 Differences between the average value of questionnaire and system output
Average of differences Standard deviation Maximum difference Minimum difference
0.138 0.185 0.754 0.004
Table 4.12 Examples that the 80% of subjects select the same intention
Q:
“BASU YA DENSHA WO TUKATTE HITORI DE GAISHUTU DEKIMASUKA?” (Can you go
out using bus or train by yourself?)
A: “HAI.” (Yes.)
Q: “KODOKUKAN WO KANJI MASUKA?” (Do you feel any loneliness?)
A: “ANMARI NAIDESUNE.” (Not so much.)
Q: “IRAIRA SURU KOTO WA ARIMASUKA?” (Have you sometimes been irritated?)
A: “MAA, SOU IRAIRA YUUKOTOMO NAIDESUGANE, YAPPARI, SORYA KINKYUU NO BAAI
NIWANE, SUKOSHI WA ARIMASUYO.” (Well, I haven’t been irritated so often, but, of course,
when there is a problem, I feel a little irritated.)
110
Table 4.13 Examples that the subjects suppose variable situations
Q:
“HITO NO NAMAE YA KOTOBA GA SUGU NI DETEKONAI KOTO GA ARIMASUKA?”
(Are there any situations where you can’t remember a person’s name or a word?)
A: “METTANI NAIDESUGANE. TOKINIWA ARYAA, DOUIU HITO DATTAKANA TO OMOU
KOTO WA ARIMASUYO.” (No, rarely. But sometimes, I think “who is he?”)
Q: “NENKIN NADO NO SHORUI WO HITORI DE KAKEMASUKA?” (Can you fill out pension
forms and so on?)
A: “MAA IMAMADEWA YATTEMASHITAGANE. KONDO KARAWA DOUNARUYARA.
YOMESAN NI TANOMU YARA, WAKARANAI DESU.” (Well, I have done it. How will I do it
next time? Will I ask my wife? I don’t know.)
Q: “KODOKUKAN WO KANJI MASUKA?” (Do you feel any loneliness?)
A: “U-N, BETSU NI FUJIYUU WO KANJIMASEN GANE. YOMESAN MAKASE DE SHITE
KURERUKOTO. EE. IIKKOU NI YATTE KUREMASUYO.” (Well, I don’t feel any
inconvenience. I leave everything to my wife. Yes, she does everything well.)
Figure 4.12 is a scatter plot diagram for all samples, and Table 4.14 is the correct rate in the
standard samples. We considered the correct sample that the difference between the questionnaire’s
result and the system output is under 0.1.
There are 82.8% correct samples in the standard samples. We consider that our system can
analyze the speaker’s affirmative/negative intention for the standard samples, because the correlation
coefficient of Figure 4.12 is 0.822.
When the samples include that of the subjects suppose variable situations, the correct rate
decreased to 68.0% and its correlation coefficient is 0.736. We analyzed 16 incorrect samples. Six of
them are caused by lack of the interjection data, the adverb data, and the question data, and not
considering individual variations, gender gap, generation gap, and so on. Eight of them need the
process of meaning analysis and reasoning. One sample is asking oneself “How do I …?,” however,
the system misunderstands it as the intentional response for the question. Similar misunderstanding
occurs by the utterance that content does not relate to the question. To solve these problems, we have
to append the other types affirmative/negative elements, reasoning process, and the rules to avoid the
utterance that does not relate to the affirmation/negation of the question.
111
Figure 4.12 Scatter plot diagram for all samples
Table 4.14 Correct rate in the examples where 80% of the subjects select the same intention
Number of samples Number of where the difference between questionnaire’s result and the system output is under 0.1
Correct rate (%)
35 29 82.8
4.11 Conclusion In this chapter, we proposed a method to analyze the user’s affirmative/negative intention from
his/her utterances in the dialogue. First, we extracted some elements based on the surface structures
of the responses and a concept of the question, and calculated an affirmative/negative value
corresponding to the degree of affirmation/negation. We defined three types of elements,
“affirmative/negative description for the yes-no question,” “direct expression of intention in the
response,” and “indirect expression of intention in the response.” Furthermore, we defined
calculating functions of affirmation value changes according to the aspect of verbs and adverbs.
Finally, we calculated the total affirmative/negative intention to a question in the dialogue.
To verify the validity of this method, we applied our method to 50 dialogues corpus and evaluated
their results using the questionnaire. We consider that the system calculates adequate values,
according to the comparison between the questionnaire’s result and the system’s output.
This method is effective in the web-based system, therefore, we applied the method to the
interface in the “Web-based analytical system of the Health Service for the Elderly.” The interface
system can analyze the user’s affirmative/negative intention in his/her utterance even if he/she is
Average value of questionnaire’s result
Affirm
ative value that the system calculates
Examples that 80% of subjects select the same intention
Examples that the subjects suppose variable situations
112
separate from the system.
We can guess the user’s intention on the yes-no question by this method, furthermore, the
interface system can converse with the user as it considers his/her intention.
As the application of our method without yes-no question, analyzing the intention for
Wh-question and extracting the spontaneous request are needed. In order to realize such functions,
we are going to extract our proposed method.
113
CHAPTER 5 EMOTION ORIENTED INTERACTION SYSTEMS — FACEMAIL & FACECHAT —
We develop application software that can analyze the user’s emotions and can represent the
emotions with facial expressions. The application requires two outstanding functions. One function
is analyzing emotion from some sentences and the other is displaying the emotional faces. The
emotion analyzing part is due to the EGC method as shown in Chapter 2 and 3.
In this section, we propose a method to generate the user’s facial expressions based on the
extracted emotions by the EGC method. This method uses a “sand glass type neural network” trained
by real facial images. First, we classify the emotions for the facial expressions as “happiness,”
“sadness,” “disgust,” “anger,” “fear” and “surprise” as proposed by Ekman [44]. By training the
neural network based on such types of facial expressions, each emotion is partitioned on the
two-dimensional emotional space constructed by the outputs of the third layer in the neural network.
In order to employ the emotional space, we assign the EGC output (20 kinds of emotions) to the
input of the two-dimensional emotional space (6 kinds of emotions) as described in Section 5.2.2.
Next, one point on the two-dimensional emotional space is determined from the assigned emotions.
We applied this method into mail software and a chat system. The mail software (JavaFaceMail)
calculates the emotions from the content of the mail, generates one facial expression image of the
sender, and sends the mail with the facial image. The chat system (JavaFaceChat) also generates a
facial expression image like JavaFaceMail. Furthermore, it analyzes the variances of emotions for
each user, and invites two users to a new closed chat room when their tendencies of variances are
alike.
5.1 Facial Expression Generating Method by Parallel Sand Glass Type Neural Network 5.1.1 Sand Glass Type Neural Network
There are some studies to clarify the relationship between the emotions and the features of facial
expressions by analyzing the features of the facial expressions [71, 72]. Especially, some researchers
employ neural networks to relate facial expressions and emotions. It is one of the effective methods
that classifies the facial expressions based on the emotions using the sand glass type neural network
as shown in Figure 5.1 [73].
The sand glass type neural network is a kind of hierarchical neural network. The features of the
neural network are that the number of input neurons is equal to the number of output neurons and
that the number of neurons in the middle layer is much less than the number of the input and output
neurons. Back Propagation (BP) learning is employed to train the network by giving teaching signals
114
Figure 5.1 Overview of sand glass type neural network to output the patterns that are the same as the input patterns. This method can extract the features of
the input data on the middle layer when the training is finished [73].
However, the standard sand glass type neural network is used to learn the facial expressions of one
person because it is difficult to deal with the multiple data simultaneously on the network. Therefore,
the extended sand glass type neural network is proposed. It connects two kinds of the five-layer
neural networks at the third layer in order to deal with the different data simultaneously as shown in
Figure 5.2 [74]. The data are inputted to the connection networks n (n = 1, …, N) simultaneously, and
the network is trained to output the same patterns as the input pattern. After the training, the features
of the input data are compressed and integrated at the third layer.
115
Figure 5.2 Overview of extend sand glass type neural network
Ichimura constructed “two-dimensional emotional space” by learning the facial expressions of the
two people (a man and a woman) using the extended sand glass type neural networks [75]. Although
this model learned the features of the two people’s facial expressions, it could not classify three or
more people’s facial expressions based on the each emotion. Therefore, we propose a parallel sand
glass type neural network, which connects N kinds of the five-layer neural networks at the third
layer as shown in Figure 5.3 [45]. This network can deal with N kinds of data simultaneously.
Network B Network A
116
Figure 5.3 Overview of parallel sand glass type neural network
BP learning is employed to train the network. However, in the third layer and the fifth layer, we
apply a linear function as a bias function instead of a sigmoid function, and do not use threshold
values � to represent prominent weights of its incoming links. In this paper, we use eq.(1) as a
sigmoid function and eq.(2) as a linear function.
� �� �x
xf���
�
exp11
(1)
� � xxg �� (2) Let 21ω as the weight vector between the first layer and the second layer, the output activation
to the second layer 2x is
� �21212 θxωx �� f , (3)
where 1x is an output activation vector in the first layer.
In the third layer, we use the following function, � �2323 xωx g� (4)
In a similar way, we use the following functions in the fourth layer and the fifth layer.
� �43434 θxωx �� f (5)
� �4545 xωx g� (6)
Network2
Input Layer
2nd Layer 3rd
Layer4th
LayerOutput Layer
255 Neurons
255 Neurons
255 Neurons
255 Neurons
255 Neurons
40 Neurons
40 Neurons
40 Neurons
40Neurons
40Neurons
40Neurons
Person1
Person 2
Person N
Network1
Network N
2 Neurons
Person1
Person 2
Person N
255 Neurons
117
5.1.2 Facial Training Data We use emotional facial expressions of some people as teaching signals. For the individual, there
are six basic kinds of emotional faces, “happiness,” “sadness,” “disgust,” “anger,” “fear,” “surprise,”
and a neutral one [44, 76]. Two facial images are readied for each emotion. Therefore, we have 13
pieces of pictures for the individual.
To normalize the facial images in position and use the size of internal facial features as reference
points, we use an affine transformation [77] to extract the normalized target images. First, we determine three reference points, rE , lE , and M as the center points of the regions that
correspond to the two eyes and mouth as shown in Figure 5.4. We define the parameters as shown in Figure 5.4 as dcc 8.021 �� , dc 4.03 � , dc 2.14 � . Then, we obtained a standard window of
128 by 128 pixels to form target images. Figure 5.5 shows the six kinds of emotional basic faces and
a neutral face of a subject. These pictures are transformed into the 8bit gray scaled format.
Figure 5.4 A target image defined by three points
118
Anger Disgust Fear
Happiness Sadness Surprise
Neutral
Figure 5.5 Sample of six emotional basic faces and a neutral face
119
Next, we have to convert these images into the frequency region as training data. As
transformation technology, we use a two-dimensional DCT, which is a famous method for digital
signal processing. The variation of facial image is reflected in the frequency region directly and most
of the energy is concentrated into its low frequency part [78]. We use the low frequency part as shown
by the gray square in Figure 5.6 for training data, which is transformed by 2D-DCT.
Figure 5.6 Low frequency region transformed by 2D-DCT
120
5.1.3 Learning Experimental Results 5.1.3.1 Ekman’s emotion circle
Ekman et al. described the relationship between the emotions and the facial images in their
detailed examinations [79]. They described how we can judge six basic emotions, that is, “happiness,”
“sadness,” “disgust,” “anger,” “fear,” and “surprise” from facial expressions as shown in Figure 5.7.
Moreover, Schlosberg et al. said that there is an emotional circle with an order of “Love-
Surprise-Fear-Anger-Dislike-Contempt.” Based on these two ideas, there have been many studies
with respect to the ordered six basic emotions in the emotional circle. They reported that emotions’
boundaries are opaque, but the separated emotions in the emotional circle are clearly distinguishable [79].
We construct an emotional circle with facial expressions by the parallel sand glass type neural
network as described in Section 5.1.1. The network is trained for some people’s facial expressions,
and we extract a feature of each emotion’s facial expression in the third layer after the trained
network.
Figure 5.7 Emotion circle
Happiness
Sadness
Disgust
Anger
Fear
Surprise
121
5.1.3.2 Learning results using four people’s emotional faces
We prepared four people’s facial images, and trained a “sand glass type neural network.” Each
network is trained by the target images for the individual to display the emotional information in the
third layer. If the number of networks is less than the number of people, the information in the third
layer shows the person’s facial characteristics. Therefore, we used the “sand glass type neural
network” consisting of four neural networks and each neural network was connected to the others by
two neurons in the third layer. We considered that the information in the output activities of two
neurons is represented in the two-dimensional emotional space. It is easy to understand this
emotional space formed by the behaviors of the neurons’ output activities.
Figure 5.8 shows the error convergence situation of this network. It was iterated 50,996 times to
converge until the mean square error reached under 0.01. It represents prominent characters of the
emotions by the output activities in the third layer, then, eq.(4) and eq.(5) are applied without
sigmoid function.
After training the network, we investigated the output activities in the third layer. Figure 5.9
shows the neuron output activities in the third layer. It represents the distributions of the emotions
for the facial images on the two-dimensional emotional space, through the horizontal axis which
shows the output activity of the first neuron in the third layer and the vertical axis which shows the
output activity of the second neuron in the third layer. We define this two-dimensional map based on
the activities of the third layer as “two-dimensional emotional space.”
The area of each emotion as shown in Figure 5.9 shows the groups of the points where the error at
the output layer is less than N�25.0 when the emotional space is partitioned into 20�20 grids
and the center point of each grid is given as the input for the fourth layer.
Figure 5.9 indicates that each emotion is separated on the two-dimensional space and distributed
like a circle as shown in Figure 5.7. This figure is almost equal to the research of the circumplex
model of emotions by Russell [79]. However, the “disgust” area is distributed near the “fear” area,
though it should be distributed between the “sadness” area and the “anger” areas. The reason is that
the subjects had been perplexed about making different facial expressions between “disgust” and
“fear.” In psychology, Ekman explained that the facial expressions of “disgust” and “fear” are more
difficult to distinguish than any other facial expressions [44]. We obtained almost the same result from
the emotional space constructed by the third layer using experimental facial expressions. Moreover,
the output activities from the network and facial images which were obtained were almost the same
as the experimental facial images. We confirmed that not only teaching data but also the other data
can be restored as facial images from these results.
Furthermore, the network outputs 400 facial images when we input 400 points allocated to the
grids as shown in Figure 5.9.This indicates that each facial image can be created for the optional
point and that the points which do not belong in any emotion areas also have each facial image. We
122
plotted the facial images at the same positions of the points on the emotional space as shown in
Figure 5.10. In this figure, we can see that the emotional facial expressions appear almost the same
area as shown in the emotional space of Figure 5.9. Furthermore, we can confirm that the
intermediate facial expressions among the inputted facial images are complemented, similar to that a
complex emotion which is composed of some basic emotions. As a result, we can confirm that the
emotional space is constructed so that each emotion is distributed sequentially and in a constant
order, at the third layer in the parallel sand glass type neural network.
Figure 5.8 Error convergence of sand glass type neural network
124
Figure 5.10 Relation map between emotional space and output activity
5.1.3.3 Relationship between the numbers of networks
We inspect the adequate number N of the training networks to correspond with the order of the
emotion circle that Ekman proposed.
Figure 5.11 shows the emotional spaces when the number N of the parallel sand glass type neural
networks are changed from 1 to 7. Figure 5.11(a) is the emotional space when the number of
connected networks N = 1. In this case, the emotional space is not constructed adequately, because
the areas of “anger” and “sadness” overlap widely and the order of the emotion areas is different
from the standard one. Figure 5.11(b) is the emotional space when the number of connected
networks N = 2. The distribution of the “disgust” area is different from the case of N = 1. However,
the order of the emotion areas is still different from the standard order. In the case of N = 3, the
distribution of the emotion areas is closer than the former ones to the standard one. However, the
125
“disgust” area and the “neutral” area overlap. When the connected number N is bigger than 4, the
emotional spaces are shown in Figure 5.11(d), (e), (f), (g). Each emotion area is distributed radially
from the “fear” area and all the orders of the emotion area are the same. From the results, the
constructed emotional spaces are not so different among the cases where four or more people’s facial
images are used. Then, we consider that at least four subjects are needed to construct adequate
standard emotional space.
(a) Number of connected networks N = 1 (b) Number of connected networks N = 2
(c) Number of connected networks N = 3 (d) Number of connected networks N = 4
126
(e) Number of connected networks N = 5 (f) Number of connected networks N = 6
(g) Number of connected networks N = 7
Figure 5.11 Two-dimensional emotional spaces at each number of connected networks
127
5.2 JavaFaceMail We expect to make a computer equipped with human-like interfaces, and it should enable us to
have an easy conversation including greetings and so on. Besides verbal messages, human
face-to-face communication includes nonverbal messages such as facial expressions, vocal inflection,
speaking rate, pauses, hand gestures, body movements, posture, and so on. Therefore, we developed
new mail software representing facial expressions called JavaFaceMail, a kind of new computer
interaction tool [80]. The current version of JavaFaceMail is 1.0.1b in the web site [81].
5.2.1 System Overview This mail software can analyze and express the emotions that the user will generate from the
content of the E-mail. Figure 5.12 shows the overview of the system. E-mail is inputted using the
interface on the client system, and it is sent to the server through a network. The server has six
processes; mail receiving process, morphological analyzing and parsing process, case frame
extracting process, the EGC process, facial expression selecting process and mail sending process.
The client system has the functions not only for general E-mail tools but also for displaying the
facial image.
Figure 5.12 Overview of JavaFaceChat
From User
Case Frame Extracting
Morphing & Parsing
Receive E-Mail Dictionary Database
Favorite Value Database
Emotion Generating Calculations
Facial Expression by Neural Network
To User
Send E-Mail
128
5.2.1.1 Behavior of Server The morphological analyzing and parsing process analyzes the sentences in the mail
morphologically and parses it. The process is done for each sentence. We used JUMAN as a
morphological analyzer and KNP as a parser as described in Section 2.3.3. Then, the case frame
representations are made from the result of KNP, a process which is also described in Section 2.3.3.
Next, the EGC method as described in Chapter 2 and 3 is applied to the case frame representations
in the EGC process. The process sometimes does not generate any emotions when there are not any
like/dislike objects in the sentence or the event type of the predicate is not registered. We do not
consider that such sentences have any effect on the whole emotion against the mail content.
Although the EGC method outputs 20 kinds of emotions, the facial expression is selected based on
six kinds of emotions as described in Section 5.1. Therefore, we have to give “assign rules” from the
emotion type by the EGC to facial emotion types, and we have to determine one point that means the
compound emotional facial expressions on the emotional space. Section 5.2.2 will show the emotion
assigning rules and the point determining method.
The server delivers the received mail with the emotion analyzing result to the address of the mail.
When the sender’s facial images have been already registered in the server, the server attaches the
facial image output by the trained neural network to the mail. When the sender has not registered
his/her facial images in the server, the server attaches the default facial picture as shown in Figure
5.13. The sender can select the type of facial image when he/she sends an E-mail message.
Now, mail data are constructed by two parts; “header” that has the information about the sending
route, the name of the mail software and so on, and “body” that includes the inputted sentences. This
system attaches the type of the facial image and analyzing result to the header area. Accordingly, in
order to display the facial images, the user has to use special mail software as described in the next
section for JavaFaceMail.
129
(a) Surprise (b) Anger (c) Fear
(d) Happiness (e) Disgust (f) Sadness
Figure 5.13 (a) Example of original six face types (Takeshi)
(a) Surprise (b) Anger (c) Fear
(d) Happiness (e) Disgust (f) Sadness
Figure 5.13 (b) Example of original six face types (Mika)
130
5.2.1.2 Behavior of Client To display the facial images, special mail software that has the functions to display the facial
images is needed. We named the mail software JavaFaceMail, and it is developed by Java Swing.
Java is a kind of object-oriented language, and it can execute in any OS where JDK1.3 [82] works.
When the user starts JavaFaceMail, a menu window appears as shown in Figure 5.14. When the
user uses this system for the first time, he/she has to register his/her information with the
JavaFaceMail server, because the server distinguishes the face type from the user’s mail address and
retrieves the facial images. The information about face images is managed using PostgreSQL, a kind
of database software. All the software for the server can be installed in the standard UNIX OS.
Next, the user has to configure the client system. Because JavaFaceMail has a usual mail function
via POP and SMTP protocols, the user resists a FQDN (Full Qualified Domain Name) with each
server like standard mail software. Furthermore, the user resists his/her name, password, a place of
mail spool and the face type previously registered with the server as shown in Figure 5.15. Although
the face type of the sender can be retrieved from the E-mail address and the face type registered by
the user, the “From” information in the mail can be changed optionally on the mail software. Then,
the system requires the face type.
The mail software also has an address book that is a tool to convert the sender’s name into the
mail address as shown in Figure 5.16. Although the items for face type do not appear in the figure,
there are the items in the address book. This address book can receive some information from the
server directly.
Figure 5.17 shows a window for sending E-mail. The server cannot analyze the mail sent from the
other mail software because of a lack of the information for generating the facial image.
Received mails have information about the face type and the value that indicates emotion at the
header area. The client system displays an adequate facial image using such information. However,
the facial image itself is not attached to the mail, but it is downloaded from the server directly. All
the downloaded images are accumulated in the client machine. Figure 5.18 shows the analyzing
result of the sentences as shown in Figure 5.17.
131
Figure 5.14 Menu window of JavaFaceMail
Figure 5.15 Configuration window of JavaFaceMail
Figure 5.16 Address book in JavaFaceMail
134
5.2.2 Assign Rules to the Facial Expressions We give assign rules from the emotion type by the EGC to facial emotion types as shown in Table
5.1.
At first, there is the emotion “Fear” which is characterized by the EGC and the facial emotion
types. Therefore, the emotion “Fear” is equal to the facial emotion type “Fear.” In the same way,
there is the emotion “Anger” which is characterized by the EGC and the facial emotion types.
“Resentment” and “Reproach” are also assigned to the “Anger” group, because these emotions
indicate the aggressions to someone. Next, the emotions aroused by “pleasure” (Joy, Happy-for,
Gloating, Hope Relief, and Satisfaction) are equal to the facial emotion type “Happiness.” However,
we only assign “Joy,” “Happy-for,” and “Gloating” to the “Happiness” group, because the concept
of “Joy” embraces the one of “Hope,” “Relief,” and “Satisfaction.” In the same way, the emotions
“Distress,” “Sorry-for,” and “Resentment” by the EGC are equal to the facial emotion type
“Sadness.” The emotion “Surprise” is the action of assailing unexpectedly or attacking without
warning. Emotions such as “Relief” and “Disappointment” are equal to the “Surprise” group,
because these emotions are generated when the prospective event is not confirmed. The “Disgust” is
caused when the situation is completely unacceptable. There are four types of emotions (Reproach,
Anger, Shame, and Remorse) relating to “unacceptable” in the EGC output. We assign “Shame” and
“Remorse” to the “Disgust” group, because “Reproach” and “Anger” are already assigned to the
“Anger” group.
Each EV is added to the corresponding values attached to each emotional facial type. Each facial
expression’s value by the EGC is calculated from the EVs preset in the whole mail. Furthermore,
some special words (e.g. “happy,” “die,” etc.) affect the facial expression without calculating
emotion as shown in Section 3.4. We increase each facial expression's degree based on the number
of these words.
Table 5.1 Assign rules to the facial expression
Facial expression Emotion by the EGC method
Fear Fear
Anger Anger, Resentment, Reproach
Happiness Joy, Happy-for, Gloating
Sadness Distress, Sorry-for, Resentment
Surprise Relief, Disappointment
Disgust Shame, Remorse
Next, a point on the emotional space should be determined from these six emotions in the content
of the mail. These values are inputted to two neurons in the third layer of the trained sand glass type
135
neural network.
We define the points, which output the facial image teaching data for each emotion, as each center
of the emotion types on the emotional space. Successively, the sums of emotion values for each
emotion are set on the emotional space as the vectors, and the center of gravity of the emotional
space is calculated as shown in Figure 5.19. Then, the network obtains the facial expression by the
given input emotion types and values. This facial image indicates the facial expression against the
whole mail content.
However, this method sometimes generates a neutral facial expression when there are two
conflicting emotions (e.g. happiness and sadness). In our study, we do not consider such conflicting
situations because there are many variations of the reactions to the conflicting situation [63] and the
facial expressions appear according to the reactions.
Figure 5.19 Center of each emotion vector
5.2.3 Mental Effects by Outputting Facial Expressions We evaluated the mental effects of displaying the facial expressions using our method by the
following questionnaire. The subjects read an E-mail message on three types of mail windows that
are (A) displaying only the context, (B) inserting face marks for each sentence, and (C) displaying
the facial expression image with the context. Then, they evaluated these three methods from the
following viewpoints by the five grade evaluations:
1. It gives you pleasure to read the mail.
2. It helps to communicate the sender’s emotion well to you.
3. You feel familiar with the output of the mail software.
The subjects are 37 university students (men: 34, women: 3). We presented Figure 5.20(a), (b) and
5.18 as the condition (A), (B) and (C).
Table 5.2 shows the average ratings and the standard deviations for each item. There were some
136
significant differences at the 1% level among the items by one-factor ANOVA.
For the first condition, “It gives you pleasure to read the mail,” the average ratings of (B) and (C)
was significantly higher than (A), and there were no significant differences between (B) and (C)
using the Turkey test. This indicates that “displaying facial expressions” using the JavaFaceMail
increases the fun of reading E-mail messages similar to inserting the face marks.
For the second condition, “It helps to communicate the sender’s emotion well to you,” the average
rating of (B) was significantly higher than (A) at (p < 0.01), and the average rating of (C) was
significantly higher than (A) at (p < 0.05). This indicates that both (B) and (C) are effective, but
inserting face marks is more effective than JavaFaceMail. A possible explanation for this is that face
marks are inserted for each sentence in an E-mail message. On the other hand, JavaFaceMail
displays only one facial expression for the total content of an E-mail. Therefore, a displayed facial
expression can be neutralized if a number of emotions are aroused simultaneously in an E-mail.
There is no problem when the content of the E-mail is short. However, we have to improve our
method for analyzing E-mails with a large content, for example, displaying the facial expressions for
each paragraph that arouse the same emotion, or displaying the facial expression just after a sentence
if the sentence arouses an emotion, and so on.
For the third condition, “You feel familiar with the output of the mail software,” there were some
significant differences among all the situations. The average rating of (C) was the highest, and (B)
was higher than (A). For the effect of familiarity, JavaFaceMail was significantly more effective
than the face marks. This indicates that the JavaFaceMail system is effective for introducing
face-to-face communication onto the human computer interaction.
Table 5.2 Average ratings and the standard deviations for each displaying method
(A) Only
sentences
(B) With face
marks
(C) With facial
expression
One-factor
ANOVA 1. It gives you pleasure to read the mail. 2.57(0.83) 3.57(1.01) 3.51(0.93) p < 0.01
2. It helps to communicate the sender’s emotion well to you. 2.86(0.98) 3.81(0.94) 3.49(0.96) p < 0.01
3. You feel familiar with the output of the mail software. 1.81(1.00) 3.22(1.25) 3.95(1.18) p < 0.01
5.3 JavaFaceChat We created a “chat system” called JavaFaceChat, which represents facial expressions
corresponding to the user’s emotion. “Chat rooms” are one of the most popular communication tools
used on the Internet. “Chat rooms” which extend to Internet TV phones, such as xDSL connections,
have become popular due to the connection being very fast and inexpensive. Mobile phones now
137
Figure 5.21 Overview of JavaFaceChat System
allow us to communicate via sound and sight, through the use of images. Although we are surprised
at the progress of such technology, we are still unsatisfied with the sophistication.
JavaFaceChat supplies the usual functions for “chat rooms.” Figure 5.21 shows an overview of our
developed JavaFaceChat. We applied the EGC method to analyze all messages, and all the
generated emotion types were concentrated in six kinds of emotions. The system in the server
calculated the facial image from the six emotion types and values described in Section 5.2.2.
138
This facial image indicates a facial expression against one utterance by each user, respectively. The
system sends a sentence and a facial image to all users upon receiving every new message.
JavaFaceChat server has plural agents. An agent receives the message from a user, and sends the
message and the face type using the EGC method to the other users. Each agent has the user’s facial
images for the six kinds of emotions.
Procedures for JavaFaceChat when a user ‘A’ types something, is described as follows.
JavaFaceChat system analyzes the other’s sentences based on one’s own thoughts and feelings, and
guesses the variances of the other’s intention and emotion, like a person who uses a standard chat
room.
�� Start system
1. Download six types of emotional facial images for each chat participant when a chat session
starts.
�� Do the following processes at the mobile terminal of ‘A’
1. Send an input sentence to the server.
2. Calculate Emotion Value EA by applying the EGC method to the sentence, and send EA to the
server.
3. Generate one facial expression image based on the EGC result as described in Section 5.2.2.
4. Display the sentence and the facial image of ‘A’ on the display of A’s terminal.
�� Do the following processes at the mobile terminal without A’s terminal
1. Receive A’s sentence from the server.
2. Calculate Emotion Value by applying the EGC method to the sentence. Favorite Value
Database accumulated in each terminal is employed for the EGC method.
3. Generate one facial expression image based on the EGC result as described in Section 5.2.2.
4. Display the sentence and the facial image of ‘A’ on the display of each terminal.
�� Do the following processes at the server
1. Send the sentence from ‘A’ to all the chat participants.
2. Check the variances of the output emotions that are sent from all the chat participants. When
there are some variances, the server calculates their norms and checks whether they are less
than threshold or not.
3. Ask participant 1 and 2 whether they want to go to another chat room or not, when ��� 21 EE is satisfied. E1 means the norm of generated emotions by participant 1, and E2
means the norm of generated emotions by participant 2.
4. Create a new chat room if both participants agree to it. Information about generated emotions
is duplicated into the new chat room.
139
In the initial stage of the chat, all agents attend the same chat room, which is opened for an aim or
a common topic. If an agent wants to talk only with a specified agent, and the specified agent agrees
to it, the system can open a new closed chat room for the two agents, like computer dating. The
system then copies the agents’ attributes from Room 1 to Room 2 as shown in Figure 5.22. Agent A
and Agent B in Figure 5.22 are still in Chat Room 1, but they can enjoy their own conversations in
Chat Room 2, too.
Figure 5.22 Agents attend two chat rooms simultaneously
Figure 5.23 Window of JavaFaceChat
140
Figure 5.24 Window of closed chat room
5.4 Emotion Oriented Interactive Interface for Raising Students’ Awareness in a Group Lesson
The types of classrooms differ in architecture according to the design of a building. As shown in
Figure 5.25, all the students look at a teacher or a blackboard in front of them. As shown in Figure
5.26, students can see neither a teacher nor a blackboard in front of them while looking at his/her
computer display. In this case, it seems to be difficult for them to understand the teacher’s
explanation well enough.
A teacher may meet a situation in a group lesson with an exercise such as computer language
programming, where students talk mutually about programming techniques. However, if one student
is asking another student for the answer itself, the teacher has to consider stopping their conversation
because the asked student cannot obtain any knowledge or experiences without the answer. However,
if his/her knowledge may be improved through the conversation, the communication is important in
the group lesson. To realize such an idea, we improved JavaFaceChat as described in Section 5.3.
142
5.4.1 System Overview There are two parts in this system, the “chat part” and the “observation part.” The “chat part” is
based on JavaFaceChat as described in Section 5.3, but the function for computer dating is not
supplied. Students use this chat system to solve their problems by asking and teaching mutually. On
the other hand, the “observation part” is in the server, and only the teacher uses it. The “observation
part” analyzes not only the students’ emotions but also their awareness from their utterances. The
awareness is judged by classifying the generated emotions as described in Section 5.4.2. Then, the
teacher can identify the students with a low level of awareness easily, and can support them.
Furthermore, when the student is not in a low level of awareness but he/she has felt bad for a long
time, the teacher can meet such situations as soon as possible.
Figure 5.27 and 5.28 show dialogue windows for students and the teacher, respectively. In Figure
5.27, a student asks someone what she does not understand and the other student replies with an
answer using his facial images. Figure 5.28 is a dialogue window for the teacher and shows the
contents of the conversation among students. The right part shows the degree of the students’
motivations or volition with his/her facial images. The teacher can find a student with a low level of
motivation or volition.
5.4.2 Assign Rules to Detect Variances in the Student’s awareness In the field of psychology, many factors for motivation are proposed; instinct, drive, arousal,
incentive, cognitive factor, self-actualization and so on [83]. We employ the incentive, cognitive and
self-actualization factors, because instinct, drive and arousal are aroused by perceptions.
The first rule for incentive is that a pleasant/unpleasant event causes the action of
appearance-avoidance. The second rule is that a neutral event also causes the action of
appearance-avoidance when it appears with pleasant/unpleasant event. Therefore, we define that the
emotions that relates “pleasure” for one’s own enhance the motivation, and the “displeasure”
emotions for one’s own deflate the motivation.
Next, the motivation is confirmed when the agent meets something new or he/she expects an event
or feels fear as a result of the event. Then, we define the emotions that relate to the future (Hope,
Fear) which enhance motivation, and the emotions relating to finished event (Satisfaction,
Disappointment, Fear-confirmed, Relief) which deflate the motivation.
We consider that “Pride” and “Shame” also enhance the motivation from the viewpoint of
self-actualization, because these emotions are generated by appraising one’s own actions.
Table 5.3 shows the classification rules about emotions related to motivation.
144
Table 5.3 Classification about emotions related to motivation
Enhancing the motivation Joy, Hope, Fear, Pride, Shame, Liking, Gratification
Deflating the motivation Distress, Relief, Satisfaction, Disappointment, Fear-confirmed,
Disliking, Remorse
5.5 Conclusion
In this chapter, we proposed a method to represent emotions with facial expressions and
introduced some human-computer interface applications.
To generate the facial expression of a human from the sentences, we inputted the emotions into
the trained “sand glass type neural networks” and extracted a facial expression image from an
emotional space. We defined the assign rules from the emotion type by the EGC to facial emotion
types, and we got the emotional spaces similar to Ekman’s emotion circle by the trained “sand glass
type neural networks.”
We applied the method into applications called JavaFaceMail and JavaFaceChat. JavaFaceMail
has a regular mail function, and a facial expression image according to the contents of the received
mail. On the other hand, the JavaFaceChat analyzes all messages and displays all users’ facial
expression images about the messages. Furthermore, we considered that the system aids students’
awareness in a group lesson.
145
CHAPTER 6 CONCLUSION
This thesis presented a method to generate some emotions from the user’s viewpoint, and a
method to analyze affirmative/negative intention from the user’s utterances.
First, we proposed the “Emotion Generating Calculations.” This method generates
pleasure/displeasure emotion from an event, an attribute and is-a relationship in utterance using
the user’s taste information (Favorite Value). Furthermore, the method calculates the degree of
the pleasure/displeasure.
Second, the EGC method has been developed to distinguish generated emotions into 20 various
emotions based on “Emotion Eliciting Condition Theory.” The method distinguishes emotions
based on the pleasure/displeasure and some conditions.
Third, we proposed a method to analyze the user’s affirmative/negative intention from the user’s
utterances. This method calculates not only the affirmative/negative intention but also the degree
of affirmation/negation by considering adverbs, modalities and interjections.
Fourth, we proposed a method to represent emotions with facial expressions. Extracted emotions
by the EGC method are inputted into trained “sand glass type neural networks,” and calculated
one facial image from the emotional space similar to Ekman’s emotion circle.
We constructed a mail tool and a chat tool as applications of the EGC method and the facial
expression selecting method. Furthermore, we applied the methods and the affirmative/negative
intention analyzing method into an interface of the “Web-based analytical system of health
service needs among healthy elderly.”
It is particularly hoped that this thesis will serve in future a natural human-computer interface
considering a human’s emotions. Furthermore, we hope our method will aid to construct a good
relationship between a human and a computer.
146
REFERENCES
[1] Osamu Hasegawa, Shigeo Morishima and Masahide Kaneko, “Processing of Facial Information
by Computer”, IEICE Trans., Vol.J80-D-II, No.8, pp.2047-2065 (in Japanese) (1997)
[2] P. Ekman, W. V. Friesen, “The repertoire of nonverbal behavior”, Semiotica, Vol.1, pp.49-98
(1969)
[3] Mehrabian, A., “Nonverbal Communication”, Aldine Atherton (1972)
[4] Hiroshi Harashima, “Intelligent image coding and intelligent communications”, J.ITE, Vol.42,
No.6, pp.519-525 (in Japanese) (1988)
[5] Hiroshi Harashima, K. Aizawa and T. Saito, “Model-based analysis synthesis coding of
videotelephone images-conception and basic study of intelligent image coding”, IEICE Trans.,
Vol.E72, No.5, pp.452-459 (1989)
[6] H. Uwakoto, Y. Kobayashi and Y. Niimi, “Acoustic analysis and modeling of emotional
expressions in speech”, IEICE Technical Report, SP92-131, pp.65-72 (in Japanese) (1993)
[7] H. Kawanami and K. Hirose, “Considerations on the Prosodic Features of Utterances with
Attitudes and Emotions”, IEICE Technical Report, SP97-67, pp.73-80 (in Japanese) (1997)
[8] M. Shigenaga, “Characteristic Features of Emotionally Uttered Speech Revealed by Discriminant
Analysis (III)”, IEICE Technical Report, SP97-66, pp.65-72 (in Japanese) (1997)
[9] Nobuaki Kadotani, Hirotomo Aso, Motoyuki Suzuki and Shozo Makino, “An investigation on
discrimination among emotion expressions contained in speech”, IPSJ SIG-SLP 34-8, pp.43-48 (in
Japanese) (2000)
[10] http://www.darpa.mil/ito/research/com/
[11] E. Levin, S. Narayanan, R. Pieraccini, K. Biatov, E. Bocchieri, G. DiFabbrizio, W. Eckert, S.
Lee, A. Pokrovsky, M. Rahim, P. Ruscitti and M. Walker, “The AT&T-DARPA Communicator
mixed initiative spoken dialogue system”, Proc. International Conference on Spoken Language
Processing (2000)
[12] A. Rudnicky, C. Bennett, A. Black, A. Chotomongcol, K. Lenzo, A. Oh and Singh, R., “Task
and domain specific modeling in the Carnegie Mellon communicator system”, Proc. International
Conference on Spoken Language Processing (2000)
[13] E. D. Os, L. Boves, L. Lamel and P. Baggia, “Overview of the arise project”, Proc. European
Conference on Speech Technology, Eurospeech99, pp.1527-1530 (1999)
[14] H. Asoh, T. Matsui, John Fry, F. Asano and S. Hayamizu, “A spoken dialog system for a mobile
office robot”, Proc. Eurospeech99, pp.1139-1142 (1999)
[15] S. Hashimoto, S. Narita, H. Kasahara, A. Takanishi, S. Sugano, K. Shirai, T. Kobayashi, H.
Takanobu, T. Kurata, K. Fujiwara, T. Matsuno, T. Kawasaki and K. Hoashi, “Humanoid
Robot—Development of an information Assistant Robot Hadaly—, 6th IEEE International
147
Workshop on Robot and Human Communication (RO-MAN’97) (1997)
[16] Tetsunori Kobayashi, “Trend of Spoken Dialogue Research”, Journal of Japan Society for
Artificial Intelligence, Vol.17, No.3, pp.266-279 (in Japanese) (2002)
[17] Naoyuki OKADA, “Representation and accumulation of the concepts of words,” IEICE
Publishers (in Japanese) (1991)
[18] Arnold, M.B., “Emotion and Personality”, New York: Columbia University Press (1960)
[19] Kazuya MERA, Takumi ICHIMURA, Teruaki AIZAWA and Toshiyuki YAMASHITA,
“Invoking Emotions in a Dialog System based on Word-Impressions,” Journal of Japan Society
for Artificial Intelligence, Vol.17, No.3, pp. 186-195 (in Japanese) (2002)
[20] Takumi ICHIMURA, Kazuya MERA and Toshiyuki YAMASHITA, “Construction of a Dialog
System with Emotions for Elderly Persons by Neural Networks”, Proc. of IEEE International
Conference on IEEE SMC (SMC2000), pp. 3594-3599 (in Japanese) (2000)
[21] Wundt, W., “Outlines of Psychology”, Leipzig: Wilhelm Engelmann (1897)
[22] H. Schlosberg, “Three dimention of emotion,” The Psychological Review, Vol. 61,
No. 2, pp.81-88 (1954)
[23] Plutchik, R., “The emotions”, New York: Random House (1962)
[24] Clark Elliott, “The Affective Reasoner: A process model of emotions in a multi-agent system,”
Ph.D thesis, Northwestern University, The Institute for the Learning Sciences, Technical Report
No. 32 (1992)
[25] Clark Elliott, “Components of two-way emotion communication between humans and
computers using a broad, rudimentary, model of affect and personality,” Bulletin of the Japanese
Cognitive Science Society (in Japanese) (1994)
[26] Ortony, A., Clore, G.L., & Collins, A., “The cognitive structure of emotions,” New York:
Cambridge University Press (1988)
[27] Paul O’Rorke, Andrew Ortony, “Explaining Emotions,” Cognitive Science, Vol.11, pp. 283-323
(1994)
[28] Kazuya Mera, Takumi Ichimura, Toshiyuki Yamashita and Katsumi Yoshida, “Complicated
Emotion Allocating Method based on Emotional Eliciting Condition Theory”, Memoirs of Tokyo
Metropolitan Institute of Technology, Vol.16, pp. 11-16 (in Japanese)
[29] Hideki Mima, Masao Fuketa, Yoshitaka Hayashi and Jun-ichi Aoe, “A Method for
Understanding Intentions of Indirect Speech-Act in Natural Language Interfaces”, the transaction
of the Institute of Electronics, Information and Communication Engineers (IEICE), Vol. J78-D-II,
No. 5, pp. 803-810 (in Japanese) (1995)
[30] Makoto Yoshie, Kazuya Mera, Takumi Ichimura, Toshiyuki Yamashita, Teruaki Aizawa and
Katsumi Yoshida, “Analysis of affirmative/negative intentions of the answers to yes-no questions
and its application to a web-based interface”, Journal of Japan Society for Fuzzy Theory and
148
Systems, Vol.14, No.4, pp.393-403 (in Japanese) (2002)
[31] Kazuya Mera, Shinji Kawamoto, Kenji Ono, Takumi Ichimura, Toshiyuki Yamashita and
Teruaki Aizawa, “A learning method of individual's taste information”, Proc. of the 5th
International Conference on Knowledge-Based Intelligent Engineering Systems & Allied
Technologies (KES2001), Vol.2, pp.1217-1221 (2001)
[32] Kazuya MERA, Takumi Ichimura and Toshiyuki Yamashita, “Analysis of User Communicative
Intention from Affirmative/Negative Elements by Fuzzy Reasoning and Its Application to
WWW-based Health Service System for Elderly”, Proc. of the 6th Intl. Conf. on Soft Computing
(IIZUKA2000), pp.971-976 (2000)
[33] Koichi Yamada, Riichiro Miziguchi and Naoki Harada, “User’s Utterance Model and
Cooperative Answering for Question-answering Systems”, Journal of IPSJ, Vol.35, No.11,
pp.2265-2275 (in Japanese) (1994)
[34] http://www.aibo.com/
[35] Fujita, M. and Kageyama, K., “An Open Architecture for Robot Entertainment”, Proceedings of
the First International Conference on Autonomous Agents, pp.435-442 (1997)
[36] Hirohide USHIDA, Yuji HIRAYAMA, Hiroshi NAKAJIMA, “Emotion Model for Life-like
Agent and Its Evaluation,” Proceedings of the 15th National Conference on Artificial Intelligence
(AAAI-98), pp. 62-69 (1998)
[37] Hirohide USHIDA, Hiroshi NAKAJIMA, “Software Systems with Emotion,” Journal of Japan
Society for Fuzzy Theory and Systems, Vol. 12, No. 6, pp. 762-769 (in Japanese) (2000)
[38] Shin-ichi Ohnaka, Tomohito Ando and Toru Iwasawa, “The introduction of the personal robot
PaPeRo”, Journal of IPSJ, Vol.37, No.7, pp.37-42 (in Japanese) (2001)
[39] Yoshihiro Fujita, “Personal Robot R100”, Journal of the Robotics Society of Japan, Vol.18,
No.2, p.40 (in Japanese) (2000)
[40] Kaoru Suzuki and Hiroshi Kanazawa, “Pet Robot Using Emotion Triggered Learning Model”,
Toshiba review, Vol.56, No.9, pp.37-40 (in Japanese) (2001)
[41] Heider, F., “Attitudes and Cognitive Organization”, Journal of Psychology, Vol.21 (1946)
[42] Heider, F., “The Psychology of Interpersonal Relations”, New York: Wiley (1958).
[43] Takao Kurokawa, “Nonverbal interface”, Ohmsha (1994).
[44] P.Ekman and W.V.Friesen, “Unmasking the Face: A Guide to Recognizing Emotions from
Facial Clues”, N.J.: Prentice-Hall (1975)
[45] Takumi Ichimura, Hitoshi Ishida, Kazuya Mera, Shinichi Oeda, Akihiro Sugihara and
Toshiyuki Yamashita, “Approach to emotion oriented intelligent system by parallel sand glass
type neural networks and emotion generating calculations”, Journal of Human Interface Society,
Vol.3, No.4, pp.225-238 (in Japanese) (2001)
[46] Takumi ICHIMURA, Kazuya MERA, Hitoshi ISHIDA, Shinichi OEDA, Akihiro SUGIHARA,
149
Toshiyuki YAMASHITA, “An Emotional Interface with Facial Expression by Sand Glass Type
Neural Network and Emotion Generating Calculations Method”, Proc. of The International
Symposium on Measurement, Analysis and Modeling of Human Functions, pp.275-280 (2001)
[47] Takumi ICHIMURA, Kazuya MERA and Toshiyuki YAMASHITA, “Construction of a Dialog
System with Emotions for Elderly Persons by Neural Networks”, Proc. of IEEE Intl. Conf. on
IEEE SMC (SMC2000), pp. 3594-3599 (2000)
[48] Takehiro KANAYA, “Particle ”WA” is not needed in Japanese”, Kodansha (in Japanese) (2002)
[49] The National Institute for Japanese Language, “Classified vocabularies chart”, Shuuei press,
1972.
[50] Tetsuya NASUKAWA et al., “Easy to Use Practical Freeware for Natural Language
Processing,” IPSJ magazine, Vol. 41, No. 11, pp. 1201-1238 (in Japanese) (2000)
[51] T.MASUOKA and Y.TAKUBO, “Basic Japanese Grammar –revised edition–,” Kuroshio
Shuppan (in Japanese) (1992)
[52] Yoshifumi HIDA, Hideko ASADA, “Present-day adjective using dictionary,” Tokyo Dou
Publishers (in Japanese) (1991)
[53] Kazuya MERA, Shinji KAWAMOTO, Takumi ICHIMURA, Toshiyuki YAMASHITA, and
Teruaki AIZAWA, “A learning method of individual’s taste information,” Proc. of the 5th
International Conference on Knowledge-Based Intelligent Engineering Systems & Allied
Technologies (KES2001), Vol. 2, pp. 1217-1221 (2001)
[54] Ryoko TOKUHISA, Kentaro INUI, et al., “Two Complementary Case Studies for Emotion
Tagging in Text Corpora,” Technical Report of JSAI, SIG-SLUD-A003-2, pp.9-14 (in Japanese)
(2001)
[55] Japan Society of Fuzzy Theory and Systems, “Fuzzy Logic –Course of Fuzzy Vol. 4”, Japan
Society of Fuzzy Theory and Systems (1993)
[56] Akihiro OKADA, Jun-ichi ABE, “Emotion Research in Psychology: Past and Current Trends,”
Journal of Japan Society for Fuzzy Theory and Systems, Vol. 12, No. 6, pp. 730-740 (in Japanese)
(2000)
[57] Randolph R. Cornelius, “The science of emotion: Research and tradition in the
psychology of emotions,” Seishin Shobo (in Japanese) (1999)
[58] Lazarus, R. S. & Folkman. Susan, “Stress, appraisal, and coping”, New York: Springer (1984)
[59] Lazarus, R. S., “Emotion and Adaptation”, New York: Oxford University Press (1991)
[60] Ortony, A., Clore, G.L., & Collins, A., “The cognitive structure of emotions,” New York:
Cambridge University Press (1988)
[61] Paul O’Rorke, Andrew Ortony, “Explaining Emotions,” Cognitive Science, Vol.11, pp. 283-323
(1994)
[62] Japan Electronic Dictionary Research Institute, LTD., “Japanese Word Dictionary” in the EDR
150
Electronic Dictionary version 2.0 (1998)
[63] Tadahiko Kumamoto, Akira Ito and Tsuyoshi Ebina, “Recognizing User Communicative
Intention in a Dialogue-Based Consultant System ---A Statistical Approach Based on the Analysis
of Spoken Japanese Sentences---, the transaction of the Institute of Electronics, Information and
Communication Engineers (IEICE), Vol. J77-D-II, No. 6, pp. 1144-1123 (1994) [64] Katsumi Yoshida, Takumi Ichimura, Hiroki Sugimori, Takashi Izuno and H. Inada, “Analytical
System of Health Service needs among Healthy Elderly by using Internet”, Proc. of
Gerontechnology Third Intl. Conf. (1999)
[65] Y. Matsumoto, A. Kitauchi, T. Yamashita, Y. Hirano, H. Matsuda and M. Asahara, “Japanese
Morphological Analysis System Chasen Version 2.0”, http://cl.aist-nara.ac.jp/lab/nlt/chasen.html
(1999)
[66] Ken MURASUGI, “Likart’s Method of Job Satisfaction Measurement and an Application of
Fuzzy Theory on Morale Survey”, Journal of Japan Industrial Management Association, Vol.44,
No.2, pp.94-101 (in Japanese) (1993)
[67] Akira MATSUMURA (edit), “DAIJIRIN 2nd edition”, Sanseido (in Japanese) (1995)
[68] The National Institute for Japanese Language, “Adverbs’ meaning and usage”, Pringing Bureau,
Ministry of Finance (in Japanese) (1991)
[69] Kaoru TONE, “Making a Decision, Feeling Like Playing A Game”, Nitskagiren Press (1986)
[70] Kazuya Mera, Makoto Yoshie, Takumi Ichimura, Toshiyuki Yamashita, Teruaki Aizawa and
Katsumi Yoshida, “Response generating method and its application to web-based health care
service”, Proc. of the 6th International Conference on Knowledge-Based Intelligent Engineering
Systems & Allied Technologies (KES2002), Vol.1, pp.688-692 (2002)
[71] M. Rosenblum, Y.Yacoob, L.Davis, “Human emotion recognition from motion using a radial
basis function network architecture”, Proc. of the Workshop on Motion of Non-Rigid and
Articulated Objects, pp. 15-25 (1996)
[72] S. Morishima, “Modeling of facial expression and emotion for human communication system”,
Displays, Vol.17, pp.43-49 (1994)
[73] B.Irie and M.Kawato, "Acquisition of Internal Represntation by Multi-Layered Perceptrons",
IEICE Trans., Vol.J73-D-II, No.8, pp.1173-1178 (in Japanese) (1990)
[74] N.Fukumura, Y.Uno, and R.Suzuki, “Learning of Many-to-Many Relation between Different
Kinds of Sensory Information Using a Neural Network Model for Recognizing Grasped Objects”,
The Brain & Neural Networks,Vol.5,No.2,pp.65-71 (in Japanese) (1998)
[75] Hitoshi ISHIDA, Takumi ICHIMURA, Mutuhiro TERAUCHI, Tetsuyuki TAKAHAMA, and
Yoshinori ISHOMICHI, “Classification of Facial Expressions using Sandglass-type Neural
Networks”, Proc. of the 10th Fuzzy, Artificial Intelligence, Neural Networks and Soft Computing
(FAN’00), pp.201-204 (in Japanese) (2000)
151
[76] N.Ueki, S.Morishima, H.Yamada, and H.Harashima, "Expression Analysis/Synthesis System
Based on Emotional Space Constructed by Multi-Layered Neural Network", IEICE Trans.,
Vol.J77-D-II, No.3, pp.573-582 (in Japanese) (1993)
[77] S.Akamatsu, T.Sasaki, H.Fukamachi, and Y.Suenaga, "Automatic extraction of target images
for face identification using the sub-space classification method", IEICE Trans.,
Vol.E76-D,No.10,pp.1190-1198 (1993)
[78] Y.Xiao, N.P.Chandrasiri, Y.Tadokoro, and M.Oda, "Recognition of Facial Expressions Using
2-D DCT and Neural Network", IEICE Trans., VolJ81-A, No.7, pp.1077-1086 (in Japanese)
(1998)
[79] J.A.Russell and M.Bullock.,"Multidimensional Scaling of Emotional Facial Expressions:
Similarity From Preschoolers to Adults", Journal of Personality and Social Psychology, Vol.48,
No.5, pp.1290-1298 (1985)
[80] Takumi Ichimura, Kazuya Mera, Yoshiaki Miki and Toshiyuki Yamashita, “Emotional Interface
for Human Feelings by Mobile Phone”, Proc. of the 6th International Conference on
Knowledge-Based Intelligent Engineering Systems & Allied Technologies (KES2002), Vol.1,
pp.708-712 (2002)
[81] Web Site of JavaFaceMail (in Japanese),
http://facemail.chi.its.hiroshima-cu.ac.jp/
[82] Web Site of Java™ 2 Platform, http://java.sun.com/j2se/1.3/
[83] Heibonsha, “Dictionary of Psychology”, Heibonsha (in Japanese) (1981)