emotion oriented intelligent...

159
東京都立科学技術大学博士(学術)学位論文 Emotion Oriented Intelligent Interface 2003 3 目良 和也

Upload: vutuyen

Post on 14-Apr-2018

216 views

Category:

Documents


2 download

TRANSCRIPT

東京都立科学技術大学博士(学術)学位論文

Emotion Oriented Intelligent Interface

2003 年 3 月

目良 和也

Emotion Oriented Intelligent Interface

Kazuya Mera

2003

Emotion Oriented Intelligent Interface

by

Kazuya Mera

DISSERTATION

Submitted to Tokyo Metropolitan Institute of Technology in partial fulfillment of the requirement for the degree of Doctor of Philosophy

March 2003

ACKNOWLEDGEMENTS

The author wished to express sincere gratitude to his supervisor Professor T. Yamashita

whose guidance has been greatly helpful in completing this dissertation.

To Professor S. Fukuda, Professor T. Yamaguchi, and Professor T. Takagi, he is

indebted for their continuous supports and their valuable comments and suggestions during

the preparation of this dissertation.

The author would also like to thank Professor N. Okada, Professor T. Aizawa, and Dr. T.

Ichimura for their continuous supports and their valuable comments and suggestions

during the preparation of this dissertation.

He further wished to thank Mr. Y. Takehara, Mr. K. Ono, and Mr. S. Kawamoto, fellow

graduate students of a master course, and Ms. M. Fujisawa, Mr. T. Takagi, Mr. S. Shimada,

Mr. A. Teramoto, Ms. T. Suehiro, Mr. M. Yoshie, Mr. Y. Sato, Mr. T. Inoue, Mr. F. Kurose,

Mr. M. Iemori, Mr. S. Ryu and Mr. S. Hamano, fellow graduate students of a bachelor

course.

Special thanks are also due to Mr. D. Edward for correcting the language mistakes.

To the colleagues in Yamashita Laboratory (Tokyo Institute of Technology), Natural

Language Processing Laboratory (Hiroshima City University), and SENCE (Hiroshima

City University), the author expresses his thanks for their collaboration.

i

CONTENTS CHAPTER 1 INTRODUCTION ______________________________________________ 1

CHAPTER 2 EMOTION GENERATING CALCULATIONS _____________________ 6

2.1 Agent Model ____________________________________________________________ 6 2.1.1 Agent Model of AIBO __________________________________________________ 7 2.1.2 MaC Model ___________________________________________________________ 8 2.1.3 Agent Model in This Study _____________________________________________ 9

2.2 Process of Emotion Generating Calculations _______________________________10

2.3 Case Frame Representation _____________________________________________13 2.3.1 Case Frame Representation in Japanese _______________________________ 13 2.3.2 Classification of Event Type___________________________________________ 13 2.3.3 Construction of Case Frame Representation ____________________________ 14

2.4 Favorite Value Database ________________________________________________16 2.4.1 Favorite Value_______________________________________________________ 16 2.4.2 Default Favorite Value _______________________________________________ 17 2.4.3 Favorite Value Learning Method ______________________________________ 18

2.5 Equations of Emotion Generating Calculations ____________________________19 2.5.1 Equations for Event __________________________________________________ 20 2.5.2 Equations for Attribute _______________________________________________ 23 2.5.3 Equations for is-a Relationship ________________________________________ 24 2.5.4 Favorite Value of Predicate with Negative Aspect _______________________ 25 2.5.5 Favorite Value of Modified Noun ______________________________________ 25

2.6 Emotion Strength Calculation ___________________________________________34

2.7 Example of EGC Method ________________________________________________36

2.8 Experimental Result____________________________________________________36

2.9 Emotion Distinguishing Method based on Emotional Space__________________40

2.10 Future Work__________________________________________________________42

2.11 Conclusion____________________________________________________________43

CHAPTER 3 COMPLICATED EMOTION ALLOCATING METHOD BASED ON EMOTION ELICITING CONDITION THEORY_________________________________44

3.1 Emotion Discrimination _________________________________________________44

ii

3.2 Emotion Eliciting Condition Theory ______________________________________49

3.3 Complicated Emotion Allocating Method based on Emotion Eliciting Condition Theory using Emotion Generating Calculation ________________________________53

3.3.1 Fortunes of the Others _______________________________________________ 53 3.3.2 Prospect-Based Emotions _____________________________________________ 56 3.3.3 Confirmation ________________________________________________________ 58 3.3.4 Well-Being __________________________________________________________ 61 3.3.5 Attribution __________________________________________________________ 62 3.3.6 Well-Being / Attribution ______________________________________________ 64

3.4 Dependency among Emotion Groups______________________________________64

3.5 Example of Complicated Emotion Allocating Method _______________________66

3.6 Experimental Results ___________________________________________________71 3.6.1 Experimentation 1 ___________________________________________________ 71 3.6.2 Experimentation 2 ___________________________________________________ 74

3.7 Future Works __________________________________________________________75

3.8 Conclusion_____________________________________________________________77

CHAPTER 4 ANALYSIS OF AFFIRMATIVE/NEGATIVE INTENTIONS FROM USER’S ANSWERS TO YES-NO QUESTIONS__________________________________78

4.1 Intention Analyzing Method from the Utterance ___________________________78 4.1.1 Understanding Intentions of the Indirect Speech-Act in Natural Language Interfaces ________________________________________________________________ 78 4.1.2 Recognizing User Communicative Intention in a Dialogue-Based Consultant System___________________________________________________________________ 81

4.2 An Overview of the Affirmative/Negative Intention Analyzing Method________83 4.2.1 Web-based Analytical System of Health Service needs among Healthy Elderly_________________________________________________________________________ 83

4.2.2 Affirmative/Negative Intention Analyzing Method for Web-based Analytical System of Health Service __________________________________________________ 87

4.3 Affirmative/Negative Element ___________________________________________89 4.3.1 Affirmative/Negative Description for the yes-no Question ________________ 89 4.3.2 Direct Expression of Intention in the Response__________________________ 90 4.3.3 Indirect Expression of Intention in the Response ________________________ 90 4.3.4 Data Structure Description in Question ________________________________ 93

iii

4.4 Affirmation Value ______________________________________________________93 4.4.1 Affirmative Value of the Interjection ___________________________________ 93 4.4.2 Affirmative Value of the Verb _________________________________________ 94

4.5 Affirmative Value Changing Scale________________________________________94 4.5.1 “Affirmative Value Changing Scale” Affected by the Adverb ______________ 94 4.5.2 Affirmative Value Change by Modality _________________________________ 97

4.6 Analyzing Intention from Plural Sentences _______________________________102

4.7 Example of Our Method ________________________________________________103

4.8 Application of Affirmative/negative Intention Analyzing Method ____________104

4.9 Experimental Result___________________________________________________106

4.10 Future Works ________________________________________________________109

4.11 Conclusion___________________________________________________________ 111

CHAPTER 5 EMOTION ORIENTED INTERACTION SYSTEMS — FACEMAIL & FACECHAT — _____________________________________________________________113

5.1 Facial Expression Generating Method by Parallel Sand Glass Type Neural Network _________________________________________________________________113

5.1.1 Sand Glass Type Neural Network ____________________________________ 113 5.1.2 Facial Training Data ________________________________________________ 117 5.1.3 Learning Experimental Results ______________________________________120

5.2 JavaFaceMail _________________________________________________________127 5.2.1 System Overview ___________________________________________________127 5.2.2 Assign Rules to the Facial Expressions________________________________134 5.2.3 Mental Effects by Outputting Facial Expressions ______________________135

5.3 JavaFaceChat_________________________________________________________136

5.4 Emotion Oriented Interactive Interface for Raising Students’ Awareness in a Group Lesson ____________________________________________________________140

5.4.1 System Overview ___________________________________________________142 5.4.2 Assign Rules to Detect Variances in the Student’s awareness____________142

5.5 Conclusion____________________________________________________________144

CHAPTER 6 CONCLUSION ______________________________________________145

REFERENCES _____________________________________________________________146

iv

1

CHAPTER 1 INTRODUCTION

Although we have recently been able to access various media by computer networks, some

problems are noticeable about the interface tools to support communication between human and

computer or human and human. In order to achieve natural communication, developing

communication interactive tool considering the human mind is expected [1].

Ekman classified body actions relating to communication as follows; verbal information,

paralanguage and non-verbal information. Verbal information is expressed by strings obtained from

sentences, utterances, and so on. Paralanguage is expressed by rhythm, intonation, and so on.

Non-verbal information is expressed by facial expressions, gestures, blinks, and so on [2]. It is

especially well known that the role of non-verbal information in communication and conversation is

important. Mehrabian, a psychologist, reported “an affection reaches to the companion by verbal

information (the weight is 7%), paralanguage (the weight is 38%) and non-verbal information (the

weight is 55%) in a conversation [3].”

Various methods are proposed for perceiving the emotions from non-verbal information.

Harashima proposed a method whereby a machine extracts and encodes the variation factors of

facial expression from the user’s facial image and another machine located faraway receives and

decodes the factors and reconstructs the facial expression by Computer Graphics [4, 5]. The methods

to analyze the emotions from voice information such as rhythm, frequency, and length of the

sentence are proposed by Uwatoko [6], Kawanami [7], Shigenaga [8], Kadotani [9], et al.

However, people sometimes express facial expressions which differ from their real emotions. For

example, a person smiles even if he/she is displeased. In this situation, if the former systems

recognize the users’ emotions as happy from the smile, the system never obtains their confidence.

Therefore, a method to analyze the users’ emotions based on not only non-verbal information but

also verbal information, is required.

Therefore, it is also important to recognize the user’s intention from his/her utterances. If the

flexibility and usability of the interface process are inadequate, the confidence of the users will be

spoiled even if the system is good for emotions.

Interface by natural language dialogue, especially voice, is very effective in communicating to the

user the intention and the contents of the dialogue. Voice is adaptive to input data on a portable

terminal, including situations where the user’s eyes and hands are used for another purpose, because

using the voice does not require any big devices, or the user’s eyes or hands. Therefore, there are

many studies for a vocal interface such as the DARPA Communicator project which deals with the

2

issues of supporting a travel plan [10, 11, 12], Arise, which guides a train schedule [13] and many other

studies for conversing with robots [14, 15]. However, the user will use many informal languages which

are like conversations between two people in our proposed system, if the system considers the user’s

emotions and the user feels confident in the system.

The technology to recognize the voices of informal language is still far from the practical level. A

voice recognition project in the U.S. developed a method that can recognize 90% or more of the task

dependence voices, but even they have not been able to increase the recognition rate for the voices of

informal language [16]. Furthermore, we will meet many more difficulties with voices including

hoarseness, cracks, tremors and dialects, because we are going to apply our system to elderly people.

For these reasons, we propose a method to recognize the user’s mind (emotions and intentions)

from the user’s utterances in natural language conversation in this paper.

First, we propose a method to analyze the user’s emotion concerning the contents of the

utterances.

Generally, the utterances in the dialogue are represented in sentence forms, and they mainly

express the events and the attribute evaluations. Although there are various types, we employ 11

event types and 6 attribution types that Okada classified [17], and define a calculation for each type.

These calculations distinguish pleasure/displeasure based on Arnold’s definition of emotion [18];

“Pleasure/displeasure are aroused from the actions relating to appearance/avoidance of like/dislike

objects.” We use the taste information (Favorite Value: FV) for each object. This calculation method

extracts pleasure and displeasure, and the emotion’s degree is calculated from the synthetic vector

among FVs of three case elements [19, 20].

However, the classified pleasure/displeasure is too ambiguous to apply them into the process

considering the user’s emotions. There are many kinds of emotions in human society like “relief,”

“expectation,” “envy,” and so on. Many psychologists have proposed the criteria of emotion

evaluations to distinguish such various emotions from the simple emotions like pleasure/displeasure.

For example, Wundt proposed three dimensions, namely “pleasure vs. displeasure,” “calmness vs.

tension,” and “relaxation vs. excitement [21].” Schlosberg proposed three other dimensions, “pleasure

vs. displeasure,” “attention vs. rejection,” and “strong vs. weak for activation [22].” Plutchik proposed

eight kinds of primary emotions (anger, disgust, sadness, surprise, fear, joy, acceptance and

anticipation) and the degree of each emotion [23]. Although these studies are effective to classify

emotions from the viewpoint of the person who arouses the emotions, they are not effective to guess

the other person’s emotions because they require the degree of the other’s perceptions such as

excitement and attention.

In this paper, we employ “emotion eliciting condition theory” proposed by Elliott [24, 25] to classify

3

the simple emotions (pleasure/displeasure) into complex emotions. The “emotion eliciting condition

theory” is developed for an agent system called Affective Reasoner from Ortony’s theory [26, 27]. We

classify pleasure/displeasure into 20 kinds of complex emotions by checking the emotion eliciting

conditions based on the grammatical features in the user’s utterance.

In this theory, we check five conditions; “pleased/displeased about an event,”

“desirable/undesirable event for another,” “prospective event,” “confirmed/unconfirmed event,” and

“approved/disapproved event.” We judge “pleased/displeased about an event” based on

pleasure/displeasure extracted by the method described in the previous paragraph, and

“desirable/undesirable event for another” is judged from the result of the method using the other’s

taste information. The remaining conditions are judged based on grammatical features such as

adverbs, tenses, aspects, and the subject of the sentence. This method can extract multiple emotions

from one sentence at the same time [28].

On the other hand, there are many studies which analyze the user’s intention from his/her

utterance.

Mima proposed a method for understanding the intention of the “indirect speech-act” in natural

language interface for the operation of the computer system [29]. This method detects the user’s

demand about operations by converting the surface structure of the user’s input sentence. However,

it is difficult to apply this method to natural language dialogue, because the Japanese language tends

to be ambiguous and is tolerant of omitting and inversion. This method has to define used words and

their order strictly to analyze the intention.

Kumamoto also proposed a method to recognize a user’s communicative intention (CI) from the

natural language dialogue in order to support the usage of a computer. This method extracts

“function words” which indicate the features to determine the CI type, and the user’s intention is

guessed using determined CI type and the grammatical features.

In this paper, we propose a method to analyze a user’s affirmative/negative intention from the

response to yes-no questions based on these methods [30, 31, 32]. The affirmative/negative intention is

expressed not by binary value but by real value in the range [0.0, 1.0] to indicate the ambiguity of

the user’s intention. In this paper, we analyze the intention by extracting the words, which indicate

affirmation/negation (affirmative/negative element) like function words by Kumamoto. There are

three types of affirmative/negative words; “affirmative/negative description for the yes-no question,”

“direct expression of intention in the response,” and “indirect expression of intention in the

response.” “Affirmative/negative descriptions for the yes-no question” do not have any independent

meaning, but they show the intention by referring to the content of the question. “Direct expressions

of intention in the response” are the derivatives of the verb and auxiliary verb in the question.

Although there are many types of “indirect expression of intention in the responses,” [33] we employ

4

three of them; “indirect information addition,” “non-standard reason addition,” and “standard reason

addition.” Their roles are to guess the intention by mentioning the reasons for the intention.

Each affirmative/negative element has a degree of affirmation/negation (affirmative value) in the

range [0.0, 1.0]. Then, the total affirmative/negative intention of the user’s utterances is calculated

from the average of the affirmative values of extracted affirmative/negative elements. This method

can deal with not only complete sentences but also the sentences including omission, incomplete

expressions and wrong voice recognition results, because this method does not refer to the overall

grammar of the sentence but partial word units.

As described in the former paragraphs, we recognize the user’s mind (emotions and intentions)

from the user’s utterances in natural language conversation. Now, extracted intentions of the user

will be sent to the body of the system from our proposed interface system, but we have to consider

how to express the extracted emotions.

As agent systems that express emotions, many pet robots are developed such as AIBO [34, 35] by

Sony, MaC model [36, 37] by OMRON, PaPeRo [38, 39] by NEC, and ComoComo [40] by Toshiba. All the

agents recognize the present environment by the sensors in the agents, and generate the emotions

from the user’s own viewpoint.

We aim for smooth human-computer communication like human to human communication.

Although it is enough to express only the agent’s own emotions to be made a pet of the user and to

be attached, it is not enough to obtain the user’s reliance for the computer interface.

In order to be believed and relied on by the user, we propose a method to express synchronized

emotions against the user’s emotions.

Heider proposed the P-O-X theory and it is expressed as follows. The P stands for a person, O is

another person and X is an impersonal entity (topic, subject, event) that P and O have an opinion

about. When both P and O have the same opinions (approval/disapproval) about X, they will approve

of each other [41, 42].

In this study, the P would be the user, O is the computer system and X is the topic in the

conversation. If the computer analyzes the user’s emotion about the topic based on the user’s taste

information and expresses the user’s emotion by facial expression, the user will prefer the computer.

We are aiming to obtain an affinity with the user.

We employ facial expressions to express the analyzed emotions. Many researchers point to the

importance of non-verbal information by face through the faculty of sight [43].

Facial expressions have important roles to express emotions and they are more important than

verbal information. We can express emotions just from facial expressions and basic facial

5

expressions are common in the world. There are many studies to analyze and express facial

expressions and to apply them into the dialogue systems [1, 44].

Ichimura proposed a method to generate one facial expression image based on six emotions (anger,

happiness, sadness, surprise, fear and disgust) using a “parallel sand glass type neural network.” We

propose a method to generate the user’s facial expressions based on the extracted emotions by the

EGC method. This method uses a “sand glass type neural network” trained by real facial images.

First, we classify the emotions for facial expressions as “happiness,” “sadness,” “disgust,” “anger,”

“fear” and “surprise” as proposed by Ekman [44]. By training the neural network based on such types

of facial expressions, each emotion is partitioned on the two-dimensional emotional space

constructed by the outputs of the third layer in the neural network.

In order to employ the emotional space, we assign the EGC output (20 kinds of emotions) to the

input of the two-dimensional emotional space (6 kinds of emotions) as described in Section 5.2.2.

Successively, a point on the two-dimensional emotional space is determined from the assigned

emotions [45, 46, 47].

We applied this method into mail software and a chat system. The mail software (JavaFaceMail)

calculates the emotions from the content of the mail, generates one facial expression image of the

sender, and sends the mail with the facial image. The chat system (JavaFaceChat) also generates a

facial expression image like JavaFaceMail. Furthermore, it analyzes the variances of emotions for

each user, and invites two users to a new closed chat room when their tendencies of variances are

alike.

6

CHAPTER 2 EMOTION GENERATING CALCULATIONS

Although the human-computer interfaces still have been constructed considering only the machine

circumstances, we have a lot of opportunities to deal with computers. However, such interfaces are

inconvenient especially for elderly people and the handicapped because they often do not know

how to operate them and cannot manipulate the tools, such as input data by keyboard, click the

small button by mouse, and so on. We consider that the advanced interfaces such as communicating

by natural language are convenient even for people in normal health. However, to understand the

intention in the utterance and to achieve natural communication between human and the system, we

have to consider a human’s emotion which is important. The concept of the emotional computer

was unfamiliar, however, recently the concept is becoming more popular, for example, AIBO made

by SONY [34, 35].

We are going to propose an agent model that expresses emotions of its own and calculates the

user’s emotions. When the agent recognizes an event guessed by the stimulus from out of the agent,

the agent calculates emotions by evaluating the event based on the individuals’ likes and dislikes.

However, it is very difficult to evaluate all events in the world. There are two processes, emotion

generation and emotion analysis, in emotion processing. We believe our method can deal with both

processes, however, we limit the emotion analysis process in this paper, as generating emotion of

the computer’s own has not gained the consensus regarding the problems of validity and morality

yet. In this paper, we restrict the stimulus within the natural language of the sentence, as we are

going to adopt the agent model for the natural language interface.

In this section, we propose a method that the agent calculates some emotions expected to raise in

the human with regard to the event that the agent recognizes. The agent calculates the emotions by

substituting values of words’ impression about likes/dislikes (FV) for the equation that is readied for

each event type. The strength of the emotion is calculated from a diagonal’s length of rectangular

solid consisting of all the terms, which are Subject, Object, and Predicate, in the equation.

Furthermore, the procedures to construct default FV database and to learn FVs from the dialogue are

described. These calculated emotions are expressed by the facial expression and response and this

method is presented in chapter 5 and 6. We evaluated this method’s validity by comparing the result

of emotions extracted from the system against the responses of the various individuals. 2.1 Agent Model

Our purpose is to realize natural dialogue processed between human and computer by considering

human emotion. In this section, we propose an agent model that can make adequate responses and

can express adequate expression of emotions in the verbal input by a human. At first, we introduce

7

some emotional agents that generate and express emotion against the stimulus from the external

world such as AIBO of Sony [34, 35] and MaC model of OMRON [36, 37]. Next, we will explain the

structure of our agent model.

2.1.1 Agent Model of AIBO Sony made an entertainment robot called AIBO. It is an autonomous walking machine in the real

world and it recognizes the environment by external/internal sensor such as camera, microphone,

touch sensor, battery rest sensor and so on. AIBO generates the instincts from the state of the sensors,

and elicits emotions based on the instinct. Then, it calculates its emotion from the instincts and

expresses its emotions as gestures, light signals, and sounds.

AIBO has five instincts: to sleep, to be a pet, charge, explore, and play with someone. It acts on its

instinct. AIBO uses expressive gestures to tell you its desire. AIBO has six emotions: joy, sadness,

anger, surprise, fear, and discontent. AIBO expresses these emotions through its horn lights, sounds,

and gestures [34, 35].

Figure 2.1 is an agent model of AIBO. This figure is written by mine based on the upper

explanation.

Actuator

Emotion domain (joy, sadness, anger, surprise, fear, and discontent)

Instinct domain (to sleep, to be a pet, charge,explore, and play with someone)

External world

Outer Stimulus Sounds Signals

External Sensors

Internal Sensors

Inner Stimulus

Light Microphone

Gestures

Figure 2.1 Agent model of AIBO

8

2.1.2 MaC Model Ushida, et al. proposes an emotion model, MaC (Mind and Consciousness) model, for life-like

agents with emotions and motivations as shown in Figure 2.2. This model consists of reactive and

deliberative mechanisms. The former mechanism covers direct mapping from sensors to effectors.

The deliberative mechanism has two processes: the cognitive and emotional processes. The

cognitive process executes recognition, decision-making, and planning. The emotional process

generates emotions according to the cognitive appraisals.

The process of emotion generating is divided in two steps as shown in Figure 2.3. In the first step,

emotional factors (i.e. desirability, praiseworthiness, and appealingness) are computed. The levels of

the emotional factors are obtained using the emotion eliciting condition rules. This model uses seven

emotional factors as follows: Goal success level (GSL), Goal failure level (GFL), Blameworthy level

(BWL), Pleasant feeling level (PFL), Unpleasant feeling level (UFL), Unexpected level (UEL), and

Goal crisis level (GCL). The second step is to compute emotion intensities. Emotion intensities are

obtained by using emotional factors, time decay, and other emotions. Emotional factors influence the

intensities by using production rules [36, 37].

Cognitive Process �� Recognition �� Decision making�� Planning

Emotional Process �� Desirability �� Praiseworthiness � Appealingness

Sensors Reflex Effectors

Cognitive Appraisal

Generated Emotion

Environment

Figure 2.2 Conceptual model of the mind and consciousness

9

2.1.3 Agent Model in This Study

The agent models as shown in the former sections gain some stimulus from external world and

generate emotions based on the stimulus. Then they express emotions using effectors and actuators.

In order to express emotions, AIBO and MaC model use the actions. We propose an agent model

that can deal with natural language and emotion to communicate a human being naturally as shown

in Figure 2.4. The natural language processing on the agent model consists of mainly three parts;

(1) Analysis of input utterance (Sentence analysis domain),

(2) Decision of what-to-say (Dialogue planning domain),

(3) Decision of how-to-say (Sentence generation domain).

When a user’s utterance comes from “External world,” “Sentence analysis domain” analyzes the

utterance content and extracts the user’s intention. Next, “Dialogue planning domain” makes the

response content based on the user’s intention, the conversation’s log and the present feeling. Then

“Sentence generation domain” makes the responses and outputs it to “External world.”

These domains transfer their data through “Internal world domain” which works as the short-term

memory and manages agent model, user model and current situation. On the other hand, “Memory

management domain” works as the long-term memory and accumulates knowledge and experience.

Figure 2.3 Framework for emotion generation

Situation

BWL

GFL

GSL

PFL

UFL

UEL

GCL

Emotional Factors

Happiness

Anger

Sadness

Disgust

Fear

Surprise

Emotions

Fuzzy Inference

▪ Current goal ▪ Current object ▪ Distance to object ▪ Object’s contribution to the goal ▪ Other’s action

Emotion Eliciting Condition Rules

If getting an object succeeds & Its contribution degree to a goal is high& the goal’s importance is high, then the GSL is high.

10

In the emotional process within the model, “Emotion generation domain” observes the output of

“Sentence analysis domain” through “internal world” and extracts some emotions from the various

events that are expressed in the user’s utterances. On the other hand, “Face selection domain”

receives these emotions and selects adequate facial expression using neural network, and the facial

expression is drawn on the display. The extracted emotions influence other domains not just “face

selection domain.”

2.2 Process of Emotion Generating Calculations We present Emotion Generating Calculations (EGC) method that extracts emotions from the

utterances. This method is constructed by focusing on the similarities between the grammar

structures and the semantic structures within the case frame representation. The input of our agent

model is the sentence of the user’s utterance, and the outputs are responses by utterance and facial

expression. Figure 2.5 is the procedure of our EGC method.

At first, the user’s utterances are transcribed into the case frame representation based on the result

of morphing and parsing for the utterance, because the input form of our proposed method is case

frame representaion.

Dialog planning domain

Internal world

Face selection domain Emotion generation domain

Sentence analysisdomain

Memory management domain

Sentence generation domain

External world

Agent

Utterance Facial expression Utterance

Figure 2.4 Agent model of our system

11

Next, the agent extracts pleasure/displeasure and its strength from an event that is described by

case frame representation. In the psychological field, “unpleasure” is often used as the opposite of

“pleasure.” However, we use “displeasure,” because an explicit intention about “unhappy” should be

indicated. The agent does morphological analysis and parsing to input utterance before this process.

Then, the agent calculates the degree of pleasure/displeasure from a diagonal’s length of rectangular

solid consisting of all the terms in EGC. EGC uses eight types of equations for 12 types of events

classified types by Okada [17]. The agent substitutes word concepts’ impression degrees about

likes/dislikes (FV) for the equations. The equations consist of 2 or 3 terms and each term mainly

means subject, object and predicate. The method also calculates the degree of extracted emotion

(Emotion Value: EV) using FVs for their terms. Furthermore, the negatives and the noun phrases are

also used in these calculations.

Then, the agent divides this simple emotion (pleasure/displeasure) into 20 various emotions based

on the Elliott’s “Emotion Eliciting Condition Theory.” Elliott’s theory requires judging such

conditions as follows; “feeling for another,” “prospect and confirmation,” “approval/disapproval.”

“Feeling for another” means someone’s emotion (not mine) about the utterance’s content and it is

judged based on EGC’s result using another’s taste information. The method extracts some aspects

and adverbs about the tense to judge “prospect and confirmation.” “Approval/disapproval” is judged

by the utterance’s case frame representation with the transitive verb.

This method calculates not the agent’s own emotions like AIBO and MaC model, but the user’s

emotions. This enables an adequate facial expression in order to sympathize the user’s emotion and

to avoid the utterance that the agent causes displeasure.

12

Input utterance

Morphological Analyzing

Parsing

EGC method

Favorite ValueDatabase

EGC Equations Database

Pleasure/Displeasure

Classify the distinguished pleasure/displeasure

Dialogue planning domain Face selection domain

20 types of emotions

Tense & Aspects

Figure 2.5 Procedure of the EGC method

Case Frame Representation

13

2.3 Case Frame Representation 2.3.1 Case Frame Representation in Japanese

The case frame structure bases the predicate phrase and the other case elements which connect to

the predicate phrase. There are two case frame structure types; “surface structure” based on the word

string of a sentence, and “deep structure” based on the content of a sentence. However we are going

to deal with the Japanese in this study, it is very difficult to analyze surface structure of Japanese.

Because Japanese sentences can be made up without subject or object and the particles that have

various functions such as “WA” are often used in Japanese [48]. In this study, the deep structure is

used in order to avoid any ambiguities that exist in Japanese.

2.3.2 Classification of Event Type

We will meet with an infinite number of events in this world. It is impossible to propose the

emotion generating rules for each event. Then, we have to classify the events.

Okada classified the event concepts into “simple event concept” which is represented by

connection of the case elements and “combined event concept” which is represented by combination

of the simple event concepts. In this paper, we deal with only the “simple event concept” because

“combined event concept” can be dealt later if a method for “simple event concept” is established.

Okada presented 11 case element types to express the event; Subject, Object, Object-From,

Object-To, Object-Mutual, Object-Source, Object-Content, Implement, Location, Time, Reason and

Degree [17]. Okada also defined seven essential elements, Subject, Object, Object-From, Object-To,

Object-Mutual, Object-Content and Implement, as the least necessary elements for recognizing the

event. Okada classified the simple event concepts recorded in a classified vocabulary chart [49] into

12 kinds of type based on the least necessary elements of the event. Table 2.1 shows all types and

their examples. Type I-V are the intransitive verbs, type VI-XI are transitive verbs, and type XII is

the rest event.

We presume that the events in an event type are dealt with in the same semantic structures which

the human recognizes. For example, although “Smoke goes up a chimney.” and “The man left

town.” are felt completely different events, both events have common form “Subject’s place changes

from one to the other.” We propose a method to generate emotion for each event type [19, 20].

14

Table 2.1 Event types

Type Event type Example sentence

I V(S) I run.

II V(S, OF) The man left town.

III V(S, OT) He goes to school.

IV V(S, OM) The beer is mixed with water.

V V(S, OS) The child disobeys his parents.

VI V(S, O) He bends a branch.

VII V(S, O, OF) The driver unloaded the baggage from the car.

VIII V(S, O, OT) He put the book into his bag.

IX V(S, O, OM) The man bumped the enemy’s head against a wall.

X V(S, O, I) He carved wood with his knife.

XI V(S, O, OC) I feel the wind refreshingly.

XII Others

2.3.3 Construction of Case Frame Representation In order to transcribe the user’s utterances into the case frame representation, we implement

morphological analysis and parsing to the input sentence first. We use JUMAN as a morphological

analyzer and KNP as a parser. There is a popular morphing system for Japanese, ChaSen, however,

we adopted JUMAN because it is needed for KNP. Both of JUMAN and KNP are developed in

Kyoto University [50].

Figure 2.6 is the example of the processes of JUMAN and KNP. In this example, a sentence

“KARE GA WATASHI NO KURUMA WO KOWASHITA. (He broke my car)” is inputted. At first,

JUMAN separates the sentence into seven morphemes. Next, KNP analyzes their relationship and

constructs the grammatical structure.

The grammatical structure of the sentence is obtained by this process. However, although its result

is expressed by surface structure, our method needs deep structure. Next, all case elements are

classified based on their particles. The result of KNP outputs the surface case frame structure and its

case elements are named by their particle names such as GA-case and WO-case. Then, we propose

some translation rules as shown in Table 2.2 that translate their case names into deep structure based

on the usage of the particles [51].

15

Input Sentence: 「彼が私の車を壊した。」

Morphological analyzing

彼 かれ 彼 名詞 6 普通名詞 1 * 0 * 0 が が が 助詞 9 格助詞 1 * 0 * 0 私 わたし 私 名詞 6 普通名詞 1 * 0 * 0 の の の 助詞 9 接続助詞 3 * 0 * 0 車 くるま 車 名詞 6 普通名詞 1 * 0 * 0 を を を 助詞 9 格助詞 1 * 0 * 0 壊した こわした 壊す 動詞 2 * 0 子音動詞サ行 5 タ形 8 。 。 。 特殊 1 句点 1 * 0 * 0 EOS

Parsing

((3 (type:D int:0 ext:) ((壊した こわした 壊す 動詞 2* 0 子音動詞サ行 5 タ形 8 NIL (自立))(。 。 。 特殊

1 句点 1 * 0 * 0 NIL (文末 付属))) (文末 句点 用言:強:動 レベル:C 区切:5-5 ID:(文末) 提題受:15) (-1)) ((2 (type:D int:0 ext:) ((車 くるま 車 名詞 6 普通

名詞 1 * 0 * 0 NIL (漢字 自立 ))(を を を 助詞 9 格助詞 1 * 0 * 0 NIL (付属))) (ヲ 助詞 体言 係:ヲ格 区切:0-0) NIL) ((1 (type:D int:0 ext:) ((私 わたし 私 名詞 6普通名詞 1 * 0 * 0 NIL (漢字 自立 ))(の の の 助詞 9接続助詞 3 * 0 * 0 NIL (付属))) (助詞 体言 係:ノ格 区切:0-4) NIL))) ((0 (type:D int:0 ext:) ((彼 かれ 彼 名詞 6 普通名

詞 1 * 0 * 0 NIL (文頭 漢字 自立 ))(が が が 助詞 9格助詞 1 * 0 * 0 NIL (付属))) (文頭 ガ 助詞 体言 係:ガ格 区切:0-0) NIL)))

Figure 2.6 Example of morphological analyzing and parsing

彼が (KARE GA)

私の (WATASHI NO)

車を (KURUMA WO)

壊した。(KOWASHITA)

16

Table 2.2 Translation rules from particle case element name to deep case element name

Particle GA KARA NI/ E/ MADE NI TO

Case Element Subject Object-From Object-To Object-Mutual Object-Source

Particle WO DE WA/ MO

Case Element Object Implement Subject/ Object

In this rule, NI-case is classified both “Object-To” and “Object-Mutual.” Then, it is classified

based on the event type of predicate in the sentence. On the other hand, WA-case and MO-case can

be classified “Subject” and “Object.” When the sentence does not have “Subject,” the case is

“Subject,” and when the sentence does not have “Object,” the case is “Object.” When both of

“Subject” and “Object” do not exist, it is “Subject” provisionally.

The case frame representation also presents the information of tense and aspect. The information

are extracted based on auxiliary verbs in the predicate phrase. We limit the considering tense/aspect

to past, future, and negation, because they are effective in generating emotion. The sentence in

Figure 2.6 is transcribed into the deep case frame structure as follows:

Predicate: KOWASU Subject: KARE

Object: WATASHI NO KURUMA

Tense: past

Aspect: nothing

2.4 Favorite Value Database 2.4.1 Favorite Value

We calculate pleasure/displeasure about an event by substituting the value that means degree of

like/dislike (FV) to the equation of EGC. We give positive numbers to some objects when the user

likes them, and give negative numbers to the other objects which the user dislikes. The FV is

predefined a real number in the range [–1.0, 1.0].

There are two types of FVs, personal FV and default FV. Personal FV is sotred in a personal

database for each person who the agent knows well, and it shows the degree of like/dislike to an

object from the person’s viewpoint. On the other hand, default FV shows the common degree of

like/dislike to an object that the agent feels. Generally, it is generated based on the agent’s own taste

information according to the result of some questionnaires. Both personal and default FVs are stored

in each user’s Favorite Value database. An object’s FV is retrieved by the following procedure

(Figure 2.7);

17

1. Retrieve the object in personal Favorite Value database.

2. Retrieve the upper concept’s FV in default value database.

3. Retrieve further upper concept’s FV in default value database.

4. Retrieve the object or the upper concept in default Favorite Value database.

5. Give the object the value 0 as the FV when there is no information in any database.

2.4.2 Default Favorite Value Default FVs are predefined based on corpus in the field that the system is applied. The object

(noun), event core (verb) and attribute (adjective) have FVs.

At first, we predefined the attributes’ FVs based on “Dictionary about Usage of Present-day

Adjectives [52].” In this book, there is a list of adjective images, and the positive/negative image and

its degrees of 1,010 adjectives are listed in the range [–3, +3]. However, some of the adjectives have

both images. For example, when a word “cool” is used about temperature, it means “not so cold or

comfortable.” On the other hand, when the word is used about eagerness, it means “not eager, lack

of will.” In this paper, we did not deal with such words because identifying the difference in

meaning is very difficult.

Next, we predefined the favorite degree of the event cores. In EGC method, pleasure/displeasure

is extracted based on approach/avoidance of a likable/dislikable object. Then, we gave the verbs that

relate to “gain” positive numbers, and gave the verbs that relate to “lose” negative numbers.

The FVs of the objects were gained from a questionnaire. In order to do it, we constructed favorite

data collecting system on WWW. It shows some nouns with an input frame for FV. The subjects

input a real number in the range [–1.0, 1.0] for each word. We adopted the average of all subjects’

“Dog” FV (Favorite Value) = −0.5

“Spitz” FV = null

“Doberman” FV = −0.5

“This dog”FV = null

“Dachshund”FV = 0.7

“Pochi (Taro’s dog)”FV = −0.8

FV = −0.5

Figure 2.7 Retrieving further upper concept’s favorite value

18

reply values as the objects’ default FVs. However, there are countless objects in the world. In this

paper, we limited the objects that have default FV into the words that frequently appear in the

dialogue about the field where our method is applied.

2.4.3 Favorite Value Learning Method EGC method needs objects’ FVs and the values are predefined from a questionnaire on WWW as

described in Section 2.4.2. However, predefining common user’s FV database is very different

among people. Even for the same person, preferences for objects can change easily. Then, people

generally guess a user’s taste information in the dialogue.

We propose four FV learning methods to learn the user’s taste information from the dialogue

using grammatical knowledge and already known FVs [53]:

1) Direct expression about like/dislike

2) Favorite Value changing situations

3) Association displeasure with the object

4) Backward calculation from the emotional expression

1) Direct expression about like/dislike When we guess a person’s taste information, we pay attention to the words “like” and “dislike.”

These words are used to tell one’s impression about something. In this method, when the sentence’s

nominative is the person and its predicate is like/dislike, the word in object frame is regarded as

liked/disliked for the person.

Some adjectives also have good/bad images. These images are identified using “standard

good/bad image of adjectives table [52].” Then, the adjectives that have good/bad images are dealt

same as “like” and “dislike.”

For example, when the agent hears the sentence “I like an apple,” the agent recognizes the user’s

taste about apple is “like,” and when the sentence is “My sister is shameless,” the agent can guess

that the user does not like his/her sister, because the word “shameless” has a bad image.

2) Favorite Value changing situations FV naturally increases when an object does something useful or favorite to the agent. It decreases,

on the other hand, when an object does something harmful or unfavorable. Current FV for a

predicate of an event is assigned a pre-determined numerical value.

In this approach, FV for an object is calculated based on the situations which will influence its FV,

from the agent’s knowledge structure. Such situations are called “Favorite Value Changing

Situations,” and are defined with the following three rules; Condition: events, Situation: situations

represented by condition, and Favorite Value change: increase or decrease. Here’s an example.

19

This example situation indicates that the agent dislikes a person who dates two people at the same

time. The situation in which a person P1 is dating with two persons causes FV for P1 to decrease.

Condition:( date ((Subj P1)(Obj-M P2))) and (date ((Obj-M P1)(Goal P3)))

Situation: P1 is dating with two persons at the same time.

Favorite Value change: FV for P1 decreases

3) Association displeasure with the object An object that often participates in something displeasing, tends to be disliked because it is

associated with past displeasing events. When a person encounters some unpleasant events, he will

hate the objects that participate in such events. Therefore, we decrease all the object’s FV when it

appears in a displeasing utterance. We define the degree of FV’s change by this method as the least,

in our four FV change methods because this effect is caused by a reoccurrence of similar situations.

For example, when the agent hears the sentences “I was struck by my brother with a bat

yesterday.” and “I was scolded by the captain for forgetting my bat.” the agent tends to guess the

user hates bats somewhat.

4) Backward calculation from the emotional expression This method guesses the user’s impression about the utterance’s contents from the emotional

expression when he speaks. For example, the user will feel pleasant about the utterance’s contents

when he smiles or his voice sounds pleasant. Guessing the user’s impression is possible by using not

only non-verbal expression but also the relation of cause and effect like “I am sad because …” and

“..., however I feel happy.” [54]

If there is a word whose FV is not defined in a sentence and the agent have already guessed the

user’s impression about the sentence, the undefined word’s FV can be guessed by calculating EGC

backward.

2.5 Equations of Emotion Generating Calculations Arnold defined “emotions” as tendencies of activation about approach/avoidance of good/bad

object [18]. We define the equations of EGC for each event types as described in Section 2.3.2. These

equations are used for detecting the user’s pleasure/displeasure. At first, we assumed the following

conditions to extract pleasure based on Arnold’s definition. The conditions to extract displeasure are

the opposite of pleasure’s conditions.

1. Favorite agent gains a benefit. / Detestable agent suffers a loss.

2. The condition of favorite/detestable agent becomes better/worse.

20

3. Favorite/detestable agent gets good/bad evaluation.

4. Favorite/detestable agent has a favorite/detestable attribute.

In this section, we propose the equations for each case frame type. We define the following

variables in the equations based on Okada’s classification [17] as described in Section 2.3.2.

fS : FV of Subject

fO : FV of Object

fOF : FV of Object-From

fOT : FV of Object-To

fOM : FV of Object-Mutual

fOS : FV of Object-Source

fOC : FV of Object-Content

fP : FV of Predicate

2.5.1 Equations for Event We define the equations for each event type as shown in Table 2.1.

Type I: The event in type I expresses “Subject (S) does Predicate (P) that the influences reach to S.”

The relationship between these elements’ FVs and its generated emotion is shown in Table 2.3 based

on the pleasure extracting condition “Favorite agent gains a benefit. / Detestable agent suffers a

loss.” Then, the equation of this event type is expressed as the product of fS and fP. The pleasure

when “Detestable agent suffers a loss” means “It serves him/her right.”

PS ffEV �� (1)

Table 2.3 Relationship between FVs and generated emotion

Subject

Like (+) 0 Dislike (–)

Benefit (+) Pleasure (+) Displeasure (–)

0 Event Core

(Predicate) Suffer (–) Displeasure (–)

0

Pleasure (+)

Type II and III: The events in type II and III express “The statement of Subject (S) that has a

relation to Predicate (P) changes from Object-From (OF) to Object-To (OT).”

When the event means “change of position or quantity” like “go” and “stray,” we judge whether

the present statement is becoming better or worse from the difference between fOF and fO. Then, we

21

calculate pleasure/displeasure by the product of fS and (fOT – fOF) the same as type I. We give fOT at

type II and fOF at type III the value 0 as a default position when they are not pointed out clearly in

the dialogue.

)( OFOTS fffEV ��� (2)

However, there are also events that express “change of mind or feeling” like “aspire” and “be suited”

in the event type III. We give fOF the value 0, because the content of Object-To only effects the

emotion. Then, “changing better/worse” is expressed by the positive and negative of FV of

Predicate.

POTS fffEV ���� )0( (3)

Then, we define the following equation that combined (2) and (3) for the event type II and III. When

the event means “change of position or quantity,” we give the event core’s FV a positive number

based on the equation (2).

POFOTS ffffEV ���� )( (4)

Type IV: The events in type IV expresses “Subject (S) and Object-Mutual (OM) have a relation to

Predicate (P).” Figure 2.8 is a relationship between FVs of S, OM, P, and generated emotion. In this

figure, the characters beside the arrows mean FV of Subject or Object-Mutual. FVs of the predicate

that means closeness are positive numbers, and FVs of the predicate that means avoidance are

negative numbers. Then, we can get an EV by a product of fS, fOM and fP based on the relationship in

Figure 2.8. POMS fffEV ��� (5)

Pleasure Event Displeasure Event

+ + closeness

− − closeness

+ − closeness

+ − avoidance

+ + avoidance

− − avoidance

Figure 2.8 Relation between favorite values of S, OM, P and generated emotion

22

Type V: The events in type V expresses “The Subject (S) and Object-Source (OS) do Predicate (P) at

the same time.” For example, an event “The child disobeys his parents,” includes two viewpoints,

“the child is defiant” and “His parents are disobeyed.” So, even if the agent does not take care of the

child, when the agent likes the parents, the agent will feel sorry for them. We reverse the sign of

Predicate’s FV when we calculate emotion against the opposite event, because the meaning of the

predicate also be reversed.

POSS

POSPS

fffffffEV

���

�����

)())(()(

(6)

There is the other type like “adhere” and “originate” in type V. We do not define any equation for

them, because there are not any example to raise pleasure/displeasure.

Type VI: There are two types in type VI; paying attention to Subject’s action like “like” and

“dance,” and paying attention to Object’s action like “bake” and “turn over.” The former events

express “Subject (S) does Predicate (P) against Object (O).” We define the equation as the product of

not only fS and fP but also fO,, because Object also has a large effect on the action.

POS fffEV ��� (7)

On the other hand, the latter events express “Object is done Predicate by Subject.” The event belongs

to this type when its predicate is a transitive verb. In this case, we take care of “Object is done

Predicate.” We use the FV of “do” and “be done” properly, because these two predicates are dealt

with different ones in Japanese. The variable fP in the following equation means the FV of “Action

of Predicate is done.”

PO ffEV �� (8)

Type VII and VIII: We define an equation based on the idea the same as in event type II and III.

However, the agent of the event is not Subject but Object in these event types, because the predicates

of these types are also transitive verbs same as the predicates in the latter type of type VI. Then, we

replace fS with fO in the equation (4).

POFOTO ffffEV ���� )( (9)

23

Type IX: The events in type IX expresses “The Object (O) and Object-Mutual (OM) have a relation

to Predicate (P) by Subject (S)” like event type IV. Then, we replace fS with fO in the equation (5)

same as type VI, VII and VIII.

POMO fffEV ��� (10)

However, some predicates that belong to this type relate “exchange.” In this case, it is important that

who owns the objects in order to understand the benefit or loss. For example, at the event “Taro

substituted the rolls of bills for that of counterfeit bills,” nobody can understand who gains a benefit

or who suffers a loss, because there is not any information about who owns the bills or the

counterfeit bills. In order to get the information about the owner, we have to retrieve not only the

content of the event but also the conversation log, the concept of the object, common knowledge and

so on. Then, we do not define any equations for “exchange” predicate as our present agent model can

deal with only the emotion generation from a single event.

Type X: Because the case element Implement effects to only the degree of generated emotion, the

meaning of the type X’s event is similar to that of the latter type VI’s event type. Then, the equation

of this event type is defined as the same as the equation (8).

PO ffEV ��

Type XI: The content of this type’s event is a part that “Object (O) is Object-Content (OC).” Then,

we admit the generated emotion from this part is the emotion from a whole event.

OCO ffEV �� (11)

We was not able to define any equation for type XII (the other) because the features of their

predicates are too various to unify their concepts.

2.5.2 Equations for Attribute Okada also classified the attribute concepts recorded in the classified vocabulary chart [49] into

seven kinds of type based on the least necessary elements of the attribute concepts. Table 2.4 shows

all types and their examples. In this table, there is a new case element type C (Comparative-object)

and “A” means Attribute. Furthermore, In the following equations, fC means the FV of

Comparative-object and fP means the FV of Attribute [19, 20].

24

Table 2.4 Types of Attribute Concept Type Attribute type Example sentence

I A (S, C) He is taller than her.

II A (S, OF, C) Japan is farther from Europe than America.

III A (S, OT, C) Japan is closer to Europe than America.

IV A (S, OM, C) (no example)

V A (S, OS, C) Taro is more detailed about chemistry than mathematics.

VI A (S, O, C) Hanako is more favorite orange than apple.

VII Others A is equal in B.

Type I to V: Because OF, OT, OM, OS and C are used for expressing the degree of attribute, So only

the case element Subject and Attribute relate to the pleasure/displeasure. The relationship between

these elements’ FVs and its generated emotion is shown in Table 2.5 based on the pleasure extracting

condition “Favorite agent gains a benefit. / Detestable agent suffers a loss.” Then, the equation of

this event type is expressed as the product of fS and fP

PS ffEV ��

Table 2.5 Relationship between FVs and generated emotion

Subject

Like (+) 0 Dislike (–)

Favorite (+) Pleasure (+) Displeasure (–)

0 Attribute

(Predicate) detestable (–) Displeasure (–)

0

Pleasure (+)

Type VI: In this type, Attribute does not evaluate Subject but Object. Then, we give this type the

same equation (8) as Object and Attribute relate.

PO ffEV ��

We was not able to define any equation for type VII (the other) because the features of their

predicates are too various to unify their concepts.

2.5.3 Equations for is-a Relationship There are three concept types expressed in a sentence; the event concept used verb, the attribute

concept used adjective, and is-a concept used “is”. The form of is-a relationship concept is mainly

25

“Subject (S) is Noun (N)” and there are the following three types of relations between S and N [51].

1. S is a kind of N. (e.g. “Hamlet is a story written in the medieval times.”)

2. S and N are the same object. (e.g. “Shakespeare is a writer of Hamlet.”)

3. There is no direct relationship between S and N. (e.g. “Boku wa Hamlet da.” (in Japanese))

There is not the third type expression in English. In Japanese, the meaning of this expression

depends on the topic. However, all types expressions mean that there is a relationship between S and

N. Then, the equation of this event type is expressed as the product of fS and fP (FV of Noun).

PS ffEV ��

2.5.4 Favorite Value of Predicate with Negative Aspect The words in the event have FVs. However, their words often appear with a modifier or aspect. For

example, the nouns are modified as noun phrases or noun clauses, and the verb and adjective are

modified by adverbs, tenses, and aspects. These modifications influence their FVs. The judge of

like/dislike is changeable with a modifier as “apple” and “rotten apple.”

We propose calculation methods of the FV with modification. In this section, we explain about the

predicate, and we explain about the noun phrase and noun clause in the next section.

When a predicate has a negative aspect, we reverse the sign of the FV of the predicate because the

meaning of the predicate becomes opposite. There are not only negative aspects but also various

aspects in dialogue, however, we do not take care of these aspects because they have no influence to

distinguish likes/dislikes.

2.5.5 Favorite Value of Modified Noun The structures of noun modification are classified as shown in Table 2.6. We explain how to

calculate the FV of the noun phrase and noun clause. The noun phrase modifies a noun by a word

such as noun, adjective, and pronoun. On the other hand, the noun clause modifies a noun by a

clause (i.e. sentence).

Table 2.6 Structure of Noun Modification Structure Example

Pronoun + Noun His story

Adjective + Noun Sad story

Noun + particle + Noun The story of Denmark

Noun + Clause The story that Shakespeare wrote

26

2.5.5.1 Favorite Value of Noun Phrase

The FV of the noun phrase is defined as a product of the FV of the modifier and that of the

modified word, because the modified word is given some information about the owner, attribute, and

so on by the modifier. When the modifier is a pronoun and the content of the pronoun be guessed,

we calculate the FV by supplying the omitted word.

FV of Noun Phrase = FV of modifier * FV of the modified word (12)

Furthermore, when the modified word is a proper noun, we deal with the FV of the modified word as

the value of the whole noun phrase, because the concept of a proper noun is unlimited by the

modifier.

FV of Noun Phrase = FV of the proper noun (13)

2.5.5.2 Favorite Value of Noun Clause

There are three types of noun clause structures as follows [51].

1. Content clause (e.g. The story that Romeo loves Juliet)

2. Modified clause by supplementary word

2.1. Limited modification (e.g. The story that Shakespeare wrote)

2.2. Unlimited modification (e.g. Shakespeare who wrote “Romeo and Juliet”)

3. Modified clause for relative noun (e.g. The day that Shakespeare wrote “Hamlet”)

Figure 2.9 Truth value of the proposition ((e) is A) is τ)

τ: a little true

a 0 0

U e

a μA(e)

A

μτ(a)

27

We propose a common method based on an idea of the “limitation of fuzzy truth value” for

evaluating the FV of noun clause.

The “limitation of fuzzy truth value” is explained as follows [55]:

First, we consider about

((x) is A) is τ

as a fuzzy predicate with the “linguistic truth value.” “A” is a fuzzy set on the universal set “U”, and

“x” is a variable fixed into an element in “U”. When we fix a variable “x” into an element “e” in a

group “U,”

((e) is A) is τ

becomes a kind of fuzzy proposition and only one truth value is obtained as shown in Figure 2.9.

The truth value of the proposition ((e) is A) is τ) is described as )(eA� . When we define )(eA�

as “a,” the fuzzy proposition “((e) is A) is τ” changes into a new fuzzy proposition as

(a) is τ. Therefore, the truth value of the proposition becomes )(a

�� , and the truth value of the proposition

“((e) is A) is τ” are given as follows;

� � � �� �ea A�����

� .

We apply the “limitation of fuzzy truth value” method for evaluating the FV of noun clause. In the

“limitation of fuzzy truth value” method, the truth value of a fuzzy proposition is enhanced/deflated

by a “linguistic truth value” after the proposition. We consider that the relationship between the

“truth value of fuzzy proposition” and the “linguistic truth value” is the same as a modified word and

the modifier clause, i.e. the impression (FV) of the modified word is enhanced/deflated by its action

or attribute described in the modifier clause. When the impressions of the modified word and the

context of the modifier clause are the same (both good and bad), the “linguistic truth value”

enhances the FV of the modified word. On the other hand, when the impressions of the modified

word and the context of the modifier clause are different, the “linguistic truth value” deflates the FV

of the modified word. However, we can already obtain the FV of modified word in the range [-1.0,

1.0]. Then, we realize the “limitation of fuzzy truth value” for modifier clauses by defining a

membership function for the “linguistic truth value.”

28

Figure 2.10 Transition function for modifier clause

We propose a transition function for modifier clause as shown in Figure 2.10 based on an idea that

“the objects which do not have concrete evaluation are more effective about the FV.” We have to

adjust the maximum value of the effect not to over 1.0. We define the maximum effect to the FV as

mdcEV and α as 2.0, because the maximum of the EV is 3 , where mdcEV means the “EV of

modifier clause.”

We apply a kind of fuzzy method into calculation of the FV and EV, however, the FV and EV are

in the range [-1.0, 1.0]. Therefore, we give two truth values for a FV. For example, the FV +0.5 has a

“truth value of like” as 0.5 and a “truth value of dislike” as 0.0. On the other hand, the FV –0.3 has a

“truth value of like” as 0.0 and a “truth value of dislike” as 0.3. We show the “truth value of like” as

likeTV and the “truth value of dislike” as dislikeTV .

a) 0�mdwFV ���

0�

dislike

mdwlike

TVFVTV

b) 0�mdwFV ���

mdwdislike

like

FVTVTV

��

� 0

0 −1.0 FVmdw

1.0

EVmdc �

−1.0

1.0

FVmdr

29

c) 0�mdwFV ���

00

dislike

like

TVTV

We explain our method for each pattern, based on the sign of mdwFV (FV of modified word) in

order to apply the transition function as shown in Figure 2.10 into the fuzzy sets.

a) Favorite value of the modified word is positive )0( �mdwFV

When the content of the modifier clause is favorable )0( �mdcEV , mdwFV is enhanced by the

favorable event or attribute. dislikeTV is always 0.0 because the effect is in the fuzzy set of positive

as shown in Figure 2.11.

� ���

�mdc

likemdc

likelikeEV

TVEV

TV ���

���

�� 1

On the other hand, when the content of the modifier is unfavorable )0( �mdcEV , mdwFV is

deflated by the unfavorable event or attribute as shown in Figure 2.12. When the absolute value of

mdcEV is large enough to remove the mdwFV ’s effect, the liking degree of the object becomes

0.0. Furthermore, the disliking degree of the object increases from 0.0 as shown in Figure 2.13. The

calculation is as follows;

� ���

�mdc

likemdc

likedislikeEV

TVEV

TV ���

���

��� 1

30

Figure 2.11 Membership function for TVlike )0( �mdcEV

Figure 2.12 Membership function for TVlike )0( �mdcEV

EVmdc �

(1,1)

TVlike 0

EVmdc

EVmdc−�

μlike(TVlike)

EVmdc �

(1,1)

TVlike 0

EVmdc

EVmdc−�

μlike(TVlike)

31

Figure 2.13 Membership function for TVdislike )0( �mdcEV

Then, we define the following membership functions for the condition )0( �mdwFV .

��

mdc

mdclike EV

EVTVi)

� �

� � 0

1

���

���

��

likedislike

mdclike

mdclikelike

TV

EVTV

EVTV

���

��

mdc

mdclike EV

EVTVii)

� �

� ���

mdclike

mdclikedislike

likelike

EVTVEVTV

TV

���

���

��

1

0

−EVmdc �

(1,1)

TVlike 0

μdislike(TVlike)

32

b) Favorite value of the modified word is negative )0( �mdwFV

dislikeTV is deflated by the favorable event or attribute and is enhanced by the unfavorable event

or attribute, in the contrary direction with likeTV . Then, we define a membership function by

reversing the sign of mdcEV as shown in Figure 2.14 and 2.15.

��

mdc

mdcdislike EV

EVTVi)

� �

� � 0

1

���

���

��

dislikelike

mdcdislike

mdcdislikedislike

TV

EVTVEVTV

���

��

mdc

mdcdislike EV

EVTVii)

� �

� ���

mdcdislike

mdcdislikelike

dislikedislike

EVTVEVTV

TV

���

���

���

1

0

c) Favorite value of the modified word is 0 )0( �mdwFV

In this case, we consider the effect of the modifier as likeTV or dislikeTV .

0) �mdcEVi

� �

� � 00

0

dislike

mdclike

EV

��

0) �mdcEVii

� �

� ��

mdcdislike

like

EV��

0

00

33

Figure 2.14 Membership function for TVdislike )0( �mdcEV

Figure 2.15 Membership function for TVlike )0( �mdcEV

−EVmdc

(1,1)

TVdislike 0

EVmdc EVmdc+�

μdislike(TVdislike)

EVmdc �

(1,1)

TVdislike 0

μlike(TVdislike)

34

We apply this method for calculating the FV of noun modifier using the EV of the modifier clause.

However, a case element is often omitted in the “modifier clause by supplementary word,” because

of the element is used as the modified word. When we apply our method into such modified clauses,

we give a FV of the modified word the default value +0.5. In order to avoid multiplying the effect of

the modified word, the given FV is not the word’s FV. 2.6 Emotion Strength Calculation

EGC method extracts pleasure/displeasure about an event and it has mainly two processes, one is

identifying pleasure/displeasure by equation as shown in the former section, and the other is

calculating the degree of the emotion. We substitute the real numbers of FVs in the range [–1.0, 1.0]

to the equations, then, we can get not only positive/negative but also the real number from the

calculation. The EV calculated by our method coincides with human feeling that “the degree of the

emotion increases with the degree of contemplation to the case elements,” because this value is in

proportion with the FVs of the case element. Then, we consider the degree of EV as the strength of

generated emotion. However, we cannot compare output values from the equations. Because, there

are eight types of equations, and both types of quadratic equations and cubic equations exist.

Furthermore, the averages of output values are different between the results of the quadratic

equations and cubic equations.

In this paper, we assume an emotional space as three-dimensional space. We consider the length

of the synthetic vector among their FVs as the pleasure/displeasure degree of the event. Table 2.7

shows the correspondence between the case elements in the EGC equations and the axis in the

three-dimensional model. We made this calculation method based on cubic equations. Then, when

we calculate the degree of EV for quadratic equations like type I, V, VI, and XI, we supply a dummy

FV β as third element. We tentatively defined the value as 0.5 as it does not affect our method.

Figure 2.16 is an example of emotion strength of event type VI. There are three elements; Subject,

Object, and Predicate, in the event type VI, and the orthogonal vectors by the elements construct a

rectangular solid. Then, we regard the length of diagonal as the degree of the EV for the type VI

event.

35

Table 2.7 Correspondence between the case elements and the axis

Event type Equation f1 f2 f3 V (S) A (S, C) A (S, OF, C) A (S, OT, C) A (S, OM, C) A (S, OS, C) N (S)

PS ff � fS — fP

V (S, OF) V (S, OT) POFOTS ffff ��� )( fS fOT – fOF fP

V (S, OM) POMS fff �� fS fOM fP

V (S, OS) POSS fff �� )( fS – fOS — fP

POS fff �� fS fO fP V (S, O)

PO ff � fO — fP V (S, O, OF) V (S, O, OT) POFOTO ffff ��� )( fO fOT – fOF fP

V (S, O, OM) POMO fff �� fO fOM fP

V (S, O, I) PO ff � fO | fI | fP

V (S, O, OC) OCO ff � fO — fOC

A (S, O, C) PO ff � fO — fP

Figure 2.16 Example of emotion strength of event type VI

0 fS

fO

fP

f1

f2

f3

F1

F2 F3

36

2.7 Example of EGC Method We show an event “Romeo dates Juliet” as a calculation example. When the event is given to the

EGC method, the calculation is as follows. In this example, we assume the agent is Romeo, and he

likes to date and he loves Juliet.

Event: “Romeo dates Juliet.”

Predicate (P) = “date with” : +0.6

Subject (S) = “Romeo” : +1.0

Object-Mutual (OM)= “Juliet” : +0.9

The event type of the predicate “date” is V, and we substitute the FVs of the case elements for the

equation (5). Then, we got a positive number 0.54. This result shows that the agent (Romeo) feels

pleasure about the event “Romeo dates with Juliet.”

Event Type: “date with” V(S, OM)

(Pleasure)number Positive54.0)6.0()9.0()0.1(

)()()(

��

������

��� datefJulietfRomeofEVofSign POMS

Next, we calculate the degree of pleasure for this event. We calculate the diagonal’s length of

rectangular solid that constructed by the orthogonal vectors by the elements, Subject, Object-Mutual,

and Predicate. Then, we regard the length as the degree of EV.

47.1)1()47.1(

)1()6.0,9.0,0.1()(),,(

��

����

���

�� EVofSignfffEVofDegree POMS

When we compare the degrees of generated emotions between the event including impressive

object (its FV is large, e.g. (1.0, 0.1, 0.1)) and the event constructing a common object (e.g. (0.5, 0.5,

0.5)), the generated emotion for the former event is bigger than the latter’s one. This result

corresponds with a human feeling that the event relating interesting object is more impressive.

2.8 Experimental Result We extracted some responses of the users from the conversation log and adopted the EGC method

37

to them. We assumed the agent is the user (the speaker). Then, the EGC method extracted the same

pleasure/displeasure with human feeling from 55 utterances from 80 utterances as shown in Table

2.8.

Table 2.8 Number of the Example Generated Emotion (EGC/Human)

Event Type I II III IV V

Number of the Example Generated Emotion (EGC/Human) 19/31 0/0 11/11 3/3 4/4

Event Type VI VII VIII IX X XI XII

Number of the Example Generated Emotion (EGC/Human) 17/27 0/1 0/2 0/0 1/1 0/0 0/0

We show some examples as follows:

38

Event: “My leg has become swollen.”

Predicate (P) = “swell” : –0.6

Subject (S) = “my leg”

= (+0.9) * (+0.6)

= +0.54

Event Type: “swell” V(S)

re)(Displeasunumber Negative32.0)6.0()54.0(

)()(

��

����

�� swellflegmyfEVofSign PS

95.0)1()95.0(

)1()6.0,5.0,54.0()(),,(

��

����

����

�� EVofSignffEVofDegree PS �

Event: “I’m useful to my family.”

Predicate (P) = “useful” : +0.6

Subject (S) = “I” : +1.0

Event Type: “useful” A(S, C)

(Pleasure)number Positive60.0)6.0()0.1(

)()(

��

����

�� usefulfIfEVofSign PS

27.1)1()27.1(

)1()6.0,5.0,0.1()(),,(

��

����

����

�� EVofSignffEVofDegree PS �

39

The following examples show the difference among the generated emotion’s degrees based on the

FVs of the elements.

Event: “He scratched my car.”

Predicate (P) = “scratch” : –0.2

Object (O) = “my car”

= (+0.9) * (+0.6)

= +0.54

Event Type: “scratch” V(S, O)

re)(Displeasunumber Negative11.0)2.0()54.0(

)()(

��

����

�� scratchfcarmyfEVofSign PO

76.0)1()76.0(

)1()2.0,5.0,54.0()(),,(

��

����

����

�� EVofSignffEVofDegree PO �

Event: “He crashed my car.”

Predicate (P) = “crash” : –0.8

Object (O) = “my car”

= (+0.9) * (+0.6)

= +0.54

Event Type: “scratch” V(S, O)

re)(Displeasunumber Negative43.0)8.0()54.0(

)()(

��

����

�� crashfcarmyfEVofSign PO

09.1)1()09.1(

)1()8.0,5.0,54.0()(),,(

��

����

����

�� EVofSignffEVofDegree PO �

40

2.9 Emotion Distinguishing Method based on Emotional Space We presented a method to extract pleasure/displeasure from an event by two processes,

distinguishing pleasure/displeasure using the EGC method, and calculating the strength of the

emotion to measure the length of synthetic vector among three FVs in emotional space. However,

the EGC method is based on whether the FV of each element is positive or negative. On the other

hand, the synthetic vector is in an area that is partitioned by three axes. Therefore, we present a

method to distinguish pleasure/displeasure from an event not using EGC but judging which area the

synthetic vector is in.

Table 2.9 is the corresponding chart between the sign of each axis and the generated

pleasure/displeasure. When the vector is on the axis and the value of the axis is zero, the event does

not raise any emotion.

Table 2.9 Distinguish pleasure/displeasure using the sign of each axis Area F1 F2 F3 Emotion

I + + + Pleasure II - + + Displeasure III - - + Pleasure IV + - + Displeasure V + + - Displeasure VI - + - Pleasure VII - - - Displeasure VIII + - - Pleasure

41

We applied this new method to the examples as shown in Section 2.7 and 2.8.

Event: “Romeo dates with Juliet.”

Predicate (P) = “date with” : +0.6

Subject (S) = “Romeo” : +1.0

Object-Mutual (OM) = “Juliet” : +0.9

Event Type: “date with” V(S, OM)

47.1)6.0,9.0,0.1(

),,(

����

� POMS fffEVofDegree

(Pleasure) I Area)6.0,9.0,0.1(),,(/

����

� POMS fffedispleasurpleasurehDistinguis

47.1)1()47.1(

)/()(

��

���

�� edispleasurpleasureEVofDegreeEmotionGenerated

Event: “My leg has become swollen.”

Predicate (P) = “swell” : –0.6

Subject (S) = “my leg”

= (+0.9)� (+0.6)

= +0.54

Event Type: “scratch” V(S)

95.0)6.0,5.0,54.0(

),,(

����

� PS ffEVofDegree �

re)(Displeasu V Area)6.0,5.0,54.0(),,(/

����

� PS ffedispleasurpleasurehDistinguis �

95.0)1()95.0(

)/()(

��

���

�� edispleasurpleasureEVofDegreeEmotionGenerated

42

Event: “I’m useful to my family.”

Predicate (P) = “useful” : +0.6

Subject (S) = “I” : +1.0

Event Type: “useful” A(S, C)

27.1)6.0,5.0,0.1(

),,(

����

� PS ffEVofDegree �

(Pleasure) I Area)6.0,5.0,0.1(),,(/����

� PS ffedispleasurpleasurehDistinguis �

27.1)1()27.1(

)/()(

��

���

�� edispleasurpleasureEVofDegreeEmotionGenerated

2.10 Future Work We found two types of problems about the EGC method from the experimental result as described

in Section 2.8.

1. Inadequate pleasure against an unpopular person.

2. Pleasure/displeasure by guessing situations from aspects in the utterance.

At first, generating negative emotion against an unpopular person depends on the individual.

Although the EGC method always detects negative emotion to unpopular person, some people

occasionally feel sorry for the unlucky person even they do not like him/her. There were 11 counter

examples in the experiment. We found that it depends on the interest to the individual. In order to

realize this, we have to consider the objects’ parameter not only FV but also the other attributes like

interest.

The next problem is about the aspect like “have to.” The expression with this aspect “have to”

often implies a duty like “Although the speaker does not want to do something, he/she has to do it.”

Then, we consider that the speaker will generate displeasure for the forced event. The EGC method

should be developed to be able to consider the effects of aspects, not only “have to” but also “can’t,”

“take trouble to,” and so on.

The study about “Favorite Value learning method” as described in Section 2.4.3 has proceeded,

too. Especially, we are studying about “Favorite Value changing situations” method and “Backward

calculation from the emotional expressions” method.

“Favorite Value changing situations” method extracts the situations which influence an object’s

43

FV from the agent’s knowledge structure. Now, we are studying about how to construct the

knowledge from each utterance using Truth Maintenance System while avoiding any contradictions

[53].

“Backward calculation from the emotional expressions” method attempts to extract FVs from

utterances containing objects whose values are undefined. The objects’ FVs are extracted by

calculating the EGC backward. We show an example for “Romeo felt sad because Mercutio was

killed.” In this sentence, two clauses are connected by a subordinating conjunction “because.” The

independent clause means “Romeo feels displeasure” and the dependent clause shows the reason.

The EGC result for the dependent clause is shown as follow;

Event: “Mercutio was killed.”

Predicate (P) = “be killed” : –0.6

Subject (S) = “Mercutio” : ?

Event Type: “be killed” V(S)

re)(Displeasu )6.0,5.0(?,),(?,/

���

� PfedispleasurpleasurehDistinguis �

The agent does not know the FV of “Mercutio,” but the agent guesses the EGC result is

“displeasure.” Then, we calculate the FV of “Mercutio” using the relationship between the FVs and

the emotion as shown in Table 2.9. We guess the sign of F1 (Mercutio’s FV) is positive based on the

table, because the signs of F2 and F3 are positive and the output emotion is “pleasure.” We consider

that the result of this calculation is correct based on the story of Hamlet.

2.11 Conclusion In this chapter, we presented an emotion-handling dialogue model in order to facilitate comfortable

interaction with the users. We proposed Emotion Generating Calculations (EGC) method to generate

pleasure/displeasure emotion from an event in utterance. We also proposed how to calculate the

degree of the pleasure/displeasure from a diagonal’s length of rectangular solid consisting of all the

terms in EGC. EGC uses eight type equations for 12 event types, two types equations for seven

attribute types, and an equation for the noun phrase. FVs for objects are used for their calculations.

Furthermore, we applied these calculations to the negative aspect and modified noun.

To verify the effectiveness of the proposed method, we applied our method to 80 events in the

conversation and calculated emotions which almost corresponded to human-generating emotions.

44

CHAPTER 3 COMPLICATED EMOTION ALLOCATING METHOD BASED ON EMOTION ELICITING CONDITION THEORY

We proposed the EGC method which calculates pleasure/displeasure from the events. However,

expressing emotion only by pleasure/displeasure is too vague. Human usually recognizes many

emotions like hope, shame, love, anxiety, gratitude, anger, and so on.

In this chapter, we propose a method to distinguish generated simple emotion

(pleasure/displeasure) by the EGC method into 20 various emotions based on Elliott’s “Emotion

Eliciting Condition Theory.” Elliott’s theory requires judgment on the following conditions; “feeling

for another,” “prospect and confirmation,” approval/disapproval.” “Feeling for another” means

someone’s emotion (not mine) about the event and it is judged based on the EGC’s result using the

other’s FV information. We extract some aspects and adverbs about the tense to judge “prospect and

confirmation.” “Approval/disapproval” is judged by the event’s case frame structure with the

transitive verb.

To verify the effectiveness of the proposed method, we report the result of some questionnaires.

3.1 Emotion Discrimination The word “emotion” described here includes various emotion types; basic and common emotions

that the other animals also have like “pleasure,” “sadness,” “anger” and so on, and the emotions

based on the social cultural background like “contempt,” “pride,” “jealousy,” “shame” and so on.

Furthermore, some emotions are close to each other and the other emotions are independent. It is the

starting point of an emotion study that shows how to grasp the whole relationship of various

emotions.

In psychology, some models for emotions’ relationships are presented [56]. They plot the emotions

on the N-dimensional space constructed by finite emotional dimensions. For example, Wundt

considered emotions as states varying along a few dimensions. He proposed three dimentions,

namely pleasure vs. displeasure (Lust vs. Unlust), calmness vs. tension (Beruhigung vs. Erregung),

and relaxation vs. excitement (Lösung vs. Spannung) as shown in Figure 3.1. Schlosberg presented a

“three-dimensional model of emotion,” pleasure vs. displeasure, attention vs. rejection, and the

activation, based on the facial expressions. The dimension of activation is basic to the behavior of

living organisms as shown in Figure 3.2 [22].

45

Pleasure

Displeasure

Relaxation

Tension Calmness

Excitement

Figure 3.1 Three-Dimensional Model of Wundt

Figure 3.2 Three-Dimensional Model of Schlosberg

Pleasure Attention

Rejection Displeasure

Activations

46

These ideas are effective for classifying various emotions, however, they do not mention the

active aspect and function of the emotions. For example, they rarely mention how to generate

emotion and original function for each emotion (e.g. “horror” let the agent avoid the danger and

“anger” let him/her fight).

Then, recent emotion studies in psychology have four major theoretical traditions in term of

definition, study, explanation of emotion; Darwinian perspective, Jamesian perspective, the cognitive

perspective, and the social constructivist perspective, as shown in Table 3.1 [57].

Table 3.1 Four Major Theoretical Traditions in Psychology

Perspective Principal Thought Classic Study Present Study

Darwinian Emotions have adaptive functions

and are general.

Darwin

(1872/1965)

Ekman (1987)

Jamesian Emotion = Body Action James (1884) Levinson (1990)

Cognitive Emotions are based on the

evaluations.

Arnold (1960a) Smith and Lazarus

(1993)

Social constructivist Emotions are the social constructive

that contribute to social purpose.

Averill (1980a) Smith and Kleinman

(1989)

Plutchik constructed a new three-dimensional model that includes the Darwinian perspective and

Jamesian perspective. His model describes the relations among emotion concepts as shown in Figure

3.3. The cone’s vertical dimension represents intensity of the emotions, and the circle represents

degrees of similarity among the emotions by their positions on the circle. The eight sectors are

designed to indicate that there are eight primary emotion dimensions defined by the theory arranged

as four pairs of opposites (Joy-Sadness, Trust-Disgust, Fear-Rage, Surprise-Anticipation.) This

model is similar to the former multi-dimensional model. However, the emotions this model

presented are based on “the original active patterns (take in, reject, protect, destroy, breed, adapt, and

investigate) of the living things.” Furthermore, the model deals with complex emotions like

contempt and shame as the combination of a few simple emotions.

However, on our agent model, emotions are not created by perception and action but the content

of the utterance. Therefore, we consider that the cognitive perspective is the most effective for our

usage. Lazarus stressed the importance of cognitive process for emotion generation, because

cognitive appraisal about environment stimulus is essential to generate emotion and the cognitive

process goes ahead of emotion generation process. He presented that a human applies a two-step

cognitive appraisement in dealing with a situation; primary appraisal and secondary appraisal [57].

47

1. Primary appraisal refers to the issue of whether the situation has relevance for personal

well-being. During primary appraisal, individuals implicitly ask themselves the question: "Am I in

trouble or am I benefiting, now or in the future, and in what way?"

2. Secondary appraisal focuses on the possible ways of coping with the situation, and judges the

extent of available personal and environmental resources for dealing with it. The secondary appraisal

process can be translated into the implicit question: "What if anything can be done about the

situation (or about the way it will make me feel)?"

If the primary appraisal process determines that the environmental element poses a threat but the

secondary appraisal process determines that the person has immediate and direct control over the

environmental element, then the person’s suffering will become fewer. For example, if a person

hears loud, raucous music from the next apartment while he/she is trying to study, but also knows

that the neighbor will graciously turn off the stereo if he/she asked, the person will experience little

distractions or stress, or any of the negative psychological effects of non-control [58].

Lazarus also considered the relationship between the emotions’ difference and cognitive processes.

The primary appraisal process assesses only that the situation is beneficial/harmful, for example, it

does not distinguish between negative emotions like anger, fear, disappointment, and sadness. Then,

the secondary appraisal process distinguishes between such negative emotions based on what the

subject can behave about the situation. Lazarus presented a “core relational theme” related to

specific emotions based on the appraisal that a situation is beneficial/harmful. What the agent can

behave about the situation when the agent is faced with a situation is shown in Table 3.2. For

example, the core relational theme of anger is “A demeaning offence against me and mine” [59].

Ecstasy

Terror

Grief

Vigilance

Rage

Loathing

Sadness Disgust Surprise

Pensiveness Boredom Distraction

Apprehension

Fear

Annoyance

Anger Amazement

Admiration

Figure 3.3 Three-Dimensional Corn Model of Plutchik

48

Table 3.2 Emotions and Their Core Relational Themes [59]

Emotion Core relational theme

Anger A demeaning offence against me and mine

Anxiety Facing uncertain, existential threat

Fright Facing an immediate, concrete, and overwhelming physical danger

Guilt Having transgressed a moral imperative

Shame Having failed to live up to an ideal of the ego

Sadness Having experienced an irrevocable loss

Envy Wanting what someone else has

Jealousy Resenting a third party for loss or threat to another’s affection

Disgust Taking in or being too close to an indigestible object or idea (metaphorically

speaking)

Happiness Making reasonable progress toward the realization of a goal

Pride Enhancement of one’s ego-identity by taking credit for a valued object or

achievement, either one’s own or that of some group with whom we identify

Relief A distressing goal-incongruent condition that has changed for the better or gone

away

Hope Fearing the worst but yearning for better

Love Desiring or participating in affection, usually but not necessarily reciprocated

Compassion Being moved by another’s suffering and wanting to help

49

3.2 Emotion Eliciting Condition Theory As shown in Section 3.1, Lazarus proposed two processes to analyze generated emotions,

referring to whether the situation has relevance for personal well-being and evaluating the situation

as avoidable, based on the cognitive perspective. Ortony proposed the theory of the cognitive

structure of emotions and it views emotions as valid reactions to events, agents and their actions, and

objects. The theory specifies a total of 22 emotion types as shown in Table 3.3 [60, 61].

The emotion types are essentially just classed as eliciting conditions, but each emotion type is

labeled with a word or phrase, generally an English emotional word corresponding to a relatively

neutral example of an emotion fitting the type. The simplest emotions are the “well-being” emotions

such as joy and distress. These are an individual’s positive and negative reactions to desirable or

undesirable events. Eliciting these “well-being” emotions corresponds to Lazarus’s primary appraisal,

and the other emotion types corresponds to the secondary appraisal.

The “fortunes-of-others” group covers four emotion types: happy-for, gloating, resentment, and

sorry-for. Each type in this group is a combination of pleasure or displeasure over an event further

categorized as being presumed to be desirable or undesirable for another person.

The “prospect-based” group includes six emotion types: hope, satisfaction, relief, fear,

fear-confirmed, and disappointment. Each type is a reaction to a desirable or undesirable event that

is still pending or that has been confirmed or unconfirmed.

The “attribution” group covers four types: pride, admiration, shame, and reproach. Each

attribution emotion type is a positive or negative reaction to either one’s own or another’s action.

The “attraction” group is a structureless group of reactions to objects. The two emotions in this

group are the momentary feelings (as opposed to stable dispositions) of liking or disliking.

The final group is comprised of four compounds of “well-being/attribution” emotion types. These

compound emotions do not correspond to the co-occurrence of their component emotions. Rather,

each compound’s eliciting conditions are the union of the component’s eliciting conditions. For

example, the eliciting conditions for anger combine the eliciting conditions for reproach with those

for distress [61].

Elliott used Ortony’s emotion eliciting condition rules for the strong-theory reasoning component

of the Affective Reasoner that supports four requirements;

(1) a simulated world which is rich enough to test the many subtle variations a treatment of

emotion reasoning requires,

(2) agents capable of (a) a wide range of affective states, (b) an interesting array of interpretations

of situations leading to those states and (c) a reasonable set of reactions to those states,

(3) a way to capture a theory of emotions, and

(4) a way for agents to interact and to reason about the affective states of one another.

Elliott has used the extended and adapted twenty-four emotion-type version of the “Emotion

50

Eliciting Condition Theory” for the agent in the Affective Reasoner as shown in Table 3.4. The

descriptions of twenty-four emotion types are extended in order to refer the situation as an event [24,

25].

This “Emotion Eliciting Condition Theory” requires pleasure/displeasure about an event and some

information about the situation of the event (i.e. affection of another, prospect, confirmation,

approval, and attraction). We appraise pleasure/displeasure by the EGC result and extract situation

information from the tense and aspect in the utterance.

However, detecting the concepts for the likes/dislikes, needs personal taste information, one’s own

experiences, perceptions, and so on. Then, in this paper, we deal 20 types of emotions except

“liking,” “disliking,” “love,” and “hate.”

51

Table 3.3 “Emotion Eliciting Condition Theory” by Ortony

Group Specification Types (name)

pleasure (joy) Well-being Appraisal of event

displeased (distress)

pleased about an event desirable for another (happy-for)

pleased about an event undesirable for another

(gloating)

displeased about an event desirable for another

(resentment)

Fortunes-of-others Presumed value of

an event affecting

another

displeased about an event undesirable for another

(sorry-for)

pleased about a prospective desirable event (hope)

pleased about a confirmed desirable event (satisfaction)

pleased about an unconfirmed undesirable event (relief)

displeased about a prospective undesirable event (fear)

displeased about a confirmed undesirable event

(fears-confirmed)

Prospect-based Appraisal of a

prospective event

displeased about an unconfirmed desirable event

(disappointment)

approving of one’s own action (pride)

approving of another’s action (admiration)

disapproving of one’s own action (shame)

Attribution Appraisal of an

agent’s action

disapproving of another’s action (reproach)

liking an appealing object (love) Attraction Appraisal of an

object disliking an unappealing object (hate)

admiration + joy → gratitude

reproach + distress → anger

pride + joy → gratification

Well-being/

Attribution

Compound emotions

shame +distress → remorse

52

Table 3.4 “Emotion Eliciting Condition Theory” by Elliott

Group Specification Types (name)

pleased about an event (joy) Well-being Appraisal of a situation as

an event displeased about an event (distress)

pleased about an event desirable for another (happy-for)

pleased about an event undesirable for another

(gloating)

displeased about an event desirable for another

(resentment)

Fortunes-of-

others

Presumed value of a

situation as an event

affecting another

displeased about an event undesirable for another

(sorry-for)

pleased about a prospective desirable event (hope) Prospect-

based

Appraisal of a situation as

a prospective event displeased about a prospective undesirable event (fear)

pleased about an unconfirmed undesirable event (relief)

pleased about a confirmed desirable event (satisfaction)

displeased about a confirmed undesirable event

(fears-confirmed)

Confirmation Appraisal of a situation as

confirming or

unconfirming an

expectation

displeased about an unconfirmed desirable event

(disappointment)

approving of one’s own action (pride)

approving of another’s action (admiration)

disapproving of one’s own action (shame)

Attribution Appraisal of a situation as

an accountable act of

some agent

disapproving of another’s action (reproach)

finding an appealing object (liking) Attraction Appraisal of a situation as

containing an attractive or

unattractive object

finding an unappealing object (disliking)

admiration + joy → gratitude

Reproach + distress → anger

pride + joy → gratification

Well-being/

Attribution

Compound emotions

shame +distress → remorse

Admiration + liking → love Attraction/

Attribution

Compound emotion

extensions Reproach + disliking → hate

53

3.3 Complicated Emotion Allocating Method based on Emotion Eliciting Condition Theory using Emotion Generating Calculation

This “Emotion Eliciting Condition Theory” requires pleasure/displeasure about an event and some

information about the situation of the event (i.e. affection of another, prospect, confirmation,

approval, and attraction). We propose the methods to appraise pleasure/displeasure by the EGC

results and extract situation information from the grammatical features in the utterance.

3.3.1 Fortunes of the Others The emotions that belong to a group of “Fortunes-of-Others” are elicited from the emotion that the

other affects. There are “happy-for,” “gloating,” “resentment,” and “sorry-for” in this group.

The EGC method calculates pleasure/displeasure concerning the event from the user’s viewpoint

using FVs. The FVs which have been defined are based on the user’s preference. However, some

emotions like “sorry-for” and “happy-for” are aroused based on the other’s emotions. We give the

conditions for the emotions about “fortunes of others” as follows.

Happy-for : pleased about an event desirable for another

Gloating : pleased about an event undesirable for another

Resentment : displeased about an event desirable for another

Sorry-for : displeased about an event undesirable for another

In order to appraise a factor that an event is desirable/undesirable for a person in the conditions,

we assumed the condition as that the event pleases/displeases the person. However, different

reactions (pleased/displeased) have happened in the same type events. Therefore, we translated

these conditions as follows; 1) when the user likes the individual that is pleased about an event, the

user feels happy-for him/her, and 2) when the user dislikes the individual that is pleased about an

event, the user gloats over his/her misfortune. In order to confirm the translation, the adequacies of

the translated conditions for generating these emotions are investigated by questionnaire. As the

result, it was found that the person’s preference and the event’s impression for the person are

important factors for these emotions.

Therefore, the method has to judge that the person is favorable/hateful for the user and the event

is desirable/undesirable from the person’s viewpoint. The EGC is used for appraising these

conditions.

First, it is detected whether an individual is favorite or hateful from the user’s viewpoint. The FV

for the target person is used for checking it, because the user’s preference is already clear by the

FVs. When the FV is positive, the user likes the person. On the other hand, when the FV is negative,

the user dislikes the person.

54

Next, it is detected that an event is pleasure/displeasure for the other person. The EGC method

can calculate pleasure/displeasure for the user’s viewpoint based on the user’s FVs. Then, if we use

not the user’s FVs but the other person’s FVs, obtained EV indicates the emotion from the person’s

viewpoint. Therefore, we use “FVs from the other’s viewpoint” for the EGC, and we consider the

output EV as the emotion of “the other” about the event. When the value is positive, the event

pleases the individual. On the other hand, when the value is negative, the event displeases the

individual. The person’s database of FV is managed in the same way as that of the user’s one. The

retrieving process of FV is the same as shown in Section 2.4.1. When the individual’s Favorite

Value database does not exist, default database is adopted.

Table 3.5 shows a relationship between the preference of an individual, emotion for the event

from the individual’s viewpoint, and generated emotion. In this table, ‘A’ means an individual

without the user. We describe the FV of ‘A’ from the user’s viewpoint as “A (user),” and the FV of

‘B’ from C’s viewpoint as “B (C).” The EV of the event is described in the same way, for example,

“EV (A)” means the EV of the event from A’s viewpoint.

Table 3.5 Generated emotions for the preference of an individual and his/her emotion in the event

EV (A)

Pleasure 0 Displeasure

Like Happy-for ‘A’ Sorry-for ‘A’

0 A (user)

Dislike Resentment

0

Gloating

Figure 3.4 is the procedure to extract the emotion of “Fortunes-of-Others.” At first, EV of the

event is calculated using Favorite Value database of an individual ‘A’. This value means the emotion

for the event from the viewpoint of ‘A’. When the EV is not 0, i.e. ‘A’ feels pleasure/displeasure for

the event, the FV of ‘A’ is checked from the Favorite Value database of the user. It shows how the

user feels against ‘A’. Then, an emotion in this group is extracted based on EV (A) and A (user).

When ‘A’ does not feel any pleasure/displeasure for the event, or when the user does not take care of

‘A’, any emotions from the event are not extracted.

55

Figure 3.4 Procedure to extract the emotion of “Fortunes-of-Others

Calculate EV for theevent

EV (A) > 0

A (user) > 0

A (user)

Happy-for Resentment

EV (A) EV (A) < 0

A (user) < 0 A (user) > 0

A (user)

Sorry-for Gloating

A (user) < 0

EV (A)

FV databaseof ‘A’

Case frame representationof an event

Nothing

A (user) = 0EV (A) = 0

FV databaseof the user

56

3.3.2 Prospect-Based Emotions There are “hope” and “fear” in a group of “Prospect-Based”. The condition for the emotion is

“pleased/displeased about a prospective desirable/undesirable event.” We can already check that the

event is desirable/undesirable using the EGC method as described in Section 3.3.1. But, we have to

give a method to check whether the event is prospective or not.

Although people generally use reasoning to predict the future event, our study has not utilized

such reasoning process. However, they do not always make a complete reasoning. When they cannot

reason for the event, they occasionally refer the grammatical features.

Therefore, we also extract the information about “prospects” from the aspect in the case frame

representation. When there is an aspect of “inference (will)” and “intention (be going to),” the event

means a future event.

When we adopt the EGC to the prospective event and when its EV is positive/negative, we

consider the agent affects “hope/fear” based on “Emotion Eliciting Condition Theory” as shown in

Table 3.4. The event which will happen in the future is taken into account in the “prospective event

list” in order to confirm that the prospective event has happened or not as described in 3.3.3.

Figure 3.5 is the procedure to extract the “Prospect-based” emotions. At first, whether the aspect

that means “inference” or “intention” in the event case frame exists or not, is checked. When the

event has the appropriate aspect, the content of the event is prospective and the event accumulates in

the prospective event list. Next, the EV of the prospect event is calculated by EGC from the user’s

viewpoint. Therefore, when the value is positive (i.e. the user feels pleasure for the event), the user

arouses “hope.” On the other hand, when the value is negative (i.e. the user feels displeasure for the

event), the user arouses “fear.”

57

Figure 3.5 Procedure to extract the emotion of “Prospect-based.”

EV (A) > 0

Hope Fear

EV (A) < 0

ProspectiveEvent List

Case frame representation of an event

EV (A) = 0

FV databaseof the user

Are there any aspects that mean “inference” or

“intention”?

Calculate EV for the event

EV (user)

Yes

No

Nothing

58

3.3.3 Confirmation There are “satisfaction,” “relief,” “fears-confirmed,” and “disappointment” in a group of

“Confirmation”. The conditions for the emotions are as follows.

Relief : pleased about an unconfirmed undesirable event

Satisfaction : pleased about a confirmed desirable event

Fears-confirmed : displeased about a confirmed undesirable event

Disappointment : displeased about an unconfirmed desirable event

We can already check that the event is desirable/undesirable using the EGC method. But, we have

to give a method to check whether the event is confirmed one or not.

To recognize that an event is confirmed, the event has to be prospected in advance and it has to

happen actually. To recognize that an event is unconfirmed, the event has to be prospected in

advance and it has to be confirmed that the event will not happen any more. In order to check the

conditions, we consider “whether the event is prospected or not” and “the event is

confirmed/unconfirmed/ unknown.” Prospected events have already been taken into account in the

prospective event list as described in Section 3.3.2. Now, we propose a confirmation method for the

prospected event as follows.

First, we inspect the event with the past aspect in order to confirm realization of the prospective

events. When there is the same content event as the one that is shown before, we consider that “we

had predicted the event and it happened.” The effect of the negative aspect is shown in Table 3.6.

Next, we extract four emotions such as “satisfaction,” “relief,” “fears-confirmed,” and

“disappointment” using the result of confirmation and EGC output based on the conditions for the

emotions. Table 3.7 shows the relationship between the result of confirmation, EGC output and

generated emotion.

Table 3.6 Relationship between FVs and generated emotion

Confirmed Event

Affirmative Negative

Affirmative Happened Not Happened Prospective

Event Negative Not Happened Happened

59

Table 3.7 Generated emotions for the result of confirmation and EGC output

Confirmation

Happened Not Happened

Pleasure Satisfaction Disappointment

0 0 EGC Result

Displeasure Fears-confirmed Relief

Figure 3.6 is the procedure to extract the emotions of “Confirmation.” At first, whether the event

finished or not, is checked according to the existence of the past aspect. Next, the event is retrieved

from the Prospective Event List that accumulates the expected events. When the event exists in the

Prospective Event List, we compare affirmative/negative expression of the input event with that of

the expected event based on Table 3.6. Then, the user’s emotion for the prospective event is

calculated by EGC. When the event happens and it pleases the user, the user feels satisfaction. Other

emotions are extracted in the same way as shown in Table 3.7.

There is another process to extract emotions about confirmation. If the adverb which suggests

“predicted result” exists in the event, we consider the event was expected because we can guess the

event had already been expected even through the user did not inform us about the prospect. On the

other hand, we consider the event was not expected when there is an adverb which suggests an

“unpredicated result” in the event.

Some adverbs like YAPPARI (as expected) and ANGAI (unexpectedly) suggest confirming that the

prospective event is due to happen or not. The following 17 adverbs mean that suggest confirmation

in “Present-day adjective using dictionary [52].”

Predicted Result: SASUGANI (as may be expected), TSUINI, YATTO, YOUYAKU (at last,

finally), ANNOJOU, YAPPARI (as one expected), NANNAKU (without difficulty), NANTOKA

(somehow or other)

Unpredicted Result: IKKOUNI (no progress at all), IGAINI, ANGAI (unexpectedly), KAETTE

(on the contrary), KEKKOU (quite, very well), TSUI, TSUITSUI (unintentionally, unconsciously),

DOUSITEMO (at any cost), NAKANAKA (not easily)

60

Figure 3.6 Procedure to extract the emotion of “Confirmation”

No

Calculate EV for theevent

EV (user) > 0

Satisfaction Fears- confirmed

EV (user) < 0 EV (user) > 0

ReliefDisappointment

EV (user) < 0

Case frame representation of an event

Nothing

EV (user) = 0EV (user) EV (user)

Calculate EV for theevent

Compare affirmative/negative expression of input event withthat of prospective event based on Table 3.5.

Happened Not prospected Prospected

Nothing

Is the event in the Prospective Event List?

Does this event have a past aspect?

Yes

No

Yes

Not happened

No

ProspectiveEvent List

Does the adverb which relates to the prospect exist?

61

3.3.4 Well-Being The emotions in the “Well-Being” group are aroused when the user feels pleasure or displeasure

for the event. When the user feels pleasure about an event, the user is feeling “joy”, and when the

user feels displeasure about an event, the displeasure means “distress.” The “Emotion Eliciting

Condition Theory” (Table 3.4) suggests that joy is elicited when he/she is pleased about an event.

However, this condition can be confirmed by the other emotions such as happy-for, gloating, hope,

satisfaction, and relief as shown in Table 3.8.

In these events, eliciting joy is judged by adopting the EGC output about the event. Furthermore,

when the user elicits happy-for, gloating, hope, satisfaction, and relief, the user elicits joy, too,

because the eliciting condition of these emotions also meets a demand of the condition of joy. The

condition about distress is dealt with in the same way as shown in Table 3.9.

If an event elicits opposite emotions at the same time, the situation is called conflict. For example,

an event such as “my son was jilted by a bimbo,” the speaker is sorry for his son but feels relief at

the same time. Any special processes are not supplied for the conflict, but just extract two opposite

emotions.

Table 3.8 Comparison amongst the emotion eliciting conditions relating to the “pleasure” emotion

Emotion Emotion Eliciting Condition

Joy Pleased about an event

Happy-for Pleased about an event desirable for another

Gloating Pleased about an event undesirable for another

Hope Pleased about a prospective desirable event

Satisfaction Pleased about a confirmed desirable event

Relief Pleased about an unconfirmed undesirable event

Table 3.9 Comparison amongst the emotion eliciting conditions relating to the “displeasure” emotion

Emotion Emotion Eliciting Condition

Distress Displeased about an event

Sorry-for Displeased about an event desirable for another

Resentment Displeased about an event undesirable for another

Fear Displeased about a prospective desirable event

Fears-confirmed Displeased about a confirmed desirable event

Disappointment Displeased about an unconfirmed undesirable event

62

3.3.5 Attribution There are “pride,” “admiration,” “shame” and “reproach” in a group of “Attribution”. The

condition for the emotion is “approving/disapproving of one’s own/ another action.” We propose the

methods to judge whether the event is approving or not and who happened the event.

First, we propose a method to judge whether the event is approving or not. The event is

approved/disapproved of by the one’s own judgement. There are various criteria for the standard

character based on many factors like the users’ senses of values, experience, living environment,

social environment, and so on. However, it is impossible to compare all the events with all the moral

values whenever the agent recognizes the event, because there are countless different values in the

world and we need a system of complex reasoning to confirm that the event is in keeping with the

values for each case. Furthermore, there are not any knowledge databases which can deal with such

complex reasoning.

Therefore, we deal with only one moral value “An event that gives me pleasure is a good thing,”

because it is the simplest and the most instinctive moral. “An event that gives me pleasure” is

defined as “the event that the EGC result is pleasure.”

Next, we propose a method to check who happened the event. The actor of the event is also

needed for detecting attribution emotion in the “Emotion Eliciting Condition Theory.” We adopt this

method only when we use the verb transitive, because the concept of the actor is expressed as the

subject of such a event. The event types with the verb transitive are type VI to type XI.

Then we classify the emotions based on “the actor of the event is one’s own or not” and “the event

is pleasure for me” as shown in Table 3.10.

Figure 3.7 is the procedure to extract the emotion of “Attribution.” At first, whether the predicate

of the event is the transitive verb or not, is checked. The event with the transitive verb means the

situation is “the object has been effected by someone.” Next, the emotion about the event is

calculated by the EGC. When the user feels pleasure or displeasure about the event, we pay attention

to the actor, i.e. who brings about the event. When the EV of the event is pleasure, and is created by

the user, the user feels pride, and when it is created by another, the user feels admiration. On the

other hand, When the EV of the event is displeasure and it is created by the user, the user feels

shame, and when it is created by another, the user feels reproach.

Table 3.10 Relationship between the actor, EGC result and generated emotion

Actor

One’s own Another

Pleasure Pride Admiration

0 0 EGC Result

Displeasure Shame Reproach

63

Figure 3.7 Procedure to extract the emotion of “Attribution”

Calculate EV for the event

The user

Subject

Pride Admiration

EV (user)

The other The user

Shame Reproach

The other

EV (A)

Case frame representation of an event

Nothing

FV databaseof the user

Is the predicate of the event a transitive verb?

EV (user) = 0 Subject

EV (user) > 0 EV (user) < 0

Yes No

64

3.3.6 Well-Being / Attribution The emotions in the group of “Well-being/Attribution” are elicited as compound emotions. There

are four emotions, gratitude, anger, gratification, and remorse, in “Well-being/Attribution” group.

As shown in Table 3.11, these emotions are compounded from “Well-being” emotions and

“Attribution” emotions based on the “Emotion Eliciting Condition Theory” as shown in Table 3.4.

Some conflicts are elicited in this table, however, Any special processes are not supplied for the

conflict.

Table 3.11 Emotion compound rules from “Well-being” emotions and “Attribution” emotions

Emotion of Attribution

Admiration Reproach Pride Shame

Joy Gratitude Conflict Gratification Conflict Emotion of Well-being Distress Conflict Anger Conflict Remorse

3.4 Dependency among Emotion Groups

We consider the dependency among emotion groups, as shown in Figure 3.8, based on the

eliciting condition of each emotion.

At first, we calculate pleasure/displeasure of the user concerning the event using EGC method

from the user’s viewpoint. When the event is prospective, the emotion in the “Prospect-based” group

is extracted. When the prospective event is confirmed or unconfirmed, the emotion in

“Confirmation” group is extracted. Furthermore, the EGC is applied to the event from the other’s

viewpoint, too, and when the other feels pleasure or displeasure, the emotion in the

“Fortunes-of-others” group is extracted. The emotions in the “Prospect-based,” “Confirmation,” and

“Fortunes-of-others” groups are aroused when the user is pleased about the event. Therefore, the

emotion in the “Well-being” group is extracted when these emotions are extracted or the user feels

pleasure or displeasure about the event. On the other hand, the output of EGC also shows the moral

value. When a moral is approved/disapproved of by the event, we can extract the emotion in the

“Attribution” group.

At last, the emotion in the “Well-being/Attribution” group is compounded from “Well-being”

emotions and “Attribution” emotions as shown in Table 3.4.

65

Figure 3.8 Dependency among emotion groups

Emotion Generating Calculations (EV)

Prospect-based

Confirmation

Fortunes-of-others

Well-being Attribution

Well-being/Attribution

ProspectiveEvent List

Approving or disapproving of a moral

Events from the viewpoint of the user

Events from the viewpoint of others

Predicted event from theviewpoint of the user

66

3.5 Example of Complicated Emotion Allocating Method Example 1: “Rome dates with Juliet.”

Emotion Generative Calculations method

Event: “Romeo dates with Juliet.”

Predicate (P) = “date with” : +0.6

Subject (S) = “Romeo” : +1.0

Object-Mutual (OM) = “Juliet” : +0.9

Event Type: “date with” V(S, OM)

47.1)6.0,9.0,0.1(

),,(

����

� POMS fffEVofDegree

(Pleasure) I Area)6.0,9.0,0.1(),,(/

����

� POMS fffedispleasurpleasurehDistinguis

47.1)1()47.1(

)/()(

��

���

�� edispleasurpleasureEVofDegreeEmotionGenerated

Complicated Emotion Allocating Method:

(1) Fortunes-of-others (Section 3.3.1)

(a) Fortunes-of-“Juliet”

Predicate (P) = “date with” : +0.7

Subject (S) = “Romeo” : +0.9

Object-Mutual (OM) = “Juliet (myself)” : +1.0

Event Type: “date with” V(S, OM)

(Pleasure) I Area)7.0,0.1,9.0(),,(/

����

� POMS fffedispleasurpleasurehDistinguis

EV of the event from “Juliet’s” viewpoint = Pleasure

&

Romeo likes the “Juliet”

(FV of “Juliet” from Romeo’s viewpoint: +0.9)

Happy for “Juliet”

(Table 3.5)

67

(b) Fortunes-of-“Lord Montague (Romeo’s father)”

Predicate (P) = “date with” : +0.3

Subject (S) = “Romeo” : +0.8

Object-Mutual (OM) = “Juliet” : –0.5

Event Type: “date with” V(S, OM)

re)(Displeasu IV Area)3.0,5.0,8.0(),,(/

����

� POMS fffedispleasurpleasurehDistinguis

EV of the event from “Lord Montague” viewpoint = Displeasure

&

Romeo likes “Lord Montague”

(FV of “Lord Montague” from Romeo’s viewpoint: +0.5)

Sorry for “Lord Montague”

(2) Well-being (Section 3.3.4)

(a) “Happy for Juliet” is generated by the event

Joy about the event

(b) “Sorry for Lord Montague” is generated by the event

Distress about the event

(Table 3.5)

(Table 3.8)

(Table 3.8)

68

Example 2: “Yesterday, a mother scolded her noisy child.”

Emotion Generative Calculations method

Event: “Yesterday, a mother scolded her noisy child.”

Predicate (P) = “scold” : –0.3

Subject (S) = “a mother” : 0.0

Object (O) = “her noisy child” : –0.5

Event Type: “scold” V(S, O)

77.0)3.0,5.0,5.0(

),,(

����

� PO ffEVofDegree �

(Pleasure) VI Area)3.0,5.0,5.0(),,(/

����

� PO ffedispleasurpleasurehDistinguis �

77.0)1()77.0(

)/()(

��

���

�� edispleasurpleasureEVofDegreeEmotionGenerated

Complicated Emotion Allocating Method:

(2) Fortunes-of-others (Section 3.3.1)

(a) Fortunes-of-“a mother”

Predicate (P) = “scold” : –0.4

Subject (S) = “a mother (myself)” : +1.0

Object (O) = “her noisy child” : +0.8

Event Type: “scold” V(S, O)

re)(Displeasu V Area)3.0,5.0,8.0(),,(/

����

� PO ffedispleasurpleasurehDistinguis �

EV of the event from “a mother’s” viewpoint = Displeasure

&

The user does not have any impression for “a mother”

(FV of “a mother” from the user’s viewpoint: 0.0)

No emotion

(Table 3.5)

69

(b) Fortunes-of-“her noisy child”

Predicate (P) = “scold” : –0.3

Subject (S) = “a mother” : +0.8

Object (O) = “her noisy child (myself)” : +1.0

Event Type: “scold” V(S, O)

re)(Displeasu V Area)3.0,5.0,0.1(),,(/

����

� PO ffedispleasurpleasurehDistinguis �

EV of the event from “her noisy child’s” viewpoint = Displeasure

&

The user dislikes the “her noisy child”

(FV of “her noisy child” from the user’s viewpoint: –0.5)

Gloating at the “her noisy child”

(2) Well-being (Section 3.3.4)

Gloating is generated by the event

Joy about the event

(3) Attribution (Section 3.3.5)

The predicate of the event (scold) is a verb transitive.

&

EV about the event from the user’s viewpoint = Pleasure

Admiration for “a mother’s” act

(4) Well-being / Attribution (Section 3.3.6)

Joy is generated about the event

&

Admiration is generated about the event

Gratitude to “a mother”

(Table 3.5)

(Table 3.8)

(Table 3.10)

(Table 3.11)

70

Example 3: “My friend, suffering a disease, was cured.”

(The event “My friend, suffering a disease won’t be cured” was recognized before.)

Emotion Generative Calculations method

Event: “My friend, suffering a disease was cured.”

Predicate (P) = “be cured” : +0.6

Subject (S) = “my friend, suffering a disease” : +0.5

Event Type: “be cured” V(S)

93.0)6.0,5.0,5.0(

),,(

����

� PS ffEVofDegree �

(Pleasure) I Area)6.0,5.0,5.0(),,(/

����

� PS ffedispleasurpleasurehDistinguis �

93.0)1()93.0(

)/()(

��

���

�� edispleasurpleasureEVofDegreeEmotionGenerated

Complicated Emotion Allocating Method:

(1) Fortunes-of-others

(a) Fortunes-of-“my friend”

Predicate (P) = “be cured” : +0.4

Subject (S) = “my friend, suffering a disease (myself)” : +1.0

Event Type: “scold” V(S)

(Pleasure) I Area)4.0,5.0,0.1(),,(/

����

� PS ffedispleasurpleasurehDistinguis �

EV of the event from “my friend’s” viewpoint = Pleasure

&

The user likes the “my friend”

(FV of “my friend” from the user’s viewpoint: +0.5)

Happy-for the “my friend”

(Table 3.5)

71

(3) Confirmation (Section 3.3.3)

The prospective event “My friend, suffering a disease won’t be cured.”

Event: “My friend, suffering a disease won’t be cured.”

Predicate (P) = “not be cured” : –0.6

Subject (S) = “my friend, suffering a disease” : +0.5

Event Type: “be cured” V(S)

re)(Displeasu V Area)6.0,5.0,5.0(),,(/

����

� PS ffedispleasurpleasurehDistinguis �

The prospective event is displeasure for the user.

The prospective event did not happen. (Table 3.6)

Relief about the prospective event

(3) Well-being

Happy-for and Relief are generated about the event

Joy about the event

3.6 Experimental Results In this section, the adequacy of the generated emotion by our proposed method described in this

chapter is reviewed through the analysis of compared the generated emotions of the system, with the

result of the questionnaire. At first, we adopted our method to dialogue corpus and we extracted 30

sentences from the corpus. Then, we asked 15 university students which emotion was aroused by the

content of the 30 sentences as shown in Table 3.12.

3.6.1 Experimentation 1 In this experimentation, we evaluated how the system generates emotions, similar to that aroused

in the subjects. At first, we showed 30 sentences to 15 subjects, and they select adequate emotions

from the 20 emotions that our system can generate. Table 3.13 shows the comparative results

between the system output and the subjects’ answers. The system extracted all the emotions that all

subjects selected, and it extracted 75% of emotions that most of the subjects (80%) selected.

We consider two reasons that the system cannot extract common emotions.

At first, there are rarely inadequate FVs of predicates. We define the FVs of predicate, based on

(Table 3.7)

(Table 3.8)

72

whether the predicate means approach or avoidance. However, when the system analyzes an event “I

don’t know the reason of my disease,” the subject is “I,” the object is “the reason of my disease,” the

predicate is “not know,” and the event type is V (S, O). We give the predicate “know” the positive

image because “know” means “gain some knowledge,” then the system generated “pleasure” about

the event. However, the subjects aroused negative emotions like “fear (86.7%),” “distress (73.3%),”

and so on. There is no doubt about “the reason of my disease is not preferable,” but we guess further

disadvantages like “the disease won’t be cured” if the reason is unclear. We should define the FV of

predicates considering such situations, too. We have to add the reasoning system to solve this

problem completely.

Predicate (P) = “not know”: –0.4

Subject (S) = “I”: +1.0

Object (O) = “the reason of my disease”: –0.3

Event Type: “scold” V(S, O)

(Pleasure) VIII Area)4.0,3.0,0.1(),,(/

����

� POS fffedispleasurpleasurehDistinguis

73

Table 3.12 Sample sentences for experimentation

Table 3.13 Reappearance Rate

Agreement rate (over *%) 100 90 80 70 60

Selected number 2 4 20 29 47

Reappearance number 2 3 15 20 31

Reappearance rate (%) 100.0 75.0 75.0 69.0 66.0

1. An old peoples home won’t accept me. 2. A player for the Giants hits a slump. 3. I can’t withdraw my savings at the bank. 4. I’m bored. 5. My grandson will be born next month. 6. Unexpectedly, my daughter didn’t come to meet me. 7. I live with my family. 8. I hurt my grandson. 9. I always relax at home. 10. My friend has an incurable disease. 11. At last, my friend’s disease was incurable. 12. My friend recovered from his disease. 13. My son will come home tomorrow. 14. My son didn’t come home. 15. My son came home this evening. 16. As I expected, the Carp lost to the Swallows. 17. I don’t know the reason for my disease. 18. My family manages my money instead of me. 19. I remember all of the “one hundred poems” cards. 20. Motorcycle gang members haven’t been arrested by the police. 21. I have a family G.P. 22. I’m suffering from kidney failure. 23. My friend was injured. 24. The Carp often loses to the Giants. 25. I’m terribly forgetful. 26. My daughter takes trouble to check my change. 27. I can make a lot of household objects. 28. My friend didn’t take me to hot spring at that time. 29. Yesterday, the house owner argued with my noisy neighbor. 30. I often give advice to my family.

74

The second problem is extracting tacit intention from the “aspects.” Some “aspects” imply

“hope,” “possibility,” and so on. For example, “DEKINAI (cannot)” implies “I want to do something,

but it has been prevented,” and “SHITEKURERU (take trouble to do)” implies “someone took

trouble to help me.” In this experimentation, the subjects aroused “gratitude (66.7%)” about the

event “my daughter takes trouble to check my change,” and “distress (53.3%)” about the event “I

can’t withdraw my savings at the bank.” We have to gather such implicit expressions by analyzing

the corpus.

3.6.2 Experimentation 2 In this experimentation, how the system generates adequate emotions is reviewed. We showed the

generated emotions by the system to 15 subjects and they answered the adequacy of the emotions by

the five grade evaluations such as “TOTEMO-DATOU (exactly),” “YAYA-DATOU (adequate),”

“DOCHIRADEMONAI (so-so),” “YAYA-FUTEKISETSU (inadequate),” and

“TOTEMO-FUTEKISETSU (wrong).” Then, we assigned the numbers 1.0, 0.75, 0.5, 0.25 and 0.0 to

their answer patterns. We considered the adequacy rate as the average of the answer values.

∑ (subject’s answer value) Adequacy Rate =

The number of subjects

Table 3.14 shows the result. Half (47.0%) of the generated emotions by the system were evaluated

that they were exactly correct because their adequacy rates were over 0.8. Furthermore, most

(86.4%) of the generated emotions were evaluated that were relatively correct because their

adequacy rates were over 0.5.

Table 3.14 Adequacy Rate Adequacy Rate (over *) 0.8 0.6 0.5

Agreed Rate in the evaluated emotions (%) 47.0 75.8 86.4

We consider two reasons that the system cannot extract common emotions.

The first problem is about the dependency of emotions. In our method, the system always

generates “joy” when the EGC outputs “pleasure.” However, there are some exceptions in this

experimentation. For example, our method generates “gratitude,” “joy,” and so on about the event

“My family manages my money instead of me.” The subjects agreed “gratitude (68.3%),” but they

did not agree “joy (33.3%)” that was derived from “gratitude.” We are re-investigating the

conditions of “well-being” by analyzing such paradoxical responses.

The next problem is the relationship among competing emotions. For the event “my friend was

75

hurt,” the system generated not only “sorry-for” but also “reproach” and “anger.” The latter

emotions are caused by the displeasure about the situation “my friend’s act upset me.” Most of the

subjects agreed “sorry-for (98.3%),” however, a few subjects agreed “reproach (26.7%)” and

“anger (23.3%).” We have to consider the relationship “some people do not arouse aggressive

emotions against a person that they are feeling sorry for,” causes such situations. We investigate

these relationships among competing emotions.

3.7 Future Works Using our method, we can extract emotion from the sentence form expression. However, we can

guess the speaker’s emotion not only from the sentence but also from some words like “BANZAI,”

“ARIGATOU (Thank you),” “GOMEN (I’m sorry),” “CHIKUSHOU (damn),” and so on.

Furthermore, we are not always dealing with complete sentences, because the hearer occasionally

mishears a few words from the utterance and the speaker sometimes omits some words. Therefore,

we focus on extracting emotion by some words that have an emotional feeling. This method picks up

some small emotional expressions and follows the EGC method.

We found out many affective words by retrieving emotion words in the “EDR Japanese Word

Dictionary [62]” as shown in Table 3.15.

When there are some affective words in the event, the degree of the corresponding emotion is

increased a little. This method is easy and effective, however, it cannot correspond to the negative

aspect. Thus, we give restrained values to this method.

By adding this method to the EGC method, we can extract not only the former 20 emotions but

also the emotions about “Attraction” and “Attraction/Attribution” synthetically.

76

Table 3.15 Affective words for each emotion

Emotion Examples of affective words

Joy ASOBI (play, amusement), SHOUMI (relish), AIKAN (joys and sorrows)

Distress ITAMASHII (painful), KUNOU (suffering), SHIREN (trial)

Happy-for ANTAI (peace, welfare), IWAU (congratulate), KANSEI (cheer)

Gloating ETSUBO (act of gloating), KOKIMIYOI (smart, neat)

Resentment FUKUSHUUSURU (revenge), ENKON (bitter feeling)

Sorry-for KAIKON (remorse, regret), MOUSHIWAKENAI (I’m very sorry)

Hope INORU (pray), KOUMYOU (gleam of hope), KIBOU (hope)

Fear AKUMU (nightmare), AWADATSU (get goosebumps)

Relief HOTTO (feel relieved), IKITSUKU (catch the breath)

Satisfaction AKIRU (have enough), YOI (good), ENMAN (perfect, harmonious)

Fears-confirmed —

Disappointment MUNASHII (empty, ineffectual), GAKKARI (be discouraged)

Pride OHON (cough with pride), IKIYOUYOU (be in high spirits)

Admiration APPARE (bravo, well done), KANSHIN (admire, be deeply impressed)

Shame AKAPPAJI (shame, disgrace), OTEN (stain, blot)

Reproach URAMESHII (reproachful, resentful), SAINAMU (reproach, torment)

Liking SUKI (like), NINKI (popularity), YOROSHIKU (Give my regards to)

Disliking MUKATSUKU (be irritated, get angry), GUUTARA (lazybones)

Gratitude ARIGATOU (thank you), OKURIMONO (gift), KANSHA (gratitude)

Anger KATTO (fly into a rage), IKIDOORASU (be angry, resent)

Gratification SIYOKU (gratification of desire)

Remorse —

Love AISURU (love), ADOKENAI (artless, innocent), ATATAKAI (warmhearted)

Hate URAMI (bitter feeling, grudge), IMAWASHII (disgusting, detestable)

77

3.8 Conclusion In this chapter, we presented a method to classify generated simple emotion (pleasure/displeasure)

by EGC method into 20 various emotions based on Elliott’s “Emotion Eliciting Condition Theory.”

Elliott’s theory requires judging such conditions as follows; “feeling for another,” “prospect and

confirmation,” approval/disapproval.” We defined the rules to check the condition of “feeling for

another” based on the EGC’s result using another’s FVs. We can judge “prospect and confirmation”

by extracting some aspects and adverbs. “Approval/disapproval” is judged by the event’s case frame

structure with the transitive verb.

To verify the effectiveness of the proposed method, we compared generated emotions with

human aroused emotions. As a result, 75% of the emotions that most (80%) of the subjects aroused

reappeared by this method, and half of the generated emotions by the system were evaluated that

they were exactly correct because their adequacy rates were over 0.8. Furthermore, most (86.4%) of

the generated emotions were evaluated that were relatively correct because their adequacy rates were

over 0.5.

Our proposed method for generating various emotions is confirmed its adequacy by questionnaire.

However, there are much more factors not only as we adopted, to connect events with emotions. We

are investigating emotion generating rules by clustering the relationship between emotions and

events using tree structure. We have to compare the result with our proposed method and unite them.

Next, in this study, we do not supply any special processes for conflict, for example, “distress”

and “admiration” are aroused for an event at the same time. Because there are various reactions

against the conflict. We are going to investigate these reactions and its conditions based on

psychology, and realize these processes.

78

CHAPTER 4 ANALYSIS OF AFFIRMATIVE/NEGATIVE INTENTIONS FROM USER’S ANSWERS TO YES-NO QUESTIONS

To achieve natural communication between a human and the computer, we presented a method to

calculate a user’s emotion from the content of the user’s utterances in chapter 2 and 3. However,

when the computer feels and expresses emotions, people will use unrestrained expressions for the

computer and a person alike. Furthermore, people occasionally use ambiguous expressions when

their language may cause the hearer’s or one’s own displeasure or they have not have clear intentions.

Even in such a situation, the natural language dialogue system has to recognize the user’s tacit

intention from the expressions.

In this chapter, we propose a method to analyze the user’s affirmative/negative intention from

his/her utterances in the dialogue [30, 31, 32]. First, we extract some elements based on the surface

structures of the responses and a concept of the question, and calculate a value (affirmative/negative

value) corresponding to the degree of affirmation/negation. There are three types in the elements,

“affirmative/negative description for the yes-no question,” “direct expression of intention in the

response,” and “indirect expression of intention in the response.” The affirmative values are defined

based on the questionnaires. Furthermore, a calculating function of affirmation value changes is

defined according to aspects of the verb. Finally, the total affirmative/negative intentions to a

question in the dialogue are calculated.

We applied this method into “Web-based analytical system of health service.” This system asks

the questions to the user and analyzes the user’s intention through conversation.

To verify the validity of our proposed method, we apply our method to 50 responses in the

dialogue corpus which obtained by conversing with five subjects, and evaluate the results.

4.1 Intention Analyzing Method from the Utterance 4.1.1 Understanding Intentions of the Indirect Speech-Act in Natural Language Interfaces

Mima et al. proposed a method for understanding the intention of the indirect speech-act in natural

language interface for the operation of the computer system [29]. This method detects the user’s

demand for commands by converting the surface concept of the user’s input sentence based on its

intention.

First, the surface concept and the intention of the input sentence are extracted based on the result

of the morphological analysis. This method deals with five kinds of intention; “REFUSAL,”

“REVERSAL,” “RESTRICTION,” “BENEFIT,” and “DISABILITY.” In order to decide the

intention, they use a lot of morphological and syntactic rules as shown in Table 4.1. These rules are

79

specialized for Japanese grammar. Next, the system anticipates the user’s demand using the

knowledge representation for concepts of operations (Figure 4.1) and the intention links on the

operation knowledge base (Figure 4.2).

For an example of the process used by this method, we consider “I don’t want to show the file

named letter.” First, a surface concept “show the file named letter” and an intention “REFUSAL.”

Successively, the demand is anticipated using the concept, the intention, and the intention link. Then,

the system anticipates the demand as “request for forbid reading the file named letter.”

Table 4.1 Morphological and syntactic rules for deciding the intentions <*REFUSAL>::=

|

|

|

<N><relationship expression1><predicative expression1>

<N><relationship expression1><V><passive><predicative expression1>

<V><predicative expression1><N><verb that means existence>

<V><passive><predicative expression1><N><verb that means existence> <*REVERSAL>::=

| <N><relationship expression1><V><predicative expression2>

: : :

<relationship expression1>::= <subject>|<starting point>|<degree1>|<repeat>|<term>| … : :

<verb that means existence>::= <GA ARU>|<GA ARIMASU>|<GA SONZAI SURU>

|<GA SONZAI SHIMASU>

: :

80

Figure 4.1 Illustrations of knowledge representations for concepts of operations

<LOAD>

<LOOK>

<READ>

<DISPLAY>

MIRU

KAKUNIN

YOMIKOMU

YOMU SIMESU HYOUJI

plan

plan

plan

(a) <LOAD>

<RESTORE>

<UNDELETE> <STORE>

FUKKI

FUKKATSU <STORE-PAGE_NUM>

TSUKERU

<UNDELETE-FILE>

D

D

D

effect

(b) <RESTORE>

81

Figure 4.2 Illustrations of the intention links on the operation knowledge base

4.1.2 Recognizing User Communicative Intention in a Dialogue-Based Consultant System

Kumamoto et al. proposed a method to recognize a user’s communicative intention (CI) from the

natural language dialogue in order to support the usage of a computer [63]. They consider eight types

of CI type; “method,” “attribute value,” “concept,” yes-no value,” “goal,” “belief,” “end of

dialogue,” and “start of dialogue.”

Figure 4.3 shows a process of the communicative intention recognition. First, the system extracts

some features to determine the CI type according to the result of the morphological analysis of the

input sentence. There are four type features; “function word,” “information about parts of speech,”

“conjugate information,” “original form information.” Table 4.2 is the list of the function words.

Next, the CI type is determined according to the pattern matching between the extracted features and

the “pattern-CI type translation table” as shown in Table 4.3. Successively, the system generates the

action frame of sentence and outputs the CI description by combining the CI type and the generated

action frame.

<DELETE>

<CREATE>

<RESTORE>

<TRANSFER>

<FORBID-WRITE>

<FORBID-READ>

<LOAD> <PRINT>

“REVERSAL”“REVERSAL”

“REVERSAL” “REVERSAL”“REFUSAL”

“REFUSAL” “REFUSAL” “REVERSAL”

“REFUSAL” “REVERSAL”

“REFUSAL” “REVERSAL”

82

Figure 4.3 Flow of communicative intention recognition

Table 4.2 List of function words Type of function word Examples 1 Dialogue starting signal SUMIMASEN, ANO 2 Dialogue ending signal WAKARIMASITA 3 HOW phrase DOUSURU, DOUYARU, DOU 4 Predicate about teaching OSHIERU

5 Predicate about benefit MORAU, KURERU

6 Predicate about wish HOSHII

7 Interrogative pronoun DARE, DOKO (without NANI)

8 NANI NANI

: : :

19 Predicate about knowledge SHIRU, OBOERU, WAKARU

Input of user utterance sentence

Morphological analysis

Feature extraction

CI type determination Frame generation

Output of CI description

+

83

Table 4.3 Pattern-CI type translation table

(((Dialogue starting signal)) Starting dialogue) (((Dialogue ending signal)) Ending dialogue) (((HOW phrase)) Method) (((Predicate about teaching)) Attribute value) (((Predicate about benefit)) Goal) (((Predicate about wish)) Goal) (((Interrogative pronoun)(OK type predicate) Attribute value: OK) (((NANI MO)(OK type predicate) Yes-no value: OK)

: : (((end of sentence)) Belief)

For an example of the process used by this method, we consider “SHUURYOU WO OSEBA IIN

DESUNE? (It is OK that I just click “SHUURYOU” button, isn’t it?)” First, the system extracts

(SA-HEN type noun, predicate about movement, condition form, OK type predicate, question type

particle, end of sentence) as the features to determine CI type, and CI type “yes-no value: OK” is

selected according to the “pattern-CI type translation table.” Next, the action frame “click the

“SHUURYOU” button” is generated at the frame generation stage. Therefore, the CI description

(yes-no value: OK, “click the “SHUURYOU” button”) is obtained.

4.2 An Overview of the Affirmative/Negative Intention Analyzing Method 4.2.1 Web-based Analytical System of Health Service needs among Healthy Elderly

An increase of elderly people requires not only a need for medical care but also a need for health

services. Japanese society has already built up medical insurance and medical care systems.

However, the health service does not cater for the “healthy elderly.” Yoshida proposed the

“web-based analytical system of health service needs of the elderly.” [64] The basic questionnaire

consists of 50 items. These items are related to QOL, -Quality of Life-, life-satisfaction, life-style,

mental stress, and social concern. The subject tries to answer these questionnaire sheets in the

homepage and these answers are sent to the web server in Figure 4.4(a). Successfully, the system

checks if the questionnaires were fulfilled by all the conditions and were stored in the queue to

implement a reasoning. Based on such calculations, the system was presented to the subject, in order

to give the analytical results and health counseling comment to the browser in Figure 4.4(b).

The system was developed to analyze the population-based health service needs for the official

84

health center. A health service for elderly people will be offered, based on these results. The system

is expected to classify the health service needs of the elderly in his/her home.

Although the personal computer diffusion in Japan reaches up to about 30% according to the

statistical data of the Economic Planning Agency of Japan, a few people feel difficulties using a

computer. Especially, we often see them amongst the elderly. For example, they tire of answering to

50 questions, gazing a display for a long time, and reacting to the monotonous computer system. The

problems are caused in the distance between the conversation and the usability of the computer. That

is, they hope the computer equipped with human-like interfaces, enables an easy conversation like

greetings and so on. Besides verbal messages, human face-to-face communication usually involves

nonverbal messages such as facial expressions, vocal inflection, speaking rate, pauses, hand gestures,

body movements, posture, and so on.

To improve this weak point of the developed “web-based health service system,” we propose a

method to analyze the user’s tacit intention even if the user replies with an ambiguous response and

to respond with natural responses for comfortable conversation.

85

Figure 4.4 (a) Questionnaire

86

Figure 4.4 (b) Results

Figure 4.4 Web-based analytical system of health service

87

4.2.2 Affirmative/Negative Intention Analyzing Method for Web-based Analytical System of Health Service

The system asks 50 questions about QOL (quality of life) to the user, and displays the counseling

comment about the user’s health as shown in Figure 4.4. All the questions are the yes-no questions

that the user will answer by “yes” or “no,” and the questions have been fixed. However, people

cannot always determine their intentions clearly and the intentions occasionally clarify through the

conversation.

We propose a method to analyze a user’s affirmative/negative intention from responses for the

questions asked on Web [30, 31, 32].

Figure 4.5 shows the process to analyze the user’s intention. First, the system asks one of the

prepared yes-no questions and the user responds with an answer using natural language expression.

Next, the response is analyzed morphologically using chasen [65], Japanese morphological analyzer.

Then, the system extracts some words that imply affirmative/negative intention of the user,

affirmative/negative elements, from the output of morphological analyzing based on the concept

data of the question. We define three types of affirmative/negative elements; affirmative/negative

description of the yes-no question, direct expression of intention in the answer, and indirect

expression of intention in the answer as shown in Section 4.3. Successively, the system calculates the

affirmation values. All the affirmative/negative elements have affirmative values that indicate the

degree of affirmation/negation. However, some adverbs and modalities affect the affirmative values

of the verbs. We show their calculation method in Section 4.4, 4.5 and 4.6. At last, the system

calculates the total affirmative intention of the user by extracting affirmative/negative elements. The

dialogue about the question continues until the system obtains enough affirmative/negative elements

to guess the user’s intention or the system judges that the same topic is continuing too long.

We limit the aim of this study to Japanese utilization, because this method is applied to the

interface in the “Web-based analytical system of the Health Service for the Elderly [64].”

88

Figure 4.4 Overview of the intention extracting method

Response Morphological analyze and parsing

Extracting elementsof intention Question concept

database

Affirmative/negative elements

Calculate affirmative values of the elements

Utterance intention list

Calculate total affirmativeintention of the subject

total affirmative intention of the subject

Modality & adverb

Question

89

4.3 Affirmative/Negative Element The speaker’s intention about a yes-no question is guessed by extracting affirmative/negative

elements from the conversation. We categorize the affirmative/negative elements into three types

based on the relationship with the content of the question as follows;

�� Affirmative/negative description for the yes-no question

�� Direct expression of intention in the response

�� Affirmative/negative expression using the main verb in the question

�� Affirmative/negative expression using the auxiliary verb in the question

�� Indirect expression of intention in the response

�� Indirect information addition

�� Non-standard reason addition

4.3.1 Affirmative/Negative Description for the yes-no Question The following example is a dialogue which includes an “affirmative/negative description for the

yes-no question” type’s affirmative/negative element. This question has been used in the “Web-based

analytical system of the Health Service for the Elderly.” The questions in Section 4.3.2 and 4.3.3

have been used in the web-based analytical system, too.

Q1: Do you sometimes become nervous?

A1: Yes, I do.

We limit the speaker’s intention to “Yes” or “No,” because our purpose is guessing whether the

speaker agrees or disagrees the content of the yes-no question. The most standard response for the

question is “Yes” or “No.” Furthermore, “That’s right” and “It’s wrong” are also used as responses to

the question.

In this paper, we use the following interjections as described in “Basic Japanese grammar [51]” as

affirmative/negative elements.

Affirmative interjections: HAI, EE, AA, UN, HAA, SOUDA

Negative interjections: IIE, IYA, CHIGAU

In the case of two or more interjections are in a sentence, we use the first interjection of the

sentence, because the other interjections, which appear in the middle of the sentence, sometimes

indicate the opposite intention to the prior interjection. In Japanese, the answers against negative

questions like “Can’t you…?” sometimes cause confusion. However, the system does not supply

such a negative question.

90

4.3.2 Direct Expression of Intention in the Response

Q2: Do you lose your way?

A2: Yes. I often lose my way.

Q3: Can you fill out the pension forms?

A3: Of course, I can.

The user answers his/her affirmative/negative intention using the verb in the question.

Furthermore, some auxiliary verbs which indicate capability and continuation, occationary represent

affirmation/negation. We consider that the sentence has an affirmative intention, when there is a verb

or an auxiliary verb which appears in the question and it does not have a negative aspect. We

predefine the following verbs and auxiliary verbs as affirmative/negative elements, because the

system always asks the same 50 questions and their expressions are always the same.

Main verbs: write, read, lose, think, feel, talk, guide, have, interest, satisfy, inconvenience,

socialize, live, remember, calculate, hang, help, join, take, eat, walk, endeavor, smoke, desire

Auxiliary verbs: can, have

4.3.3 Indirect Expression of Intention in the Response

Q4: Can you fill out the pension forms?

A4: They aren’t so difficult for me.

Q5: Can you fill out the pension forms?

A5: My daughter guides me.

People occasionally do not want to reply clearly, when they feel coy about the content of the

response, they have not completed their intentions yet, and the response may cause displeasure to the

hearer, and so on. In such situations, they use indirect expressions that make the hearer guess the

intention. Yamada classified indirect responses into 12 categories as shown in Table 4.4 [33].

We limit the extract intention to guess the speaker’s affirmation/negation, that is, “Indirect

information addition” and “Non-standard reason addition.”

The reasoning process is needed for guessing the speaker’s purpose, the intention, demand, and so

on using his/her surrounding situation. However, it is too difficult to judge from a natural language

utterance.

91

Therefore, we define some verbs that will be used in the indirect representations for each question,

and we consider them as affirmative elements. These affirmative elements are selected based on the

result of the question answer dialogue corpus. They are similar/same words and upper/lower

concepts of the question’s verb, and the attribute of the question’s context, and so on. For example,

we select the following verbs as affirmative elements in the question “Can you fill out the pension

form?”

Sentence: Can you fill out the pension forms?

Affirmative matter: write, easy, move

Negative matter: ask, tired, hard

92

Table 4.4 Classification of cooperative responses [33]

1. Precondition responses

▪Correction precondition When the precondition about the question’s intention is

wrong, the response corrects it.

▪Notice precondition The response notices the precondition of the question’s

intention even if it is not satisfied.

▪Confirmation precondition The respondent sometimes asks about the precondition

when he/she does not know whether it is satisfied or not.

2. Providing information responses

▪ Providing wanted information Provide additional information about what the speaker

wants by guessing his/her tacit intention in the question.

▪ Providing Supportive

information

Provide supportive information to accomplish the

speaker’s purpose.

▪ Providing indirect information When the respondent does not know the answer but

he/she has incomplete information to guess the answer, it

is provided.

▪ Providing substitute information When the responses are negative, the substitute

information is provided to accomplish the purpose.

3. Providing reason responses

▪ Providing contrary expectation

reasons

Provide a reason why the respondent cannot meet the

speaker’s expectation of the question.

▪ Providing a non-standard reason Provide the reason for the response, when it is contrary to

general reasoning.

▪ Providing a standard reason Provide the reason for the response, even though, there is

a reason to imply non-standard.

4. Counter questions to the question

▪Cooperating question Asking for more information to achieve a purpose.

▪Intention correcting question Ask the real purpose of the question when the respondent

cannot guess the speaker’s intention.

93

4.3.4 Data Structure Description in Question We can apply the “affirmative/negative description for the yes-no question” for all questions.

However, the “direct/indirect expression of intention” depends on the expression of the question.

Then, the affirmative/negative elements about “direct/indirect expression of intention” for each

question have to be predefined.

Each question has three kinds of data frames; direct answer verb, affirmative matter, and negative

matter. The direct answer verb is defined from the main verb and the auxiliary verb in the question

as shown in Section 4.3.2. On the other hand, the indirect answer verbs are classified into

“affirmative matter” and “negative matter,” and they are extracted from the corpus for each question

as shown in Section 4.3.3. The following example is the data structure description in the question

“Can you fill out the pension forms?”

Sentence: Can you fill out the pension forms?

Direct answer verb: can fill out, can

Affirmative matter: write, do, easy, move

Negative matter: ask, leave, tired, hard

4.4 Affirmation Value In order to calculate the user’s affirmative degree to the question, we prepare the affirmative value

for each affirmative element. We define the affirmative values in the range [0.0, 1.0], where 1.0

means the strongest affirmation and 0.0 means the strongest negation. 0.5 means neither affirmation

nor negation. We defined the affirmative values of the interjections and verbs from the result of the

questionnaire as shown in Figure 4.6.

4.4.1 Affirmative Value of the Interjection

We investigated the affirmative degrees of the typical Japanese interjections, HAI (yes) and IIE

(no), by a questionnaire of the 11 grade evaluations on 14 university students (male: 10, female: 4).

The degrees are described on the number line. Figure 4.6 shows an example of the questionnaire.

By the results of the questionnaire, all subjects answered with an affirmative value towards HAI

over 0.8 and the value towards IIE under 0.2. Then we defined the average of the answers as their

affirmative values; the value of HAI is 0.94, and the value of IIE is 0.06. We did not give an absolute

value 1.0 and 0.0 to them, because when Japanese say “yes” it sometimes includes a little “no,” and

“no” sometimes includes a little “yes” [66].

We give the other affirmative interjections like EE, UN in Section 4.3.1 the same affirmation

value 0.94, and the other negative interjections like IYA, CHIGAU in the same section, the

94

affirmation value 0.06. We call these pair of interjections and its affirmative values “interjection

data.”

Q: Can you fill out the pension forms?

A1: Yes.

A2: No.

Affirmative degree for A1:

Affirmative degree for A2:

Figure 4.6 Example of the questionnaire 4.4.2 Affirmative Value of the Verb

We investigated the affirmative degrees of the direct expressions (KAKERU (can fill out),

DEKIRU (can)) and the indirect expressions (KAKU (write), YARU (do), TANOMU (ask),

MAKASERU (leave)) for the question “Can you fill out the pension forms?” using the questionnaire

of the 11 grade evaluations on 14 university students (male: 10, female: 4) the same as in Section

4.4.1.

We defined the value of the verbs of the direct expressions as 0.91, the value of the verbs of the

indirect affirmative expressions as 0.82, and the value of the verbs of the indirect negative

expression as 0.25 based on the average of the questionnaire’s result.

4.5 Affirmative Value Changing Scale 4.5.1 “Affirmative Value Changing Scale” Affected by the Adverb

Adverbs modify the verbs and the modification effect the affirmative degrees of the verbs. We

define the degree of the effect that changes the affirmative degrees of the verbs and adjectives as

“affirmative value changing scale.”

4.5.1.1 Classification of the adverb

Adverbs are classified into many types, and there are mainly four types; the “state adverb” which

Absolute affirmation

Absolute negation

Neither affirmationnor negation

Absolute affirmation

Absolute negation

Neither affirmationnor negation

95

indicates the state of action, the “degree adverb” which shows the degree of statement, feeling, and

change, the “quantity adverb” which shows the quantity of the objects and people that relate to the

action, the “tense and aspect adverb” which shows the time, occurrence, frequency and development

of the event. We consider only the “degree adverbs” because they effect the affirmative degree of the

modified verb.

In order to define the “affirmative value changing scale” of the adverbs, we classified both the

affirmative adverbs and the negative adverbs into three groups based on “DAIJIRIN [67]” and

“Adverbs’ meaning and usage [68]” as shown in Table 4.5.

Table 4.5 Classification of the degree adverbs [67, 68]

Group Adverbs A HIJOUNI, TOTEMO, ZUIBUN (greatly, extremely), KEKKOU, KANARI, DAIBU (quite,

very well), TOTEMO (very, really), JUUBUN, YOKU (enough, sufficiently), SOUTOU

(considerable) B WARITO, WARIAI, WARINI (comparatively, relatively), MAAMAA (moderate, so-so) C SUKOSHI (slightly), CHOTTO, SHOUSHOU, TASHOU (a little), IKURAKA (somewhat) D (no modified expression by adverb)

E MATTAKU, ZENZEN (quite, entirely), SUKOSHIMO, SAPPARI, CHITTOMO (not at all),

TOUTEI (hardly), METTANI (rarely)

F TOTEMOTOTEMO (not at all)

G SONNANI, AMARI, ANMARI (not very), SAHODO (not so much), TAISHITE (not very

much)

H (no modified expression by adverb)

4.5.1.2 Calculation of “affirmative value changing scale”

We investigated the effects to the affirmative degree by the adverb in each group using the

questionnaire that was completed by 30 subjects based on Analytic Hierarchy Process (AHP) theory [69]. We compared three affirmative adverb groups and the expression without any adverbs. We

selected one adverb from each group (group A “HIJOUNI,” group B “WARITO,” group C

“SUKOSHI”) and calculated their priorities for the affirmative degree of each adverb’s expression by

a paired comparison. In a similar way, we compared three negative adverb groups and an expression

without any adverbs, too. We selected these adverbs for each group (group E “MATTAKU,” group F

“TOTEMOTOTEMO,” group G “SONNANI”).

A paired comparison on AHP evaluates the ratio of each alternative’s priority. The subjects gave

the value 1.0 to the situation “the adverb A has the same meaning as the adverb B,” the value 3.0 to

96

“the adverb A is slightly stronger than the adverb B,” the value 5.0 to “the adverb A is stronger than

the adverb B,” the value 7.0 to “the adverb A is clearly stronger than the adverb B” and the value 9.0

to “the adverb A is absolutely stronger than the adverb B.” Based on the relationship between the

adverb A and B, the values change to 1.0, 1/3, 1/5, 1/7 and 1/9 respectively. We did not consider any

hierarchies for AHP because we just need the relative strength among the expressions, modified by

the adverb. We constructed four dimensions square matrix ][ ijaA � based on the result of a paired

comparison. ija is the result of a paired comparison between the adverb A and B. We calculated the

eigenvalue and the eigenvector of the matrix, normalized the eigenvector, and then calculated four priorities of the expressions ],,,[ 4321 wwwww � as shown in Table 4.6.

However, the affirmative degree of no adverb expression is also changed if these priorities’ values

are used for the “affirmative value changing scales” as it is. Therefore, the priorities are normalized

to the “affirmative value changing scale” of no adverb expression, as 1.00 by the following

expression.

178.0i

iw

u �

The value of the denominator means the priority of the no adverb expression, iw gives priority

to the group i, and iu is the “affirmative value changing scale” of the group i. In the same way, we

calculated the “affirmative value changing scales” for the negative adverb groups.

We consider that group E and F are the same group because some subjects answered that group E

is stronger than group F, and some that answered that group F is stronger than E.”

Table 4.6 shows the result of each group’s “affirmative value changing scale.” We call the pairing

of the adverb and its “affirmative value changing scale” “adverb data.” The affirmative value

calculation method for the modified predicate, using “affirmative value changing scale” is presented

in Section 4.5.2.4.

Table 4.6 Priority and “affirmative value changing scale” of each adverb group

Group Priority Affirmative value changing scale

A (HIJOUNI) 0.654 3.67

B (WARITO) 0.111 0.62

C (SUKOSHI) 0.057 0.32

D (nothing) 0.178 1.00

E (MATTAKU) 0.404 2.93

F (TOTEMOTOTEMO) 0.404 2.93

G (SONNANI) 0.054 0.39

H (nothing) 0.138 1.00

97

4.5.2 Affirmative Value Change by Modality Some suffix can change the meaning of the verb and the adjective. For example, when the suffix

“not” appears with the verb “write,” the meaning of “not write” is reversed from that of “write.” Not

only the negative modality but also the past modality and the double negative modality have some

effects. In this section, we propose the functions that express the change of the affirmative value by

variable modalities. 4.5.2.1 Negative modality

In order to clarify the relationship between the affirmative degree of normal expression and that of

the same expression with negative modality, we investigated the relationship by questionnaire. We

presented six normal sentences and their negative sentences to 14 subjects (university students) and

they answered using affirmative degrees within the range [0.0, 1.0]. Table 4.7 is the presented

sentences in this questionnaire.

Table 4.7 Affirmative and negative expressions used for the questionnaire

Q: “NENKIN NO SHORUI WO HITORI DE KAKEMASUKA?” (Can you fill out the pension forms?)

A1: “KAKEMASU.” (I can fill them out.)

A2: “KAKEMASEN.” (I can’t fill them out.)

A3: “DEKIMASU.” (I can.)

A4: “DEKIMASU.” (I can.)

A5: “YARIMASU.” (I do them.)

A6: “YARIMASEN.” (I don’t do them.)

A7: “TANONDE IMASU.” (I always ask someone to do them.)

A8: “TANONDE IMASEN.” (I don’t ask anyone to do them.)

A9: “MAKASETE IMASU.” (I leave them to someone else.)

A10: “MAKASETE IMASEN.” (I don’t leave them to someone else.)

Figure 4.7 is the result of the questionnaire for investigating the function of negative modality.

The horizontal axis is the affirmative degree of the normal sentence, and the vertical axis is that of

the sentence with negative modality.

98

Figure 4.7 Function of the negative modality

By this graph, the sentences’ affirmative degrees are changed to the symmetrical values based on

middle value 0.5. We define a common function from the affirmative expression to the negative

expressions. We calculated an approximate linear function and a linear polynomial expression. The

shape of the linear polynomial function is almost the same as that of the approximate linear function,

because the curved line of linear polynomial function ( 221 xCxCby ��� ) is mild and 2C , the

constant of 2x , is a very small value. Furthermore, the spread of the regression line is not so large, because the correlation coefficient of the approximate linear function (-0.886) is close to –1.0. Then,

we approximate the effect of negative modality to the following linear function. x is the affirmative value of normal expression and y is the affirmative value of the expression with

negative modality. 0.1��� xy (4.1)

The following example shows the change of the affirmative value of the verb by negative modality.

Q: “NENKIN NO SHORUI WO KAKEMASUKA?” (Can you fill out the pension forms?)

A: “MOU KAITEINAIKARA DEKINAIDESUNE.” (I can’t do them because I haven’t written

them now.)

The affirmative value of a verb phrase “KAITEINAI” changes to 0.18 from the value of its normal

expression “KAKU (write)” 0.82, and the value of a verb phrase “DEKINAI” changes to 0.09 from

the value of its normal expression “DEKIRU (can)” 0.91.

Affirmative degree of the normal sentence

Affirm

ative degree of the sentencew

ith ne gative modality

Linear function Polynomial expression

99

4.5.2.2 Double negative modality

We investigated the relationship between normal expression and the double negative expression in

the questionnaire, that presents two normal sentences and their double negative sentences to 14

subjects the same as in Section 4.5.2.1. Table 4.8 is the presented sentences in this questionnaire.

Table 4.8 Normal and double negative expressions used for the questionnaire

Q: “NENKIN NO SHORUI WO HITORI DE KAKEMASUKA?” (Can you fill out the pension forms?)

A1: “KAKEMASU.” (I can fill them out.)

A2: “KAKENAI KOTOMO NAIDESU.” (I don’t think that I can’t fill them out.)

A3: “DEKIMASU.” (I can.)

A4: “DEKINAI KOTOMO NAIDESU.” (I don’t think that I can’t.)

Figure 4.8 is the result of the questionnaire for investigating the function of double negative

modality. The horizontal axis is the affirmative degree of the normal sentence, and the vertical axis

is that of the sentence with the double negative modality.

Figure 4.8 Function of the double negative modality

By this graph, the affirmative values of the sentence with double negative modality are uneven when

that of the normal sentence is 1.0, however, we can see a pattern that the degrees of

affirmation/negation generally decrease a little. Then, we approximate the effect of double negative

Affirmative degree of the normal sentence

Affirm

ative degree

of the

sentencew

ith double ne gative modality

Linear function Polynomial expression

100

modality to the following linear function. x is the affirmative value of the normal expression and y is the affirmative value of the expression with double negative modality.

3.04.0 �� xy (4.2) The following example shows the change of affirmative value of verb by double negative

modality.

Q: “NENKIN NO SHORUI WO KAKEMASUKA?” (Can you fill out the pension forms?)

A: “KAKENAI KOTOMO NAI DESU.” (I don’t think I can’t write it.)

The affirmative value of a verb phrase “KAKENAI KOTOMO NAI” drops to 0.66 from the value

of the normal expression “KAKERU (can write)” 0.91.

4.5.2.3 Past modality

We investigated the relationship between the normal expression and the past expression using the

questionnaire that present two normal sentences and their past sentences to 14 subjects the same as

in Section 4.5.2.1. Table 4.9 shows the presented sentences in this questionnaire.

Table 4.9 Normal and past expressions used for the questionnaire

Q: “NENKIN NO SHORUI WO HITORI DE KAKEMASUKA?” (Can you fill out the pension forms?)

A1: “KAKEMASU.” (I can fill them out.)

A2: “KAKEMASHITA.” (I could fill them out.)

A3: “DEKIMASU.” (I can.)

A4: “DEKIMASHITA.” (I could.)

A5: “KAKIMASU.” (I write.)

A6: “KAKIMASHITA.” (I wrote.)

Figure 4.9 is the result of the questionnaire for investigating the function of past modality. The

horizontal axis is the affirmative degree of the normal sentence, and the vertical axis is that of the

sentence with the past modality.

101

Figure 4.9 Function of the past modality

However, the subjects’ answers were spread between affirmation and negation. We guess the

reason as follows; when subjects hear about the speaker’s past experiences, some subjects guess

“Then, he still can do it” and the others guess “He can’t do it now. Because if he were do it, he

doesn’t have to use past modality.” Therefore, we do not consider past modality in this study.

4.5.2.4 Modification by adverb

When the extracted predicate is modified by the adverb, the affirmative value of the modified

predicate is calculated by multiplying the affirmative value of the predicate by the “affirmative value

changing scale” of the adverb. The adverb whose “affirmative value changing scale” is over 1.0

emphasizes the predicate’s affirmative value, and the adverb whose “affirmative value changing

scale” is under 1.0 drops the predicate’s affirmative value. We define the following equations for

modification by the adverb.

50.0)50.0( ���� uxy (4.3)

��

��

00.1

00.0yynew

)00.1()00.100.0(

)00.0(

��

yy

y (4.4)

x is the affirmative value of normal expression, u is the “affirmative value changing scale,” and y

is the affirmative value of the modified expression by the adverb. The conditions in the equation

(4.4) are used for saving the affirmative value which multiplied the “affirmative value changing scale” in the range [0.0, 1.0]. We consider newy as the affirmative value of the modified expression

Affirmative degree of the normal sentence

Affirm

ative degree

of the

sentencew

ith double ne gative modality

Linear function Polynomial expression

102

by the adverb.

The following example shows the change of affirmative value to the verb by negative modality.

Q: “NENKIN NO SHORUI WO KAKEMASUKA?” (Can you fill out the pension forms?)

A: “WARIAI KAKEMASU.” (I can fill them in comparatively well.)

The affirmative value of a verb phrase “WARIAI KAKEMASU” drops to 0.75 as multiplying the

affirmative value of the verb “KAKERU (can write)” 0.91 by the “affirmative value changing scale”

of the adverb “WARIAI (comparatively well)” 0.62.

We calculate the affirmative value of the modified predicate by the predicate and the adverb.

However, an adverb seldom appears without any predicates in the spoken language. For example, we

respond like “MAAMAA (so-so),” “ZENZEN (entirely),” and so on. When we calculate the

affirmative value of such an expression, we load the predicate in the question, and then we apply our

method.

Q: “NENKIN NO SHORUI WO KAKEMASUKA?” (Can you fill out the pension forms?)

A: “MAAMAA DESU.” (I might be able to.)

In this example, we fill up the question’s verb “KAKERU (can write)” into the response. This

operation matches a human feeling that the complete expression of the response is “MAAMAA

KAKEMASU (I can fill them out so-so).” Then, the affirmative value of the verb “KAKEMASU”

drops to 0.75 by “MAAMAA” the same as for the former example.

4.6 Analyzing Intention from Plural Sentences We proposed the methods to extract affirmative/negative elements and to calculate affirmative

values as described in the former sections. Then, we proposed a method to calculate the total

affirmative intention of the user’s plural utterances using such affirmative elements.

We consider the average value of the affirmative values with the priorities of extracted

affirmative/negative elements as the total affirmative intention of the user’s plural utterances. The

equation is given as follows.

)( 1

1

1iin

i i

n

i ii www

wxz �

��

� (4.5)

,where ix is the affirmative value of the affirmative/negative element, iw is the priority of each

103

affirmative/negative element, z is the total affirmative intention value of the user’s plural utterances,

and n is the number of extracted affirmative/negative elements. In this equation, the newer element

has the stronger effect, because the priority of an element is defined to be larger than the element of

the older one.

4.7 Example of Our Method Let us show an example to analyze the intention of the following utterance.

Q: “KODOKUKAN WO KANJI MASUKA?” (Do you feel any loneliness?)

A: “IIE, ANMARI.” (No, not very.)

First, the system implements morphological analyzing to the response “IIE, ANMARI” that the

user inputs. Next, the system extracts affirmative/negative elements from the parsing result and

calculates the affirmative values of extracted elements using the interjection data, the adverb data,

and the question data. In this example, two affirmative/negative elements are extracted “IIE” from

the interjection data and “ANMARI” from the adverb data.

“ANMARI” is used as a single adverb and the system guesses the question’s verb “KANJIRU

(affirmative value: 0.91)” is omitted as shown in Section 4.5.2.4. We fill up the negation of the

question’s verb “KANJINAI” because “ANMARI” has the negative aspect. Then, the affirmative

value of “ANMARI” is calculated as follows. (In this calculation, AV and AVCS denote “affirmative

value” and “affirmative value changing scale,” respectively.)

(1) Employ the equation (4.1) to calculate the affirmation value of “KANJINAI” because it is the

negative expression of “KANJIRU”.

09.00.1)91.0(

0.1)""(""

���

��� KANJIRUofAVKANJINAIofAV

(2) Employ the equations (4.3) and (4.4) to consider the effect of the adverb “ANMARI.”

34.050.039.0)50.009.0(

50.0)""(}50.0)""{(""

����

���� ANMARIofAVCSKANJINAIofAVANMARIofAV

Then, the system gets the affirmative values “IIE (affirmative value: 0.06)” and “ANMARI

(affirmative value: 0.34). Successively, the system calculates the total affirmative intention of the

user’s utterances by substituting their affirmative elements for the equation for investigating the

104

function of negative modality (4.5).

20.02

139.0106.0�

����ntionmativeInteTotalAffir

4.8 Application of Affirmative/negative Intention Analyzing Method We applied this affirmative/negative intention analyzing method into the Web-based analytical

system as described in 4.2.

This system consists of a server system and client systems connected by Internet. The main

process of “Web-based analytical system” works on the server. The clients supply an interface by

natural language and a display to show facial images and the result of the “Web-based analytical

system.” The natural language interface will decrease the user’s stress caused by using a computer,

because the user will not need to deal with a keyboard and a mouse.

First, the system gives the user a question about QOL and obtains his/her response through natural

language interface. The server receives the response and do morphological analyze and parsing.

Analyzing results are sent to “emotion generating domain,” “intention analyzing domain” and

“response generating domain.”

The system applies the EGC method to the user’s utterance and calculates some emotions from

the user’s viewpoint. The emotions are used in “facial expression selecting domain” in order to

select an appropriate facial expression from the user’s face data. Generating facial image is

displayed on the client’s display. The process for generating facial image is described in Chapter 5.

“Intention analyzing domain” calculates the user’s affirmative/negative intention against the

question from his/her utterances. When the system obtains an ambiguous intention, the system

continues the conversation in order to clear the user’s intention, because the method can deal with

plural utterances.

Therefore, “response generating domain” generates appropriate responses based on the

grammatical feature of the user’s utterance in order to continue the conversation. We also propose a

method to increase the variation of response expressions and develop techniques to select

appropriate response expression, in order to be made it easier to talk with a computer. We deal with

three kinds of response types; “simple response,” “repeating,” and “showing hearer’s interest.”

“Simple response” can be used anytime in a break of phrases. “Repeating” is mainly used just after

the question to confirm the user’s utterance. “Showing hearer’s interest” is used for showing that the

system can understand the user’s intention. Then, it makes a response expression based on the

content of the user’s utterance and responds to it [70].

The system collects the affirmative/negative intentions against all the QOL questions through the

conversation. These results are used to evaluate analytical results and health counseling comment.

105

Figure 4.10 Overview of the extended Web-based analytical system of health service

Question List Asking domain

Analyzing domain

Intention analyzingdomain

Emotion generating domain

Response generating domain

QOL evaluatingdomain

Facial expressionselecting domain

QOL result

Natural languageinterface

Display

Server Client

Facial Expression Database

106

Figure 4.11 Interface tool for “web-based system of health service needs among healthy elderly

4.9 Experimental Result We applied our method to 50 dialogues corpus (10 Q&A corpus for each 5 people) and evaluated

their results using the questionnaire by the seven grades evaluations. However, when the subject

replies with an ambiguous response, it will be wasted at time to confirm his/her true intention

because he/she does not make clear his/her intention as he/she cannot determine an unique intention.

Then, we used 32 different subjects (university students, male: 23, female: 9) to evaluate the

intentions objectively.

Table 4.12 shows the comparison results between the average of the affirmative value that the

users replied by the questionnaire and the affirmative value that our system calculates. A5 in Table

4.12(b) is the result against the example in Section 4.7.

We consider that the system calculates adequate values because both values are similar.

107

Table 4.10 Comparison between the average of the affirmative value that the users replied and the

affirmative value that our system calculated

Table 4.10 (a) Comparison results that the 90% or more of subjects select the same intention

Sample No. Average value of questionnaire’s

result (X)

Affirmative value that system

calculates (Y) || YX �

A1 0.974 0.940 0.034

A4 0.120 0.090 0.030

A6 0.780 0.754 0.034

A8 0.729 0.500 0.229

A9 0.964 0.970 0.006

A10 0.953 0.940 0.013

B1 0.963 0.910 0.053

B2 1.000 0.925 0.075

B3 0.816 0.910 0.910

B9 1.000 0.925 0.075

B10 0.906 0.850 0.056

C1 1.000 0.925 0.075

C2 0.031 0.455 0.424

C3 0.031 0.060 0.029

C4 0.906 0.910 0.004

C5 0.031 0.060 0.029

C7 0.990 0.925 0.065

C8 0.837 0.910 0.073

C9 1.000 0.940 0.060

C10 0.995 0.940 0.055

D1 0.774 0.940 0.055

D3 0.905 0.880 0.025

D5 0.078 0.060 0.018

D7 0.990 0.910 0.080

D9 0.979 0.955 0.024

D10 0.835 0.754 0.081

E4 0.885 0.920 0.035

E6 0.844 0.940 0.094

E8 0.905 0.820 0.085

108

Table 4.10 (b) Comparison results that the 80%-90% of subjects select the same intention

Sample No. Average value of questionnaire’s

result (X)

Affirmative value that system

calculates (Y) || YX �

A5 0.252 0.200 0.052

B5 0.303 0.340 0.037

D4 0.688 0.771 0.083

D6 0.678 0.533 0.145

E1 0.141 0.820 0.679

E3 0.723 0.607 0.116

Table 4.10 (c) Comparison results that the 80% or less of subjects select the same intention

Sample No. Average value of questionnaire’s

result (X)

Affirmative value that system

calculates (Y) || YX �

A2 0.395 0.910 0.515

A3 0.823 0.625 0.198

A7 0.989 0.940 0.049

B4 0.376 0.395 0.019

B6 0.637 0.832 0.195

B7 0.647 0.754 0.107

B8 0.516 0.455 0.061

C6 0.568 0.170 0.398

D2 0.583 0.535 0.048

D8 0.568 0.607 0.039

E2 0.569 0.925 0.356

E5 0.376 0.090 0.286

E7 0.749 0.500 0.249

E9 0.683 0.500 0.183

E10 0.661 0.500 0.161

109

4.10 Future Works We compared the average of the affirmative values and the affirmative values that our system

calculated for all samples as shown in Table 4.11. We consider that there is not so much difference

because the average value of the difference is 0.138 and the standard deviation is 0.185. The sample

that has maximum difference (0.754) is caused by the failure of morphological analyzing. We

extracted the samples that 80% of the subjects selected the same intention (affirmation/ negation/

neither-affirmation-nor-negation), and we considered them as standard samples. We did not consider

the standard samples that did not reach a consensus, that is to say 20% or more of the subjects

objected to the major opinion. Table 4.12 shows some of the examples that 80% of the subjects

selected with the same intention, and Table 4.13 shows the examples that the subjects suppose

variable situations.

Table 4.11 Differences between the average value of questionnaire and system output

Average of differences Standard deviation Maximum difference Minimum difference

0.138 0.185 0.754 0.004

Table 4.12 Examples that the 80% of subjects select the same intention

Q:

“BASU YA DENSHA WO TUKATTE HITORI DE GAISHUTU DEKIMASUKA?” (Can you go

out using bus or train by yourself?)

A: “HAI.” (Yes.)

Q: “KODOKUKAN WO KANJI MASUKA?” (Do you feel any loneliness?)

A: “ANMARI NAIDESUNE.” (Not so much.)

Q: “IRAIRA SURU KOTO WA ARIMASUKA?” (Have you sometimes been irritated?)

A: “MAA, SOU IRAIRA YUUKOTOMO NAIDESUGANE, YAPPARI, SORYA KINKYUU NO BAAI

NIWANE, SUKOSHI WA ARIMASUYO.” (Well, I haven’t been irritated so often, but, of course,

when there is a problem, I feel a little irritated.)

110

Table 4.13 Examples that the subjects suppose variable situations

Q:

“HITO NO NAMAE YA KOTOBA GA SUGU NI DETEKONAI KOTO GA ARIMASUKA?”

(Are there any situations where you can’t remember a person’s name or a word?)

A: “METTANI NAIDESUGANE. TOKINIWA ARYAA, DOUIU HITO DATTAKANA TO OMOU

KOTO WA ARIMASUYO.” (No, rarely. But sometimes, I think “who is he?”)

Q: “NENKIN NADO NO SHORUI WO HITORI DE KAKEMASUKA?” (Can you fill out pension

forms and so on?)

A: “MAA IMAMADEWA YATTEMASHITAGANE. KONDO KARAWA DOUNARUYARA.

YOMESAN NI TANOMU YARA, WAKARANAI DESU.” (Well, I have done it. How will I do it

next time? Will I ask my wife? I don’t know.)

Q: “KODOKUKAN WO KANJI MASUKA?” (Do you feel any loneliness?)

A: “U-N, BETSU NI FUJIYUU WO KANJIMASEN GANE. YOMESAN MAKASE DE SHITE

KURERUKOTO. EE. IIKKOU NI YATTE KUREMASUYO.” (Well, I don’t feel any

inconvenience. I leave everything to my wife. Yes, she does everything well.)

Figure 4.12 is a scatter plot diagram for all samples, and Table 4.14 is the correct rate in the

standard samples. We considered the correct sample that the difference between the questionnaire’s

result and the system output is under 0.1.

There are 82.8% correct samples in the standard samples. We consider that our system can

analyze the speaker’s affirmative/negative intention for the standard samples, because the correlation

coefficient of Figure 4.12 is 0.822.

When the samples include that of the subjects suppose variable situations, the correct rate

decreased to 68.0% and its correlation coefficient is 0.736. We analyzed 16 incorrect samples. Six of

them are caused by lack of the interjection data, the adverb data, and the question data, and not

considering individual variations, gender gap, generation gap, and so on. Eight of them need the

process of meaning analysis and reasoning. One sample is asking oneself “How do I …?,” however,

the system misunderstands it as the intentional response for the question. Similar misunderstanding

occurs by the utterance that content does not relate to the question. To solve these problems, we have

to append the other types affirmative/negative elements, reasoning process, and the rules to avoid the

utterance that does not relate to the affirmation/negation of the question.

111

Figure 4.12 Scatter plot diagram for all samples

Table 4.14 Correct rate in the examples where 80% of the subjects select the same intention

Number of samples Number of where the difference between questionnaire’s result and the system output is under 0.1

Correct rate (%)

35 29 82.8

4.11 Conclusion In this chapter, we proposed a method to analyze the user’s affirmative/negative intention from

his/her utterances in the dialogue. First, we extracted some elements based on the surface structures

of the responses and a concept of the question, and calculated an affirmative/negative value

corresponding to the degree of affirmation/negation. We defined three types of elements,

“affirmative/negative description for the yes-no question,” “direct expression of intention in the

response,” and “indirect expression of intention in the response.” Furthermore, we defined

calculating functions of affirmation value changes according to the aspect of verbs and adverbs.

Finally, we calculated the total affirmative/negative intention to a question in the dialogue.

To verify the validity of this method, we applied our method to 50 dialogues corpus and evaluated

their results using the questionnaire. We consider that the system calculates adequate values,

according to the comparison between the questionnaire’s result and the system’s output.

This method is effective in the web-based system, therefore, we applied the method to the

interface in the “Web-based analytical system of the Health Service for the Elderly.” The interface

system can analyze the user’s affirmative/negative intention in his/her utterance even if he/she is

Average value of questionnaire’s result

Affirm

ative value that the system calculates

Examples that 80% of subjects select the same intention

Examples that the subjects suppose variable situations

112

separate from the system.

We can guess the user’s intention on the yes-no question by this method, furthermore, the

interface system can converse with the user as it considers his/her intention.

As the application of our method without yes-no question, analyzing the intention for

Wh-question and extracting the spontaneous request are needed. In order to realize such functions,

we are going to extract our proposed method.

113

CHAPTER 5 EMOTION ORIENTED INTERACTION SYSTEMS — FACEMAIL & FACECHAT —

We develop application software that can analyze the user’s emotions and can represent the

emotions with facial expressions. The application requires two outstanding functions. One function

is analyzing emotion from some sentences and the other is displaying the emotional faces. The

emotion analyzing part is due to the EGC method as shown in Chapter 2 and 3.

In this section, we propose a method to generate the user’s facial expressions based on the

extracted emotions by the EGC method. This method uses a “sand glass type neural network” trained

by real facial images. First, we classify the emotions for the facial expressions as “happiness,”

“sadness,” “disgust,” “anger,” “fear” and “surprise” as proposed by Ekman [44]. By training the

neural network based on such types of facial expressions, each emotion is partitioned on the

two-dimensional emotional space constructed by the outputs of the third layer in the neural network.

In order to employ the emotional space, we assign the EGC output (20 kinds of emotions) to the

input of the two-dimensional emotional space (6 kinds of emotions) as described in Section 5.2.2.

Next, one point on the two-dimensional emotional space is determined from the assigned emotions.

We applied this method into mail software and a chat system. The mail software (JavaFaceMail)

calculates the emotions from the content of the mail, generates one facial expression image of the

sender, and sends the mail with the facial image. The chat system (JavaFaceChat) also generates a

facial expression image like JavaFaceMail. Furthermore, it analyzes the variances of emotions for

each user, and invites two users to a new closed chat room when their tendencies of variances are

alike.

5.1 Facial Expression Generating Method by Parallel Sand Glass Type Neural Network 5.1.1 Sand Glass Type Neural Network

There are some studies to clarify the relationship between the emotions and the features of facial

expressions by analyzing the features of the facial expressions [71, 72]. Especially, some researchers

employ neural networks to relate facial expressions and emotions. It is one of the effective methods

that classifies the facial expressions based on the emotions using the sand glass type neural network

as shown in Figure 5.1 [73].

The sand glass type neural network is a kind of hierarchical neural network. The features of the

neural network are that the number of input neurons is equal to the number of output neurons and

that the number of neurons in the middle layer is much less than the number of the input and output

neurons. Back Propagation (BP) learning is employed to train the network by giving teaching signals

114

Figure 5.1 Overview of sand glass type neural network to output the patterns that are the same as the input patterns. This method can extract the features of

the input data on the middle layer when the training is finished [73].

However, the standard sand glass type neural network is used to learn the facial expressions of one

person because it is difficult to deal with the multiple data simultaneously on the network. Therefore,

the extended sand glass type neural network is proposed. It connects two kinds of the five-layer

neural networks at the third layer in order to deal with the different data simultaneously as shown in

Figure 5.2 [74]. The data are inputted to the connection networks n (n = 1, …, N) simultaneously, and

the network is trained to output the same patterns as the input pattern. After the training, the features

of the input data are compressed and integrated at the third layer.

115

Figure 5.2 Overview of extend sand glass type neural network

Ichimura constructed “two-dimensional emotional space” by learning the facial expressions of the

two people (a man and a woman) using the extended sand glass type neural networks [75]. Although

this model learned the features of the two people’s facial expressions, it could not classify three or

more people’s facial expressions based on the each emotion. Therefore, we propose a parallel sand

glass type neural network, which connects N kinds of the five-layer neural networks at the third

layer as shown in Figure 5.3 [45]. This network can deal with N kinds of data simultaneously.

Network B Network A

116

Figure 5.3 Overview of parallel sand glass type neural network

BP learning is employed to train the network. However, in the third layer and the fifth layer, we

apply a linear function as a bias function instead of a sigmoid function, and do not use threshold

values � to represent prominent weights of its incoming links. In this paper, we use eq.(1) as a

sigmoid function and eq.(2) as a linear function.

� �� �x

xf���

exp11

(1)

� � xxg �� (2) Let 21ω as the weight vector between the first layer and the second layer, the output activation

to the second layer 2x is

� �21212 θxωx �� f , (3)

where 1x is an output activation vector in the first layer.

In the third layer, we use the following function, � �2323 xωx g� (4)

In a similar way, we use the following functions in the fourth layer and the fifth layer.

� �43434 θxωx �� f (5)

� �4545 xωx g� (6)

Network2

Input Layer

2nd Layer 3rd

Layer4th

LayerOutput Layer

255 Neurons

255 Neurons

255 Neurons

255 Neurons

255 Neurons

40 Neurons

40 Neurons

40 Neurons

40Neurons

40Neurons

40Neurons

Person1

Person 2

Person N

Network1

Network N

2 Neurons

Person1

Person 2

Person N

255 Neurons

117

5.1.2 Facial Training Data We use emotional facial expressions of some people as teaching signals. For the individual, there

are six basic kinds of emotional faces, “happiness,” “sadness,” “disgust,” “anger,” “fear,” “surprise,”

and a neutral one [44, 76]. Two facial images are readied for each emotion. Therefore, we have 13

pieces of pictures for the individual.

To normalize the facial images in position and use the size of internal facial features as reference

points, we use an affine transformation [77] to extract the normalized target images. First, we determine three reference points, rE , lE , and M as the center points of the regions that

correspond to the two eyes and mouth as shown in Figure 5.4. We define the parameters as shown in Figure 5.4 as dcc 8.021 �� , dc 4.03 � , dc 2.14 � . Then, we obtained a standard window of

128 by 128 pixels to form target images. Figure 5.5 shows the six kinds of emotional basic faces and

a neutral face of a subject. These pictures are transformed into the 8bit gray scaled format.

Figure 5.4 A target image defined by three points

118

Anger Disgust Fear

Happiness Sadness Surprise

Neutral

Figure 5.5 Sample of six emotional basic faces and a neutral face

119

Next, we have to convert these images into the frequency region as training data. As

transformation technology, we use a two-dimensional DCT, which is a famous method for digital

signal processing. The variation of facial image is reflected in the frequency region directly and most

of the energy is concentrated into its low frequency part [78]. We use the low frequency part as shown

by the gray square in Figure 5.6 for training data, which is transformed by 2D-DCT.

Figure 5.6 Low frequency region transformed by 2D-DCT

120

5.1.3 Learning Experimental Results 5.1.3.1 Ekman’s emotion circle

Ekman et al. described the relationship between the emotions and the facial images in their

detailed examinations [79]. They described how we can judge six basic emotions, that is, “happiness,”

“sadness,” “disgust,” “anger,” “fear,” and “surprise” from facial expressions as shown in Figure 5.7.

Moreover, Schlosberg et al. said that there is an emotional circle with an order of “Love-

Surprise-Fear-Anger-Dislike-Contempt.” Based on these two ideas, there have been many studies

with respect to the ordered six basic emotions in the emotional circle. They reported that emotions’

boundaries are opaque, but the separated emotions in the emotional circle are clearly distinguishable [79].

We construct an emotional circle with facial expressions by the parallel sand glass type neural

network as described in Section 5.1.1. The network is trained for some people’s facial expressions,

and we extract a feature of each emotion’s facial expression in the third layer after the trained

network.

Figure 5.7 Emotion circle

Happiness

Sadness

Disgust

Anger

Fear

Surprise

121

5.1.3.2 Learning results using four people’s emotional faces

We prepared four people’s facial images, and trained a “sand glass type neural network.” Each

network is trained by the target images for the individual to display the emotional information in the

third layer. If the number of networks is less than the number of people, the information in the third

layer shows the person’s facial characteristics. Therefore, we used the “sand glass type neural

network” consisting of four neural networks and each neural network was connected to the others by

two neurons in the third layer. We considered that the information in the output activities of two

neurons is represented in the two-dimensional emotional space. It is easy to understand this

emotional space formed by the behaviors of the neurons’ output activities.

Figure 5.8 shows the error convergence situation of this network. It was iterated 50,996 times to

converge until the mean square error reached under 0.01. It represents prominent characters of the

emotions by the output activities in the third layer, then, eq.(4) and eq.(5) are applied without

sigmoid function.

After training the network, we investigated the output activities in the third layer. Figure 5.9

shows the neuron output activities in the third layer. It represents the distributions of the emotions

for the facial images on the two-dimensional emotional space, through the horizontal axis which

shows the output activity of the first neuron in the third layer and the vertical axis which shows the

output activity of the second neuron in the third layer. We define this two-dimensional map based on

the activities of the third layer as “two-dimensional emotional space.”

The area of each emotion as shown in Figure 5.9 shows the groups of the points where the error at

the output layer is less than N�25.0 when the emotional space is partitioned into 20�20 grids

and the center point of each grid is given as the input for the fourth layer.

Figure 5.9 indicates that each emotion is separated on the two-dimensional space and distributed

like a circle as shown in Figure 5.7. This figure is almost equal to the research of the circumplex

model of emotions by Russell [79]. However, the “disgust” area is distributed near the “fear” area,

though it should be distributed between the “sadness” area and the “anger” areas. The reason is that

the subjects had been perplexed about making different facial expressions between “disgust” and

“fear.” In psychology, Ekman explained that the facial expressions of “disgust” and “fear” are more

difficult to distinguish than any other facial expressions [44]. We obtained almost the same result from

the emotional space constructed by the third layer using experimental facial expressions. Moreover,

the output activities from the network and facial images which were obtained were almost the same

as the experimental facial images. We confirmed that not only teaching data but also the other data

can be restored as facial images from these results.

Furthermore, the network outputs 400 facial images when we input 400 points allocated to the

grids as shown in Figure 5.9.This indicates that each facial image can be created for the optional

point and that the points which do not belong in any emotion areas also have each facial image. We

122

plotted the facial images at the same positions of the points on the emotional space as shown in

Figure 5.10. In this figure, we can see that the emotional facial expressions appear almost the same

area as shown in the emotional space of Figure 5.9. Furthermore, we can confirm that the

intermediate facial expressions among the inputted facial images are complemented, similar to that a

complex emotion which is composed of some basic emotions. As a result, we can confirm that the

emotional space is constructed so that each emotion is distributed sequentially and in a constant

order, at the third layer in the parallel sand glass type neural network.

Figure 5.8 Error convergence of sand glass type neural network

123

Figure 5.9 Neuron Activities in the third layer

124

Figure 5.10 Relation map between emotional space and output activity

5.1.3.3 Relationship between the numbers of networks

We inspect the adequate number N of the training networks to correspond with the order of the

emotion circle that Ekman proposed.

Figure 5.11 shows the emotional spaces when the number N of the parallel sand glass type neural

networks are changed from 1 to 7. Figure 5.11(a) is the emotional space when the number of

connected networks N = 1. In this case, the emotional space is not constructed adequately, because

the areas of “anger” and “sadness” overlap widely and the order of the emotion areas is different

from the standard one. Figure 5.11(b) is the emotional space when the number of connected

networks N = 2. The distribution of the “disgust” area is different from the case of N = 1. However,

the order of the emotion areas is still different from the standard order. In the case of N = 3, the

distribution of the emotion areas is closer than the former ones to the standard one. However, the

125

“disgust” area and the “neutral” area overlap. When the connected number N is bigger than 4, the

emotional spaces are shown in Figure 5.11(d), (e), (f), (g). Each emotion area is distributed radially

from the “fear” area and all the orders of the emotion area are the same. From the results, the

constructed emotional spaces are not so different among the cases where four or more people’s facial

images are used. Then, we consider that at least four subjects are needed to construct adequate

standard emotional space.

(a) Number of connected networks N = 1 (b) Number of connected networks N = 2

(c) Number of connected networks N = 3 (d) Number of connected networks N = 4

126

(e) Number of connected networks N = 5 (f) Number of connected networks N = 6

(g) Number of connected networks N = 7

Figure 5.11 Two-dimensional emotional spaces at each number of connected networks

127

5.2 JavaFaceMail We expect to make a computer equipped with human-like interfaces, and it should enable us to

have an easy conversation including greetings and so on. Besides verbal messages, human

face-to-face communication includes nonverbal messages such as facial expressions, vocal inflection,

speaking rate, pauses, hand gestures, body movements, posture, and so on. Therefore, we developed

new mail software representing facial expressions called JavaFaceMail, a kind of new computer

interaction tool [80]. The current version of JavaFaceMail is 1.0.1b in the web site [81].

5.2.1 System Overview This mail software can analyze and express the emotions that the user will generate from the

content of the E-mail. Figure 5.12 shows the overview of the system. E-mail is inputted using the

interface on the client system, and it is sent to the server through a network. The server has six

processes; mail receiving process, morphological analyzing and parsing process, case frame

extracting process, the EGC process, facial expression selecting process and mail sending process.

The client system has the functions not only for general E-mail tools but also for displaying the

facial image.

Figure 5.12 Overview of JavaFaceChat

From User

Case Frame Extracting

Morphing & Parsing

Receive E-Mail Dictionary Database

Favorite Value Database

Emotion Generating Calculations

Facial Expression by Neural Network

To User

Send E-Mail

128

5.2.1.1 Behavior of Server The morphological analyzing and parsing process analyzes the sentences in the mail

morphologically and parses it. The process is done for each sentence. We used JUMAN as a

morphological analyzer and KNP as a parser as described in Section 2.3.3. Then, the case frame

representations are made from the result of KNP, a process which is also described in Section 2.3.3.

Next, the EGC method as described in Chapter 2 and 3 is applied to the case frame representations

in the EGC process. The process sometimes does not generate any emotions when there are not any

like/dislike objects in the sentence or the event type of the predicate is not registered. We do not

consider that such sentences have any effect on the whole emotion against the mail content.

Although the EGC method outputs 20 kinds of emotions, the facial expression is selected based on

six kinds of emotions as described in Section 5.1. Therefore, we have to give “assign rules” from the

emotion type by the EGC to facial emotion types, and we have to determine one point that means the

compound emotional facial expressions on the emotional space. Section 5.2.2 will show the emotion

assigning rules and the point determining method.

The server delivers the received mail with the emotion analyzing result to the address of the mail.

When the sender’s facial images have been already registered in the server, the server attaches the

facial image output by the trained neural network to the mail. When the sender has not registered

his/her facial images in the server, the server attaches the default facial picture as shown in Figure

5.13. The sender can select the type of facial image when he/she sends an E-mail message.

Now, mail data are constructed by two parts; “header” that has the information about the sending

route, the name of the mail software and so on, and “body” that includes the inputted sentences. This

system attaches the type of the facial image and analyzing result to the header area. Accordingly, in

order to display the facial images, the user has to use special mail software as described in the next

section for JavaFaceMail.

129

(a) Surprise (b) Anger (c) Fear

(d) Happiness (e) Disgust (f) Sadness

Figure 5.13 (a) Example of original six face types (Takeshi)

(a) Surprise (b) Anger (c) Fear

(d) Happiness (e) Disgust (f) Sadness

Figure 5.13 (b) Example of original six face types (Mika)

130

5.2.1.2 Behavior of Client To display the facial images, special mail software that has the functions to display the facial

images is needed. We named the mail software JavaFaceMail, and it is developed by Java Swing.

Java is a kind of object-oriented language, and it can execute in any OS where JDK1.3 [82] works.

When the user starts JavaFaceMail, a menu window appears as shown in Figure 5.14. When the

user uses this system for the first time, he/she has to register his/her information with the

JavaFaceMail server, because the server distinguishes the face type from the user’s mail address and

retrieves the facial images. The information about face images is managed using PostgreSQL, a kind

of database software. All the software for the server can be installed in the standard UNIX OS.

Next, the user has to configure the client system. Because JavaFaceMail has a usual mail function

via POP and SMTP protocols, the user resists a FQDN (Full Qualified Domain Name) with each

server like standard mail software. Furthermore, the user resists his/her name, password, a place of

mail spool and the face type previously registered with the server as shown in Figure 5.15. Although

the face type of the sender can be retrieved from the E-mail address and the face type registered by

the user, the “From” information in the mail can be changed optionally on the mail software. Then,

the system requires the face type.

The mail software also has an address book that is a tool to convert the sender’s name into the

mail address as shown in Figure 5.16. Although the items for face type do not appear in the figure,

there are the items in the address book. This address book can receive some information from the

server directly.

Figure 5.17 shows a window for sending E-mail. The server cannot analyze the mail sent from the

other mail software because of a lack of the information for generating the facial image.

Received mails have information about the face type and the value that indicates emotion at the

header area. The client system displays an adequate facial image using such information. However,

the facial image itself is not attached to the mail, but it is downloaded from the server directly. All

the downloaded images are accumulated in the client machine. Figure 5.18 shows the analyzing

result of the sentences as shown in Figure 5.17.

131

Figure 5.14 Menu window of JavaFaceMail

Figure 5.15 Configuration window of JavaFaceMail

Figure 5.16 Address book in JavaFaceMail

132

Figure 5.17 Mail sender part in JavaFaceMail

133

Figure 5.18 Mail reader part in JavaFaceMail

134

5.2.2 Assign Rules to the Facial Expressions We give assign rules from the emotion type by the EGC to facial emotion types as shown in Table

5.1.

At first, there is the emotion “Fear” which is characterized by the EGC and the facial emotion

types. Therefore, the emotion “Fear” is equal to the facial emotion type “Fear.” In the same way,

there is the emotion “Anger” which is characterized by the EGC and the facial emotion types.

“Resentment” and “Reproach” are also assigned to the “Anger” group, because these emotions

indicate the aggressions to someone. Next, the emotions aroused by “pleasure” (Joy, Happy-for,

Gloating, Hope Relief, and Satisfaction) are equal to the facial emotion type “Happiness.” However,

we only assign “Joy,” “Happy-for,” and “Gloating” to the “Happiness” group, because the concept

of “Joy” embraces the one of “Hope,” “Relief,” and “Satisfaction.” In the same way, the emotions

“Distress,” “Sorry-for,” and “Resentment” by the EGC are equal to the facial emotion type

“Sadness.” The emotion “Surprise” is the action of assailing unexpectedly or attacking without

warning. Emotions such as “Relief” and “Disappointment” are equal to the “Surprise” group,

because these emotions are generated when the prospective event is not confirmed. The “Disgust” is

caused when the situation is completely unacceptable. There are four types of emotions (Reproach,

Anger, Shame, and Remorse) relating to “unacceptable” in the EGC output. We assign “Shame” and

“Remorse” to the “Disgust” group, because “Reproach” and “Anger” are already assigned to the

“Anger” group.

Each EV is added to the corresponding values attached to each emotional facial type. Each facial

expression’s value by the EGC is calculated from the EVs preset in the whole mail. Furthermore,

some special words (e.g. “happy,” “die,” etc.) affect the facial expression without calculating

emotion as shown in Section 3.4. We increase each facial expression's degree based on the number

of these words.

Table 5.1 Assign rules to the facial expression

Facial expression Emotion by the EGC method

Fear Fear

Anger Anger, Resentment, Reproach

Happiness Joy, Happy-for, Gloating

Sadness Distress, Sorry-for, Resentment

Surprise Relief, Disappointment

Disgust Shame, Remorse

Next, a point on the emotional space should be determined from these six emotions in the content

of the mail. These values are inputted to two neurons in the third layer of the trained sand glass type

135

neural network.

We define the points, which output the facial image teaching data for each emotion, as each center

of the emotion types on the emotional space. Successively, the sums of emotion values for each

emotion are set on the emotional space as the vectors, and the center of gravity of the emotional

space is calculated as shown in Figure 5.19. Then, the network obtains the facial expression by the

given input emotion types and values. This facial image indicates the facial expression against the

whole mail content.

However, this method sometimes generates a neutral facial expression when there are two

conflicting emotions (e.g. happiness and sadness). In our study, we do not consider such conflicting

situations because there are many variations of the reactions to the conflicting situation [63] and the

facial expressions appear according to the reactions.

Figure 5.19 Center of each emotion vector

5.2.3 Mental Effects by Outputting Facial Expressions We evaluated the mental effects of displaying the facial expressions using our method by the

following questionnaire. The subjects read an E-mail message on three types of mail windows that

are (A) displaying only the context, (B) inserting face marks for each sentence, and (C) displaying

the facial expression image with the context. Then, they evaluated these three methods from the

following viewpoints by the five grade evaluations:

1. It gives you pleasure to read the mail.

2. It helps to communicate the sender’s emotion well to you.

3. You feel familiar with the output of the mail software.

The subjects are 37 university students (men: 34, women: 3). We presented Figure 5.20(a), (b) and

5.18 as the condition (A), (B) and (C).

Table 5.2 shows the average ratings and the standard deviations for each item. There were some

136

significant differences at the 1% level among the items by one-factor ANOVA.

For the first condition, “It gives you pleasure to read the mail,” the average ratings of (B) and (C)

was significantly higher than (A), and there were no significant differences between (B) and (C)

using the Turkey test. This indicates that “displaying facial expressions” using the JavaFaceMail

increases the fun of reading E-mail messages similar to inserting the face marks.

For the second condition, “It helps to communicate the sender’s emotion well to you,” the average

rating of (B) was significantly higher than (A) at (p < 0.01), and the average rating of (C) was

significantly higher than (A) at (p < 0.05). This indicates that both (B) and (C) are effective, but

inserting face marks is more effective than JavaFaceMail. A possible explanation for this is that face

marks are inserted for each sentence in an E-mail message. On the other hand, JavaFaceMail

displays only one facial expression for the total content of an E-mail. Therefore, a displayed facial

expression can be neutralized if a number of emotions are aroused simultaneously in an E-mail.

There is no problem when the content of the E-mail is short. However, we have to improve our

method for analyzing E-mails with a large content, for example, displaying the facial expressions for

each paragraph that arouse the same emotion, or displaying the facial expression just after a sentence

if the sentence arouses an emotion, and so on.

For the third condition, “You feel familiar with the output of the mail software,” there were some

significant differences among all the situations. The average rating of (C) was the highest, and (B)

was higher than (A). For the effect of familiarity, JavaFaceMail was significantly more effective

than the face marks. This indicates that the JavaFaceMail system is effective for introducing

face-to-face communication onto the human computer interaction.

Table 5.2 Average ratings and the standard deviations for each displaying method

(A) Only

sentences

(B) With face

marks

(C) With facial

expression

One-factor

ANOVA 1. It gives you pleasure to read the mail. 2.57(0.83) 3.57(1.01) 3.51(0.93) p < 0.01

2. It helps to communicate the sender’s emotion well to you. 2.86(0.98) 3.81(0.94) 3.49(0.96) p < 0.01

3. You feel familiar with the output of the mail software. 1.81(1.00) 3.22(1.25) 3.95(1.18) p < 0.01

5.3 JavaFaceChat We created a “chat system” called JavaFaceChat, which represents facial expressions

corresponding to the user’s emotion. “Chat rooms” are one of the most popular communication tools

used on the Internet. “Chat rooms” which extend to Internet TV phones, such as xDSL connections,

have become popular due to the connection being very fast and inexpensive. Mobile phones now

137

Figure 5.21 Overview of JavaFaceChat System

allow us to communicate via sound and sight, through the use of images. Although we are surprised

at the progress of such technology, we are still unsatisfied with the sophistication.

JavaFaceChat supplies the usual functions for “chat rooms.” Figure 5.21 shows an overview of our

developed JavaFaceChat. We applied the EGC method to analyze all messages, and all the

generated emotion types were concentrated in six kinds of emotions. The system in the server

calculated the facial image from the six emotion types and values described in Section 5.2.2.

138

This facial image indicates a facial expression against one utterance by each user, respectively. The

system sends a sentence and a facial image to all users upon receiving every new message.

JavaFaceChat server has plural agents. An agent receives the message from a user, and sends the

message and the face type using the EGC method to the other users. Each agent has the user’s facial

images for the six kinds of emotions.

Procedures for JavaFaceChat when a user ‘A’ types something, is described as follows.

JavaFaceChat system analyzes the other’s sentences based on one’s own thoughts and feelings, and

guesses the variances of the other’s intention and emotion, like a person who uses a standard chat

room.

�� Start system

1. Download six types of emotional facial images for each chat participant when a chat session

starts.

�� Do the following processes at the mobile terminal of ‘A’

1. Send an input sentence to the server.

2. Calculate Emotion Value EA by applying the EGC method to the sentence, and send EA to the

server.

3. Generate one facial expression image based on the EGC result as described in Section 5.2.2.

4. Display the sentence and the facial image of ‘A’ on the display of A’s terminal.

�� Do the following processes at the mobile terminal without A’s terminal

1. Receive A’s sentence from the server.

2. Calculate Emotion Value by applying the EGC method to the sentence. Favorite Value

Database accumulated in each terminal is employed for the EGC method.

3. Generate one facial expression image based on the EGC result as described in Section 5.2.2.

4. Display the sentence and the facial image of ‘A’ on the display of each terminal.

�� Do the following processes at the server

1. Send the sentence from ‘A’ to all the chat participants.

2. Check the variances of the output emotions that are sent from all the chat participants. When

there are some variances, the server calculates their norms and checks whether they are less

than threshold or not.

3. Ask participant 1 and 2 whether they want to go to another chat room or not, when ��� 21 EE is satisfied. E1 means the norm of generated emotions by participant 1, and E2

means the norm of generated emotions by participant 2.

4. Create a new chat room if both participants agree to it. Information about generated emotions

is duplicated into the new chat room.

139

In the initial stage of the chat, all agents attend the same chat room, which is opened for an aim or

a common topic. If an agent wants to talk only with a specified agent, and the specified agent agrees

to it, the system can open a new closed chat room for the two agents, like computer dating. The

system then copies the agents’ attributes from Room 1 to Room 2 as shown in Figure 5.22. Agent A

and Agent B in Figure 5.22 are still in Chat Room 1, but they can enjoy their own conversations in

Chat Room 2, too.

Figure 5.22 Agents attend two chat rooms simultaneously

Figure 5.23 Window of JavaFaceChat

140

Figure 5.24 Window of closed chat room

5.4 Emotion Oriented Interactive Interface for Raising Students’ Awareness in a Group Lesson

The types of classrooms differ in architecture according to the design of a building. As shown in

Figure 5.25, all the students look at a teacher or a blackboard in front of them. As shown in Figure

5.26, students can see neither a teacher nor a blackboard in front of them while looking at his/her

computer display. In this case, it seems to be difficult for them to understand the teacher’s

explanation well enough.

A teacher may meet a situation in a group lesson with an exercise such as computer language

programming, where students talk mutually about programming techniques. However, if one student

is asking another student for the answer itself, the teacher has to consider stopping their conversation

because the asked student cannot obtain any knowledge or experiences without the answer. However,

if his/her knowledge may be improved through the conversation, the communication is important in

the group lesson. To realize such an idea, we improved JavaFaceChat as described in Section 5.3.

141

Figure 5.25 Classroom scenery (1)

Figure 5.26 Classroom scenery (2)

142

5.4.1 System Overview There are two parts in this system, the “chat part” and the “observation part.” The “chat part” is

based on JavaFaceChat as described in Section 5.3, but the function for computer dating is not

supplied. Students use this chat system to solve their problems by asking and teaching mutually. On

the other hand, the “observation part” is in the server, and only the teacher uses it. The “observation

part” analyzes not only the students’ emotions but also their awareness from their utterances. The

awareness is judged by classifying the generated emotions as described in Section 5.4.2. Then, the

teacher can identify the students with a low level of awareness easily, and can support them.

Furthermore, when the student is not in a low level of awareness but he/she has felt bad for a long

time, the teacher can meet such situations as soon as possible.

Figure 5.27 and 5.28 show dialogue windows for students and the teacher, respectively. In Figure

5.27, a student asks someone what she does not understand and the other student replies with an

answer using his facial images. Figure 5.28 is a dialogue window for the teacher and shows the

contents of the conversation among students. The right part shows the degree of the students’

motivations or volition with his/her facial images. The teacher can find a student with a low level of

motivation or volition.

5.4.2 Assign Rules to Detect Variances in the Student’s awareness In the field of psychology, many factors for motivation are proposed; instinct, drive, arousal,

incentive, cognitive factor, self-actualization and so on [83]. We employ the incentive, cognitive and

self-actualization factors, because instinct, drive and arousal are aroused by perceptions.

The first rule for incentive is that a pleasant/unpleasant event causes the action of

appearance-avoidance. The second rule is that a neutral event also causes the action of

appearance-avoidance when it appears with pleasant/unpleasant event. Therefore, we define that the

emotions that relates “pleasure” for one’s own enhance the motivation, and the “displeasure”

emotions for one’s own deflate the motivation.

Next, the motivation is confirmed when the agent meets something new or he/she expects an event

or feels fear as a result of the event. Then, we define the emotions that relate to the future (Hope,

Fear) which enhance motivation, and the emotions relating to finished event (Satisfaction,

Disappointment, Fear-confirmed, Relief) which deflate the motivation.

We consider that “Pride” and “Shame” also enhance the motivation from the viewpoint of

self-actualization, because these emotions are generated by appraising one’s own actions.

Table 5.3 shows the classification rules about emotions related to motivation.

143

Figure 5.27 Dialog window for students

Figure 5.28 Dialog window for the teacher

144

Table 5.3 Classification about emotions related to motivation

Enhancing the motivation Joy, Hope, Fear, Pride, Shame, Liking, Gratification

Deflating the motivation Distress, Relief, Satisfaction, Disappointment, Fear-confirmed,

Disliking, Remorse

5.5 Conclusion

In this chapter, we proposed a method to represent emotions with facial expressions and

introduced some human-computer interface applications.

To generate the facial expression of a human from the sentences, we inputted the emotions into

the trained “sand glass type neural networks” and extracted a facial expression image from an

emotional space. We defined the assign rules from the emotion type by the EGC to facial emotion

types, and we got the emotional spaces similar to Ekman’s emotion circle by the trained “sand glass

type neural networks.”

We applied the method into applications called JavaFaceMail and JavaFaceChat. JavaFaceMail

has a regular mail function, and a facial expression image according to the contents of the received

mail. On the other hand, the JavaFaceChat analyzes all messages and displays all users’ facial

expression images about the messages. Furthermore, we considered that the system aids students’

awareness in a group lesson.

145

CHAPTER 6 CONCLUSION

This thesis presented a method to generate some emotions from the user’s viewpoint, and a

method to analyze affirmative/negative intention from the user’s utterances.

First, we proposed the “Emotion Generating Calculations.” This method generates

pleasure/displeasure emotion from an event, an attribute and is-a relationship in utterance using

the user’s taste information (Favorite Value). Furthermore, the method calculates the degree of

the pleasure/displeasure.

Second, the EGC method has been developed to distinguish generated emotions into 20 various

emotions based on “Emotion Eliciting Condition Theory.” The method distinguishes emotions

based on the pleasure/displeasure and some conditions.

Third, we proposed a method to analyze the user’s affirmative/negative intention from the user’s

utterances. This method calculates not only the affirmative/negative intention but also the degree

of affirmation/negation by considering adverbs, modalities and interjections.

Fourth, we proposed a method to represent emotions with facial expressions. Extracted emotions

by the EGC method are inputted into trained “sand glass type neural networks,” and calculated

one facial image from the emotional space similar to Ekman’s emotion circle.

We constructed a mail tool and a chat tool as applications of the EGC method and the facial

expression selecting method. Furthermore, we applied the methods and the affirmative/negative

intention analyzing method into an interface of the “Web-based analytical system of health

service needs among healthy elderly.”

It is particularly hoped that this thesis will serve in future a natural human-computer interface

considering a human’s emotions. Furthermore, we hope our method will aid to construct a good

relationship between a human and a computer.

146

REFERENCES

[1] Osamu Hasegawa, Shigeo Morishima and Masahide Kaneko, “Processing of Facial Information

by Computer”, IEICE Trans., Vol.J80-D-II, No.8, pp.2047-2065 (in Japanese) (1997)

[2] P. Ekman, W. V. Friesen, “The repertoire of nonverbal behavior”, Semiotica, Vol.1, pp.49-98

(1969)

[3] Mehrabian, A., “Nonverbal Communication”, Aldine Atherton (1972)

[4] Hiroshi Harashima, “Intelligent image coding and intelligent communications”, J.ITE, Vol.42,

No.6, pp.519-525 (in Japanese) (1988)

[5] Hiroshi Harashima, K. Aizawa and T. Saito, “Model-based analysis synthesis coding of

videotelephone images-conception and basic study of intelligent image coding”, IEICE Trans.,

Vol.E72, No.5, pp.452-459 (1989)

[6] H. Uwakoto, Y. Kobayashi and Y. Niimi, “Acoustic analysis and modeling of emotional

expressions in speech”, IEICE Technical Report, SP92-131, pp.65-72 (in Japanese) (1993)

[7] H. Kawanami and K. Hirose, “Considerations on the Prosodic Features of Utterances with

Attitudes and Emotions”, IEICE Technical Report, SP97-67, pp.73-80 (in Japanese) (1997)

[8] M. Shigenaga, “Characteristic Features of Emotionally Uttered Speech Revealed by Discriminant

Analysis (III)”, IEICE Technical Report, SP97-66, pp.65-72 (in Japanese) (1997)

[9] Nobuaki Kadotani, Hirotomo Aso, Motoyuki Suzuki and Shozo Makino, “An investigation on

discrimination among emotion expressions contained in speech”, IPSJ SIG-SLP 34-8, pp.43-48 (in

Japanese) (2000)

[10] http://www.darpa.mil/ito/research/com/

[11] E. Levin, S. Narayanan, R. Pieraccini, K. Biatov, E. Bocchieri, G. DiFabbrizio, W. Eckert, S.

Lee, A. Pokrovsky, M. Rahim, P. Ruscitti and M. Walker, “The AT&T-DARPA Communicator

mixed initiative spoken dialogue system”, Proc. International Conference on Spoken Language

Processing (2000)

[12] A. Rudnicky, C. Bennett, A. Black, A. Chotomongcol, K. Lenzo, A. Oh and Singh, R., “Task

and domain specific modeling in the Carnegie Mellon communicator system”, Proc. International

Conference on Spoken Language Processing (2000)

[13] E. D. Os, L. Boves, L. Lamel and P. Baggia, “Overview of the arise project”, Proc. European

Conference on Speech Technology, Eurospeech99, pp.1527-1530 (1999)

[14] H. Asoh, T. Matsui, John Fry, F. Asano and S. Hayamizu, “A spoken dialog system for a mobile

office robot”, Proc. Eurospeech99, pp.1139-1142 (1999)

[15] S. Hashimoto, S. Narita, H. Kasahara, A. Takanishi, S. Sugano, K. Shirai, T. Kobayashi, H.

Takanobu, T. Kurata, K. Fujiwara, T. Matsuno, T. Kawasaki and K. Hoashi, “Humanoid

Robot—Development of an information Assistant Robot Hadaly—, 6th IEEE International

147

Workshop on Robot and Human Communication (RO-MAN’97) (1997)

[16] Tetsunori Kobayashi, “Trend of Spoken Dialogue Research”, Journal of Japan Society for

Artificial Intelligence, Vol.17, No.3, pp.266-279 (in Japanese) (2002)

[17] Naoyuki OKADA, “Representation and accumulation of the concepts of words,” IEICE

Publishers (in Japanese) (1991)

[18] Arnold, M.B., “Emotion and Personality”, New York: Columbia University Press (1960)

[19] Kazuya MERA, Takumi ICHIMURA, Teruaki AIZAWA and Toshiyuki YAMASHITA,

“Invoking Emotions in a Dialog System based on Word-Impressions,” Journal of Japan Society

for Artificial Intelligence, Vol.17, No.3, pp. 186-195 (in Japanese) (2002)

[20] Takumi ICHIMURA, Kazuya MERA and Toshiyuki YAMASHITA, “Construction of a Dialog

System with Emotions for Elderly Persons by Neural Networks”, Proc. of IEEE International

Conference on IEEE SMC (SMC2000), pp. 3594-3599 (in Japanese) (2000)

[21] Wundt, W., “Outlines of Psychology”, Leipzig: Wilhelm Engelmann (1897)

[22] H. Schlosberg, “Three dimention of emotion,” The Psychological Review, Vol. 61,

No. 2, pp.81-88 (1954)

[23] Plutchik, R., “The emotions”, New York: Random House (1962)

[24] Clark Elliott, “The Affective Reasoner: A process model of emotions in a multi-agent system,”

Ph.D thesis, Northwestern University, The Institute for the Learning Sciences, Technical Report

No. 32 (1992)

[25] Clark Elliott, “Components of two-way emotion communication between humans and

computers using a broad, rudimentary, model of affect and personality,” Bulletin of the Japanese

Cognitive Science Society (in Japanese) (1994)

[26] Ortony, A., Clore, G.L., & Collins, A., “The cognitive structure of emotions,” New York:

Cambridge University Press (1988)

[27] Paul O’Rorke, Andrew Ortony, “Explaining Emotions,” Cognitive Science, Vol.11, pp. 283-323

(1994)

[28] Kazuya Mera, Takumi Ichimura, Toshiyuki Yamashita and Katsumi Yoshida, “Complicated

Emotion Allocating Method based on Emotional Eliciting Condition Theory”, Memoirs of Tokyo

Metropolitan Institute of Technology, Vol.16, pp. 11-16 (in Japanese)

[29] Hideki Mima, Masao Fuketa, Yoshitaka Hayashi and Jun-ichi Aoe, “A Method for

Understanding Intentions of Indirect Speech-Act in Natural Language Interfaces”, the transaction

of the Institute of Electronics, Information and Communication Engineers (IEICE), Vol. J78-D-II,

No. 5, pp. 803-810 (in Japanese) (1995)

[30] Makoto Yoshie, Kazuya Mera, Takumi Ichimura, Toshiyuki Yamashita, Teruaki Aizawa and

Katsumi Yoshida, “Analysis of affirmative/negative intentions of the answers to yes-no questions

and its application to a web-based interface”, Journal of Japan Society for Fuzzy Theory and

148

Systems, Vol.14, No.4, pp.393-403 (in Japanese) (2002)

[31] Kazuya Mera, Shinji Kawamoto, Kenji Ono, Takumi Ichimura, Toshiyuki Yamashita and

Teruaki Aizawa, “A learning method of individual's taste information”, Proc. of the 5th

International Conference on Knowledge-Based Intelligent Engineering Systems & Allied

Technologies (KES2001), Vol.2, pp.1217-1221 (2001)

[32] Kazuya MERA, Takumi Ichimura and Toshiyuki Yamashita, “Analysis of User Communicative

Intention from Affirmative/Negative Elements by Fuzzy Reasoning and Its Application to

WWW-based Health Service System for Elderly”, Proc. of the 6th Intl. Conf. on Soft Computing

(IIZUKA2000), pp.971-976 (2000)

[33] Koichi Yamada, Riichiro Miziguchi and Naoki Harada, “User’s Utterance Model and

Cooperative Answering for Question-answering Systems”, Journal of IPSJ, Vol.35, No.11,

pp.2265-2275 (in Japanese) (1994)

[34] http://www.aibo.com/

[35] Fujita, M. and Kageyama, K., “An Open Architecture for Robot Entertainment”, Proceedings of

the First International Conference on Autonomous Agents, pp.435-442 (1997)

[36] Hirohide USHIDA, Yuji HIRAYAMA, Hiroshi NAKAJIMA, “Emotion Model for Life-like

Agent and Its Evaluation,” Proceedings of the 15th National Conference on Artificial Intelligence

(AAAI-98), pp. 62-69 (1998)

[37] Hirohide USHIDA, Hiroshi NAKAJIMA, “Software Systems with Emotion,” Journal of Japan

Society for Fuzzy Theory and Systems, Vol. 12, No. 6, pp. 762-769 (in Japanese) (2000)

[38] Shin-ichi Ohnaka, Tomohito Ando and Toru Iwasawa, “The introduction of the personal robot

PaPeRo”, Journal of IPSJ, Vol.37, No.7, pp.37-42 (in Japanese) (2001)

[39] Yoshihiro Fujita, “Personal Robot R100”, Journal of the Robotics Society of Japan, Vol.18,

No.2, p.40 (in Japanese) (2000)

[40] Kaoru Suzuki and Hiroshi Kanazawa, “Pet Robot Using Emotion Triggered Learning Model”,

Toshiba review, Vol.56, No.9, pp.37-40 (in Japanese) (2001)

[41] Heider, F., “Attitudes and Cognitive Organization”, Journal of Psychology, Vol.21 (1946)

[42] Heider, F., “The Psychology of Interpersonal Relations”, New York: Wiley (1958).

[43] Takao Kurokawa, “Nonverbal interface”, Ohmsha (1994).

[44] P.Ekman and W.V.Friesen, “Unmasking the Face: A Guide to Recognizing Emotions from

Facial Clues”, N.J.: Prentice-Hall (1975)

[45] Takumi Ichimura, Hitoshi Ishida, Kazuya Mera, Shinichi Oeda, Akihiro Sugihara and

Toshiyuki Yamashita, “Approach to emotion oriented intelligent system by parallel sand glass

type neural networks and emotion generating calculations”, Journal of Human Interface Society,

Vol.3, No.4, pp.225-238 (in Japanese) (2001)

[46] Takumi ICHIMURA, Kazuya MERA, Hitoshi ISHIDA, Shinichi OEDA, Akihiro SUGIHARA,

149

Toshiyuki YAMASHITA, “An Emotional Interface with Facial Expression by Sand Glass Type

Neural Network and Emotion Generating Calculations Method”, Proc. of The International

Symposium on Measurement, Analysis and Modeling of Human Functions, pp.275-280 (2001)

[47] Takumi ICHIMURA, Kazuya MERA and Toshiyuki YAMASHITA, “Construction of a Dialog

System with Emotions for Elderly Persons by Neural Networks”, Proc. of IEEE Intl. Conf. on

IEEE SMC (SMC2000), pp. 3594-3599 (2000)

[48] Takehiro KANAYA, “Particle ”WA” is not needed in Japanese”, Kodansha (in Japanese) (2002)

[49] The National Institute for Japanese Language, “Classified vocabularies chart”, Shuuei press,

1972.

[50] Tetsuya NASUKAWA et al., “Easy to Use Practical Freeware for Natural Language

Processing,” IPSJ magazine, Vol. 41, No. 11, pp. 1201-1238 (in Japanese) (2000)

[51] T.MASUOKA and Y.TAKUBO, “Basic Japanese Grammar –revised edition–,” Kuroshio

Shuppan (in Japanese) (1992)

[52] Yoshifumi HIDA, Hideko ASADA, “Present-day adjective using dictionary,” Tokyo Dou

Publishers (in Japanese) (1991)

[53] Kazuya MERA, Shinji KAWAMOTO, Takumi ICHIMURA, Toshiyuki YAMASHITA, and

Teruaki AIZAWA, “A learning method of individual’s taste information,” Proc. of the 5th

International Conference on Knowledge-Based Intelligent Engineering Systems & Allied

Technologies (KES2001), Vol. 2, pp. 1217-1221 (2001)

[54] Ryoko TOKUHISA, Kentaro INUI, et al., “Two Complementary Case Studies for Emotion

Tagging in Text Corpora,” Technical Report of JSAI, SIG-SLUD-A003-2, pp.9-14 (in Japanese)

(2001)

[55] Japan Society of Fuzzy Theory and Systems, “Fuzzy Logic –Course of Fuzzy Vol. 4”, Japan

Society of Fuzzy Theory and Systems (1993)

[56] Akihiro OKADA, Jun-ichi ABE, “Emotion Research in Psychology: Past and Current Trends,”

Journal of Japan Society for Fuzzy Theory and Systems, Vol. 12, No. 6, pp. 730-740 (in Japanese)

(2000)

[57] Randolph R. Cornelius, “The science of emotion: Research and tradition in the

psychology of emotions,” Seishin Shobo (in Japanese) (1999)

[58] Lazarus, R. S. & Folkman. Susan, “Stress, appraisal, and coping”, New York: Springer (1984)

[59] Lazarus, R. S., “Emotion and Adaptation”, New York: Oxford University Press (1991)

[60] Ortony, A., Clore, G.L., & Collins, A., “The cognitive structure of emotions,” New York:

Cambridge University Press (1988)

[61] Paul O’Rorke, Andrew Ortony, “Explaining Emotions,” Cognitive Science, Vol.11, pp. 283-323

(1994)

[62] Japan Electronic Dictionary Research Institute, LTD., “Japanese Word Dictionary” in the EDR

150

Electronic Dictionary version 2.0 (1998)

[63] Tadahiko Kumamoto, Akira Ito and Tsuyoshi Ebina, “Recognizing User Communicative

Intention in a Dialogue-Based Consultant System ---A Statistical Approach Based on the Analysis

of Spoken Japanese Sentences---, the transaction of the Institute of Electronics, Information and

Communication Engineers (IEICE), Vol. J77-D-II, No. 6, pp. 1144-1123 (1994) [64] Katsumi Yoshida, Takumi Ichimura, Hiroki Sugimori, Takashi Izuno and H. Inada, “Analytical

System of Health Service needs among Healthy Elderly by using Internet”, Proc. of

Gerontechnology Third Intl. Conf. (1999)

[65] Y. Matsumoto, A. Kitauchi, T. Yamashita, Y. Hirano, H. Matsuda and M. Asahara, “Japanese

Morphological Analysis System Chasen Version 2.0”, http://cl.aist-nara.ac.jp/lab/nlt/chasen.html

(1999)

[66] Ken MURASUGI, “Likart’s Method of Job Satisfaction Measurement and an Application of

Fuzzy Theory on Morale Survey”, Journal of Japan Industrial Management Association, Vol.44,

No.2, pp.94-101 (in Japanese) (1993)

[67] Akira MATSUMURA (edit), “DAIJIRIN 2nd edition”, Sanseido (in Japanese) (1995)

[68] The National Institute for Japanese Language, “Adverbs’ meaning and usage”, Pringing Bureau,

Ministry of Finance (in Japanese) (1991)

[69] Kaoru TONE, “Making a Decision, Feeling Like Playing A Game”, Nitskagiren Press (1986)

[70] Kazuya Mera, Makoto Yoshie, Takumi Ichimura, Toshiyuki Yamashita, Teruaki Aizawa and

Katsumi Yoshida, “Response generating method and its application to web-based health care

service”, Proc. of the 6th International Conference on Knowledge-Based Intelligent Engineering

Systems & Allied Technologies (KES2002), Vol.1, pp.688-692 (2002)

[71] M. Rosenblum, Y.Yacoob, L.Davis, “Human emotion recognition from motion using a radial

basis function network architecture”, Proc. of the Workshop on Motion of Non-Rigid and

Articulated Objects, pp. 15-25 (1996)

[72] S. Morishima, “Modeling of facial expression and emotion for human communication system”,

Displays, Vol.17, pp.43-49 (1994)

[73] B.Irie and M.Kawato, "Acquisition of Internal Represntation by Multi-Layered Perceptrons",

IEICE Trans., Vol.J73-D-II, No.8, pp.1173-1178 (in Japanese) (1990)

[74] N.Fukumura, Y.Uno, and R.Suzuki, “Learning of Many-to-Many Relation between Different

Kinds of Sensory Information Using a Neural Network Model for Recognizing Grasped Objects”,

The Brain & Neural Networks,Vol.5,No.2,pp.65-71 (in Japanese) (1998)

[75] Hitoshi ISHIDA, Takumi ICHIMURA, Mutuhiro TERAUCHI, Tetsuyuki TAKAHAMA, and

Yoshinori ISHOMICHI, “Classification of Facial Expressions using Sandglass-type Neural

Networks”, Proc. of the 10th Fuzzy, Artificial Intelligence, Neural Networks and Soft Computing

(FAN’00), pp.201-204 (in Japanese) (2000)

151

[76] N.Ueki, S.Morishima, H.Yamada, and H.Harashima, "Expression Analysis/Synthesis System

Based on Emotional Space Constructed by Multi-Layered Neural Network", IEICE Trans.,

Vol.J77-D-II, No.3, pp.573-582 (in Japanese) (1993)

[77] S.Akamatsu, T.Sasaki, H.Fukamachi, and Y.Suenaga, "Automatic extraction of target images

for face identification using the sub-space classification method", IEICE Trans.,

Vol.E76-D,No.10,pp.1190-1198 (1993)

[78] Y.Xiao, N.P.Chandrasiri, Y.Tadokoro, and M.Oda, "Recognition of Facial Expressions Using

2-D DCT and Neural Network", IEICE Trans., VolJ81-A, No.7, pp.1077-1086 (in Japanese)

(1998)

[79] J.A.Russell and M.Bullock.,"Multidimensional Scaling of Emotional Facial Expressions:

Similarity From Preschoolers to Adults", Journal of Personality and Social Psychology, Vol.48,

No.5, pp.1290-1298 (1985)

[80] Takumi Ichimura, Kazuya Mera, Yoshiaki Miki and Toshiyuki Yamashita, “Emotional Interface

for Human Feelings by Mobile Phone”, Proc. of the 6th International Conference on

Knowledge-Based Intelligent Engineering Systems & Allied Technologies (KES2002), Vol.1,

pp.708-712 (2002)

[81] Web Site of JavaFaceMail (in Japanese),

http://facemail.chi.its.hiroshima-cu.ac.jp/

[82] Web Site of Java™ 2 Platform, http://java.sun.com/j2se/1.3/

[83] Heibonsha, “Dictionary of Psychology”, Heibonsha (in Japanese) (1981)