the task selection mechanism for interactive robots: application to the intelligent life supporting...

32
The Task Selection Mechanism for Interactive Robots: Application to the Intelligent Life Supporting System Seung-Min Baek, 1, * Daisuke Tachibana, 2,† Fumihito Arai, 3,‡ Toshio Fukuda, 3,§ Takayuki Matsuno 3,¶ 1 Sungkyunkwan University, 300 Chunchun-dong, Jangan-gu, Suwon 440-746, Korea 2 Toyoda Motor Corporation, 1 Toyota-cho, Toyota City, Aichi Prefecture 471-8571, Japan 3 Nagoya University, Furo-cho, Chikusa-ku, Nagoya 464-8603, Japan The essential challenge in the future ubiquitous networks is to make information available to people not only at any time, at any place, and in any form, but with the right thing at the right time in the right way by inferring the users’ situations. Several psychological experiments show that there are some associations between each user’s situations including the user’s emotions and each user’s task selection. Utilizing those results, this article presents a situation-based task selection mechanism that enables a life-supporting robot system to perform tasks based on the user’s situation. Stimulated by interactions between the robot and the user, this mechanism con- structs and updates the association between the user’s situation and tasks so that the robot can adapt to the user’s behaviors related to the robot’s tasks effectively. For the user adaptation, Radial Basis Function Networks ~ RBFNs! and associative learning algorithms are used. The proposed mechanism is applied to the CRF3 ~Character robot face 3! system to prove its feasi- bility and effectiveness. © 2006 Wiley Periodicals, Inc. 1. INTRODUCTION Due to the dramatic improvement in information and network technologies, in the near future people will be surrounded by many home digital appliances and computers, both of which connect and communicate via the network at home. The main challenge of such an environment, called the “ubiquitous network,” is how *Author to whom all correspondence should be addressed: e-mail: [email protected]. †e-mail: [email protected]. e-mail: [email protected]. § e-mail: [email protected]. e-mail: [email protected]. INTERNATIONAL JOURNAL OF INTELLIGENT SYSTEMS, VOL. 21, 973–1004 ~2006! © 2006 Wiley Periodicals, Inc. Published online in Wiley InterScience ~www.interscience.wiley.com!. DOI 10.1002/ int.20172

Upload: seung-min-baek

Post on 11-Jun-2016

212 views

Category:

Documents


0 download

TRANSCRIPT

The Task Selection Mechanism forInteractive Robots: Application to theIntelligent Life Supporting SystemSeung-Min Baek,1,* Daisuke Tachibana,2,† Fumihito Arai,3,‡ Toshio Fukuda,3,§

Takayuki Matsuno3,¶

1Sungkyunkwan University, 300 Chunchun-dong, Jangan-gu,Suwon 440-746, Korea2Toyoda Motor Corporation, 1 Toyota-cho, Toyota City,Aichi Prefecture 471-8571, Japan3Nagoya University, Furo-cho, Chikusa-ku, Nagoya 464-8603, Japan

The essential challenge in the future ubiquitous networks is to make information available topeople not only at any time, at any place, and in any form, but with the right thing at the righttime in the right way by inferring the users’ situations. Several psychological experiments showthat there are some associations between each user’s situations including the user’s emotionsand each user’s task selection. Utilizing those results, this article presents a situation-based taskselection mechanism that enables a life-supporting robot system to perform tasks based on theuser’s situation. Stimulated by interactions between the robot and the user, this mechanism con-structs and updates the association between the user’s situation and tasks so that the robot canadapt to the user’s behaviors related to the robot’s tasks effectively. For the user adaptation,Radial Basis Function Networks ~RBFNs! and associative learning algorithms are used. Theproposed mechanism is applied to the CRF3 ~Character robot face 3! system to prove its feasi-bility and effectiveness. © 2006 Wiley Periodicals, Inc.

1. INTRODUCTION

Due to the dramatic improvement in information and network technologies,in the near future people will be surrounded by many home digital appliances andcomputers, both of which connect and communicate via the network at home. Themain challenge of such an environment, called the “ubiquitous network,” is how

*Author to whom all correspondence should be addressed: e-mail: [email protected].†e-mail: [email protected].‡e-mail: [email protected].§e-mail: [email protected].¶e-mail: [email protected].

INTERNATIONAL JOURNAL OF INTELLIGENT SYSTEMS, VOL. 21, 973–1004 ~2006!© 2006 Wiley Periodicals, Inc. Published online in Wiley InterScience~www.interscience.wiley.com!. • DOI 10.1002/int.20172

to manage those networked appliances and computers easily and effectively. Morehome appliances are being introduced, and people will need more effort to under-stand how to use them. Needing too much effort in understanding generally makesthe user frustrated. To reduce the user’s burdens, it is necessary that the systemhave some capabilities to perceive the user’s situation and adapt to the user. Towardthe challenges, the interactive robot is one of the potential interfaces by miscella-neous communication channels such as verbal, visual, and tactile. The conven-tional approaches for developing interactive robots are principally twofold: Thefirst approach is developing the hardware for interactive robots such as the robotface1 and humanoid2 and creature-like robots,3 and the second approach is imple-menting multimodal communication channels to apply to the real applications inschools,4 museums,5 hospitals,6 nursing homes,7,8 offices,9 and so on. Consideringthe real applications in the home environment, the abilities of not only communi-cating with the user multimodally and emotionally10 but also of referring to theuser’s conventional activities, recognizing human intentions as to what he or sheis likely to do, and, accordingly, performing some tasks useful to the user at theright time in the right way are required.

Related work has been done by several groups. However there are noapproaches considering both the association between the user situation and therobot behaviors and the interaction between the user and the robot during asso-ciation constructing and the updating processes of routine human activities and~time-based! situations. Hence, the main interest of this article is to analyze theuser’s situation for understanding how the robot should act and execute tasks orservices through human–robot interactions so that the robot adapts the user behav-ior related to the robot tasks effectively and efficiently.

2. RELATED WORKS

2.1. The Ubiquitous Network

There are many issues and challenges to realizing the Ubiquitous Network.11

• Scalability: The future Ubiquitous Networking environment will likely face a prolifer-ation of users, applications, networked devices, and their interactions on a scale neverexperienced before. As the number of devices grows, it might be unmanageable to main-tain system configurations and their applications.

• Heterogeneity: Now, applications are typically developed for specific device classes orsystem platforms, leading to separate versions of the same application for handhelds,desktops, and cluster-based servers. If such heterogeneity increases, development appli-cations that run across all platforms will become exceedingly difficult.

• Integration: As the number of devices and applications increases, integration becomesmore complex.

• Invisibility: The environment and the objects in it must be able to tune themselves with-out distracting users at a conscious level.

• Context awareness: Most computing systems and devices nowadays cannot sense theirenvironments and therefore cannot make timely, context-sensitive decisions. The Ubiq-uitous Network, however, requires systems and devices that perceive contexts. More-over, once a pervasive computing system can perceive the current context, it must have

974 BAEK ET AL.

International Journal of Intelligent Systems DOI 10.1002/int

the means of using its perceptions effectively. Richer interactions with users will requirea deeper understanding of the physical space.

To tackle those issues and challenges, there is much research related to theUbiquitous Network. Because of so many issues, research in the Ubiquitous Net-work is quite diverse, as the field itself has not yet been clearly defined. The researchoriginates from many different areas such as mobile computing, distributed sys-tems, human computer interaction, AI, design, embedded systems, processor designand computer architecture, materials science, civil engineering, and architecture.Therefore the research on the Ubiquitous Network has generally been done with aproject style that involves much component research. Here are some representa-tive projects:

~1! Academia

• Aura, Carnegie Mellon University. Aura projects aim to scale the computing sys-tem demonstrating a “personal information aura” that spans wearable, handheld,desktop, and infrastructure computers. Aura includes much individual research. “Dar-win” is an intelligent network at Aura’s core. “Coda” is a distributed file manage-ment system that supports nomadic file access, and “Odyssey” provides operatingsystem support for resource adaptation.

• Endeavour, University of California at Berkeley. Endeavour’s main technologicalcapabilities are to seamlessly support fluid software. It includes processing, stor-age, and data management functionality to arbitrarily and automatically distributeitself among pervasive devices and along paths through scalable computing plat-forms that are integrated with the ubiquitous networking infrastructure. The systemcan compose itself from preexisting hardware and software components to satisfy aservice request while advertising the services it can provide to others.

• Oxygen, Massachusetts Institute of Technology. The Oxygen project rests on aninfrastructure of mobile and stationary devices connected by a self-configuring net-work. This infrastructure supplies abundant computation and communication, whileharnessed through system, perceptual, and software technologies to meet user needs.This project focuses on eight environment-enabling technologies, which emphasizeunderstanding what turns an otherwise dormant environment into an empoweredone to which users shift parts of their tasks.

• Aware Home, Georgia Institute of Technology. The Aware Home project focuseson the context awareness or computing needs in human everyday lives, specifically,that part of our lives that is not centered around work or the office.12 The projectincludes a vision-based sensor system to track multiple individuals in an environ-ment and a smart floor interface to identify and track people walking across a largearea.

~2! Industry

• TRON, University of Tokyo and others. TRON ~The Real-time Operating systemNucleus! focuses on developing an open-source embedded operating system thatcan be implemented with every device in order to share the same protocol and spec-ification solving heterogeneity or scalability issues. TRON is the name of the com-puter architecture proposed by Dr. Ken Sakamura,13 which is a kind of embeddedoperating system that enables the control of digital communication. The specifica-tions of TRON are open, as a result making a big consortium that includes most ofthe Japanese electronics companies and others.

• WebSphere, IBM. WebSphere work focuses on applications and the middle warethat extends its software platform. Different from TRON, IBM is spearheading

TASK SELECTION MECHANISM FOR INTERACTIVE ROBOTS 975

International Journal of Intelligent Systems DOI 10.1002/int

consortia and initiatives for open standards to support ubiquitous computing appli-cations. Collaborating with hardware vendors such as Palm, Symbol Technologies,and Handspring, IBM is developing a new generation of devices.

• Cooltown, Hewlett-Packard. Cooltown focuses on extending Web technology, wire-less networks, and portable devices to create a virtual bridge between mobile usersand physical entities and electronic services. Cooltown uses URLs for the address-ing, physical beaconing, and sensing of URLs for discovery. And it uses localizedWeb servers for directories to create a location-aware system that supports nomadicusers. It leverages Internet connectivity on top of this infrastructure to support com-munication devices.

The issues of scalability, heterogeneity, or integration have been tackled mainlyby industrial projects—this might be because such issues are strongly related tohardware specifications or protocol standardization and that heterogeneous groupsneed to be involved to solve them. On the other hand, much research regardingcontext awareness is challenged by academia due to the fact that there are manymore complex and unclear challenges tackled by industries, and this field coversmany areas in addition to the engineering field ~e.g., psychology, sociology, eth-ics, etc.!. The research position of this article is also in the context awareness fromrobotics and human computer interaction interface approaches. So the next sec-tion explains this in more detail.

2.2. Context Awareness

Humans have a quite high capability to convey ideas to each other and reactappropriately. This is due to many factors: the richness of the language they share,the common understanding of how the world works, and an implicit understand-ing of everyday situations. When humans talk with humans, they can use implicitsituational information, or “context,” to increase the conversational bandwidth.Unfortunately, this ability to convey ideas does not transfer well to humans inter-acting with computers. In traditional interactive computing, users have an impov-erished mechanism for providing input to computers. By improving the computer’saccess to contexts, we increase the richness of communication in human–computerinteractions and make it possible to produce more useful computational services.Such a sensing and adapting context is generally called “context awareness.” Con-text awareness generally defines the ability of a computing device or program tosense, react to, or adapt to the environment in which it is running.14

2.2.1. Context

Here we used the word context, but this term is widely used with very differ-ent meanings. In the field of the Ubiquitous Network, context is generally definedas any information that can be used to characterize the situation of an entity, whichis a person, place, or object that is considered relevant to the interaction between auser and an application, including the user and the applications themselves. Almostany information available at the time of an interaction can be seen as context infor-mation. Some examples are as follows15:

976 BAEK ET AL.

International Journal of Intelligent Systems DOI 10.1002/int

• Identity• Spatial information, for example, location, orientation, speed, and acceleration• Time, for example, time of the day, date, and season of the year• Environmental information, for example, temperature, air quality, and light or noise

level• Social situation, for example, who you are with and people that are nearby• Resources that are nearby, for example, accessible devices, and hosts• Availability of resources, for example, battery, display, network, and bandwidth• Physiological measurements, for example, blood pressure, heart rate, respiration rate,

muscle activity, and tone of voice• Activity, for example, talking, reading, walking, and running• Schedules and agendas.

In this article, context information is summarized as time, identity, location,activity, and human emotion based, and situation is defined as a human state scriptedby a set of contexts.

2.2.2. Classification of the Context Awareness Applications

There have been several explorations around context awareness applications.The most widespread classification of context awareness applications is the onesuggested by Dey.16 Modifying his classification, we can classify these as below:

• Adapting the behavior to the user context ~e.g., context adaptation when presentingcontent!

• Using a situation description for tagging ~virtual or virtual representations of physical!objects as a way of classifying and relating them to each other ~e.g., creation context asautomatic metadata!

• Providing situation awareness assistance ~e.g., carry out certain tasks, offer certain ser-vices, or provide needed information resources automatically/proactively!

• Supporting collaborative work regarding the personal and shared context elements ~e.g.,automatic detection/calculation of resources subject to sharing, aggregation over theoverlapping personal context elements!.

2.2.3. Conventional Research on Context Awareness

Historically, major applications of conventional research were office or meet-ing tools. The reason for this is that most computers are used in an office environ-ment, and it is also easier to obtain context information, such as location, in alimited and controllable area such as an office.

• Active Badge, Olivetti Research Laboratory. The Active Badge system17 is generallyconsidered to be one of the first context-aware applications. Each member of the per-sonnel attached a badge that transmits IR signals in the office. A network of sensorsplaced around the office building picked up the signals and a central location serverpolled these sensors. The telephone receptionist could find out where a person was anddirect the call to an appropriate phone.

• ParcTab, Xerox Palo Alto Research Center. The ParcTab system18 is based on palm-sized wireless ParcTab computers and an infrared communication system that links themto each other and to desktop computers through a local area network ~LAN!. The ParcTab

TASK SELECTION MECHANISM FOR INTERACTIVE ROBOTS 977

International Journal of Intelligent Systems DOI 10.1002/int

worked as a mobile personal digital office assistant. There were dozens of applicationsdeveloped for and tested with it. For instance:• Presenting information about the room the user was in ~e.g., when the user was at the

library, information about the library was displayed!.• Helping the user find the most convenient local resource ~e.g., the nearest printer!.• Attaching a certain UNIX directory to a certain room. When the user enters the room

all the files in the directory are shown.• Locating other persons carrying a ParcTab. The location of the persons was shown in

a map on a desktop computer.• Using the ParcTab as a remote control with different control choices in different rooms.

• Technology for Enabling Awareness ~TEA!, Starlab. TEA projects19 aim to adaptmobile ~or wearable! devices by transforming sensor readings into context profiles. Basedon Self Organization Mapping, the system associates between sensor data and contextprofiles. For instance, in the phone scenario, the profiles of the mobile phone are selectedautomatically based on the recognized context. The phone chooses to ring, vibrate, adjustthe ring volume, or keep silent, depending on whether the phone is in hand, on a table, ina suitcase, or outside. Those context profiles are associated with the sensor data fromaccelerometers, photodiodes, temperature sensors, touch sensors, pressure sensors, micro-phones, infrared sensors, and carbon monoxide sensors.

Most of the existing applications actually use only a few context values. Gen-erally, location-, identity-, or time-based contexts have been used. The reason forthis probably lies in the difficulty for computer systems to obtain context informa-tion and to process it. Most of the applications are prototypes developed in researchlaboratories and the academic world. There do not yet exist many commercialsolutions. The main areas where such commercial solutions exist are different kindsof location-based services and guides.

In conventional projects, if the ubiquitous network system certainly recog-nized the user’s context properly and the executing task that the context is relatedto, the context-awareness research would propose many possible applications tothe market. Unfortunately there are some crucial challenges in using contextawareness.

The first crucial challenge is that uncertain inferences and decisions are com-pounded.20 There are many contexts in which the automatic system might succeedin optimizing and, as a result, fail in doing the right thing. No matter how hard thesystem designer tries to program contingency plans for all possible contexts, invari-ably the system will sometimes frustrate the user and perform unexpected orundesirable tasks, especially when such an inference is based on layers of ambig-uous interpretation and input or requires a level of intelligence that even humanswould find difficult.

For instance, on a day when the temperature is predicted to shift from warmto cool, the home ubiquitous network system might determine that the optimalcooling strategy is to shut down the air conditioner and automatically open a set ofblinds and windows so as to create an efficient cross breeze. This scenario is rela-tively simple compared with the others. However even in this simple scenario,there are many situations in which the automatic system might succeed in optimiz-ing temperature comfort and yet fail in “doing the right thing”: something noisyis occurring outside, someone is smoking at the outside window, someone inthe home is allergic to pollen and the pollen count is high, it is raining outside, it is

978 BAEK ET AL.

International Journal of Intelligent Systems DOI 10.1002/int

too quiet for a person reading when the hum of the air conditioner is off, someonedid not want the blinds open because this throws glare on a computer screen, andso on.

The second one is privacy.21 Mechanisms such as location tracking, smartspaces, and the use of a surrogates monitor act on an almost continuous basis. As auser becomes more dependent on a ubiquitous network system, he becomes moreknowledgeable about the user’s movements, behavior patterns, and habits. Thepotential for a serious loss of privacy may deter people from using a ubiquitousnetwork system. Hence the infrastructure needs to be confident of the user’s iden-tity and authorization level before responding to the user’s requests.

The third one is the communication between the user and the network sys-tem. People must feel annoyed if the system controls every operation of tasks orperforms unexpected or undesirable tasks without enough communication withthe user. Hence, the system should not strip users of their sense of control so theyfeel comfortable and avoid ill feeling.

Therefore, to design a context awareness system, it is necessary to discernwhat functions of the smart home are possible with limited inferences, which arepossible only through inference. And on the other hand, users necessarily have tounderstand the system’s expected behavior in the face of this condition and thesystem’s facilities for detecting or inferring this condition, because systems thatrely on inference will never be right all of the time.

2.3. Socially Interactive Robot

There are a number of systems from different fields of research that aredesigned to interact with people. Many of these systems target different applica-tion domains such as computer interfaces, Web agents, synthetic characters forentertainment, or robots for physical labor. To do social interaction with humans,software-based embodied avatars are also alternative means, and there are a num-ber of graphic-based systems that combine natural language with an embodiedavatar. However, in this section, only a few socially interactive robots related toour research are summarized. Based on the approach, we can categorize conven-tional research into three sections: design, multimodal interaction, and sociallysituated learning.

2.3.1. Design

The approach of the “Design” category mainly aims to design robots thatexpress their internal states like emotions or homeostatics and aims to achievehumanlike motions or do some gestures and have postures.

• Face. There are several projects that focus on the development of robot faces. “Kismet”1

is a prototype robot, developed by Breazeal, at the Massachusetts Institute of Technol-ogy, whose sole purpose is face-to-face social interaction. It uses facial expressions andvocalizations to indicate its emotions and guide people’s interaction with it. Kismet isspecially designed to be childlike, engaging people in the types of exchanges that occurbetween an infant and its caregiver. Kobayashi et al.22 at the Science University of Tokyo

TASK SELECTION MECHANISM FOR INTERACTIVE ROBOTS 979

International Journal of Intelligent Systems DOI 10.1002/int

have developed humanlike robotic faces named “SAYA” ~typically resembling a Japa-nese woman! that incorporate hair, teeth, silicone skin, and a large number of controlpoints. Each control point maps to a facial action unit of a human face. The facial actionunits characterize how each facial muscle ~or combination of facial muscles! adjusts theskin and facial features to produces human expressions and facial movements.23 Using acamera mounted in the left eyeball, the robot can recognize and produce a predefinedset of emotive facial expressions ~corresponding to anger, fear, disgust, happiness, sor-row, and surprise!. A number of simpler expressive faces have been developed at WasedaUniversity, one of which, called “WE-4,”24 can adjust its amount of eye opening and itsneck posture in response to light intensity. Depending on the emotion and homeostatic,with arm motions, this robot can also show six basic expressions—anger, fear, disgust,happiness, sadness, and surprise.

• Humanoid. The number of humanoid robotic projects under way is growing. Honda’sASIMO is a bipedal walker using “prediction motion control” that predicts the nextmove and shifts the center of gravity accordingly,25 and changes direction, stops, andaccelerates walking smoothly. Sony’s Qrio is a smaller walking robot specialized inentertainment uses. In addition to dynamic walking, Qrio can dance dynamically, sit up,or keep its balance to stand.26 HRP227 and a future project at the National Institute ofAdvanced Industrial Science and Technology ~AIST! in Japan aim to develop a robotsimilar to ASIMO or Qrio that can walk on uneven surfaces, control tipping over, andget up from a fallen position. However HRP is more focused on developing practicalapplications ~e.g., a human cooperatively transports and installs an exterior wall panelfor a simple prefab building in the open air!. NASA is developing an upper-torsohumanoid robot “Robonaut” that can function as an Extravehicular Activity ~EVA! astro-naut equivalent.28 Because Robonaut’s major applications are EVAs, Robonaut hasadvanced hands and arms. Robonaut’s hand, which has 14 degrees of freedom ~DOFs!,can fit into all the required places and operate EVA tools like a tether hook. In the aca-demic field, there are several research projects to design and develop humanoid robots.For instance Kuniyoshi et al. at the University of Tokyo have developed a robot that cando a “roll-and-rise” motion, in which the robot stands up in one action from a flat lyingstate.29

• Creature. An increasing number of entertainment, personal, and toy robots have beendesigned to imitate loving. The most common designs are inspired by household ani-mals such as dogs and cats with the objective of creating robotic companions. Sony’sdog robot Aibo can perceive a few simple visual and auditory features that allow it tointeract with a color ball or bone.30 It is mechanically very sophisticated, able to loco-mote, to get up if it falls down, and to perform an assortment of tricks. Paro, a seal-puprobot, is developed to provide some therapeutic benefit for its users.31 In fact, Paro haswon a Guinness Book of World Records award as the “world’s most soothing robot.” Itresponds to stroking. Paro appears to be about 45 cm long and is covered in white fur.

2.3.2. Multimodal interaction

The approach of the “multimodal interaction” category mainly aims to com-municate between socially interactive robots ~SIRs! and humans with high-leveldialogue and natural cues such as gazing, gestures, and posture.

This category includes many service robots deployed in hospitals,6 muse-ums,5 office buildings,9 departments,32 and so on. RHINO and Minerva33 are ser-vice robots that were deployed as tour guides in the Deutsches Museum Bonn,Germany, and the National Museum of American History, Washington, D.C., in1997 and 1998. Their research is mainly dedicated to solving navigation problemsand planning for autonomous mobile robots called Partially Observable Markov

980 BAEK ET AL.

International Journal of Intelligent Systems DOI 10.1002/int

Decision Process ~POMDS!, rather than multimodal human–robot interactions.Recent similar tour guide robots are HERMES and Jijo-2. XAVIER34 is one of thefirst mobile robots controllable via a Web interface. The long-term experiment ofthis robot was carried out from December 1995 to December 1998 in an indoorenvironment. During that term, XAVIER received 30,000 requests by and executed4,700 tasks for users via the Internet. Recently following XAVIER, a personality-embedded service robot Vikia was introduced by the Social Robots Project at Car-negie Mellon University ~CMU!.

Robovie is a humanlike mobile robot that has relatively higher perceptualabilities than other SIRs through various sensors—omnidirectional vision sen-sors, microphones, skin sensors, tactile sensors, ID tags, and so on. Robovieinteracts and communicates with the user in accordance with “situated modules”that execute a particular task in a particular situation and “episode rules” thatrepresent their partial execution order. Robovie was used experimentally in theclassroom to evaluate whether this robot could construct long-term relationshipsand whether it is good for education uses. As a result, even though there aresome difficulties in obtaining a high degree of interaction because of its limitedability to process sensory data, the fact that those kinds of SIRs have some effectsin education has been shown.35 SIG is a humanoid robot being developed by TheERATO Kitano Symbiotic Systems Project.36 SIG has stereo hearing and vision.The robot has voice and face recognition systems. Real-time auditory and visualmultiple-object tracking is used to point the head at the speaker. The robot wasdesigned to recognize and locate voices in the environments typical of a recep-tionist or museum guide. Robita at Waseda University is a communication robotthat can participate and take part in group conversations with eye points, ges-tures, simple facial expressions, and posture ~pointing!. Those conversations areconducted according to the situation-based scripts depending on the content’spriority.37 Robita also has a sophisticated conversation mechanism; Robita learnsthe relation between the user’s unknown expressions and intentions by means ofasking the user through the whole interaction based on associative learning betweenthe utterance expression node ~UEN! and the retrieval keyword node ~RKN!.

2.3.3. Socially Situated Learning

The research categorized as “socially situated learning” is mainly concen-trated on how robots learn their behaviors. In socially situated learning, an indi-vidual interacts with his social environment to acquire new competency. In recentyears, there has been some research to analyze how social learning can occurthrough human–robot interactions. The “Cog” project including “Kismet” at theMassachusetts Institute of Technology analyzes the process of socially situatedlearning via a “bottom-up” approach: ~1! Saliency results from a combination ofinherent object qualities, contextual influences, and the model’s attention, ~2! uti-lizing similar physical morphologies to simplify the task of body mapping andrecognizing success, then ~3! exploiting the structures of social interactions. Basedon this approach, the robot mimicked a swing action followed by a human teach-ing the swing action.38 As well as the Cog project, some research into socially

TASK SELECTION MECHANISM FOR INTERACTIVE ROBOTS 981

International Journal of Intelligent Systems DOI 10.1002/int

situated learning is related to teaching by imitation. Imitation has been used as amechanism for leaning simple motor skills from observation, such as block stack-ing39 or pendulum balancing.40 Imitation has also been applied to the learningof sensor-motor associations41 and for constructing task representations.42 TheCyberHuman project at the Advanced Telecommunication Research Institute alsois dedicated to developing an algorithm of imitation learning.43 Based on a non-linear model-based adaptive controller, the humanoid robot DB learns humanmotions such as swings, playing air hockey, and dancing from perception by three-dimensional visual tracking or the SenSuit exoskeleton that records the 17 jointangles of the human upper torso.

Some research focuses on the learning process from a human teacher to arobot via social interactions. Through the interactions with the physical and socialenvironments, the Infanoid project44 at the Keihanna Human Info-CommunicationResearch Center focuses on designing “mirror system” mapping between someone’sbehavior ~how he or she feels and acts! and the self ’s behavior ~how I feel and act!for imitative learning. An improved approach has been developed by Nagai et al.at Osaka University. They proposed a mechanism by which a robot acquires sen-sorimotor coordination for joint attention through “bootstrap learning”45: a pro-cess by which a learner acquires higher capabilities through interactions with itsenvironment based on embedded lower capabilities even if the learner does notreceive any external evaluation or the environment is controlled.

Other socially situated leaning is to learn robots’ behavior strategies ~when,where, and how the robot does a specific task or coordinates executable tasks!through interactions based on understanding the user’s situation or the environ-ment around the robot. Pearl is a robotic assistant for elderly people living intheir homes developed by Thrun et al. at CMU. Pearl is equipped with inter-active systems—SICK laser range finders, sonar sensors, microphones for speechrecognition, speakers for speech synthesis, touch-sensitive graphical displays, actu-ated head units, and stereo camera systems. The central interacting mechanismof Pearl’s behavior is based on a “personal cognitive orthotic” ~PCO!, whichidentifies robotic activities based on the importance and the likelihood of beingforgotten, determines effective times to issue them, and adapts to environmentalchanges.46

2.4. Research Positioning

Our research objective is to propose a robotic task selection mechanism basedon the user’s situation in accordance with understanding the user’s situations. Themain challenges here are, first, how we design a learning algorithm adapting theuser’s activities and demands, and, second, how the robot interacts or suggestsaccording to the understanding of the user’s activities and demands. Therefore ourresearch is highly related to socially situated learning. Particularly similar to thePearl project, our approach is to design an algorithm to learn a robot’s behaviorstrategies ~when, where, and how the robot does a specific task or coordinatesexecutable tasks! through interactions based on understanding the user’s situationor the environment around the robot. However it is also necessary to think about

982 BAEK ET AL.

International Journal of Intelligent Systems DOI 10.1002/int

the robot’s capabilities to interact with natural cues in order to adapt to the user—perceiving the user’s activities and user’s environment and communicating multi-modally like human–human communication ~see Figure 1!.

3. SITUATION-BASED TASK SELECTION MECHANISM

In this section, first we describe the necessity of considering a user’s situa-tions to enhance human–robot interaction. Then a computational model of thesituation–task association mechanism is described.

3.1. User’s Action and Situation

As the previous section illustrated, the essential challenge in future ubiqui-tous networks is not only to make information available to people at any time, atany place, and in any form, but specifically to say the right thing at the right timein the right way. Such an on-demand interaction and support must bring benefitsin terms of effectiveness, efficiency, and acceptability by means of analyzing users’situations. A user situation is a collection of contexts. For instance, the situation“He comes home at 7 p.m. from his office” can be decomposed into several con-texts like “he,” “come home,” “7 p.m.,” and “from his office.” Each context itselfhas no meaning or values for the system. However, it will have a meaning andcompose a situation when it is associated with user’s activities or tasks, for instance,understanding the trend with which the user generally interacts with the system inthis context or with which the user does the task in that context. As sociologist

Figure 1. Research positioning.

TASK SELECTION MECHANISM FOR INTERACTIVE ROBOTS 983

International Journal of Intelligent Systems DOI 10.1002/int

Suchman points out in her proposal theory, the Situated Action Theory, user actionsare dictated by the surrounding situation.47 Accordingly, observing the associationbetween user activity and a situation, in some pairs of tasks and situations, thesystem can identify the association and consequently can infer the user’s trend ofactivities in the situation. In this research, we are focusing on activities related tothe tasks that the user commands the robot to perform and how the robot under-stands the tendencies of the association between a user’s situation and a robot’sperforming of the task.

3.2. Overall System Structure

Figure 2 shows the overall diagram of a task selection mechanism based on auser’s situation. As shown, the situation-based task selection mechanism has fourmain component systems: task-situation association system, interaction system,task selector, and the context aware system to understand the user’s situation. Stim-ulated by user direct commands and context awareness, this mechanism associatesbetween a user’s situation and a robot-performing task and accordingly selects thetasks that are appropriate for the user efficiently and effectively. The constraint ofthis mechanism is to only consider the situation of interaction between one robotand one user. The robot has a profile of each user about the mechanism and doesnot co-share one profile with several users.

3.2.1. Context Awareness System

The context awareness system perceives the user’s contents in terms of iden-tity, location, time, activity, and emotion. Identity is a context that illustrates theuser’s profile—who the user is, what his or her attribute is, and what his or her

Figure 2. Functional block diagram of overall structure.

984 BAEK ET AL.

International Journal of Intelligent Systems DOI 10.1002/int

condition is. Location is a context that describes the user’s position—where theuser is, what the environment around the user is, and what kinds of objects arenear the user. Time is a context that mentions the data of the processing tasks oruser’s activities—when the user is doing something, how often the user is doingsomething, and how long the user is doing something. Activity is a context thatillustrates the user’s action—what the user is doing and how the user is doing it.The activity context includes current robot tasks that the user commanded or therobot is performing automatically after getting the user’s permission. Emotion is acontext of human emotion that is inferred by sensors, interactions, or other contexts.

3.2.2. Task–Situation Association System

The task–situation association system is a core system of this mechanism.According to the theoretical basis, the robot observes a tendency of direct com-mand to perform a task as well as a situation ~set of contexts! in which the task isperformed. If the robot finds high association between the situation and the task,this association information is forwarded to the interaction system to confirmwhether selecting the task is highly associated by the situation or not. Accordingto feedback from the user, the association between the task and the situation ismodified—strengthening or weakening. This strength of the association betweena task and a situation is continuously updated.

3.2.3. Interaction System

The interaction system is a mediating system between a robot and a user. Inaccordance with the task-situation association system, the interaction system asksthe user about the adequacy of association, and once the task is performed, theinteraction system supports the user’s task executions by using the robot’s multi-modal communication channels. The interaction system is also utilized for infer-ring a human emotional state. In addition to the physiological sensors, theperceived data from visual or acoustic sensors are strong cues to infer emotions~e.g., contents of conversation, pitch, intensity, duration, tone of speech, facialexpressions, gestures, or postures!. This inferred human emotion is used for oneof the contents.

3.2.4. Task Selector

The task selector is a system to select and perform a task. If a situation ishighly associated with two different tasks, the system primarily selects the highestassociated task and consequentially selects the second highest associated task. Inthe case where the system has high confidence in the association between a taskand a situation, the system automatically performs the task. Otherwise, via theinteraction system, the system asks the user to perform the task.

TASK SELECTION MECHANISM FOR INTERACTIVE ROBOTS 985

International Journal of Intelligent Systems DOI 10.1002/int

3.3. Computational Model of Task–Situation Association

The task–situation association system consists of the consecutively con-nected three-layered neural network model based on the RBFNs and associativelearning rules.

3.3.1. First Layer

The main function of the first layer ~Figure 3! is to learn the associationbetween each task and context. Associative learning occurs when a learning inputsuch as a user’s direct command is received from the task selector. Each context ofthe same attribute usually interrelates with each other ~e.g., the context of 3 p.m.,4 p.m., and 5 p.m. share the same attribute “hour”!. To the contexts of the sameattribute, it is presumed that the association between an output of the task selec-tion and a context affects the association between the output and the similar ~near!context. For instance when the task of cooking breakfast is generally requested at8 a.m., the user may ask the same task around 8 a.m. like 7 a.m. or 9 a.m. as well.Therefore we employ the RBF networks to illustrate the association between eachtask and content in the same attribute. An RBFN is a simple type of network modelwith a single hidden layer of nodes and a linear output layer, with unity weightingbetween the input and the hidden layer.48 In this article, static RBFNs are used forencoding various contexts into input node vectors rci

. According to the character-istic of each context, shapes of RBFs and the number of RBF nodes can be varied.Each element of the input node vector is defined with the Gaussian function:

rci, n� exp ��

~ci � Sci, n !2

li� ~1!

Figure 3. The first and second layers of the task–situation association system.

986 BAEK ET AL.

International Journal of Intelligent Systems DOI 10.1002/int

where ci , Sci,n , and li denote the ith context input value, center of the nth RBFnode, and the width parameter of RBF, respectively.

Some of the contexts in the same attribute usually interrelate with each other.Therefore, for continuous contexts related with close values such as time and posi-tion, the width of the RBF must be broad enough to reach the center of the nearestRBF. On the other hand, the width of RBF must be narrow for discrete contextsindependent of close value.

Using RBFN, an output of the first layer is obtained as follows:

yci tj1 � fsatlins~vc1 tj

1T{rci

� stj ! ~2!

where vci tj1 is an association weight vector between the ith context ci and the j th

task tj , rciis a Gaussian RBF node vector of the ith context ci , stj is a learning

input of the j th task tj , and fsatlins~{! means a symmetric saturating linear func-tion given by

fsatlins~u! � ��1 if u � �1

u if �1 � u� 1

1 if u � 1

Associative learning occurs when a learning input is received from the task selec-tor. The learning input value stj is given by the user’s commands or interactionfeedback:

stj � �2 if task tj is commanded or accepted

�2 if task tj is denied

0 otherwise

Note that, for both the positive and negative learning cases, the amplitude of thelearning input value should be big enough for saturating the output of the firstlayer. Here, the associative learning rule for each elements of an association weightvector is constructed as follows:

vci, n tj1 ~x � 1! � vci, n tj

1 ~x!� ayci tj1 ~rci, n

� vci, n tj1 ~x!! ~3!

where x is the number of learning trials and a is learning gain.

3.3.2. Second Layer

The main role of the second layer ~Figure 3! is generating threshold output.The weak association between the context ci and task tj is excluded by the thresh-old value proportional to the maximum output of the first layer. The output of thesecond layer is written as

TASK SELECTION MECHANISM FOR INTERACTIVE ROBOTS 987

International Journal of Intelligent Systems DOI 10.1002/int

yci tj2 � fhardlim~ yci tj

1 � bci tj2 !

bci tj2 � �bth{max

n~ yci tj

1 ! ~4!

where bth is a threshold coefficient, and fhardlim~{! is a hard limit function given by

fhardlim~u! � �1 if u � 0

0 if u� 0

3.3.3. Third Layer

The third layer ~Figure 4! functions as a logical AND operation among thecontext sets of the different attributes. Therefore, the output of the third layer isdefined as

ytj3 � fhardlim~vtj

3T{y tj

2 � btj3 !

btj3 � �(i�1

L

vci tj3 � 0.5 if (

i�1

L

vci tj3 � 1

0 otherwise

~5!

Figure 4. The third layer of the task–situation association system.

988 BAEK ET AL.

International Journal of Intelligent Systems DOI 10.1002/int

where the third layer’s weights are obtained by

vci tj3 � �1 if max

n~ yci tj

1 !� minn~ yci tj

1 ! � gth

0 otherwise~6!

where gth is a threshold constant to determine whether an association is con-structed or not.

When ytj3 � 0 and the user accepts task j, the association learning between

task j and the corresponding contents occurs. The association learning is based onthe Instar Rule that allows weight decay only when the instar ~in this algorithm,ycik tj

1 ! is active.49

4. IMPLEMENTATION AND EXPERIMENT WITHTHE CRF3 SYSTEM

4.1. CRF3 System

Figure 5 describes the functions of the CRF3 system. The CRF3 system con-sists of several modules to implement an intelligent life supporting system. So theCRF3 system can interact with a user naturally using simple voice recognition,syntheses, and emotional expressions. The vision system of CRF3 can detect sev-eral colors including the human face, and the RF-ID system can identify the usersand items that are around the experimental area.

4.1.1. The CRF3 Robot Module

Figure 6 presents an overview of CRF3. The size of the mechanism is 115 mm~W! � 213 mm ~H! � 74 mm ~D!. Figure 7 shows the motor configuration ofCRF3 and the connection between nine RC servomotors and the control points oneyes, eyebrows, eyelids, and mouth. There are two DOFs at the eyes, four DOFs atthe eyebrows, two DOFs at the eyelids, and one DOF at the mouth. A total of threeBLDC motors are added to control the neck’s roll, pitch, and yaw motion indepen-dently. We also integrated several sensor systems on the robot to realize a human–robot interaction system. Sixteen-bit microcontrollers ~H8-3048F from Hitachi!are used for motor control and sensor signal processing blocks. Utilizing thismechanical structure, CRF3 can show several facial expressions ~see Figure 8!.And the emotional engine is connected with facial expressions.50

4.1.2. Vision Sensor Module

CRF3 is comprised of implemented visual modules that at this stage performface or color ball detections based on color extractions. Using the number of pix-els in the face region, it also detects whether the user is approaching or retreatingfrom CRF. This sensor module is run on separate computers and communicateswith the host via the Ethernet because the module requires much computing power.

TASK SELECTION MECHANISM FOR INTERACTIVE ROBOTS 989

International Journal of Intelligent Systems DOI 10.1002/int

Figure 5. Character Robot Face 3 ~CRF3! system.

Figure 6. Character Robot Face 3.

990 BAEK ET AL.

International Journal of Intelligent Systems DOI 10.1002/int

4.1.3. Tactile Sensor Module

A tactile sensor module is implemented with two touch sensors that are placedside by side. If both of them are touched sequentially, it is recognized as a petsignal, and if only one of them is touched, it is recognized as a hit signal by themicrocontroller.

4.1.4. Speech Recognition and Synthesis Modules

As speech is one of the high-level modalities in human communication,a speech recognition module ~Microsoft Speech SDK5.1! and speech synthesismodule ~NEC’s speech SDK4.6! are implemented into the CRF3 system. This mod-ule recognizes and synthesizes in both English and Japanese. Registered words orsentences can be recognized and CRF3 replies in voice as well as represents its

Figure 7. Motor configuration of CRF3.

Figure 8. Facial expressions of CRF3: ~a! happiness, ~b! sadness, ~c! anger, and ~d! surprise.

TASK SELECTION MECHANISM FOR INTERACTIVE ROBOTS 991

International Journal of Intelligent Systems DOI 10.1002/int

emotional state via facial motions. The lip and mouth motion is synchronizedCRF3’s voice.

4.1.5. RFID Module

CRF3 is implemented in an RFID module. An RFID ~Radio Frequency Iden-tification!module ~Intellitag SDK3 from Sharp! consists of three antennas, a reader,and RFID tags attached with the user and items. The antennas emit radio signals toactivate the tag and read data to it. Antennas are the conduits between the tags andthe reader, which controls the system’s data acquisition and communication at a2448.875 MHz frequency. This RFID module achieves read ranges of approxi-mately 2 m but is highly directional due to the adaptation of the microwave radiofrequency. When ID tags are attached to the user and items, the system detectswhere the user or items are.

4.2. Experimental Setup

The experiment was done in a room at the laboratory as shown in Figure 9.CRF3 is deployed along with a computer display to support the user’s PC taskssuch as e-mail checking and web browsing. Two RFID antennas ~Antenna 1 and 2!are set in front of the door to detect the user’s entrance and exit. One antenna

Figure 9. Experimental environment.

992 BAEK ET AL.

International Journal of Intelligent Systems DOI 10.1002/int

~Antenna 3! is set beside the computer display and CRF3 to detect the user sittingin front of the PC and CRF3.

4.3. Tasks and Set of Contexts

The CRF3 system monitors the contexts of the user’s situation. Monitoredcontexts are categorized into five attributes: identity, time, state, activity, and user’semotion. The state attribute is one including the user’s location and motion tochange his or her location. Considering the continuity of the context in the attribute,time-based contexts such as hour and day of the week are categorized into contin-uous contexts, and the contexts in the other attributes in this experiment are cat-egorized into discrete contexts. Accordingly, each context in hour and day of theweek affects neighboring contexts in the same attribute by RBFN functions whenthe context is stimulated. Other contexts do not affect neighboring contexts in thesame attribute. Here are tasks and a set of contexts:

Task• t1; t6: 1: web browsing, 2: e-mail checking, 3: item checking, 4: play

classic, 5: play pop, 6: play rockIdentity attribute ~discrete!

• c01; c02: user ~1: user1, 2: user2!Time attribute ~continuous!

• c1: hour ~0 o’clock–24 o’clock!• c2: day of week ~Sunday–Saturday!

State attribute ~discrete!• c31; c35: location ~1: outside, 2: entrance, 3: in the room, 4: in front of

PC, 5: exit!Activity attribute ~discrete!

• c41; c47: executing task ~1: nothing, 2: web browsing, 3: e-mail check-ing, 4: item checking, 5: play classic, 6: play pop, 7: play rock!

User emotion attribute ~discrete!• c50; c52: user emotion ~1: neutral, 2: happy, 3: sad!

There are six tasks CRF3 can initiate and terminate in the experiments. Thecontext of the identity attribute can be detected by the RFID tag attached to theuser. The antenna reads the ID number and the system understands who the user isaccording to the detected ID number. The context of the time attribute can be sensedby the computer’s internal clock. Considering the tasks we use in this experiment,only hour- and day of the week-attributed contexts are perceived. The contexts ofthe user state attribute are sensed by the RFID system. Figure 10 and Table I showthe diagram of the user state and transition. For example, when the user is outside~S1! and then antenna 2 or 3 detects the user ~T2 or T3!, the context of the userstate switches to the “entrance” context ~S2!. Sequentially, when no antennas aredetected by the user ~T0!, the context switches to the “in the room” context ~S3!.Alternatively, when the user is detected by antenna 3 ~T3!, the context switches to“in front of PC” ~S4!.

TASK SELECTION MECHANISM FOR INTERACTIVE ROBOTS 993

International Journal of Intelligent Systems DOI 10.1002/int

The contexts of the activity attributes are obtained by the robot-performingtasks. If the current performing task and the associating task are disjoint, tasks areable to associate to each other. In this experiment the association can occur betweenweb browsing, e-mail checking, and playing music—rock, pop, or classic.

The context of the user emotion attribute is inferred by the human emotioninference system. For simplification, in this experiment, we only consider that theuser’s emotion is inferred by emotional sentences. There are two user emotionnodes: happy and sad. It is supposed that positive comments as shown in Table IIhave excitory relationships to happy whereas negative comments have relation-ships to sad. In this article, the simplified emotional engine of CRF3 is used.50

The system parameters, a of the learning gain, bth of the threshold coefficientof the second layer, and gth of the threshold constant of the third layer are selectedas 0.1, 0.75, and 0.2, respectively.

4.4. Experimental Results

The proposed task selection mechanism is implemented into the CRF3 sys-tem51 and an experiment run to investigate a learning process of the associationbetween one task and one situation with the user’s feedback. In the experiment,the robot suggests “item checking” in the situation that consisted of the followingcontexts:

~1! Saturday or Sunday ~day of the week!~2! 8 a.m. ~time!~3! Exit ~user state!

For example, if the user would like to carry specific items such as the Bible ora tennis racket on a weekend morning, sometimes he commands the robot to checkhis items when he is leaving. Then the CRF3 asks the user whether it checks itemor not when the user is leaving as shown in Figure 11. Based on the answer of the

Figure 10. Diagram of the user state and transition.

994 BAEK ET AL.

International Journal of Intelligent Systems DOI 10.1002/int

Table I. Definitions of user states and transitions.

User state Transition

S1 outside T0 disappearS2 entrance T1 detected at antenna 1S3 in the room T2 detected at antenna 2S4 in front of PC T3 detected at antenna 3S5 exit

Table II. Sensor parameters.

User emotion Emotional sentence

Happy Positive comments

• I feel very good!

• I am very happy!

• Great!

• Why not do it! ~response for a question!

Sad Negative comments

• I feel awful!

• I am very sad!

• What the hell are you suggesting? ~response for a question!

Figure 11. Checking item.

TASK SELECTION MECHANISM FOR INTERACTIVE ROBOTS 995

International Journal of Intelligent Systems DOI 10.1002/int

user, task–situation association is updated automatically. In this experiment, theuser activity and emotion attributes are not shown here because the “item check-ing” task has nothing to do with those attributes. ~Actually “nothing” and “neu-tral” are associated with this task.!

Figures 12, 13, and 14 show every context output of the first layer accordingto the learning trial. For the first outputs, the second and third layers filter theweak associated contexts. If the node value multiplied the output of the secondlayer by the associating weight of the third layer and the result is equal to 1, thesystem understands that this context is associated with the task ~checking item!. Ifthe system understands that the context is not associated with a task, vice versa~see Figures 15, 16, and 17!. In the first and second learning trials, the user com-manded the robot to perform checking item when he left. As a result, only thecontext of 8 o’clock was associated. Consequently, in the third trial, the robotsuggested checking item at 8 a.m., Monday, in the room. Hence, the user deniedthe robot’s suggestion followed by negative learning. The context of “exit” wasassociated, and the association between the context of “8 a.m.” and checking itemwas closed. In the fourth learning trial, the robot suggested the task at 9 p.m. onTuesday when the user was leaving. This suggestion was obviously denied, and,as a result, all context association has been closed. In the fifth and sixth learningtrials, the user command at 8 a.m. on Saturday and Sunday associates all contextswith the checking item. After that, the robot successfully suggested to the user atthe right time because of the high association between the situation and checking

Figure 12. Output of the first layer for the context attributed “day of the week.”

996 BAEK ET AL.

International Journal of Intelligent Systems DOI 10.1002/int

Figure 13. Output of the first layer for the context attributed “hour.”

Figure 14. Output of the first layer for the context attributed “user state.”

TASK SELECTION MECHANISM FOR INTERACTIVE ROBOTS 997

International Journal of Intelligent Systems DOI 10.1002/int

Figure 15. Node value of “day of the week” after being filtered in second and third layers ~if1: associated, 0: not associated!.

Figure 16. Node value of “hour” after being filtered in second and third layers ~if 1: associ-ated, 0: not associated!.

998 BAEK ET AL.

International Journal of Intelligent Systems DOI 10.1002/int

item task. The detailed learning conditions including the positive or negative indi-cation and the association results are described in Table III.

4.5. Evaluation of the Situation–Task Association Mechanism

To confirm whether the system learns the association between a situation anda task, and consequently suggests the right task in the right situation, we evaluatedthe association system. The system can be evaluated by the probability of how

Figure 17. Node value of “user state” after being filtered in second and third layers ~if 1:associated, 0: not associated!.

Table III. Learning process.

Trial Learning condition P/N Association result

1 Sat., 8 a.m., exit P Sat. ~0!, Sun.~0!, 8 a.m. ~0!, exit~0!2 Sun., 8 a.m., exit P Sat. ~0!, Sun.~0!, 8 a.m. ~1!, exit~0!3 Mon., 8 a.m., room N Sat. ~0!, Sun.~0!, 8 a.m. ~0!, exit~1!4 Tue., 9 p.m., exit N Sat. ~0!, Sun.~0!, 8 a.m. ~0!, exit~0!5 Sat., 8 a.m., exit P Sat. ~1!, Sun.~0!, 8 a.m. ~1!, exit~1!6 Sun., 8 a.m., exit P Sat. ~1!, Sun.~1!, 8 a.m. ~1!, exit~1!7 Sat., 8 a.m., exit P Sat. ~1!, Sun.~0!, 8 a.m. ~1!, exit~1!8 Sun., 8 a.m., exit P Sat. ~1!, Sun.~1!, 8 a.m. ~1!, exit~1!9 Sat., 8 a.m., exit P Sat. ~1!, Sun.~1!, 8 a.m. ~1!, exit~1!

10 Sun., 8 a.m., exit P Sat. ~1!, Sun.~1!, 8 a.m. ~1!, exit~1!

TASK SELECTION MECHANISM FOR INTERACTIVE ROBOTS 999

International Journal of Intelligent Systems DOI 10.1002/int

precisely the robot suggests the task in the right situation. At the same time, wealso have to consider the false alarm case: The robot would suggest a task in thewrong situation due to incorrect or incomplete associations. The evaluation func-tion composed of probability values is written by

feval �P~C!

P~C!� P~E !~7!

where P~E !� P~F!� P~M !, P~M ! is a probability of the case in which the robotwould not suggest when the user intended to execute. Conversely, P~C! is a prob-ability of the robot suggesting when the user intended to execute. P~F! is a prob-ability of the robot wrongly suggesting when the user did not intend.

In the experiment, this evaluation function is called forward when the asso-ciative learning occurs.

Suppose that the user always commands or accepts the robot’s suggestion in theintended situation with a fixed set of contexts. P~C!, P~F!, P~M ! can be given by

P~C! � �1

Tif task is suggested in the current intended situation

0 otherwise

P~M ! � �1

Tif task is not suggested in the current intended situation

0 otherwise

P~F! �F

T

where T is the total number of the context combinations since the last associativelearning was activated and F is the total number of the false alarms since the lastassociative learning was activated.

Figure 18 shows the result of the evaluation function ~“item checking” case!.After the seventh learning trial, the evaluation value decreased because that Satur-day the peak increased but on Sunday the value decreased in the day of the weekcontext. So, the threshold output ~output of the second layer! corresponding toSunday goes to zero. This missed suggestion disappeared after the eighth trial.Note that this result shows a variation of the association based on the learninginput. Therefore, even if the user changed the task–situation relation, the systemcould adapt to the user’s changing preferences.

This evaluation function converges after the eighth learning trial. The robotcan suggest the checking item task at the right time.

This experimental result is not an optimal case. If learning gains are tuned sothat every context could be associated simultaneously, then the learning trial num-ber could be smaller. However, this result shows that the robot system could adaptto the user by the command and interaction.

1000 BAEK ET AL.

International Journal of Intelligent Systems DOI 10.1002/int

5. CONCLUSION

In the future ubiquitous network, socially interactive robots in the homeenvironment will be expected not only to show artificially emotional expressionsbut also to play a role as intelligent home agents that are able to infer the users’situations via the ubiquitous network system and suggest or provide tasks thatsatisfy the users’ intentions. To develop such a robot system, there are mainlytwo challenges: ~1! how to recognize the user’s situation, and ~2! when, where,and how the system provides or suggests the task the user intends according tothe understanding of the situation. Our research focused on the second challenge.We designed a task selection mechanism based on the user’s situation as well asrobot emotions.

The proposed situation-based task selection mechanism has four main com-ponent systems: the task–situation association system, the interaction system, thetask selector, and the context awareness system. Stimulated by the user’s directcommands or context awareness, this mechanism associates a user’s situation anda performing task. The strength of the association can be changed by the user’sdirect commands or user’s acceptance/denial of the suggestion. The user’s situa-tion consists of a set of contexts that have five kinds of attributes: identity, loca-tion, time, activity, and emotion. The task-situation association system consists ofa three-layer neural network model using radio basis function networks ~RBFNs!and the associative learning rule. The proposed mechanism was implemented inthe CRF3 system, and the experiment demonstrated that the robot system per-formed feasible operations.

This research was the first step to designing socially interactive robots asintelligent life-supporting systems in the home environment. The current systemmainly aims to understand when a task is started, but not precisely to understand

Figure 18. Evaluation result.

TASK SELECTION MECHANISM FOR INTERACTIVE ROBOTS 1001

International Journal of Intelligent Systems DOI 10.1002/int

how long the user requests the performance of the task. This issue is critical fortask management that recognizes which task and how many tasks are beingexecuted. Therefore, the task executing duration should be considered in the nextstep. And regarding the affect infusion processes or behaviors such as heuristicor emotional ones, we must consider the robot’s emotion memory-based tasksuggestion system, which works as an external stimulus to select the user’s taskif there are no tasks associated with the current situation.

Acknowledgments

This research work was supported by a joint research project on Micromechatronics. Wethank all the members of this project and staffs from the Chubu Science and Technology Center.

References

1. Breazeal C. Emotion and sociable humanoid robots. Int J Hum Comput Stud 2003;59:119–155.

2. Miwa H, Okuchi T, Takanobu H, Takanishi A. Development of a new human-like headrobot WE-4. In: Proc 2002 IEEE/RSJ Int Conf on Intelligent Robots and Systems~IROS2002!, Swiss Federal Institute of Technology Lausanne ~EPFL!, Switzerland; 2002.pp 2443–2448.

3. Shibata T, Tanie K. Physical and affective interaction between human and mental commitrobot. In: Proc 2001 IEEE Int Conf on Robotics & Automation ~ICRA 2001!, Seoul, Korea;2001. pp 2572–2577.

4. Kanda T, Hirano T, Eaton D, Ishiguro H. Interactive robots as social partners and peertutors for children: A field trial. Hum Comput Interact 2004;19:61–84.

5. Burgard W, Cremers AB, Fox D, Hahnel D, Lakemeyer G, Schulz D, Steiner W, Thrun S.Experiences with an interactive museum tour-guide robot. Artif Intell 1999;114:3–55.

6. King S, Weiman C. Helpmate autonomous mobile robot navigation system. In: Proc SPIEConf on Mobile Robots, Boston, MA; 1990. pp 190–198.

7. Pollack ME, Brown L, Colbry D, McCarthy CE, Orosz C, Peintner B, Ramakrishnan S,Tsamardinos I. Autominder: An intelligent cognitive orthotic system for people with mem-ory impairment. Robot Auton Syst 2003;44:273–282.

8. Pollack ME, McCarthy CE, Tsamardinos I, Ramakrishnan S, Brown L, Carrion S, ColbryD, Orosz C, Peintner B. Autominder: A planning, monitoring, and reminding assistiveagent. In: Proc Seventh Int Conf on Intelligent Autonomous Systems ~IAS-7!, Marina delRey, CA; 2002.

9. Asoh H, Hayamizu S, Hara I, Motomura Y, Akaho S, Matsui T. Socially embedded learn-ing of the office-conversant mobile robot jijo-2. In: Proc 15th Int Joint Conf on ArtificialIntelligence ~IJCAI-97!, Nagoya, Japan; 1997. pp 880–885.

10. Jung MJ, Arai F, Hasegawa Y, Fukuda T. Mood and task coordination of home robots. In:Proc 2003 IEEE Int Conf on Robotics & Automation ~ICRA2003!, Taipei, Taiwan; 2003.pp 250–255.

11. Saha D. Pervasive computing: A paradigm for the 21st century. IEEE Comput Soc 2003;36~3!:25–31.

12. Kidd CD. The aware home: A living laboratory for ubiquitous computing research. In:Proc Second Int Workshop on Cooperative Buildings ~CoBuild 1999!, Pittsburgh, PA; 1999.pp. 191–198.

13. Sakamura K. Bibliography of the TRON project ~1984–1994!. In: Proc 11th TRON ProjectInternational Symp, Tokyo, Japan; 1994. pp 146–173.

14. Schilit B, Adams N, Want R. Context-aware computing applications. IEEE Workshop onMobile Computing Systems and Applications, Santa Cruz, CA; 1994. pp 85–90.

1002 BAEK ET AL.

International Journal of Intelligent Systems DOI 10.1002/int

15. Dey AK, Abowd GD. Towards a better understanding of context and context-awareness.GVU Technical Report GIT-GVU-99-22, Atlanta, GA: College of Computing, GeorgiaInstitute of Technology; 1999. Available at: ftp://ftp.cc.gatech.edu/pub/gvu/tr/1999/99-22.pdf.

16. Dey AK. Understanding and using context. Pers Ubiquitous Comput 2001;5:4–7.17. Want R, Hopper A, Falcao V, Gibbons J. The active badge location system. ACM Trans

Inform Syst 1992;10:91–102.18. Want R, Schilit B, Norman A, Gold R, Goldberg D, Petersen K, Ellis J, Weiser M. An

overview of the PARCTAB ubiquitous computing environment. IEEE Pers Commun1995;2:28– 43.

19. Schmidt A, Adoo KA, Takaluoma A, Tuomela U, Laerhoven KV, Van de Velde W. Advancedinteraction in context. In: Proc First Int Symp on Handheld and Ubiquitous Computing,Karlsruhe, Germany; 1999. pp 89–101.

20. Edwards WK, Grinter RE. At home with ubiquitous computing: Seven challenges. In:Proc Third Int Conf on Ubiquitous Computing, Atlanta, GA; 2001. Lecture Notes in Com-puter Science 2201. pp 256–272.

21. Satyanarayanan M. Pervasive computing: Vision and challenges. IEEE Pers Commun2001;8:10–17.

22. Kobayashi H, Ichikawa Y, Tsuji T, Kikuchi K. Development on face robot for real facialexpressions. In: Proc 2001 IEEE/RSJ Int Conf on Intelligent Robots and Systems~IROS2001!, Maui, Hawaii; 2001. pp 2215–2220.

23. Ekman P, Friesen W. Measuring facial movement with the facial action coding system. In:Ekman P, editor. Emotion in the human face. Cambridge, UK: Cambridge University Press;1982. pp 178–211.

24. Miwa H, Okuchi T, Takanobu H, Takanishi A. Development of a new human-like head robotWE-4. In: Proc 2002 IEEE/RSJ Int Conf on Intelligent Robots and Systems ~IROS2002!,Swiss Federal Institute of Technology Lausanne ~EPFL!, Switzerland; 2002. pp 2443–2448.

25. ASIMO. Honda Motor Co. Inc.; 2005. Available at: http://asimo.honda.com/.26. QRIO. Sony Co., 2006. Available at: http://www.sony.net/SonyInfo/QRIO/technology/

index_nf.html.27. Humanoid Robotics Project. Manufacturing Science and Technology Center; 2003. Avail-

able at: http://www.mstc.or.jp/hrp/main.html.28. ROBONAUT. NASA-Johnson Space Center; 2006. Available at: http://robonaut.jsc.

nasa.gov/robonaut.html.29. Terada K, Ohmura Y, Kuniyoshi Y. Analysis and control of whole body dynamic

humanoid motion—Towards experiments on a roll-and-rise motion. In: Proc 2003 IEEE/RSJ Int Conf on Intelligent Robots and Systems ~IROS2003!, Las Vegas, NV; 2003.pp 1382–1387.

30. AIBO. Sony Co.; 2006. Available at: http://www.sony.net/Products/aibo/.31. Robot Technology at AIST. Advanced Industrial Science and Technology ~AIST!; 2006.

Available at: http://www.aist.go.jp/aist_e/aist_today/2003_09/robot_01.html.32. Endres H, Feiten W, Lawitzky G. Field test of a navigation system: Autonomous cleaning

in supermarkets. In: Proc 1998 IEEE Int Conf on Robotics & Automation ~ICRA!, Leu-ven, Belgium; 1998. pp 1779–1781.

33. RHINO. Intelligent Autonomous Systems Group at the Computer Science Department IIIof the University of Bonn; 2006. Available at: http://www.cs.uni-bonn.de/;rhino/.

34. Simmons R. XAVIER: An autonomous mobile robot on the web. In: Proc 1998 IEEE/RSJInt Conf on Intelligent Robots and Systems ~IROS’98!Workshop on Web Robots, Victo-ria, Canada; 1998. pp 43– 48.

35. Kanda T, Hirano T, Eaton D, Ishiguro H. A practical experiment with interactive humanoidrobots in a human society. In: Proc Third IEEE Int Conf on Humanoid Robots ~Humanoids2003!, Karlsruhe, Germany; 2003.

36. Okuno HG, Nakadai K, Hidai K, Mizoguchi H, Kitano H. Human-robot interaction throughreal-time auditory and visual multiple-talker tracking. In: Proc 2001 IEEE/RSJ Int Confon Intelligent Robots and Systems ~IROS2001!, Maui, Hawaii; 2001. pp 1402–1409.

TASK SELECTION MECHANISM FOR INTERACTIVE ROBOTS 1003

International Journal of Intelligent Systems DOI 10.1002/int

37. Kobayashi T et al. Inter-module cooperation architecture for interactive robot. In: ProcIROS2002, vol 3, Swiss Federal Institute of Technology Lausanne ~EPFL!, Switzerland;2002. pp 2286–2291.

38. Scassellati BM. Foundations for a theory of mind for a humanoid robot. Doctoral thesis.Cambridge, MA: Massachusetts Institute of Technology; 2001.

39. Kuniyoshi Y, Inaba M, Inoue H. Learning by watching: Extracting reusable task knowl-edge from vision observation of human performance. IEEE Trans Robot Autom1994;10:799–822.

40. Atkeson CG, Schall S. Robot learning from demonstration. In: Proc 14th Int Conf onMachine Learning ~ICML ’97!, Nashville, TN; 1997. pp 12–20.

41. Andry P, Gaussier P, Moga S, Banquet JP, Nadel J. Learning and communication via imi-tation: An autonomous robot perspective. IEEE Trans Syst Man Cybern 2001;31:431– 442.

42. Nicolescu M, Mataric M. Learning and interacting in human-robot domains. IEEE TransSyst Man Cybern 2001;31:419– 430.

43. Schaal S. Is imitation learning the route to humanoid robots? Trends Cogn Sci 1999;3:233–242.

44. Kozima H. Infanoid: A babybot that explores the social environment. In: Dautenhahn K,Bond AH, Canamero L, Edmonds B, editors. Socially intelligent agents: Creating relation-ships with computers and robots. Boston/Dordrecht/London: Kluwer Academic Publish-ers; 2002. pp 157–164.

45. Nagai Y, Hosoda K, Asada M. Joint attention emerges through bootstrap learning. In: ProcIEEE/RSJ Int Conf on Intelligent Robots and Systems, Las Vegas, NV; 2003. pp 168–173.

46. Pollack ME, Brown L, Colbry D, Orosz C, Peintner B, Ramakrishnan S, Engberg S, Mat-thews JT, Dunbar-Jacob J, McCarthy CE, Thrun S, Montemerlo M, Pineau J, Roy N. Pearl:A mobile robotic assistant for the elderly. In: Proc AAAI Workshop on Automation asEldercare, Edmonton, Alberta; 2002. pp 85–92.

47. Suchman LA. Plans and situated actions: The problem of human-machine communica-tion. Cambridge, UK: Cambridge University Press; 1987.

48. Moody J, Darken CJ. Fast learning in networks of locally tuned processing units. NeuralCommun 1989;1:281–294.

49. Hagan MT, Demuth HB, Beale MH. Associative learning—Instar Rule. In: Neural net-work design. Boston, MA: PWS Publishing; 1995. pp 13-11–13-14.

50. Fukuda T, Jung M-J, Nakashima M, Arai F, Hasegawa Y. Facial expressive robotic headsystem for human-robot communication and its application in home environment. ProcIEEE 2004;92:1851–1865.

51. Arai F, Tachibana D, Jung MJ, Fukuda T, Hasegawa Y. Development of character robotsfor human-robot mutual communication. In: Proc 2003 IEEE Int Workshop of Robot andHuman Interactive Communication ~ROMAN2003!; 2003. CD-ROM.

1004 BAEK ET AL.

International Journal of Intelligent Systems DOI 10.1002/int