learning interaction rules through compression of sensori-motor ... · yasuo kuniyoshi ∗ ∗the...

Learning Interaction Rules through

Compression of Sensori-Motor Causality Space

Takatsugu Kuriyama! !! Takashi Shibuya! Tatsuya Harada!

Yasuo Kuniyoshi!!The University of Tokyo, Hongo 7-3-1, Bunkyo-ku, Tokyo, 113-8656 Japan.

{kuriyama, takashi, harada, kuniyosh}@isi.imi.i.u-tokyo.ac.jp!!JSPS Research Fellow

Abstract

A human partner returns a specific responseafter a robot acts a specific social cue. Wedefine this as interaction rules. Toward natu-ral communication, we focus on social gamesplayed between an infant and caregiver. Forthe social games, we have to solve two prob-lems: (1) the robot have to find the rules inhigh-dimensional space with a limited numberof exemplars; (2) the interaction should notbe divided into learning phase and interactionphase. Previous challenges on learning inter-action rules didn’t attacked them both ways.In this paper, we solve them simultaneously.The robot calculates sensori-motor causal-ity space utilizing partial correlation analysis,and it compresses the causality space to findthe rules in high-dimension utilizing canoni-cal correlation analysis. The experiment ofhuman-robot interaction showed that a realrobot with camera and arms (3 DoF for each)learns gesture interaction rules with human inhigh-dimensional sensori-motor space withoutdividing interaction into learning/interactionphases.

1. Introduction

1.1 Background

Human-robot interaction has possibilities to enter-tain us as a peer, beyond practical communicationfor collaborative work (Goodrich and Schultz, 2007)(Fong et al., 2003). Many robot platforms forcommunication entertainment have been de-veloped, including a dog pet robot AIBO(Fujita, 2004), a seadog therapy robot PARO(Wada and Shibata, 2006), a childcare robot Pa-PeRo (Osada et al., 2006), and a communicativehumanoid Robovie (Mitsunaga et al., 2006). Then,a problem arose from a viewpoint of behavior:how this kind of robots should behave for naturalcommunication.

Figure 1: interaction rule: correspondence between a

robot’s action in a context and a partner’s response.

We can communicate with others naturally, so apromising way to build such kind of robot is to de-sign a robot like a real human. Because humancannot communicate as an adult on the birth, weexplore a key for communication in the develop-mental process as cognitive developmental robotics(Asada et al., 2001) (Asada et al., 2009).

1.2 Social Games

Now we focus on social games played betweenan infant and caregiver. Social games area major communication during the first twoyears, such as peek-a-boo, ball game, give andtake, gonna get you, point and name andso on (Gustafson et al., 1979). Social gamesare consisted of rules (Ratner and Bruner, 1978)(Bruner and Sherwood, 1975), and infant-elicitedsocial behaviors plays a significant role (Stern, 1977).Assuming an infant as a robot and a caregiver as ahuman partner, we focus on interaction rule that isa relationship between a robot’s action in a contextand a partner’s response (Fig. 1). A partner’s socialcue to a robot is included as a context of the rule.

Development of social games pro-gresses through the four stages de-scribed below (Bruner and Sherwood, 1975)(Rome-Flanders and Cossette, 1995). These stagesare actually continuous, not like a switching. Aninfant gradually moves from a passive role graduallyduring the game.

57

Johansson, B., !ahin, E. & Balkenius, C. (2010). Proceedings of the Tenth International Conference on Epigenetic Robotics: Modeling Cognitive Development in Robotic Systems. Lund University Cognitive Studies, 149.

1. observing passively: The infant merely observesthe caregiver passively; the caregiver may physi-cally assist the infant to play.

2. taking part in one of the game’s elements: Theinfant takes part in one of game’s elements andeventually grows to initiate more of elements.

3. sharing of the game’s activities: Each playertakes a turn in a well-organized fashion based onthe convention of the game.

4. generating modifications: The infant generatesvariations within the rules of the game. The in-fant has a su!cient understanding of the game’srule structure to be able to add new rules.

We now refer infant mechanisms to build a robot.An infant behaves confirming contingency: a toy re-sponds to own action or not (Watson, 1979). Aninfant behaves searching regularity (Rochat, 2001).This regularity is considered to include contingency.Put it all together, psychological studies suggeststhat infant’s understanding of interaction rules canbe regarded as a testing behavior of action-responserelationship with evaluation of its regularity, and theprocess of understanding is not divided into exclusivestages but continuous. Now we point two problemsto approach this rule understanding of a robot.

1. high-dimensionality of interaction rules: context,action, and response in interaction rules are high-dimensional in the real world, e.g. high resolutionvisual, tactile, auditory sensors, and a large num-ber of muscles. If a robot designer defines a smallnumber of specific dimensions, the robot is spe-cialized in some games and this goes against thediversity of social games. Generally learning inhigh-dimensional space needs a lot of exemplars,but a number of exemplars is small because in-teraction rule easily varies when one of the dyadgenerates modifications to the interaction.

2. learning/interaction continuousness: as shown insocial games development, the process of under-standing interaction rules is continuous. The pro-cess should not be divided into exclusive phasessuch as learning phase and interaction one, andeven a switching timing of these phases cannotbe specified in the real social games due to themodification.

1.3 Related Wok

Several studies have attacked on interaction rule un-derstanding. Some studies assumed imitation fac-ulty to teach robot’s action in the rules. Imitationlearning has a correspondence problem where whichrobot’s body part, posture, and action correspond

to partner’s ones. Ogino et al. solved posture corre-spondence problem, associating visual image of part-ner and robot’s joint angles (Ogino et al., 2006). Butthe robot has to finalize correspondence learning be-fore rule-based interaction. Taniguchi et al. solvedaction correspondence problem, segmenting and as-sociating partner’s posture sequence and robot’s one(Taniguchi et al., 2008). But body part recognitionis predefined and DoF of rule is limited to 4, and in-teraction is divided into three phases: segmentation,association, rule-based interaction. Kuriyama et al.made interaction sequential, not divided into phases(Kuriyama and Kuniyoshi, 2008). The robot auto-matically determines whether current interaction isimitative or rule-based. But postures are defined andDoF of rule is limited to 4.

Others utilized prediction/causality faculty with-out assuming imitation one. Sumioka et al. showeda robot acquires joint attention related behaviors (akind of ruled interaction) with causality detection(Sumioka et al., 2009). But DoF of rule is limited to15 and learning takes much time (about 100,000 to1,000,000 sec.). Oudeyer et al. showed a robot learnsplay with toys (a kind of ruled interaction) with re-sponse prediction and learning progress evaluation(Oudeyer et al., 2007). But DoF of rule is limited to11. Kuriyama et al. showed a robot and a partnerco-create rules with response prediction and responsehabituation (Kuriyama and Kuniyoshi, 2009). ButDoF of rule is limited to 12.

Learning interaction rules in high-dimensionalspace in a sequence without dividing into phases isunsolved and a realistic setup in the real world.

2. Proposed Robot Model

2.1 Overview

We summarize our idea to attack the two problems.A challenge on the first problem (high-dimensionalityof interaction rules) is that the number of exemplarsis limited because one of the dyad generates mod-ification. This is a kind of Curse of dimensional-ity problem where a robot cannot properly estimateprobability distribution of exemplars in high dimen-sional space if the number of given exemplars is lessthan many times of the number dimensions.

To attack this, we focus on a dimensional compres-sion. If the sensori-motor space is compressed intoa small number of dimensions, we can decrease thenumber of needed exemplars.

A challenges on the second problem (learn-ing/interaction continuousness) is that, in the realhuman-human interaction, there is not mode switch-ing from learning action primitives to rule-based in-teraction. In a conventional way, a robot learns therules after it learns the primitives with mode switch-ing. To attack this, our strategy is to learn own

58

Figure 2: a prediction method that evaluates correlation

between {context and action} and response.

and partner’s action primitives and interaction rulesusing them concurrently. Therefore our solution toovercome the two problem is to adaptively compresscontext, action and response space preserving theirrelationship.

2.2 Prediction and Causality

Interaction rules is the relationship between a robot’sown action and the partner’s response. From a view-point of a robot, action can be represented by motorcommand, and the other’s response can be repre-sented by sensor information (e.g. visual or tactileinformation). So interaction rules can be representedby sensori-motor relationship. Context that is stateof the dyad or interactional history is considered. Arobot memories all experience of {context, action,response} (exemplars).

A method to find this sensori-motor relationshipis an evaluation of correlation between {context, ac-tion} and response (Fig. 2), and this is a predic-tion because there is a time delay: a response oc-curs after {context, action} (Oudeyer et al., 2007)(Kuriyama and Kuniyoshi, 2009). Another is anevaluation of causality from action in context to re-sponse (Sumioka et al., 2009). Causality is correla-tion between action and response subtracting con-text’s e!ect on action and response (Fig. 3). A pre-diction method is simple and useful, but not sophisti-cated for active learning because a causality methodprecisely evaluates action’s e!ect on response while aprediction method doesn’t care that which of contextor action contributes to response prediction. So in acase where we focus on what a robot can do like ac-tive learning, a causality method is considered to bea better way to capture sensori-motor relationship,therefore we utilize this.

2.3 Interaction Rule Space Compression

We formulate the causality space. Causal relation-ship from action in context to response is correlationbetween action and response from which context’s ef-fect on action and response are subtracted, and thisis represented as a correlation between the following

Figure 3: a causality method that evaluates correlation

between action and response from which context’s e!ect

on action and response are subtracted.

X and Y :!

X = A! A(C) = A!MC!ACY = R! R(C) = R!MC!RC

(1)

where C is a matrix that contains full-dimensionalcontext of all exemplars, A is a matrix that containsfull-dimensional action of all exemplars, R is a matrixthat contains full-dimensional response of all exem-plars, and A and R are estimated A and R when Cis given. MC!A and MC!RC are obtained by thefollowing linear regression:

"#

$

MC!A = argminM

|A!MC|

MC!R = argminM

|R!MC| (2)

We now consider projections from X and Y spaceto new 1-dimensional spaces:

!X "

i = Xui

Y "i = Y vi

(3)

where i = 1, 2 . . . D. D = min(DX , DY ). DX

and DY are the number of dimension of X and Y .We regard the new space X "

i and Y "i as compressed

sensori-motor causality space. To ensure correlationbetween X "

i and Y "i , and to obtain ui and vi, we ap-

ply canonical correlation analysis (CCA) which com-presses the two space preserving correlation. ui andvi are obtained as eigenvectors of the following eigen-value problems:

!!XY !

#1Y Y !Y Xui = "i!XXui

!Y X!#1XX!XY vi = "i!Y Y vi

(4)

where !xy is variance covariance matrix between xand y, and "1 " "2 " · · · " "D " 0.

#"i is cor-

relation coe"cient between X "i and Y "

i , so the com-pressed 1-dimensional subspace with the strongestcausal relationship is X "

1 and Y "1 , and the second are

X "2 and Y "

2 which are subspaces orthogonal toX "1 and

Y "1 , and so on. Therefore picking up a small num-

ber of X "i and Y "

i means compressing sensori-motorcausality space (Fig. 4). If the number of exemplarsis less than D, we have to copy experienced exem-plars with noise to increase the number of virtualexemplars to solve CCA.

59

Figure 4: an example of exemplar distribution on

(X !1, Y

!1 ) space. Exemplars distribute around perfect cor-

relation (the dashed blue line).

To consider context, we compress C space.(X !

i, Y!i ) space which is a projection from (X,Y )

space is independent to C space because context’se!ect is subtracted in (X,Y ) space. So we simplycompress C space without considering (X !

i, Y!i ).

We now consider projections from C space to new1-dimensional spaces:

C !i = Cwi (5)

where i = 1, 2 . . . DC . DC is the number of dimen-sion of C. We regard the new space C !

i as compressedcontext space. To ensure representation of exem-plars’ distribution, and to obtain wi, we apply prin-ciple component analysis (PCA) which compressesthe space preserving representation of distribution.wi is obtained as eigenvectors of the following eigen-value problem:

(!CC ! µiI)wi = 0 (6)

where µ1 " µ2 " · · · " µDC " 0.#µi is standard de-

viation of C !i, so the compressed 1-dimensional sub-

space with the widest distribution is C !1, and the sec-

ond are C !2 which is a subspace orthogonal to C !

1, andso on. Therefore picking up a small number of C !

i iscompressing context space.

2.4 Action Selection

A robot confirms stability of the causality with givencontext. The number of dimension of (X !

i, Y!i ) and

C ! that are considered in action selection are DX!Y !

and DC! . To consider context, we extract exemplarsfrom robot’s memory which meets the following:

|Cwi ! c| $ 0.5 (i = 1 . . . DC!) (7)

where c is a given context. This means that we ex-tract exemplars that are near the given context in

Figure 5: exemplars which are similar to given context.

NX!,i(p) is the number of exemplars within the vertical

green belt. NX!Y !,i(p) is the number of exemplars within

the intersection of the vertical and horizontal green belt.

the C ! space. Now we put the extracted exemplarsin (X !, Y !) space and evaluate stability of causality(Fig. 5). We count the number of exemplars thatmeet the following:

|X !i ! p| $ 0.5 (i = 1 . . . DX!Y !) (8)

The countNX!,i(p) is the number of exemplars whoseaction is near an action which is p in X !

i. This meanshow frequent a robot performed actions which aresimilar to abstract action p. We also count the num-ber of exemplars that meet the following:

!|X !

i ! p| $ 0.5|Y !

i ! p| $ 0.5(9)

The count NX!Y !,i(p) is the number of exemplarswhose action is near an action which is p in X !

i, andwhose response is near (within standard deviation)the perfect response with causality. This means howfrequent a robot performed actions which are simi-lar to abstract action p and the response is on thecausal relationship. The robot selects the abstractaction p! which has the highest stability of causalityas following:

p! = argmaxp

NX!Y !,i(p)

NX!,i(p)(i = 1 . . . DX!Y !) (10)

We convert actual action a from abstract action p!

as following:

a = MC"A c+ p!w#1i (11)

Then, the robot acts a.

60

3. Experiment

We investigate whether a robot based on the pro-posed model can learn interaction rules with humanin high-dimensional sensori-motor space without di-viding interaction into learning/interaction phases.We also investigate what happens inside the robotduring the interaction.

3.1 Setup

The experiment is face-to-face gesture interaction asone of typical interactions (Fig. 6). We built areal robot with camera on its head (Fig. 7). Forhuman-robot gesture interaction, robot arms (3 DoFfor each) are activated. The motor configuration isshown in Fig. 8. The raw camera input is 640x480colored pixels.

We define C (context), A (action), R (response)space in the setup. C is the sensory state in a tim-ing of starting an action. The space is 288 (visualsensory state; red, green, blue values for each down-sampled 12x8 pixels) plus 6 (proprioceptive sensorystate; rotational angles of the 6 motors) dimensions.A is the posture when an action is finished. Thespace is 6 (rotational angles of the 6 motors) dimen-sions. R is the sensory state when an action is fin-ished. The space is 288 (visual sensory state; red,green, blue values for each down-sampled 12x8 pix-els) dimensions. So the {C, A, R} (exemplar) spaceis 588 dimensions, and each dimension of {C, A, R}is normalized from 0 to 1. The number of the exem-plars that the robot can memory is set 1,176 (twiceof 588). If the number of experienced exemplars areless than the memory size 1,176, they are copied tofill memory with Gaussian noise (SD=0.05).

Action cycle is 1 second. At the 0 second in a cycle,the robot observes c, and determines an a within 0.25second. At the 0.25 second in a cycle, it starts tomove toward a posture. At the 1 second in a cycle,it stops in the a posture and observes r.

The number of compressed sensori-motor causalityspace DX!,Y ! = 2, and the number of compressedcontext space DC! = 2. p = {0.5, 1.5, 2.5}. Thecompression (CCA and PCA calculation) is done atevery 30 seconds. During the first 30 seconds, therobot moves randomly.

We define two interaction rules. One is that ahuman responds waving one’s hand from side to sideafter the robot waves its right hand from side to side.The other is that a human responds waving one’shand up and down after the robot waves its righthand up and down. An examiner (one of the authors)tries to teach the first one during the first 90 sec, andteach the other after the first 90 sec.

Figure 6: experimental situation.

Figure 7: a real robot with

a camera.

Figure 8: the motor config-

uration.

Figure 9: an example of camera input.

Figure 10: a down-sampled image (12x8 pixels).

61

Figure 11: canonical loading on each DoF (action) in

(X !1, Y

!1 ) space.

Figure 12: canonical loading on each DoF (action) in

(X !2, Y

!2 ) space.

3.2 Result

The robot gradually understands the two rules. Fig.11 and Fig. 12 show canonical loading on eachDoF of the robot in (X !

1, Y!1) and (X !

2, Y!2) space.

This canonical loading is the correlation between Xand X !

i, and this shows the composition of the com-pressed space.

With the examiner’s rule-based response in thefirst rule during the first 90 sec., the canonical load-ing of R-Swing in (X !

1, Y!1) increased at 90 sec., and

the loading of R-Swing in (X !2, Y

!2) clearly decreased

at 90 sec. (Fig. 11). This means that the robotregards R-Swing as the strongest causality compo-nent and the first rule is represented in (X !

1, Y!1). In

the same way, the loading of R-UpDown in (X !1, Y

!1)

decreased at 90 sec., and the loading of R-UpDownin (X !

2, Y!2) increased at 90 sec. (Fig. 11). This

means that the robot regards R-UpDown as the sec-ond causality component. There is only one ruleduring the first 90 sec., Fig. 13 shows that causal-ity stability is kept high level during the first 90 sec.Fig. 14 shows that the robot selected actions within(X !

1, Y!1). This means that the interaction is governed

by the first rule seen in Fig. 11. Fig. 15 showsthat the robot successfully predicts the response in(X !

1, Y!1) during the first 90 sec. while Fig. 16 shows

that the response is not on (X !2, Y

!2) because the cur-

rent rule is on (X !1, Y

!1). Fig. 17 shows that the

Figure 13: causality stability of selected actions. This is

calculated by NX!Y !,i(p)/NX!,i(p).

Figure 14: ordinal number i of the (X !, Y !) space dimen-

sion that corresponds to selected actions.

Figure 15: prediction error on response in (X !1, Y

!1 ). This

is the distance between observed r and estimated r when

(c, a) is given.

Figure 16: prediction error on response in (X !2, Y

!2 ). This

is the distance between observed r and estimated r when

(c, a) is given.

Figure 17: absolute value of selected abstract action p:

deviation of selected actions in (X !i, Y

!i ) space. Larger p

is stronger causality while p = 0 means no causality.

62

robot selected actions from wide range of p. Thismeans that the robot is confident with causality, andit knows various range of actions have causality.

With the examiner’s rule-based response in thesecond rule after the first 90 sec., the canonical load-ing of R-Swing and R-UpDown in (X !

1, Y!1) doesn’t

change clearly, but the loading of R-UpDown andR-Bend in (X !

2, Y!2) decreases and increases while

the loading of R-Swing in (X !2, Y

!2) increases and de-

creases (Fig. 11, Fig. 12). This means that thesecond rule is represented in (X !

2, Y!2). Fig. 13 shows

that causality stability decreases and increases af-ter the first 90 sec. This means that the first ruleis now stopped and the robot gets confused, but fi-nally it finds the new rule. Fig. 14 shows that therobot moves to selected actions within (X !

2, Y!2). This

means that the robot now finds out the rule repre-sented in (X !

2, Y!2) Fig. 11. Fig. 16 shows that the

robot successfully predicts the response in (X !2, Y

!2)

after the first 90 sec. while Fig. 15 shows thatthe response is not on (X !

1, Y!1) because the first rule

is expired. Fig. 17 shows that the robot selectedonly actions where |p| = 0.5 during 150 sec. to 180sec. Considering p = 0 has no causality, this meansthat the robot is not confident with causality during150 sec. to 180 sec., but finally it becomes confidentagain after 180 sec. We can say that the robot haddi!culty in understanding the second rule becauseit has to understand the second after it realizes thefirst is expired.

4. Discussion

Action and response primitives, (X !i, Y

!i ) space, are

determined through rule-based interaction, while aconventional way requires the primitives as prede-fined symbols or independent processes action andresponse space to find out primitives. So the pro-posed method may lead a new hypothesis on howan infant grows categorical recognition of action andresponse.

This learning is self-motivated to maximize causal-ity stability without teaching signal. What is in-teraction rule is revealed by not demonstrationbut interactional causality. Incremental learningmethods in high-dimensional space is found suchas (Vijayakumar et al., 2005) based on function ap-proximation, but these cannot be simply applied insocial games because teaching signal is not given.

In this robot model, the first rule needs tobe expired to find the second rule because therobot continues to confirm the first rule even afterthe second rule appears. To understand multiplerules, the model should be improved with learningprogress evaluation like habituation/dishabituation(Kuriyama and Kuniyoshi, 2009): The robot in-hibits confirmation of well-known rules to find anew rule (habituation), and back to the well-known

rules (dishabituation). This meets intrinsic motiva-tion of autonomous learning without external rewardor teaching signal (Oudeyer and Kaplan, 2008). Ourapproach is utilizing the intrinsic motivation for so-cial skill learning.

The proposed method can be applied to actionlearning in general. Simply social games really needsthe method because the number of exemplars is lim-ited due to interaction rule modification during thegame.

5. Conclusion

A human partner returns a specific response after arobot acts a specific social cue. We define this as in-teraction rules. Toward natural communication, wefocus on social games played between an infant andcaregiver. For the social games, we have to solvetwo problems: (1) the robot have to find the rulesin high-dimensional space with a limited number ofexemplars; (2) the interaction should not be dividedinto learning phase and interaction phase. Previ-ous challenges on learning interaction rules didn’t at-tacked them both ways. In this paper, we solve themsimultaneously. The robot calculates sensori-motorcausality space utilizing partial correlation analysis,and it compresses the causality space to find therules in high-dimension utilizing canonical correla-tion analysis.

As future work, we move to triadic social gameswhere the robot and partner play with toys. Wecould define primitives which are relevant to 1-to-2-year-old infant. We can use many primitives of ac-tion or response because the proposed method sup-ports high-dimensional space. This could enrich in-teraction.

Acknowledgements

We acknowledge Mr. Harold Martinez and Dr. Hi-denobu Sumioka for interesting discussion. We alsoacknowledge JSPS Research Fellowship for support.

References

Asada, M., Hosoda, K., Kuniyoshi, Y., Ishiguro,H., Inui, T., Yoshikawa, Y., Ogino, M., andYoshida, C. (2009). Cognitive developmentalrobotics: A survey. IEEE Transaction on Au-tonomous Mental Development, 1(1):12–34.

Asada, M., MacDorman, K. F., Ishiguro, H., andKuniyoshi, Y. (2001). Cognitive developmen-tal robotics as a new paradigm for the designof humanoid robots. Robotics and AutonomousSystem, 37:185–193.

Bruner, J. S. and Sherwood, V. (1975). Peekabooand the learning of rule structures. In Bruner,

63

J. S., Jolly, A., and Sylva, K., (Eds.), Play: Itsrole in development and evolution, pages 277–285. Basic Books.

Fong, T., Nourbakhsh, I., and Dautenhahn, K.(2003). A survey of socially interactive robots.Robotics and Autonomous Systems, 42:143–165.

Fujita, M. (2004). On activating human communi-cations with pettype robot aibo. Proceedings ofthe IEEE, 92(11):1804–1813.

Goodrich, M. A. and Schultz, A. C. (2007). Human-robot interaction: A survey. Foundationsand Trends in Human-Computer Interaction,1(3):203–275.

Gustafson, G. E., Green, J. A., and West, M. J.(1979). The infants’ changing roles in mother-infants games: The growth of social skills. InfantBehavior and Development, 2:301–308.

Kuriyama, T. and Kuniyoshi, Y. (2008). Acquisi-tion of human-robot interaction rules via imita-tion and response observation. In Proceedings ofthe 10th International Conference on the Simu-lation of Adaptive Behavior (SAB2008), pages467–476.

Kuriyama, T. and Kuniyoshi, Y. (2009). Co-creation of human-robot interaction rulesthrough response prediction and habitua-tion/dishabituation. In Proceedings of the 2009IEEE/RSJ International Conference on Intel-ligent Robots and Systems (IROS2009), pages4990–4995.

Mitsunaga, N., Miyashita, T., Ishiguro, H., Kogure,K., and Hagita, N. (2006). Robovie-iv: Acommunication robot interacting with peopledaily in an o!ce. In Proceedings of the 2006IEEE/RSJ International Conference on Intel-ligent Robots and Systems (IROS2006), pages5066–5072.

Ogino, M., Toichi, H., Yoshikawa, Y., and Asada,M. (2006). Interaction rule learning with a hu-man partner based on an imitation faculty witha simple visuo-motor mapping. Robotics andAutonomous Systems, 54(5):414–418.

Osada, J., Ohnaka, S., and Sato, M. (2006).The scenario and design process of childcarerobot, papero. In Proceedings of the 2006ACM SIGCHI international conference on Ad-vances in computer entertainment technology,number 80.

Oudeyer, P.-Y. and Kaplan, F. (2008). How can wedefine intrinsic motivation? In Proceedings ofthe 8th International Conference on Epigenetic

Robotics: Modeling Cognitive Development inRobotic Systems (EpiRob08).

Oudeyer, P.-Y., Kaplan, F., and Hafner, V. V.(2007). Intrinsic motivation systems for au-tonomous mental development. IEEE Transac-tions on Evolutionary Computation, 11(2):265–286.

Ratner, N. and Bruner, J. (1978). Games, social ex-change and the acquisition of language. Journalof Child Language, 5:391–401.

Rochat, P. (2001). The Infant’s World, pages 194–234. Harvard University Press.

Rome-Flanders, T. and Cossette, L. (1995). Com-prehension of rules and structures in mother-infant games: A longitudinal study of the earlytwo years of life. International Journal of Be-havioral Development, 18(1):83–103.

Stern, D. (1977). The First Relationship: Infantand Mother (Developing Child). Harvard Uni-versity Press.

Sumioka, H., Takeuchi, Y., Yoshikawa, Y., andAsada, M. (2009). Bottom-up social devel-opment through reproducing contingency withsensorimotor clustering. In Proceedings ofNinth International Conference on EpigeneticRobotics (EpiRob09), pages 169–176.

Taniguchi, T., Iwahashi, N., Sugiura, K., andSawaragi, T. (2008). Constructive approach torole-reversal imitation through unsegmented in-teractions. Journal of Robotics and Mechatron-ics, 20(4):567–577.

Vijayakumar, S., D’Souza, A., and Schaal, S.(2005). Incremental online learning in high di-mensions. Neural Computation, 17(12):2602–2634.

Wada, K. and Shibata, T. (2006). Robot ther-apy in a care house –its sociopsychological andphysiological e"ects on the residents–. In Pro-ceedings of the 2006 IEEE International Confer-ence on Robotics and Automation (ICRA2006),pages 3966–3971.

Watson, J. S. (1979). The perception of contingencyas a determinant of social responsiveness. InThoman, E. B., (Ed.), Origins of the Infant’sSocial Responsiveness, pages 33–64. John Wiley& Sons Inc.

64

learning interaction rules through compression of sensori-motor ... · yasuo kuniyoshi ∗ ∗the...

Documents