온라인소셜 - kaistan.kaist.ac.kr/~sbmoon/paper/thesis/2007dec-hyunwoo.pdf · 2018-08-30 ·...

31
Master’s Thesis 소셜 Growth in Online Social Networks: Sheer Volume vs Social Interaction (Chun, Hyunwoo) Department of Electrical Engineering and Computer Science Division of Computer Science Korea Advanced Institute of Science and Technology 2008

Upload: others

Post on 30-May-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 온라인소셜 - KAISTan.kaist.ac.kr/~sbmoon/paper/thesis/2007Dec-hyunwoo.pdf · 2018-08-30 · public diary, a testimonial board, a guestbook, etc. A user can establish friend relation-ships

석사학위논문

Master’s Thesis

온라인소셜네트워크의성장

Growth in Online Social Networks: Sheer Volume vs Social

Interaction

전현우 (全玹佑 Chun, Hyunwoo)

전자전산학과전산학전공

Department of Electrical Engineering and Computer Science

Division of Computer Science

한국과학기술원

Korea Advanced Institute of Science and Technology

2008

Page 2: 온라인소셜 - KAISTan.kaist.ac.kr/~sbmoon/paper/thesis/2007Dec-hyunwoo.pdf · 2018-08-30 · public diary, a testimonial board, a guestbook, etc. A user can establish friend relation-ships

온라인소셜네트워크의성장

Growth in Online Social Networks: SheerVolume vs Social Interaction

2

Page 3: 온라인소셜 - KAISTan.kaist.ac.kr/~sbmoon/paper/thesis/2007Dec-hyunwoo.pdf · 2018-08-30 · public diary, a testimonial board, a guestbook, etc. A user can establish friend relation-ships

Growth in Online Social Networks: SheerVolume vs Social Interaction

Advisor : Professor Moon, Sue Bok

by

Chun, Hyunwoo

Department of Electrical Engineering and Computer Science

Division of Computer Science

Korea Advanced Institute of Science and Technology

A thesis submitted to the faculty of the Korea Advanced

Institute of Science and Technology in partial fulfillment of the

requirements for the degree of Master of Engineering in the

Department of Electrical Engineering and Computer Science,

Division of Computer Science .

Daejeon, Korea

2007. 12. 5.

Approved by

Professor Moon, Sue Bok

Advisor

Page 4: 온라인소셜 - KAISTan.kaist.ac.kr/~sbmoon/paper/thesis/2007Dec-hyunwoo.pdf · 2018-08-30 · public diary, a testimonial board, a guestbook, etc. A user can establish friend relation-ships

온라인소셜네트워크의성장

전현우

위 논문은 한국과학기술원 석사학위논문으로 학위논문심사

위원회에서심사통과하였음.

2007년 12월 4일

심사위원장 문 수 복 (인)

심사위원 윤 현 수 (인)

심사위원 Otfried Cheong (인)

Page 5: 온라인소셜 - KAISTan.kaist.ac.kr/~sbmoon/paper/thesis/2007Dec-hyunwoo.pdf · 2018-08-30 · public diary, a testimonial board, a guestbook, etc. A user can establish friend relation-ships

MCS

20063506

전 현 우. Chun, Hyunwoo. Growth in Online Social Networks: Sheer Volume

vs Social Interaction. 온라인 소셜 네트워크의 성장. Department of Electrical

Engineering and Computer Science, Division of Computer Science . 2008. 21p.

Advisor Prof. Moon, Sue Bok. Text in English.

Abstract

i

Page 6: 온라인소셜 - KAISTan.kaist.ac.kr/~sbmoon/paper/thesis/2007Dec-hyunwoo.pdf · 2018-08-30 · public diary, a testimonial board, a guestbook, etc. A user can establish friend relation-ships

Contents

Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iContents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iiiList of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ivList of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v

1 Introduction 1

2 User Interaction Captured in Cyworld Guestbook 32.1 Growth of Guestbook Activity . . . . . . . . . . . . . . . . . . . . . . . . . . 42.2 Self-Posting in Guestbook . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

3 Activity Network 6

4 Factors Contributing to Activity 94.1 Peer pressure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94.2 Reciprocity of User Activity . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

5 Steady Core Analysis 135.1 Why Is Steady Core Important? . . . . . . . . . . . . . . . . . . . . . . . . . 135.2 Basic Statistics of Steady Core . . . . . . . . . . . . . . . . . . . . . . . . . . 145.3 Topological Characteristics of Steady Core . . . . . . . . . . . . . . . . . . . 15

6 Compare Friendship Network with Activity Network 166.1 The Impact of Power Users in Two Networks . . . . . . . . . . . . . . . . . . 16

7 Related Work 17

8 Conclusion 18

Summary (in Korean) 19

References 20

iii

Page 7: 온라인소셜 - KAISTan.kaist.ac.kr/~sbmoon/paper/thesis/2007Dec-hyunwoo.pdf · 2018-08-30 · public diary, a testimonial board, a guestbook, etc. A user can establish friend relation-ships

List of Tables

2.1 Description of Our Cyworld Data . . . . . . . . . . . . . . . . . . . . . . . . 4

5.1 Description of Giant Connected Component in Steady Core . . . . . . . . . 15

iv

Page 8: 온라인소셜 - KAISTan.kaist.ac.kr/~sbmoon/paper/thesis/2007Dec-hyunwoo.pdf · 2018-08-30 · public diary, a testimonial board, a guestbook, etc. A user can establish friend relation-ships

List of Figures

2.1 Comparison of growth in various user statistics in Cyworld . . . . . . . . . . 5

3.1 Growth of the activity network in strength and degree . . . . . . . . . . . . 73.2 The evolution of each user group. (a)average degree per month, (b)average

strength per month . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

4.1 (a) Node strength vs number of friends (b) Median node strength vs numberof friends ALL: Node strength linearly correlated to the number of friendsup to 200 or so friends. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

4.2 Reciprocity of each user, (a) The number of receiving and writing messages,(b) The median of receiving and writing messages of each user . . . . . . . 11

4.3 Reciprocity of each link, (a) The number of received and writing messages(A→B, A←B, (b) The median of received and writing messages . . . . . . . 12

5.1 CDF of duration of edges last . . . . . . . . . . . . . . . . . . . . . . . . . . 135.2 Monthly changing of top 1% hub nodes . . . . . . . . . . . . . . . . . . . . . 135.3 Degree distribution of steady core . . . . . . . . . . . . . . . . . . . . . . . . 15

v

Page 9: 온라인소셜 - KAISTan.kaist.ac.kr/~sbmoon/paper/thesis/2007Dec-hyunwoo.pdf · 2018-08-30 · public diary, a testimonial board, a guestbook, etc. A user can establish friend relation-ships

1. Introduction

The theme of today’s Internet services is social networking. Not only online social networkservices, Myspace and Facebook, but also other major web 2.0 services, Flickr, Del.icio.us,and YouTube, offer social networking features to their services. Users are getting used tointeract each other through these features: making friend relationships, sharing their pho-tos, and writing comments. Especially, online social networks assist users to make explicitfrienship in online, and users easily make friendship with other users. These friend rela-tions based on trust are expected to become a key to solve recommendation [], security [],search [], and personalization [] issues. Understanding friend relations is the first step toachieve them, and recent studies about social networks focus on the explicit friend rela-tions [1, 6].

Friend relations on trust are valuable for some research area, but their characteristicsare not enough to represent the status of online social networks at that time. The mainreason comes from user behavior managing their friends. Once users make friend relationswith others in online social network services, they tend not to break relations [1]. Con-sequently, the friend network is a history of all the past friend networks. The other rea-son is that a friend relation is the only start line of social interactions in online social net-work services. Making friend relations typically cannot be repeated after they are created.Looking friends’ photos, reading friends’ articles, and leaving comments to friends’ guest-books are not observed in the friend relations, although these activities may come fromfriend relations. Macroscopically, the number of users, the number of daily visitors, andpage views are three famous metrics to measure the status of online SNSs. These metricsdescribe overall status of online social network services, but hard to look into a part ofsocial networks in detail.

In this paper, we suggest to move the focus on analysis of online social networks fromfriend network to activity network for deeply understanding online social networks. Theactivity network is constructed from complete guestbook logs of Cyworld, the most biggestonline social network service in Korea. Over two years guestbook logs give us insight tounderstand user activities: whether user activities transparently reflect the growth of on-line social networks? what affects user activities? can we find active user groups in onlinesocial networks? what charactistics do those groups have? and how much different thefriend network and the activity network?

1

Page 10: 온라인소셜 - KAISTan.kaist.ac.kr/~sbmoon/paper/thesis/2007Dec-hyunwoo.pdf · 2018-08-30 · public diary, a testimonial board, a guestbook, etc. A user can establish friend relation-ships

To our best knowledge, this is the first large-scale work to analyze user activities in on-line social network services. Although there are some literatures to anlayze user activitiesin different medium, call networks [] and msn messenger networks [], we present uniquecharacteristics of activity networks in online social networks. Also, from the difference be-tween the friend network and the activity network, we propose that we must carefullychoose relevant one of both networks for different analysis or simulation goals.

The remainder of this paper is structured as follows. In Chapter ??, we describe theCyworld, guestbook data of Cyworld. In Chapter 3, we define the activity network andconstruct it. In Chapter ??, we describe factors contributing to activity network. In Chap-ter 5, we describe the steady core and analysis it. We compare friendship network withactivity network in Chapter 6.

2

Page 11: 온라인소셜 - KAISTan.kaist.ac.kr/~sbmoon/paper/thesis/2007Dec-hyunwoo.pdf · 2018-08-30 · public diary, a testimonial board, a guestbook, etc. A user can establish friend relation-ships

2. User Interaction Captured in Cyworld

Guestbook

In this section, we describe the social network data we use for this study. Cyworld, launchedin 2001, is the largest online social network service in Korea. When a user joins Cyworld,one is given a homepage (called minihompy) that contains an avatar, a photo gallery, apublic diary, a testimonial board, a guestbook, etc. A user can establish friend relation-ships with other users and share information only with those established relationships. Asof October 2007, the number of registered Cyworld users has surpassed 20 million, thatis more than a third of the entire South Korea population1. As the huge number of reg-istered users represents, people spend much time logged onto Cyworld and manage vari-ous aspects of social life online. Users browse through friends’ photos and leave comments.They read others’ public diaries and write testimonials for those established friends. Someof the features, such as writing a testimonial and viewing photos, are often limited to onlythose with established online friend relationships. The guestbook is accessible to any userand is the most used feature. Also it is a recorded two-way interaction, while viewingphotos and reading public diaries are not recorded or reported to the owner of the photosand diaries.

Ahn et al have analyzed Cyworld’s topological characteristics of bi-directional friend re-lationships. Once established, a friend relationship is hardly severed and remains whetherusers stay in touch or not. It is an assertion that some relationship existed, currently ac-tive or not. In this work, we delve deeper into the web of social networking and studythe user interaction captured in the guestbook. Unlike a friend relationship, that is bi-directional, a message on a guestbook represents a directional interaction between users.On a guestbook, people write greetings, recent news, and replies to messages. And theydo not have to have an established friend relationship to use the guestbook.

We have obtained the complete guestbook logs of Cyworld from June 2003 to Octo-ber 2005. This period is very important in the development of Cyworld, as the number

1Upon joining, a new user must have its personal identification number (equivalent of U.S.’s socialsecurity number) verified. Foreigners have special provisions for membership. All user accounts on Cy-world map to real users, unless a user make an illicit use of other people’s personal identification num-bers

3

Page 12: 온라인소셜 - KAISTan.kaist.ac.kr/~sbmoon/paper/thesis/2007Dec-hyunwoo.pdf · 2018-08-30 · public diary, a testimonial board, a guestbook, etc. A user can establish friend relation-ships

of users grew exponentially from 2 million to 16 million and the friend relationship net-work began to show a sign of densification [1]. In this work we investigate whether thegrowth in actual user interaction, the key aspect of social networking services, has keptup with the external growth. Our guestbook log consists of three columns: the writer,the guestbook owner, and the time. All user identifiers have been anonymized. As of Oc-tober 2005, the number of Cyworld subscribers reached 16,023,307, 75.2% or 12,048,186users have formed friend relationships with others, and 65.4% or 10,476,604 users havewritten or received at least once during the period of our guestbook logs. Compared to381,602,530 friend relationships, the number of the writer and guestbook owner pairs is537,970,431. Table 2.1 summarizes our dataset.

Table 2.1: Description of Our Cyworld DataPeriod 2003.06∼2005.10

Number of Users Appeared in Data 17,788,870

Number of Messages (Tuples) 8,423,218,770

Number of Unique Tuples 537,970,431

Mean # of Written Msg. per User 637 (397)

Mean # of Received Msg. per User 484 (297)

2.1 Growth of Guestbook Activity

In the previous section, we have claimed that the guestbook is the most used featureand best represents the actual interaction among users. In this section, we quantitativelydemonstrate the popularity of the guestbook feature.

Figure 2.1 shows the following numbers:

• Registered users

• Registered users with friend relationships

• Users who have written at least one guestbook message during our dataset period

• Users who have written at least in that particular month

We observe that the number of Cyworld users grew almost ten-fold in that short timespan, and the cumulative number of guestbook users also experienced an exponential growth.However, the number of guestbook users never caught up with the total number of Cy-world users. Moreover, the monthly statistics of guestbook users started to abate in growth.

4

Page 13: 온라인소셜 - KAISTan.kaist.ac.kr/~sbmoon/paper/thesis/2007Dec-hyunwoo.pdf · 2018-08-30 · public diary, a testimonial board, a guestbook, etc. A user can establish friend relation-ships

0

2e+06

4e+06

6e+06

8e+06

1e+07

1.2e+07

1.4e+07

1.6e+07

1.8e+07

2005-072005-012004-072004-012003-07

Pop

ulat

ion

Time

total usersusers w/ friendscum # of writers

writers per month

Figure 2.1: Comparison of growth in various user statistics in Cyworld

2.2 Self-Posting in Guestbook

When a friend write a message on a guestbook, the owner of the guestbook often repliesin one’s own guestbook, instead of visiting the friend’s homepage and write there. Thisactivity is captured in our guestbook log as a 3-tuple that has the same writer and owner.We call this tuple a self-post. Self-posts take up about a third or 38.9% in all posts, andthey persist over time. Also 81.8% of users who have written at least once have written aself-post. For half of the users, 33.3% of their writings are self-posts. This is not negligiblephenomenon.

As follows, we determine how to interpret self-posts before analyzing user activities ofguestbook logs. A self-post serves either of the two purposes: a message written for view-ing by all others (a notice) or a reply specifically for a preceding message. We cannot dis-tinguish a notice from a reply in the guestbook log, as they both appear as 3-tuples withthe same writer and owner. The problem is that two types of self-post have no differencein guestbook logs, but influences of two types are greately different; A reply is intuitivelymotivating other users to interact continuously, but a notice message does not directlyinfluence other users. However, in Cyworld a public diary serves a similar purpose as anotice, and we conjecture most self-posts are actually replies. As self-posting is an impor-tant aspect of user activity, we include it in our activity analysis in the next section.

5

Page 14: 온라인소셜 - KAISTan.kaist.ac.kr/~sbmoon/paper/thesis/2007Dec-hyunwoo.pdf · 2018-08-30 · public diary, a testimonial board, a guestbook, etc. A user can establish friend relation-ships

3. Activity Network

Graph representation of social networks is an apt abstraction of their connected natureand allows us to tap into the rich repository of graph and complex network theories. Inthis section we describe how we represent the user interaction on the guestbook as a graphand define metrics of interaction.

We map a user to a node and a message to a directional weighted edge. In the rest ofthe paper, we refer to a user and a node interchangeably. An edge from node A to node Bdenotes that user A has written a message on user B’s guestbook. The weight of the edgeis the number of messages user A has written to user B. We include self-posting in ouranalysis as a reflexive edge pointing at itself. We call this directed and weighted graph theactivity network. Our activity network is different from the friend network studied in [1]in the following two aspects:

• Edges are directional.

• Even represented as undirectional graph, the activity network is not a proper subsetof the friend network, for users without established friend relationships can still writeonto each other’s guestbooks.

We use the following two metrics to capture users’ activity quantitatively.

• Strength: the sum of all weights of edges originating from a node (the total numberof messages a user has written)

• Degree: the number of edges originating from a node (the total number of uniqueguestbooks a user has written onto)

In Figure 3.1 we plot four cumulative distribution functions (CDF):

• Cumulative node strength since June 2003

• Monthly average of node strength

• Cumulative node degree since June 2003

• Monthly average of node degree

6

Page 15: 온라인소셜 - KAISTan.kaist.ac.kr/~sbmoon/paper/thesis/2007Dec-hyunwoo.pdf · 2018-08-30 · public diary, a testimonial board, a guestbook, etc. A user can establish friend relation-ships

Figure 3.1: Growth of the activity network in strength and degree

As our guestbook data is from the period of explosive growth, a large number of users havejoined and the time of membership initiation should be taken into consideration. Hence,monthly averages of node strength and degree are calculated by taking the cumulativenode strength and degree by October 2006 and dividing them by the number of monthssince the first time a user has written a message in our dataset period. Not all users writeon a guestbook as soon as they join Cyworld and there is a gap between a membershipinitiation and the first guestbook activity. In this sense the average node strength anddegree, as we calculate, are upper bounds on actual monthly average activity of users.

Half the Cyworld users with guestbook activity have written at least 170 messagesin 25 different users’ guestbooks. Top 10% of those users have written more than 1800messages on 100 and more guestbooks. To know whether user activity reflect the statusof online social network services, at first we divide all users into three groups by the levelof their activities that are top 10%, top 10 to 50%, and remain users. We compare thegrowth of three groups with that of Cyworld over time. Due to level of activities variesover time, we set theshold values of activity level to the monthly averages of node strengthand degree from the Figure 3.1.

Half the Cyworld users have written at least 16 messages in 6 different users’ guest-books per month. Top 10% of those users have written more than 117 messages on 24and more guestbooks per month. We compare the evolution of three user groups. In fig-ure 3.2(a), we can observe that the number of top 10% users did not increase from Sep.2004 while the number of top 10 to 50% and that of under 50% users have increasedsteadily. We can also observe similar phenomenon in Fiugre 3.2(b). The only differencebetween two figures is that the number of top 10% users in Figure 3.2(b) decreases.

7

Page 16: 온라인소셜 - KAISTan.kaist.ac.kr/~sbmoon/paper/thesis/2007Dec-hyunwoo.pdf · 2018-08-30 · public diary, a testimonial board, a guestbook, etc. A user can establish friend relation-ships

0

500000

1e+06

1.5e+06

2e+06

2.5e+06

3e+06

3.5e+06

4e+06

2003-07 2004-01 2004-07 2005-01 2005-07

Num

ber

of u

sers

Time

top 10%top 10-50%

remains

0

500000

1e+06

1.5e+06

2e+06

2.5e+06

3e+06

3.5e+06

4e+06

2003-07 2004-01 2004-07 2005-01 2005-07

Num

ber

of u

sers

Time

top 10%top 10-50%

remains

Figure 3.2: The evolution of each user group. (a)average degree per month, (b)averagestrength per month

8

Page 17: 온라인소셜 - KAISTan.kaist.ac.kr/~sbmoon/paper/thesis/2007Dec-hyunwoo.pdf · 2018-08-30 · public diary, a testimonial board, a guestbook, etc. A user can establish friend relation-ships

4. Factors Contributing to Activity

In Section 3, we construct the activity network and observe that activity of users variesover month. In this section we investigate factors that contribute to activity in our onlinesocial network. Peer pressure and popularity are often the main causes behind our humansocial activities. We investigate how these two factors impact activity in the online socialnetwork.

4.1 Peer pressure

We first ask the following question: "Are people socially more active, if they have manyfriends?" We would like to know if one’s number of friends plays an encouraging role,as the more friends have joined the same online social networking service, the more peerpressure one might receive.

0

2000

4000

6000

8000

10000

0 100 200 300 400 500 600 700 800 900 1000

med

ian

node

str

engt

h

# of friends

Figure 4.1: (a) Node strength vs number of friends (b) Median node strength vs numberof friends ALL: Node strength linearly correlated to the number of friends up to 200 orso friends.

We plot the node strength against the number of friends in Figure 4.1(a). At a firstglance, the node strength does not seem to increase linearly to the number of users. Whenwe take a closer look at the zoomed-in inset, we recognize a somewhat correlated increasein strength up to users with about 200 of friends. Then the node strength starts to di-

9

Page 18: 온라인소셜 - KAISTan.kaist.ac.kr/~sbmoon/paper/thesis/2007Dec-hyunwoo.pdf · 2018-08-30 · public diary, a testimonial board, a guestbook, etc. A user can establish friend relation-ships

minish, even as the number of friends increases. In order to verify a trend in correla-tion, we plot the median node strength per number of friends in Figure 4.1(b). The graphshows a clear linear correlation between node strength and the number of user up to about200, and then disperses. The Pearson correlation coefficient of the overall graph in Fig-ure 4.1(a) is 0.2071. We split the users into two separate groups, those with 200 or fewerfriends and with more than 200, and compute the correlation coefficients. For the firstgroup, the Pearson correlation coefficient is 0.6235, strongly positive; for the other group,only 0.00913. Intuitively, the more friends one has, the more active one should be socially.However, beyond 200 or so friends, one must reach a limit in one’s socializing capacity.

The breakoff point of 200 in our analysis is larger than predicted by Dunbar’s Law [?].From our data, we see that people in the 21st century can keep up with social activitiesinvolving 200 friends or so. Though the size of human neo-cortex has not grown sinceDunbar’s time, technology has assisted in our evolution into a more social creature.

This outcome is in agreement with the previous work that reports a fall-out from asingle scaling behavior in the node degree distribution and conjectures the emergence ofonline-only relationships [1]. Users with more than 200 friends are not particularly moreactive or at least as much compared to those with fewer friends. We suppose that thegraph is another representation of Dunbar’s law. Dunbar’s law is that there are differentlimitations of the number of manageable relationships following species. In our previouswork [1], we use Dunbar’s law to describe the two scales in degree distribution graph ofCyworld friend networks; we find out scales’ change occurring at users having about 200friends. Similarly, in Figure 4.1, changing correlation coefficient from positive to almost 0occurs from users who having about 200 friends. Changing correlation is strongly relatedto the manageable limit number of relations rather than our previous work, because theload of writing messages is much harder than send or accept requests for friends. Fig-ure 4.1 (b) clearly shows the changing trends. This graph exhibits the mean number ofwritten messages of users who have the same number of friends; positive scale region un-der 200 friends and the scattering region over 200 friends.

From these two graphs, we assume that 1) there is the limitation of the number ofmanageable friends, about 200, 2) until the limitation, more friends, more activities. 3)over the limitation, the number of friends are not related to activities.

10

Page 19: 온라인소셜 - KAISTan.kaist.ac.kr/~sbmoon/paper/thesis/2007Dec-hyunwoo.pdf · 2018-08-30 · public diary, a testimonial board, a guestbook, etc. A user can establish friend relation-ships

4.2 Reciprocity of User Activity

We look into the reciprocity of user activities. We show that peer pressure is exerted bynot only the number of friends but also the number of exchanged messages.

We compare the number of received and written messages of each user in Figure 4.2,and of each user pair in Figure 4.3. We can detect three regions: similar received and writ-ten messages, a few received but many written messages, and a few received but manywritten messages. The first and largest region is placed following y = x. The number ofwritten messages by users who are included in this region is similar with that of receivedmessages. We can assume that received messages are motivated users to write messages,vice and versa. Another two regions present opposite characteristics of two user groups.One group is that they write only a few replies, but they receive many messages. We con-jecture a part of these users are originally very popular people such as celebrities. Theother group is that they receive a few messages, but they write many messages. We con-jecture a part of these users are spammers, or very passionate fanboys.

1

10

100

1000

10000

100000

1e+06

1 10 100 1000 10000 100000 1e+06

# of

rec

eive

d m

essa

ges

# of written messages

Figure 4.2: Reciprocity of each user, (a) The number of receiving and writing messages,(b) The median of receiving and writing messages of each user

Through only the summation of written messages to all friends we cannot know thedistribution of messages over friends. For example, among 100 friends, actively exchanging100 messages with only one friends is greatly different from exchanging one message withevery 100 friends. To compare peer pressure between these cases, we plot the number ofexchanged messages between user pairs in Figure 4.3. The shape of user groups followingy = x becomes more sharply; the number of written and received messages are almost thesame in this group. Online exchanging messages are motivated each other reciprocally like

11

Page 20: 온라인소셜 - KAISTan.kaist.ac.kr/~sbmoon/paper/thesis/2007Dec-hyunwoo.pdf · 2018-08-30 · public diary, a testimonial board, a guestbook, etc. A user can establish friend relation-ships

offline conversation.

1

10

100

1000

10000

100000

1 10 100 1000 10000 100000 1e+06

# of

B->

A# of A->B

Figure 4.3: Reciprocity of each link, (a) The number of received and writing messages(A→B, A←B, (b) The median of received and writing messages

12

Page 21: 온라인소셜 - KAISTan.kaist.ac.kr/~sbmoon/paper/thesis/2007Dec-hyunwoo.pdf · 2018-08-30 · public diary, a testimonial board, a guestbook, etc. A user can establish friend relation-ships

5. Steady Core Analysis

The steady core is defined as the network of users who 1) write amount of guestbook(active user) and 2) write guestbook steadily every month (steady user). In this section,we analyze steady core of networks based on user activities.

5.1 Why Is Steady Core Important?

0.55

0.6

0.65

0.7

0.75

0.8

0.85

0.9

0.95

1

0 5 10 15 20 25 30

cdf

duration

Figure 5.1: CDF of duration of edges last

0

10000

20000

30000

40000

50000

60000

70000

80000

90000

200507200501200407200401200307

num

ber

month

# of hubs# of consecutive hubs

Figure 5.2: Monthly changing of top 1% hub nodes

13

Page 22: 온라인소셜 - KAISTan.kaist.ac.kr/~sbmoon/paper/thesis/2007Dec-hyunwoo.pdf · 2018-08-30 · public diary, a testimonial board, a guestbook, etc. A user can establish friend relation-ships

Writing a message is a repetitive activity. The number of writing and receiving mes-sages shows how much two users are interested in or related each other. Of course, this isnot an accurate but an approximate measure, because Cyworld provide another featuresto interact bewteen users. Both numbers are various by users in Figure 4.3, as follows, theactivity network is changed extensively. To know how much the activity network changesover time, we have two points of view: one is from the edge’s perspecive, and the otheris from the active node’s perspective.

We show how long activities last in Figure 5.1 by constructing activity networks everymonth and compare edges among them. A half of edges do not last even two months;Each user’s activity intensively changes. In addition, Over 60% of the most top 1% activeusers are changed monthly in Figure 5.2.

In Section 4.2, we examine that active users can motivate other users to interact eachother. If active users are strongly connected each other, they can be a powerful source ofactivities. Thus, clearly identifying the set of active users is directly related to understandonline social networks.

5.2 Basic Statistics of Steady Core

In order to determine active user and steady user, we use two metrics: user’s overallstrength and standard deviaition (SD) of strength per month. Both threshold values areset to median of each metric. In this paper, threshold value of overall strength is set to171 according to Figure 3.1 and that of SD set to 16. We extract steady from users whoseSD value is smaller than 16 and user writes messages at least 171 times. The number ofthese users is 499,397, and they are divided into 103,657 weakly connected components(WCC), the biggest one has 371,674 users and size of other components is smaller than18. We choose the most giant connected component (GCC) to analyze the characteristicsof steady core.

Steady core consists of 371,674 users and 1,269,240 edges. These users are about 2%of entire users. They write all 67,216,193 messages and 36,288,688 messages of them toother users in steady core. These messages are about 0.43% of all messages.

Considering the size of the steady core, their activities are quite high.

14

Page 23: 온라인소셜 - KAISTan.kaist.ac.kr/~sbmoon/paper/thesis/2007Dec-hyunwoo.pdf · 2018-08-30 · public diary, a testimonial board, a guestbook, etc. A user can establish friend relation-ships

Table 5.1: Description of Giant Connected Component in Steady CoreNumber of Users 371,674

Number of Messages 36,288,688

Number of Unique Tuples 1,269,240

Mean # of Written Msg. per User 97

5.3 Topological Characteristics of Steady Core

Then, we look into topological characteristics of steady core. Topological analysis canshow what basic statistics does not show. In our previous work [1], we present that Cy-world friendship network is mixed up two different types of friend relationships throughtopological analysis.

10-6

10-5

10-4

10-3

10-2

10-1

100

100 101 102 103 104

CC

DF

degree of node

out-degreein-degree

Figure 5.3: Degree distribution of steady core

Figure 5.3 is degree distribution of steady core. We can find two different scale thatare abserved in the degree distribution of the friend network.

Figure ?? shows distribution of clustering coefficient of each node. The average clus-tering coefficient is 0.760. Comparing XXX, the clustering coefficient of the entire activ-ity newtork, steady core is well clustered.

15

Page 24: 온라인소셜 - KAISTan.kaist.ac.kr/~sbmoon/paper/thesis/2007Dec-hyunwoo.pdf · 2018-08-30 · public diary, a testimonial board, a guestbook, etc. A user can establish friend relation-ships

6. Compare Friendship Network with Activity

Network

In this section, we focus on the difference between the friendship network and activitynetwork. Both networks comes from the same Cyworld users but the different connectionof users. Not only the connectivity of two users, but also dynamics is greatly different.Activity network changes much faster and intenser. This raise some interesting questionsabout the activity network. How long does edge last? How much do degree and strengthof one node change monthly? Are hub nodes steady? How many nodes are neighbors ofhubs? Is this number changed? Through answers of these questions, we deeply understandthe activity network.

6.1 The Impact of Power Users in Two Networks

Hub nodes in the friendship network and the activity network are representatives of thepower users in viral marketing []. Whose hubs are more close to power users in informationpropagation is out of scope of this paper, but we investigate the characteristics of hubs inboth networks as preliminary work for reasoning it.

Our first focus is the proportion of overlap of hub nodes in both networks. We extracteach top 1% hub nodes from both two networks, the friendship network and the activitynetwork. The number of hub nodes who have more than 151 friends in the friendshipnetwork is 121,886 and that of hub nodes who write guestbooks on more than 207 usersin the activity network is 134,490. Of two kinds of hub nodes, 63,800 users are overlaped.We also extract each top 1% hub nodes from monthly data in order to know whether hubnodes in the activity network are steady. Threshold value of top 1% degree increases untilMay 2004 and is kept. Hub nodes who are appeared in both consecutive two months isXX % of total nodes.

16

Page 25: 온라인소셜 - KAISTan.kaist.ac.kr/~sbmoon/paper/thesis/2007Dec-hyunwoo.pdf · 2018-08-30 · public diary, a testimonial board, a guestbook, etc. A user can establish friend relation-ships

7. Related Work

Social network analysis has been developed mainly from sociology and anthropology [12].As electronically piled social network data enable observing the large-scale statistics ofnetworks, the social network has became a hot topic also for the scientists in other fields,such as physics and computer science. World Wide Web has been giving birth to an armyof social network services. SNSs have already occupied substantial portion of our relation-ships and they are treated as a major part of our social lives [13].

Before the emergence of large-scale online social network services, variety of other on-line social networks have been analyzed. Many analyses used the e-mail networks, themost basic communication medium [2, 5, 10, 11]. Valverde and Solé study the social net-work of open source communities [10,11]. The massive data of mobile communication wasrecently analyzed [8, 7]. Using mobile phone records of millions of people, they examinethe communication pattern of people. They argued that the stability of communicationnetwork largely depends on the weak ties in the network.

Holme et al. analyze an online dating community in detail [4]. In this work, the timeevolution of activity showed the saturation of degree, power-law activity pattern, and ..Mislove et al. investigate various social network services [6]. Internet communities [9,4,3,5].

17

Page 26: 온라인소셜 - KAISTan.kaist.ac.kr/~sbmoon/paper/thesis/2007Dec-hyunwoo.pdf · 2018-08-30 · public diary, a testimonial board, a guestbook, etc. A user can establish friend relation-ships

8. Conclusion

18

Page 27: 온라인소셜 - KAISTan.kaist.ac.kr/~sbmoon/paper/thesis/2007Dec-hyunwoo.pdf · 2018-08-30 · public diary, a testimonial board, a guestbook, etc. A user can establish friend relation-ships

요약문

온라인소셜네트워크의성장

19

Page 28: 온라인소셜 - KAISTan.kaist.ac.kr/~sbmoon/paper/thesis/2007Dec-hyunwoo.pdf · 2018-08-30 · public diary, a testimonial board, a guestbook, etc. A user can establish friend relation-ships

References

[1] Y.-Y. Ahn, S. Han, H. Kwak, S. Moon, and H. Jeong. Analysis of topological charac-teristics of huge online social networking services. In WWW ’07: Proceedings of the16th international conference on World Wide Web, pages 835–844, New York, NY,USA, 2007. ACM Press.

[2] H. Ebel, L.-I. Mielsch, and S. Bornholdt. Scale-free topology of e-mail networks. Phys.Rev. E, 66:035103, 2002.

[3] K.-I. Goh, Y.-H. Eom, H. Jeong, B. Kahng, and D. Kim. Structure and evolution ofonline social relationships: Heterogeneity in unrestricted discussions. Phys. Rev. E,73:066123, 2006.

[4] P. Holme, C. R. Edling, and F. Liljeros. Structure and time-evolution of an internetdating community. Social Networks, 26:155, 2004.

[5] G. Kossinets and D. Watts. Emprical Analysis of an Evolving Social Network. Sci-ence, 311(88), 2006.

[6] A. Mislove, M. Marcon, K. P. Gummadi, P. Druschel, and B. Bhattacharjee. Mea-surement and Analysis of Online Social Networks. In ACM Internet MeasurementConference, October 2007.

[7] J.-P. Onnela, J. Saramäki, J. Hyvönen, G. Szabó, M. A. de Menezes, K. Kaski, A.-L.Barabási, and J. Kertész. Analysis of a large-scale weighted network of one-to-onehuman communication. New Journal of Physics, 9:179, 2007.

[8] J.-P. Onnela, J. Saramäki, J. Hyvönen, G. Szabó, D. Lazer, K. Kaski, J. Kertész, andA.-L. Barabási. Structure and tie strengths in mobile communication networks. Proc.Nat. Acad. Sci., 104(18):7332, 2007.

[9] F. T. Rothaermel and S. Sugiyama. Virtual internet communities and commercialsuccess: individual and community-level theory grounded in the atypical case of time-zone.com. Journal of Management, 27(3):297, 2001.

[10] S. Valverde and R. V. Solé. Evolving social weighted networks: Nonlocal dynamicsof open source communities. arXiv:physics/0602005v1.

20

Page 29: 온라인소셜 - KAISTan.kaist.ac.kr/~sbmoon/paper/thesis/2007Dec-hyunwoo.pdf · 2018-08-30 · public diary, a testimonial board, a guestbook, etc. A user can establish friend relation-ships

[11] S. Valverde and R. V. Solé. Self-organization versus hierarchy in open-source socialnetworks. Phys. Rev. E, 76:046118, 2007.

[12] S. Wasserman and K. Faust. Social network analysis. Cambridge University Press,Cambridge, 1994.

[13] B. Wellman. Computer networks as social networks. Science, 293(5537):2031, 2001.

21

Page 30: 온라인소셜 - KAISTan.kaist.ac.kr/~sbmoon/paper/thesis/2007Dec-hyunwoo.pdf · 2018-08-30 · public diary, a testimonial board, a guestbook, etc. A user can establish friend relation-ships

감사의 글

이 논문을 완성하기까지 주위의 모든 분들로부터 수많은 도움을 받았습니다.

Page 31: 온라인소셜 - KAISTan.kaist.ac.kr/~sbmoon/paper/thesis/2007Dec-hyunwoo.pdf · 2018-08-30 · public diary, a testimonial board, a guestbook, etc. A user can establish friend relation-ships

이력서

이 름 : 전 현 우

생 년 월 일 : 1984년 4월 15일

출 생 지 : 부산광역시 동구 초량동 1247번지의 2

본 적 지 : 부산광역시 중구 부평동 2가 56번지

주 소 : 부산광역시 동구 범일 2동 한양아파트

E-mail주소 : [email protected]

학 력

2000. 3. – 2002. 2. 부산과학고등학교 (2년 수료)

2002. 3. – 2006. 2. 한국과학기술원 전산학과 (B.S.)

2006. 3. – 2008. 2. 한국과학기술원 전산학과 (M.S.)