cs224w: social and information network analysis jure...
TRANSCRIPT
CS224W: Social and Information Network AnalysisJure Leskovec, Stanford University
http://cs224w.stanford.edu
Korea Rep.
Uruguay
Switz.Liecht
Sri Lanka
GibraltarArmenia
Ireland
Portugal
Nicaragua
Ghana
Morocco
Brazil
Paraguay
El Salvador
Slovenia
Cuba
Bulgaria
Dominican Rp
Barbados
Bermuda
Belarus
Mauritania
Philippines
Korea D P Rp
Burkina Faso
Uzbekistan
Myanmar
Costa Rica
TFYR Macedna Sudan
Senegal
Mongolia
Angola
NigeriaMexico Iran
Iraq
Kuwait
Oman
Saudi Arabia
Untd Arab Em
TurkeyUK
Lithuania
Russian Fed
Libya
Venezuela
Algeria
South Africa
Cote Divoire
USAColombia
Ecuador
Bahamas
Panama
Syria
Denmark
Netherlands
Finland
Norway
Sweden
Egypt
Cameroon
Gabon
Dem.Rp.Congo
Canada
Argentina
Bolivia
Chile
Peru
Guatemala
Trinidad Tbg
Yemen
Afghanistan
Indonesia
Malaysia
Singapore
China
Viet Nam
Estonia
Australia
Papua N.Guin
Kazakhstan
Italy
Spain
Qatar
New Zealand
Pakistan
Tunisia
Georgia
Thailand
Guinea
Liberia
Niger
JapanIndia
Taiwan
Ukraine
Germany
Greece
France,Monac
Austria
IsraelHungary
Benin
Azerbaijan
Belgium-Lux
Malta
Latvia
Jamaica
Poland
Czech Rep
Yugoslavia
Cyprus
Romania
Slovakia
Croatia
Trade in crudepetroleum and petroleum products, 1998, source: NBER-United Nations Trade Data
10/11/16 JureLeskovec,StanfordCS224W:SocialandInformationNetworkAnalysis,http://cs224w.stanford.edu 2
Y
X
Y
X
Y X
Y
X
indegree
In each of the following networks, X has higher centrality than Y according to a particular measure
outdegree betweenness closeness
10/11/16 JureLeskovec,StanfordCS224W:SocialandInformationNetworkAnalysis,http://cs224w.stanford.edu 3
¡ Intuition: Howmanypairsofindividualswouldhavetogothroughyouinordertoreachoneanotherintheminimumnumberofhops?
Y X
10/11/16 JureLeskovec,StanfordCS224W:SocialandInformationNetworkAnalysis,http://cs224w.stanford.edu 4
€
CB (i) = g jk (i) /g jkj<k∑
Where gjk = the number of shortest paths connecting j,kgjk(i) = the number that node i is on.
Usually normalized by:
€
CB' (i) = CB (i ) /[(n −1)(n − 2) /2]
number of pairs of node excluding the node itself
10/11/16 JureLeskovec,StanfordCS224W:SocialandInformationNetworkAnalysis,http://cs224w.stanford.edu 5
¡ Non-normalized version:
10/11/16 JureLeskovec,StanfordCS224W:SocialandInformationNetworkAnalysis,http://cs224w.stanford.edu 6
¡ Non-normalizedversion:
¡ Aliesbetweennotwoothervertices¡ BliesbetweenAand3othervertices:C,D,andE¡ Cliesbetween4pairsofvertices(A,D),(A,E),(B,D),(B,E)
¡ Notethattherearenoalternatepathsforthesepairstotake,soCgetsfullcredit
A B C ED
10/11/16 JureLeskovec,StanfordCS224W:SocialandInformationNetworkAnalysis,http://cs224w.stanford.edu 7
CS224W: Social and Information Network AnalysisJure Leskovec, Stanford University
http://cs224w.stanford.edu
¡ Todaywewilltalkabouthumanbehavioronline
¡ Wewilltrytounderstandhowpeopleexpressopinionsabouteachotheronline§ Wewillusedataandnetworksciencetheorytomodelfactorsaroundhumanevaluations
§ ThiswillbeanexampleofComputationalSocialScienceresearch§ Wearemakingsocialscienceconstructsquantitativeandthenusecomputationtomeasurethem
10/11/16 JureLeskovec,StanfordCS224W:SocialandInformationNetworkAnalysis,http://cs224w.stanford.edu 9
10/11/16 JureLeskovec,StanfordCS224W:SocialandInformationNetworkAnalysis,http://cs224w.stanford.edu 10
Observations
Smalldiameter,Edgeclustering
Patternsofsignededgecreation
ViralMarketing,Blogosphere,Memetracking
Scale-Free
Densificationpowerlaw,Shrinking diameters
Strengthofweakties,Core-periphery
Models
Erdös-Renyi model,Small-worldmodel
Structuralbalance,Theoryofstatus
Independentcascademodel,Gametheoreticmodel
Preferentialattachment,Copyingmodel
Microscopicmodel ofevolvingnetworks
Kronecker Graphs
Algorithms
Decentralizedsearch
Models forpredictingedgesigns
Influencemaximization,Outbreakdetection,LIM
PageRank,Hubsandauthorities
Linkprediction,Supervised randomwalks
Community detection:Girvan-Newman,Modularity
Inmanyonlineapplications usersexpresspositiveandnegativeattitudes/opinions:
¡ Throughactions:§ Ratingaproduct/person§ Pressinga“like”button
¡ Throughtext:§ Writingacomment,areview
¡ Successoftheseonlineapplicationsisbuiltonpeopleexpressingopinions§ Recommendersystems§ WisdomoftheCrowds§ Sharingeconomy
10/11/16 JureLeskovec,StanfordCS224W:SocialandInformationNetworkAnalysis,http://cs224w.stanford.edu 11
¡ Aboutitems:§ Movieandproductreviews
¡ Aboutotherusers:§ Onlinecommunities
¡ Aboutitemscreatedbyothers:§ Q&Awebsites
10/11/16 JureLeskovec,StanfordCS224W:SocialandInformationNetworkAnalysis,http://cs224w.stanford.edu 12
+
++
++
+
––
–
––
–
+
–
+–
+
–
+
–
+
¡ Manyonlinesettingswhereonepersonexpressesanopinionaboutanother(oraboutanother’scontent)§ Itrustyou[Kamvar-Schlosser-Garcia-Molina‘03]§ Iagreewithyou[Adamic-Glance’04]§ Ivoteinfavorofadmittingyouintothecommunity[Cosley etal.‘05,Burke-Kraut‘08]
§ Ifindyouranswer/opinionhelpful[Danescu-Niculescu-Mizil etal.‘09,Borgs-Chayes-Kalai-Malekian-Tennenholtz ‘10]
10/11/16 JureLeskovec,StanfordCS224W:SocialandInformationNetworkAnalysis,http://cs224w.stanford.edu 13
Someofthecentralissues:
¡ Factors:Whatfactorsdriveone’sevaluations?
¡ Synthesis:Howdowecreateacompositedescriptionthataccuratelyreflectsaggregateopinionofthecommunity?
10/11/16 JureLeskovec,StanfordCS224W:SocialandInformationNetworkAnalysis,http://cs224w.stanford.edu 14
§ Direct:Usertouser§ Indirect: Usertocontent(createdbyanothermemberofacommunity)
¡ Whereonlinedoesthisexplicitlyoccuronalargescale?
10/11/16 JureLeskovec,StanfordCS224W:SocialandInformationNetworkAnalysis,http://cs224w.stanford.edu 15
Direct Indirect
+
++
++
+
––
–
––
–+
–
+
¡ Wikipedia adminship elections§ Support/Oppose(120kvotesinEnglish)§ 4languages:EN,GER,FR,SP
¡ StackOverflow Q&Acommunity§ Upvote/Downvote (7.5Mvotes)
¡ Epinions productreviews§ Ratingsofothers’productreviews(13M)
§ 5=positive,1-4=negative
+
–
+
10/11/16 JureLeskovec,StanfordCS224W:SocialandInformationNetworkAnalysis,http://cs224w.stanford.edu 16
¡ Therearetwowaystolookatthis:Onepersonevaluatestheotherviaapositive/negativeevaluation
10/11/16 JureLeskovec,StanfordCS224W:SocialandInformationNetworkAnalysis,http://cs224w.stanford.edu 17
+
++
++
+
––
–
––
–
Then we will focus on evaluations in the
context of a network
BA
First we focus on a single evaluation
(without the context of a network)
¡ Whatdriveshumanevaluations?
¡ HowdopropertiesofevaluatorAandtargetB affectA’svote?§ Status andSimilarity aretwofundamentaldriversbehindhumanevaluations
10/11/16 JureLeskovec,StanfordCS224W:SocialandInformationNetworkAnalysis,http://cs224w.stanford.edu 18
BA
¡ Status:Levelofrecognition,merit,achievement,reputationinthecommunity
§ Wikipedia:#edits,#barnstars§ StackOverflow:#answers
¡ User-usersimilarity:§ OverlappingtopicalinterestsofA andB
§ Wikipedia: Similarityofthearticlesedited§ StackOverflow: Similarityofusersevaluated
10/11/16 JureLeskovec,StanfordCS224W:SocialandInformationNetworkAnalysis,http://cs224w.stanford.edu 19
[WSDM ‘12]
¡ HowdopropertiesofevaluatorAandtargetB affectA’svote?
¡ Twonatural(butcompeting)hypotheses:§ (1) Prob.thatBreceivesapositiveevaluationdependsprimarilyonthecharacteristicsofB§ ThereissomeobjectivecriteriaforuserBtoreceiveapositiveevaluation
10/11/16 JureLeskovec,StanfordCS224W:SocialandInformationNetworkAnalysis,http://cs224w.stanford.edu 20
BA
¡ HowdopropertiesofevaluatorAandtargetB affectA’svote?
¡ Twonatural(butcompeting)hypotheses:§ (2) Prob.thatBreceivesapositiveevaluationdependsonrelationshipbetweenthecharacteristicsofAandB§ UserAcomparesherselftouserBandthenmakestheevaluation
10/11/16 JureLeskovec,StanfordCS224W:SocialandInformationNetworkAnalysis,http://cs224w.stanford.edu 21
BA
¡ HowdoesstatusofBaffectA’sevaluation?§ Eachcurveisafixedstatusdifference:Δ =SA-SB
¡ Observations:§ Flatcurves: Prob.ofpositiveeval.P(+)doesn’tdependonB’sstatus
§ Differentlevels: DifferentvaluesofΔ resultindifferentbehavior
10/11/16 JureLeskovec,StanfordCS224W:SocialandInformationNetworkAnalysis,http://cs224w.stanford.edu 22
Target B status
BAWe keep increasing status of B, while keeping the status
difference (SA-SB) fixed
¡ Howdoespriorinteractionshapeevaluations?2hypotheses:§ (1)Evaluatorsaremoresupportiveoftargetsintheirarea§ “Themoresimilaryouare,themoreIlikeyou”
§ (2)Morefamiliarevaluatorsknowweaknessesandaremoreharsh§ “Themoresimilaryouare,thebetterIcanunderstandyourweaknesses”
10/11/16 JureLeskovec,StanfordCS224W:SocialandInformationNetworkAnalysis,http://cs224w.stanford.edu 23
2410/11/16 JureLeskovec,StanfordCS224W:SocialandInformationNetworkAnalysis,http://cs224w.stanford.edu
Prior interaction/ similarity boosts positive evaluations
Similarity: For each user create a set of words of all articles she edited. The similarity is then the Jaccard similarity between the two sets of words.Then sort the user pairs by similarity and bucket them into percentile.
10/11/16 JureLeskovec,StanfordCS224W:SocialandInformationNetworkAnalysis,http://cs224w.stanford.edu 25
Status is a proxy for quality when evaluator does not know the target
¡ Whoshowsuptoevaluate?
¡ Selectioneffectinwhogivestheevaluation§ IfSA>SB thenAandBaremorelikelytobesimilar
10/11/16 JureLeskovec,StanfordCS224W:SocialandInformationNetworkAnalysis,http://cs224w.stanford.edu 26
Elite evaluators vote on targets in
their area of expertise
¡ WhatisP(+)asafunctionofΔ =SA-SB?§ Basedonfindingssofar:Monotonicallydecreasing
10/11/16 JureLeskovec,StanfordCS224W:SocialandInformationNetworkAnalysis,http://cs224w.stanford.edu 27
Δ, Status difference
P(+)
-10 (SA<SB)
0 (SA=SB)
10 (SA>SB)
¡ WhatisP(+)asafunctionofΔ =SA-SB?
10/11/16 JureLeskovec,StanfordCS224W:SocialandInformationNetworkAnalysis,http://cs224w.stanford.edu 28
Especially negativefor SA=SB
Reboundfor SA > SB
Status difference
[ICWSM ‘10]
Computed over 120k votes
¡ Whylowevals.ofusersofsamestatus?§ Notduetousersbeingtoughoneachother§ Butduetotheeffectsofsimilarity
§ Sowegetthe“mercy”bounceduetounevenmixingofvotes
10/11/16 JureLeskovec,StanfordCS224W:SocialandInformationNetworkAnalysis,http://cs224w.stanford.edu 29
Explanation: For negative status difference we have low similarity people which behave according to the red curve on the left plot. As status difference increases the similarity also increases and thus around zero people behave according to the green curve. For positive status difference, similarity is high, and evaluations follow the blue curve. By having a particularly weighted combination of red, green, and blue curve we observe the “mercy bounce”.
¡ Sofar:Propertiesofindividualevaluations¡ But:Evaluationsneedtobe“summarized”§ Determiningrankingsofusersoritems§ Multipleevaluationsleadtoagroupdecision
¡ Howtoaggregateuserevaluationstoobtaintheopinionofthecommunity?§ Canweguesscommunity’sopinionfromasmallfractionofthemakeupofthecommunity?
10/11/16 JureLeskovec,StanfordCS224W:SocialandInformationNetworkAnalysis,http://cs224w.stanford.edu 30
¡ PredictWikipediaadminship electionresultswithoutseeingthevotes§ Observeidentitiesofthefirstk(=5)peoplevoting(butnothowtheyvoted)
§ Wanttopredicttheelectionoutcome§ Promotion vs.nopromotion
¡ Whyisithard?§ Don’tseethevotes(justvoters)§ Onlyseefirst5voters(outof~50)
10/11/16 JureLeskovec,StanfordCS224W:SocialandInformationNetworkAnalysis,http://cs224w.stanford.edu 31
[WSDM ‘12]
¡ Wanttomodelprob.userAvotes+inelectionofuserB
¡ Ourmodel:𝑃 𝐴 = + 𝐵 = 𝑃& + 𝑑(𝑆& − 𝑆+, 𝑠𝑖𝑚 𝐴,𝐵 )§ PA …empiricalfractionof+votesofA§ d(status,similarity)…avg.deviationinfrac.of+votes
§ WhenA evaluatesB fromaparticular(status,similarity)quadrant,howdoesthischangetheirbehavioronaverage?§ Note: d(status,similarity)onlytakes4differentvalues(basedonthequadrantinthe(status,similarity)space)
¡ Predict‘elected’if:10/11/16 JureLeskovec,StanfordCS224W:SocialandInformationNetworkAnalysis,http://cs224w.stanford.edu 32
B
Pki=1 P (Ai = +|B) > w
[WSDM ‘12]
¡ Basedononlywhoshowedtovotepredicttheoutcomeoftheelection
§ Othermethods:§ Guessinggives52%accuracy§ LogisticRegressiononstatusandsimilarityfeatures:67%§ Ifweseethefirstk=5 votes85%(goldstandard)
10/11/16 JureLeskovec,StanfordCS224W:SocialandInformationNetworkAnalysis,http://cs224w.stanford.edu 33
Number of voters seen Accuracy
Theme: Learning from implicit feedbackAudience composition tells us something
about their reaction
¡ Socialmediasitesaregovernedby(oftenimplicit)userevaluations
¡ Wikipediavotingprocesshasanexplicit,publicandrecorded processofevaluation
¡ Maincharacteristics:§ Importanceofrelativeassessment:Status§ Importanceofpriorinteraction:Similarity§ Diversityofindividuals’responsefunctions
¡ Application: Ballot-blindprediction
3410/11/16 JureLeskovec,StanfordCS224W:SocialandInformationNetworkAnalysis,http://cs224w.stanford.edu
¡ Statusseemstobesalientfeature
¡ Similarity alsoplaysimportantrole
¡ Audiencecompositionhelpspredictaudience’sreaction
10/11/16 JureLeskovec,StanfordCS224W:SocialandInformationNetworkAnalysis,http://cs224w.stanford.edu 35
¡ Therearetwowaystolookatthis:Onepersonevaluatestheotherviaapositive/negativeevaluation
10/11/16 JureLeskovec,StanfordCS224W:SocialandInformationNetworkAnalysis,http://cs224w.stanford.edu 37
+
++
++
+
––
–
––
–
Now we will focus on evaluations in the
context of a network
BA
So far we focused on a single evaluation
(without the context of a network)
¡ Networkswithpositiveandnegativerelationships
¡ Ourbasicunitofinvestigationwill besignedtriangles
¡ Firstwetalkaboutundirectednetworksthendirected
¡ Plan:§ Model: Considertwosoc. theoriesofsignednets§ Data: Reasonabouttheminlargeonlinenetworks§ Application: PredictifAandBarelinkedwith+or-
10/11/16 JureLeskovec,StanfordCS224W:SocialandInformationNetworkAnalysis,http://cs224w.stanford.edu 38
--+
--+
¡ Networkswithpositive andnegativerelationships
¡ Consideranundirectedcompletegraph¡ Labeleachedgeaseither:§ Positive:friendship,trust,positivesentiment,…§ Negative:enemy,distrust,negativesentiment,…
¡ ExaminetriplesofconnectednodesA,B,C
10/11/16 JureLeskovec,StanfordCS224W:SocialandInformationNetworkAnalysis,http://cs224w.stanford.edu 39
¡ Startwiththeintuition[Heider ’46]:§ Friend ofmyfriend ismyfriend§ Enemy ofenemy ismyfriend§ Enemy offriendismyenemy
¡ Lookatconnectedtriplesofnodes:
10/11/16 JureLeskovec,StanfordCS224W:SocialandInformationNetworkAnalysis,http://cs224w.stanford.edu 40
+++
--+
++-
---
UnbalancedBalancedConsistent with “friend of a friend” or
“enemy of the enemy” intuitionInconsistent with the “friend of a friend”
or “enemy of the enemy” intuition
10/11/16 JureLeskovec,StanfordCS224W:SocialandInformationNetworkAnalysis,http://cs224w.stanford.edu 41
BalancedUnbalanced
¡ Graphisbalancedifeveryconnectedtripleofnodeshas:§ All3edgeslabeled+,or§ Exactly1edgelabeled+
¡ Balanceimpliesglobalcoalitions [Cartwright-Harary]
¡ Fact: Ifalltrianglesarebalanced,theneither:§ Thenetworkcontainsonlypositiveedges,or§ Nodescanbesplitinto2setswherenegativeedgesonlypointbetweenthesets
10/11/16 JureLeskovec,StanfordCS224W:SocialandInformationNetworkAnalysis,http://cs224w.stanford.edu 42
+ +L
+R
-
10/11/16 JureLeskovec,StanfordCS224W:SocialandInformationNetworkAnalysis,http://cs224w.stanford.edu 43
B+
C
D
E
+
–
–
Friends of A Enemies of A
Every node in L is enemy of R
+ +
–
AAny 2 nodesin L are friends
Any 2 nodesin R are friends
L
R
¡ Internationalrelations:§ Positive edge:alliance§ Negative edge:animosity
¡ SeparationofBangladeshfromPakistanin1971:USsupportsPakistan.Why?§ USSR wasenemyofChina§ ChinawasenemyofIndia§ IndiawasenemyofPakistan§ USwasfriendlywithChina§ ChinavetoedBangladeshfromU.N.
10/11/16 JureLeskovec,StanfordCS224W:SocialandInformationNetworkAnalysis,http://cs224w.stanford.edu 44
P
R
I
C
U
+
+ ––
–
+?
B–?–
10/11/16 JureLeskovec,StanfordCS224W:SocialandInformationNetworkAnalysis,http://cs224w.stanford.edu 45
10/11/16 JureLeskovec,StanfordCS224W:SocialandInformationNetworkAnalysis,http://cs224w.stanford.edu 46
10/11/16 JureLeskovec,StanfordCS224W:SocialandInformationNetworkAnalysis,http://cs224w.stanford.edu 47
10/11/16 JureLeskovec,StanfordCS224W:SocialandInformationNetworkAnalysis,http://cs224w.stanford.edu 48
10/11/16 JureLeskovec,StanfordCS224W:SocialandInformationNetworkAnalysis,http://cs224w.stanford.edu 49
10/11/16 JureLeskovec,StanfordCS224W:SocialandInformationNetworkAnalysis,http://cs224w.stanford.edu 50
¡ Sofarwetalkedaboutcompletegraphs
10/11/16 JureLeskovec,StanfordCS224W:SocialandInformationNetworkAnalysis,http://cs224w.stanford.edu 51
Balanced?
-+
Def 1: LocalviewFillinthemissingedgestoachievebalance
Def 2: GlobalviewDividethegraphintotwocoalitions
The2definitionsareequivalent!
--
-
¡ Graphisbalanced if andonlyifitcontainsnocyclewithanoddnumberofnegative edges
¡ Howtocomputethis?§ Findconnectedcomponentson+edges
§ Ifwefindacomponentofnodeson+edgesthatcontainsa–edge⇒ Unbalanced
§ Foreachcomponentcreateasuper-node§ ConnectcomponentsAandBifthereisanegativeedgebetweenthemembers
§ Assignsuper-nodestosidesusingBFS
10/11/16 JureLeskovec,StanfordCS224W:SocialandInformationNetworkAnalysis,http://cs224w.stanford.edu 52
Even length cycle
–
–––
–
–
––
–
Odd length cycle
10/11/16 JureLeskovec,StanfordCS224W:SocialandInformationNetworkAnalysis,http://cs224w.stanford.edu 53
10/11/16 JureLeskovec,StanfordCS224W:SocialandInformationNetworkAnalysis,http://cs224w.stanford.edu 54
10/11/16 JureLeskovec,StanfordCS224W:SocialandInformationNetworkAnalysis,http://cs224w.stanford.edu 55
¡ UsingBFSassigneachnodeaside¡ Graphisunbalanced ifanytwoconnectedsuper-nodesareassignedthesameside
10/11/16 JureLeskovec,StanfordCS224W:SocialandInformationNetworkAnalysis,http://cs224w.stanford.edu 56
L
R R
L LL
RUnbalanced!
û
¡ Projectisasubstantialpartoftheclass§ Studentsputsignificanteffortandgreatthingshavebeendone
¡ Typesofprojects:§ (1) Analysisofaninterestingdatasetwiththegoaltodevelopa(new)modeloranalgorithm
§ (2) Atestofamodeloralgorithm(thatyouhavereadaboutoryourown)onreal&simulateddata.§ Fastalgorithmsforbiggraphs.CanbeintegratedintoSNAP.
¡ Otherpoints:§ Theprojectshouldcontainsomemathematicalanalysis,andsomeexperimentationonrealorsyntheticdata
§ Theresultoftheprojectwilltypicallybean8pagepaper,describingtheapproach,theresults,andrelatedwork.
§ Cometousifyouneedhelpwithaprojectidea!10/11/16 JureLeskovec,StanfordCS224W:SocialandInformationNetworkAnalysis,http://cs224w.stanford.edu 58
Projectproposal: 3-5pages,teamsofupto3students¡ Projectproposalhas3parts:
§ (0)Quick200wordabstract§ (1)Relatedwork/Reactionpaper(2-3pages):
§ Read3papersrelatedtotheproject/class§ Doreadingbeyondwhatwascoveredinclass§ Thinkbeyondwhatyouread.Don’ttakeother’sworkforgranted!§ 2-3pages: Summary (~1page),Critique(~1page)
§ (2)Proposal(1-2pages):§ Clearlydefinetheproblemyouaresolving.§ HowdoesitrelatetowhatyoureadfortheReactionpaper?§ Whatdatawillyouuse?(makesureyoualreadyhaveit!)§ Whichalgorithm/modelwillyouuse/develop?Bespecific!§ Howwillyouevaluate/testyourmethod?
See http://cs224w.stanford.edu/info.html fordetailedinstructionsandexamplesofpreviousproposals
10/11/16 JureLeskovec,StanfordCS224W:SocialandInformationNetworkAnalysis,http://cs224w.stanford.edu 59
¡ Logistics:§ 1)RegisteryourgroupontheGoogleDochttp://bit.ly/1BNiHae
§ 2)SubmitPDF onGradeScopeAND athttp://snap.stanford.edu/submit/
§ Duein9days:ThuOct20at23:59PST!§ Nolateperiods
¡ Ifyouneedhelp/ideas/advicecometoOfficehours/Emailus
10/11/16 JureLeskovec,StanfordCS224W:SocialandInformationNetworkAnalysis,http://cs224w.stanford.edu 60
¡ Pinterest:Staytuned¡ Foodwebs:
§ http://vlado.fmf.uni-lj.si/pub/networks/data/bio/foodweb/foodweb.htmwithmetadata:https://www.cbl.umces.edu/~atlss/
¡ Tradenetworksovertime:§ http://faostat3.fao.org/download/F/FT/E
¡ StackExchange(replynetworks,Q/Anetworks):§ https://archive.org/details/stackexchange
¡ Microfinancedata:§ http://web.stanford.edu/~jacksonm/Data.html
¡ Reddit: Over1000subreddits foroneyear(2014).§ Networkswhereuserswhocommentneareachother.Veryinteresting forcomparingdifferent
communitiesetc.Lotsofmetadata(e.g.,frompostsorcomments)butthisdataisverylarge(hundredsofGbs)
¡ Interpersonalexpertiseoverlapwithinacompany§ Withinacompany,employeeswereaskedtorespondtothisquestion: Foreachpersoninthelist
below,pleaseshowhowstronglyyouagreeordisagreewiththefollowingstatement:Ingeneral,thispersonhasexpertise inareasthatareimportantinthekindofworkIdo.”
§ Link: http://opsahl.co.uk/tnet/datasets/Cross_Parker-Consulting_info.txt§ TypeofData:Originnode,destinationnode,weightofconnection(1-5)
¡ Moviegalaxies: Socialnetworksof200moviesfrom moviegalaxies.com.Eachnetworkrepresentshowcharactersinteractinonemovie
10/11/16 JureLeskovec,StanfordCS224W:SocialandInformationNetworkAnalysis,http://cs224w.stanford.edu 61
¡ TheNeuralNetworkofa Caenorhabditis elegans worm¡ Link: http://opsahl.co.uk/tnet/datasets/celegans_n306.txt¡ FormatofData:Originnode(Neuron),destinationnode(Neuron),weightoflink
¡ Thenetworkofairports intheUnitedStates¡ Description:FlightsbetweenUSairportsin2002(undirected),weightedbyhowmanyavailableseats
whereonflightsbetweentwoairportsoverthecourseoftheyear.¡ Link: http://opsahl.co.uk/tnet/datasets/USairport500.txt¡ TypeofData:Airport1,Airport2,numberofseatsacrosstheentireyearthatwereavailable
¡ Citation/author relationships¡ Description:Asetofroughly630,000papers,andtheirrespectiveauthors¡ Link: https://aminer.org/citation¡ Link:https://www.microsoft.com/en-us/research/project/microsoft-academic-graph/¡ TypeofData:(wouldrequiresometextprocessingtoextract)Nameofpaper,indexofpaper,authors
¡ Pages/hostnetwork¡ Description:Asetofhostsfromthe.uk domainandthepagestheylinkto¡ Link: http://law.di.unimi.it/webdata/uk-2014/
¡ WolfePrimatesinteraction¡ Description:Thesedatarepresent3monthsofinteractionsamongatroopofmonkeys.Vertex
attributescontainadditional information:(1)IDnumberoftheanimal; (2)ageinyears;(3)sex;(4)rankinthetroop.
¡ Link: http://nexus.igraph.org/api/dataset_info?id=45&format=html
¡ Pythondependencygraphforpypi¡ Description:Thelibraries whichdependonotherlibraries inthepackagepypi¡ Link: https://ogirardot.wordpress.com/2013/01/05/state-of-the-pythonpypi-dependency-graph/¡ Format:nameofdependency,versionextracted,json stringofotherdependencies
10/11/16 JureLeskovec,StanfordCS224W:SocialandInformationNetworkAnalysis,http://cs224w.stanford.edu 62