tweetcred: real-time credibility assessment of content on twitter @ socinfo conference 2014

23
Real-Time Credibility Assessment of Content on Twitter Adi$ Gupta*, Ponnurangam Kumaraguru*, Carlos Cas$llo~ and Patrick Meier~ *Indraprastha Ins$tute of Informa$on Technology ~Qatar Compu$ng Research Ins$tute November 11, SocInfo’14

Upload: precog

Post on 18-Jun-2015

376 views

Category:

Technology


1 download

TRANSCRIPT

Page 1: TweetCred: Real-Time Credibility Assessment of  Content on Twitter @ Socinfo Conference 2014

Real-Time Credibility Assessment of Content on Twitter"

Adi$  Gupta*,  Ponnurangam  Kumaraguru*,  Carlos  Cas$llo~  and  Patrick  Meier~  *Indraprastha  Ins$tute  of  Informa$on  Technology  

~Qatar  Compu$ng  Research  Ins$tute  November  11,  SocInfo’14  

Page 2: TweetCred: Real-Time Credibility Assessment of  Content on Twitter @ Socinfo Conference 2014

cerc.iiitd.ac.in  

TweetCred"

Page 3: TweetCred: Real-Time Credibility Assessment of  Content on Twitter @ Socinfo Conference 2014

cerc.iiitd.ac.in  

Non-trustworthy Content"

FAKE  

RUMORS  

3  

$  

Page 4: TweetCred: Real-Time Credibility Assessment of  Content on Twitter @ Socinfo Conference 2014

cerc.iiitd.ac.in  4  

Page 5: TweetCred: Real-Time Credibility Assessment of  Content on Twitter @ Socinfo Conference 2014

cerc.iiitd.ac.in  

Research Contributions"

� Semi-­‐supervised  ranking  model  for  scoring  tweets  according  to  their  credibility  in  real-­‐$me    

� TweetCred    - Live  deployment:  1,400+  TwiRer  users    - Credibility  score  computed  for  7+  Million  tweets  - Evaluated  TweetCred  in  terms  of  response  $me,  effec$veness  and  usability  

5  

Page 6: TweetCred: Real-Time Credibility Assessment of  Content on Twitter @ Socinfo Conference 2014

cerc.iiitd.ac.in  

Methodology"

6  

Page 7: TweetCred: Real-Time Credibility Assessment of  Content on Twitter @ Socinfo Conference 2014

cerc.iiitd.ac.in  

Training Data"� 500  Tweets  per  event  � Used  CrowdFlower  

7  

Event   Tweets   Users  Boston  Marathon  Blasts  (2013)   7,888,374   3,677,531  

Typhoon  Haiyan  /  Yolanda  (2013)   671,918   368,269  

Cyclone  Phailin  (2013)   76,136   34,776  

Washington  Navy  yard  shoo$ngs  (2013)   484,609   257,682  

Polar  vortex  cold  wave  (2014)   143,959   116,141  

Oklahoma  Tornadoes  (2013)   809,154   542,049  

 Total       10,074,150   4,996,448  

Page 8: TweetCred: Real-Time Credibility Assessment of  Content on Twitter @ Socinfo Conference 2014

cerc.iiitd.ac.in  

Annotation"�  Step  1  

- R1.  Contains  informa$on  about  the  event  - R2.  Is  related  to  the  event,  but  contains  no  informa$on  - R3.  Not  related  to  the  event  - R4.  Skip  tweet  

*45%  (class  R1),  40%  (class  R2),  and  15%  (class  R3)      

�  Step  2  - C1.  Definitely  credible  - C2.  Seems  credible  - C3.  Definitely  incredible  - C4.  Skip  tweet.    *52%  (class  C1),  35%  (class  C2),  and  13%  (class  C3)    

8  

Page 9: TweetCred: Real-Time Credibility Assessment of  Content on Twitter @ Socinfo Conference 2014

cerc.iiitd.ac.in  

Credibility Modeling "

9  

Feature  set      Features  (45)    

Tweet  meta-­‐data    Number  of  seconds  since  the  tweet;  Source  of  tweet  (mobile  /  web/  etc);  Tweet  contains  geo-­‐coordinates  

Tweet  content  (simple)    

Number  of  characters;  Number  of  words;  Number  of  URLs;  Number  of  hashtags;  Number  of  unique  characters;  Presence  of  stock  symbol;  Presence  of  happy  smiley;  Presence  of  sad  smiley;  Tweet  contains  `via';  Presence  of  colon  symbol  

Tweet  content  (linguis$c)    

Presence  of  swear  words;  Presence  of  nega$ve  emo$on  words;  Presence  of  posi$ve  emo$on  words;  Presence  of  pronouns;  Men$on  of  self  words  in  tweet  (I;  my;  mine)  

Tweet  author     Number  of  followers;  friends;  $me  since  the  user  if  on  TwiRer;  etc.  

Tweet  network    Number  of  retweets;  Number  of  men$ons;  Tweet  is  a  reply;  Tweet  is  a  retweet  

Tweet  links     WOT  score  for  the  URL;  Ra$o  of  likes  /  dislikes  for  a  YouTube  video  

Page 10: TweetCred: Real-Time Credibility Assessment of  Content on Twitter @ Socinfo Conference 2014

cerc.iiitd.ac.in  

Ranking Results"

10  

AdaRank  Coord.  Ascent   RankBoost   SVM-­‐rank  

NDCG@25   0.6773   0.5358   0.6736   0.3951  

NDCG@50   0.6861   0.5194   0.6825   0.4919  

NDCG@75   0.6949   0.7521   0.689   0.6188  

NDCG@100     0.6669   0.7607   0.6826   0.7219  

Time  (training)   35-­‐40  secs   1  min   35-­‐40  secs   9-­‐10  secs  

Time  (tesNng)   <1  sec   <1  sec   <1  sec   <1  sec  

Page 11: TweetCred: Real-Time Credibility Assessment of  Content on Twitter @ Socinfo Conference 2014

cerc.iiitd.ac.in  

TweetCred Demo"

11  

Page 12: TweetCred: Real-Time Credibility Assessment of  Content on Twitter @ Socinfo Conference 2014

cerc.iiitd.ac.in  

Implementation"

Page 13: TweetCred: Real-Time Credibility Assessment of  Content on Twitter @ Socinfo Conference 2014

cerc.iiitd.ac.in  

Feedback by Users"

13  

Page 14: TweetCred: Real-Time Credibility Assessment of  Content on Twitter @ Socinfo Conference 2014

cerc.iiitd.ac.in  

Usage Statistics"

Date  of  launch  of  TweetCred    27  Apr,  2014  

Credibility  score  seen  by  users  total   5,438,115  

Credibility  score  seen  by  users  unique   4,540,618  

Unique  TwiRer  users   1,127  

Feedback  was  given  for  tweets   1,273  

Unique  users  who  gave  feedback   263  

14  *at  the  $me  of  paper  submission  

Page 15: TweetCred: Real-Time Credibility Assessment of  Content on Twitter @ Socinfo Conference 2014

cerc.iiitd.ac.in  

Users of TweetCred"� Live  deployment  � Sample  users:  - Emergency  departments  - Firefighters  - Journalists  /  news  media  - General  users  - Researchers  (Requested  API  tokens)  

15  

Page 16: TweetCred: Real-Time Credibility Assessment of  Content on Twitter @ Socinfo Conference 2014

cerc.iiitd.ac.in  

Evaluation"� Response  Time  � Usability  Evalua$on  � User  Feedback  

16  

Page 17: TweetCred: Real-Time Credibility Assessment of  Content on Twitter @ Socinfo Conference 2014

cerc.iiitd.ac.in  

Response Time"

17  

Page 18: TweetCred: Real-Time Credibility Assessment of  Content on Twitter @ Socinfo Conference 2014

cerc.iiitd.ac.in  

Usability Survey"

� Online  survey  - Par$cipants:  67    

� Ques$ons:  -   System  Usability  Scale  (SUS)  ques$ons  - Demographic  ques$ons    

� SUS  Score:  70  - 74%  of  par$cipants  -­‐>  TweetCred  easy  to  use  - 81%  of  par$cipants  -­‐>  Would  like  to  use  TweetCred    

18  

Page 19: TweetCred: Real-Time Credibility Assessment of  Content on Twitter @ Socinfo Conference 2014

cerc.iiitd.ac.in  

User Feedback"

19  

 Observed                95%  Conf.  interval  Agreed  with  score   40.14   (36.73,  43.77)  Disagreed  with  score   59.85   (55.68,  64.26)  Disagreed:  score  should  be  higher     48.62   (44.86,  52.61)  Disagreed:  score  should  be  lower     11.23   (9.82,  13.65)  Disagreed  by  1  point   8.71   (7.17,  10.50)  Disagreed  by  2  points   14.29   (12.29,  16.53)  Disagreed  by  3  points   12.8   (10.91,  14.92)  Disagreed  by  4  points   10.91   (9.17,  12.89)  Disagreed  by  5  points   6.52   (5.19,  8.08)  

Page 20: TweetCred: Real-Time Credibility Assessment of  Content on Twitter @ Socinfo Conference 2014

cerc.iiitd.ac.in  

Crisis Events Bias"

20  

Page 21: TweetCred: Real-Time Credibility Assessment of  Content on Twitter @ Socinfo Conference 2014

v

Page 22: TweetCred: Real-Time Credibility Assessment of  Content on Twitter @ Socinfo Conference 2014

cerc.iiitd.ac.in  

Future Work"� Personaliza$on  - Users  trust  contacts  in  their  network  more  (users  they  follow,  retweet  or  men$on)    

� Intersec$on  between  psychology  literature  on  informa$on  credibility  &  credibility  of  content  on  TwiRer    

22  

Page 23: TweetCred: Real-Time Credibility Assessment of  Content on Twitter @ Socinfo Conference 2014

Thank  you!      

hRp://twitdigest.iiitd.edu.in/TweetCred/  cerc.iiitd.ac.in