Transcript
Page 1: Faking Sandy: Characterizing and Identifying Fake Images on Twitter during Hurricane Sandy

Faking Sandy: Characterizing and Identifying Fake Images on Twitter

during Hurricane Sandy

Adi$  Gupta  Hemank  Lamba  

Ponnurangam  Kumaraguru  Anupam  Joshi  

 

Page 2: Faking Sandy: Characterizing and Identifying Fake Images on Twitter during Hurricane Sandy

 precog.iiitd.edu.in  

Agenda

 Background   Mo$va$on   Main  contribu$ons   Data  descrip$on   Methodology   Results   Future  work  

2  

Page 3: Faking Sandy: Characterizing and Identifying Fake Images on Twitter during Hurricane Sandy

 precog.iiitd.edu.in  

Background: Hurricane Sandy

 Dates:  Oct  22-­‐  31,  2012   Category  3  storm   Damages  worth  $75  billion   Coast  of  NE  America  

3  

Page 4: Faking Sandy: Characterizing and Identifying Fake Images on Twitter during Hurricane Sandy

 precog.iiitd.edu.in  

Motivation

FAKE  IMAGES  

4  

Page 5: Faking Sandy: Characterizing and Identifying Fake Images on Twitter during Hurricane Sandy

 precog.iiitd.edu.in  

Motivation

5  

Page 6: Faking Sandy: Characterizing and Identifying Fake Images on Twitter during Hurricane Sandy

 precog.iiitd.edu.in  

Our Contributions

  Performed  in-­‐depth  characteriza$on  of  tweets  sharing  fake  images  on  TwiVer  during  Hurricane  Sandy  - Tweets  containing  the  fake  images  URLs  were  mostly  retweets  (86%)  

-  Just  11%  overlap  between  the  retweet  and  follower  graphs  of  tweets  containing  fake  images.      

  Applied  classifica$on  algorithms  to  dis$nguish  between  tweets  containing  fake  and  real  images.    - Best  accuracy  of  97%  was  achieved  using  decision  tree  classifier,  using  tweet  based  features.  

 

6  

Page 7: Faking Sandy: Characterizing and Identifying Fake Images on Twitter during Hurricane Sandy

 precog.iiitd.edu.in  

Methodology

7  

Data  Collec$on  and  Filtering  

Data  Characteriza$on  

Feature  Genera$on  

Obtaining  Ground  Truth  

Evalua$ng  Results  

Classifica$on  Module  

Page 8: Faking Sandy: Characterizing and Identifying Fake Images on Twitter during Hurricane Sandy

 precog.iiitd.edu.in  

Data Description

8  

Total  tweets   1,782,526  Total  unique  users   1,174,266  Tweets  with  URLs   622,860  

Page 9: Faking Sandy: Characterizing and Identifying Fake Images on Twitter during Hurricane Sandy

 precog.iiitd.edu.in  

Data Filtering

 Reputable  online  resource  to  filter  fake  and  real  images  - Guardian  collected  and  publically  distributed  a  list  of  fake  and  true  images  shared  during  Hurricane  Sandy                  

 One  of  the  biggest  fake  content  propaga$on  datasets  that  have  been  studied  by  researchers  

9  

Tweets  with  fake  images   10,350  

Users  with  fake  images   10,215  

Tweets  with  real  images   5,767  

Users  with  real  images   5,678  

Page 10: Faking Sandy: Characterizing and Identifying Fake Images on Twitter during Hurricane Sandy

 precog.iiitd.edu.in  

Characterization – Fake Image Propagation

 86%  of  tweets  spreading  the  fake  images  were  retweets  

 Top  30  users  out  of  10,215  users  (0.3%)  resulted  in  90%  of  the  retweets  of  fake  images  

10  

Page 11: Faking Sandy: Characterizing and Identifying Fake Images on Twitter during Hurricane Sandy

 precog.iiitd.edu.in  

Network Analysis

11  

Tweet  –  Retweet  graph  for  the  spread  of  fake  images  at  ‘nth’  and  ‘n+1th‘  hour  

Page 12: Faking Sandy: Characterizing and Identifying Fake Images on Twitter during Hurricane Sandy

 precog.iiitd.edu.in  

Role of Explicit Twitter Network

 Analyzing  role  of  follower  network  in  fake  image  propaga$on  

 We  crawled  the  TwiVer  network  for  all  users  who  tweeted  the  fake  image  URLs  

12  

Page 13: Faking Sandy: Characterizing and Identifying Fake Images on Twitter during Hurricane Sandy

 precog.iiitd.edu.in  

Algorithm

13  

Page 14: Faking Sandy: Characterizing and Identifying Fake Images on Twitter during Hurricane Sandy

 precog.iiitd.edu.in  

Results

14  

Total  edges  in  the  retweet  network   10,508  

Total  edges  in  the  follower-­‐followee  network   10,799,122  

Total  edges  that  exist  in  both  retweet  network  and  the  follower-­‐  followee  network  

1,215  

%age  Overlap   11%  

Page 15: Faking Sandy: Characterizing and Identifying Fake Images on Twitter during Hurricane Sandy

 precog.iiitd.edu.in  

Classification

 5  fold  cross  valida$on  

15  

Tweet  Features  [F2]  Length  of  Tweet  Number  of  Words  

Contains  Ques$on  Mark?  Contains  Exclama$on  Mark?  Number  of  Ques$on  Marks  

Number  of  Exclama$on  Marks  Contains  Happy  Emo$con  Contains  Sad  Emo$con  

Contains  First  Order  Pronoun  Contains  Second  Order  Pronoun  Contains  Third  Order  Pronoun  

Number  of  uppercase  characters  Number  of  nega$ve  sen$ment  words  Number  of  posi$ve  sen$ment  words  

Number  of  men$ons  Number  of  hashtags  Number  of  URLs  Retweet  count  

User  Features  [F1]  Number  of  Friends  Number  of  Followers  Follower-­‐Friend  Ra$o  

Number  of  $mes  listed  User  has  a  URL  

User  is  a  verified  user  Age  of  user  account  

Page 16: Faking Sandy: Characterizing and Identifying Fake Images on Twitter during Hurricane Sandy

 precog.iiitd.edu.in  

Classification Results

16  

F1  (user)   F2  (tweet)   F1+F2  

Naïve  Bayes   56.32%   91.97%   91.52%  

Decision  Tree   53.24%   97.65%   96.65%  

•  Best  results  were  obtained  from  Decision  Tree  classifier,  we  got  97%  accuracy  in  predic$ng  fake  images  from  real.    

•  Tweet  based  features  are  very  effec$ve  in  dis$nguishing  fake  images  tweets  from  real,  while  the  performance  of  user  based  features  was  very  poor.    

 •  Our  results  provided  a  proof  of  concept  that,  automated  techniques  can  be  

used  in  iden$fying  real  images  from  fake  images  posted  on  TwiVer.    

Page 17: Faking Sandy: Characterizing and Identifying Fake Images on Twitter during Hurricane Sandy

 precog.iiitd.edu.in  

Future Work

Fake  Charity  

Rumor  

17  

Page 18: Faking Sandy: Characterizing and Identifying Fake Images on Twitter during Hurricane Sandy

 precog.iiitd.edu.in  

Some Attractions

18  

Page 19: Faking Sandy: Characterizing and Identifying Fake Images on Twitter during Hurricane Sandy

 precog.iiitd.edu.in  

Some Attractions

19  

Page 20: Faking Sandy: Characterizing and Identifying Fake Images on Twitter during Hurricane Sandy

 precog.iiitd.edu.in  

Q & A

20  

Page 21: Faking Sandy: Characterizing and Identifying Fake Images on Twitter during Hurricane Sandy

Thank You!!!������

[email protected]  [email protected]  

   

Page 22: Faking Sandy: Characterizing and Identifying Fake Images on Twitter during Hurricane Sandy

For any further information, please write to [email protected]

precog.iiitd.edu.in

22  


Top Related