deepwalk: online learning of social representations

1

DeepWalk: Online Learning of Social Representations

Bryan Perozzi, Rami Al-Rfou, Steven SkienaACM SIG-KDD 2014

2

Outline

• Introduction: Graphs as Features• Language Modeling• DeepWalk• Evaluation: Network Classification• Conclusions & Future Work

3

Introduction

• Deep Learning 을 이용해 graph 의 인접 행렬의 latent representation(social relations-> vector space) 을 학습함

4

Introduction

• 소셜 네트워크 ( 커뮤니티 ) 상의 유저간의 상호작용을 vector space 에서 표현하여 학습 모델에 적용

Zachary’s Karate Network

5

Language modeling

• Corpus 안에서 특정 word sequence 가 얼마나 나오는지

• 문서상에서 word 의 latent representation 을 학습함(word co-occurrence):– word2vec:

• 단어의 semantic 한 의미를 나타낼 수 있다

6

From language modeling to graphs

• 자연어 corpus 에서 단어의 빈도가 power law 를 따른다• scale-free graph 의 Random

walk 에서 vertex frequency 역시 power law 를 따른다

7

From language modeling to graphs

• 각각의 random walk 는 한 문장을 나타낸다• Short random walks =

sentences

8

Framework

9

Random Walks

1. 각각의 vertex(root) 에서 개의 random walks 를 생성한다2. 각 random walk 의 길이는 이다

3. vertex 의 이웃들로부터 다음으로 방문할 vertex 를 선택한다 (uniformly sample)

10

Framework

11

Representation Mapping

• 을 representation vector 에 mapping 한다 ->

• 이웃 vertex 들을 vector space 상에 mapping한다 .

Maximize:

12

Skip-gram model

한 문장에서 window size (w) 내에 등장하는 단어들의 동시 발생 확률을 최대화하는 언어 모델

Maximize:

13

Framework

14

Hierarchical Softmax

• 을 계산하는 데에는 vertex 의 수 만큼의 operation 이 필요하다 ->

• vertex 를 binary tree 에 표현 • 를 최대화한다 = root 에서 node 까지 가는 path 의 발생 확률을 최대화한다• 는 logistic binary classifier• 따라서 , ->

15

Learning

• Learned parameters: – Vertex representations– Tree binary classifiers weights

• vertex representation 은 먼저 random 하게 초기화한다 .• 와 같은 classifier 가 loss function 을 계산한다• Stochastic Gradient Descent(SGD) 를 이용해

parameter 를 동시에 update 한다 .

16

Framework

17

Experiments

• Node Classification– graph 의 node 중 일부만 label 을 가지고 있을 때 , label이 없는 node 들을 예측한다

• DataSet– BlogCatalog– Flickr– YouTube

• Baselines– SpectralClustering, MaxModularity, EdgeCluster(k-

means), weighted vote Relational Neighbor(wvRN)

18

Results: BlogCatalog

• DeepWalk 는 label 이 있는 node 가 적은 data 에서도 잘 동작한다

19

Results: Youtube

• Scalable on very large graph!

20

Parallelization

• 모델 학습 과정에서 parameter 를 공유하지 않으므로 각 부분을 병렬처리해도 성능에 영향을 주지 않는다

21

Conclusions• Network 로 표현되는 데이터를 continuous

vector space 상에서 표현하여 학습이 가능하다 .• Word sequence 들을 graph 로 표현하여

language model 에 사용 가능하다 .• label 이 부족한 경우에도 잘 작동하다 .• 큰 graph 에도 Scalable 하기 때문에 online

learning 에 사용 가능하다

22

Thank you!Q & A

deepwalk: online learning of social representations

Data & Analytics