fast shortest path distance estimation in large networks

Fast Shortest Path DistanceEstimation in Large Networks

Michalis Potamias, Francesco Bonchi,Carlos Castillo, and Aristides Gionis

発表 : 秋葉拓哉 ( コンピュータ科学専攻 M1)

2011/10/24 　 Web 工学

どんな論文？

• 最短路クエリのアルゴリズムと実験

• チーム：– 主著者はボストン大学– 他 3 名は Yahoo! Research (Barcelona, Spain)– 主著者がインターンした際の成果

• CIKM 2009 の Student Best Paper– ちなみに同会議の Best Paper も Yahoo! Research– 現在 CIKM 2011 開催中 (Glasgow)

話の流れ

1. 最短路クエリ問題とは？

2. ランドマークを用いた最短距離の推定

3. ランドマーク選択の手法の検討

4. 実験結果

最短路クエリ問題とは？

Social Search

Context-Aware Search

「木」を検索

Context-Aware Search

これらの応用におけるグラフ・最短距離

• Social Search– Social Network: 人を頂点，枝を友人関係

• Context-Aware Search– Web Graph: ページを頂点，枝をリンク

これらのグラフ上での最短距離を結果のランキングの指標に使う

最短経路問題のその他の需要

経路設計道乗り換え案内･

データマイニング

[WF94, Sco06]

情報検索・データベース[HWYY07, TWRC09]

生物情報学[RAS+05, RS06]

コンピュータネットワーク

[BLM+06, PS06]

その他 ...XML, オントロジー , ...

基礎的な問題 → 需要も幅広い

最短距離を知りたいけど…

• 幅優先探索：　 ( : 辺の本数 )

• 世の中のグラフはとても大きい

• 毎回幅優先探索するのは遅すぎる

Twitter のユーザ数人

Facebook のユーザ数人

Google の持つページ数ページ( 多分ちょっと古いデータ )

最短経路クエリ処理

1. 前処理

2. クエリ処理

「本郷から駒場」

「中野から秋葉原」

「札幌から那覇」

「 30 分です」

「 20 分です」

「 7 時間です」

前計算データ

最短経路クエリ処理

1. 前処理

2. クエリ処理

「本郷から駒場」

「中野から秋葉原」

「札幌から那覇」

「 30 分です」

「 20 分です」

「 7 時間です」

前計算データ

活用

ランドマークを用いた最短距離の推定

三角不等式

• グラフ上でのの最短距離

• ★

( は任意の頂点 )

上界による最短距離推定( 単一ランドマーク )

• ★これをそのまま使う

1. 前処理– 頂点を 1 つ選ぶ（ランドマーク）– , を全頂点に対し前計算しておく

( 幅優先探索 )

2. クエリ処理

上界による最短距離推定( 複数ランドマーク )

ランドマークを単一 () から複数 () にしよう

• ★複数頂点に使う

1. 前処理– 一定数の頂点集合を決める（ランドマーク）– 各に対し , を全長点に対し前計算 (BFS)

2. クエリ処理}

ランドマーク選択の手法

ランドマークの選択

• 個のランドマークを選びましょう– は 20 とか 100 とか 300 とか

• ベースライン : ランダムな選択– [Tang+, SIGCOMM’03], [Kleinberg+, FOCS’04], [Vieira+, CIKM’07]

• 本論文の Key Insight:–ランダムより良いヒューリスティクスがある

のでは？

アイディア

• 多くの最短路が通る点がよさそう

• グラフの中央っぽい点がよさそう

グラフグラフ

良い例嫌な例

Basic Strategies

• Degree Strategy– 次数の高いものから選ぶ

• Centrality Strategy– Closeness Centrality の小さいものから選ぶ

頂点の Closeness Centrality とは？

つまり，全長点への距離の平均．小さいものほど `` 中央’’ に近いと考える．実際には，ランダムサンプルにより近似して計算．

Constrained Strategies

• アイディア–近いところに一杯あっても無駄

• Degree/ Strategy–既に選んだ頂点から距離は選択禁止

• Centrality/ Strategy

Partitioning-Based Strategies

• アイディア– 色々なところに散らばっているとよさそう

• Graph Partitioning を使おう

Graph Partitioning とは？

1. グラフを個の近いサイズの成分に分解2. 違う成分間の辺数を最小化

NP-困難であり，ヒューリスティクスがよく研究されている(10th DIMACS Implementation Challenge 開催中 )

Partitioning-Based Strategies

• Degree/P– 各分割で次数最高の点

• Centrality/P– 各分割で Closeness Centrality 最高の点

• Border/P– 各分割で以下を最大化する点 (≒縁に近い点 )

評価実験

データセット

Table 1

近似精度 (相対誤差 )

Table 2

近似精度 (相対誤差 )

Figure 3

厳密手法とのクエリ時間の比較

Table 5

Social Search での精度

Figure 5

まとめ

• 話したこと– 最短経路クエリ問題とは– ランドマークを用いた最短経路の推定– ランドマークを選択の戦略– 実験結果

• 話さなかったこと– ランドマーク選択の NP-困難性– 下界による推定，上界と下界を同時に用いた推定

( うまくいかない )

Related Work( 本論文の後の文献を含む )

• 交通ネットワークでの最短路クエリ– より構造が活用しやすいため多くの手法あり– ALT (A* + landmarks), Reach, Hierarchical, …

• 厳密最短路クエリ– ALT– 2-HOP [Cohen+, SODA’02] [Cheng+, EDBT’09]– 対称性 [Xiao+, EDBT’09]– 木分解 [Wei, SIGMOD’10]

• 近似最短路クエリ– NSI [Rattigan+, SIGKDD’06]– Landmark [Potamias+, CIKM’09 ( 本論文 )]– Distance-Sketch [Das Sarma+, WSDM’10]– Path-Sketch [Gubichev+, CIKM’10]

• 到達可能性クエリ

fast shortest path distance estimation in large networks

Documents

shortest path algorithms · given a graph path p= v 1 v 2 v...

robust optimization - bilkentmustafap/courses/rt4.pdf ·...

ruteo dinámico - | utn - facultad regional la...

formation ccna 27 - le protocole ospf* *open shortest path...

1/451 protocolo de roteamento. 1/452 ospf (open shortest...

réseaux open shortest path first · 3 réseaux : open...

pertemuan 24 shortest path

shortest path

a multiple pairs shortest path algorithm 解説

analysis of algorithms...contents (l14 – review shortest...

netzwerkvirtualisierung mit spb - cms-it.de · design,...

introduce shortest path algorithms(korean)

podstawowe pojęcia dotyczące drzew podstawowe pojęcia...

bgl design and shortest path

cs 8833 algorithms algorithms shortest path problems

maxflow và shortest path

ing. en redes y comunicaciones diseño de redes ospf (open...

faster shortest path algorithms for planar graphs algorithms...

struktur data -...

optimasi simulasi routing ospf (open shortest path first