文献紹介:an iterative 'sudoku style' approach to subgraph-based word sense...

28
文献紹介 2015/02/06 長岡技術科学大学 自然言語処理研究室 岡田 正平

Upload: shohei-okada

Post on 20-Jul-2015

99 views

Category:

Science


1 download

TRANSCRIPT

Page 1: 文献紹介:An Iterative 'Sudoku Style' Approach to Subgraph-based Word Sense DIsambiguation

文献紹介2015/02/06

長岡技術科学大学自然言語処理研究室

岡田正平

Page 2: 文献紹介:An Iterative 'Sudoku Style' Approach to Subgraph-based Word Sense DIsambiguation

文献情報An Iterative ‘Sudoku Style’ Approach to Subgraph-based Word Sense Disambiguation

Steve L. Manion and Raazesh Sainudiin

In Proc. of *SEM 2014, pp. 40-50. 2014.

2015/02/06 文献紹介 2

Page 3: 文献紹介:An Iterative 'Sudoku Style' Approach to Subgraph-based Word Sense DIsambiguation

概要• subgraph-based WSD の改良

• sub-graph の構築と曖昧性解消を相互に行う(iterative approach)

• 平均的なWSD system に iterative approach を適用することで state-of-the-art に匹敵する性能を示す

2015/02/06 文献紹介 3

Page 4: 文献紹介:An Iterative 'Sudoku Style' Approach to Subgraph-based Word Sense DIsambiguation

背景WSDにおける知識獲得(教師データ)のボトルネック 語彙知識ベース(LKB)を利用した教師な手法への期待

• 語彙の粒度,ドメイン,言語横断

その1種である subgraph-based WSD に着目1. LKBs から semantic subgraph を構築2. graph に基づく重要度の指標を用いて語義を選択

2015/02/06 文献紹介 4

Page 5: 文献紹介:An Iterative 'Sudoku Style' Approach to Subgraph-based Word Sense DIsambiguation

既存の subgraph-based WSD1. 入力中の語の列を見出し語化

• 𝑤1,⋯ ,𝑤𝑚 ⟼ ℒ, ℒ = 𝑙1,⋯ , 𝑙𝑚2. LKBを 1つの semantic graph 𝒢 = (𝒮,ℰ)と見做す

• 𝒮:ノード(語義)集合• ℰ:エッヂ(語義間の関係)集合

2015/02/06 文献紹介 5

Page 6: 文献紹介:An Iterative 'Sudoku Style' Approach to Subgraph-based Word Sense DIsambiguation

既存の subgraph-based WSD3. subgraph 𝒢ℒ を構築

• subtree paths, shortest paths, local edges4. 各語の語義を選択

• 見出し語 𝑙𝑖 が持つ語義集合 𝑅 𝑙𝑖 = {𝑠𝑖,1,⋯ , 𝑠𝑖,𝑘}• subgraph 𝒢ℒ 中の語義 𝑠𝑖,𝑗 の重要度スコア 𝜙(𝑠𝑖,𝑗)

2015/02/06 文献紹介 6

Page 7: 文献紹介:An Iterative 'Sudoku Style' Approach to Subgraph-based Word Sense DIsambiguation

既存の subgraph-based WSD

2015/02/06 文献紹介 7

Page 8: 文献紹介:An Iterative 'Sudoku Style' Approach to Subgraph-based Word Sense DIsambiguation

提案手法• 従来手法では各ステップは独立• 構築されたsubgraphは変更されない

2015/02/06 文献紹介 8

Page 9: 文献紹介:An Iterative 'Sudoku Style' Approach to Subgraph-based Word Sense DIsambiguation

提案手法• 提案手法ではこれまでにWSD 済みの語を用いて

subgraph を再構築していく

2015/02/06 文献紹介 9

Page 10: 文献紹介:An Iterative 'Sudoku Style' Approach to Subgraph-based Word Sense DIsambiguation

「数独」の解法

2015/02/06 文献紹介 10

Page 11: 文献紹介:An Iterative 'Sudoku Style' Approach to Subgraph-based Word Sense DIsambiguation

「数独」の解法• 最初にヒントとなる数字がいくつか与えられている• 情報の多い(=曖昧性の少ない)ところから解いていく• 解いた(数字が記入された)マスは次のヒントとなる

2015/02/06 文献紹介 11

Page 12: 文献紹介:An Iterative 'Sudoku Style' Approach to Subgraph-based Word Sense DIsambiguation

subgraph-WSD への適用

2015/02/06 文献紹介 12

Page 13: 文献紹介:An Iterative 'Sudoku Style' Approach to Subgraph-based Word Sense DIsambiguation

subgraph-WSD への適用• 単一語義語で subgraph を構築• 語義の少ない(曖昧性が少ない)語からWSD して行く• WSD済みの語の情報を加えて subgraph を再構築

2015/02/06 文献紹介 13

Page 14: 文献紹介:An Iterative 'Sudoku Style' Approach to Subgraph-based Word Sense DIsambiguation

subgraph-WSD への適用

2015/02/06 文献紹介 14

Page 15: 文献紹介:An Iterative 'Sudoku Style' Approach to Subgraph-based Word Sense DIsambiguation

subgraph-WSD への適用

2015/02/06 文献紹介 15

𝑚:単一語義の語𝑎, 𝑏:2語義の語

図の例では語義𝑎2と𝑏1が選択される

Page 16: 文献紹介:An Iterative 'Sudoku Style' Approach to Subgraph-based Word Sense DIsambiguation

subgraph-WSD への適用

2015/02/06 文献紹介 16

𝑐:3語義の語

図の例では語義𝑐2が選択される

𝑎2, 𝑏1がsubgraphに組み込まれる

Page 17: 文献紹介:An Iterative 'Sudoku Style' Approach to Subgraph-based Word Sense DIsambiguation

subgraph-WSD への適用

2015/02/06 文献紹介 17

Page 18: 文献紹介:An Iterative 'Sudoku Style' Approach to Subgraph-based Word Sense DIsambiguation

提案手法

2015/02/06 文献紹介 18

𝜌:語義数𝜌𝑚𝑚𝑚:最大語義数

Page 19: 文献紹介:An Iterative 'Sudoku Style' Approach to Subgraph-based Word Sense DIsambiguation

評価実験 1 | 実験設定提案手法 (Iterative approach) の有効性を確認• LKB:BabelNet• データセット:SemEval 2013 Task 12

Multilingual WSD (English) data set• subgraph の構築方法

• subtree paths• shortest paths

2015/02/06 文献紹介 19

Page 20: 文献紹介:An Iterative 'Sudoku Style' Approach to Subgraph-based Word Sense DIsambiguation

グラフ中の重要度指標• In-Degree• Out-Degree• Betweenness Centrality• Sum Inverse Path Length• PageRank• HITS Kleinberg• personalized PageRank

2015/02/06 文献紹介 20

Page 21: 文献紹介:An Iterative 'Sudoku Style' Approach to Subgraph-based Word Sense DIsambiguation

文書レベルにおける結果

2015/02/06 文献紹介 21

Page 22: 文献紹介:An Iterative 'Sudoku Style' Approach to Subgraph-based Word Sense DIsambiguation

文レベルにおける結果

2015/02/06 文献紹介 22

Page 23: 文献紹介:An Iterative 'Sudoku Style' Approach to Subgraph-based Word Sense DIsambiguation

改善例Spanish [1]football players playing in the All-Start [4]League and in powerful [12]clubs of the [2]Premier League of [9]England are during the [5]year very active in [4]league and local [8]cup [7]competitions and there are high-level [25]shocks in the [10]European Cups and [2]European Champions League.

*[ ]内の数字は語義数

2015/02/06 文献紹介 23

Page 24: 文献紹介:An Iterative 'Sudoku Style' Approach to Subgraph-based Word Sense DIsambiguation

改善例Spanish [1]football players playing in the All-Start [4]League and in powerful [12]clubs of the [2]Premier League of [9]England are during the [5]year very active in [4]league and local [8]cup [7]competitions and there are high-level [25]shocks in the [10]European Cups and [2]European Champions League.

*[ ]内の数字は語義数

2015/02/06 文献紹介 24

Page 25: 文献紹介:An Iterative 'Sudoku Style' Approach to Subgraph-based Word Sense DIsambiguation

評価実験 2最適化の後,他手法と比較• 比較システム:SemEval 2013 Task 12より

2015/02/06 文献紹介 25

Page 26: 文献紹介:An Iterative 'Sudoku Style' Approach to Subgraph-based Word Sense DIsambiguation

評価実験 2 | 結果

2015/02/06 文献紹介 26

MFS: Most Frequent Sense

+ : back-off strategy

Page 27: 文献紹介:An Iterative 'Sudoku Style' Approach to Subgraph-based Word Sense DIsambiguation

評価実験 2 | 結果

2015/02/06 文献紹介 27

PR: PageRankPPR: Personalized PR

Page 28: 文献紹介:An Iterative 'Sudoku Style' Approach to Subgraph-based Word Sense DIsambiguation

まとめ• subgraph-based WSD の改良

• sub-graph の構築と曖昧性解消を相互に行う(iterative approach)

• 平均的なWSD system に iterative approach を適用することで state-of-the-art に匹敵する性能を示す

2015/02/06 文献紹介 28