finding what matters in questions
DESCRIPTION
Finding What Matters in Questions. Xiaoqiang Luo , Hema Raghavan , Vittorio Castelli , Sameer Maskey and Radu Florian IBM T.J. Watson Research Center. Introduction. e.q . : “ How does one apply for a New York day care license?” bag-of-words model 的最高分 : - PowerPoint PPT PresentationTRANSCRIPT
NAACL-HLT 2013 1
Finding What Matters in Questions
Xiaoqiang Luo, Hema Raghavan, Vittorio Castelli, Sameer Maskey and Radu Florian
IBM T.J. Watson Research Center
3
Introduction
ه e.q. : “How does one apply for a New York day care license?”ه bag-of-words model 的最高分 :
ى “New licenses for day care centers in York county, PA”ه MMP model :
“ 用ى New York,” “day care,” and “license” 這三個 phrase 來搜尋ه We call these important phrases mandatory matching
phrases (MMPs)
NAACL-HLT 2013
NAACL-HLT 2013 4
Question Corpus
ه subset of the DARPA BOLT corpus containing forum postings in English.
ه 四人挑選 question ه 以下 5 種 question 不會用
需要推理或計算才能得到答案的問句ى問題描述不清楚或有ى ambiguation可以拆成很多問句的問題ىmultiple choice questionsىى factoid questions
NAACL-HLT 2013 5
Question Corpus
ه 兩位標記者負責標記所挑選的 question 的MMP 類型 (MMP-Must, MMP-Maybe) 以及span
ه E.q.
不重疊連續
NAACL-HLT 2013 6
Generate MMP Training Instances
NAACL-HLT 2013 7
Generate MMP Training Instances
m
N
m
N
m
N
NAACL-HLT 2013 8
Generate MMP Training Instances
ه Output instances:ه < span, MMP type>
E.q. : hedge funds = <(5, 6), +1>
Position : 0 1 2 3 4 5 6 7 8 9
deep : 0
1
2
3
4
5
6
ه MMP type:ه MMP-Must : +1ه MMP-Skip : -1ه MMP-Maybe : -1
<(4, 6), +1>
p
Np <(4, 4), +1><(4, 6), +1>
p
Np
<(7, 9), +1><(9, 9), +1>
p
<(5, 6), +1>
NAACL-HLT 2013 9
Generate MMP Training Instances
NAACL-HLT 2013 10
MMP Features
Lexical Features:ه CaseFeatures:
ه is the first word of an MMP upper-case?ه Is it all capital letters? ه Does it contain numeric letters?ه E.q. :
.For “(NP American)” in Figure 1, the upper-case feature firesى
NAACL-HLT 2013 11
MMP Features
Lexical Features:ه CommonQWord:
ه Does the MMP contain question words, including “What,” “When,” “Who,” etc.
NAACL-HLT 2013 12
MMP Features
Syntactic Features:ه PhraseLabel:
ه this feature returns the phrasal label of the MMP.ه E.q:
”.For “(NP American)” in Figure 1, the feature value is “NPى
NAACL-HLT 2013 13
MMP Features
Syntactic Features:ه NPUnique:
ه this Boolean feature fires if a phrase is the only NP in a question
ه E.q.: .For “(NP American),” the feature value would be falseى
NAACL-HLT 2013 14
MMP Features
Syntactic Features:ه PosOfPTN:
ه (1) the position of the left-most word of the nodeه (2) whether the left-most word is the beginning of the
questionه (3) the depth of the anchoring node, defined as the
length of the path to the root node.
NAACL-HLT 2013 15
E.q. of PosOfPTN
ه E.q: For “(NP American)” in Figure 1:ه 5th word in the sentenceه not the first word of the sentenceه Depth of the node is 6
Position : 1 2 3 4 5 6 7 8 9 10
deep : 0
1
2
3
4
5
6
NAACL-HLT 2013 16
MMP Features
Syntactic Features:ه PhrLenToQLenRatio:
ه This feature computes the number of words in an MMP, and its relative ratio to the sentence length.
NAACL-HLT 2013 17
MMP Features
Semantic Features (NETypes):ه The feature tests if a phrase is or contains a named
entity, and, if this is the case, the value is the entity type.ى information extraction (IE) pipeline consisting of syntactic
parsing, mention detection and coreference resolution (Florian et al., 2004; Luo et al., 2004; Luo and Zitouni, 2005)
ه E.q. : For “(NP American)” in Figure 1, the feature value would be “GPE.”
ه
NAACL-HLT 2013 18
MMP Features
Corpus-based Features ( AvgCorpusIDF):ه This group of features computes the average of the
IDFs of the words in this phrase. Have stop wordsى
NAACL-HLT 2013 19
MMP Classification Results
Classifier:ه logistic regression binary classifier using WEKA.Data set:
questionstraining set 174
test set 27
NAACL-HLT 2013 20
Performances of the MMP classifier
NAACL-HLT 2013 21
Example Questions by MMP Model
NAACL-HLT 2013 22
Data for Relevance Model
ه From BOLT-IR task(IR, 2012)ه Top snippets returned by the search engine are
judged for relevancy by our annotators.
questiontraining set 390 (28915 snippet, 6528 answer)
test set 59
NAACL-HLT 2013 23
Relevance Prediction
ه The relevance model is a conditional distribution P(r|q, s;D)ه where r is a binary random variable indicating if the
candidate snippet s is relevant to the question q.ه D is the document where the snippet s is found.
NAACL-HLT 2013 24
Relevance Prediction
Baseline systemه (1) Text Match Features
ه query and snippet 的 cosine scoresه (2) Answer Type Features:
ه The top 3 predictions of a statistical classifier trained to predict answer categories were used as features.
NAACL-HLT 2013 25
Relevance Prediction
Baseline systemه (3) Mention Match Features
ه whether a named entity in the query occurs in the snippet.
NAACL-HLT 2013 26
Relevance Prediction
Baseline systemه (4) Event match features
ه use several hand-crafted dictionaries containing terms exclusive to various types of events like ”violence”, ”legal”, ”election”.
ه If both the query and snippet contain the same event type ’The features take value is ‘1ى
NAACL-HLT 2013 27
Relevance Prediction
Baseline systemه (5) Snippet Statistics:
ه snippet lengthه the position of the snippet in the post etc were created.
NAACL-HLT 2013 28
Relevance Prediction
Features Derived from MMPه HardMatch:
ه Let I(m s)∈ be a 1 or 0 function indicating if a snippet contains the MMP m
NAACL-HLT 2013 29
Relevance Prediction
Features Derived from MMPه SoftLMMatch:
ه The SoftLMMatch score is a language-model (LM) based score, similar to that used in (Bendersky and Croft, 2008), except that MMPs play the role of concepts.
NAACL-HLT 2013 30
Relevance Prediction
Features Derived from MMPه SoftLMMatch:
ه The SoftLMMatch score is a language-model (LM) based score, similar to that used in (Bendersky and Croft, 2008), except that MMPs play the role of concepts.
NAACL-HLT 2013 31
Relevance Prediction
Features Derived from MMPه SoftLMMatch:
ه where wi is the ith in snippet sه I(wi = v) is an indicator function, taking value 1 if wi is v
and 0 otherwiseه |V | is the vocabulary size
NAACL-HLT 2013 32
Relevance Prediction
Features Derived from MMPه MMPInclScore:
ه where w ∈ m are the words in mه I( ・ ) is the indicator function taking value 1 when the argument
is true and 0 otherwiseه is a constant thresholdه l(w, s) is the similarity of word w to the snippet s as:
ى l(w, s) = maxv s ∈ JW(w, v) ى JW(w, v) = (Jaro Winkler similarity score between words w and v)
NAACL-HLT 2013 33
Relevance Prediction
Features Derived from MMPه MMPInclScore:
ه The MMP weighted inclusion score between the question q and snippet s is computed as:
NAACL-HLT 2013 34
Relevance Prediction
Features Derived from MMPه MMPRankDep:
ه This feature, RD(q, s) first tests if there exists a matched bilexcial dependency between q and s;
NAACL-HLT 2013 35
Relevance Prediction
Features Derived from MMPه MMPRankDep:
ه Let m(i) be the ith ranked MMPه let <wh, wd | q> and <uh, ud | s> be bilexical
dependencies from q and s, respectively wh and uh are the headsىwd and ud are the dependentsى
NAACL-HLT 2013 36
Relevance Prediction
Features Derived from MMPه MMPRankDep:
ه EQ(w, u) EQ(w, u) is true if either w and u are exactly the same, or theirى
morphs are the same, or they head the same entity, or their synset in WordNet overlap
ه RD(q, s) RD(q, s) is true if and only ifى
ي EQ(wh, uh) EQ(w∧ d, ud) w∧ h m∈ (i) w∧ d m∈ (j) is true for some <wh, wd | q>, for some <uh, ud | s> and for some i and j.
NAACL-HLT 2013 37
Relevance Prediction
3 snippet classifiers modelه noMMP model
ه a system without MMP features;ه IDF-as-MMP model
ه a baseline with each word as an MMP and the word’s IDF as the MMP score.
ه MMP model
NAACL-HLT 2013 38
Relevance Prediction
Performance of 3 snippet classifiers system
NAACL-HLT 2013 39
End-to-End System Results
ه The question-answering system is used in the 2012 BOLT IR evaluation (IR, 2012)ه There are 499K(Arabic), 449K(Chinese ) and
262K(English ) threads in each of these languages. ه The Arabic and Chinese posts were first translated into
English before being processed.
NAACL-HLT 2013 40
End-to-End System Results
ه performance
NAACL-HLT 2013 41
BOLT Evaluation Results
ه The BOLT evaluation consists of 146 questions, mostly event- or topic- related
NAACL-HLT 2013 42
BOLT Evaluation Results
NAACL-HLT 2013 43
Conclusions
ه 作者提供一個使用 mandatory matching phrases (MMP) 的 QA 系統
ه 從 question 抽取出 MMP 的 F-measure 高達 88.6%
ه 將 MMP model 跟 snippet relevance model 合併可以有效提升 snippet relevance model 的效能ه 使用 MMP 的 QA 系統是 2012 BOLT IR 中效能最好的系統