jae-gil lee department of knowledge service engineering kaist

36
Booming Up the Long Tails: Discovering Potentially Contributive Users in Community-Based Question Answering Services Jae-Gil Lee Department of Knowledge Service Engineering KAIST

Upload: dorothy-harding

Post on 01-Jan-2016

17 views

Category:

Documents


2 download

DESCRIPTION

Booming Up the Long Tails: Discovering Potentially Contributive Users in Community-Based Question Answering Services. Jae-Gil Lee Department of Knowledge Service Engineering KAIST. Contents. Background and Motivation Overview of the Methodology Detailed Methodology Experiment Evaluation - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Jae-Gil  Lee Department of Knowledge Service Engineering KAIST

Booming Up the Long Tails: Discover-ing Potentially Contributive Users in Community-Based Question Answering Services

Jae-Gil Lee

Department of Knowledge Service Engi-neeringKAIST

Page 2: Jae-Gil  Lee Department of Knowledge Service Engineering KAIST

2/12/2014 2

Contents

Background and Motivation

Overview of the Methodology

Detailed Methodology

Experiment Evaluation

Conclusions

This paper received the Best Paper Award at AAAI ICWSM-13

Page 3: Jae-Gil  Lee Department of Knowledge Service Engineering KAIST

2/12/2014 3

Community-Based Question Answering (CQA) Ser-

vices

Current problems in CQA servicesToo many questions Hard to find questions to answer Solutions: expert finding, question routing [Zhou et al. 2009]

Search engines are weak at Recent updated information Personalized information Advice & opinion [Budalakoti et al. 2010]

160,000 questions per day

50,000 questions per dayAsk

Answer

CQA services

Page 4: Jae-Gil  Lee Department of Knowledge Service Engineering KAIST

2/12/2014 4

Question RoutingGraph-based Content-based Profile-based

Also, hybrid methods

HITs, PageRankFind influential answerers

Language ModelingMatch questions & answerers

User profileFind experts based on profiles

Two important factors in question routing• Expertise: answerers need proper knowledge on the question area• Availability: answerers need time to answer[Horowitz et al. 2010, Li et al. 2010, Zhang et al. 2007]

There is a trade-off between expertise and availability

Page 5: Jae-Gil  Lee Department of Knowledge Service Engineering KAIST

2/12/2014 5

Short Tail vs. Long Tail

Most contributions (i.e., answers) in CQA ser-vices are made by a small number of heavy usersMany questions won’t be answered if such heavy

users become unavailable

A system is not robust if it heavily relies on a small number of users

Page 6: Jae-Gil  Lee Department of Knowledge Service Engineering KAIST

2/12/2014 6

On the other hand, recently-joined users are prone to leave CQA servicesExample: the appearances of the 9,874 an-

swerers who wrote answers in the computers category of KiN

Only 8.4% of an-swerers remained after a year

Page 7: Jae-Gil  Lee Department of Knowledge Service Engineering KAIST

2/12/2014 7

Comparison with Traditional Question Routing

Motivating such recently-joined users to be-come heavy users―by routing proper questions to them so that they can easily contribute―is of prime importance towards the success of the ser-vices Existing methodologies

COur methodology D

Which users should we take care of? Recently-joined expert users!

Page 8: Jae-Gil  Lee Department of Knowledge Service Engineering KAIST

2/12/2014 8

Problem Setting

Developing a methodology of measuring the likelihood of a light user becoming a contributive (i.e., heavy) user in the future in CQA services

Input: (i) the statistics of each heavy user, (ii) the answers written by heavy users, (iii) the answers written by light users

Output: the likelihood of each light user becoming a heavy user in the future An-swer Affordance

Page 9: Jae-Gil  Lee Department of Knowledge Service Engineering KAIST

2/12/2014 9

Contents

Background and Motivation

Overview of the Methodology

Detailed Methodology

Experiment Evaluation

Conclusions

Page 10: Jae-Gil  Lee Department of Knowledge Service Engineering KAIST

2/12/2014 10

Challenges

There is no sufficient information (i.e., an-swers) to judge the expertise of recently-joined users!

Kind of a cold-start problem

How can we cope with the lack of in-formation?

Page 11: Jae-Gil  Lee Department of Knowledge Service Engineering KAIST

2/12/2014 11

Intuition

A person’s active vocabulary reveals his/her knowledge

Vocabulary has sharable characteristics so that domain-specific words are repeatedly used by ex-pert answerers

Using the active vocabulary of a user to infer his/her expertise, i.e., using the vocabulary to bridge a gap between heavy users and light users

Page 12: Jae-Gil  Lee Department of Knowledge Service Engineering KAIST

2/12/2014 12

Vocabulary Level Vocabulary knowledge

“Vocabulary knowledge should at least comprise two dimensions, which are vocabulary breadth (or size), and depth (or qual-ity)” [Marjorie et al. 1996]

Three dimensions of lexical competence “(a) partial to precise knowledge, (b) depth of knowledge, and (c)

receptive to productive use ability” [Henriksen 1999]

Productive vocabulary ability “It implies degrees of knowledge. A learner may be reluctant

to use infrequent word using a simpler, more frequent word of a similar meaning. Such reluctance is often a result of uncertainty about the word’s usage. Lack of confidence is a reflection of imperfect knowledge. We refer to the ability to use a word at one’s free will as free productive ability” [Laufer et al. 1999]

De-tails

Page 13: Jae-Gil  Lee Department of Knowledge Service Engineering KAIST

2/12/2014 13

Domain Experts’ Vocabulary Usage

“Experts generated queries containing words from domain-specific lexicons fifty percent more often than non-experts. In addition to being able to generate more technically-sophisti-cated queries, experts also generated longer queries in terms of tokens and characters. It may be that because domain experts are more familiar with the domain vocabulary.” [White et al. 2009]

“Behavior of software engineers is quite distinct from general web search behavior. They use longer and more detailed queries. They make heavy use of specialized terms and search syntax. … Controlled vocabulary look-up lists or query processing tools should be in place to deal with acronyms, product names, and other technical terms” [Freund et al. 2006]

“When searching, experts found slightly more relevant documents. Experts issued more queries per task and longer queries, and their vocabulary overlapped somewhat more with the-saurus entries” [Zhang et al. 2005]

Domain experts use specialized, but formatted/standardized words

De-tails

Page 14: Jae-Gil  Lee Department of Knowledge Service Engineering KAIST

2/12/2014 14

Domain Expert’sVocabulary Durability

“One important change in behavior was the use of a more specific vocabulary as students learned more about their research topic” [Vakkari et al. 2003]

“Experts’ use of domain-specific vocabulary changes only slightly over the duration of the study. However, many non-expert users exhibit an increase in their usage of domain-specific vocabu-lary” [White et al. 2009]

Domain expert’s unique word set remains for a long time without change

De-tails

Page 15: Jae-Gil  Lee Department of Knowledge Service Engineering KAIST

2/12/2014 15

Usage of the Vocabulary: Overview

Heavy Users Words Light Users

Page 16: Jae-Gil  Lee Department of Knowledge Service Engineering KAIST

2/12/2014 16

Contents

Background and Motivation

Overview of the Methodology

Detailed Methodology

Experiment Evaluation

Conclusions

Page 17: Jae-Gil  Lee Department of Knowledge Service Engineering KAIST

2/12/2014 17

Basics of CQA Ser-vices

Top-Level Categories (e.g., Computers, Travel)

Defining the expertise of a user on a top-level category in our methodology

User Profile Selection Count = A Selection Ratio = B = A/D Recommendation Count = C

Page 18: Jae-Gil  Lee Department of Knowledge Service Engineering KAIST

2/12/2014 18

Answer Affordance

Considering both expertise and avail-ability

𝐴𝑓𝑓𝑜𝑟𝑑𝑎𝑛𝑐𝑒 (𝑢𝑙 )=¿

Page 19: Jae-Gil  Lee Department of Knowledge Service Engineering KAIST

2/12/2014 19

Estimated Expertise

WordLevel(w1)

WordLevel(w2)

WordLevel(w3)

WordLevel(wn)

Heavy Users UH

u2

un

u1

Expertise(u1)

Expertise(u2)

Expertise(un)

w1, w2, w4, w6 …

EstimatedExpertise(un+1)

...

Step 2 Step 3 Step 4

w2, w4, w6, w7 …

w1, w3, w6, w8 …

...

w1, w3, w4, w5 …

...

w3, w4, w5, w8 …

w2, w3, w6, w8 …

un+2

un+k

un+1

Light Users UL

WordLevel(wi )

Step 1

EstimatedExpertise(un+2)

EstimatedExpertise(un+k)

Vocabulary

Page 20: Jae-Gil  Lee Department of Knowledge Service Engineering KAIST

2/12/2014 20

Step 1: the expert score of a heavy user is calculated using the abundant historical data Expertise(uh)The expertise of a user becomes higher (i) as

the user’s answers are more concentrated on the target category and (ii) as the user has higher selection count, selection ratio, and recommendation count

Page 21: Jae-Gil  Lee Department of Knowledge Service Engineering KAIST

2/12/2014 21

Step 2: the level of a word is determined by the expert scores of the heavy users who used the word before WordLevel(wi)The word level of a word becomes higher as

the word is used by more expert users and more frequently

Decomposing an answer into words is reliable even for a small number of answers, because each answer typically has quite a few words

Page 22: Jae-Gil  Lee Department of Knowledge Service Engineering KAIST

2/12/2014 22

Step 3: these word levels are propagated to a set of words used by a light user in his/her answersThis step is supported by the observation that

the vocabulary of an expert stays mostly un-changed despite a temporal gap [White, Du-mais, and Teevan 2009]

Page 23: Jae-Gil  Lee Department of Knowledge Service Engineering KAIST

2/12/2014 23

Example: sample words in the travel cate-gory with their value of WordLevel(Wi)

Page 24: Jae-Gil  Lee Department of Knowledge Service Engineering KAIST

2/12/2014 24

Step 4: the expert score of the light user is reversely calculated based on his/her vocabulary EstimatedExpertise(ul)

Page 25: Jae-Gil  Lee Department of Knowledge Service Engineering KAIST

2/12/2014 25

Availability

Simply measuring the number of a user’s answers with their importance proportional to their recency

Page 26: Jae-Gil  Lee Department of Knowledge Service Engineering KAIST

2/12/2014 26

Contents

Background and Motivation

Overview of the Methodology

Detailed Methodology

Experiment Evaluation

Conclusions

Page 27: Jae-Gil  Lee Department of Knowledge Service Engineering KAIST

2/12/2014 27

Data Set

Collected from Naver Knowledge-In (KiN) http://kin.naver.com

Ranging from September 2002 to August 2012 Ten years

Including two categories: Computers and Travel Computers factual information, Travel subjective opin-

ions The entropy is used for measuring the expertise of a

user, working well especially for the categories where factual expertise is primarily sought after [Adamic et al. 2008]

Statistics

Computers Travel

# of answers 3,926,794 585,316

# of words 191,502 232,076

# of users 228,369 44,866

Page 28: Jae-Gil  Lee Department of Knowledge Service Engineering KAIST

2/12/2014 28

Period Division Dividing the 10 year period into three periods

The resource period is sufficiently long to learn the exper-tise of users, so is the test period; in contrast, the training period is not

Heavy users: those who joined during the resource period

Light users: those who joined during the training pe-riod (only one year)

Assuming that the end of the training period is the present

Page 29: Jae-Gil  Lee Department of Knowledge Service Engineering KAIST

2/12/2014 29

Accuracy of Expertise Predic-tion: Preliminary Tests

Extracting the main interest declared by each user in CQA services

Measuring the ratio of such self-declared experts on the target category among the top-k light users sorted by EstimatedExpertise()

The ratio of users who expressed their interests

(a) Computers (b) Travel

Page 30: Jae-Gil  Lee Department of Knowledge Service Engineering KAIST

2/12/2014 30

Accuracy of Expertise Predic-tion: Evaluation Method

Finding the top-k users by EstimatedExpertise() from the training period our prediction

Finding the top-k users by KiN’s ranking scheme from the test period ground truthKiN’s ranking scheme is a weighted sum of the selec-

tion count and the selection ratio Measuring (i) P@k and (ii) R-precisionRepeating the same procedure for comparison

with the following approachesExpertise(): the way of ranking heavy users rather

than light users in our methodologySelCount(): the selection countRecommCount(): the recommendation count

Page 31: Jae-Gil  Lee Department of Knowledge Service Engineering KAIST

2/12/2014 31

Accuracy of Expertise Pre-diction: Results

The precision performance for the travel category

The precision performance for the computers category

Page 32: Jae-Gil  Lee Department of Knowledge Service Engineering KAIST

2/12/2014 32

Accuracy of Answer Affor-dance: Evaluation Method

Finding the top-k users by Affordance() for light users our methodology

Finding the top-k users managed by KiN competitor

Measuring the user availability and the answer possession for the next one monthUser availability: the ratio of the number of the

top-k users who appeared on the day to the total number of users who appeared on that day

Answer possession: the ratio of the number of the answers posted by the top-k users on the day to the total number of answers posted on that day

Page 33: Jae-Gil  Lee Department of Knowledge Service Engineering KAIST

2/12/2014 33

The result of the answer pos-session

(a) Computers (b) Travel

The result of the user avail-ability

(a) Computers (b) Travel

Page 34: Jae-Gil  Lee Department of Knowledge Service Engineering KAIST

2/12/2014 34

Contents

Background and Motivation

Overview of the Methodology

Detailed Methodology

Experiment Evaluation

Conclusions

Page 35: Jae-Gil  Lee Department of Knowledge Service Engineering KAIST

2/12/2014 35

Conclusions

Developed a new methodology that can make CQA services more active and robust

Verified the effectiveness of our methodol-ogy using a real data set for ten years

Quote from the reviews:“I'm sold. If these results hold on another CQA site, this will be a very significant contribution to online communities. The study is well done, it's incredibly readable and clear, and the evaluation dataset is impeccable (10 years of data from one of the top 3 sites).”

Page 36: Jae-Gil  Lee Department of Knowledge Service Engineering KAIST

2/12/2014 36

Thank You!