연결자 기반의 시맨틱 정보 검색 모델 (connectives based semantic information...

52
연연연 연연연 연연연 연연 연연 연연 (Connectives based semantic information retrieval model)

Upload: cliff

Post on 24-Jan-2016

58 views

Category:

Documents


13 download

DESCRIPTION

연결자 기반의 시맨틱 정보 검색 모델 (Connectives based semantic information retrieval model). Contents. Motivation Limitations of keyword-based information retrieval Related Work Overview of Semantic Search Overview of Recommendation A Semantic Information Retrieval Model - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: 연결자  기반의  시맨틱  정보 검색 모델    (Connectives based semantic information retrieval model)

연결자 기반의 시맨틱 정보 검색 모델 (Connectives based semantic information retrieval model)

Page 2: 연결자  기반의  시맨틱  정보 검색 모델    (Connectives based semantic information retrieval model)

Contents

Motivation Limitations of keyword-based information retrieval

Related Work Overview of Semantic Search

Overview of Recommendation

A Semantic Information Retrieval Model Unified Graph Model for Semantic Information Retrieval

Modeling

– Modeling of Connectives

– Modeling of Relationships

Probabilistic Approach to Ranking

Experiments

Conclusion

IDS Lab. - 2

Page 3: 연결자  기반의  시맨틱  정보 검색 모델    (Connectives based semantic information retrieval model)

Motivation

The number and variety of items available on the Web has grown explosively Two types of technologies are widely used to overcome information

overload problems

IDS Lab. - 3

Recom-mend

(push ser-vice)

Search (pull service)

Information Retrieval System

user query user profile

Massive Data Useful Information

Page 4: 연결자  기반의  시맨틱  정보 검색 모델    (Connectives based semantic information retrieval model)

Motivation

Most information retrieval (IR) systems are based on key-words due to its simplicity and efficiency [Xu et al., 2008]

Documents and users’ needs (e.g., queries, profiles) are represented with keywords

– Keywords are referred to as connectives that connects documents to users’ needs

Exact Matching of Keywords Since keyword-based IR systems exploit the exact matching of key-

words btw documents and users’ needs, it is impossible to return se-mantically relevant documents

IDS Lab. - 4

pid Paper text

p1 “Index selection for OLAP”, ICDE, H Gupta et al., 1997

p2 “Range Queries in OLAP Data Cubes”, SIGMOD, CT Ho et al., 1997

p3 “Implementing Data Cubes Efficiently”, SIGMOD, V Harinarayan et al., 1996

p4 “Data Cube: A Relational Aggregation Operator…”, MS technical report, J Gray et al., 1995

Query or Profile = “OLAP”

p1 “Index selection for OLAP”, ICDE, H Gupta et al., 1997

p2 “Range Queries in OLAP Data Cubes”, SIGMOD, CT Ho et al., 1997

p3 “Implementing Data Cubes Efficiently”, SIGMOD, V Harinarayan et al., 1996

p4 “Data Cube: A Relational Aggregation Operator…”, MS technical report, J Gray et al., 1995

[Balmin et al., 2004]

Page 5: 연결자  기반의  시맨틱  정보 검색 모델    (Connectives based semantic information retrieval model)

Limitations of Keyword-based IR [problem]

Semantic Ambiguity of Keywords [Dragut et al., 2006]

Homonym (“apple” as fruit vs. “apple” as company)

Synonym (“movies” vs. “films”)

Example of Homonym

IDS Lab. - 5

Information Seeker Web Documents (items)

query term : “apple” index terms : “apple”

concepts : “fruit” concepts : “computer”

Page 6: 연결자  기반의  시맨틱  정보 검색 모델    (Connectives based semantic information retrieval model)

Limitations of Keyword-based IR [solution]

Semantic Ambiguity of Keywords Query Expansion Approaches

– Co-occurrence [Wolfmanet al., 1999]

– Ontology, Thesaurus [Vogel et al., 2005][Gong et al., 2005][Shen et al., 2006]

IDS Lab. - 6

Query Expansion using Term Co-occurrence Query Expansion using Concept Keywords

Contents related to"Apple" computer(irrelevant to the user)

Contents related to"Apple" fruit(relevant to the user)

Page 7: 연결자  기반의  시맨틱  정보 검색 모델    (Connectives based semantic information retrieval model)

Limitations of Keyword-based IR [problem]

Sparse Annotation In keyword-based IR systems, users’ needs and items are represented

with “bag of keywords”

– Vector-space model

Due to the sparse annotation for items, it is hard to compute the de-gree of relevance exactly

– Some items may be not provided to users, although they are semantically rele-vant to the given needs

0 1 1 0 1 0 0 0

0 1 0 0 0 0 1 0

1 0 0 0 0 0 0 0

0 0 1 0 0 1 0 0

0 0 0 1 0 0 0 0

d1

d2

d3

d4

d5

t1 t2 t3 t4 t5 t6 t7 t8

1 0 0 0 0 0 0 1q

t1 t2 t3 t4 t5 t6 t7 t8

cos(q, d1) = 0 ; it is impossible to retrieve, although the semantic relevance is high

document-term matrix

IDS Lab. - 7

Page 8: 연결자  기반의  시맨틱  정보 검색 모델    (Connectives based semantic information retrieval model)

Limitations of Keyword-based IR [solution]

Sparse Annotation

IDS Lab. - 8

0 1 1 0 1 0 0 0

0 1 0 0 0 0 1 0

1 0 0 0 0 0 0 0

0 0 1 0 0 1 0 0

0 0 0 1 0 0 0 0

d1

d2

d3

d4

d5

t1 t2 t3 t4 t5 t6 t7 t8

1 0 0 0 0 0 0 1q

t1 t2 t3 t4 t5 t6 t7 t8

cos(q, d1) = 0 ; it is impossible to retrieve, although the semantic relevance is high

document-term matrix

2 1

1 1

1 0

1 1

1 0

d1

d2

d3

d4

d5

c1 c2

document-concept matrix

cos(q, d1) = 0.95

Page 9: 연결자  기반의  시맨틱  정보 검색 모델    (Connectives based semantic information retrieval model)

Semantic Information Retrieval

IDS Lab. - 9

Semantic Ambiguity of Keywords Although documents contain the keywords derived from users’ needs (queries, profiles), they may be irrelevant to the given users’ needs

Sparse Annotation Due to the sparse annotation for documents, it is hard to compute exact relevance between documents and users’ needs

Semantic Information Retrieval

using conceptual matching (semantic relevance) instead of keyword matching (literal relevance) between documents and users’ needs

concepts (not keywords) are utilized as connectives

Page 10: 연결자  기반의  시맨틱  정보 검색 모델    (Connectives based semantic information retrieval model)

Overview of Semantic Search

Logic-based Approaches Expressing users’ needs (i.e., queries) with specific ontology lan-

guages (e.g., RDQL, OWL-QL)

Logically inferred search results are provided to users – OWL-QL [Fikes et al., 2004]

– ONTOWEB [Kim, 2005]

OWL-QL [Fikes et al., 2004]

예시

IDS Lab. - 10

Page 11: 연결자  기반의  시맨틱  정보 검색 모델    (Connectives based semantic information retrieval model)

Overview of Semantic Search

Link-based Approaches (Graph Traverse Approaches) Searching semantically relevant documents through the hyperlinks

between Web documents– TAP [Guha et al., 2003]

– Hybrid Spread Activation [Roch et al., 2004]

– ObjectRank [Balmin et al., 2004]

IDS Lab. - 11

P2 “Range Queries in OLAP Data Cubes”

P1 “Index Selection for OLAP”

P4 “Data Cube: A Relational…”

P5 “Modeling Multidimensional Databases”

P3 “Implementing Data Cubes Effi-ciently”

Initial Results of “OLAP”

ObjectRank

Page 12: 연결자  기반의  시맨틱  정보 검색 모델    (Connectives based semantic information retrieval model)

Overview of Semantic Search

Concept-based Approaches Representing documents and users’ needs with concepts derived

from domain knowledge

– Some studies regards controlled vocabulary as concepts

IDS Lab. - 12

“music”

Sports Team: Seattle Sonics

Company: Starbucks

City: Seattle

Person: Howard Schultz

Music: When I Fall in Love

Concepts(connectives)

Page 13: 연결자  기반의  시맨틱  정보 검색 모델    (Connectives based semantic information retrieval model)

Overview of Semantic Search

Concept-based Approaches Result Processing

– Re-ranking documents according to each user’s conceptual profiles

– Keyword matching based approach

OBIWAN [Gauch et al., 2004]

DySe [Rinaldi et al., 2009]

OntoSearch [Jiang et al., 2009]

Query Expansion

– Converting queries and documents in a keyword space to those in a concept space

– Conceptual matching based approach

Adaptive Vector Approach [Vallet et al., 2005][Castells et al., 2007]

Folksonomy Approach [Wu et al., 2006][Xu et al., 2008]

IDS Lab. - 13

Page 14: 연결자  기반의  시맨틱  정보 검색 모델    (Connectives based semantic information retrieval model)

Overview of Semantic Search

Adaptive Vector Approach [Castells et al., 2007]

IDS Lab. - 14

concept vectors of query & document

SPARQL query Results

Concepts of Knowledge Base

Page 15: 연결자  기반의  시맨틱  정보 검색 모델    (Connectives based semantic information retrieval model)

Overview of Semantic Search

Folksonomy Approach [Xu et al., 2008]

Regarding tags as concepts

– A user annotates a document with tags which represent his/her interests

– A document has tags which represent the semantics of the document

IDS Lab. - 15

User annotated tags Web page has tags

Page 16: 연결자  기반의  시맨틱  정보 검색 모델    (Connectives based semantic information retrieval model)

Overview of Semantic Search

Comparison of Three Approaches

IDS Lab. - 16

Logic-based Link-based Concept-based

Goal Data Retrieval Web Documents RetrievalNot limited

(web pages, images, music etc.)

QueryOntology Language

(a barrier for ordinary users)

Keywords Keywords

Ranking X O O

Semantic Ambiguity Low High Medium

SparseAnnotation - High Medium

Connec-tives - Web Documents Concepts

(tags, categories etc.)

Page 17: 연결자  기반의  시맨틱  정보 검색 모델    (Connectives based semantic information retrieval model)

Overview of Semantic Search

Issues of Previous Concept-based Semantic Searches Sparse Annotation

– It is difficult to completely annotate the semantics of documents (or users’ queries) with a few connectives

Lexical analysis utilizing the exact matching of connectives (i.e., concepts and keywords)

There is a possibility that semantically relevant documents cannot be provided

Example

– Concept Vector of user: <1, 1, 0>

– Concept Vector of Document: <0, 0, 1>

IDS Lab. - 17

0),( dusim

Hollywood

Movie

Romance

Concepts(connectives)

u

Romeo & Juliet(1968)

Romeo & Juliet(1996)

Page 18: 연결자  기반의  시맨틱  정보 검색 모델    (Connectives based semantic information retrieval model)

Overview of Semantic Search

Issues of Previous Concept-based Semantic Searches Authority

– It is hard to determine the ranks if some documents have the same degree of relevance

– Some search engines such as Google, the authority of documents is used

Documents that are frequently referenced by others have high authority

Example

IDS Lab. - 18

Hollywood

Movie

Romance

u

Romeo & Juliet(1968)

Romeo & Juliet(1996)

semantic relevance: 0.5authority: 0.2

semantic relevance: 0.5authority: 0.8

Concepts(connectives)

Page 19: 연결자  기반의  시맨틱  정보 검색 모델    (Connectives based semantic information retrieval model)

Overview of Recommendation

Content-based Filtering Recommending documents similar to those a given user has preferred

in the past

– Similar to keyword search

Collaborative Filtering Identifying like-minded users whose preferences are similar to those of

the given user

Recommending documents that the like-minded users have preferred

IDS Lab. - 19

Page 20: 연결자  기반의  시맨틱  정보 검색 모델    (Connectives based semantic information retrieval model)

Overview of Recommendation

Example of Collaborative Filtering Identifying like-minded users whose preferences are similar to

those of the given user

– The preference of user1 is similar to that of userm

Recommending documents that the like-minded users have pre-ferred

IDS Lab. - 20

documents

users

d1 d2 d3 d4 d5 d6

user1 3 - 4 - 6 6

user2 - 2 - 5 -

userm 4 - 3 - 5 ?

recommend

Page 21: 연결자  기반의  시맨틱  정보 검색 모델    (Connectives based semantic information retrieval model)

Overview of Recommendation

Limitations of Previous Recommendation Systems Content-based Filtering

– Ambiguity of keywords

– Sparse Annotation

Collaborative Filtering

– Sparse Annotation

Dimension Reduction [Billsus et al., 1998][Sarwar et al., 2000]

Removing insignificant users or documents Loss of information

Hybrid Approaches of Content-based and Collaborative Filtering [Balabanovic et al., 1997][Pazzani et al., 1999]

Keywords-based connectives

Clustering of Users [Chee et al., 2002]

Bad quality of recommendations

Tag [Zanardi et al., 2008][Kim et al., 2010]

Explicit feedback (users’ annoyance or hesitation)

IDS Lab. - 21

Page 22: 연결자  기반의  시맨틱  정보 검색 모델    (Connectives based semantic information retrieval model)

Unified Graph Model for Semantic Information Retrieval

A Unified Graph for Semantic Information Retrieval Objects are interrelated to each other in the real world

We assume that 4 types of objects are interrelated to each other

– Users, documents, terms, concepts (Complete 4-partite graph)

– The graph can be expanded to an n-partite graph depending on applications (or domains)

IDS Lab. - 22

users

terms concepts

documentsaccessing

containing

subm

itti

ng relating

preferring

containing

d1

d2

d3

c1

c2

Document-Concept Relationship

Page 23: 연결자  기반의  시맨틱  정보 검색 모델    (Connectives based semantic information retrieval model)

Unified Graph Model for Semantic Information Retrieval

Derivatives in a Unified Graph Keyword Search

– Documents containing keywords submitted by a user are regarded as search results

IDS Lab. - 23

users

terms concepts

documentsaccessing

containing

subm

itti

ng relating

preferring

containing

connectives

u1

t1

d3

containing

subm

ittin

g

Page 24: 연결자  기반의  시맨틱  정보 검색 모델    (Connectives based semantic information retrieval model)

Unified Graph Model for Semantic Information Retrieval

Derivatives in a Unified Graph Conventional Collaborative Filtering

– Identifying like-minded users whose preferences are similar to those of an active user

The preferences can be derived from the click-through log (or rating log)

IDS Lab. - 24

users

terms concepts

documentsaccessing

containing

subm

itti

ng relating

preferring

containing

connectives

u1

u2

d3

d1

accessing

accessing

accessing

Page 25: 연결자  기반의  시맨틱  정보 검색 모델    (Connectives based semantic information retrieval model)

Unified Graph Model for Semantic Information Retrieval

Derivatives in a Unified Graph Concept-based Semantic Search

– Representing a user’s query and documents with their corresponding concepts

– Documents containing concepts derived from a user’s query are regarded as search results

IDS Lab. - 25

users

terms concepts

documentsaccessing

containing

subm

itti

ng relating

preferring

containing

connectives

u1

c1

d3

t1

relating

submitting

containing

Page 26: 연결자  기반의  시맨틱  정보 검색 모델    (Connectives based semantic information retrieval model)

Unified Graph Model for Semantic Information Retrieval

Derivatives in a Unified Graph Semantic Collaborative Filtering

– Identifying like-minded users by utilizing the concepts derived from users’ pref-erences,

Although users have accessed different document, it is possible to compute the semantic relevance between them

IDS Lab. - 26

users

terms concepts

documentsaccessing

containing

subm

itti

ng relating

preferring

containing

connectives

u1

u2

d3

c1

accessing

preferring

preferring

Page 27: 연결자  기반의  시맨틱  정보 검색 모델    (Connectives based semantic information retrieval model)

Unified Graph Model for Semantic Information Retrieval

Semantic Information Retrieval in a Unified Graph Ambiguity of Keywords

– Exploiting concept connectives

Sparse Annotation

– Exploiting lexical analysis and non-lexical analysis through heterogeneous con-nectives

Authority

– Exploiting collaborative filtering to derived implicit authority

Documents that like-minded users preferred have high authority

IDS Lab. - 27

users

terms concepts

documentsaccessing

containing

subm

itti

ng relating

preferring

containing

keyword search

collaborative filtering

concept-based semantic search

semantic collaborative filtering

Page 28: 연결자  기반의  시맨틱  정보 검색 모델    (Connectives based semantic information retrieval model)

Unified Graph Model for Semantic Information Retrieval

Analysis of Unified Graph

IDS Lab. - 28

Links btw. Users & Documents

documentsaccessing

Many users access a few documents

A few users access many documents

termscontaining

Links btw. Documents & Terms

Many documents contain a few terms

A few documents contain many terms

Sparse Relationship

(Sparsity : 0.999)

(Sparsity : 0.998)

Page 29: 연결자  기반의  시맨틱  정보 검색 모델    (Connectives based semantic information retrieval model)

Unified Graph Model for Semantic Information Retrieval

Analysis of Unified Graph

IDS Lab. - 29

concepts

documents

relating

Links btw. Concepts (ODP) & Doc-uments

Links btw. Concepts (Wikipedia) & Doc-uments

Dense Relationship

(Sparsity : 0.614)

(Sparsity : 0.575)

Many concepts are related to many documents

Many concepts are related to many documents

Page 30: 연결자  기반의  시맨틱  정보 검색 모델    (Connectives based semantic information retrieval model)

Unified Graph Model for Semantic Information Retrieval

Analysis of Unified Graph

IDS Lab. - 30

Links btw. Terms & Concepts (ODP)

Links btw. Terms & Concepts (Wikipedia)

Many terms are contained in a few concepts

A few terms are contained in many concepts

Many terms are contained in a few concepts

A few terms are contained in many concepts

terms conceptscontaining

(Sparsity : 0.999)

(Sparsity : 0.998)

Sparse Relationship

Page 31: 연결자  기반의  시맨틱  정보 검색 모델    (Connectives based semantic information retrieval model)

Unified Graph Model for Semantic Information Retrieval

Analysis of Unified Graph

IDS Lab. - 31

Links btw. Users & Concepts (ODP)

users

concepts

documents

preferring

Links btw. Users & Concepts (Wikipedia)

(Sparsity : 0.418)

(Sparsity : 0.371)

Dense Relationship

Many users prefer many concepts

Many users prefer many concepts

Page 32: 연결자  기반의  시맨틱  정보 검색 모델    (Connectives based semantic information retrieval model)

Unified Graph Model for Semantic Information Retrieval

Types of Relationships

IDS Lab. - 32

users

terms concepts

documentsaccessing

containing

subm

itti

ngrelatin

g

preferring

containing

Dense Relationships

Sparse Relationships

Page 33: 연결자  기반의  시맨틱  정보 검색 모델    (Connectives based semantic information retrieval model)

Unified Graph Model for Semantic Information Retrieval

Research Questions What kind of relationship exists between the performance of semantic IR and the den-

sity between objects (i.e., nodes in a unified graph)?

What combination of relationships (or connectives) can contribute to the improvement of performance in semantic IR?

– Whether both dense relationships and sparse relationships contribute to the improve-ment of performance or not

IDS Lab. - 33

Performance(e.g., precision)

Density

Page 34: 연결자  기반의  시맨틱  정보 검색 모델    (Connectives based semantic information retrieval model)

Unified Graph Model for Semantic Information Retrieval

IDS Lab. Seminar - 34

-

Conventional Collaborative

Filtering (CCF)

Semantic Collaborative

Filtering (SCF)

CCF + SCF

- -

Keyword Search

(KS)

Semantic Search(SS)

KS + SS

Recommen-dation

Search

users

terms concepts

documents users

terms concepts

documents users

terms concepts

documents

users

terms concepts

documents users

terms concepts

documents users

terms concepts

documents

users

terms concepts

documents users

terms concepts

documents users

terms concepts

documents

users

terms concepts

documents users

terms concepts

documents users

terms concepts

documents

users

terms concepts

documents

users

terms concepts

documents

users

terms concepts

documents

No Dense Relationship 1 Dense Relationship(user-concept)

1 Dense Relationship(document-concept)

2 Dense Relationships(user-concept &

document-concept)

Combination of Relationships

Page 35: 연결자  기반의  시맨틱  정보 검색 모델    (Connectives based semantic information retrieval model)

Unified Graph Model for Semantic Information Retrieval

IDS Lab. Seminar - 35

-

Conventional Collaborative

Filtering (CCF)

Semantic Collaborative

Filtering (SCF)

CCF + SCF

- -

Keyword Search

(KS)

Semantic Search(SS)

KS + SS

Recommen-dation

Search

users

terms concepts

documents users

terms concepts

documents users

terms concepts

documents

users

terms concepts

documents users

terms concepts

documents users

terms concepts

documents

users

terms concepts

documents users

terms concepts

documents users

terms concepts

documents

users

terms concepts

documents users

terms concepts

documents users

terms concepts

documents

users

terms concepts

documents

users

terms concepts

documents

users

terms concepts

documents

Comparison of Research Coverage

Coverage of Previous Approaches

Coverage of Our Approach

Page 36: 연결자  기반의  시맨틱  정보 검색 모델    (Connectives based semantic information retrieval model)

Modeling for Connectives - Document

Document Each document is represented by a |V| dimensional term vector

To remove the effect of document length, the term vector is normal-ized

IDS Lab. - 36

|V|i,ki,i,1 w,...,w,...,wid ,where wn,k is the weight (tf-idf) of the kth term in dn and V is the set of index terms

t1

t2

t3

1d

222 )(,...,

)(,...,

)( ji,

|V|i,

ji,

ki,

ji,

i,1

w

w

w

w

w

wid

Page 37: 연결자  기반의  시맨틱  정보 검색 모델    (Connectives based semantic information retrieval model)

Modeling for Connectives – User

User Explicit Approach

– A user is represented by keywords that the user explicitly provides to IR sys-tems

Implicit Approach

– By analyzing a user’s access log, it is possible to represent the user with key-words derived from his/her access log

A user is defined as the average of term vectors

– The derived term vector is normalized to remove the length effect

IDS Lab. - 37

D D

pupu

n

p

pd

unup du

1

, where and D pppp un

ui

uu ddd ,...,,...,1u

access

t3

t4

t2

t1

t6t5

d3

d1 access

access

d2

|V|i,ki,i,1 w,...,w,...,wpuid

Page 38: 연결자  기반의  시맨틱  정보 검색 모델    (Connectives based semantic information retrieval model)

Modeling for Connectives – Concept

Concept Definition from the American Heritage Dictionary

– A general idea derived or inferred from specific instances or occurrences

A concept is defined as the average of term vectors derived from ob-jects (or attributes) that belong to the concept

– If the objects are documents, the concept modeling is similar to the user model-ing

– The derived term vector is also normalized to remove

the length effect

IDS Lab. - 38

O O

xcxc

i

x

xo

cicx oc

1

, where and

O xxxx cm

ci

cc ooo ,...,,...,1

|V|i,ki,i,1 w,...,w,...,wxcio

concept

t3

t4

t1

t2

t6t5

belong tobelong to

belong to

objector

attribute

objector

attribute

objector

attribute

Page 39: 연결자  기반의  시맨틱  정보 검색 모델    (Connectives based semantic information retrieval model)

Modeling for Relationships

Relationships Explicit Relationship

– Relationships that explicitly exist between two types of objects

Example in Document-Term Relationships

IDS Lab. - 39

users

terms concepts

documents

Document –TermUser-Term

(Explicit Approach)

Concept-Term

User-Document (User Access Log)

, where w(di, tk) denotes the weight of kth term in di

otherwise

t iftw

tw

tk

t j

k

kj

0

),(

),(

)|Pr(i

i

i

i

dd

d

d

Page 40: 연결자  기반의  시맨틱  정보 검색 모델    (Connectives based semantic information retrieval model)

Modeling for Relationships

Relationships Implicit Relationship

– Relationships that are inferred (or derived) from explicit relationships

IDS Lab. - 40

users

terms concepts

documents

User Modeling(Implicit Approach)

Document-ConceptRelationship

User-ConceptRelationship

Page 41: 연결자  기반의  시맨틱  정보 검색 모델    (Connectives based semantic information retrieval model)

Modeling for Relationships

Relationships Implicit Relationship

– Relevance between two objects (oi, oj) is estimated with a conditional probability Pr(oi|

oj)

– Assuming that prior probabilities Pr(oi), Pr(oj), Pr(er) are constant for their random vari-

ables

IDS Lab. - 41

r

r

r

r

ejrir

r

i

er

r

jrj

r

iri

j

errjri

j

errji

j

j

jiji

ijj

iji

oeoee

o

ee

oeo

e

oeo

o

eeoeoo

eeooo

o

oooo

ooo

ooo

)|Pr()|Pr()Pr(

)Pr(

)Pr()Pr(

)|Pr()Pr(

)Pr(

)|Pr()Pr(

)Pr(

1

)Pr()|Pr()|Pr()Pr(

1

)Pr()|Pr()Pr(

1

)Pr(

)Pr()|Pr(

)|Pr()Pr(

)Pr()|Pr(

the law of total probability

the definition of conditional probability

assuming oi and oj are conditionally independent on er

Bayes’ theorem

re

jririjji oeoeoooo )|Pr()|Pr()|Pr()|Pr(

relevance between oi & oj connectives connecting oi with oj

Bayes’ theorem

Page 42: 연결자  기반의  시맨틱  정보 검색 모델    (Connectives based semantic information retrieval model)

Modeling for Relationships

Relationships Implicit Relationship

IDS Lab. - 42

id

ipipp duduu )|Pr()|Pr()|Pr()|Pr( kkk t tt

between Users and Terms

kt

kk tt )|Pr()|Pr()|Pr()|Pr( xiixxi cddccd

id

xipipxxp cduduccu )|Pr()|Pr()|Pr()|Pr(

users

terms concepts

User Modeling(Implicit Approach)

Document-ConceptRelationship

User-ConceptRelationship

between Documents and Conceptsbetween Users and Concepts

id

xipi cdudkt

kk tt )|Pr()|Pr()|Pr(

documents

Page 43: 연결자  기반의  시맨틱  정보 검색 모델    (Connectives based semantic information retrieval model)

Probabilistic Approach to Ranking

Search Keyword Search

Semantic Search

IDS Lab. - 43

kt

kk tt )|Pr()|Pr()|(Pr qiqik udud

xc

qxixqis ucdcud )|Pr()|Pr()|(Pr

xcqxix ucdc

kk tkk

tkk tttt )|Pr()|Pr()|Pr()|Pr(

offline computation(document-concept relationship)

Page 44: 연결자  기반의  시맨틱  정보 검색 모델    (Connectives based semantic information retrieval model)

Probabilistic Approach to Ranking

Recommendation (Collaborative Filtering-based Ap-proach) Conventional Collaborative Filtering

Semantic Collaborative Filtering

IDS Lab. - 44

users

terms concepts

documents

p xu d

pxpxip ududdu' '

)|'Pr()'|'Pr()|'Pr(

p ru c

prprip ucucdu'

)|Pr()'|Pr()|'Pr(

offline computation(user-concept relationship)

users

terms concepts

documents

)|'Pr()|'Pr()|Pr('

''pp

u

uipp

ui uuduud

p

pp

)|'Pr()|'Pr()|Pr('

''pp

u

uipp

ui uuduud

p

pp

Page 45: 연결자  기반의  시맨틱  정보 검색 모델    (Connectives based semantic information retrieval model)

IDS Lab. - 45

)|Pr( qukt

users

terms concepts

documents

connectives

)|Pr( xckt

line-off

:

)|Pr( ix dc

Page 46: 연결자  기반의  시맨틱  정보 검색 모델    (Connectives based semantic information retrieval model)

Experiments

IDS Lab. - 46

Page 47: 연결자  기반의  시맨틱  정보 검색 모델    (Connectives based semantic information retrieval model)

Contributions

Proposing a Unified Model for Semantic Information Re-trieval 멀티 타입 (multi-typed) 연결자를 이용한 시맨틱 정보 검색 모델

시맨틱 기반 검색 (Search) 및 추천 (Recommendation) 을 아우르는 모델 – 관련 연구들은 제안된 모델된 특정 링크 정보를 이용한 특별한 형태임

Providing a Guide to Ranking in Semantic Information Re-trieval 제안된 모델 내에서 연결자들 사이의 관계를 고려한 랭킹 모델 고찰 및 제안

다양한 개념 연결자 타입들을 이용하여 , 시맨틱 정보 검색 모델의 특성 고찰

Resolving Limitations of Previous Approaches 통합 모델을 이용하여 이전 연구들의 한계점들을 극복

– Ambiguity of Keywords

– Sparse Annotation

– Exact Matching of Concept-based Approaches

– Novelty

IDS Lab. - 47

Page 48: 연결자  기반의  시맨틱  정보 검색 모델    (Connectives based semantic information retrieval model)

IDS Lab. - 48

Page 49: 연결자  기반의  시맨틱  정보 검색 모델    (Connectives based semantic information retrieval model)

IDS Lab. - 49

Page 50: 연결자  기반의  시맨틱  정보 검색 모델    (Connectives based semantic information retrieval model)

Unified Equation

IDS Lab. - 50

},,{

)|()|()|(UCTK Kk

K kupkdpudp

1 UCT

1)|( utp

Page 51: 연결자  기반의  시맨틱  정보 검색 모델    (Connectives based semantic information retrieval model)

Graph Density

Applying associative retrieval techniques to alleviate the sparsity problem in collaborative filtering (TOIS’04)

IDS Lab. - 51

graph the in links possible of number

graph the in present links actual of numberdensitygraph _

Page 52: 연결자  기반의  시맨틱  정보 검색 모델    (Connectives based semantic information retrieval model)

Modeling of Relationships

Lemma 1. 임의의 두 객체들의 확률 기반 유사도는 벡터공간 모델에서 두 객체들의

코사인 기반 유사도에 비례

Proof.

IDS Lab. - 52

),()|Pr( jiji oosimoo

re

jrirji oeoeoo )|Pr()|Pr()|Pr( 에서 Pr(er|oi), Pr(er|oj) 를 다음과 같이 정의하면

,)|Pr(

2

xi,

ri,

w

wir oe

2)|Pr(

xj,

rj,

w

wjr oe

22

)|Pr()|Pr()|Pr(

xj,xi,

rj,ri,

ww

ww

rejrirji oeoeoo ),( ji oosim

* 참고 : 벡터 공간 모델에서 두 객체들은 다음과 같이 정의됨

,)(

,...,)(

,...,)( 222

xi,

||i,

xi,

ri,

xi,

i,1

w

w

w

w

w

w Rio

222 )(,...,

)(,...,

)( xj,

||j,

xj,

rj,

xj,

j,1

w

w

w

w

w

w Rjo