an ipc-based vector space model for patent retrieval

Post on 23-Feb-2016

42 Views

Category:

Documents

1 Downloads

Preview:

Click to see full reader

DESCRIPTION

An IPC-based vector space model for patent retrieval. Presenter: Jun-Yi Wu Authors: Yen-Liang Chen, Yu-Ting Chiu. 國立雲林科技大學 National Yunlin University of Science and Technology. 2011 IPM. Outline. Motivation Objective Methodology Experiments Conclusion Comments. Motivation. - PowerPoint PPT Presentation

TRANSCRIPT

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.

An IPC-based vector space model for patent retrieval

Presenter: Jun-Yi Wu Authors: Yen-Liang Chen, Yu-Ting Chiu

2011 IPM

國立雲林科技大學National Yunlin University of Science and Technology

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Outline

Motivation Objective Methodology Experiments Conclusion Comments

2

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Motivation

3

The weakness in traditional VSM is that the indexing vocabulary changes whenever changes occur in the document set, or the indexing vocabulary selection algorithms, or parameters of the algorithms, or if wording evolution occurs.

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Objective

The major objective of this research is to design a method to solve the afore-mentioned problems for patent retrieval.

The proposed method utilizes the special characteristics of the patent documents, the International Patent Classification (IPC) codes, to generate the indexing vocabulary for presenting all the patent documents.

4

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Methodology

5

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Methodology

6

Phase 1: Collect patent documents

PatentDB

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Methodology

7

Phase 2:Text preprocessing

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Methodology

88

Phase 3: Generate category * term vectors

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Methodology

99

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Methodology

1010

Phase 4: Generate term * category vector

Phase 5: Generate document * category vectors

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Experiments

1111

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Experiments

1212

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Experiments

1313

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Experiments

1414

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Experiments

1515

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Conclusion

1616

A novel method, IPC-based VSM, was proposed for generating vectors to represent patent documents.

The indexing vocabulary generated in IPC-based VSM was better at finding similar documents than either of the traditional methods.

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Comments

1717

Advantage IPC_based SVM better than previous methods.

Application Information Retrieval

top related