an ipc-based vector space model for patent retrieval

17
Intelligent Database Systems Lab N.Y.U.S. T. I. M. An IPC-based vector space model for patent retrieval Presenter: Jun-Yi Wu Authors: Yen-Liang Chen, Yu-Ting Chiu 2011 IPM 國國國國國國國國 National Yunlin University of Science and Technology

Upload: dyan

Post on 23-Feb-2016

42 views

Category:

Documents


1 download

DESCRIPTION

An IPC-based vector space model for patent retrieval. Presenter: Jun-Yi Wu Authors: Yen-Liang Chen, Yu-Ting Chiu. 國立雲林科技大學 National Yunlin University of Science and Technology. 2011 IPM. Outline. Motivation Objective Methodology Experiments Conclusion Comments. Motivation. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: An IPC-based vector space model for patent retrieval

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.

An IPC-based vector space model for patent retrieval

Presenter: Jun-Yi Wu Authors: Yen-Liang Chen, Yu-Ting Chiu

2011 IPM

國立雲林科技大學National Yunlin University of Science and Technology

Page 2: An IPC-based vector space model for patent retrieval

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Outline

Motivation Objective Methodology Experiments Conclusion Comments

2

Page 3: An IPC-based vector space model for patent retrieval

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Motivation

3

The weakness in traditional VSM is that the indexing vocabulary changes whenever changes occur in the document set, or the indexing vocabulary selection algorithms, or parameters of the algorithms, or if wording evolution occurs.

Page 4: An IPC-based vector space model for patent retrieval

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Objective

The major objective of this research is to design a method to solve the afore-mentioned problems for patent retrieval.

The proposed method utilizes the special characteristics of the patent documents, the International Patent Classification (IPC) codes, to generate the indexing vocabulary for presenting all the patent documents.

4

Page 5: An IPC-based vector space model for patent retrieval

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Methodology

5

Page 6: An IPC-based vector space model for patent retrieval

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Methodology

6

Phase 1: Collect patent documents

PatentDB

Page 7: An IPC-based vector space model for patent retrieval

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Methodology

7

Phase 2:Text preprocessing

Page 8: An IPC-based vector space model for patent retrieval

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Methodology

88

Phase 3: Generate category * term vectors

Page 9: An IPC-based vector space model for patent retrieval

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Methodology

99

Page 10: An IPC-based vector space model for patent retrieval

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Methodology

1010

Phase 4: Generate term * category vector

Phase 5: Generate document * category vectors

Page 11: An IPC-based vector space model for patent retrieval

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Experiments

1111

Page 12: An IPC-based vector space model for patent retrieval

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Experiments

1212

Page 13: An IPC-based vector space model for patent retrieval

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Experiments

1313

Page 14: An IPC-based vector space model for patent retrieval

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Experiments

1414

Page 15: An IPC-based vector space model for patent retrieval

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Experiments

1515

Page 16: An IPC-based vector space model for patent retrieval

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Conclusion

1616

A novel method, IPC-based VSM, was proposed for generating vectors to represent patent documents.

The indexing vocabulary generated in IPC-based VSM was better at finding similar documents than either of the traditional methods.

Page 17: An IPC-based vector space model for patent retrieval

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Comments

1717

Advantage IPC_based SVM better than previous methods.

Application Information Retrieval