intelligent database systems lab n.y.u.s.t. i. m. an ipc-based vector space model for patent...

17
Intelligent Database Systems Lab N.Y.U.S. T. I. M. An IPC-based vector space model for patent retrieval Presenter: Jun-Yi Wu Authors: Yen-Liang Chen, Yu-Ting Chiu 2011 IPM 國國國國國國國國 National Yunlin University of Science and Technology

Upload: brent-perkins

Post on 05-Jan-2016

212 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Intelligent Database Systems Lab N.Y.U.S.T. I. M. An IPC-based vector space model for patent retrieval Presenter: Jun-Yi Wu Authors: Yen-Liang Chen, Yu-Ting

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.

An IPC-based vector space model for patent retrieval

Presenter: Jun-Yi Wu Authors: Yen-Liang Chen, Yu-Ting Chiu

2011 IPM

國立雲林科技大學National Yunlin University of Science and Technology

Page 2: Intelligent Database Systems Lab N.Y.U.S.T. I. M. An IPC-based vector space model for patent retrieval Presenter: Jun-Yi Wu Authors: Yen-Liang Chen, Yu-Ting

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Outline

Motivation Objective Methodology Experiments Conclusion Comments

2

Page 3: Intelligent Database Systems Lab N.Y.U.S.T. I. M. An IPC-based vector space model for patent retrieval Presenter: Jun-Yi Wu Authors: Yen-Liang Chen, Yu-Ting

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Motivation

3

The weakness in traditional VSM is that the indexing vocabulary changes whenever changes occur in the document set, or the indexing vocabulary selection algorithms, or parameters of the algorithms, or if wording evolution occurs.

Page 4: Intelligent Database Systems Lab N.Y.U.S.T. I. M. An IPC-based vector space model for patent retrieval Presenter: Jun-Yi Wu Authors: Yen-Liang Chen, Yu-Ting

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Objective

The major objective of this research is to design a method to solve the afore-mentioned problems for patent retrieval.

The proposed method utilizes the special characteristics of the patent documents, the International Patent Classification (IPC) codes, to generate the indexing vocabulary for presenting all the patent documents.

4

Page 5: Intelligent Database Systems Lab N.Y.U.S.T. I. M. An IPC-based vector space model for patent retrieval Presenter: Jun-Yi Wu Authors: Yen-Liang Chen, Yu-Ting

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Methodology

5

Page 6: Intelligent Database Systems Lab N.Y.U.S.T. I. M. An IPC-based vector space model for patent retrieval Presenter: Jun-Yi Wu Authors: Yen-Liang Chen, Yu-Ting

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Methodology

6

Phase 1: Collect patent documents

PatentDB

Page 7: Intelligent Database Systems Lab N.Y.U.S.T. I. M. An IPC-based vector space model for patent retrieval Presenter: Jun-Yi Wu Authors: Yen-Liang Chen, Yu-Ting

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Methodology

7

Phase 2:Text preprocessing

Page 8: Intelligent Database Systems Lab N.Y.U.S.T. I. M. An IPC-based vector space model for patent retrieval Presenter: Jun-Yi Wu Authors: Yen-Liang Chen, Yu-Ting

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Methodology

88

Phase 3: Generate category * term vectors

Page 9: Intelligent Database Systems Lab N.Y.U.S.T. I. M. An IPC-based vector space model for patent retrieval Presenter: Jun-Yi Wu Authors: Yen-Liang Chen, Yu-Ting

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Methodology

99

Page 10: Intelligent Database Systems Lab N.Y.U.S.T. I. M. An IPC-based vector space model for patent retrieval Presenter: Jun-Yi Wu Authors: Yen-Liang Chen, Yu-Ting

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Methodology

1010

Phase 4: Generate term * category vector

Phase 5: Generate document * category vectors

Page 11: Intelligent Database Systems Lab N.Y.U.S.T. I. M. An IPC-based vector space model for patent retrieval Presenter: Jun-Yi Wu Authors: Yen-Liang Chen, Yu-Ting

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Experiments

1111

Page 12: Intelligent Database Systems Lab N.Y.U.S.T. I. M. An IPC-based vector space model for patent retrieval Presenter: Jun-Yi Wu Authors: Yen-Liang Chen, Yu-Ting

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Experiments

1212

Page 13: Intelligent Database Systems Lab N.Y.U.S.T. I. M. An IPC-based vector space model for patent retrieval Presenter: Jun-Yi Wu Authors: Yen-Liang Chen, Yu-Ting

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Experiments

1313

Page 14: Intelligent Database Systems Lab N.Y.U.S.T. I. M. An IPC-based vector space model for patent retrieval Presenter: Jun-Yi Wu Authors: Yen-Liang Chen, Yu-Ting

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Experiments

1414

Page 15: Intelligent Database Systems Lab N.Y.U.S.T. I. M. An IPC-based vector space model for patent retrieval Presenter: Jun-Yi Wu Authors: Yen-Liang Chen, Yu-Ting

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Experiments

1515

Page 16: Intelligent Database Systems Lab N.Y.U.S.T. I. M. An IPC-based vector space model for patent retrieval Presenter: Jun-Yi Wu Authors: Yen-Liang Chen, Yu-Ting

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Conclusion

1616

A novel method, IPC-based VSM, was proposed for generating vectors to represent patent documents.

The indexing vocabulary generated in IPC-based VSM was better at finding similar documents than either of the traditional methods.

Page 17: Intelligent Database Systems Lab N.Y.U.S.T. I. M. An IPC-based vector space model for patent retrieval Presenter: Jun-Yi Wu Authors: Yen-Liang Chen, Yu-Ting

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Comments

1717

Advantage IPC_based SVM better than previous methods.

Application Information Retrieval