an ipc-based vector space model for patent retrieval
DESCRIPTION
An IPC-based vector space model for patent retrieval. Presenter: Jun-Yi Wu Authors: Yen-Liang Chen, Yu-Ting Chiu. 國立雲林科技大學 National Yunlin University of Science and Technology. 2011 IPM. Outline. Motivation Objective Methodology Experiments Conclusion Comments. Motivation. - PowerPoint PPT PresentationTRANSCRIPT
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.
An IPC-based vector space model for patent retrieval
Presenter: Jun-Yi Wu Authors: Yen-Liang Chen, Yu-Ting Chiu
2011 IPM
國立雲林科技大學National Yunlin University of Science and Technology
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.Outline
Motivation Objective Methodology Experiments Conclusion Comments
2
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.Motivation
3
The weakness in traditional VSM is that the indexing vocabulary changes whenever changes occur in the document set, or the indexing vocabulary selection algorithms, or parameters of the algorithms, or if wording evolution occurs.
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.Objective
The major objective of this research is to design a method to solve the afore-mentioned problems for patent retrieval.
The proposed method utilizes the special characteristics of the patent documents, the International Patent Classification (IPC) codes, to generate the indexing vocabulary for presenting all the patent documents.
4
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.Methodology
5
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.Methodology
6
Phase 1: Collect patent documents
PatentDB
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.Methodology
7
Phase 2:Text preprocessing
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.Methodology
88
Phase 3: Generate category * term vectors
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.Methodology
99
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.Methodology
1010
Phase 4: Generate term * category vector
Phase 5: Generate document * category vectors
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.Experiments
1111
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.Experiments
1212
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.Experiments
1313
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.Experiments
1414
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.Experiments
1515
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.Conclusion
1616
A novel method, IPC-based VSM, was proposed for generating vectors to represent patent documents.
The indexing vocabulary generated in IPC-based VSM was better at finding similar documents than either of the traditional methods.
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.Comments
1717
Advantage IPC_based SVM better than previous methods.
Application Information Retrieval