extending the growing hierarchal som for clustering documents in graphs domain

15
Intelligent Database Systems Lab 國國國國國國國國 National Yunlin University of Science and Technology 1 Extending the Growing Hierarchal SOM for Clustering Documents in Graphs domain Presenter : Cheng-Hui Chen Authors : Mahmoud F. Hussin, Mahmoud R. farra and Yasser El-Sonbaty IJCNN, 2008

Upload: mostyn

Post on 22-Feb-2016

33 views

Category:

Documents


0 download

DESCRIPTION

Extending the Growing Hierarchal SOM for Clustering Documents in Graphs domain. Presenter : Cheng- Hui Chen Authors : Mahmoud F. Hussin , Mahmoud R. farra and Yasser El- Sonbaty IJCNN, 2008. Outlines. Motivation Objectives Methodology Experiments Conclusions Comments. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Extending the Growing Hierarchal SOM for Clustering Documents in Graphs domain

Intelligent Database Systems Lab

國立雲林科技大學National Yunlin University of Science and Technology

1

Extending the Growing Hierarchal SOM for Clustering Documents in Graphs domain

Presenter : Cheng-Hui Chen Authors : Mahmoud F. Hussin, Mahmoud R. farra and Yasser El-Sonbaty

IJCNN, 2008

Page 2: Extending the Growing Hierarchal SOM for Clustering Documents in Graphs domain

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.

2

Outlines Motivation Objectives Methodology Experiments Conclusions Comments

Page 3: Extending the Growing Hierarchal SOM for Clustering Documents in Graphs domain

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Motivation· The variants of SOM are limited by the fact that they

use only the VSM for document representations. ─ It does not represent any relation between the words.─ The space complexity to the VSM.

· The sentences being broken down into their individual components without any representation of the sentence structure.

3

term B

term A

*

*

D1

D2

θ

d

river raftingmild

Page 4: Extending the Growing Hierarchal SOM for Clustering Documents in Graphs domain

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Objectives· Using graphs to represent documents helped

the salient features of data through using edges to represent relations and using vertices to represent words.

· The decrease the space complexity comparing to the VSM.

· The extend the GHSOM to work in the graph domain to enhance the quality of clusters.

4

rafting

river

mild

Page 5: Extending the Growing Hierarchal SOM for Clustering Documents in Graphs domain

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Methodology

5

Page 6: Extending the Growing Hierarchal SOM for Clustering Documents in Graphs domain

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Enhance the DIG to work with G-GHSOM· The Document Index Graph (DIG) model

─ the DIG for representing the document and Exploited it in the document clustering.

· For example (the document table of word "river" is shown)─ River rafting. (doc1)─ mild river rafting. (doc2)─ River fishing. (doc3)

6

e1 S0(1)

e0 S0(0)

river

3

Page 7: Extending the Growing Hierarchal SOM for Clustering Documents in Graphs domain

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Enhance the DIG to work with G-GHSOM

7

1

2

1

2

3

4

34

Page 8: Extending the Growing Hierarchal SOM for Clustering Documents in Graphs domain

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Enhance the DIG to work with G-GHSOM· Single-word similarity measure

· Two document vectors similarity measure

· The total similarity is the integration

8

Page 9: Extending the Growing Hierarchal SOM for Clustering Documents in Graphs domain

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Ment of the G-GHSOM to work with graph· Neuron Initialization

─ Detecting the matching list to calculate the phrase based similarity.

9

Page 10: Extending the Growing Hierarchal SOM for Clustering Documents in Graphs domain

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Ment of the G-GHSOM to work with graph

10

Page 11: Extending the Growing Hierarchal SOM for Clustering Documents in Graphs domain

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Ment of the G-GHSOM to work with graph

11

Gin

Page 12: Extending the Growing Hierarchal SOM for Clustering Documents in Graphs domain

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Experiments· Data set

─ Reuter’s news articles (RNA)─ University of Waterloo and Canadian Web sites (UW-

CAN)

12

Page 13: Extending the Growing Hierarchal SOM for Clustering Documents in Graphs domain

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Experiments

13

Page 14: Extending the Growing Hierarchal SOM for Clustering Documents in Graphs domain

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Conclusions· The extend the GHSOM to a new graph based

GHSOM: (G-GHSOM) to enhance the quality of the document clustering.─ G-GHSOM works successfully with graph domain and

achieves a better quality clustering than TGHSOM in document clustering.

· The enhanced the DIG model to work with GHSOM algorithm.

14

Page 15: Extending the Growing Hierarchal SOM for Clustering Documents in Graphs domain

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.

15

Comments· Advantages

─ Enhance the quality of the document clustering· Application

─ SOM─ Clustering