weighted cluster ensembles: methods and analysis

23
Intelligent Database Systems Lab 國國國國國國國國 National Yunlin University of Science and Technology Weighted Cluster Ensembles: Methods and analysis Presenter Chien-Hsing Chen Author: Carlotta Domeniconi Muna Al- Razgan 1 2009.TKDD.40..

Upload: lethia

Post on 21-Jan-2016

22 views

Category:

Documents


0 download

DESCRIPTION

Weighted Cluster Ensembles: Methods and analysis. Presenter : Chien-Hsing Chen Author: Carlotta Domeniconi Muna Al- Razgan. 2009.TKDD.40. Outline. Motivation Objective Overall of clustering ensemble Method Experiments Conclusion Comment. Motivation. High-dimensional - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Weighted Cluster Ensembles: Methods and analysis

Intelligent Database Systems Lab

國立雲林科技大學National Yunlin University of Science and Technology

Weighted Cluster Ensembles:Methods and analysis

Presenter: Chien-Hsing Chen

Author:

Carlotta Domeniconi

Muna Al-Razgan

1

2009.TKDD.40..

Page 2: Weighted Cluster Ensembles: Methods and analysis

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.

2

Outline Motivation Objective Overall of clustering ensemble Method Experiments Conclusion Comment

Page 3: Weighted Cluster Ensembles: Methods and analysis

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.

3

High-dimensional A dimension (feature) is highly relevant to a cluster, but is irrelevant to

another cluster.

Common global dimensionality reduction techniques are unable to capture such local structure of the data.

it instead of using an equal weight for all w1, w2, …, wD.

using an equal weight for a wi among all clusters, where i =1, …, D,

Clustering ensemble An ensemble bag includes: K-means, SOM, … etc Alternative bag is: 3-means, 5-means, 7-means

How can a technique combine the two respects?

Motivation

baseball homerun shopping c1={sport}

c2={auction}

attribute name

w=(0.9, 0.8, 0.1)t

w=(0.1, 0.2, 0.9)t

w1,i ≠ w2,i

Page 4: Weighted Cluster Ensembles: Methods and analysis

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.

4

High-dimensional provide a first attempt to capture local structure of the data. LAC-h approach

Clustering ensemble LAC-1, LAC-3, LAC-29, …

Combine the two respects WSPA approach WBPA approach WSBPA approach

Objective

w1,i ≠ w2,i

Page 5: Weighted Cluster Ensembles: Methods and analysis

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.

5

Clustering ensemble

Overall work

s ( ) > s ( )

0.95 0.01

0.25

0.20

0.20

0.15

1. A new clustering approach is discussed• handle high-D

2. Three ensemble techniques are introduced• consensus function

3. Graph cut

clusteringpartition

0.13

0.91

Page 6: Weighted Cluster Ensembles: Methods and analysis

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.

6

Clustering ensemble

distance of a attribute i within a cluster j

LAC (locally adaptive clustering)

0.9 0.20

0.1 0.22

0.2 0.21

0.7 0.23

c1

c2

w=(0.9, 0.5, 0.1)t

w=(0.1, 0.5, 0.9)t

|nc1| = 4

|nc2| = 3

?

Page 7: Weighted Cluster Ensembles: Methods and analysis

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.

7

Clustering ensemble

Overall work

s ( ) > s ( )

0.950.01

0.25

0.20

0.20

0.15

1. A new clustering approach is discussed• handle high-D

2. Three ensemble techniques are introduced• consensus function

3. Graph cut

c1

0.13

0.91

Page 8: Weighted Cluster Ensembles: Methods and analysis

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.

8

WSPA 1/2 s ( )

0.94

0.02

0.04

0.060.90 0.04

P =(0.94, 0.04, 0.02)t

P =(0.90, 0.06, 0.02)t

Page 9: Weighted Cluster Ensembles: Methods and analysis

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.

9

Clustering ensemble

Two points have high similarity score if often appearing in the same partitions. Instance-based Graph cut

WSPA 2/2

0.950.01

0.25

0.20

0.20

0.15

0.13

0.91

Page 10: Weighted Cluster Ensembles: Methods and analysis

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.

10

Problem definition

WBPA 1/3

and are never clustered together

the groups to which and belong share the same instances

≡ 0

0.940.02

0.04

0.910.03 0.06 P =(0.94, 0.04, 0.02)t

P =(0.03, 0.91, 0.06)t

Graph

Page 11: Weighted Cluster Ensembles: Methods and analysis

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.

11

WBPA 2/3

The Graph is connect between a cluster and an instance instead of that among data

Graph

Page 12: Weighted Cluster Ensembles: Methods and analysis

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.

12

WBPA 3/3

0.940.64

Page 13: Weighted Cluster Ensembles: Methods and analysis

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.

13

WSBPA

0.94

0.64

0.93

0.94 0.91

0.86 0.85

0.89

0.930.94

0.86

Page 14: Weighted Cluster Ensembles: Methods and analysis

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.

14

WSBPA

0.94

0.64

0.01

0.04 0.01

0.89

0.86 0.85

0.89

0.030.04

0.86

Page 15: Weighted Cluster Ensembles: Methods and analysis

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.

15

Experiment

Page 16: Weighted Cluster Ensembles: Methods and analysis

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.

16

Experiment

Page 17: Weighted Cluster Ensembles: Methods and analysis

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.

17

Experiment

Page 18: Weighted Cluster Ensembles: Methods and analysis

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.

18

Experiment

Page 19: Weighted Cluster Ensembles: Methods and analysis

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.

19

Experiment

Page 20: Weighted Cluster Ensembles: Methods and analysis

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.

20

Experiment w1,i ≠ w2,i

Page 21: Weighted Cluster Ensembles: Methods and analysis

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.

21

Experiment

Page 22: Weighted Cluster Ensembles: Methods and analysis

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.

22

High-dimensional LAC-h approach

Clustering ensemble LAC-1, LAC-3, LAC-29, …

Combine the two respects WSPA approach WBPA approach WSBPA approach

Conclusion

Page 23: Weighted Cluster Ensembles: Methods and analysis

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.

23

Comment Advantage

Consensus function

Drawback

Application Ensemble clustering on SOM