weighted cluster ensembles: methods and analysis

Post on 21-Jan-2016

22 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

Weighted Cluster Ensembles: Methods and analysis. Presenter : Chien-Hsing Chen Author: Carlotta Domeniconi Muna Al- Razgan. 2009.TKDD.40. Outline. Motivation Objective Overall of clustering ensemble Method Experiments Conclusion Comment. Motivation. High-dimensional - PowerPoint PPT Presentation

TRANSCRIPT

Intelligent Database Systems Lab

國立雲林科技大學National Yunlin University of Science and Technology

Weighted Cluster Ensembles:Methods and analysis

Presenter: Chien-Hsing Chen

Author:

Carlotta Domeniconi

Muna Al-Razgan

1

2009.TKDD.40..

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.

2

Outline Motivation Objective Overall of clustering ensemble Method Experiments Conclusion Comment

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.

3

High-dimensional A dimension (feature) is highly relevant to a cluster, but is irrelevant to

another cluster.

Common global dimensionality reduction techniques are unable to capture such local structure of the data.

it instead of using an equal weight for all w1, w2, …, wD.

using an equal weight for a wi among all clusters, where i =1, …, D,

Clustering ensemble An ensemble bag includes: K-means, SOM, … etc Alternative bag is: 3-means, 5-means, 7-means

How can a technique combine the two respects?

Motivation

baseball homerun shopping c1={sport}

c2={auction}

attribute name

w=(0.9, 0.8, 0.1)t

w=(0.1, 0.2, 0.9)t

w1,i ≠ w2,i

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.

4

High-dimensional provide a first attempt to capture local structure of the data. LAC-h approach

Clustering ensemble LAC-1, LAC-3, LAC-29, …

Combine the two respects WSPA approach WBPA approach WSBPA approach

Objective

w1,i ≠ w2,i

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.

5

Clustering ensemble

Overall work

s ( ) > s ( )

0.95 0.01

0.25

0.20

0.20

0.15

1. A new clustering approach is discussed• handle high-D

2. Three ensemble techniques are introduced• consensus function

3. Graph cut

clusteringpartition

0.13

0.91

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.

6

Clustering ensemble

distance of a attribute i within a cluster j

LAC (locally adaptive clustering)

0.9 0.20

0.1 0.22

0.2 0.21

0.7 0.23

c1

c2

w=(0.9, 0.5, 0.1)t

w=(0.1, 0.5, 0.9)t

|nc1| = 4

|nc2| = 3

?

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.

7

Clustering ensemble

Overall work

s ( ) > s ( )

0.950.01

0.25

0.20

0.20

0.15

1. A new clustering approach is discussed• handle high-D

2. Three ensemble techniques are introduced• consensus function

3. Graph cut

c1

0.13

0.91

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.

8

WSPA 1/2 s ( )

0.94

0.02

0.04

0.060.90 0.04

P =(0.94, 0.04, 0.02)t

P =(0.90, 0.06, 0.02)t

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.

9

Clustering ensemble

Two points have high similarity score if often appearing in the same partitions. Instance-based Graph cut

WSPA 2/2

0.950.01

0.25

0.20

0.20

0.15

0.13

0.91

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.

10

Problem definition

WBPA 1/3

and are never clustered together

the groups to which and belong share the same instances

≡ 0

0.940.02

0.04

0.910.03 0.06 P =(0.94, 0.04, 0.02)t

P =(0.03, 0.91, 0.06)t

Graph

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.

11

WBPA 2/3

The Graph is connect between a cluster and an instance instead of that among data

Graph

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.

12

WBPA 3/3

0.940.64

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.

13

WSBPA

0.94

0.64

0.93

0.94 0.91

0.86 0.85

0.89

0.930.94

0.86

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.

14

WSBPA

0.94

0.64

0.01

0.04 0.01

0.89

0.86 0.85

0.89

0.030.04

0.86

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.

15

Experiment

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.

16

Experiment

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.

17

Experiment

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.

18

Experiment

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.

19

Experiment

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.

20

Experiment w1,i ≠ w2,i

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.

21

Experiment

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.

22

High-dimensional LAC-h approach

Clustering ensemble LAC-1, LAC-3, LAC-29, …

Combine the two respects WSPA approach WBPA approach WSBPA approach

Conclusion

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.

23

Comment Advantage

Consensus function

Drawback

Application Ensemble clustering on SOM

top related