8 unsupervised method 서울시립대학교 전기전자컴퓨터공학과 g201449015 이가희...

Click here to load reader

Upload: marilynn-garrett

Post on 18-Jan-2016

229 views

Category:

Documents


0 download

TRANSCRIPT

PowerPoint

8 Unsupervised method G201449015 0. unsupervised method

: finds groups in your data with similar characteristics

k-means

: finds elements or properties in the data that tend occur togetherUnsupervised method

1. cluster analysis

: clustering .

.

hierarchical clusteringk-meansUnsupervised method

1.Cluster analysis

1.1 hierarchical clustering

.

2 (), .

dist(), hclust()

dist(x, method, diag, upper)x : data(numeric matrix)method : euclideanmaximummanhattanbinaryminkowskidiag : T/Fupper : T/F

Unsupervised method

1.Cluster analysis

1.1hierarchical clusteringhclust(d, method)d : distance datamethod : ward.D, ward.D2singlecompleteaveragemediancentroid1.1 hierarchical clustering

Unsupervised method

1.Cluster analysis

1.1hierarchical clustering

1.1 hierarchical clustering

cutree() : .

cutree(tree, k, h)tree : hcluter tree datak : h : Unsupervised method

1.Cluster analysis

1.1hierarchical clustering

1.1 hierarchical clustering

visualizing cluters - 2

(PCA) 2 plotting

prcomp() : , .

prcomp(x)x : dataUnsupervised method

1.Cluster analysis

1.1hierarchical clustering1.1 hierarchical clustering

Unsupervised method

1.Cluster analysis

1.1hierarchical clustering

1.1 hierarchical clustering

bootstrap evaluation of clusters : clusterboot()

library(fpc)clusterboot(data, clustermethod)data : data matrixclustermethod : clustering methodkmeansCBI(data, k) : kmeans clusteringhclustCBI(data, k, method) : agglomerative hierarchical clustering

Unsupervised method

1.Cluster analysis

1.1hierarchical clustering1.1 hierarchical clustering

Unsupervised method

1.Cluster analysis

1.1hierarchical clustering

original cluster resampling cluster ,original cluster resampling cluster ,

1.1 hierarchical clustering

Calinski-Harabasz index : , .

(W) = WSS(k) / (n-k)(B) = BSS(k) / (k-1)ch = B / W, k .

good cluster : small WSS(k), large BSS(k)

WSS(total within sum of square) : - centroid

BSS(between sum of square) : TSS-WSS(k)Unsupervised method

1.Cluster analysis

1.1hierarchical clusteringUnsupervised method

1.Cluster analysis

1.1hierarchical clustering

1.1 hierarchical clustering

Unsupervised method

1.Cluster analysis

1.1hierarchical clustering

1.1 hierarchical clustering

1.2 k-means algorithm

K .

kmeans()

kmeans(x, centers, iter.max, nstart, algorithm, trace)x : datacenters : (k)iter.max : nstart : random sets algorithmHartigan-WongLloydForgyMacQueentrace : T/F

Unsupervised method

1.Cluster analysis

1.2k-means algorithm

1.2 k-means algorithm

bootstrap evaluation of clusters : clusterboot()

library(fpc)clusterboot(data, clustermethod)data : data matrixclustermethod : clustering methodkmeansCBI(data, k) : kmeans clusteringhclustCBI(data, k, method) : agglomerative hierarchical clustering

Unsupervised method

1.Cluster analysis

1.2k-means algorithm

1.2 k-means algorithm

: kmeansruns()

kmeansruns(data, krange, criterion)data : datakrange : criterionch : Calinski-Harabasz Indexasw : average silhouette width

a = b = asw = 1-a/b, if aY) = support(union(X,Y) / support(X))

Support(X) = X / T

10% 60% .

Unsupervised method

2.Association rules

2. Association rules

Unsupervised method

2.Association rules

2. Association rules

Unsupervised method

2.Association rules

...

2. Association rules

apriori(data, parameter)data : dataparametersupportconfidence

Unsupervised method

2.Association rules

2. Association rules

inspect(x)

Unsupervised method

2.Association rules

sort, sift lift : lhs, rhs

2. Association rules

apriori(data, parameter, appearance)data : dataparametersupportconfidenceappearance :

Unsupervised method

2.Association rules

2. Association rules

inspect(x)

cf)

Unsupervised method

2.Association rules