05 unsupervised learning(1)
TRANSCRIPT
CLUSTERING
Clustering is the partitioning of a data set into subsets (clusters), so that the data in each subset (ideally) share some common trait - often according to some defined distance measure.
Clustering is unsupervised classification
CLUSTERING
• There is no explicit teacher and the system forms clusters or “natural groupings” or structure in the input pattern
CLUSTERING
• Data WITHOUT classes or labels
• Deals with finding a structure in a collection of unlabeled data.
• The process of organizing objects into groups whose members are similar in some way
• A cluster is therefore a collection of objects which are “similar” between them and are “dissimilar” to the objects belonging to other clusters.
1 2 3, , , , dn x x x x x
CLUSTERING
In this case we easily identify the 4 clusters into which the data can be divided
The similarity criterion is distance: two or more objects belong to the same cluster if they are “close” according to a given distance
Types of Clustering
• Hierarchical algorithms– These find successive clusters using previously established
clusters.
1. Agglomerative ("bottom-up"): Agglomerative algorithms begin with each element as a separate cluster and merge them into successively larger clusters.
2. Divisive ("top-down"): Divisive algorithms begin with the whole set and proceed to divide it into
successively into smaller clusters.
Types of Clustering
• Partitional clustering– Construct a partition of a data set to produce several clusters –
At once– The process is repeated iteratively – Termination condition– Examples
• K-means clustering
• Fuzzy c-means clustering
K MEANS – Example 2
• Suppose we have 4 medicines and each has two attributes (pH and weight index). Our goal is to group these objects into K=2 groups of medicine
Medicine
Weight pH-Index
A 1 1
B 2 1
C 4 3
D 5 4
A B
C
D
K MEANS – Example 2
Step 1: Compute the similarity between all samples and K centroids
Bc ,Ac 21
Euclidean distance
24.4)14()25( ),(
5)14()15( ),(
222
221
cDd
cDd
Assign each object to the cluster with the nearest seed point
K MEANS – Example 2
Step 2 - Assign the sample to its closest cluster
The elements of Group matrix below is 1 if and only if the object is assigned to that group
K MEANS – Example 2
Step 3: Re-calculate the K-centroids
Knowing the members of each cluster, now we compute the new centroid of each group based on these new memberships.
)67.2 ,67.3(
)3/8 ,3/11(
3
431 ,
3
542
)1 ,1(
2
1
c
c
K MEANS – Example 2
• Step 4 – Repeat the above steps
Compute the distance of all objects to the new centroids
K MEANS – Example 2
• Step 4 – Repeat the above stepsKnowing the members of each cluster, now we compute the new centroid of each group based on these new memberships.
)21
3 ,21
4(2
43 ,
254
)1 ,21
1(2
11 ,
221
2
1
c
c
K MEANS – Example 2
• We obtain result that . Comparing the grouping of last iteration and this iteration reveals that the objects does not move group anymore.
• Thus, the computation of the k-mean clustering has reached its stability and no more iteration is needed.
Kmeans - Examples
D. Comaniciu and P. Meer, Robust Analysis
of Feature Spaces: Color Image
Segmentation, 1997.
• Data Points – RGB Values of pixels• Can be used for Image Segmentation
Hierarchical clustering
• Agglomerative and divisive clustering on the data set {a, b, c, d ,e }
Step 0 Step 1 Step 2 Step 3 Step 4
b
d
c
e
aa b
d e
c d e
a b c d e
Step 4 Step 3 Step 2 Step 1 Step 0
Agglomerative
Divisive
Agglomerative clustering
1. Convert object attributes to distance matrix2. Set each object as a cluster (thus if we have N
objects, we will have N clusters at the beginning)3. Repeat until number of cluster is one (or known # of
clusters) a. Merge two closest clustersb. Update distance matrix
d1
d2
d3
d4
d5
d1,d2 d4,d5 d3
d3,d4,d5
Starting Situation
• Start with clusters of individual points and a distance/proximity matrix
p1
p3
p5
p4
p2
p1 p2 p3 p4 p5 . . .
.
.
.
Distance Matrix
...p1 p2 p3 p4 p9 p10 p11 p12
Intermediate situation
• After some merging steps, we have some clusters
C1
C4
C2 C5
C3
C2C1
C1
C3
C5
C4
C2
C3 C4 C5
Distance Matrix
...p1 p2 p3 p4 p9 p10 p11 p12
Inter cluster distance measures
• Single Link• Average Link• Complete Link• Distance between centroids
Similarity?
Intermediate situation
• We want to merge the two closest clusters (C2 and C5) and update the distance matrix.
C1
C4
C2 C5
C3
C2C1
C1
C3
C5
C4
C2
C3 C4 C5
Distance Matrix
...p1 p2 p3 p4 p9 p10 p11 p12
Single link
• Smallest distance between an element in one cluster and an element in the other
,( , ) min ( , )
i ji j
x c y cD c c D x y
Complete link
• Largest distance between an element in one cluster and an element in the other
,( , ) max ( , )
i ji j
x c y cD c c D x y
Average Link
• Avg distance between an element in one cluster and an element in the other
,( , ) ( , )
i j
i jx c y c
D c c avg D x y
After Merging
• Update the distance matrix
C1
C4
C2 U C5
C3? ? ? ?
?
?
?
C2 U C5C1
C1
C3
C4
C2 U C5
C3 C4
...p1 p2 p3 p4 p9 p10 p11 p12
Example – Single link clustering
Dendrogram
Clustering obtained by cutting the dendrogram at a desired level: each connectedconnected component forms a cluster.
Single Link Clustering
Nested ClustersDendrogram
1
2
3
4
5
6
1
2
3
4
5
3 6 2 5 4 10
0.05
0.1
0.15
0.2
Complete link Clustering
Nested Clusters
Dendrogram
3 6 4 1 2 50
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
1
2
3
4
5
6
1
2 5
3
4
Average link clustering
Nested Clusters
Dendrogram
3 6 4 1 2 50
0.05
0.1
0.15
0.2
0.25
1
2
3
4
5
6
1
2
5
3
4
Comparison
Average Link
Single Link Complete Link1
2
3
4
5
61
2
5
34
1
2
3
4
5
61
2 5
3
41
2
3
4
5
6
12
3
4
5
C-Means Clustering1. Chose the number of clusters and randomly select the
centroids of each cluster.
2. For each data point: Calculate the distance from the data point to each cluster. Instead of assigning the pixel completely to one cluster, use
the weights depending on the distance of that pixel from each cluster.
The closer the cluster, the higher the weigh, and vice versa. Re-compute the centers of the clusters using these
weighted distances.
Mean Shift Algorithm
Mean Shift Algorithm1. Choose a search window size.2. Choose the initial location of the search window.3. Compute the mean location (centroid of the data) in the search window.4. Center the search window at the mean location computed in Step 3.5. Repeat Steps 3 and 4 until convergence.
The mean shift algorithm seeks the “mode” or point of highest density of a data distribution:
54
Mean Shift Algorithm
The mean shift algorithm seeks the “mode” or point of highest density of a data distribution:
Intuitive Description
Distribution of identical billiard balls
Region ofinterest
Center ofmass
Mean Shiftvector
Objective : Find the densest region
Intuitive Description
Distribution of identical billiard balls
Region ofinterest
Center ofmass
Mean Shiftvector
Objective : Find the densest region
Intuitive Description
Distribution of identical billiard balls
Region ofinterest
Center ofmass
Mean Shiftvector
Objective : Find the densest region
Intuitive Description
Distribution of identical billiard balls
Region ofinterest
Center ofmass
Mean Shiftvector
Objective : Find the densest region
Intuitive Description
Distribution of identical billiard balls
Region ofinterest
Center ofmass
Mean Shiftvector
Objective : Find the densest region
Intuitive Description
Distribution of identical billiard balls
Region ofinterest
Center ofmass
Mean Shiftvector
Objective : Find the densest region
Intuitive Description
Distribution of identical billiard balls
Region ofinterest
Center ofmass
Objective : Find the densest region
Clustering
Attraction basin : the region for which all trajectories lead to the same mode
Cluster : All data points in the attraction basin of a mode
Mean Shift Segmentation
Place a tiny mean shift window over each data point1. Grow the window and mean shift it2. Track windows that merge along with the data they transversed 3. Until everything is merged into one cluster
MSS Is scale (search window size) sensitive. Solution, use all scales:
Extension