classification and clustering
DESCRIPTION
Regular Presentation on Classification and Clustering.TRANSCRIPT
Clustering and Classification
Presented by:
Yogendra, Govinda, Lov, Sunena
Outline
• Background
• Classification
• Clustering
• Examples
• References
Background
• Clustering is “the process of organizing objects into groups whose members are similar in some way”.
• A cluster is therefore a collection of objects which are “similar” between them and are “dissimilar” to the objects belonging to other clusters.
Clustering
Clustering
• Organizing objects into group.
Cluster analysis Components
• Similarity(Distance) Measure
• Clustering Algorithm
Similarity Measure
• Distance between two data points
Clustering Algorithms• Exclusive Clustering• Data belongs to only one definite cluster
• Eg. K-Means Algorithms
• Overlapping Clustering• Uses fuzzy set for clustering
• Single Data may belong to one or more clusters
• Hierarchical Clustering• Agglomerative(Bottom-up)
• Divisive(Top-Down)
Clustering Algorithms
Agglomerative
Divisive
K-Means Algorithm
K-Means Algorithm
Minimization of Squared Error Function
Demo
Here we have a dataset!
We randomly choose 2 group centroids!
We assign each point to the group that has the closest centroit.
We recalculate the positions of the centroids.
We assign each point to the group that has the closest centroid.
We recalculate the positions of the centroids.
Other Approaches
Hierarchical clustering
• Agglomerative (bottom up)
1. start with 1 point (singleton)
2. recursively add two or more appropriate clusters
3. Stop when k number of clusters is achieved.
Hierarchical clustering
• Divisive (Top down)
1. Start with a big cluster
2. Recursively divide into smaller clusters
3. Stop when k number of clusters is achieved.
References
• www-users.cs.umn.edu/~kumar/.../chap8_basic_cluster_analysis.pdf
• http://en.wikipedia.org/wiki/Cluster_analysis
• https://www.cs.duke.edu/courses/fall03/cps260/notes/lecture18.pdf
• http://www.matlab-cookbook.com/recipes/0100_Statistics/150_kmeans_clustering.html
• http://www.cs.utah.edu/~germain/PPS/Topics/Matlab/plot.html
• http://www.mathworks.com/help/stats/k-means-clustering.html