SlideShare a Scribd company logo
Prof. Neeraj Bhargava
Vishal Dutt
Department of Computer Science, School of
Engineering & System Sciences
MDS University, Ajmer
Hierarchical clustering
Agglomerative (bottom up)
1. start with 1 point
(singleton)
2. recursively add two or
more appropriate
clusters
3. Stop when k number of
clusters is achieved.
Divisive (top down)
1. Start with a big cluster
2. Recursively divide into
smaller clusters
3. Stop when k number of
clusters is achieved.
General steps of hierarchical clustering
Given a set of N items to be clustered, and an N*N
distance (or similarity) matrix, the basic process of
hierarchical clustering (defined by S.C. Johnson in
1967) is this:
 Start by assigning each item to a cluster, so that if
you have N items, you now have N clusters, each
containing just one item. Let the distances
(similarities) between the clusters the same as the
distances (similarities) between the items they
contain.
 Find the closest (most similar) pair of clusters and
merge them into a single cluster, so that now you
have one cluster less.
 Compute distances (similarities) between the new
cluster and each of the old clusters.
 Repeat steps 2 and 3 until all items are clustered into
K number of clusters
Exclusive vs. non exclusive
clustering
 In the first case data are grouped in an
exclusive way, so that if a certain datum
belongs to a definite cluster then it could
not be included in another cluster. A
simple example of that is shown in the
figure below, where the separation of
points is achieved by a straight line on a
bi-dimensional plane.
 On the contrary the second type, the
overlapping clustering, uses fuzzy sets to
cluster data, so that each point may
belong to two or more clusters with
different degrees of membership.
Partitioning clustering
1. Divide data into proper subset
2. recursively go through each subset and relocate
points between clusters (opposite to visit-once
approach in Hierarchical approach)
This recursive relocation= higher quality cluster
Probabilistic clustering
1. Data are picked from mixture of probability
distribution.
2. Use the mean, variance of each distribution as
parameters for cluster
3. Single cluster membership
Single-Linkage Clustering(hierarchical)
• The N*N proximity matrix is D = [d(i,j)]
• The clusterings are assigned sequence
numbers 0,1,......, (n-1)
• L(k) is the level of the kth clustering
• A cluster with sequence number m is
denoted (m)
• The proximity between clusters (r) and (s)
is denoted d [(r),(s)]

More Related Content

PPT
Cure, Clustering Algorithm
Lino Possamai
 
PPTX
CLIQUE Automatic subspace clustering of high dimensional data for data mining...
Raed Aldahdooh
 
PPTX
Density based Clustering Algorithms(DB SCAN, Mean shift )
Utkarsh Sharma
 
PPTX
Path compression
DEEPIKA T
 
PPTX
Empirical Network Classification
Colleen Farrelly
 
PPT
Clustering
DataminingTools Inc
 
PDF
Birch
ngocdiem87
 
PPTX
Hierarchical Clustering
Megha Sharma
 
Cure, Clustering Algorithm
Lino Possamai
 
CLIQUE Automatic subspace clustering of high dimensional data for data mining...
Raed Aldahdooh
 
Density based Clustering Algorithms(DB SCAN, Mean shift )
Utkarsh Sharma
 
Path compression
DEEPIKA T
 
Empirical Network Classification
Colleen Farrelly
 
Birch
ngocdiem87
 
Hierarchical Clustering
Megha Sharma
 

What's hot (17)

PPTX
Multiscale Mapper Networks
Colleen Farrelly
 
PPTX
Hierarchical clustering
Abdullah Masoud
 
PPT
3.6 constraint based cluster analysis
Krish_ver2
 
PPTX
Cluster analysis
s v
 
PPTX
Path & application(ds)2
GowriLatha1
 
PPT
Concurrent Replication of Parallel and Distributed Simulations
Gabriele D'Angelo
 
PPTX
Topology for data science
Colleen Farrelly
 
PPTX
Datamining
Haripritha
 
PPT
Birch
ThamizharasiM3
 
PDF
An Efficient Method of Partitioning High Volumes of Multidimensional Data for...
IJERA Editor
 
PPT
★Mean shift a_robust_approach_to_feature_space_analysis
irisshicat
 
PPT
Lec4 Clustering
Jeff Hammerbacher
 
PPT
Clustering (from Google)
Sri Prasanna
 
PDF
CSA 3702 machine learning module 3
Nandhini S
 
PDF
GRAPH BASED LOCAL RECODING FOR DATA ANONYMIZATION
IJDMS
 
PDF
A NEW GENERALIZATION OF EDGE OVERLAP TO WEIGHTED NETWORKS
ijaia
 
Multiscale Mapper Networks
Colleen Farrelly
 
Hierarchical clustering
Abdullah Masoud
 
3.6 constraint based cluster analysis
Krish_ver2
 
Cluster analysis
s v
 
Path & application(ds)2
GowriLatha1
 
Concurrent Replication of Parallel and Distributed Simulations
Gabriele D'Angelo
 
Topology for data science
Colleen Farrelly
 
Datamining
Haripritha
 
An Efficient Method of Partitioning High Volumes of Multidimensional Data for...
IJERA Editor
 
★Mean shift a_robust_approach_to_feature_space_analysis
irisshicat
 
Lec4 Clustering
Jeff Hammerbacher
 
Clustering (from Google)
Sri Prasanna
 
CSA 3702 machine learning module 3
Nandhini S
 
GRAPH BASED LOCAL RECODING FOR DATA ANONYMIZATION
IJDMS
 
A NEW GENERALIZATION OF EDGE OVERLAP TO WEIGHTED NETWORKS
ijaia
 
Ad

Similar to 12 types of clustering (20)

PPT
Slide-TIF311-DM-10-11.ppt
ImXaib
 
PPT
Slide-TIF311-DM-10-11.ppt
SandinoBerutu1
 
PPT
clustering and their types explanation of data mining
vandanasharma862095
 
PPTX
TYPES OF CLUSTERING.pptx
Incrediblev Vishnu
 
PDF
12. Clustering.pdf for the students of aktu.
tanyasingh3130
 
PPTX
Algorithms used in AIML and the need for aiml basic use cases
Bhagirathi12
 
PPTX
Hierarchical clustering.pptx
NTUConcepts1
 
PPTX
Data mining and warehousing
Swetha544947
 
PPTX
Unsupervised Learning-Clustering Algorithms.pptx
jasontseng19
 
PPTX
ML basic & clustering
monalisa Das
 
PDF
iiit delhi unsupervised pdf.pdf
VIKASGUPTA127897
 
PPTX
log6kntt4i4dgwfwbpxw-signature-75c4ed0a4b22d2fef90396cdcdae85b38911f9dce0924a...
ABINASHPADHY6
 
PDF
PPT s10-machine vision-s2
Binus Online Learning
 
PDF
Hierarchical Clustering
Carlos Castillo (ChaTo)
 
PDF
Data Science - Part VII - Cluster Analysis
Derek Kane
 
PPTX
Cluster Analysis
guest0edcaf
 
PPTX
Cluster Analysis
DataminingTools Inc
 
PPTX
Cluster Analysis
Datamining Tools
 
Slide-TIF311-DM-10-11.ppt
ImXaib
 
Slide-TIF311-DM-10-11.ppt
SandinoBerutu1
 
clustering and their types explanation of data mining
vandanasharma862095
 
TYPES OF CLUSTERING.pptx
Incrediblev Vishnu
 
12. Clustering.pdf for the students of aktu.
tanyasingh3130
 
Algorithms used in AIML and the need for aiml basic use cases
Bhagirathi12
 
Hierarchical clustering.pptx
NTUConcepts1
 
Data mining and warehousing
Swetha544947
 
Unsupervised Learning-Clustering Algorithms.pptx
jasontseng19
 
ML basic & clustering
monalisa Das
 
iiit delhi unsupervised pdf.pdf
VIKASGUPTA127897
 
log6kntt4i4dgwfwbpxw-signature-75c4ed0a4b22d2fef90396cdcdae85b38911f9dce0924a...
ABINASHPADHY6
 
PPT s10-machine vision-s2
Binus Online Learning
 
Hierarchical Clustering
Carlos Castillo (ChaTo)
 
Data Science - Part VII - Cluster Analysis
Derek Kane
 
Cluster Analysis
guest0edcaf
 
Cluster Analysis
DataminingTools Inc
 
Cluster Analysis
Datamining Tools
 
Ad

More from Vishal Dutt (20)

PPTX
Grid computing components
Vishal Dutt
 
PPTX
Python files / directories part16
Vishal Dutt
 
PPTX
Python Classes and Objects part14
Vishal Dutt
 
PPTX
Python Classes and Objects part13
Vishal Dutt
 
PPTX
Python files / directories part15
Vishal Dutt
 
PPTX
Python functions part12
Vishal Dutt
 
PPTX
Python functions part11
Vishal Dutt
 
PPTX
Python functions part10
Vishal Dutt
 
PPTX
List view5
Vishal Dutt
 
PPTX
Python decision making_loops_control statements part9
Vishal Dutt
 
PPTX
List view4
Vishal Dutt
 
PPTX
List view3
Vishal Dutt
 
PPTX
Python decision making_loops_control statements part8
Vishal Dutt
 
PPTX
Python decision making_loops part7
Vishal Dutt
 
PPTX
Python decision making_loops part6
Vishal Dutt
 
PPTX
List view2
Vishal Dutt
 
PPTX
List view1
Vishal Dutt
 
PPTX
Python decision making part5
Vishal Dutt
 
PPTX
Python decision making part4
Vishal Dutt
 
PPTX
Python operators part3
Vishal Dutt
 
Grid computing components
Vishal Dutt
 
Python files / directories part16
Vishal Dutt
 
Python Classes and Objects part14
Vishal Dutt
 
Python Classes and Objects part13
Vishal Dutt
 
Python files / directories part15
Vishal Dutt
 
Python functions part12
Vishal Dutt
 
Python functions part11
Vishal Dutt
 
Python functions part10
Vishal Dutt
 
List view5
Vishal Dutt
 
Python decision making_loops_control statements part9
Vishal Dutt
 
List view4
Vishal Dutt
 
List view3
Vishal Dutt
 
Python decision making_loops_control statements part8
Vishal Dutt
 
Python decision making_loops part7
Vishal Dutt
 
Python decision making_loops part6
Vishal Dutt
 
List view2
Vishal Dutt
 
List view1
Vishal Dutt
 
Python decision making part5
Vishal Dutt
 
Python decision making part4
Vishal Dutt
 
Python operators part3
Vishal Dutt
 

Recently uploaded (20)

PPTX
Command Palatte in Odoo 18.1 Spreadsheet - Odoo Slides
Celine George
 
PPTX
Artificial Intelligence in Gastroentrology: Advancements and Future Presprec...
AyanHossain
 
PPTX
Dakar Framework Education For All- 2000(Act)
santoshmohalik1
 
PDF
Biological Classification Class 11th NCERT CBSE NEET.pdf
NehaRohtagi1
 
PPTX
Kanban Cards _ Mass Action in Odoo 18.2 - Odoo Slides
Celine George
 
PPTX
BASICS IN COMPUTER APPLICATIONS - UNIT I
suganthim28
 
PPTX
Introduction to pediatric nursing in 5th Sem..pptx
AneetaSharma15
 
PDF
Health-The-Ultimate-Treasure (1).pdf/8th class science curiosity /samyans edu...
Sandeep Swamy
 
PPTX
Care of patients with elImination deviation.pptx
AneetaSharma15
 
PPTX
family health care settings home visit - unit 6 - chn 1 - gnm 1st year.pptx
Priyanshu Anand
 
PPTX
How to Close Subscription in Odoo 18 - Odoo Slides
Celine George
 
PPTX
Artificial-Intelligence-in-Drug-Discovery by R D Jawarkar.pptx
Rahul Jawarkar
 
PPTX
An introduction to Prepositions for beginners.pptx
drsiddhantnagine
 
PPTX
PROTIEN ENERGY MALNUTRITION: NURSING MANAGEMENT.pptx
PRADEEP ABOTHU
 
PPTX
HEALTH CARE DELIVERY SYSTEM - UNIT 2 - GNM 3RD YEAR.pptx
Priyanshu Anand
 
PPTX
A Smarter Way to Think About Choosing a College
Cyndy McDonald
 
PPTX
CARE OF UNCONSCIOUS PATIENTS .pptx
AneetaSharma15
 
PPTX
Python-Application-in-Drug-Design by R D Jawarkar.pptx
Rahul Jawarkar
 
PPTX
Virus sequence retrieval from NCBI database
yamunaK13
 
PPTX
Information Texts_Infographic on Forgetting Curve.pptx
Tata Sevilla
 
Command Palatte in Odoo 18.1 Spreadsheet - Odoo Slides
Celine George
 
Artificial Intelligence in Gastroentrology: Advancements and Future Presprec...
AyanHossain
 
Dakar Framework Education For All- 2000(Act)
santoshmohalik1
 
Biological Classification Class 11th NCERT CBSE NEET.pdf
NehaRohtagi1
 
Kanban Cards _ Mass Action in Odoo 18.2 - Odoo Slides
Celine George
 
BASICS IN COMPUTER APPLICATIONS - UNIT I
suganthim28
 
Introduction to pediatric nursing in 5th Sem..pptx
AneetaSharma15
 
Health-The-Ultimate-Treasure (1).pdf/8th class science curiosity /samyans edu...
Sandeep Swamy
 
Care of patients with elImination deviation.pptx
AneetaSharma15
 
family health care settings home visit - unit 6 - chn 1 - gnm 1st year.pptx
Priyanshu Anand
 
How to Close Subscription in Odoo 18 - Odoo Slides
Celine George
 
Artificial-Intelligence-in-Drug-Discovery by R D Jawarkar.pptx
Rahul Jawarkar
 
An introduction to Prepositions for beginners.pptx
drsiddhantnagine
 
PROTIEN ENERGY MALNUTRITION: NURSING MANAGEMENT.pptx
PRADEEP ABOTHU
 
HEALTH CARE DELIVERY SYSTEM - UNIT 2 - GNM 3RD YEAR.pptx
Priyanshu Anand
 
A Smarter Way to Think About Choosing a College
Cyndy McDonald
 
CARE OF UNCONSCIOUS PATIENTS .pptx
AneetaSharma15
 
Python-Application-in-Drug-Design by R D Jawarkar.pptx
Rahul Jawarkar
 
Virus sequence retrieval from NCBI database
yamunaK13
 
Information Texts_Infographic on Forgetting Curve.pptx
Tata Sevilla
 

12 types of clustering

  • 1. Prof. Neeraj Bhargava Vishal Dutt Department of Computer Science, School of Engineering & System Sciences MDS University, Ajmer
  • 2. Hierarchical clustering Agglomerative (bottom up) 1. start with 1 point (singleton) 2. recursively add two or more appropriate clusters 3. Stop when k number of clusters is achieved. Divisive (top down) 1. Start with a big cluster 2. Recursively divide into smaller clusters 3. Stop when k number of clusters is achieved.
  • 3. General steps of hierarchical clustering Given a set of N items to be clustered, and an N*N distance (or similarity) matrix, the basic process of hierarchical clustering (defined by S.C. Johnson in 1967) is this:  Start by assigning each item to a cluster, so that if you have N items, you now have N clusters, each containing just one item. Let the distances (similarities) between the clusters the same as the distances (similarities) between the items they contain.  Find the closest (most similar) pair of clusters and merge them into a single cluster, so that now you have one cluster less.  Compute distances (similarities) between the new cluster and each of the old clusters.  Repeat steps 2 and 3 until all items are clustered into K number of clusters
  • 4. Exclusive vs. non exclusive clustering  In the first case data are grouped in an exclusive way, so that if a certain datum belongs to a definite cluster then it could not be included in another cluster. A simple example of that is shown in the figure below, where the separation of points is achieved by a straight line on a bi-dimensional plane.  On the contrary the second type, the overlapping clustering, uses fuzzy sets to cluster data, so that each point may belong to two or more clusters with different degrees of membership.
  • 5. Partitioning clustering 1. Divide data into proper subset 2. recursively go through each subset and relocate points between clusters (opposite to visit-once approach in Hierarchical approach) This recursive relocation= higher quality cluster
  • 6. Probabilistic clustering 1. Data are picked from mixture of probability distribution. 2. Use the mean, variance of each distribution as parameters for cluster 3. Single cluster membership
  • 7. Single-Linkage Clustering(hierarchical) • The N*N proximity matrix is D = [d(i,j)] • The clusterings are assigned sequence numbers 0,1,......, (n-1) • L(k) is the level of the kth clustering • A cluster with sequence number m is denoted (m) • The proximity between clusters (r) and (s) is denoted d [(r),(s)]