SlideShare a Scribd company logo
2
Most read
3
Most read
9
Most read
CLUSTERING
Presented By:
SHARBANI DEY
LIPIKA SAHA
INTRODUCTION
 Clustering is an unsupervised learning method of data abstraction.
 The method of identifying similar groups of data in a dataset is
called Clustering.
 It is basically a collection of objects on the basis of similarity and
dissimilarity between them.
TYPES OF CLUSTERING
 Hard Clustering
In hard clustering, each data point either belongs to a cluster
completely or not.
 Soft Clustering
Soft clustering is about grouping the data items such that
an item can exists in multiple clusters.
CLUSTERING METHODS
Density-Based Methods :
These method search the data space for areas of varied density of data points in
the data space.
Hierarchical Based Methods:
In this method, the clusters forms a tree-type structure based on the hierarchy
New clusters are formed using the previously formed one.
It is divided into two category
• Agglomerative
• Divisive
Partitioning Based Methods:
These methods partition the objects into k cluster and each partition forms
one cluster.
example :- K means
Grid-Based Methods:
In this method, the data space is formulated into a finite number of cells
that form a grid-like structure.
K Means Clustering
 It is an algorithm to group similar elements or data points to cluster.
 The number of groups or cluster is represented by k.
 It assumes that the object attribute forms a vector space based on features
that are already provided.
K Means Clustering Algorithm
Step 1: First we initialize k points, called means, randomly.
Step 2:We categorize each item to its closest mean and we update the mean’s
coordinates, which are the averages of the items categorized in that mean so
far.
Step 3: We repeat the process for a given number of iterations and at the end,
we have our clusters.
Example of K-means Clustering
Let us consider a table
Individual Height Weight
1 185 72
2 170 56
3 168 60
4 179 68
5 182 72
Step 1: Randomly we choose two centroids for two clusters
k1=(185,72)
k2=(170,56)
Step 2: Now using these centroids we compute Eucledian Distance 3rd point
ED=sqrt[(xo-xc)^2+(y0-yc)^2]
k1=sqrt[(168-185)^2+(60-72)^2]
k1=20.80
k2=sqrt[(168-170)^2+(60-56)^2]
k2=4.48
Therefore 3 belongs to k2
Step 3: Calculate new centroid values for k2
k2=[(170+168)/2 , (60+56)/2]
k2=(169,58)
Individual Height Weight
1 185 72
2 170 56
3 168 60
4 179 68
5 182 72
K1={1,4,5}
K2={2,3}
Individual k1 K2
3 20.80 4.48
4 6.32 14.14
5 2 12.56
Hierarchical Clustering
 Hierarchical Clustering finds successive clusters using previously
established clusters.
 No Assumptions on the number of clusters.
Agglomerative Hierarchical Clustering
 Initially consider every data point as an individual Cluster and at every
step, merge the nearest pairs of the cluster.
It is a bottom-up method.
At first every data set is considered as individual entity or cluster.
At every iteration, the clusters merge with different clusters until one
cluster is formed.
Example of Agglomerative Hierarchical
Clustering
Divisive Hierarchical Clustering
Divisive Hierarchical clustering is precisely the opposite of the
Agglomerative Hierarchical clustering.
In Divisive Hierarchical clustering, we take into account all of the data
points as a single cluster.
In every iteration, we separate the data points from the clusters which
aren’t comparable.
In the end, we are left with N clusters.
Example of Divisive Hierarchical Clustering
Reference
• https://www.edureka.co/data-science-python-certification-course
• https://www.analyticsvidhya.com/blog/2016/11/an-introduction-to-clustering-
and-different-methods-of-
clustering/#:~:text=Clustering%20is%20the%20task%20of,and%20assign%20t
hem%20into%20clusters
• https://www.google.com/amp/s/www.geeksforgeeks.org/clustering-in-machine-
learning/amp/
• https://towardsdatascience.com/k-means-clustering-algorithm-applications-
evaluation-methods-and-drawbacks-aa03e644b48a
• https://www.kdnuggets.com/2019/09/hierarchical-clustering.html
• https://towardsdatascience.com/hierarchical-clustering-agglomerative-and-
divisive-explained-342e6b20d710
• https://towardsdatascience.com/understanding-the-concept-of-hierarchical-
clustering-technique-c6e8243758ec
• https://developers.google.com/machine-learning/clustering/overview
• https://www.google.com/amp/s/www.geeksforgeeks.org/hierarchical-
clustering-in-data-mining/amp/
• https://www.google.com/amp/s/www.geeksforgeeks.org/k-means-clustering-
introduction/amp/
THANKYOU

More Related Content

PPTX
Clustering in data Mining (Data Mining)
Mustafa Sherazi
 
PPTX
Data mining: Classification and prediction
DataminingTools Inc
 
PPT
K means Clustering Algorithm
Kasun Ranga Wijeweera
 
PPTX
Clustering in Data Mining
Archana Swaminathan
 
PPT
Data Mining Concepts and Techniques, Chapter 10. Cluster Analysis: Basic Conc...
Salah Amean
 
PPTX
Introduction to Clustering . pptx
Harsha Patil
 
PPT
Chapter 10. Cluster Analysis Basic Concepts and Methods.ppt
Subrata Kumer Paul
 
PPTX
Clusters techniques
rajshreemuthiah
 
Clustering in data Mining (Data Mining)
Mustafa Sherazi
 
Data mining: Classification and prediction
DataminingTools Inc
 
K means Clustering Algorithm
Kasun Ranga Wijeweera
 
Clustering in Data Mining
Archana Swaminathan
 
Data Mining Concepts and Techniques, Chapter 10. Cluster Analysis: Basic Conc...
Salah Amean
 
Introduction to Clustering . pptx
Harsha Patil
 
Chapter 10. Cluster Analysis Basic Concepts and Methods.ppt
Subrata Kumer Paul
 
Clusters techniques
rajshreemuthiah
 

What's hot (20)

PPT
3.5 model based clustering
Krish_ver2
 
PPT
3.2 partitioning methods
Krish_ver2
 
PDF
Big data unit i
Navjot Kaur
 
PPT
Cluster analysis
Kamalakshi Deshmukh-Samag
 
PDF
K - Nearest neighbor ( KNN )
Mohammad Junaid Khan
 
PDF
Classification in Data Mining
Rashmi Bhat
 
PPTX
Data Mining: clustering and analysis
DataminingTools Inc
 
PDF
Big data Clustering Algorithms And Strategies
Farzad Nozarian
 
PPTX
Aspects of data mart
Osama Hussain Paracha
 
PPTX
Density based methods
SVijaylakshmi
 
PPTX
Introduction to Data Mining and Data Warehousing
Kamal Acharya
 
PPT
3.3 hierarchical methods
Krish_ver2
 
PDF
Data clustering
GARIMA SHAKYA
 
PPTX
Denclue Algorithm - Cluster, Pe
Tauhidul Khandaker
 
PPTX
Machine learning clustering
CosmoAIMS Bassett
 
PPTX
Design cycles of pattern recognition
Al Mamun
 
PPTX
05 Clustering in Data Mining
Valerii Klymchuk
 
PPTX
Cluster Analysis
DataminingTools Inc
 
PPTX
K-means clustering algorithm
Vinit Dantkale
 
3.5 model based clustering
Krish_ver2
 
3.2 partitioning methods
Krish_ver2
 
Big data unit i
Navjot Kaur
 
Cluster analysis
Kamalakshi Deshmukh-Samag
 
K - Nearest neighbor ( KNN )
Mohammad Junaid Khan
 
Classification in Data Mining
Rashmi Bhat
 
Data Mining: clustering and analysis
DataminingTools Inc
 
Big data Clustering Algorithms And Strategies
Farzad Nozarian
 
Aspects of data mart
Osama Hussain Paracha
 
Density based methods
SVijaylakshmi
 
Introduction to Data Mining and Data Warehousing
Kamal Acharya
 
3.3 hierarchical methods
Krish_ver2
 
Data clustering
GARIMA SHAKYA
 
Denclue Algorithm - Cluster, Pe
Tauhidul Khandaker
 
Machine learning clustering
CosmoAIMS Bassett
 
Design cycles of pattern recognition
Al Mamun
 
05 Clustering in Data Mining
Valerii Klymchuk
 
Cluster Analysis
DataminingTools Inc
 
K-means clustering algorithm
Vinit Dantkale
 
Ad

Similar to Clustering (20)

PPT
Lecture_3_k-mean-clustering.ppt
SyedNahin1
 
PPTX
CLUSTER ANALYSIS ALGORITHMS.pptx
ShwetapadmaBabu1
 
PPT
26-Clustering MTech-2017.ppt
vikassingh569137
 
PPT
4 DM Clustering ifor computerscience.ppt
arewho557
 
PDF
ch_5_dm clustering in data mining.......
PriyankaPatil919748
 
PPTX
Unsupervised learning Algorithms and Assumptions
refedey275
 
PDF
Clustering
Rashmi Bhat
 
PPTX
K means ALGORITHM IN MACHINE LEARNING.pptx
angelinjeba6
 
PDF
Survey on Unsupervised Learning in Datamining
IOSR Journals
 
PDF
Unsupervised Learning in Machine Learning
Pyingkodi Maran
 
PPTX
partitioning methods in data mining .pptx
BodhanLaxman1
 
PPTX
machine learning - Clustering in R
Sudhakar Chavan
 
PPTX
MODULE 4_ CLUSTERING.pptx
nikshaikh786
 
PPTX
Lec13 Clustering.pptx
Khalid Rabayah
 
PDF
A survey on Efficient Enhanced K-Means Clustering Algorithm
ijsrd.com
 
PPTX
K-means machine learning clustering .pptx
asmaashalma456
 
PDF
Algorithm for mining cluster and association patterns
ReginoBalogoJr1
 
PPT
Clustering & classification
Jamshed Khan
 
PDF
Lecture_54.pdF k-MEANS cLUTERING BY NPTEL
bajrangenterprises19
 
Lecture_3_k-mean-clustering.ppt
SyedNahin1
 
CLUSTER ANALYSIS ALGORITHMS.pptx
ShwetapadmaBabu1
 
26-Clustering MTech-2017.ppt
vikassingh569137
 
4 DM Clustering ifor computerscience.ppt
arewho557
 
ch_5_dm clustering in data mining.......
PriyankaPatil919748
 
Unsupervised learning Algorithms and Assumptions
refedey275
 
Clustering
Rashmi Bhat
 
K means ALGORITHM IN MACHINE LEARNING.pptx
angelinjeba6
 
Survey on Unsupervised Learning in Datamining
IOSR Journals
 
Unsupervised Learning in Machine Learning
Pyingkodi Maran
 
partitioning methods in data mining .pptx
BodhanLaxman1
 
machine learning - Clustering in R
Sudhakar Chavan
 
MODULE 4_ CLUSTERING.pptx
nikshaikh786
 
Lec13 Clustering.pptx
Khalid Rabayah
 
A survey on Efficient Enhanced K-Means Clustering Algorithm
ijsrd.com
 
K-means machine learning clustering .pptx
asmaashalma456
 
Algorithm for mining cluster and association patterns
ReginoBalogoJr1
 
Clustering & classification
Jamshed Khan
 
Lecture_54.pdF k-MEANS cLUTERING BY NPTEL
bajrangenterprises19
 
Ad

Recently uploaded (20)

PDF
Research-Fundamentals-and-Topic-Development.pdf
ayesha butalia
 
PPTX
The Future of AI & Machine Learning.pptx
pritsen4700
 
PDF
Responsible AI and AI Ethics - By Sylvester Ebhonu
Sylvester Ebhonu
 
PDF
A Strategic Analysis of the MVNO Wave in Emerging Markets.pdf
IPLOOK Networks
 
PPTX
AI and Robotics for Human Well-being.pptx
JAYMIN SUTHAR
 
PDF
SparkLabs Primer on Artificial Intelligence 2025
SparkLabs Group
 
PDF
NewMind AI Weekly Chronicles - July'25 - Week IV
NewMind AI
 
PDF
Data_Analytics_vs_Data_Science_vs_BI_by_CA_Suvidha_Chaplot.pdf
CA Suvidha Chaplot
 
PDF
Peak of Data & AI Encore - Real-Time Insights & Scalable Editing with ArcGIS
Safe Software
 
PDF
How ETL Control Logic Keeps Your Pipelines Safe and Reliable.pdf
Stryv Solutions Pvt. Ltd.
 
PDF
OFFOFFBOX™ – A New Era for African Film | Startup Presentation
ambaicciwalkerbrian
 
PPTX
The-Ethical-Hackers-Imperative-Safeguarding-the-Digital-Frontier.pptx
sujalchauhan1305
 
PDF
Tea4chat - another LLM Project by Kerem Atam
a0m0rajab1
 
PDF
Structs to JSON: How Go Powers REST APIs
Emily Achieng
 
PDF
Google I/O Extended 2025 Baku - all ppts
HusseinMalikMammadli
 
PPTX
Simple and concise overview about Quantum computing..pptx
mughal641
 
PDF
Get More from Fiori Automation - What’s New, What Works, and What’s Next.pdf
Precisely
 
PDF
The Future of Mobile Is Context-Aware—Are You Ready?
iProgrammer Solutions Private Limited
 
PPTX
AI in Daily Life: How Artificial Intelligence Helps Us Every Day
vanshrpatil7
 
PPTX
What-is-the-World-Wide-Web -- Introduction
tonifi9488
 
Research-Fundamentals-and-Topic-Development.pdf
ayesha butalia
 
The Future of AI & Machine Learning.pptx
pritsen4700
 
Responsible AI and AI Ethics - By Sylvester Ebhonu
Sylvester Ebhonu
 
A Strategic Analysis of the MVNO Wave in Emerging Markets.pdf
IPLOOK Networks
 
AI and Robotics for Human Well-being.pptx
JAYMIN SUTHAR
 
SparkLabs Primer on Artificial Intelligence 2025
SparkLabs Group
 
NewMind AI Weekly Chronicles - July'25 - Week IV
NewMind AI
 
Data_Analytics_vs_Data_Science_vs_BI_by_CA_Suvidha_Chaplot.pdf
CA Suvidha Chaplot
 
Peak of Data & AI Encore - Real-Time Insights & Scalable Editing with ArcGIS
Safe Software
 
How ETL Control Logic Keeps Your Pipelines Safe and Reliable.pdf
Stryv Solutions Pvt. Ltd.
 
OFFOFFBOX™ – A New Era for African Film | Startup Presentation
ambaicciwalkerbrian
 
The-Ethical-Hackers-Imperative-Safeguarding-the-Digital-Frontier.pptx
sujalchauhan1305
 
Tea4chat - another LLM Project by Kerem Atam
a0m0rajab1
 
Structs to JSON: How Go Powers REST APIs
Emily Achieng
 
Google I/O Extended 2025 Baku - all ppts
HusseinMalikMammadli
 
Simple and concise overview about Quantum computing..pptx
mughal641
 
Get More from Fiori Automation - What’s New, What Works, and What’s Next.pdf
Precisely
 
The Future of Mobile Is Context-Aware—Are You Ready?
iProgrammer Solutions Private Limited
 
AI in Daily Life: How Artificial Intelligence Helps Us Every Day
vanshrpatil7
 
What-is-the-World-Wide-Web -- Introduction
tonifi9488
 

Clustering

  • 2. INTRODUCTION  Clustering is an unsupervised learning method of data abstraction.  The method of identifying similar groups of data in a dataset is called Clustering.  It is basically a collection of objects on the basis of similarity and dissimilarity between them.
  • 3. TYPES OF CLUSTERING  Hard Clustering In hard clustering, each data point either belongs to a cluster completely or not.  Soft Clustering Soft clustering is about grouping the data items such that an item can exists in multiple clusters.
  • 4. CLUSTERING METHODS Density-Based Methods : These method search the data space for areas of varied density of data points in the data space. Hierarchical Based Methods: In this method, the clusters forms a tree-type structure based on the hierarchy New clusters are formed using the previously formed one. It is divided into two category • Agglomerative • Divisive
  • 5. Partitioning Based Methods: These methods partition the objects into k cluster and each partition forms one cluster. example :- K means Grid-Based Methods: In this method, the data space is formulated into a finite number of cells that form a grid-like structure.
  • 6. K Means Clustering  It is an algorithm to group similar elements or data points to cluster.  The number of groups or cluster is represented by k.  It assumes that the object attribute forms a vector space based on features that are already provided.
  • 7. K Means Clustering Algorithm Step 1: First we initialize k points, called means, randomly. Step 2:We categorize each item to its closest mean and we update the mean’s coordinates, which are the averages of the items categorized in that mean so far. Step 3: We repeat the process for a given number of iterations and at the end, we have our clusters.
  • 8. Example of K-means Clustering Let us consider a table Individual Height Weight 1 185 72 2 170 56 3 168 60 4 179 68 5 182 72
  • 9. Step 1: Randomly we choose two centroids for two clusters k1=(185,72) k2=(170,56) Step 2: Now using these centroids we compute Eucledian Distance 3rd point ED=sqrt[(xo-xc)^2+(y0-yc)^2] k1=sqrt[(168-185)^2+(60-72)^2] k1=20.80 k2=sqrt[(168-170)^2+(60-56)^2] k2=4.48 Therefore 3 belongs to k2 Step 3: Calculate new centroid values for k2 k2=[(170+168)/2 , (60+56)/2] k2=(169,58) Individual Height Weight 1 185 72 2 170 56 3 168 60 4 179 68 5 182 72
  • 10. K1={1,4,5} K2={2,3} Individual k1 K2 3 20.80 4.48 4 6.32 14.14 5 2 12.56
  • 11. Hierarchical Clustering  Hierarchical Clustering finds successive clusters using previously established clusters.  No Assumptions on the number of clusters.
  • 12. Agglomerative Hierarchical Clustering  Initially consider every data point as an individual Cluster and at every step, merge the nearest pairs of the cluster. It is a bottom-up method. At first every data set is considered as individual entity or cluster. At every iteration, the clusters merge with different clusters until one cluster is formed.
  • 13. Example of Agglomerative Hierarchical Clustering
  • 14. Divisive Hierarchical Clustering Divisive Hierarchical clustering is precisely the opposite of the Agglomerative Hierarchical clustering. In Divisive Hierarchical clustering, we take into account all of the data points as a single cluster. In every iteration, we separate the data points from the clusters which aren’t comparable. In the end, we are left with N clusters.
  • 15. Example of Divisive Hierarchical Clustering
  • 16. Reference • https://www.edureka.co/data-science-python-certification-course • https://www.analyticsvidhya.com/blog/2016/11/an-introduction-to-clustering- and-different-methods-of- clustering/#:~:text=Clustering%20is%20the%20task%20of,and%20assign%20t hem%20into%20clusters • https://www.google.com/amp/s/www.geeksforgeeks.org/clustering-in-machine- learning/amp/ • https://towardsdatascience.com/k-means-clustering-algorithm-applications- evaluation-methods-and-drawbacks-aa03e644b48a • https://www.kdnuggets.com/2019/09/hierarchical-clustering.html • https://towardsdatascience.com/hierarchical-clustering-agglomerative-and- divisive-explained-342e6b20d710 • https://towardsdatascience.com/understanding-the-concept-of-hierarchical- clustering-technique-c6e8243758ec • https://developers.google.com/machine-learning/clustering/overview • https://www.google.com/amp/s/www.geeksforgeeks.org/hierarchical- clustering-in-data-mining/amp/ • https://www.google.com/amp/s/www.geeksforgeeks.org/k-means-clustering- introduction/amp/