2
Most read
4
Most read
AMELIORATION OF
K-MEANS ALGORITHM
K-MEANS ALGORITHM
K-means algorithm is used for creating and
analyzing clusters.
In this algorithm, ‘n’ number of data points are
divided into ‘k’ clusters based on some similarity
measurement criterion.
However results generated using this algorithm are
mainly dependent on choosing initial cluster
centroids.
K means clustering algorithm
ADVANTAGES &
DISADVANTAGES
Advantages of k-means algorithm:
1. Ease of implementation and high-speed performance
2. Measurable and efficient in large data collection
Disadvantages of k-means algorithm:
1. Selection of optimal number of clusters is difficult
2. Selection of the initial centroids is random.
PROBLEM DEFINITION
•In the original k-means algorithm, the resulting
set of clusters strongly depends on the selection
of initial centroids which is random.
•Thus, in our project, we will propose a method
for calculating the initial centroids, which will
make the k-Means algorithm more efficient, so
as to get quality clustering with reduced
complexity.
PROPOSED SOLUTION
Phase-I: The input array of elements is scanned
and split up into sub-arrays, which represent the
initial clusters.
Phase-II: The centroids of previous initial clusters
are computed by calculating mean of each cluster.
Furthermore the data elements having less or equal
distance remains in the same cluster otherwise
they are moved to appropriate clusters. The entire
process continues until no changes in the clusters
are detected.
IMPROVED K-MEANS
ALGORITHM
Algorithm is divided into two Phases. In Phase-I, we find the initial
clusters, while in Phase-II, data elements are moved in appropriate
clusters.
Phase-I: To find the initial clusters
INPUT: Array {a1, a2, a3,..., an}
OUTPUT: A set of Initial Clusters.
Steps:
1) Find the size of cluster Si by calculating (n/k).
Where n= number of data points Dp (a1, a2, a3, ...... an)
k= number of clusters.
2) Create 'k' number of Arrays Ak
3) Move data points (Dp) from Input Array to Ak until Si.
4) Continue Step 3 until all Dp is removed from input array
5) Exit with having 'k' initial clusters.
Phase-II: To find the final clusters
INPUT: A set of Initial Clusters.
OUTPUT: A set of k Clusters.
Steps:
1) Compute the Arithmetic Mean M of all initial clusters C
2) Set 1≤ j≤ k
3) Compute the distance D of all Dp to M of Initial Clusters Cj
4) If D of Dp and M is less than or equal to other distances of Mi (1≤
i≤ k) then Dp stays in same cluster Else Dp having less D is assigned
to Corresponding Ci
5) For each cluster Cj (1≤ j≤ k), Recompute the M and move Dp until
no change in clusters.
APPLICATION
Rating based clustering system.
In E - commerce sites to cluster products based on
ratings to optimize the purchase-profit ratio of the
enterprise.
Useful for enhanced marketing and devising sales
strategy.
THANK YOU!

More Related Content

PPTX
K means clustering
PPT
K means Clustering Algorithm
PPTX
K MEANS CLUSTERING
PPT
3.5 model based clustering
PPTX
Introduction to Clustering algorithm
PPTX
K-Means Clustering Algorithm.pptx
PPT
K mean-clustering
PPTX
K Means Clustering Algorithm | K Means Clustering Example | Machine Learning ...
K means clustering
K means Clustering Algorithm
K MEANS CLUSTERING
3.5 model based clustering
Introduction to Clustering algorithm
K-Means Clustering Algorithm.pptx
K mean-clustering
K Means Clustering Algorithm | K Means Clustering Example | Machine Learning ...

What's hot (20)

PPTX
Kmeans
PPTX
Dynamic Programming
PDF
K means Clustering
PPTX
Unsupervised learning clustering
PDF
K-Means Algorithm
PPT
Divide and Conquer
PPTX
queue & its applications
PPTX
Analysis of algorithm
PPT
DESIGN AND ANALYSIS OF ALGORITHMS
PDF
Topological Sorting (Decrease and Conquer)
PDF
07 Analysis of Algorithms: Order Statistics
PPTX
K means clustring @jax
PPT
KNN - Classification Model (Step by Step)
PPTX
K-Means clustring @jax
PDF
Algorithms Lecture 2: Analysis of Algorithms I
PPTX
Complexity analysis in Algorithms
PPTX
Greedy algorithms
PPTX
Knn 160904075605-converted
PPTX
Computability - Tractable, Intractable and Non-computable Function
PPT
CC282 Unsupervised Learning (Clustering) Lecture 7 slides for ...
Kmeans
Dynamic Programming
K means Clustering
Unsupervised learning clustering
K-Means Algorithm
Divide and Conquer
queue & its applications
Analysis of algorithm
DESIGN AND ANALYSIS OF ALGORITHMS
Topological Sorting (Decrease and Conquer)
07 Analysis of Algorithms: Order Statistics
K means clustring @jax
KNN - Classification Model (Step by Step)
K-Means clustring @jax
Algorithms Lecture 2: Analysis of Algorithms I
Complexity analysis in Algorithms
Greedy algorithms
Knn 160904075605-converted
Computability - Tractable, Intractable and Non-computable Function
CC282 Unsupervised Learning (Clustering) Lecture 7 slides for ...
Ad

Viewers also liked (14)

PPT
K mean-clustering algorithm
PDF
slides Céline Beji
PPT
Enhance The K Means Algorithm On Spatial Dataset
PPTX
K-Means Clustering Algorithm - Cluster Analysis | Machine Learning Algorithm ...
PPTX
05 k-means clustering
PPT
MOLECULAR DOCKING
PPTX
phase rule & phase diagram
PPTX
The phase rule
PPT
Intro to MATLAB and K-mean algorithm
PPTX
Coacervation Phase Separation Techniques
PPT
Phase Diagrams and Phase Rule
PDF
What is Artificial Intelligence | Artificial Intelligence Tutorial For Beginn...
PPTX
AI and Machine Learning Demystified by Carol Smith at Midwest UX 2017
K mean-clustering algorithm
slides Céline Beji
Enhance The K Means Algorithm On Spatial Dataset
K-Means Clustering Algorithm - Cluster Analysis | Machine Learning Algorithm ...
05 k-means clustering
MOLECULAR DOCKING
phase rule & phase diagram
The phase rule
Intro to MATLAB and K-mean algorithm
Coacervation Phase Separation Techniques
Phase Diagrams and Phase Rule
What is Artificial Intelligence | Artificial Intelligence Tutorial For Beginn...
AI and Machine Learning Demystified by Carol Smith at Midwest UX 2017
Ad

Similar to K means clustering algorithm (20)

PDF
Optimising Data Using K-Means Clustering Algorithm
PDF
The International Journal of Engineering and Science (The IJES)
PDF
Premeditated Initial Points for K-Means Clustering
PDF
An improvement in k mean clustering algorithm using better time and accuracy
PPTX
MODULE 4_ CLUSTERING.pptx
DOCX
Neural nw k means
PPT
K mean-clustering
DOCX
8.clustering algorithm.k means.em algorithm
PPT
k-mean-clustering (1) clustering topic explanation
PPT
k-mean-Clustering impact on AI using DSS
PPT
k-mean-clustering algorithm with example.ppt
PPT
Unsupervised Machine Learning, Clustering, K-Means
PPT
k-mean-clustering.ppt
PDF
Experimental study of Data clustering using k- Means and modified algorithms
DOCX
K means report
PDF
k-mean-clustering.pdf
PPTX
"k-means-clustering" presentation @ Papers We Love Bucharest
PDF
New Approach for K-mean and K-medoids Algorithm
PDF
A survey on Efficient Enhanced K-Means Clustering Algorithm
Optimising Data Using K-Means Clustering Algorithm
The International Journal of Engineering and Science (The IJES)
Premeditated Initial Points for K-Means Clustering
An improvement in k mean clustering algorithm using better time and accuracy
MODULE 4_ CLUSTERING.pptx
Neural nw k means
K mean-clustering
8.clustering algorithm.k means.em algorithm
k-mean-clustering (1) clustering topic explanation
k-mean-Clustering impact on AI using DSS
k-mean-clustering algorithm with example.ppt
Unsupervised Machine Learning, Clustering, K-Means
k-mean-clustering.ppt
Experimental study of Data clustering using k- Means and modified algorithms
K means report
k-mean-clustering.pdf
"k-means-clustering" presentation @ Papers We Love Bucharest
New Approach for K-mean and K-medoids Algorithm
A survey on Efficient Enhanced K-Means Clustering Algorithm

Recently uploaded (20)

PPTX
Basic Statistical Analysis for experimental data.pptx
PPTX
research framework and review of related literature chapter 2
PDF
Delhi c@ll girl# cute girls in delhi with travel girls in delhi call now
PPTX
DATA ANALYTICS COURSE IN PITAMPURA.pptx
PDF
9 FinOps Tools That Simplify Cloud Cost Reporting.pdf
PDF
Lesson 1 - intro Cybersecurity and Cybercrime.pptx.pdf
PDF
MULTI-ACCESS EDGE COMPUTING ARCHITECTURE AND SMART AGRICULTURE APPLICATION IN...
PPTX
Understanding AI: Basics on Artificial Intelligence and Machine Learning
PDF
book-34714 (2).pdfhjkkljgfdssawtjiiiiiujj
PPTX
ISO 9001-2015 quality management system presentation
PPTX
UNIT-1 NOTES Data warehousing and data mining.pptx
PDF
Mcdonald's : a half century growth . pdf
PPT
Drug treatment of Malbbbbbhhbbbbhharia.ppt
PPTX
Sistem Informasi Manejemn-Sistem Manajemen Database
PPTX
Evaluasi program Bhs Inggris th 2023-2024 dan prog th 2024-2025-1.pptx
PPTX
Power BI - Microsoft Power BI is an interactive data visualization software p...
PDF
Book Trusted Companions in Delhi – 24/7 Available Delhi Personal Meeting Ser...
PPTX
Stats annual compiled ipd opd ot br 2024
PDF
NU-MEP-Standards معايير تصميم جامعية .pdf
PPTX
Capstone Presentation a.pptx on data sci
Basic Statistical Analysis for experimental data.pptx
research framework and review of related literature chapter 2
Delhi c@ll girl# cute girls in delhi with travel girls in delhi call now
DATA ANALYTICS COURSE IN PITAMPURA.pptx
9 FinOps Tools That Simplify Cloud Cost Reporting.pdf
Lesson 1 - intro Cybersecurity and Cybercrime.pptx.pdf
MULTI-ACCESS EDGE COMPUTING ARCHITECTURE AND SMART AGRICULTURE APPLICATION IN...
Understanding AI: Basics on Artificial Intelligence and Machine Learning
book-34714 (2).pdfhjkkljgfdssawtjiiiiiujj
ISO 9001-2015 quality management system presentation
UNIT-1 NOTES Data warehousing and data mining.pptx
Mcdonald's : a half century growth . pdf
Drug treatment of Malbbbbbhhbbbbhharia.ppt
Sistem Informasi Manejemn-Sistem Manajemen Database
Evaluasi program Bhs Inggris th 2023-2024 dan prog th 2024-2025-1.pptx
Power BI - Microsoft Power BI is an interactive data visualization software p...
Book Trusted Companions in Delhi – 24/7 Available Delhi Personal Meeting Ser...
Stats annual compiled ipd opd ot br 2024
NU-MEP-Standards معايير تصميم جامعية .pdf
Capstone Presentation a.pptx on data sci

K means clustering algorithm

  • 3. K-means algorithm is used for creating and analyzing clusters. In this algorithm, ‘n’ number of data points are divided into ‘k’ clusters based on some similarity measurement criterion. However results generated using this algorithm are mainly dependent on choosing initial cluster centroids.
  • 6. Advantages of k-means algorithm: 1. Ease of implementation and high-speed performance 2. Measurable and efficient in large data collection Disadvantages of k-means algorithm: 1. Selection of optimal number of clusters is difficult 2. Selection of the initial centroids is random.
  • 8. •In the original k-means algorithm, the resulting set of clusters strongly depends on the selection of initial centroids which is random. •Thus, in our project, we will propose a method for calculating the initial centroids, which will make the k-Means algorithm more efficient, so as to get quality clustering with reduced complexity.
  • 10. Phase-I: The input array of elements is scanned and split up into sub-arrays, which represent the initial clusters. Phase-II: The centroids of previous initial clusters are computed by calculating mean of each cluster. Furthermore the data elements having less or equal distance remains in the same cluster otherwise they are moved to appropriate clusters. The entire process continues until no changes in the clusters are detected.
  • 12. Algorithm is divided into two Phases. In Phase-I, we find the initial clusters, while in Phase-II, data elements are moved in appropriate clusters. Phase-I: To find the initial clusters INPUT: Array {a1, a2, a3,..., an} OUTPUT: A set of Initial Clusters. Steps: 1) Find the size of cluster Si by calculating (n/k). Where n= number of data points Dp (a1, a2, a3, ...... an) k= number of clusters. 2) Create 'k' number of Arrays Ak 3) Move data points (Dp) from Input Array to Ak until Si. 4) Continue Step 3 until all Dp is removed from input array 5) Exit with having 'k' initial clusters.
  • 13. Phase-II: To find the final clusters INPUT: A set of Initial Clusters. OUTPUT: A set of k Clusters. Steps: 1) Compute the Arithmetic Mean M of all initial clusters C 2) Set 1≤ j≤ k 3) Compute the distance D of all Dp to M of Initial Clusters Cj 4) If D of Dp and M is less than or equal to other distances of Mi (1≤ i≤ k) then Dp stays in same cluster Else Dp having less D is assigned to Corresponding Ci 5) For each cluster Cj (1≤ j≤ k), Recompute the M and move Dp until no change in clusters.
  • 15. Rating based clustering system. In E - commerce sites to cluster products based on ratings to optimize the purchase-profit ratio of the enterprise. Useful for enhanced marketing and devising sales strategy.