2
Most read
4
Most read
16
Most read
K Means Clustering
What is K-Means Algorithm?
K-Means Clustering is an Unsupervised Learning algorithm, which groups the unlabelled
dataset into different clusters.
Here K defines the number of pre-defined clusters that need to be created in the
process, as if K=2, there will be two clusters, and for K=3, there will be three clusters,
and so on.
It is an iterative algorithm that divides the unlabelled dataset into k different clusters
in such a way that each dataset belongs only one group that has similar properties.
It is a centroid-based algorithm, where each cluster is associated with a centroid. The
main aim of this algorithm is to minimize the sum of distances between the data point
and their corresponding clusters.
 The k-means clustering algorithm mainly performs two tasks:
 1. Determines the best value for K center points or centroids by an iterative process.
 2. Assigns each data point to its closest k-center. Those data points which are near to the
particular k-center, create a cluster.
 How does the K-Means Algorithm Work?
 The working of the K-Means algorithm is explained in the below steps:
 Step-1: Select the number K to decide the number of clusters.
 Step-2: Select random K points or centroids. (It can be other from the input dataset).
 Step-3: Assign each data point to their closest centroid, which will form the predefined K
clusters.
 Step-4: Calculate the variance and place a new centroid of each cluster.
 Step-5: Repeat the third steps, which means reassign each datapoint to the new closest
centroid of each cluster.
 Step-6: If any reassignment occurs, then go to step-4 else go to FINISH.
 Step-7: The model is ready.
 Let's take number k of clusters, i.e., K=2, to identify the dataset and to put them into
different clusters. It means here we will try to group these datasets into two different
clusters.
Now we will assign each data point of the scatter plot to its
closest K-point or centroid.
From the above image, it is clear that points left side of
the line is near to the K1 or blue centroid, and points to
the right of the line are close to the yellow centroid.
As we need to find the closest cluster, so we will repeat
the process by choosing a new centroid.
As our model is ready, so we can now remove the
assumed centroids, and the two final clusters will be as
shown in the below image:
 In the given dataset, we have Customer_Id, Gender, Age, Annual Income ($), and Spending
Score (which is the calculated value of how much a customer has spent in the mall, the more
the value, the more he has spent).
 From this dataset, we need to calculate some patterns, as it is an unsupervised method, so we
don't know what to calculate exactly.
 The steps to be followed for the implementation are given below:
 Data Pre-processing
 Finding the optimal number of clusters using the elbow method
 Training the K-means algorithm on the training dataset
 Visualizing the clusters
 The numpy we have imported for the performing mathematics calculation, matplotlib is
for plotting the graph, and pandas are for managing the dataset.
 Importing the Dataset:
Next, we will import the dataset that we need to use. So here, we are using the
Mall_Customer_data.csv dataset. It can be imported using the below code:
 import numpy as nm
 import matplotlib.pyplot as mtp
 import pandas as pd
 dataset = pd.read_csv('Mall_Customers_data.csv')
 print(dataset)
 Extracting Independent Variables
 Here we don't need any dependent variable for data pre-processing step as it is a clustering
problem, and we have no idea about what to determine. So we will just add a line of code
for the matrix of features.
 Finding the optimal number of clusters using the elbow method
 In the second step, we will try to find the optimal number of clusters for our clustering
problem. So, as discussed above, here we are going to use the elbow method for this
purpose.
 Elbow method uses the WCSS concept to draw the plot by plotting WCSS values on the Y-axis
and the number of clusters on the X-axis.
 So we are going to calculate the value for WCSS for different k values ranging from 1 to 10.
Below is the code for it:
 #finding optimal number of clusters using the elbow method
 from sklearn.cluster import KMeans
 wcss_list= [] #Initializing the list for the values of WCSS

 #Using for loop for iterations from 1 to 10.
 for i in range(1, 11):
 kmeans = KMeans(n_clusters=i, init='k-means++', random_state= 42)
 kmeans.fit(x)
 wcss_list.append(kmeans.inertia_)
 mtp.plot(range(1, 11), wcss_list)
 mtp.title('The Elobw Method Graph')
 mtp.xlabel('Number of clusters(k)')
 mtp.ylabel('wcss_list')
 mtp.show()
 we have created the wcss_list variable to initialize an empty list, which is used to contain
the value of wcss computed for different values of k ranging from 1 to 10.
 After that, we have initialized the for loop for the iteration on a different value of k ranging
from 1 to 10; since for loop in Python, exclude the outbound limit, so it is taken as 11 to
include 10th value.
 The rest part of the code is similar as we did in earlier topics, as we have fitted the model
on a matrix of features and then plotted the graph between the number of clusters and
WCSS.
 Training the K-means algorithm on the training dataset
 As we have got the number of clusters, so we can now train the model on the
dataset.
 #training the K-means model on a dataset
 kmeans = KMeans(n_clusters=5, init='k-means++', random_state= 42)
 y_predict= kmeans.fit_predict(x)
 Visualizing the Clusters
 The last step is to visualize the clusters. As we have 5 clusters for our model,
so we will visualize each cluster one by one.
 #visulaizing the clusters
 mtp.scatter(x[y_predict == 0, 0], x[y_predict == 0, 1], s = 100, c = 'blue', label = 'Cluster 1') #for first cluster
 mtp.scatter(x[y_predict == 1, 0], x[y_predict == 1, 1], s = 100, c = 'green', label = 'Cluster 2') #for second cluster
 mtp.scatter(x[y_predict== 2, 0], x[y_predict == 2, 1], s = 100, c = 'red', label = 'Cluster 3') #for third cluster
 mtp.scatter(x[y_predict == 3, 0], x[y_predict == 3, 1], s = 100, c = 'cyan', label = 'Cluster 4') #for fourth cluster
 mtp.scatter(x[y_predict == 4, 0], x[y_predict == 4, 1], s = 100, c = 'magenta', label = 'Cluster 5') #for fifth cluster
 mtp.scatter(kmeans.cluster_centers_[:, 0], kmeans.cluster_centers_[:, 1], s = 300, c = 'yellow', label = 'Centroid'
)
 mtp.title('Clusters of customers')
 mtp.xlabel('Annual Income (k$)')
 mtp.ylabel('Spending Score (1-100)')
 mtp.legend()
 mtp.show()
 Cluster1 shows the customers with average salary and average spending so we can
categorize these customers as Cluster1
 Cluster2 shows the customer has a high income but low spending, so we can
categorize them as careful.
 Cluster3 shows the low income and also low spending so they can be categorized as
sensible.
 Cluster4 shows the customers with low income with very high spending so they can be
categorized as careless.
 Cluster5 shows the customers with high income and high spending so they can be
categorized as target, and these customers can be the most profitable customers for
the mall owner.
 import numpy as nm
 import matplotlib.pyplot as mtp
 import pandas as pd
 dataset = pd.read_csv('Mall_Customers_data.csv')
 print(dataset)
 x = dataset.iloc[:, [3, 4]].values
 print(x)
 #finding optimal number of clusters using the elbow method
 from sklearn.cluster import KMeans
 wcss_list=[]
 #Initializing the list for the values of WCSS
 #Using for loop for iterations from 1 to 10.
 for i in range(1,11):
 kmeans=KMeans(n_clusters=i,init='k-means++',random_state=42)
 kmeans.fit(x)
 wcss_list.append(kmeans.inertia_)
 mtp.plot(range(1,11),wcss_list)
 mtp.title('The Elobw Method Graph')
 mtp.xlabel('Number of clusters(k)')
 mtp.ylabel('wcss_list')
 mtp.show()
 #training the K-means model on a dataset
 kmeans = KMeans(n_clusters=5, init='k-means++', random_state= 42)
 y_predict= kmeans.fit_predict(x)
 #visulaizing the clusters
 mtp.scatter(x[y_predict == 0, 0], x[y_predict == 0, 1], s = 100, c = 'blue', label = 'Cluster 1') #for first cluster
 mtp.scatter(x[y_predict == 1, 0], x[y_predict == 1, 1], s = 100, c = 'green', label = 'Cluster 2') #for second cluster
 mtp.scatter(x[y_predict== 2, 0], x[y_predict == 2, 1], s = 100, c = 'red', label = 'Cluster 3') #for third cluster
 mtp.scatter(x[y_predict == 3, 0], x[y_predict == 3, 1], s = 100, c = 'cyan', label = 'Cluster 4') #for fourth cluster
 mtp.scatter(x[y_predict == 4, 0], x[y_predict == 4, 1], s = 100, c = 'magenta', label = 'Cluster 5') #for fifth cluster
 mtp.scatter(kmeans.cluster_centers_[:, 0], kmeans.cluster_centers_[:, 1], s = 300, c = 'yellow', label =
'Centroid')
 mtp.title('Clusters of customers')
 mtp.xlabel('Annual Income (k$)')
 mtp.ylabel('Spending Score (1-100)')
 mtp.legend()
 mtp.show()
K Means Clustering in ML.pptx

More Related Content

PPTX
Random Forest Decision Tree.pptx
PPTX
Cluster Analysis Introduction
PPTX
KNN Algorithm - How KNN Algorithm Works With Example | Data Science For Begin...
PPTX
Learning – Types of Machine Learning – Supervised Learning – Unsupervised UNI...
PPTX
K-Nearest Neighbor Classifier
PDF
Implementing the Perceptron Algorithm for Finding the weights of a Linear Dis...
PDF
Artificial Neural Networks Lect5: Multi-Layer Perceptron & Backpropagation
Random Forest Decision Tree.pptx
Cluster Analysis Introduction
KNN Algorithm - How KNN Algorithm Works With Example | Data Science For Begin...
Learning – Types of Machine Learning – Supervised Learning – Unsupervised UNI...
K-Nearest Neighbor Classifier
Implementing the Perceptron Algorithm for Finding the weights of a Linear Dis...
Artificial Neural Networks Lect5: Multi-Layer Perceptron & Backpropagation

What's hot (20)

PPT
Data mining-primitives-languages-and-system-architectures2641
PPT
MachineLearning.ppt
PPTX
Convolutional Neural Network (CNN)of Deep Learning
PPT
3.2 partitioning methods
PDF
Density Based Clustering
PDF
K-Means Algorithm
PPT
K mean-clustering
PPTX
Decision tree induction
PPTX
K-means clustering algorithm
PPT
introduction to machine learning pdf.ppt
PPTX
K-Means Clustering Algorithm.pptx
PPTX
tsfresh_suman2
PPT
Clustering
PPT
Association Rule.ppt
PPT
Classification and prediction
PPTX
PDF
K means Clustering
PPTX
AlphaZero: A General Reinforcement Learning Algorithm that Masters Chess, Sho...
Data mining-primitives-languages-and-system-architectures2641
MachineLearning.ppt
Convolutional Neural Network (CNN)of Deep Learning
3.2 partitioning methods
Density Based Clustering
K-Means Algorithm
K mean-clustering
Decision tree induction
K-means clustering algorithm
introduction to machine learning pdf.ppt
K-Means Clustering Algorithm.pptx
tsfresh_suman2
Clustering
Association Rule.ppt
Classification and prediction
K means Clustering
AlphaZero: A General Reinforcement Learning Algorithm that Masters Chess, Sho...
Ad

Similar to K Means Clustering in ML.pptx (20)

PDF
5. Types of Clustering Algorithms in ML.pdf
PPTX
Clustering: A Scikit Learn Tutorial
PDF
Unsupervised Learning in Machine Learning
PPTX
K-Means clustering and its working .pptx
PPTX
sarisus hdyses can create targeted .pptx
PDF
Machine Learning, K-means Algorithm Implementation with R
PPT
Clustering in Machine Learning: A Brief Overview.ppt
PPTX
k-mean medoid and-knn-algorithm problems.pptx
PPTX
K-means Clustering with Scikit-Learn
PPT
3.Unsupervised Learning.ppt presenting machine learning
PDF
Machine Learning with Python- Machine Learning Algorithms- K-Means Clustering...
PDF
K Means Clustering Algorithm | K Means Example in Python | Machine Learning A...
PPTX
"k-means-clustering" presentation @ Papers We Love Bucharest
PPTX
Unsupervised learning Algorithms and Assumptions
PPTX
big data analytics unit 2 notes for study
PPTX
Lec13 Clustering.pptx
PDF
Optimising Data Using K-Means Clustering Algorithm
PDF
Mastering Customer Segmentation with LLM.pdf
PPTX
Classificationand different algorithm
PPTX
Unsupervised Learning: Clustering
5. Types of Clustering Algorithms in ML.pdf
Clustering: A Scikit Learn Tutorial
Unsupervised Learning in Machine Learning
K-Means clustering and its working .pptx
sarisus hdyses can create targeted .pptx
Machine Learning, K-means Algorithm Implementation with R
Clustering in Machine Learning: A Brief Overview.ppt
k-mean medoid and-knn-algorithm problems.pptx
K-means Clustering with Scikit-Learn
3.Unsupervised Learning.ppt presenting machine learning
Machine Learning with Python- Machine Learning Algorithms- K-Means Clustering...
K Means Clustering Algorithm | K Means Example in Python | Machine Learning A...
"k-means-clustering" presentation @ Papers We Love Bucharest
Unsupervised learning Algorithms and Assumptions
big data analytics unit 2 notes for study
Lec13 Clustering.pptx
Optimising Data Using K-Means Clustering Algorithm
Mastering Customer Segmentation with LLM.pdf
Classificationand different algorithm
Unsupervised Learning: Clustering
Ad

More from Ramakrishna Reddy Bijjam (20)

PPTX
Probability Distribution Reviewing Probability Distributions.pptx
PPTX
Combining data and Customizing the Header NamesSorting.pptx
PPTX
python plotting's and its types with examples.pptx
PPTX
Statistics and its measures with Python.pptx
PPTX
DataStructures in Pyhton Pandas and numpy.pptx
PPTX
Pyhton with Mysql to perform CRUD operations.pptx
PPTX
Regular expressions,function and glob module.pptx
PPTX
Natural Language processing using nltk.pptx
PPTX
Parsing HTML read and write operations and OS Module.pptx
PPTX
JSON, XML and Data Science introduction.pptx
PPTX
What is FIle and explanation of text files.pptx
PPTX
BINARY files CSV files JSON files with example.pptx
DOCX
VBS control structures for if do whilw.docx
DOCX
Builtinfunctions in vbscript and its types.docx
DOCX
VBScript Functions procedures and arrays.docx
DOCX
VBScript datatypes and control structures.docx
PPTX
Numbers and global functions conversions .pptx
DOCX
Structured Graphics in dhtml and active controls.docx
DOCX
Filters and its types as wave shadow.docx
PPTX
JavaScript Arrays and its types .pptx
Probability Distribution Reviewing Probability Distributions.pptx
Combining data and Customizing the Header NamesSorting.pptx
python plotting's and its types with examples.pptx
Statistics and its measures with Python.pptx
DataStructures in Pyhton Pandas and numpy.pptx
Pyhton with Mysql to perform CRUD operations.pptx
Regular expressions,function and glob module.pptx
Natural Language processing using nltk.pptx
Parsing HTML read and write operations and OS Module.pptx
JSON, XML and Data Science introduction.pptx
What is FIle and explanation of text files.pptx
BINARY files CSV files JSON files with example.pptx
VBS control structures for if do whilw.docx
Builtinfunctions in vbscript and its types.docx
VBScript Functions procedures and arrays.docx
VBScript datatypes and control structures.docx
Numbers and global functions conversions .pptx
Structured Graphics in dhtml and active controls.docx
Filters and its types as wave shadow.docx
JavaScript Arrays and its types .pptx

Recently uploaded (20)

PPTX
transformers as a tool for understanding advance algorithms in deep learning
PPTX
Introduction to Fundamentals of Data Security
PDF
Book Trusted Companions in Delhi – 24/7 Available Delhi Personal Meeting Ser...
PPTX
Sheep Seg. Marketing Plan_C2 2025 (1).pptx
PPTX
1 hour to get there before the game is done so you don’t need a car seat for ...
PDF
Concepts of Database Management, 10th Edition by Lisa Friedrichsen Test Bank.pdf
PPTX
lung disease detection using transfer learning approach.pptx
PDF
The Role of Pathology AI in Translational Cancer Research and Education
PPTX
chuitkarjhanbijunsdivndsijvndiucbhsaxnmzsicvjsd
PPTX
C programming msc chemistry pankaj pandey
PPTX
ch20 Database System Architecture by Rizvee
PPT
2011 HCRP presentation-final.pptjrirrififfi
PPTX
AI AND ML PROPOSAL PRESENTATION MUST.pptx
PPT
dsa Lec-1 Introduction FOR THE STUDENTS OF bscs
PDF
book-34714 (2).pdfhjkkljgfdssawtjiiiiiujj
PPTX
indiraparyavaranbhavan-240418134200-31d840b3.pptx
PPTX
9 Bioterrorism.pptxnsbhsjdgdhdvkdbebrkndbd
PDF
Hikvision-IR-PPT---EN.pdfSADASDASSAAAAAAAAAAAAAAA
PPTX
Hushh.ai: Your Personal Data, Your Business
PPTX
langchainpptforbeginners_easy_explanation.pptx
transformers as a tool for understanding advance algorithms in deep learning
Introduction to Fundamentals of Data Security
Book Trusted Companions in Delhi – 24/7 Available Delhi Personal Meeting Ser...
Sheep Seg. Marketing Plan_C2 2025 (1).pptx
1 hour to get there before the game is done so you don’t need a car seat for ...
Concepts of Database Management, 10th Edition by Lisa Friedrichsen Test Bank.pdf
lung disease detection using transfer learning approach.pptx
The Role of Pathology AI in Translational Cancer Research and Education
chuitkarjhanbijunsdivndsijvndiucbhsaxnmzsicvjsd
C programming msc chemistry pankaj pandey
ch20 Database System Architecture by Rizvee
2011 HCRP presentation-final.pptjrirrififfi
AI AND ML PROPOSAL PRESENTATION MUST.pptx
dsa Lec-1 Introduction FOR THE STUDENTS OF bscs
book-34714 (2).pdfhjkkljgfdssawtjiiiiiujj
indiraparyavaranbhavan-240418134200-31d840b3.pptx
9 Bioterrorism.pptxnsbhsjdgdhdvkdbebrkndbd
Hikvision-IR-PPT---EN.pdfSADASDASSAAAAAAAAAAAAAAA
Hushh.ai: Your Personal Data, Your Business
langchainpptforbeginners_easy_explanation.pptx

K Means Clustering in ML.pptx

  • 1. K Means Clustering What is K-Means Algorithm? K-Means Clustering is an Unsupervised Learning algorithm, which groups the unlabelled dataset into different clusters. Here K defines the number of pre-defined clusters that need to be created in the process, as if K=2, there will be two clusters, and for K=3, there will be three clusters, and so on. It is an iterative algorithm that divides the unlabelled dataset into k different clusters in such a way that each dataset belongs only one group that has similar properties. It is a centroid-based algorithm, where each cluster is associated with a centroid. The main aim of this algorithm is to minimize the sum of distances between the data point and their corresponding clusters.
  • 2.  The k-means clustering algorithm mainly performs two tasks:  1. Determines the best value for K center points or centroids by an iterative process.  2. Assigns each data point to its closest k-center. Those data points which are near to the particular k-center, create a cluster.
  • 3.  How does the K-Means Algorithm Work?  The working of the K-Means algorithm is explained in the below steps:  Step-1: Select the number K to decide the number of clusters.  Step-2: Select random K points or centroids. (It can be other from the input dataset).  Step-3: Assign each data point to their closest centroid, which will form the predefined K clusters.  Step-4: Calculate the variance and place a new centroid of each cluster.  Step-5: Repeat the third steps, which means reassign each datapoint to the new closest centroid of each cluster.  Step-6: If any reassignment occurs, then go to step-4 else go to FINISH.  Step-7: The model is ready.  Let's take number k of clusters, i.e., K=2, to identify the dataset and to put them into different clusters. It means here we will try to group these datasets into two different clusters.
  • 4. Now we will assign each data point of the scatter plot to its closest K-point or centroid. From the above image, it is clear that points left side of the line is near to the K1 or blue centroid, and points to the right of the line are close to the yellow centroid.
  • 5. As we need to find the closest cluster, so we will repeat the process by choosing a new centroid. As our model is ready, so we can now remove the assumed centroids, and the two final clusters will be as shown in the below image:
  • 6.  In the given dataset, we have Customer_Id, Gender, Age, Annual Income ($), and Spending Score (which is the calculated value of how much a customer has spent in the mall, the more the value, the more he has spent).  From this dataset, we need to calculate some patterns, as it is an unsupervised method, so we don't know what to calculate exactly.  The steps to be followed for the implementation are given below:  Data Pre-processing  Finding the optimal number of clusters using the elbow method  Training the K-means algorithm on the training dataset  Visualizing the clusters
  • 7.  The numpy we have imported for the performing mathematics calculation, matplotlib is for plotting the graph, and pandas are for managing the dataset.  Importing the Dataset: Next, we will import the dataset that we need to use. So here, we are using the Mall_Customer_data.csv dataset. It can be imported using the below code:  import numpy as nm  import matplotlib.pyplot as mtp  import pandas as pd  dataset = pd.read_csv('Mall_Customers_data.csv')  print(dataset)  Extracting Independent Variables  Here we don't need any dependent variable for data pre-processing step as it is a clustering problem, and we have no idea about what to determine. So we will just add a line of code for the matrix of features.
  • 8.  Finding the optimal number of clusters using the elbow method  In the second step, we will try to find the optimal number of clusters for our clustering problem. So, as discussed above, here we are going to use the elbow method for this purpose.  Elbow method uses the WCSS concept to draw the plot by plotting WCSS values on the Y-axis and the number of clusters on the X-axis.  So we are going to calculate the value for WCSS for different k values ranging from 1 to 10. Below is the code for it:
  • 9.  #finding optimal number of clusters using the elbow method  from sklearn.cluster import KMeans  wcss_list= [] #Initializing the list for the values of WCSS   #Using for loop for iterations from 1 to 10.  for i in range(1, 11):  kmeans = KMeans(n_clusters=i, init='k-means++', random_state= 42)  kmeans.fit(x)  wcss_list.append(kmeans.inertia_)  mtp.plot(range(1, 11), wcss_list)  mtp.title('The Elobw Method Graph')  mtp.xlabel('Number of clusters(k)')  mtp.ylabel('wcss_list')  mtp.show()
  • 10.  we have created the wcss_list variable to initialize an empty list, which is used to contain the value of wcss computed for different values of k ranging from 1 to 10.  After that, we have initialized the for loop for the iteration on a different value of k ranging from 1 to 10; since for loop in Python, exclude the outbound limit, so it is taken as 11 to include 10th value.  The rest part of the code is similar as we did in earlier topics, as we have fitted the model on a matrix of features and then plotted the graph between the number of clusters and WCSS.
  • 11.  Training the K-means algorithm on the training dataset  As we have got the number of clusters, so we can now train the model on the dataset.  #training the K-means model on a dataset  kmeans = KMeans(n_clusters=5, init='k-means++', random_state= 42)  y_predict= kmeans.fit_predict(x)  Visualizing the Clusters  The last step is to visualize the clusters. As we have 5 clusters for our model, so we will visualize each cluster one by one.
  • 12.  #visulaizing the clusters  mtp.scatter(x[y_predict == 0, 0], x[y_predict == 0, 1], s = 100, c = 'blue', label = 'Cluster 1') #for first cluster  mtp.scatter(x[y_predict == 1, 0], x[y_predict == 1, 1], s = 100, c = 'green', label = 'Cluster 2') #for second cluster  mtp.scatter(x[y_predict== 2, 0], x[y_predict == 2, 1], s = 100, c = 'red', label = 'Cluster 3') #for third cluster  mtp.scatter(x[y_predict == 3, 0], x[y_predict == 3, 1], s = 100, c = 'cyan', label = 'Cluster 4') #for fourth cluster  mtp.scatter(x[y_predict == 4, 0], x[y_predict == 4, 1], s = 100, c = 'magenta', label = 'Cluster 5') #for fifth cluster  mtp.scatter(kmeans.cluster_centers_[:, 0], kmeans.cluster_centers_[:, 1], s = 300, c = 'yellow', label = 'Centroid' )  mtp.title('Clusters of customers')  mtp.xlabel('Annual Income (k$)')  mtp.ylabel('Spending Score (1-100)')  mtp.legend()  mtp.show()
  • 13.  Cluster1 shows the customers with average salary and average spending so we can categorize these customers as Cluster1  Cluster2 shows the customer has a high income but low spending, so we can categorize them as careful.  Cluster3 shows the low income and also low spending so they can be categorized as sensible.  Cluster4 shows the customers with low income with very high spending so they can be categorized as careless.  Cluster5 shows the customers with high income and high spending so they can be categorized as target, and these customers can be the most profitable customers for the mall owner.
  • 14.  import numpy as nm  import matplotlib.pyplot as mtp  import pandas as pd  dataset = pd.read_csv('Mall_Customers_data.csv')  print(dataset)  x = dataset.iloc[:, [3, 4]].values  print(x)  #finding optimal number of clusters using the elbow method  from sklearn.cluster import KMeans  wcss_list=[]  #Initializing the list for the values of WCSS  #Using for loop for iterations from 1 to 10.  for i in range(1,11):  kmeans=KMeans(n_clusters=i,init='k-means++',random_state=42)  kmeans.fit(x)  wcss_list.append(kmeans.inertia_)  mtp.plot(range(1,11),wcss_list)  mtp.title('The Elobw Method Graph')
  • 15.  mtp.xlabel('Number of clusters(k)')  mtp.ylabel('wcss_list')  mtp.show()  #training the K-means model on a dataset  kmeans = KMeans(n_clusters=5, init='k-means++', random_state= 42)  y_predict= kmeans.fit_predict(x)  #visulaizing the clusters  mtp.scatter(x[y_predict == 0, 0], x[y_predict == 0, 1], s = 100, c = 'blue', label = 'Cluster 1') #for first cluster  mtp.scatter(x[y_predict == 1, 0], x[y_predict == 1, 1], s = 100, c = 'green', label = 'Cluster 2') #for second cluster  mtp.scatter(x[y_predict== 2, 0], x[y_predict == 2, 1], s = 100, c = 'red', label = 'Cluster 3') #for third cluster  mtp.scatter(x[y_predict == 3, 0], x[y_predict == 3, 1], s = 100, c = 'cyan', label = 'Cluster 4') #for fourth cluster  mtp.scatter(x[y_predict == 4, 0], x[y_predict == 4, 1], s = 100, c = 'magenta', label = 'Cluster 5') #for fifth cluster  mtp.scatter(kmeans.cluster_centers_[:, 0], kmeans.cluster_centers_[:, 1], s = 300, c = 'yellow', label = 'Centroid')  mtp.title('Clusters of customers')  mtp.xlabel('Annual Income (k$)')  mtp.ylabel('Spending Score (1-100)')  mtp.legend()  mtp.show()