SlideShare a Scribd company logo
3
Most read
8
Most read
9
Most read
K-Nearest Neighbors
(K-NN)
By Mohamed Gamal
Agenda
▪ What is K-NN?
▪ How does K-NN work?
▪ K-NN algorithm and structure
▪ Advantages of K-NN
▪ Disadvantages of K-NN
▪ Example
1
▪ K-Nearest Neighbors (KNN) is a simple, yet powerful, supervised non-linear machine
learning algorithm used for classification and regression tasks.
▪ It's a non-parametric algorithm, meaning it doesn't make any assumptions about the
underlying data distribution and doesn't learn a model during the training phase.
▪ Instead, it memorizes the entire training dataset and makes predictions based on the
similarity of new data points to the known data points.
What is KNN?
2
▪ Step-1: Select the number K of the neighbors.
▪ Step-2: Calculate the distance of K number of neighbors.
▪ Step-3: Take the K nearest neighbors as per the calculated distance.
▪ Step-4: Among these K neighbors, count the number of the data
points in each category.
▪ Step-5: Assign the new data points to that category for which the
number of the neighbor is maximum.
How does KNN work? (Algorithm)
2
1) K=5 2) Calculate the distances.
5) Assign the new data point to the
class/category with the majority of votes.
Blue: 3
Orange: 2
3) Choose K=5 neighbors
with the min. distances.
4) Among the selected K nearest neighbors, count
the no. points in each class/category.
Blue: 3
Orange: 2
Minkowski Distance
(Named after the German mathematician, Hermann Minkowski)
Manhattan Distance
(Also called taxicab distance or cityblock distance)
Euclidean Distance
(The shortest distance between any two points)
𝑝
=
1
𝑝
=
2
Ways to calculate the
distance in KNN
How to select the value of K?
• There’s no particular way to determine the best value
for K, so you need to try some values to find the best
out of them.
• The most preferred value for K is 5.
• A very low value for K such as K=1 or K=2, can be noisy
and lead to the effects of outliers in the model.
• Large values for K are good, but it may find some
difficulties.
Advantages Disadvantages
Simple and effective algorithm
Accuracy depends on the quality of the data (e.g.,
noise can affect accuracy)
Quick calculation time
With large data, the prediction stage might be
slow
High accuracy (with small dataset)
Sensitive to the scale of the data and irrelevant
features
No need to make additional assumptions about
the data
Requires a large amount of memory — needs to
store all of the training data
New data entry
The dataset
▪ Assume that:
• The value of K is 5.
• Euclidean distance is used.
Example
▪ Note: you can calculate the distance
using any other measure!
(e.g., Manhattan, Minkowski … etc.).
(Saturation)
The dataset
Calculating Distances:
▪ 𝑑1 = 40 − 20 2 + 20 − 35 2 = 25
▪ 𝑑2 = 50 − 20 2 + 50 − 35 2 = 33.54
▪ 𝑑3 = 60 − 20 2 + 90 − 35 2 = 68.01
▪ 𝑑4 = 10 − 20 2 + 25 − 35 2 = 10
▪ 𝑑5 = 70 − 20 2 + 70 − 35 2 = 61.03
▪ 𝑑6 = 60 − 20 2 + 10 − 35 2 = 47.17
▪ 𝑑7 = 25 − 20 2 + 80 − 35 2 = 45
New data entry
The dataset New data entry
𝐾
=
5
▪ As you can see, based on the 5
selected neighbors with the low
distances, the majority of votes
are for the Red class, therefore,
the new entry is classified as
Red.
Red
▪ If the data is a jumble of all different classes then KNN will fail because it will try
to find k nearest neighbors.
▪ Outliers points.
KNN failure cases
Jumbled data Outliers
ThankYou!

More Related Content

Similar to Understanding K-Nearest Neighbor (KNN) Algorithm (20)

PPTX
KNN Tutorial (K-nearest neighbor) - Machine Learning
hmd3214
 
PPTX
KNN.pptx
Mohamed Essam
 
PPTX
artificial intelligence.pptx
soundaryasellapandia
 
PDF
Lecture03 - K-Nearest-Neighbor Machine learning
ShafinZaman2
 
PPTX
K-Nearest Neighbor Classifier
Neha Kulkarni
 
PPTX
KNN Classifier
Mobashshirur Rahman 👲
 
PPTX
knn is the k nearest algorithm ppt that includes all about knn, its adv and d...
BootNeck1
 
PPTX
Knn demonstration
Md. Abu Bakr Siddique
 
PDF
Knn algorithym execution - lab 9 ppt.pdf
rajalakshmir71
 
PPTX
KNN Algorithm - How KNN Algorithm Works With Example | Data Science For Begin...
Simplilearn
 
PDF
Natural Language Processing of applications.pdf
pranavi452104
 
PPTX
Nearest Neighbor Algorithm Zaffar Ahmed
Zaffar Ahmed Shaikh
 
PPTX
KNN CLASSIFIER, INTRODUCTION TO K-NEAREST NEIGHBOR ALGORITHM.pptx
Nishant83346
 
PPTX
K neareast neighbor algorithm presentation
Shiraz316
 
PDF
Machine Learning-Lec7 Bayesian calssification.pdf
BeshoyArnest
 
PPTX
22PCOAM16 Unit 3 Session 24 K means Algorithms.pptx
Guru Nanak Technical Institutions
 
PPTX
k-Nearest Neighbors with brief explanation.pptx
gamingzonedead880
 
PPTX
k-nearest neighbour Machine Learning.pptx
SabbirAhmed346057
 
PPTX
Machine_Learning_KNN_Presentation.pptx
TesfahunAsmare1
 
KNN Tutorial (K-nearest neighbor) - Machine Learning
hmd3214
 
KNN.pptx
Mohamed Essam
 
artificial intelligence.pptx
soundaryasellapandia
 
Lecture03 - K-Nearest-Neighbor Machine learning
ShafinZaman2
 
K-Nearest Neighbor Classifier
Neha Kulkarni
 
KNN Classifier
Mobashshirur Rahman 👲
 
knn is the k nearest algorithm ppt that includes all about knn, its adv and d...
BootNeck1
 
Knn demonstration
Md. Abu Bakr Siddique
 
Knn algorithym execution - lab 9 ppt.pdf
rajalakshmir71
 
KNN Algorithm - How KNN Algorithm Works With Example | Data Science For Begin...
Simplilearn
 
Natural Language Processing of applications.pdf
pranavi452104
 
Nearest Neighbor Algorithm Zaffar Ahmed
Zaffar Ahmed Shaikh
 
KNN CLASSIFIER, INTRODUCTION TO K-NEAREST NEIGHBOR ALGORITHM.pptx
Nishant83346
 
K neareast neighbor algorithm presentation
Shiraz316
 
Machine Learning-Lec7 Bayesian calssification.pdf
BeshoyArnest
 
22PCOAM16 Unit 3 Session 24 K means Algorithms.pptx
Guru Nanak Technical Institutions
 
k-Nearest Neighbors with brief explanation.pptx
gamingzonedead880
 
k-nearest neighbour Machine Learning.pptx
SabbirAhmed346057
 
Machine_Learning_KNN_Presentation.pptx
TesfahunAsmare1
 

More from Faculty of Computers and Informatics, Suez Canal University, Ismailia, Egypt (20)

PDF
How to install CS50 Library (Step-by-step guide)
Faculty of Computers and Informatics, Suez Canal University, Ismailia, Egypt
 
PDF
Understanding Singular Value Decomposition (SVD)
Faculty of Computers and Informatics, Suez Canal University, Ismailia, Egypt
 
PDF
Understanding Convolutional Neural Networks (CNN)
Faculty of Computers and Informatics, Suez Canal University, Ismailia, Egypt
 
PDF
Luhn's algorithm to validate Egyptian ID numbers
Faculty of Computers and Informatics, Suez Canal University, Ismailia, Egypt
 
PDF
Complier Design - Operations on Languages, RE, Finite Automata
Faculty of Computers and Informatics, Suez Canal University, Ismailia, Egypt
 
PDF
Object Oriented Programming (OOP) using C++ - Lecture 5
Faculty of Computers and Informatics, Suez Canal University, Ismailia, Egypt
 
PDF
Object Oriented Programming (OOP) using C++ - Lecture 2
Faculty of Computers and Informatics, Suez Canal University, Ismailia, Egypt
 
PDF
Object Oriented Programming (OOP) using C++ - Lecture 1
Faculty of Computers and Informatics, Suez Canal University, Ismailia, Egypt
 
PDF
Object Oriented Programming (OOP) using C++ - Lecture 3
Faculty of Computers and Informatics, Suez Canal University, Ismailia, Egypt
 
PDF
Object Oriented Programming (OOP) using C++ - Lecture 4
Faculty of Computers and Informatics, Suez Canal University, Ismailia, Egypt
 
Understanding Convolutional Neural Networks (CNN)
Faculty of Computers and Informatics, Suez Canal University, Ismailia, Egypt
 
Complier Design - Operations on Languages, RE, Finite Automata
Faculty of Computers and Informatics, Suez Canal University, Ismailia, Egypt
 
Object Oriented Programming (OOP) using C++ - Lecture 5
Faculty of Computers and Informatics, Suez Canal University, Ismailia, Egypt
 
Object Oriented Programming (OOP) using C++ - Lecture 2
Faculty of Computers and Informatics, Suez Canal University, Ismailia, Egypt
 
Object Oriented Programming (OOP) using C++ - Lecture 1
Faculty of Computers and Informatics, Suez Canal University, Ismailia, Egypt
 
Object Oriented Programming (OOP) using C++ - Lecture 3
Faculty of Computers and Informatics, Suez Canal University, Ismailia, Egypt
 
Object Oriented Programming (OOP) using C++ - Lecture 4
Faculty of Computers and Informatics, Suez Canal University, Ismailia, Egypt
 
Ad

Recently uploaded (20)

PDF
Aprendendo Arquitetura Framework Salesforce - Dia 03
Mauricio Alexandre Silva
 
PDF
AI-Powered-Visual-Storytelling-for-Nonprofits.pdf
TechSoup
 
PDF
Governor Josh Stein letter to NC delegation of U.S. House
Mebane Rash
 
PPTX
HUMAN RESOURCE MANAGEMENT: RECRUITMENT, SELECTION, PLACEMENT, DEPLOYMENT, TRA...
PRADEEP ABOTHU
 
PDF
Chapter-V-DED-Entrepreneurship: Institutions Facilitating Entrepreneurship
Dayanand Huded
 
PPTX
Universal immunization Programme (UIP).pptx
Vishal Chanalia
 
PDF
Horarios de distribución de agua en julio
pegazohn1978
 
PDF
Stokey: A Jewish Village by Rachel Kolsky
History of Stoke Newington
 
PPTX
Post Dated Cheque(PDC) Management in Odoo 18
Celine George
 
PPTX
Nitrogen rule, ring rule, mc lafferty.pptx
nbisen2001
 
PPTX
DAY 1_QUARTER1 ENGLISH 5 WEEK- PRESENTATION.pptx
BanyMacalintal
 
PPTX
CATEGORIES OF NURSING PERSONNEL: HOSPITAL & COLLEGE
PRADEEP ABOTHU
 
PPTX
Identifying elements in the story. Arrange the events in the story
geraldineamahido2
 
PDF
STATEMENT-BY-THE-HON.-MINISTER-FOR-HEALTH-ON-THE-COVID-19-OUTBREAK-AT-UG_revi...
nservice241
 
PDF
Vani - The Voice of Excellence - Jul 2025 issue
Savipriya Raghavendra
 
PPTX
How to Manage Allocation Report for Manufacturing Orders in Odoo 18
Celine George
 
PPTX
infertility, types,causes, impact, and management
Ritu480198
 
PPTX
care of patient with elimination needs.pptx
Rekhanjali Gupta
 
PDF
Reconstruct, Restore, Reimagine: New Perspectives on Stoke Newington’s Histor...
History of Stoke Newington
 
PDF
Introduction presentation of the patentbutler tool
MIPLM
 
Aprendendo Arquitetura Framework Salesforce - Dia 03
Mauricio Alexandre Silva
 
AI-Powered-Visual-Storytelling-for-Nonprofits.pdf
TechSoup
 
Governor Josh Stein letter to NC delegation of U.S. House
Mebane Rash
 
HUMAN RESOURCE MANAGEMENT: RECRUITMENT, SELECTION, PLACEMENT, DEPLOYMENT, TRA...
PRADEEP ABOTHU
 
Chapter-V-DED-Entrepreneurship: Institutions Facilitating Entrepreneurship
Dayanand Huded
 
Universal immunization Programme (UIP).pptx
Vishal Chanalia
 
Horarios de distribución de agua en julio
pegazohn1978
 
Stokey: A Jewish Village by Rachel Kolsky
History of Stoke Newington
 
Post Dated Cheque(PDC) Management in Odoo 18
Celine George
 
Nitrogen rule, ring rule, mc lafferty.pptx
nbisen2001
 
DAY 1_QUARTER1 ENGLISH 5 WEEK- PRESENTATION.pptx
BanyMacalintal
 
CATEGORIES OF NURSING PERSONNEL: HOSPITAL & COLLEGE
PRADEEP ABOTHU
 
Identifying elements in the story. Arrange the events in the story
geraldineamahido2
 
STATEMENT-BY-THE-HON.-MINISTER-FOR-HEALTH-ON-THE-COVID-19-OUTBREAK-AT-UG_revi...
nservice241
 
Vani - The Voice of Excellence - Jul 2025 issue
Savipriya Raghavendra
 
How to Manage Allocation Report for Manufacturing Orders in Odoo 18
Celine George
 
infertility, types,causes, impact, and management
Ritu480198
 
care of patient with elimination needs.pptx
Rekhanjali Gupta
 
Reconstruct, Restore, Reimagine: New Perspectives on Stoke Newington’s Histor...
History of Stoke Newington
 
Introduction presentation of the patentbutler tool
MIPLM
 
Ad

Understanding K-Nearest Neighbor (KNN) Algorithm

  • 2. Agenda ▪ What is K-NN? ▪ How does K-NN work? ▪ K-NN algorithm and structure ▪ Advantages of K-NN ▪ Disadvantages of K-NN ▪ Example 1
  • 3. ▪ K-Nearest Neighbors (KNN) is a simple, yet powerful, supervised non-linear machine learning algorithm used for classification and regression tasks. ▪ It's a non-parametric algorithm, meaning it doesn't make any assumptions about the underlying data distribution and doesn't learn a model during the training phase. ▪ Instead, it memorizes the entire training dataset and makes predictions based on the similarity of new data points to the known data points. What is KNN? 2
  • 4. ▪ Step-1: Select the number K of the neighbors. ▪ Step-2: Calculate the distance of K number of neighbors. ▪ Step-3: Take the K nearest neighbors as per the calculated distance. ▪ Step-4: Among these K neighbors, count the number of the data points in each category. ▪ Step-5: Assign the new data points to that category for which the number of the neighbor is maximum. How does KNN work? (Algorithm) 2
  • 5. 1) K=5 2) Calculate the distances. 5) Assign the new data point to the class/category with the majority of votes. Blue: 3 Orange: 2 3) Choose K=5 neighbors with the min. distances. 4) Among the selected K nearest neighbors, count the no. points in each class/category. Blue: 3 Orange: 2
  • 6. Minkowski Distance (Named after the German mathematician, Hermann Minkowski) Manhattan Distance (Also called taxicab distance or cityblock distance) Euclidean Distance (The shortest distance between any two points) 𝑝 = 1 𝑝 = 2 Ways to calculate the distance in KNN
  • 7. How to select the value of K? • There’s no particular way to determine the best value for K, so you need to try some values to find the best out of them. • The most preferred value for K is 5. • A very low value for K such as K=1 or K=2, can be noisy and lead to the effects of outliers in the model. • Large values for K are good, but it may find some difficulties.
  • 8. Advantages Disadvantages Simple and effective algorithm Accuracy depends on the quality of the data (e.g., noise can affect accuracy) Quick calculation time With large data, the prediction stage might be slow High accuracy (with small dataset) Sensitive to the scale of the data and irrelevant features No need to make additional assumptions about the data Requires a large amount of memory — needs to store all of the training data
  • 9. New data entry The dataset ▪ Assume that: • The value of K is 5. • Euclidean distance is used. Example ▪ Note: you can calculate the distance using any other measure! (e.g., Manhattan, Minkowski … etc.). (Saturation)
  • 10. The dataset Calculating Distances: ▪ 𝑑1 = 40 − 20 2 + 20 − 35 2 = 25 ▪ 𝑑2 = 50 − 20 2 + 50 − 35 2 = 33.54 ▪ 𝑑3 = 60 − 20 2 + 90 − 35 2 = 68.01 ▪ 𝑑4 = 10 − 20 2 + 25 − 35 2 = 10 ▪ 𝑑5 = 70 − 20 2 + 70 − 35 2 = 61.03 ▪ 𝑑6 = 60 − 20 2 + 10 − 35 2 = 47.17 ▪ 𝑑7 = 25 − 20 2 + 80 − 35 2 = 45 New data entry
  • 11. The dataset New data entry 𝐾 = 5 ▪ As you can see, based on the 5 selected neighbors with the low distances, the majority of votes are for the Red class, therefore, the new entry is classified as Red. Red
  • 12. ▪ If the data is a jumble of all different classes then KNN will fail because it will try to find k nearest neighbors. ▪ Outliers points. KNN failure cases Jumbled data Outliers