Supervised Machine Learning Algorithms

Supervised
ML
Algorithms
DR. MUHAMMAD FARHAN
HANIF

Machine Learning
• Learning Algorithms/Systems: Performance improvement with
experience, generalize to unseen input
• Example:
• Face recognition
• Email spam detection
• Market segmentation
• Rainfall forecasting
• Inductive inference – Data to Model

Machine Learning
R
e
p
r
e
s
e
n
t
a
t
i
o
n
Object Learning Model
Training Data
Learning Algorithm
Error function
Parameter
Update
Output
X
y
f(X)

Machine Learning Models
• Classification
• Predicts category of input objects – predefined classes
• Object recognition in images, email spam detection
• Regression
• Predicts real valued output for a given input
• Predicting value of a stock, predicting number of clicks in an advertisement
• Clustering
• Groups objects into homogeneous clusters – clusters not predefined
• Market segmentation, anomaly detection in industrial plants

Examples of Machine Learning
Models
Linear Models
1️
1️
⃣
 Logistic Regression – Simple and effective for binary classification (e.g., spam detection).
 Linear Discriminant Analysis (LDA) – Used when classes are well-separated and assumes
normal distribution.
2️
⃣ Tree-Based Models
 Decision Tree – Splits data into branches for decision-making (e.g., diagnosing a disease).
 Random Forest – An ensemble of multiple decision trees to improve accuracy and reduce
overfitting.
 XGBoost (Extreme Gradient Boosting) – A high-performance tree-based model used in
many ML competitions.
 LightGBM (Light Gradient Boosting Machine) – Faster than XGBoost for large datasets.
 CatBoost (Categorical Boosting) – Handles categorical features well.

3️
3️
⃣ Instance-Based Models
K-Nearest Neighbors (KNN) – Classifies based on the majority vote of k-nearest points in the dataset.
4️
⃣ Support Vector Machines (SVM)
Support Vector Machine (SVM) – Uses hyperplanes to separate classes with maximum margin.
5️
5️
⃣ Neural Networks & Deep Learning
Artificial Neural Networks (ANN) – Multi-layered perceptron (MLP) for classification tasks.
Convolutional Neural Networks (CNN) – Best for image classification (e.g., face recognition).
Recurrent Neural Networks (RNN) & Long Short-Term Memory (LSTM) – Used for sequential
classification tasks (e.g., sentiment analysis).

7
Classification
Example: Credit scoring
Differentiating between low-risk
and high-risk customers from
their income and savings
Discriminant: IF income > θ1 AND savings > θ2
THEN low-risk ELSE high-risk
Model

8
Classification: Applications
Aka Pattern recognition
Face recognition: Pose, lighting, occlusion (glasses, beard), make-up, hair style
Character recognition: Different handwriting styles.
Speech recognition: Temporal dependency.
◦ Use of a dictionary or the syntax of the language.
◦ Sensor fusion: Combine multiple modalities; eg, visual (lip image) and acoustic for speech
Medical diagnosis: From symptoms to illnesses
Web Advertizing: Predict if a user clicks on an ad on the Internet.

9
Face Recognition
Training examples of a person
Test images
AT&T Laboratories, Cambridge UK
http://www.uk.research.att.com/facedatabase.html

Linear Models for Classification
A linear model is a simple mathematical method used in machine learning to
separate data into different categories using a straight line (or a plane in higher
dimensions). It assumes that the relationship between input features and the
output class can be represented using a linear function.
Think of it like drawing a straight line to separate two groups in a scatter plot! If
you can draw a single straight line (or boundary) to classify points, you are using a
linear model.
Why Learn Linear Models First?
Linear models are:
✔ Simple and easy to interpret 🧠
✔ Fast to compute ⚡
✔ Useful for many real-world problems 📈
✔ Foundation for advanced ML models 🔍

Supervised Machine Learning Algorithms

Logistic Regression
Logistic regression is a supervised machine learning algorithm used
for classification tasks where the goal is to predict the probability that an instance
belongs to a given class or not. Logistic regression is a statistical algorithm which
analyze the relationship between two data factors. The article explores the
fundamentals of logistic regression, it’s types and implementations.
Logistic regression is used for binary classification where we use sigmoid function,
that takes input as independent variables and produces a probability value
between 0 and 1.
🔹 Used when we have two classes (e.g., spam vs. not spam, sick vs. healthy).
🔹 It learns a decision boundary that separates data using a curve called the sigmoid
function

Example:
Imagine you are building a system to classify emails as spam or not spam.
•If the probability of being spam is greater than 50%, it is classified as spam.
•If the probability is less than 50%, it is classified as not spam.

Key Points
•Logistic regression predicts the output of a categorical dependent variable.
Therefore, the outcome must be a categorical or discrete value.
•It can be either Yes or No, 0 or 1, true or False, etc. but instead of giving the exact
value as 0 and 1, it gives the probabilistic values which lie between 0 and 1.
•In Logistic regression, instead of fitting a regression line, we fit an “S” shaped
logistic function, which predicts two maximum values (0 or 1).

Types of Logistic Regression
On the basis of the categories, Logistic Regression can be classified into three
types:
1.Binomial: In binomial Logistic regression, there can be only two possible types of
the dependent variables, such as 0 or 1, Pass or Fail, etc.
2.Multinomial: In multinomial Logistic regression, there can be 3 or more possible
unordered types of the dependent variable, such as “cat”, “dogs”, or “sheep”
3.Ordinal: In ordinal Logistic regression, there can be 3 or more possible ordered
types of dependent variables, such as “low”, “Medium”, or “High”.

Assumptions of Logistic Regression
We will explore the assumptions of logistic regression as understanding these assumptions is important to
ensure that we are using appropriate application of the model. The assumption include:
1.
Independent observations: Each observation is independent of the other. meaning there is no correlation
between any input variables.
2.
Binary dependent variables: It takes the assumption that the dependent variable must be binary or
dichotomous, meaning it can take only two values. For more than two categories SoftMax functions are used.
3.
Linearity relationship between independent variables and log odds: The relationship between the independent
variables and the log odds (refers to the natural logarithm of the odds of an event occurring) of the dependent
variable should be linear.
4.
No outliers: There should be no outliers (a data point that significantly deviates from the general pattern or
trend within a dataset, potentially indicating errors, unusual occurrences, or novelties) in the dataset.
5.
Large sample size: The sample size is sufficiently large

Instead of predicting 0 or 1 directly, Logistic Regression predicts a probability:
Here, w and b are learned from the data.
The sigmoid function ensures the output is between 0 and 1 (like a probability)
How the Model Decides?
•If the probability is > 0.5, the email is spam 📧🚫.
•If the probability is 0.5
≤ , the email is not spam 📩✅.

Understanding Sigmoid Function
•The sigmoid function is a mathematical function used to map the predicted values
to probabilities.
•It maps any real value into another value within a range of 0 and 1. The value of
the logistic regression must be between 0 and 1, which cannot go beyond this limit,
so it forms a curve like the “S” form.
•The S-form curve is called the Sigmoid function or the logistic function.
•In logistic regression, we use the concept of the threshold value, which defines the
probability of either 0 or 1. Such as values above the threshold value tends to 1,
and a value below the threshold values tends to 0.

•σ(z) tends towards 1 as z→∞
•σ(z) tends towards 0 as z→−∞
•σ(z) is always bounded between 0 and
1

Terminologies involved in Logistic
Regression
• Independent variables: The input characteristics or predictor factors applied to the dependent variable’s predictions.
• Dependent variable: The target variable in a logistic regression model, which we are trying to predict.
• Logistic function: The formula used to represent how the independent and dependent variables relate to one another. The logistic function
transforms the input variables into a probability value between 0 and 1, which represents the likelihood of the dependent variable being 1
or 0.
• Odds: It is the ratio of something occurring to something not occurring. it is different from probability as the probability is the ratio of
something occurring to everything that could possibly occur.
• Log-odds: The log-odds, also known as the logit function, is the natural logarithm of the odds. In logistic regression, the log odds of the
dependent variable are modeled as a linear combination of the independent variables and the intercept.
• Coefficient: The logistic regression model’s estimated parameters, show how the independent and dependent variables relate to one
another.
• Intercept: A constant term in the logistic regression model, which represents the log odds when all independent variables are equal to zero.
• Maximum likelihood estimation: The method used to estimate the coefficients of the logistic regression model, which maximizes the
likelihood of observing the data given the model

Implementation of Logistic
Regression on Python
https://www.kdnuggets.com/2022/04/logistic-regression-classification.html

Supervised Machine Learning Algorithms

More Related Content

Similar to Supervised Machine Learning Algorithms (20)

Recently uploaded (20)

Supervised Machine Learning Algorithms