Machine Learning-Lec8 support vector machine.pdf

Support Vector Machines
(SVM)—(SVR)
Dr. Marwa M. Emam
Faculty of computers and Information
Minia University
Dr. Marwa M. Emam 1

Agenda
 Introduction
 Key Concepts
 Linear SVM
 Non- linear SVM
 Kernel Trick
 SVR
Dr. Marwa M. Emam 2

Introduction
 SVM is a supervised learning algorithm used for classification and
regression tasks.
 It's particularly effective in high-dimensional spaces and is well-suited
for both linear and non-linear data.
 The SVM aims to find the linear boundary that is located as far as
possible from the points in the dataset.
3

Introduction …
 We learned about linear classifiers. With two-dimensional data,
these are defined by a line that separates a dataset consisting of
points with two labels.
 However, we may have noticed that many different lines can separate
a dataset, and this raises the following question: how do we know
which is the best line?
Dr. Marwa M. Emam 4

Introduction …
 In this figure, we can see three different linear classifiers that
separate this dataset. Which one do you prefer, classifier 1, 2, or 3?
Dr. Marwa M. Emam
5

Introduction …
 If you said classifier 2, we agree. All three lines separate the dataset well,
but the second line is better placed.
 The first and third lines are very close to some of the points, whereas the
second line is far from all the points. If we were to wiggle the three lines
around a little bit, the first and the third may go over some of the points,
misclassifying some of them in the process, whereas the second one will
still classify them all correctly.
 Thus, classifier 2 is more robust than classifiers 1 and 3.
Dr. Marwa M. Emam 6

Introduction …
 Main Goal of the SVM to design a hyperplane that classify all training
vectors into two classes.
 An SVM classifier uses two parallel lines instead of one line. It tries to
classify the data correctly and also tries to space the lines as much as
possible.

Dr. Marwa M. Emam
7

Introduction …
 Support Vectors: Support vectors are the data points that are closest to
the hyperplane. These are crucial in defining the optimal hyperplane, as
they contribute to maximizing the margin.
 Margin: The margin is the distance between the hyperplane and the
nearest data point from either class. SVM aims to maximize this margin,
leading to better generalization on unseen data.
Dr. Marwa M. Emam 10

The Objective Function Of SVM
 Support Vector Machines (SVMs) aim to find a hyperplane that separates
data points of different classes while maximizing the margin between
them.
 The objective is typically formulated as a margin maximization problem,
and the error function is associated with minimizing the classification
error or maximizing the margin.

The Objective Function
 The objective function can be formulated using the concept of margin
and regularization. For linearly separable data, a common
formulation is:

The Objective Function
 The SVM uses two parallel lines, parallel lines have similar equations;
they have the same weights but a different bias. Thus, in our SVM,
we use the central line as a frame of reference L with equation
 w1x1 + w2x2 + b = 0, and construct two lines, one above it and one
below it, with the respective equations:

The Objective Function: from maximization
to minimization
 The objective function can be formulated using the concept of margin
and regularization. For linearly separable data, a common
formulation is:

SVM as a minimization problem
 Optimization Problem:
 The goal is to find the optimal values for w and b that minimize the
objective function while satisfying the constraints. This is a quadratic
optimization problem subject to linear constraints.
Dr. Marwa M. Emam
19

We wish to find the w and b which minimizes, and the α which
maximizes LP(whilst keeping αi
≥ 0 ∀
i
)
. We can do this by differentiating LP with respect to w and
b and setting the derivatives to zero:

Non-Linearity:
 In some cases, the relationship between features and the target
variable may not be linear. The kernel trick enables SVMs to capture
non-linear patterns by projecting data into a higher-dimensional
space. In some cases, the relationship between features and the
target variable may not be linear. The kernel trick enables SVMs to
capture non-linear patterns by projecting data into a higher-
dimensional space.

27
Non-linear SVMs: Feature spaces
• General idea:the original feature space can
always be mapped to some higher-
dimensional feature space where the
training set is separable:
Φ: x →φ(x)

Kernel Trick
 The kernel trick is a technique used in Support Vector Machines (SVM) to
handle non-linear decision boundaries by implicitly mapping the input
features into a higher-dimensional space without explicitly calculating
the transformation. This allows SVMs to efficiently classify data that is
not linearly separable in the original feature space.
 The kernel trick provides flexibility in choosing different kernel functions
to capture various types of non-linear relationships. Common kernels
include polynomial, radial basis function (RBF), and sigmoid kernels.

Kernel Function

Machine Learning-Lec8 support vector machine.pdf

More Related Content

Similar to Machine Learning-Lec8 support vector machine.pdf

More from BeshoyArnest

Recently uploaded

Machine Learning-Lec8 support vector machine.pdf