SlideShare a Scribd company logo
Support Vector Machine
By: Amr Koura
Agenda
● Definition.
● Kernel Functions.
● Optimization Problem.
● Soft Margin Hyperplanes.
● V-SVC.
● SMO algorithm.
● Demo.
Definition
Definition
● Supervised learning model with associated
learning algorithms that analyze and recognize
patterns.
● Application:
- Machine learning.
- Pattern recognition.
- classification and regression analysis.
Binary Classifier
● Given set of Points P={ such that
and } .
build model that assign new example to
( X i ,Y i) X i ∈R
d
Y i ∈{−1,1}
{−1,1}
Question
● What if the examples are not linearly
separable?
http://openclassroom.stanford.edu/MainFolder/DocumentPage.php?course=MachineLearning&doc=exercises/ex8/ex8.html
Kernel Function
Kernel Function
● SVM can efficiently perform non linear
classification using Kernel trick.
● Kernel trick map the input into high dimension
space where the examples become linearly
separable.
Kernel Function
https://en.wikipedia.org/wiki/Support_vector_machine
Kernel Function
● Linear Kernel.
● Polynomial Kernel.
● Gaussian RBF Kernel.
● Sigmoid Kernel.
Linear Kernel Function
● K(X,Y)=<X,Y>
Dot product between X,Y.
Polynomial Kernel Function
Where d: degree of polynomial, and c is free
parameter trade off between the influence of
higher and lower order terms in polynomials.
k ( X ,Y )=(γ∗< X ,Y > + c)
d
Gaussian RBF Kernel
Where denote square euclidean
distance.
Other form:
k ( X ,Y )=exp(
∣∣X −Y∣∣
2
−2∗σ
)
∣∣X −Y∣∣
2
k ( X ,Y )=exp(−¿ γ∗∣∣X −Y∣∣
2
)
Sigmoid Kernel Function
Where is scaling factor and r is shifting
parameter.
k ( X ,Y )=tanh(γ∗< X ,Y > + r)
γ
Optimization Problem
Optimization Problem
● Need to find hyperplane with maximum margin.
https://en.wikipedia.org/wiki/Support_vector_machine
Optimization Problem
● Distance between two hyperplanes = .
● Goal:
1- minimize ||W||.
2- prevent points to fall into margin.
● Constraint:
and
together:
, st:
2
∣∣W∣∣
W.X i−b≥1 forY i=1 W.X i−b≤−1 forY i=−1
yi (W.X i−b)≥1 for 1≤i≤nmin(W ,b)
∣∣W∣∣
Optimization Problem
● Mathematically convenient:
, st:
● By Lagrange multiplier , the problem become
quadratic optimization problem.
arg min(W ,b)
1
2
∣∣W∣∣
2
yi (W.X i−b)≥1
arg min(W ,b) max(α> 0)
1
2
∣∣W∣∣
2
−∑
i=1
n
αi [ yi (W.X i−b)−1]
Optimization Problem
● The solution can be expressed in linear
combination of :
.
for these points in support vector.
X i
W =∑
1
n
αi Y i X i
αi≠0
Optimization problem
● The QP is solved iff:
1) KKT conditions are fulfilled for every
example.
2) is semi definite positive.
● KKT conditions are:
Qi , j= yi∗y j∗k ( ⃗X i∗ ⃗X j)
αi=0⇒ yi∗ f ( ⃗xi )⩾1
0< αi< C ⇒ yi∗ f (⃗xi)⩾1
αi=C ⇒ yi∗ f ( ⃗xi )⩽1
Soft Margin
Hyperplanes
Soft Margin Hyperplanes
● The soft margin hyperplanes will choose a
hyperplane that splits the examples as cleanly
as possible with maximum margin.
● Non slack variable , measure the degree of
misclassification.
ξi
Soft Margin Hyperplanes
Learning with Kernels , by: scholkopf
Soft Margin Hyperplanes
● The optimization problem:
, st: , .
using Lagrange multiplier:
st: ,
arg min(W ,ξ ,b)
1
2
∣∣W∣∣
2
+
C
n
∑
1
n
ξi yi (W.X i+ b)≥1−ξi ξi≥0
∑
i=1
n
αi yi=0
W (α)=∑
i=0
n
αi−
1
2
∑
i , j=1
n
αi α j yi y j k (xi , x j)
0≤αi≤
C
n
● C is essentially a regularisation parameter,
which controls the trade-off between achieving
a low error on the training data and minimising
the norm of the weights.
● After the Optimizer computes , the W can be
computed as
αi
W =∑
1
n
X i Y i αi
V-SVC
V-SVC
● In previous formula , C variable was tradeoff
between (1) minimizing training errors
(2)maximizing margin.
● Replace C by parameter V, control number of
margin errors and support vectors.
● V is upper bound of training error rate.
V-SVC
● The optimization problem become:
,st:
, and .
minimize(W ,ξ ,ρ)
1
2
∣∣W∣∣
2
−V ρ+
1
n
∑
1
n
ξi
yi (W.X i+ b)≥ρ−ξi ξi≥0 ρi≥0
V-SVC
● Using Lagrange multiplier:
St:
, and
and decision function f(X)=
minimizeα∈Rd W (α)=−
1
2
∑
i , j=1
n
αi α j Y i Y j k ( X i , X j)
0≤αi≤
1
n
∑
i=1
n
αi Y i=0 ∑
i=1
n
αi≥V
sgn(∑
i=1
n
αi yi k ( X , X i)+ b)
SMO Algorithm
SMO Algorithm
● Sequential Minimal Optimization algorithm used
to solve quadratic programming problem.
● Algorithm:
1- select pair of examples “details are coming”.
2- optimize target function with respect to
selected pair analytically.
3- repeat until the selected pairs “step 1” is
optimized or number of iteration exceed user
defined input.
SMO Algorithm
2-optimize target function with respect to
selected pair analytically.
- the update on value of and depends on
the difference between the approximation error
in and .
X =Kii+ K jj−2Y i Y j Kij
αi
α j
αi α j
Solve for two Lagrange multipliers
http://research.microsoft.com/pubs/68391/smo-book.pdf
Solve for two Lagrange multipliers
http://www.csie.ntu.edu.tw/~cjlin/papers/libsvm.pdf
Solve for two Lagrange multipliers
double X = Kii+Kjj+2*Kij;
double delta = (-G[i]-G[j])/X;
double diff = alpha[i] - alpha[j];
alpha[i] += delta; alpha[j] += delta;
if(region I):
alpha[i] = C_i; alpha[j] = C_i – diff;
if(region II):
alpha[j] = C_j; alpha[i] = C_j + diff;
if(region III):
alpha[j] = 0;alpha[i] = diff;
If (region IV):
alpha[i] = 0;alpha[j] = -diff;
SMO Algorithm
● 1- select pair of examples:
we need to find pair (i,j) where the difference
between classification error is maximum.
The pair is optimal if the difference between
classification error is less than
(( f (xi)− yi)−( f (x j)− y j))
2
ξ
SMO Algorithm
1- select pair of examples “Continue”:
Define the following variables:
(Max difference) (min difference)
I0={i ,αi=0,αi ∈(0,Ci)}
I+ ,0={i ,αi=0, yi=1} I+ ,C={i ,αi=Ci , yi=1}
I−,0={i ,αi=0, yi=−1} I−,C={i ,αi=Ci , yi=−1}
maxi∈{I0∪I+ ,0∪I−,c} f (xi)− yi
min j∈{I 0∪I−,0∪I+ ,c } f (x j)− y j
SMO algorithm complexity
● Memory complexity: no additional matrix is
required to solve the problem. Only 2*2 Matrix
is required in each iteration.
● Memory complexity is linear on training data set
size.
● SMO algorithm is scaled between linear and
quadratic in the size of training data size.

More Related Content

PPTX
Dynamic programming
Melaku Bayih Demessie
 
PPTX
Support Vector Machine - How Support Vector Machine works | SVM in Machine Le...
Simplilearn
 
PPTX
Support vector machine
SomnathMore3
 
PDF
Artificial Intelligence Notes Unit 1
DigiGurukul
 
PPTX
Perceptron & Neural Networks
NAGUR SHAREEF SHAIK
 
PDF
Knowledge based agent
Shiwani Gupta
 
PPTX
Radial basis function network ppt bySheetal,Samreen and Dhanashri
sheetal katkar
 
Dynamic programming
Melaku Bayih Demessie
 
Support Vector Machine - How Support Vector Machine works | SVM in Machine Le...
Simplilearn
 
Support vector machine
SomnathMore3
 
Artificial Intelligence Notes Unit 1
DigiGurukul
 
Perceptron & Neural Networks
NAGUR SHAREEF SHAIK
 
Knowledge based agent
Shiwani Gupta
 
Radial basis function network ppt bySheetal,Samreen and Dhanashri
sheetal katkar
 

What's hot (20)

PPT
Intro automata theory
Rajendran
 
PPT
Dynamic pgmming
Dr. C.V. Suresh Babu
 
PPTX
Support vector machine-SVM's
Anudeep Chowdary Kamepalli
 
PPTX
Support vector machine
Rishabh Gupta
 
ODP
Genetic algorithm ppt
Mayank Jain
 
PPT
Introduction to Genetic algorithms
Akhil Kaushik
 
PPTX
Activation function
Astha Jain
 
PPTX
Support Vector Machine (SVM)
Sana Rahim
 
PPTX
Support Vector Machine ppt presentation
AyanaRukasar
 
PPTX
Hidden Markov Model
Mahmoud El-tayeb
 
PPTX
support vector regression
Akhilesh Joshi
 
PDF
Common Problems in Hyperparameter Optimization
SigOpt
 
PDF
Introduction to optimization Problems
Electronics & Communication Staff SCU Suez Canal University
 
PPT
Hidden markov model ppt
Shivangi Saxena
 
PPTX
Hands-On Machine Learning with Scikit-Learn and TensorFlow - Chapter8
Hakky St
 
PPTX
Deep neural networks
Si Haem
 
PPTX
Fuzzy Genetic Algorithm
Pintu Khan
 
PPTX
Ensemble learning
Haris Jamil
 
PPTX
Transfer learning-presentation
Bushra Jbawi
 
PDF
Gradient descent method
Sanghyuk Chun
 
Intro automata theory
Rajendran
 
Dynamic pgmming
Dr. C.V. Suresh Babu
 
Support vector machine-SVM's
Anudeep Chowdary Kamepalli
 
Support vector machine
Rishabh Gupta
 
Genetic algorithm ppt
Mayank Jain
 
Introduction to Genetic algorithms
Akhil Kaushik
 
Activation function
Astha Jain
 
Support Vector Machine (SVM)
Sana Rahim
 
Support Vector Machine ppt presentation
AyanaRukasar
 
Hidden Markov Model
Mahmoud El-tayeb
 
support vector regression
Akhilesh Joshi
 
Common Problems in Hyperparameter Optimization
SigOpt
 
Hidden markov model ppt
Shivangi Saxena
 
Hands-On Machine Learning with Scikit-Learn and TensorFlow - Chapter8
Hakky St
 
Deep neural networks
Si Haem
 
Fuzzy Genetic Algorithm
Pintu Khan
 
Ensemble learning
Haris Jamil
 
Transfer learning-presentation
Bushra Jbawi
 
Gradient descent method
Sanghyuk Chun
 
Ad

Viewers also liked (15)

PPT
Lec12
faintcardy
 
ODP
Local Outlier Factor
AMR koura
 
PDF
26 Computational Geometry
Andres Mendez-Vazquez
 
PDF
View classification of medical x ray images using pnn classifier, decision tr...
eSAT Journals
 
PDF
Convex Hull Algorithm Analysis
Rex Yuan
 
PPTX
convex hull
Aabid Shah
 
PDF
How to use SVM for data classification
Yiwei Chen
 
PDF
Basic guide to turf cricket pitch preparation
Debbie-Ann Hall
 
PDF
Mri brain image segmentatin and classification by modified fcm &amp;svm akorithm
eSAT Journals
 
PPTX
Svm my
Subhadeep Karan
 
PPTX
Chapter 9 morphological image processing
Ahmed Daoud
 
PPTX
Image Classification And Support Vector Machine
Shao-Chuan Wang
 
PPT
Patent Basics and Intellectual Property Rights
Rahul Dev
 
PDF
Support Vector Machines for Classification
Prakash Pimpale
 
PDF
Resume writing for students and freshers
Paku Sastry
 
Lec12
faintcardy
 
Local Outlier Factor
AMR koura
 
26 Computational Geometry
Andres Mendez-Vazquez
 
View classification of medical x ray images using pnn classifier, decision tr...
eSAT Journals
 
Convex Hull Algorithm Analysis
Rex Yuan
 
convex hull
Aabid Shah
 
How to use SVM for data classification
Yiwei Chen
 
Basic guide to turf cricket pitch preparation
Debbie-Ann Hall
 
Mri brain image segmentatin and classification by modified fcm &amp;svm akorithm
eSAT Journals
 
Chapter 9 morphological image processing
Ahmed Daoud
 
Image Classification And Support Vector Machine
Shao-Chuan Wang
 
Patent Basics and Intellectual Property Rights
Rahul Dev
 
Support Vector Machines for Classification
Prakash Pimpale
 
Resume writing for students and freshers
Paku Sastry
 
Ad

Similar to Svm V SVC (20)

PPTX
UE19EC353 ML Unit4_slides.pptx
premkumar901866
 
PPTX
SOFT COMPUTING TECHNIQUES AND APPLICATIONS
SoumitraGhorai2
 
PPTX
Support Vector Machine.pptx
HarishNayak44
 
PDF
SVM.pdf
HibaBellafkih2
 
PPTX
svm-proyekt.pptx
ElinEliyev
 
PPTX
Module 3 -Support Vector Machines data mining
shobyscms
 
PDF
SVM-Module-3-ML-PKT.pdf machine lrearning pdf
RohitKumarSahoo5
 
PDF
Support vector machine, machine learning
22054561
 
PPTX
Support Vector Machines Simply
Emad Nabil
 
PPTX
Support vector machine
Prasenjit Dey
 
PPTX
Support vector machine learning.pptx
Abhiroop Bhattacharya
 
PDF
Epsrcws08 campbell isvm_01
Cheng Feng
 
PDF
Support Vector Machines is the the the the the the the the the
sanjaibalajeessn
 
PPT
support-vector-machines_Machine_learning.ppt
Candy491
 
PPTX
Classification-Support Vector Machines.pptx
Ciceer Ghimirey
 
PPTX
AML 4TH MODULE - SUPPORT VECTOR MACHINE algorithm
RitwikAravind1
 
PDF
Extra Lecture - Support Vector Machines (SVM), a lecture in subject module St...
Maninda Edirisooriya
 
PDF
Support Vector Machines ( SVM )
Mohammad Junaid Khan
 
PPTX
Support Vector Machine topic of machine learning.pptx
CodingChamp1
 
PPTX
SVM1.pptx
MahimMajee
 
UE19EC353 ML Unit4_slides.pptx
premkumar901866
 
SOFT COMPUTING TECHNIQUES AND APPLICATIONS
SoumitraGhorai2
 
Support Vector Machine.pptx
HarishNayak44
 
svm-proyekt.pptx
ElinEliyev
 
Module 3 -Support Vector Machines data mining
shobyscms
 
SVM-Module-3-ML-PKT.pdf machine lrearning pdf
RohitKumarSahoo5
 
Support vector machine, machine learning
22054561
 
Support Vector Machines Simply
Emad Nabil
 
Support vector machine
Prasenjit Dey
 
Support vector machine learning.pptx
Abhiroop Bhattacharya
 
Epsrcws08 campbell isvm_01
Cheng Feng
 
Support Vector Machines is the the the the the the the the the
sanjaibalajeessn
 
support-vector-machines_Machine_learning.ppt
Candy491
 
Classification-Support Vector Machines.pptx
Ciceer Ghimirey
 
AML 4TH MODULE - SUPPORT VECTOR MACHINE algorithm
RitwikAravind1
 
Extra Lecture - Support Vector Machines (SVM), a lecture in subject module St...
Maninda Edirisooriya
 
Support Vector Machines ( SVM )
Mohammad Junaid Khan
 
Support Vector Machine topic of machine learning.pptx
CodingChamp1
 
SVM1.pptx
MahimMajee
 

Recently uploaded (20)

PPTX
INTESTINALPARASITES OR WORM INFESTATIONS.pptx
PRADEEP ABOTHU
 
DOCX
Unit 5: Speech-language and swallowing disorders
JELLA VISHNU DURGA PRASAD
 
PPTX
BASICS IN COMPUTER APPLICATIONS - UNIT I
suganthim28
 
PPTX
Five Point Someone – Chetan Bhagat | Book Summary & Analysis by Bhupesh Kushwaha
Bhupesh Kushwaha
 
DOCX
pgdei-UNIT -V Neurological Disorders & developmental disabilities
JELLA VISHNU DURGA PRASAD
 
PPTX
How to Manage Leads in Odoo 18 CRM - Odoo Slides
Celine George
 
PPTX
An introduction to Dialogue writing.pptx
drsiddhantnagine
 
PPTX
PROTIEN ENERGY MALNUTRITION: NURSING MANAGEMENT.pptx
PRADEEP ABOTHU
 
PPTX
Care of patients with elImination deviation.pptx
AneetaSharma15
 
PDF
Health-The-Ultimate-Treasure (1).pdf/8th class science curiosity /samyans edu...
Sandeep Swamy
 
PPTX
Sonnet 130_ My Mistress’ Eyes Are Nothing Like the Sun By William Shakespear...
DhatriParmar
 
PPTX
Tips Management in Odoo 18 POS - Odoo Slides
Celine George
 
PPTX
Python-Application-in-Drug-Design by R D Jawarkar.pptx
Rahul Jawarkar
 
PPTX
CDH. pptx
AneetaSharma15
 
PDF
The Minister of Tourism, Culture and Creative Arts, Abla Dzifa Gomashie has e...
nservice241
 
PPTX
Measures_of_location_-_Averages_and__percentiles_by_DR SURYA K.pptx
Surya Ganesh
 
PPTX
A Smarter Way to Think About Choosing a College
Cyndy McDonald
 
PDF
Virat Kohli- the Pride of Indian cricket
kushpar147
 
PPTX
CARE OF UNCONSCIOUS PATIENTS .pptx
AneetaSharma15
 
PDF
Biological Classification Class 11th NCERT CBSE NEET.pdf
NehaRohtagi1
 
INTESTINALPARASITES OR WORM INFESTATIONS.pptx
PRADEEP ABOTHU
 
Unit 5: Speech-language and swallowing disorders
JELLA VISHNU DURGA PRASAD
 
BASICS IN COMPUTER APPLICATIONS - UNIT I
suganthim28
 
Five Point Someone – Chetan Bhagat | Book Summary & Analysis by Bhupesh Kushwaha
Bhupesh Kushwaha
 
pgdei-UNIT -V Neurological Disorders & developmental disabilities
JELLA VISHNU DURGA PRASAD
 
How to Manage Leads in Odoo 18 CRM - Odoo Slides
Celine George
 
An introduction to Dialogue writing.pptx
drsiddhantnagine
 
PROTIEN ENERGY MALNUTRITION: NURSING MANAGEMENT.pptx
PRADEEP ABOTHU
 
Care of patients with elImination deviation.pptx
AneetaSharma15
 
Health-The-Ultimate-Treasure (1).pdf/8th class science curiosity /samyans edu...
Sandeep Swamy
 
Sonnet 130_ My Mistress’ Eyes Are Nothing Like the Sun By William Shakespear...
DhatriParmar
 
Tips Management in Odoo 18 POS - Odoo Slides
Celine George
 
Python-Application-in-Drug-Design by R D Jawarkar.pptx
Rahul Jawarkar
 
CDH. pptx
AneetaSharma15
 
The Minister of Tourism, Culture and Creative Arts, Abla Dzifa Gomashie has e...
nservice241
 
Measures_of_location_-_Averages_and__percentiles_by_DR SURYA K.pptx
Surya Ganesh
 
A Smarter Way to Think About Choosing a College
Cyndy McDonald
 
Virat Kohli- the Pride of Indian cricket
kushpar147
 
CARE OF UNCONSCIOUS PATIENTS .pptx
AneetaSharma15
 
Biological Classification Class 11th NCERT CBSE NEET.pdf
NehaRohtagi1
 

Svm V SVC

  • 2. Agenda ● Definition. ● Kernel Functions. ● Optimization Problem. ● Soft Margin Hyperplanes. ● V-SVC. ● SMO algorithm. ● Demo.
  • 4. Definition ● Supervised learning model with associated learning algorithms that analyze and recognize patterns. ● Application: - Machine learning. - Pattern recognition. - classification and regression analysis.
  • 5. Binary Classifier ● Given set of Points P={ such that and } . build model that assign new example to ( X i ,Y i) X i ∈R d Y i ∈{−1,1} {−1,1}
  • 6. Question ● What if the examples are not linearly separable? http://openclassroom.stanford.edu/MainFolder/DocumentPage.php?course=MachineLearning&doc=exercises/ex8/ex8.html
  • 8. Kernel Function ● SVM can efficiently perform non linear classification using Kernel trick. ● Kernel trick map the input into high dimension space where the examples become linearly separable.
  • 10. Kernel Function ● Linear Kernel. ● Polynomial Kernel. ● Gaussian RBF Kernel. ● Sigmoid Kernel.
  • 11. Linear Kernel Function ● K(X,Y)=<X,Y> Dot product between X,Y.
  • 12. Polynomial Kernel Function Where d: degree of polynomial, and c is free parameter trade off between the influence of higher and lower order terms in polynomials. k ( X ,Y )=(γ∗< X ,Y > + c) d
  • 13. Gaussian RBF Kernel Where denote square euclidean distance. Other form: k ( X ,Y )=exp( ∣∣X −Y∣∣ 2 −2∗σ ) ∣∣X −Y∣∣ 2 k ( X ,Y )=exp(−¿ γ∗∣∣X −Y∣∣ 2 )
  • 14. Sigmoid Kernel Function Where is scaling factor and r is shifting parameter. k ( X ,Y )=tanh(γ∗< X ,Y > + r) γ
  • 16. Optimization Problem ● Need to find hyperplane with maximum margin. https://en.wikipedia.org/wiki/Support_vector_machine
  • 17. Optimization Problem ● Distance between two hyperplanes = . ● Goal: 1- minimize ||W||. 2- prevent points to fall into margin. ● Constraint: and together: , st: 2 ∣∣W∣∣ W.X i−b≥1 forY i=1 W.X i−b≤−1 forY i=−1 yi (W.X i−b)≥1 for 1≤i≤nmin(W ,b) ∣∣W∣∣
  • 18. Optimization Problem ● Mathematically convenient: , st: ● By Lagrange multiplier , the problem become quadratic optimization problem. arg min(W ,b) 1 2 ∣∣W∣∣ 2 yi (W.X i−b)≥1 arg min(W ,b) max(α> 0) 1 2 ∣∣W∣∣ 2 −∑ i=1 n αi [ yi (W.X i−b)−1]
  • 19. Optimization Problem ● The solution can be expressed in linear combination of : . for these points in support vector. X i W =∑ 1 n αi Y i X i αi≠0
  • 20. Optimization problem ● The QP is solved iff: 1) KKT conditions are fulfilled for every example. 2) is semi definite positive. ● KKT conditions are: Qi , j= yi∗y j∗k ( ⃗X i∗ ⃗X j) αi=0⇒ yi∗ f ( ⃗xi )⩾1 0< αi< C ⇒ yi∗ f (⃗xi)⩾1 αi=C ⇒ yi∗ f ( ⃗xi )⩽1
  • 22. Soft Margin Hyperplanes ● The soft margin hyperplanes will choose a hyperplane that splits the examples as cleanly as possible with maximum margin. ● Non slack variable , measure the degree of misclassification. ξi
  • 23. Soft Margin Hyperplanes Learning with Kernels , by: scholkopf
  • 24. Soft Margin Hyperplanes ● The optimization problem: , st: , . using Lagrange multiplier: st: , arg min(W ,ξ ,b) 1 2 ∣∣W∣∣ 2 + C n ∑ 1 n ξi yi (W.X i+ b)≥1−ξi ξi≥0 ∑ i=1 n αi yi=0 W (α)=∑ i=0 n αi− 1 2 ∑ i , j=1 n αi α j yi y j k (xi , x j) 0≤αi≤ C n
  • 25. ● C is essentially a regularisation parameter, which controls the trade-off between achieving a low error on the training data and minimising the norm of the weights. ● After the Optimizer computes , the W can be computed as αi W =∑ 1 n X i Y i αi
  • 26. V-SVC
  • 27. V-SVC ● In previous formula , C variable was tradeoff between (1) minimizing training errors (2)maximizing margin. ● Replace C by parameter V, control number of margin errors and support vectors. ● V is upper bound of training error rate.
  • 28. V-SVC ● The optimization problem become: ,st: , and . minimize(W ,ξ ,ρ) 1 2 ∣∣W∣∣ 2 −V ρ+ 1 n ∑ 1 n ξi yi (W.X i+ b)≥ρ−ξi ξi≥0 ρi≥0
  • 29. V-SVC ● Using Lagrange multiplier: St: , and and decision function f(X)= minimizeα∈Rd W (α)=− 1 2 ∑ i , j=1 n αi α j Y i Y j k ( X i , X j) 0≤αi≤ 1 n ∑ i=1 n αi Y i=0 ∑ i=1 n αi≥V sgn(∑ i=1 n αi yi k ( X , X i)+ b)
  • 31. SMO Algorithm ● Sequential Minimal Optimization algorithm used to solve quadratic programming problem. ● Algorithm: 1- select pair of examples “details are coming”. 2- optimize target function with respect to selected pair analytically. 3- repeat until the selected pairs “step 1” is optimized or number of iteration exceed user defined input.
  • 32. SMO Algorithm 2-optimize target function with respect to selected pair analytically. - the update on value of and depends on the difference between the approximation error in and . X =Kii+ K jj−2Y i Y j Kij αi α j αi α j
  • 33. Solve for two Lagrange multipliers http://research.microsoft.com/pubs/68391/smo-book.pdf
  • 34. Solve for two Lagrange multipliers http://www.csie.ntu.edu.tw/~cjlin/papers/libsvm.pdf
  • 35. Solve for two Lagrange multipliers double X = Kii+Kjj+2*Kij; double delta = (-G[i]-G[j])/X; double diff = alpha[i] - alpha[j]; alpha[i] += delta; alpha[j] += delta; if(region I): alpha[i] = C_i; alpha[j] = C_i – diff; if(region II): alpha[j] = C_j; alpha[i] = C_j + diff; if(region III): alpha[j] = 0;alpha[i] = diff; If (region IV): alpha[i] = 0;alpha[j] = -diff;
  • 36. SMO Algorithm ● 1- select pair of examples: we need to find pair (i,j) where the difference between classification error is maximum. The pair is optimal if the difference between classification error is less than (( f (xi)− yi)−( f (x j)− y j)) 2 ξ
  • 37. SMO Algorithm 1- select pair of examples “Continue”: Define the following variables: (Max difference) (min difference) I0={i ,αi=0,αi ∈(0,Ci)} I+ ,0={i ,αi=0, yi=1} I+ ,C={i ,αi=Ci , yi=1} I−,0={i ,αi=0, yi=−1} I−,C={i ,αi=Ci , yi=−1} maxi∈{I0∪I+ ,0∪I−,c} f (xi)− yi min j∈{I 0∪I−,0∪I+ ,c } f (x j)− y j
  • 38. SMO algorithm complexity ● Memory complexity: no additional matrix is required to solve the problem. Only 2*2 Matrix is required in each iteration. ● Memory complexity is linear on training data set size. ● SMO algorithm is scaled between linear and quadratic in the size of training data size.