SlideShare a Scribd company logo
6
Most read
8
Most read
9
Most read
VC Dimension in Machine Learning
Dr. Varun Kumar
Dr. Varun Kumar Lecture 18 1 / 10
Outlines
1 General Classification Problem
2 Usage of VC dimension in ML
3 Introduction to Vapnik-Chervonenkis (VC) Dimension
4 How to Determine VC Dimension for a Given Classifier or Hypothesis?
5 References
Dr. Varun Kumar Lecture 18 2 / 10
General classification problem
1 Always look for test error along with the training error.
2 Improving on training error does not improve the test error.
3 Increase in machine capacity may give the poor performance.
Is there any equation that relates the training and test error ?
Dr. Varun Kumar Lecture 18 3 / 10
Usage of VC dimension in ML
Model complexity determines the performance/cost on both the training
and test sets.
P

Test error ≤ Training error +
r
h(log(2N/h) + 1) − log η/4
N

= 1 − η
Note: Above expression shows the upper bound of test error with
probability 1 − η.
h→ VC dimension
h measure the power
h does not depend on the choice of training set
N → Total number of training sample
For reducing the residual, h → low or N → high
Test error ≤ Training error + Penalty(Complexity)
.
Dr. Varun Kumar Lecture 18 4 / 10
Continued–
⇒ Let us our training data are iid from some distribution fX (x).
⇒ Types of risk
(i) Risk R(θ)→ Long term observation→ Test observation
R(θ) = Test error = E[δ(c 6= ĉ(x; θ))]
(ii) Empirical risk Remp
(θ)→ Finite sample observation→ Training
observation
Remp
(θ) = Training error =
1
m
X
i
[δ(c(i)
6= ĉ(i)
(x; θ))]
Dr. Varun Kumar Lecture 18 5 / 10
Introduction to Vapnik-Chervonenkis (VC) Dimension
Key features:
⇒ VC dimension is a measure of the capacity (complexity, expressive
power, richness, or flexibility) of a set of functions.
⇒ It learns by a statistical binary classification algorithm.
⇒ It is defined as the cardinality of the largest set of points that the
algorithm can shatter.
Cardinality refers to the size of set. Ex- A = {1, 4, 6}, cardinality
|A| = 3
⇒ The capacity of a classification model is related to how complicated it
can be.→ Overfitting
VC dimension of a set-family
Let H be a set family (a set of sets) and C a set.
H ∩ C := {h ∩ C | h ∈ H}.
Dr. Varun Kumar Lecture 18 6 / 10
Relationship between risk and model complexity
Dr. Varun Kumar Lecture 18 7 / 10
How to determine VC dimension for a given classifier or hypothesis?
1 General point setting:
Statement: In a n−dimensional feature space a set of m points (m  n) is
in general position if and only if no subset of (m + 1) points lie on the
(n − 1) dimensional hyperplane.
Dr. Varun Kumar Lecture 18 8 / 10
2 Shattering:
Statement: A hypothesis H shatter m points in n− dimensional space if
all possible combinations of m points in n− dimensional space are
correctly classified.
Dr. Varun Kumar Lecture 18 9 / 10
References
E. Alpaydin, Introduction to machine learning. MIT press, 2020.
T. M. Mitchell, The discipline of machine learning. Carnegie Mellon University,
School of Computer Science, Machine Learning , 2006, vol. 9.
J. Grus, Data science from scratch: first principles with python. O’Reilly Media,
2019.
Dr. Varun Kumar Lecture 18 10 / 10

More Related Content

What's hot (20)

PPTX
Issues in knowledge representation
Sravanthi Emani
 
DOC
Ch 6 final
Nateshwar Kamlesh
 
PDF
Statistical Pattern recognition(1)
Syed Atif Naseem
 
PPT
Back propagation
Nagarajan
 
PPTX
Learning rule of first order rules
swapnac12
 
PPTX
Unsupervised learning
amalalhait
 
PDF
Classification Based Machine Learning Algorithms
Md. Main Uddin Rony
 
PPTX
Naive bayesian classification
Dr-Dipali Meher
 
PPTX
knowledge representation using rules
Harini Balamurugan
 
PPTX
Naive bayes
Ashraf Uddin
 
PPTX
Predicate logic
Harini Balamurugan
 
PDF
Machine learning Lecture 2
Srinivasan R
 
PPTX
Probabilistic Reasoning
Junya Tanaka
 
PDF
Introduction to soft computing
Siksha 'O' Anusandhan (Deemed to be University )
 
PPT
Bayes Classification
sathish sak
 
PPT
Np cooks theorem
Narayana Galla
 
PPTX
Evaluating hypothesis
swapnac12
 
PPT
3.2 partitioning methods
Krish_ver2
 
PDF
Introduction to Recurrent Neural Network
Knoldus Inc.
 
Issues in knowledge representation
Sravanthi Emani
 
Ch 6 final
Nateshwar Kamlesh
 
Statistical Pattern recognition(1)
Syed Atif Naseem
 
Back propagation
Nagarajan
 
Learning rule of first order rules
swapnac12
 
Unsupervised learning
amalalhait
 
Classification Based Machine Learning Algorithms
Md. Main Uddin Rony
 
Naive bayesian classification
Dr-Dipali Meher
 
knowledge representation using rules
Harini Balamurugan
 
Naive bayes
Ashraf Uddin
 
Predicate logic
Harini Balamurugan
 
Machine learning Lecture 2
Srinivasan R
 
Probabilistic Reasoning
Junya Tanaka
 
Introduction to soft computing
Siksha 'O' Anusandhan (Deemed to be University )
 
Bayes Classification
sathish sak
 
Np cooks theorem
Narayana Galla
 
Evaluating hypothesis
swapnac12
 
3.2 partitioning methods
Krish_ver2
 
Introduction to Recurrent Neural Network
Knoldus Inc.
 

Similar to Vc dimension in Machine Learning (20)

PDF
Lecture 3 (Supervised learning)
VARUN KUMAR
 
PDF
13ClassifierPerformance.pdf
ssuserdce5c21
 
PDF
Understanding Blackbox Prediction via Influence Functions
SEMINARGROOT
 
PDF
Deep Learning Introduction for Engineering
terala1
 
PDF
14 ch ken black solution
Krunal Shah
 
PDF
Introduction to Machine Learning Lectures
ssuserfece35
 
PDF
Lecture6 xing
Tianlu Wang
 
PDF
Machine learning in science and industry — day 1
arogozhnikov
 
DOCX
26 Ch. 3 Organizing and Graphing DataAssignment 2ME.docx
eugeniadean34240
 
PDF
1_2 Introduction to Machine Learning.pdf
RaviBhuva13
 
PDF
Boosting dl concept learners
Giuseppe Rizzo
 
PDF
15 ch ken black solution
Krunal Shah
 
PDF
Andres hernandez ai_machine_learning_london_nov2017
Andres Hernandez
 
PPT
Support Vector Machines
nextlib
 
PPT
PERFORMANCE EVALUATION PARAMETERS FOR MACHINE LEARNING
abeeratariq20011
 
PPT
4.Support Vector Machines.ppt machine learning and development
PriyankaRamavath3
 
PPT
Computational Learning Theory
butest
 
PDF
Radial Basis Function Neural Network (RBFNN), Induction Motor, Vector control...
cscpconf
 
PDF
ANALYTICAL STUDY OF FEATURE EXTRACTION TECHNIQUES IN OPINION MINING
csandit
 
PDF
Analytical study of feature extraction techniques in opinion mining
csandit
 
Lecture 3 (Supervised learning)
VARUN KUMAR
 
13ClassifierPerformance.pdf
ssuserdce5c21
 
Understanding Blackbox Prediction via Influence Functions
SEMINARGROOT
 
Deep Learning Introduction for Engineering
terala1
 
14 ch ken black solution
Krunal Shah
 
Introduction to Machine Learning Lectures
ssuserfece35
 
Lecture6 xing
Tianlu Wang
 
Machine learning in science and industry — day 1
arogozhnikov
 
26 Ch. 3 Organizing and Graphing DataAssignment 2ME.docx
eugeniadean34240
 
1_2 Introduction to Machine Learning.pdf
RaviBhuva13
 
Boosting dl concept learners
Giuseppe Rizzo
 
15 ch ken black solution
Krunal Shah
 
Andres hernandez ai_machine_learning_london_nov2017
Andres Hernandez
 
Support Vector Machines
nextlib
 
PERFORMANCE EVALUATION PARAMETERS FOR MACHINE LEARNING
abeeratariq20011
 
4.Support Vector Machines.ppt machine learning and development
PriyankaRamavath3
 
Computational Learning Theory
butest
 
Radial Basis Function Neural Network (RBFNN), Induction Motor, Vector control...
cscpconf
 
ANALYTICAL STUDY OF FEATURE EXTRACTION TECHNIQUES IN OPINION MINING
csandit
 
Analytical study of feature extraction techniques in opinion mining
csandit
 
Ad

More from VARUN KUMAR (20)

PDF
Distributed rc Model
VARUN KUMAR
 
PDF
Electrical Wire Model
VARUN KUMAR
 
PDF
Interconnect Parameter in Digital VLSI Design
VARUN KUMAR
 
PDF
Introduction to Digital VLSI Design
VARUN KUMAR
 
PDF
Challenges of Massive MIMO System
VARUN KUMAR
 
PDF
E-democracy or Digital Democracy
VARUN KUMAR
 
PDF
Ethics of Parasitic Computing
VARUN KUMAR
 
PDF
Action Lines of Geneva Plan of Action
VARUN KUMAR
 
PDF
Geneva Plan of Action
VARUN KUMAR
 
PDF
Fair Use in the Electronic Age
VARUN KUMAR
 
PDF
Software as a Property
VARUN KUMAR
 
PDF
Orthogonal Polynomial
VARUN KUMAR
 
PDF
Patent Protection
VARUN KUMAR
 
PDF
Copyright Vs Patent and Trade Secrecy Law
VARUN KUMAR
 
PDF
Property Right and Software
VARUN KUMAR
 
PDF
Investigating Data Trials
VARUN KUMAR
 
PDF
Gaussian Numerical Integration
VARUN KUMAR
 
PDF
Censorship and Controversy
VARUN KUMAR
 
PDF
Romberg's Integration
VARUN KUMAR
 
PDF
Introduction to Censorship
VARUN KUMAR
 
Distributed rc Model
VARUN KUMAR
 
Electrical Wire Model
VARUN KUMAR
 
Interconnect Parameter in Digital VLSI Design
VARUN KUMAR
 
Introduction to Digital VLSI Design
VARUN KUMAR
 
Challenges of Massive MIMO System
VARUN KUMAR
 
E-democracy or Digital Democracy
VARUN KUMAR
 
Ethics of Parasitic Computing
VARUN KUMAR
 
Action Lines of Geneva Plan of Action
VARUN KUMAR
 
Geneva Plan of Action
VARUN KUMAR
 
Fair Use in the Electronic Age
VARUN KUMAR
 
Software as a Property
VARUN KUMAR
 
Orthogonal Polynomial
VARUN KUMAR
 
Patent Protection
VARUN KUMAR
 
Copyright Vs Patent and Trade Secrecy Law
VARUN KUMAR
 
Property Right and Software
VARUN KUMAR
 
Investigating Data Trials
VARUN KUMAR
 
Gaussian Numerical Integration
VARUN KUMAR
 
Censorship and Controversy
VARUN KUMAR
 
Romberg's Integration
VARUN KUMAR
 
Introduction to Censorship
VARUN KUMAR
 
Ad

Recently uploaded (20)

PDF
Set Relation Function Practice session 24.05.2025.pdf
DrStephenStrange4
 
PDF
International Journal of Information Technology Convergence and services (IJI...
ijitcsjournal4
 
PDF
Basic_Concepts_in_Clinical_Biochemistry_2018كيمياء_عملي.pdf
AdelLoin
 
DOC
MRRS Strength and Durability of Concrete
CivilMythili
 
PPTX
美国电子版毕业证南卡罗莱纳大学上州分校水印成绩单USC学费发票定做学位证书编号怎么查
Taqyea
 
PPTX
Element 7. CHEMICAL AND BIOLOGICAL AGENT.pptx
merrandomohandas
 
PPTX
GitOps_Repo_Structure for begeinner(Scaffolindg)
DanialHabibi2
 
PDF
Design Thinking basics for Engineers.pdf
CMR University
 
PDF
Biomechanics of Gait: Engineering Solutions for Rehabilitation (www.kiu.ac.ug)
publication11
 
PPTX
The Role of Information Technology in Environmental Protectio....pptx
nallamillisriram
 
PPTX
MPMC_Module-2 xxxxxxxxxxxxxxxxxxxxx.pptx
ShivanshVaidya5
 
PDF
Water Design_Manual_2005. KENYA FOR WASTER SUPPLY AND SEWERAGE
DancanNgutuku
 
PDF
PORTFOLIO Golam Kibria Khan — architect with a passion for thoughtful design...
MasumKhan59
 
DOCX
CS-802 (A) BDH Lab manual IPS Academy Indore
thegodhimself05
 
PDF
GTU Civil Engineering All Semester Syllabus.pdf
Vimal Bhojani
 
PDF
Introduction to Productivity and Quality
মোঃ ফুরকান উদ্দিন জুয়েল
 
PDF
Unified_Cloud_Comm_Presentation anil singh ppt
anilsingh298751
 
PPTX
Lecture 1 Shell and Tube Heat exchanger-1.pptx
mailforillegalwork
 
PPTX
Thermal runway and thermal stability.pptx
godow93766
 
PPT
PPT2_Metal formingMECHANICALENGINEEIRNG .ppt
Praveen Kumar
 
Set Relation Function Practice session 24.05.2025.pdf
DrStephenStrange4
 
International Journal of Information Technology Convergence and services (IJI...
ijitcsjournal4
 
Basic_Concepts_in_Clinical_Biochemistry_2018كيمياء_عملي.pdf
AdelLoin
 
MRRS Strength and Durability of Concrete
CivilMythili
 
美国电子版毕业证南卡罗莱纳大学上州分校水印成绩单USC学费发票定做学位证书编号怎么查
Taqyea
 
Element 7. CHEMICAL AND BIOLOGICAL AGENT.pptx
merrandomohandas
 
GitOps_Repo_Structure for begeinner(Scaffolindg)
DanialHabibi2
 
Design Thinking basics for Engineers.pdf
CMR University
 
Biomechanics of Gait: Engineering Solutions for Rehabilitation (www.kiu.ac.ug)
publication11
 
The Role of Information Technology in Environmental Protectio....pptx
nallamillisriram
 
MPMC_Module-2 xxxxxxxxxxxxxxxxxxxxx.pptx
ShivanshVaidya5
 
Water Design_Manual_2005. KENYA FOR WASTER SUPPLY AND SEWERAGE
DancanNgutuku
 
PORTFOLIO Golam Kibria Khan — architect with a passion for thoughtful design...
MasumKhan59
 
CS-802 (A) BDH Lab manual IPS Academy Indore
thegodhimself05
 
GTU Civil Engineering All Semester Syllabus.pdf
Vimal Bhojani
 
Introduction to Productivity and Quality
মোঃ ফুরকান উদ্দিন জুয়েল
 
Unified_Cloud_Comm_Presentation anil singh ppt
anilsingh298751
 
Lecture 1 Shell and Tube Heat exchanger-1.pptx
mailforillegalwork
 
Thermal runway and thermal stability.pptx
godow93766
 
PPT2_Metal formingMECHANICALENGINEEIRNG .ppt
Praveen Kumar
 

Vc dimension in Machine Learning

  • 1. VC Dimension in Machine Learning Dr. Varun Kumar Dr. Varun Kumar Lecture 18 1 / 10
  • 2. Outlines 1 General Classification Problem 2 Usage of VC dimension in ML 3 Introduction to Vapnik-Chervonenkis (VC) Dimension 4 How to Determine VC Dimension for a Given Classifier or Hypothesis? 5 References Dr. Varun Kumar Lecture 18 2 / 10
  • 3. General classification problem 1 Always look for test error along with the training error. 2 Improving on training error does not improve the test error. 3 Increase in machine capacity may give the poor performance. Is there any equation that relates the training and test error ? Dr. Varun Kumar Lecture 18 3 / 10
  • 4. Usage of VC dimension in ML Model complexity determines the performance/cost on both the training and test sets. P Test error ≤ Training error + r h(log(2N/h) + 1) − log η/4 N = 1 − η Note: Above expression shows the upper bound of test error with probability 1 − η. h→ VC dimension h measure the power h does not depend on the choice of training set N → Total number of training sample For reducing the residual, h → low or N → high Test error ≤ Training error + Penalty(Complexity) . Dr. Varun Kumar Lecture 18 4 / 10
  • 5. Continued– ⇒ Let us our training data are iid from some distribution fX (x). ⇒ Types of risk (i) Risk R(θ)→ Long term observation→ Test observation R(θ) = Test error = E[δ(c 6= ĉ(x; θ))] (ii) Empirical risk Remp (θ)→ Finite sample observation→ Training observation Remp (θ) = Training error = 1 m X i [δ(c(i) 6= ĉ(i) (x; θ))] Dr. Varun Kumar Lecture 18 5 / 10
  • 6. Introduction to Vapnik-Chervonenkis (VC) Dimension Key features: ⇒ VC dimension is a measure of the capacity (complexity, expressive power, richness, or flexibility) of a set of functions. ⇒ It learns by a statistical binary classification algorithm. ⇒ It is defined as the cardinality of the largest set of points that the algorithm can shatter. Cardinality refers to the size of set. Ex- A = {1, 4, 6}, cardinality |A| = 3 ⇒ The capacity of a classification model is related to how complicated it can be.→ Overfitting VC dimension of a set-family Let H be a set family (a set of sets) and C a set. H ∩ C := {h ∩ C | h ∈ H}. Dr. Varun Kumar Lecture 18 6 / 10
  • 7. Relationship between risk and model complexity Dr. Varun Kumar Lecture 18 7 / 10
  • 8. How to determine VC dimension for a given classifier or hypothesis? 1 General point setting: Statement: In a n−dimensional feature space a set of m points (m n) is in general position if and only if no subset of (m + 1) points lie on the (n − 1) dimensional hyperplane. Dr. Varun Kumar Lecture 18 8 / 10
  • 9. 2 Shattering: Statement: A hypothesis H shatter m points in n− dimensional space if all possible combinations of m points in n− dimensional space are correctly classified. Dr. Varun Kumar Lecture 18 9 / 10
  • 10. References E. Alpaydin, Introduction to machine learning. MIT press, 2020. T. M. Mitchell, The discipline of machine learning. Carnegie Mellon University, School of Computer Science, Machine Learning , 2006, vol. 9. J. Grus, Data science from scratch: first principles with python. O’Reilly Media, 2019. Dr. Varun Kumar Lecture 18 10 / 10