PRESENTED BY
A.KEERTHIKA
M.SC(IT)
Let, suppose that we observe a response Y and p different
predictors X = (X₁, X₂,…., Xp). In general, we can say:
 Y =f(X) + εHere f is an unknown function, and ε is
the random error term.
 In essence, statistical learning refers to a set of
approaches for estimating f.
 In cases where we have set of X readily available, but
the output Y, not so much, the error averages to zero,
and we can say:
 ¥ = ƒ(X)
 where ƒ represents our estimate of f and ¥ represents the
resulting prediction.
 Hence for a set of predictors X, we can say:
 E(Y — ¥)² = E[f(X) + ε — ƒ(X)]²=> E(Y — ¥)² = [f(X) -
ƒ(X)]² + Var(ε)
where,
 E(Y — ¥)² represents the expected value of the squared
difference between actual and expected result.
 [f(X) — ƒ(X)]² represents the reducible error. It is
reducible because we can potentially improve the accuracy
of ƒ by better modeling.
 Var(ε) represents the irreducible error. It is irreducible
because no matter how well we estimate ƒ, we cannot
reduce the error introduced by variance in ε.
 Variables, Y, can be broadly be characterised
as quantitative or qualitative( also known
as categorical). Quantitative variables take on
numerical values, e.g., age, height, income, price, and
much more. Estimating qualitative responses is often
termed as a regression problem. Qualitative
variables take on categorical values, e.g., gender,
brand, parts of speech, and much more. Estimating
qualitative responses is often termed
as a classification problem.
 Variance refers to the amount by which ƒ would change if
we estimated with different training data sets. In general,
when we over-fit a model on a given training data
set(reducible error in training set is very low but on test set
is very high), we get a model that has higher variance since
any change in the data points would results in a
significantly different model.
 Bias refers to the error introduced by approximating a real-
life problem, which may be extremely complicated by a
much simpler model — for example, modeling non-linear
problems with a linear model. In general, when we over-fit,
a model on given data set it results in very less bias.
Linear regression is a statistical method belonging to
supervised learning used for predicting quantitative
responses.
 Simple Linear Regression approach predicts a
quantitative response ¥ based on a single variable X
assuming a linear relationship. We can say :
 ¥ ≈ β₀ + β₁XOur
Artificial intelligence
 The non-constant variance of error terms.
 Outliers: when the actual prediction is very far from the
estimated one, can arise due to inaccurate recording of
data.
 High-leverage points: Unusual values of the predictors
impact the regression line known as high leverage points.
 Collinearity: where two or more predictor variables are
closely related to each other, it may be challenging to weed
out the individual effect of a single predictor variable.
 KNN Regression is a non-parametric approach towards
estimating or predicting values, which do not assume
the form of ƒ(X). It estimates/predicts ƒ(x₀) where x₀ is
a prediction point by averaging out all N₀ responses
closest to x₀. We can say:
 The responses as we discussed till now, may not always
be quantitative, it can be also qualitative, predicting these
qualitative responses is called classification.
 We will discuss various statistical approaches to classification
including:
 SVM
 Logistic Regression
 KNN Classifier
 GAM
 Trees
 Random Forest
 Boosting
 SVM or support vector machine is the classifier that
maximizes the margin. The goal of a classifier in our
example below is to find a line or (n-1) dimension
hyper-plane that separates the two classes present in
the n-dimensional space. I have written a
detailed article explaining the derivation and
formulation of SVM. In my opinion, it is one of the
most powerful techniques in our tool box of statistical
methods in AI.
 KNN(K nearest neighbors) Classifier is a lazy learning
technique, where the training data set is represented on a
Euclidean hyperplane, and test data is assigned the labels
based on the K Euclidean distance metrics.
 Practical Aspects
 K should be chosen empirically and preferably odd to avoid
tie situation.
 KNN should have both discrete and continuous target
functions.
 Weighted contribution(e.g. distance based) from different
neighbors can be used computing the final label.
 Advantages Of KNN
 We can learn a complex target function.
 Zero loss of any information.
 Disadvantages of KNN
 Classification cost of new instances is very high.
 Significant computation takes place at classification
time.
All the above methods had some form of annotated data
set. But when we want to learn patterns in our data
without any annotations unsupervised learning comes
into the picture.
The most widely used statistical method for
unsupervised learning is K-Means Clustering. We take
k random points in our data set and map all other
points to one of the K regions based on their closeness
to K chosen random points. Then we change the K
random points to the centroid of the clusters thus
formed. We do that until we observe a negligible
change in the cluster formed after each iteration.
Artificial intelligence

More Related Content

PPTX
Machine learning session4(linear regression)
PPT
2.7 other classifiers
PDF
Classifiers
PDF
Data Science Cheatsheet.pdf
PDF
Linear_Models_with_R_----_(2._Estimation).pdf
PDF
Introduction to Machine Learning Lectures
PDF
Understanding Blackbox Prediction via Influence Functions
PDF
9.2. SE5072_Multi-fidelity for data s.pdf
Machine learning session4(linear regression)
2.7 other classifiers
Classifiers
Data Science Cheatsheet.pdf
Linear_Models_with_R_----_(2._Estimation).pdf
Introduction to Machine Learning Lectures
Understanding Blackbox Prediction via Influence Functions
9.2. SE5072_Multi-fidelity for data s.pdf

Similar to Artificial intelligence (20)

PDF
Statistical parameters
PPT
instance bases k nearest neighbor algorithm.ppt
PDF
working with python
PDF
the unconditional Logistic Regression .pdf
PDF
Econometrics 1 Slide from the masters degree 1
PDF
Linear regression [Theory and Application (In physics point of view) using py...
PDF
Data Science - Part XII - Ridge Regression, LASSO, and Elastic Nets
PPT
PPT
Machine learning and Neural Networks
PPTX
simple and multiple linear Regression. (1).pptx
PDF
1607.01152.pdf
DOCX
SAMPLING MEANDEFINITIONThe term sampling mean is a stati.docx
DOCX
SAMPLING MEANDEFINITIONThe term sampling mean is a stati.docx
PPTX
Chapter two 1 econometrics lecture note.pptx
PDF
Big Data Analysis
PPTX
2. diagnostics, collinearity, transformation, and missing data
PPTX
SVM - Functional Verification
PPTX
Regression for class teaching
PPTX
AI & ML(Unit III).pptx.It contains also syllabus
PDF
MUMS Opening Workshop - Emulators for models and Complexity Reduction - Akil ...
Statistical parameters
instance bases k nearest neighbor algorithm.ppt
working with python
the unconditional Logistic Regression .pdf
Econometrics 1 Slide from the masters degree 1
Linear regression [Theory and Application (In physics point of view) using py...
Data Science - Part XII - Ridge Regression, LASSO, and Elastic Nets
Machine learning and Neural Networks
simple and multiple linear Regression. (1).pptx
1607.01152.pdf
SAMPLING MEANDEFINITIONThe term sampling mean is a stati.docx
SAMPLING MEANDEFINITIONThe term sampling mean is a stati.docx
Chapter two 1 econometrics lecture note.pptx
Big Data Analysis
2. diagnostics, collinearity, transformation, and missing data
SVM - Functional Verification
Regression for class teaching
AI & ML(Unit III).pptx.It contains also syllabus
MUMS Opening Workshop - Emulators for models and Complexity Reduction - Akil ...
Ad

More from keerthikaA8 (11)

PPTX
SOFT COMPUTING
PPTX
object oriented analysis and design
PPTX
python programming
PPTX
distribute system.
PPTX
data mining ppt.
PPTX
Artificial intelligence
PPTX
Artificial intelligence.pptx
PPTX
MESSAGE PASSING MECHANISMS
PPTX
LIFE CYCLE OF SERVLET
PPTX
FILE TRANSFER PROTOCOL
PPTX
GENERAL METHODS
SOFT COMPUTING
object oriented analysis and design
python programming
distribute system.
data mining ppt.
Artificial intelligence
Artificial intelligence.pptx
MESSAGE PASSING MECHANISMS
LIFE CYCLE OF SERVLET
FILE TRANSFER PROTOCOL
GENERAL METHODS
Ad

Recently uploaded (20)

PDF
M.Tech in Aerospace Engineering | BIT Mesra
PDF
Skin Care and Cosmetic Ingredients Dictionary ( PDFDrive ).pdf
PDF
Solved Past paper of Pediatric Health Nursing PHN BS Nursing 5th Semester
PDF
Chevening Scholarship Application and Interview Preparation Guide
PPTX
Climate Change and Its Global Impact.pptx
PDF
Civil Department's presentation Your score increases as you pick a category
PDF
Journal of Dental Science - UDMY (2021).pdf
PDF
Horaris_Grups_25-26_Definitiu_15_07_25.pdf
PDF
0520_Scheme_of_Work_(for_examination_from_2021).pdf
PPTX
PLASMA AND ITS CONSTITUENTS 123.pptx
PPTX
principlesofmanagementsem1slides-131211060335-phpapp01 (1).ppt
PDF
Fun with Grammar (Communicative Activities for the Azar Grammar Series)
PDF
1.Salivary gland disease.pdf 3.Bleeding and Clotting Disorders.pdf important
PDF
anganwadi services for the b.sc nursing and GNM
PDF
Nurlina - Urban Planner Portfolio (english ver)
PPTX
Diploma pharmaceutics notes..helps diploma students
PDF
Journal of Dental Science - UDMY (2022).pdf
PDF
The TKT Course. Modules 1, 2, 3.for self study
PPTX
UNIT_2-__LIPIDS[1].pptx.................
PDF
Myanmar Dental Journal, The Journal of the Myanmar Dental Association (2015).pdf
M.Tech in Aerospace Engineering | BIT Mesra
Skin Care and Cosmetic Ingredients Dictionary ( PDFDrive ).pdf
Solved Past paper of Pediatric Health Nursing PHN BS Nursing 5th Semester
Chevening Scholarship Application and Interview Preparation Guide
Climate Change and Its Global Impact.pptx
Civil Department's presentation Your score increases as you pick a category
Journal of Dental Science - UDMY (2021).pdf
Horaris_Grups_25-26_Definitiu_15_07_25.pdf
0520_Scheme_of_Work_(for_examination_from_2021).pdf
PLASMA AND ITS CONSTITUENTS 123.pptx
principlesofmanagementsem1slides-131211060335-phpapp01 (1).ppt
Fun with Grammar (Communicative Activities for the Azar Grammar Series)
1.Salivary gland disease.pdf 3.Bleeding and Clotting Disorders.pdf important
anganwadi services for the b.sc nursing and GNM
Nurlina - Urban Planner Portfolio (english ver)
Diploma pharmaceutics notes..helps diploma students
Journal of Dental Science - UDMY (2022).pdf
The TKT Course. Modules 1, 2, 3.for self study
UNIT_2-__LIPIDS[1].pptx.................
Myanmar Dental Journal, The Journal of the Myanmar Dental Association (2015).pdf

Artificial intelligence

  • 2. Let, suppose that we observe a response Y and p different predictors X = (X₁, X₂,…., Xp). In general, we can say:  Y =f(X) + εHere f is an unknown function, and ε is the random error term.  In essence, statistical learning refers to a set of approaches for estimating f.  In cases where we have set of X readily available, but the output Y, not so much, the error averages to zero, and we can say:  ¥ = ƒ(X)
  • 3.  where ƒ represents our estimate of f and ¥ represents the resulting prediction.  Hence for a set of predictors X, we can say:  E(Y — ¥)² = E[f(X) + ε — ƒ(X)]²=> E(Y — ¥)² = [f(X) - ƒ(X)]² + Var(ε) where,  E(Y — ¥)² represents the expected value of the squared difference between actual and expected result.  [f(X) — ƒ(X)]² represents the reducible error. It is reducible because we can potentially improve the accuracy of ƒ by better modeling.  Var(ε) represents the irreducible error. It is irreducible because no matter how well we estimate ƒ, we cannot reduce the error introduced by variance in ε.
  • 4.  Variables, Y, can be broadly be characterised as quantitative or qualitative( also known as categorical). Quantitative variables take on numerical values, e.g., age, height, income, price, and much more. Estimating qualitative responses is often termed as a regression problem. Qualitative variables take on categorical values, e.g., gender, brand, parts of speech, and much more. Estimating qualitative responses is often termed as a classification problem.
  • 5.  Variance refers to the amount by which ƒ would change if we estimated with different training data sets. In general, when we over-fit a model on a given training data set(reducible error in training set is very low but on test set is very high), we get a model that has higher variance since any change in the data points would results in a significantly different model.  Bias refers to the error introduced by approximating a real- life problem, which may be extremely complicated by a much simpler model — for example, modeling non-linear problems with a linear model. In general, when we over-fit, a model on given data set it results in very less bias.
  • 6. Linear regression is a statistical method belonging to supervised learning used for predicting quantitative responses.  Simple Linear Regression approach predicts a quantitative response ¥ based on a single variable X assuming a linear relationship. We can say :  ¥ ≈ β₀ + β₁XOur
  • 8.  The non-constant variance of error terms.  Outliers: when the actual prediction is very far from the estimated one, can arise due to inaccurate recording of data.  High-leverage points: Unusual values of the predictors impact the regression line known as high leverage points.  Collinearity: where two or more predictor variables are closely related to each other, it may be challenging to weed out the individual effect of a single predictor variable.
  • 9.  KNN Regression is a non-parametric approach towards estimating or predicting values, which do not assume the form of ƒ(X). It estimates/predicts ƒ(x₀) where x₀ is a prediction point by averaging out all N₀ responses closest to x₀. We can say:
  • 10.  The responses as we discussed till now, may not always be quantitative, it can be also qualitative, predicting these qualitative responses is called classification.  We will discuss various statistical approaches to classification including:  SVM  Logistic Regression  KNN Classifier  GAM  Trees  Random Forest  Boosting
  • 11.  SVM or support vector machine is the classifier that maximizes the margin. The goal of a classifier in our example below is to find a line or (n-1) dimension hyper-plane that separates the two classes present in the n-dimensional space. I have written a detailed article explaining the derivation and formulation of SVM. In my opinion, it is one of the most powerful techniques in our tool box of statistical methods in AI.
  • 12.  KNN(K nearest neighbors) Classifier is a lazy learning technique, where the training data set is represented on a Euclidean hyperplane, and test data is assigned the labels based on the K Euclidean distance metrics.  Practical Aspects  K should be chosen empirically and preferably odd to avoid tie situation.  KNN should have both discrete and continuous target functions.  Weighted contribution(e.g. distance based) from different neighbors can be used computing the final label.
  • 13.  Advantages Of KNN  We can learn a complex target function.  Zero loss of any information.  Disadvantages of KNN  Classification cost of new instances is very high.  Significant computation takes place at classification time.
  • 14. All the above methods had some form of annotated data set. But when we want to learn patterns in our data without any annotations unsupervised learning comes into the picture. The most widely used statistical method for unsupervised learning is K-Means Clustering. We take k random points in our data set and map all other points to one of the K regions based on their closeness to K chosen random points. Then we change the K random points to the centroid of the clusters thus formed. We do that until we observe a negligible change in the cluster formed after each iteration.