SlideShare a Scribd company logo
Hichem Felouat - hichemfel@gmail.com - Algeria 1
The Fundamentals
of
Machine Learning
Hichem Felouat
hichemfel@gmail.com
2Hichem Felouat - hichemfel@gmail.com - Algeria
What Is Artificial Intelligence?
Artificial intelligence (AI) is an area of computer science
that emphasizes the creation of intelligent machines that work
and react like humans.
• AI is an interdisciplinary science with multiple approaches.
• AI has become an essential part of the technology industry.
Subdomains of Artificial Intelligence
3Hichem Felouat - hichemfel@gmail.com - Algeria
4Hichem Felouat - hichemfel@gmail.com - Algeria
What Is Machine Learning?
• Machine Learning is the science (and
art) of programming computers so
they can learn from data.
• Machine Learning is the field of
study that gives computers the ability
to learn without being explicitly
programmed. —Arthur Samuel, 1959
Hichem Felouat - hichemfel@gmail.com - Algeria 5
What Does Learning Mean?
A computer program is said to
learn from experience E with
respect to some task T and some
performance measure P, if its
performance on T, as measured by
P, improves with experience E. —
Tom Mitchell, 1997
Hichem Felouat - hichemfel@gmail.com - Algeria 6
Timeline of Machine Learning
Hichem Felouat - hichemfel@gmail.com - Algeria 7
Why Use Machine Learning?
The traditional approach. If the problem is not trivial, your program will
likely become a long list of complex rules pretty hard to maintain.
Hichem Felouat - hichemfel@gmail.com - Algeria 8
Why Use Machine Learning?
Machine Learning approach. The program is much shorter, easier to
maintain, and most likely more accurate.
Hichem Felouat - hichemfel@gmail.com - Algeria 9
Why Use Machine Learning?
Machine Learning can help humans learn.
Hichem Felouat - hichemfel@gmail.com - Algeria 10
Why Use Machine Learning?
AI Index 2019 Annual Report.
Hichem Felouat - hichemfel@gmail.com - Algeria 11
Applications of Machine Learning
Machine learning is currently the preferred approach in the following
domains:
1) Speech analysis: e.g., speech recognition, synthesis.
2) Computer vision: e.g., object recognition/detection.
3) Robotics: e.g., position/map estimation.
4) Bio-informatics: e.g., sequence alignment, genetic analysis.
5) E-commerce: e.g., automatic trading, fraud detection.
6) Financial analysis: e.g., portfolio allocation, credits.
7) Medicine: e.g., diagnosis, therapy conception.
8) Web: e.g., Content management, social networks, etc.
Hichem Felouat - hichemfel@gmail.com - Algeria 12
Applications of Machine Learning
To summarize, Machine Learning is great for:
• Problems for which existing solutions require a lot of hand-tuning or
long lists of rules: one Machine Learning algorithm can often simplify
code and perform better.
• Complex problems for which there is no good solution at all using a
traditional approach: the best Machine Learning techniques can find a
solution.
Hichem Felouat - hichemfel@gmail.com - Algeria 13
How to get started with ML
1) Mathematics: statistics, probability, and
linear algebra.(NumPy, SciPy)
2) Programming: data structures, OOP, and
parallel programming. (Python)
3) Databases: SQL and NOSQL.
4) ML algorithms: regression, classification,
and clustering.
5) ML Tools: Scikt learn, TensorFlow and
Keras.
Hichem Felouat - hichemfel@gmail.com - Algeria 14
How to get started with ML
Hichem Felouat - hichemfel@gmail.com - Algeria 15
Machine Learning Vocabulary 1
1) Examples: Items or instances of data used for learning or evaluation. In our
spam problem, these examples correspond to the collection of email
messages we will use for learning and testing.
2) Training sample: Examples used to train a learning algorithm. In our spam
problem, the training sample consists of a set of email examples along with
their associated labels.
3) Labels: Values or categories assigned to examples. In classification
problems, examples are assigned specific categories, for instance, the spam
and non-spam categories in our binary classification problem. In regression,
items are assigned real-valued labels.
Hichem Felouat - hichemfel@gmail.com - Algeria 16
Machine Learning Vocabulary 2
5) Test sample: Examples used to evaluate the performance of a learning algorithm. The test
sample is separate from the training and validation data and is not made available in the
learning stage. In the spam problem, the test sample consists of a collection of email
examples for which the learning algorithm must predict labels based on features. These
predictions are then compared with the labels of the test sample to measure the performance
of the algorithm.
4) Features: The set of attributes, often represented as a vector, associated to an example. In
the case of email messages, some relevant features may include the length of the message,
the name of the sender, various characteristics of the header, the presence of certain
keywords in the body of the message, and so on.
6) Loss function: A function that measures the difference, or loss, between a
predicted label and a true label.
Hichem Felouat - hichemfel@gmail.com - Algeria 17
Types of Machine Learning Systems
There are so many different types of Machine Learning systems that it is
useful to classify them in broad categories based on:
• Whether or not they are trained with human supervision (supervised,
unsupervised, semisupervised, and Reinforcement Learning).
• Whether or not they can learn incrementally on the fly (online versus
batch learning).
• Whether they work by simply comparing new data points to known data
points, or instead detect patterns in the training data and build a
predictive model, much like scientists do (instance-based versus model-
based learning).
Hichem Felouat - hichemfel@gmail.com - Algeria 18
Types of Machine Learning Systems
Hichem Felouat - hichemfel@gmail.com - Algeria 19
Types of Machine Learning Systems
Supervised learning :
In supervised learning, the training data you feed to the
algorithm includes the desired solutions, called labels.
• When y is real, we talk about regression.
• When y is discrete, we talk about classification.
Hichem Felouat - hichemfel@gmail.com - Algeria 20
Types of Machine Learning Systems
A labeled training set for supervised learning.
Hichem Felouat - hichemfel@gmail.com - Algeria 21
Types of Machine Learning Systems
Here are some of the most important supervised
learning algorithms:
• k-Nearest Neighbors
• Linear Regression
• Logistic Regression
• Support Vector Machines (SVMs)
• Decision Trees and Random Forests
• Neural networks*
Hichem Felouat - hichemfel@gmail.com - Algeria 22
Types of Machine Learning Systems
Unsupervised Learning:
In unsupervised learning, as you might guess, the training data is
unlabeled. The system tries to learn without a teacher.
No labels are given to the learning algorithm, leaving it on its own to
explore or find structure in the data.
Hichem Felouat - hichemfel@gmail.com - Algeria 23
Types of Machine Learning Systems
An unlabeled training set for unsupervised learning.
Hichem Felouat - hichemfel@gmail.com - Algeria 24
Here are some of the most important unsupervised
learning algorithms:
• Clustering
• Visualization and dimensionality reduction
Types of Machine Learning Systems
Hichem Felouat - hichemfel@gmail.com - Algeria 25
Types of Machine Learning Systems
Semi-Supervised Learning :
Some algorithms can deal with partially labeled training data,
usually a lot of unlabeled data and a little bit of labeled data. This
is called semi-supervised learning.
Most semi-supervised learning algorithms are combinations of
unsupervised and supervised algorithms.
Hichem Felouat - hichemfel@gmail.com - Algeria 26
Types of Machine Learning Systems
Reinforcement Learning :
• The learning system called an agent in this context.
• Can observe the environment, select and perform actions, and get
rewards in return (or penalties in the form of negative rewards).
• It must then learn by itself what is the best strategy, called a policy, to get
the most reward over time.
• A policy defines what action the agent should choose when it is in a given
situation.
Hichem Felouat - hichemfel@gmail.com - Algeria 27
Types of Machine Learning Systems
Reinforcement Learning
Hichem Felouat - hichemfel@gmail.com - Algeria 28
Types of Machine Learning Systems
Batch learning:
In batch learning, the system is incapable of learning
incrementally: it must be trained using all the available
data. This will generally take a lot of time and computing
resources, so it is typically done offline. First, the system is
trained, and then it is launched into production and runs
without learning anymore; it just applies what it has learned.
This is called offline learning.
Hichem Felouat - hichemfel@gmail.com - Algeria 29
Types of Machine Learning Systems
On-line learning:
In online learning, you train the system incrementally by
feeding it data instances sequentially, either individually
or by small groups called mini batches. Each learning step is
fast and cheap, so the system can learn about new data
on the fly, as it arrives.
Hichem Felouat - hichemfel@gmail.com - Algeria 30
Types of Machine Learning Systems
Online learning
Hichem Felouat - hichemfel@gmail.com - Algeria 31
Instance-Based VS Model-Based Learning
One more way to categorize Machine Learning systems is by how
they generalize. Most Machine Learning tasks are about making
predictions. This means that given a number of training examples,
the system needs to be able to generalize to examples it has never
seen before.
Having a good performance measure on the training data is good,
but insufficient; the true goal is to perform well on new instances.
There are two main approaches to generalization: instance-based
learning and model-based learning.
Hichem Felouat - hichemfel@gmail.com - Algeria 32
Instance-Based VS Model-Based Learning
Instance-based learning:
The system learns the examples by heart, then generalizes to new
cases using a similarity measure.
Hichem Felouat - hichemfel@gmail.com - Algeria 33
Instance-Based VS Model-Based Learning
Model-based learning:
Build a model of these examples, then use that model to make
predictions.
Hichem Felouat - hichemfel@gmail.com - Algeria 34
Loss Function
The loss function computes the error for a single training
example, while the cost function is the average of the loss
functions of the entire training set.
Hichem Felouat - hichemfel@gmail.com - Algeria 35
Machine Learning Vocabulary 3
• Hyperparameters : are configuration variables that are external to the model
and whose values cannot be estimated from data. That is to say, they can not
be learned directly from the data in standard model training. They are almost
always specified by the machine learning engineer prior to training.
• Regression: this is the problem of predicting a real value for each item.
Examples of regression include prediction of stock values or that of variations
of economic variables.
• Classification: this is the problem of assigning a category to each item.
• Clustering: this is the problem of partitioning a set of items into
homogeneous subsets.
Hichem Felouat - hichemfel@gmail.com - Algeria 36
In Summary
1) You studied the data.
2) You selected a model.
3) You trained it on the training data.
4) Finally, you applied the model to make predictions
on new cases.
Hichem Felouat - hichemfel@gmail.com - Algeria 37
Main Challenges of Machine Learning
In short, since your main task is to select
a learning algorithm and train it on some
data, the two things that can go wrong are
“bad data” and “bad algorithm”.
Hichem Felouat - hichemfel@gmail.com - Algeria 38
Main Challenges of Machine Learning
1- Database
Hichem Felouat - hichemfel@gmail.com - Algeria 39
Main Challenges of Machine Learning
1- Database
1- Insufficient Quantity of Training Data :
Machine Learning takes a lot of data for most Machine
Learning algorithms to work properly. Even for very simple
problems you typically need thousands of examples, and
for complex problems such as image or speech
recognition you may need millions of examples (unless
you can reuse parts of an existing model).
Hichem Felouat - hichemfel@gmail.com - Algeria 40
Main Challenges of Machine Learning
1- Database
2) Non-representative Training Data:
In order to generalize well, it is crucial that your training data be representative of
the new cases you want to generalize to. This is true whether you use instance-
based learning or model-based learning.
Hichem Felouat - hichemfel@gmail.com - Algeria 41
Main Challenges of Machine Learning
1- Database
3) Poor-Quality Data:
If your training data is full of errors, outliers, and noise (e.g., due to poor quality
measurements), it will make it harder for the system to detect the underlying patterns, so your
system is less likely to perform well. It is often well worth the effort to spend time cleaning up
your training data. The truth is, most data scientists spend a significant part of their time
doing just that. For example:
1) If some instances are clearly outliers, it may help to simply discard them or try to fix
the errors manually.
2) If some instances are missing a few features (e.g., 5% of your customers did not
specify their age), you must decide whether you want to ignore this attribute altogether,
ignore these instances, fill in the missing values (e.g., with the median age), or train
one model with the feature and one model without it, and so on.
Hichem Felouat - hichemfel@gmail.com - Algeria 42
Main Challenges of Machine Learning
1- Database
4) Irrelevant Features:
Your system will only be capable of learning if the training data contains enough
relevant features and not too many irrelevant ones. A critical part of the success
of a Machine Learning project is coming up with a good set of features to train on.
This process, called feature engineering, involves:
1) Feature selection: selecting the most useful features to train on among
existing features.
2) Feature extraction: combining existing features to produce a more useful
one (dimensionality reduction algorithms can help).
3) Creating new features by gathering new data.
Hichem Felouat - hichemfel@gmail.com - Algeria 43
Main Challenges of Machine Learning
2- Algorithm
1) Overfitting the Training Data:
Overfitting happens when a model learns the detail and noise in the training
data to the extent that it negatively impacts the performance of the model on
new data. This means that the noise or random fluctuations in the training data
is picked up and learned as concepts by the model. The problem is that these
concepts do not apply to new data and negatively impact the models ability to
generalize.
The model performs well on the training data, but it does not
generalize well.
Hichem Felouat - hichemfel@gmail.com - Algeria 44
Main Challenges of Machine Learning
2- Algorithm
2) Underfitting the Training Data:
Underfitting is the opposite of overfitting: it occurs
when your model is too simple to learn the
underlying structure of the data.
Hichem Felouat - hichemfel@gmail.com - Algeria 45
Main Challenges of Machine Learning
2- Algorithm
Hichem Felouat - hichemfel@gmail.com - Algeria 46
Main Challenges of Machine Learning
2- Algorithm
Hichem Felouat - hichemfel@gmail.com - Algeria 47
How to Avoid Underfitting and Overfitting
Underfitting :
• Complexify model
• Add more features
• Train longer
Overfitting :
• validation
• Perform regularization
• Get more data
• Remove/Add some features
Hichem Felouat - hichemfel@gmail.com - Algeria 48
Common Classification Model Evaluation
Metrics : Confusion Matrix
The confusion matrix is used to describe the performance of a
classification model on a set of test data for which true values are known.
Hichem Felouat - hichemfel@gmail.com - Algeria 49
Common Classification Model Evaluation
metrics : Main Metrics
Hichem Felouat - hichemfel@gmail.com - Algeria 50
Common Classification Model Evaluation
metrics : Main Metrics
Hichem Felouat - hichemfel@gmail.com - Algeria 51
Common Regression Model Evaluation
metrics : Mean Absolute Error
Hichem Felouat - hichemfel@gmail.com - Algeria 52
Common Regression Model Evaluation
metrics : Mean Square Error
Hichem Felouat - hichemfel@gmail.com - Algeria 53
Common Regression Model Evaluation
metrics : Mean Absolute Percentage Error
Hichem Felouat - hichemfel@gmail.com - Algeria 54
Common Regression Model Evaluation
metrics : Mean Percentage Error
Hichem Felouat - hichemfel@gmail.com - Algeria 55
Testing and Validating
It is common to use 80% of the data for training and hold out 20% for
testing.
If the training error is low (i.e., your model makes few mistakes on the training
set) but the generalization error is high, it means that your model is overfitting the
training data.
A common solution to this problem is to have a second holdout set called the
validation set. You train multiple models with various hyperparameters using the
training set, you select the model and hyperparameters that perform best on the
validation set, and when you’re happy with your model you run a single final test
against the test set to get an estimate of the generalization error.
Hichem Felouat - hichemfel@gmail.com - Algeria 56
Testing and Validating : Cross-Validation
Cross-Validation (CV) : the training set is split into
complementary subsets, and each model is trained against
a different combination of these subsets and validated
against the remaining parts. Once the model type and
hyperparameters have been selected, a final model is
trained using these hyperparameters on the full training set,
and the generalized error is measured on the test set.
Hichem Felouat - hichemfel@gmail.com - Algeria 57
Testing and Validating : Cross-Validation
Hichem Felouat - hichemfel@gmail.com - Algeria 58
Boosting
Boosting refers to any Ensemble method that can combine
several weak learners into a strong learner. The general
idea of most boosting methods is to train predictors
sequentially, each trying to correct its predecessor. There
are many boosting methods available, but by far the most
popular are AdaBoost (Adaptive Boosting) and Gradient
Boosting.
Hichem Felouat - hichemfel@gmail.com - Algeria 59
Boosting
AdaBoost sequential training with instance weight updates
Hichem Felouat - hichemfel@gmail.com - Algeria 60
Voting Classifiers
The Voting Classifier: is a meta-classifier for combining similar or
conceptually different machine learning classifiers for classification via majority
or plurality voting. (For simplicity, we will refer to both majority and plurality voting
as majority voting.)
Hichem Felouat - hichemfel@gmail.com - Algeria 61
Dimensionality Reduction
Many Machine Learning problems involve thousands or even millions of features
for each training instance. Not only does this make training extremely slow, but it
can also make it much harder to find a good solution. This problem is often
referred to as the curse of dimensionality.
Principal Component Analysis
Hichem Felouat - hichemfel@gmail.com - Algeria 62
Hyperparameter Tuning
Hyperparameter Tuning : works by running multiple trials in a
single training job. Each trial is a complete execution of your training
application with values for your chosen hyperparameters, set within
limits you specify. The AI Platform training service keeps track of the
results of each trial and makes adjustments for subsequent trials.
When the job is finished, you can get a summary of all the trials along
with the most effective configuration of values according to the
criteria you specify.
Hichem Felouat - hichemfel@gmail.com - Algeria 63
Steps to Build a Machine Learning System
1. Data collection.
2. Improving data quality (data preprocessing).
3. Feature engineering (feature extraction and
selection, dimensionality reduction).
4. Splitting data into training and evaluation sets.
5. Algorithm selection.
6. Training.
7. Evaluation + Hyperparameter tuning.
8. Testing.
9. Deployment
Hichem Felouat - hichemfel@gmail.com - Algeria 64
Deep Learning is a subfield of machine learning
concerned with algorithms inspired by the structure and
function of the brain called artificial neural networks.
Deep Learning
Hichem Felouat - hichemfel@gmail.com - Algeria 65
Deep Learning VS Machine Learning
Hichem Felouat - hichemfel@gmail.com - Algeria 66
Feature extraction
Engineering of features is , however, a tedious process for several
reasons: Takes a lot of time and Requires expert knowledge.
For learning-based applications, a lot of time is spent to adjust the
features.
Extracted features often lack a structural representation reflecting
abstraction levels in the problem at hand.
Hichem Felouat - hichemfel@gmail.com - Algeria 67
Representation learning
Deep Learning aims at learning automatically
representations from large sets of labeled data:
• The machine is powered with raw data.
• Automatic discovery of representations.
Hichem Felouat - hichemfel@gmail.com - Algeria 68
Deep learning models
Several DL models have been proposed :
• Autoencoders (Aes)
• Deep belief networks (DBNs)
• Convolutional neural networks (CNNs).
• Recurrent neural networks (RNNs).
• Generative adversial networks (GANs), etc.
Hichem Felouat - hichemfel@gmail.com - Algeria 69
Convolutional neural networks (CNNs)
Hichem Felouat - hichemfel@gmail.com - Algeria 70
Convolutional neural networks (CNNs)
Hichem Felouat - hichemfel@gmail.com - Algeria 71
Convolutional neural networks (CNNs)
Hichem Felouat - hichemfel@gmail.com - Algeria 72
Convolutional neural networks (CNNs)
Hichem Felouat - hichemfel@gmail.com - Algeria 73
Convolutional neural networks (CNNs)
Hichem Felouat - hichemfel@gmail.com - Algeria 74
Thank you for your
attention

More Related Content

What's hot (20)

PPT
Machine Learning
Rahul Kumar
 
PDF
Machine learning
Dr Geetha Mohan
 
PDF
Lecture 1: What is Machine Learning?
Marina Santini
 
PPT
Machine learning
Rajib Kumar De
 
PPTX
Types of Machine Learning
Samra Shahzadi
 
PPTX
1.Introduction to deep learning
KONGU ENGINEERING COLLEGE
 
PPT
Machine learning
Sanjay krishne
 
PDF
AI vs Machine Learning vs Deep Learning | Machine Learning Training with Pyth...
Edureka!
 
PDF
Machine Learning
Shrey Malik
 
PPTX
Intro/Overview on Machine Learning Presentation
Ankit Gupta
 
PPT
Machine Learning presentation.
butest
 
PPT
Machine Learning
Vivek Garg
 
PDF
Automatic machine learning (AutoML) 101
QuantUniversity
 
PPTX
Introduction to Deep Learning
Oswald Campesato
 
PPTX
Introduction to-machine-learning
Babu Priyavrat
 
PPTX
Deep learning
Ratnakar Pandey
 
PPTX
Deep learning
Rajgupta258
 
PPTX
Ensemble learning
Haris Jamil
 
PDF
Introduction to Deep learning
Massimiliano Ruocco
 
PPTX
introduction to machin learning
nilimapatel6
 
Machine Learning
Rahul Kumar
 
Machine learning
Dr Geetha Mohan
 
Lecture 1: What is Machine Learning?
Marina Santini
 
Machine learning
Rajib Kumar De
 
Types of Machine Learning
Samra Shahzadi
 
1.Introduction to deep learning
KONGU ENGINEERING COLLEGE
 
Machine learning
Sanjay krishne
 
AI vs Machine Learning vs Deep Learning | Machine Learning Training with Pyth...
Edureka!
 
Machine Learning
Shrey Malik
 
Intro/Overview on Machine Learning Presentation
Ankit Gupta
 
Machine Learning presentation.
butest
 
Machine Learning
Vivek Garg
 
Automatic machine learning (AutoML) 101
QuantUniversity
 
Introduction to Deep Learning
Oswald Campesato
 
Introduction to-machine-learning
Babu Priyavrat
 
Deep learning
Ratnakar Pandey
 
Deep learning
Rajgupta258
 
Ensemble learning
Haris Jamil
 
Introduction to Deep learning
Massimiliano Ruocco
 
introduction to machin learning
nilimapatel6
 

Similar to The fundamentals of Machine Learning (20)

PDF
Machine Learning Landscape
Eng Teong Cheah
 
PPTX
Machine Learning Basics - By Animesh Sinha
Animesh Sinha
 
DOC
Intro/Overview on Machine Learning Presentation -2
Ankit Gupta
 
PPTX
Introduction To Machine Learning
Knoldus Inc.
 
PDF
Machine learning
Fahd Allebdi
 
PPTX
Statistical Machine Learning Lecture notes
SureshK256753
 
DOCX
Training_Report_on_Machine_Learning.docx
ShubhamBishnoi14
 
PPTX
Machine_Learning.pptx
shubhamatak136
 
PPTX
Introduction to Machine Learning.pptx
Dr. Amanpreet Kaur
 
PPTX
introduction to machine learning
Johnson Ubah
 
DOCX
machine learning.docx
JadhavArjun2
 
PPTX
Introduction to Machine Learning
Sujith Jayaprakash
 
PPTX
Introduction to machine learning
Salman Khan
 
PPTX
Introduction to machine learning
Salman Khan
 
PDF
Machine learning
osman ansari
 
PPTX
Machine learning
eonx_32
 
PDF
Unit1_Introduction to ML_Defination_application.pdf
RAMESHWAR CHINTAMANI
 
PPTX
ML vs AI
Janu Jahnavi
 
PPTX
Lecture 1.pptxgggggggggggggggggggggggggggggggggggggggggggg
AjayKumar773878
 
Machine Learning Landscape
Eng Teong Cheah
 
Machine Learning Basics - By Animesh Sinha
Animesh Sinha
 
Intro/Overview on Machine Learning Presentation -2
Ankit Gupta
 
Introduction To Machine Learning
Knoldus Inc.
 
Machine learning
Fahd Allebdi
 
Statistical Machine Learning Lecture notes
SureshK256753
 
Training_Report_on_Machine_Learning.docx
ShubhamBishnoi14
 
Machine_Learning.pptx
shubhamatak136
 
Introduction to Machine Learning.pptx
Dr. Amanpreet Kaur
 
introduction to machine learning
Johnson Ubah
 
machine learning.docx
JadhavArjun2
 
Introduction to Machine Learning
Sujith Jayaprakash
 
Introduction to machine learning
Salman Khan
 
Introduction to machine learning
Salman Khan
 
Machine learning
osman ansari
 
Machine learning
eonx_32
 
Unit1_Introduction to ML_Defination_application.pdf
RAMESHWAR CHINTAMANI
 
ML vs AI
Janu Jahnavi
 
Lecture 1.pptxgggggggggggggggggggggggggggggggggggggggggggg
AjayKumar773878
 
Ad

More from Hichem Felouat (11)

PDF
مفاهيم حول الذكاء الاصطناعي تشمل تعاريف و معلومات أساسية
Hichem Felouat
 
PDF
Natural Language Processing NLP (Transformers)
Hichem Felouat
 
PDF
Introduction To Generative Adversarial Networks GANs
Hichem Felouat
 
PDF
Object detection and Instance Segmentation
Hichem Felouat
 
PDF
Artificial Intelligence and its Applications
Hichem Felouat
 
PDF
Natural Language Processing (NLP)
Hichem Felouat
 
PDF
Predict future time series forecasting
Hichem Felouat
 
PDF
Transfer Learning
Hichem Felouat
 
PDF
How to Build your First Neural Network
Hichem Felouat
 
PDF
Machine Learning Algorithms
Hichem Felouat
 
PDF
Build your own Convolutional Neural Network CNN
Hichem Felouat
 
مفاهيم حول الذكاء الاصطناعي تشمل تعاريف و معلومات أساسية
Hichem Felouat
 
Natural Language Processing NLP (Transformers)
Hichem Felouat
 
Introduction To Generative Adversarial Networks GANs
Hichem Felouat
 
Object detection and Instance Segmentation
Hichem Felouat
 
Artificial Intelligence and its Applications
Hichem Felouat
 
Natural Language Processing (NLP)
Hichem Felouat
 
Predict future time series forecasting
Hichem Felouat
 
Transfer Learning
Hichem Felouat
 
How to Build your First Neural Network
Hichem Felouat
 
Machine Learning Algorithms
Hichem Felouat
 
Build your own Convolutional Neural Network CNN
Hichem Felouat
 
Ad

Recently uploaded (20)

PDF
CONCURSO DE POESIA “POETUFAS – PASSOS SUAVES PELO VERSO.pdf
Colégio Santa Teresinha
 
PDF
Dimensions of Societal Planning in Commonism
StefanMz
 
PPTX
How to Set Up Tags in Odoo 18 - Odoo Slides
Celine George
 
PPTX
Cultivation practice of Litchi in Nepal.pptx
UmeshTimilsina1
 
PDF
The-Ever-Evolving-World-of-Science (1).pdf/7TH CLASS CURIOSITY /1ST CHAPTER/B...
Sandeep Swamy
 
PPTX
Stereochemistry-Optical Isomerism in organic compoundsptx
Tarannum Nadaf-Mansuri
 
PPTX
How to Convert an Opportunity into a Quotation in Odoo 18 CRM
Celine George
 
PDF
Lesson 2 - WATER,pH, BUFFERS, AND ACID-BASE.pdf
marvinnbustamante1
 
PDF
The dynastic history of the Chahmana.pdf
PrachiSontakke5
 
PDF
QNL June Edition hosted by Pragya the official Quiz Club of the University of...
Pragya - UEM Kolkata Quiz Club
 
PPTX
Neurodivergent Friendly Schools - Slides from training session
Pooky Knightsmith
 
PDF
Stokey: A Jewish Village by Rachel Kolsky
History of Stoke Newington
 
PPTX
SPINA BIFIDA: NURSING MANAGEMENT .pptx
PRADEEP ABOTHU
 
PDF
ARAL_Orientation_Day-2-Sessions_ARAL-Readung ARAL-Mathematics ARAL-Sciencev2.pdf
JoelVilloso1
 
PPTX
MENINGITIS: NURSING MANAGEMENT, BACTERIAL MENINGITIS, VIRAL MENINGITIS.pptx
PRADEEP ABOTHU
 
PDF
Biological Bilingual Glossary Hindi and English Medium
World of Wisdom
 
PPT
Talk on Critical Theory, Part II, Philosophy of Social Sciences
Soraj Hongladarom
 
PDF
LAW OF CONTRACT (5 YEAR LLB & UNITARY LLB )- MODULE - 1.& 2 - LEARN THROUGH P...
APARNA T SHAIL KUMAR
 
PPTX
How to Set Maximum Difference Odoo 18 POS
Celine George
 
PPTX
Unit 2 COMMERCIAL BANKING, Corporate banking.pptx
AnubalaSuresh1
 
CONCURSO DE POESIA “POETUFAS – PASSOS SUAVES PELO VERSO.pdf
Colégio Santa Teresinha
 
Dimensions of Societal Planning in Commonism
StefanMz
 
How to Set Up Tags in Odoo 18 - Odoo Slides
Celine George
 
Cultivation practice of Litchi in Nepal.pptx
UmeshTimilsina1
 
The-Ever-Evolving-World-of-Science (1).pdf/7TH CLASS CURIOSITY /1ST CHAPTER/B...
Sandeep Swamy
 
Stereochemistry-Optical Isomerism in organic compoundsptx
Tarannum Nadaf-Mansuri
 
How to Convert an Opportunity into a Quotation in Odoo 18 CRM
Celine George
 
Lesson 2 - WATER,pH, BUFFERS, AND ACID-BASE.pdf
marvinnbustamante1
 
The dynastic history of the Chahmana.pdf
PrachiSontakke5
 
QNL June Edition hosted by Pragya the official Quiz Club of the University of...
Pragya - UEM Kolkata Quiz Club
 
Neurodivergent Friendly Schools - Slides from training session
Pooky Knightsmith
 
Stokey: A Jewish Village by Rachel Kolsky
History of Stoke Newington
 
SPINA BIFIDA: NURSING MANAGEMENT .pptx
PRADEEP ABOTHU
 
ARAL_Orientation_Day-2-Sessions_ARAL-Readung ARAL-Mathematics ARAL-Sciencev2.pdf
JoelVilloso1
 
MENINGITIS: NURSING MANAGEMENT, BACTERIAL MENINGITIS, VIRAL MENINGITIS.pptx
PRADEEP ABOTHU
 
Biological Bilingual Glossary Hindi and English Medium
World of Wisdom
 
Talk on Critical Theory, Part II, Philosophy of Social Sciences
Soraj Hongladarom
 
LAW OF CONTRACT (5 YEAR LLB & UNITARY LLB )- MODULE - 1.& 2 - LEARN THROUGH P...
APARNA T SHAIL KUMAR
 
How to Set Maximum Difference Odoo 18 POS
Celine George
 
Unit 2 COMMERCIAL BANKING, Corporate banking.pptx
AnubalaSuresh1
 

The fundamentals of Machine Learning

  • 1. Hichem Felouat - [email protected] - Algeria 1 The Fundamentals of Machine Learning Hichem Felouat [email protected]
  • 2. 2Hichem Felouat - [email protected] - Algeria What Is Artificial Intelligence? Artificial intelligence (AI) is an area of computer science that emphasizes the creation of intelligent machines that work and react like humans. • AI is an interdisciplinary science with multiple approaches. • AI has become an essential part of the technology industry.
  • 3. Subdomains of Artificial Intelligence 3Hichem Felouat - [email protected] - Algeria
  • 4. 4Hichem Felouat - [email protected] - Algeria What Is Machine Learning? • Machine Learning is the science (and art) of programming computers so they can learn from data. • Machine Learning is the field of study that gives computers the ability to learn without being explicitly programmed. —Arthur Samuel, 1959
  • 5. Hichem Felouat - [email protected] - Algeria 5 What Does Learning Mean? A computer program is said to learn from experience E with respect to some task T and some performance measure P, if its performance on T, as measured by P, improves with experience E. — Tom Mitchell, 1997
  • 6. Hichem Felouat - [email protected] - Algeria 6 Timeline of Machine Learning
  • 7. Hichem Felouat - [email protected] - Algeria 7 Why Use Machine Learning? The traditional approach. If the problem is not trivial, your program will likely become a long list of complex rules pretty hard to maintain.
  • 8. Hichem Felouat - [email protected] - Algeria 8 Why Use Machine Learning? Machine Learning approach. The program is much shorter, easier to maintain, and most likely more accurate.
  • 9. Hichem Felouat - [email protected] - Algeria 9 Why Use Machine Learning? Machine Learning can help humans learn.
  • 10. Hichem Felouat - [email protected] - Algeria 10 Why Use Machine Learning? AI Index 2019 Annual Report.
  • 11. Hichem Felouat - [email protected] - Algeria 11 Applications of Machine Learning Machine learning is currently the preferred approach in the following domains: 1) Speech analysis: e.g., speech recognition, synthesis. 2) Computer vision: e.g., object recognition/detection. 3) Robotics: e.g., position/map estimation. 4) Bio-informatics: e.g., sequence alignment, genetic analysis. 5) E-commerce: e.g., automatic trading, fraud detection. 6) Financial analysis: e.g., portfolio allocation, credits. 7) Medicine: e.g., diagnosis, therapy conception. 8) Web: e.g., Content management, social networks, etc.
  • 12. Hichem Felouat - [email protected] - Algeria 12 Applications of Machine Learning To summarize, Machine Learning is great for: • Problems for which existing solutions require a lot of hand-tuning or long lists of rules: one Machine Learning algorithm can often simplify code and perform better. • Complex problems for which there is no good solution at all using a traditional approach: the best Machine Learning techniques can find a solution.
  • 13. Hichem Felouat - [email protected] - Algeria 13 How to get started with ML 1) Mathematics: statistics, probability, and linear algebra.(NumPy, SciPy) 2) Programming: data structures, OOP, and parallel programming. (Python) 3) Databases: SQL and NOSQL. 4) ML algorithms: regression, classification, and clustering. 5) ML Tools: Scikt learn, TensorFlow and Keras.
  • 14. Hichem Felouat - [email protected] - Algeria 14 How to get started with ML
  • 15. Hichem Felouat - [email protected] - Algeria 15 Machine Learning Vocabulary 1 1) Examples: Items or instances of data used for learning or evaluation. In our spam problem, these examples correspond to the collection of email messages we will use for learning and testing. 2) Training sample: Examples used to train a learning algorithm. In our spam problem, the training sample consists of a set of email examples along with their associated labels. 3) Labels: Values or categories assigned to examples. In classification problems, examples are assigned specific categories, for instance, the spam and non-spam categories in our binary classification problem. In regression, items are assigned real-valued labels.
  • 16. Hichem Felouat - [email protected] - Algeria 16 Machine Learning Vocabulary 2 5) Test sample: Examples used to evaluate the performance of a learning algorithm. The test sample is separate from the training and validation data and is not made available in the learning stage. In the spam problem, the test sample consists of a collection of email examples for which the learning algorithm must predict labels based on features. These predictions are then compared with the labels of the test sample to measure the performance of the algorithm. 4) Features: The set of attributes, often represented as a vector, associated to an example. In the case of email messages, some relevant features may include the length of the message, the name of the sender, various characteristics of the header, the presence of certain keywords in the body of the message, and so on. 6) Loss function: A function that measures the difference, or loss, between a predicted label and a true label.
  • 17. Hichem Felouat - [email protected] - Algeria 17 Types of Machine Learning Systems There are so many different types of Machine Learning systems that it is useful to classify them in broad categories based on: • Whether or not they are trained with human supervision (supervised, unsupervised, semisupervised, and Reinforcement Learning). • Whether or not they can learn incrementally on the fly (online versus batch learning). • Whether they work by simply comparing new data points to known data points, or instead detect patterns in the training data and build a predictive model, much like scientists do (instance-based versus model- based learning).
  • 18. Hichem Felouat - [email protected] - Algeria 18 Types of Machine Learning Systems
  • 19. Hichem Felouat - [email protected] - Algeria 19 Types of Machine Learning Systems Supervised learning : In supervised learning, the training data you feed to the algorithm includes the desired solutions, called labels. • When y is real, we talk about regression. • When y is discrete, we talk about classification.
  • 20. Hichem Felouat - [email protected] - Algeria 20 Types of Machine Learning Systems A labeled training set for supervised learning.
  • 21. Hichem Felouat - [email protected] - Algeria 21 Types of Machine Learning Systems Here are some of the most important supervised learning algorithms: • k-Nearest Neighbors • Linear Regression • Logistic Regression • Support Vector Machines (SVMs) • Decision Trees and Random Forests • Neural networks*
  • 22. Hichem Felouat - [email protected] - Algeria 22 Types of Machine Learning Systems Unsupervised Learning: In unsupervised learning, as you might guess, the training data is unlabeled. The system tries to learn without a teacher. No labels are given to the learning algorithm, leaving it on its own to explore or find structure in the data.
  • 23. Hichem Felouat - [email protected] - Algeria 23 Types of Machine Learning Systems An unlabeled training set for unsupervised learning.
  • 24. Hichem Felouat - [email protected] - Algeria 24 Here are some of the most important unsupervised learning algorithms: • Clustering • Visualization and dimensionality reduction Types of Machine Learning Systems
  • 25. Hichem Felouat - [email protected] - Algeria 25 Types of Machine Learning Systems Semi-Supervised Learning : Some algorithms can deal with partially labeled training data, usually a lot of unlabeled data and a little bit of labeled data. This is called semi-supervised learning. Most semi-supervised learning algorithms are combinations of unsupervised and supervised algorithms.
  • 26. Hichem Felouat - [email protected] - Algeria 26 Types of Machine Learning Systems Reinforcement Learning : • The learning system called an agent in this context. • Can observe the environment, select and perform actions, and get rewards in return (or penalties in the form of negative rewards). • It must then learn by itself what is the best strategy, called a policy, to get the most reward over time. • A policy defines what action the agent should choose when it is in a given situation.
  • 27. Hichem Felouat - [email protected] - Algeria 27 Types of Machine Learning Systems Reinforcement Learning
  • 28. Hichem Felouat - [email protected] - Algeria 28 Types of Machine Learning Systems Batch learning: In batch learning, the system is incapable of learning incrementally: it must be trained using all the available data. This will generally take a lot of time and computing resources, so it is typically done offline. First, the system is trained, and then it is launched into production and runs without learning anymore; it just applies what it has learned. This is called offline learning.
  • 29. Hichem Felouat - [email protected] - Algeria 29 Types of Machine Learning Systems On-line learning: In online learning, you train the system incrementally by feeding it data instances sequentially, either individually or by small groups called mini batches. Each learning step is fast and cheap, so the system can learn about new data on the fly, as it arrives.
  • 30. Hichem Felouat - [email protected] - Algeria 30 Types of Machine Learning Systems Online learning
  • 31. Hichem Felouat - [email protected] - Algeria 31 Instance-Based VS Model-Based Learning One more way to categorize Machine Learning systems is by how they generalize. Most Machine Learning tasks are about making predictions. This means that given a number of training examples, the system needs to be able to generalize to examples it has never seen before. Having a good performance measure on the training data is good, but insufficient; the true goal is to perform well on new instances. There are two main approaches to generalization: instance-based learning and model-based learning.
  • 32. Hichem Felouat - [email protected] - Algeria 32 Instance-Based VS Model-Based Learning Instance-based learning: The system learns the examples by heart, then generalizes to new cases using a similarity measure.
  • 33. Hichem Felouat - [email protected] - Algeria 33 Instance-Based VS Model-Based Learning Model-based learning: Build a model of these examples, then use that model to make predictions.
  • 34. Hichem Felouat - [email protected] - Algeria 34 Loss Function The loss function computes the error for a single training example, while the cost function is the average of the loss functions of the entire training set.
  • 35. Hichem Felouat - [email protected] - Algeria 35 Machine Learning Vocabulary 3 • Hyperparameters : are configuration variables that are external to the model and whose values cannot be estimated from data. That is to say, they can not be learned directly from the data in standard model training. They are almost always specified by the machine learning engineer prior to training. • Regression: this is the problem of predicting a real value for each item. Examples of regression include prediction of stock values or that of variations of economic variables. • Classification: this is the problem of assigning a category to each item. • Clustering: this is the problem of partitioning a set of items into homogeneous subsets.
  • 36. Hichem Felouat - [email protected] - Algeria 36 In Summary 1) You studied the data. 2) You selected a model. 3) You trained it on the training data. 4) Finally, you applied the model to make predictions on new cases.
  • 37. Hichem Felouat - [email protected] - Algeria 37 Main Challenges of Machine Learning In short, since your main task is to select a learning algorithm and train it on some data, the two things that can go wrong are “bad data” and “bad algorithm”.
  • 38. Hichem Felouat - [email protected] - Algeria 38 Main Challenges of Machine Learning 1- Database
  • 39. Hichem Felouat - [email protected] - Algeria 39 Main Challenges of Machine Learning 1- Database 1- Insufficient Quantity of Training Data : Machine Learning takes a lot of data for most Machine Learning algorithms to work properly. Even for very simple problems you typically need thousands of examples, and for complex problems such as image or speech recognition you may need millions of examples (unless you can reuse parts of an existing model).
  • 40. Hichem Felouat - [email protected] - Algeria 40 Main Challenges of Machine Learning 1- Database 2) Non-representative Training Data: In order to generalize well, it is crucial that your training data be representative of the new cases you want to generalize to. This is true whether you use instance- based learning or model-based learning.
  • 41. Hichem Felouat - [email protected] - Algeria 41 Main Challenges of Machine Learning 1- Database 3) Poor-Quality Data: If your training data is full of errors, outliers, and noise (e.g., due to poor quality measurements), it will make it harder for the system to detect the underlying patterns, so your system is less likely to perform well. It is often well worth the effort to spend time cleaning up your training data. The truth is, most data scientists spend a significant part of their time doing just that. For example: 1) If some instances are clearly outliers, it may help to simply discard them or try to fix the errors manually. 2) If some instances are missing a few features (e.g., 5% of your customers did not specify their age), you must decide whether you want to ignore this attribute altogether, ignore these instances, fill in the missing values (e.g., with the median age), or train one model with the feature and one model without it, and so on.
  • 42. Hichem Felouat - [email protected] - Algeria 42 Main Challenges of Machine Learning 1- Database 4) Irrelevant Features: Your system will only be capable of learning if the training data contains enough relevant features and not too many irrelevant ones. A critical part of the success of a Machine Learning project is coming up with a good set of features to train on. This process, called feature engineering, involves: 1) Feature selection: selecting the most useful features to train on among existing features. 2) Feature extraction: combining existing features to produce a more useful one (dimensionality reduction algorithms can help). 3) Creating new features by gathering new data.
  • 43. Hichem Felouat - [email protected] - Algeria 43 Main Challenges of Machine Learning 2- Algorithm 1) Overfitting the Training Data: Overfitting happens when a model learns the detail and noise in the training data to the extent that it negatively impacts the performance of the model on new data. This means that the noise or random fluctuations in the training data is picked up and learned as concepts by the model. The problem is that these concepts do not apply to new data and negatively impact the models ability to generalize. The model performs well on the training data, but it does not generalize well.
  • 44. Hichem Felouat - [email protected] - Algeria 44 Main Challenges of Machine Learning 2- Algorithm 2) Underfitting the Training Data: Underfitting is the opposite of overfitting: it occurs when your model is too simple to learn the underlying structure of the data.
  • 45. Hichem Felouat - [email protected] - Algeria 45 Main Challenges of Machine Learning 2- Algorithm
  • 46. Hichem Felouat - [email protected] - Algeria 46 Main Challenges of Machine Learning 2- Algorithm
  • 47. Hichem Felouat - [email protected] - Algeria 47 How to Avoid Underfitting and Overfitting Underfitting : • Complexify model • Add more features • Train longer Overfitting : • validation • Perform regularization • Get more data • Remove/Add some features
  • 48. Hichem Felouat - [email protected] - Algeria 48 Common Classification Model Evaluation Metrics : Confusion Matrix The confusion matrix is used to describe the performance of a classification model on a set of test data for which true values are known.
  • 49. Hichem Felouat - [email protected] - Algeria 49 Common Classification Model Evaluation metrics : Main Metrics
  • 50. Hichem Felouat - [email protected] - Algeria 50 Common Classification Model Evaluation metrics : Main Metrics
  • 51. Hichem Felouat - [email protected] - Algeria 51 Common Regression Model Evaluation metrics : Mean Absolute Error
  • 52. Hichem Felouat - [email protected] - Algeria 52 Common Regression Model Evaluation metrics : Mean Square Error
  • 53. Hichem Felouat - [email protected] - Algeria 53 Common Regression Model Evaluation metrics : Mean Absolute Percentage Error
  • 54. Hichem Felouat - [email protected] - Algeria 54 Common Regression Model Evaluation metrics : Mean Percentage Error
  • 55. Hichem Felouat - [email protected] - Algeria 55 Testing and Validating It is common to use 80% of the data for training and hold out 20% for testing. If the training error is low (i.e., your model makes few mistakes on the training set) but the generalization error is high, it means that your model is overfitting the training data. A common solution to this problem is to have a second holdout set called the validation set. You train multiple models with various hyperparameters using the training set, you select the model and hyperparameters that perform best on the validation set, and when you’re happy with your model you run a single final test against the test set to get an estimate of the generalization error.
  • 56. Hichem Felouat - [email protected] - Algeria 56 Testing and Validating : Cross-Validation Cross-Validation (CV) : the training set is split into complementary subsets, and each model is trained against a different combination of these subsets and validated against the remaining parts. Once the model type and hyperparameters have been selected, a final model is trained using these hyperparameters on the full training set, and the generalized error is measured on the test set.
  • 57. Hichem Felouat - [email protected] - Algeria 57 Testing and Validating : Cross-Validation
  • 58. Hichem Felouat - [email protected] - Algeria 58 Boosting Boosting refers to any Ensemble method that can combine several weak learners into a strong learner. The general idea of most boosting methods is to train predictors sequentially, each trying to correct its predecessor. There are many boosting methods available, but by far the most popular are AdaBoost (Adaptive Boosting) and Gradient Boosting.
  • 59. Hichem Felouat - [email protected] - Algeria 59 Boosting AdaBoost sequential training with instance weight updates
  • 60. Hichem Felouat - [email protected] - Algeria 60 Voting Classifiers The Voting Classifier: is a meta-classifier for combining similar or conceptually different machine learning classifiers for classification via majority or plurality voting. (For simplicity, we will refer to both majority and plurality voting as majority voting.)
  • 61. Hichem Felouat - [email protected] - Algeria 61 Dimensionality Reduction Many Machine Learning problems involve thousands or even millions of features for each training instance. Not only does this make training extremely slow, but it can also make it much harder to find a good solution. This problem is often referred to as the curse of dimensionality. Principal Component Analysis
  • 62. Hichem Felouat - [email protected] - Algeria 62 Hyperparameter Tuning Hyperparameter Tuning : works by running multiple trials in a single training job. Each trial is a complete execution of your training application with values for your chosen hyperparameters, set within limits you specify. The AI Platform training service keeps track of the results of each trial and makes adjustments for subsequent trials. When the job is finished, you can get a summary of all the trials along with the most effective configuration of values according to the criteria you specify.
  • 63. Hichem Felouat - [email protected] - Algeria 63 Steps to Build a Machine Learning System 1. Data collection. 2. Improving data quality (data preprocessing). 3. Feature engineering (feature extraction and selection, dimensionality reduction). 4. Splitting data into training and evaluation sets. 5. Algorithm selection. 6. Training. 7. Evaluation + Hyperparameter tuning. 8. Testing. 9. Deployment
  • 64. Hichem Felouat - [email protected] - Algeria 64 Deep Learning is a subfield of machine learning concerned with algorithms inspired by the structure and function of the brain called artificial neural networks. Deep Learning
  • 65. Hichem Felouat - [email protected] - Algeria 65 Deep Learning VS Machine Learning
  • 66. Hichem Felouat - [email protected] - Algeria 66 Feature extraction Engineering of features is , however, a tedious process for several reasons: Takes a lot of time and Requires expert knowledge. For learning-based applications, a lot of time is spent to adjust the features. Extracted features often lack a structural representation reflecting abstraction levels in the problem at hand.
  • 67. Hichem Felouat - [email protected] - Algeria 67 Representation learning Deep Learning aims at learning automatically representations from large sets of labeled data: • The machine is powered with raw data. • Automatic discovery of representations.
  • 68. Hichem Felouat - [email protected] - Algeria 68 Deep learning models Several DL models have been proposed : • Autoencoders (Aes) • Deep belief networks (DBNs) • Convolutional neural networks (CNNs). • Recurrent neural networks (RNNs). • Generative adversial networks (GANs), etc.
  • 69. Hichem Felouat - [email protected] - Algeria 69 Convolutional neural networks (CNNs)
  • 70. Hichem Felouat - [email protected] - Algeria 70 Convolutional neural networks (CNNs)
  • 71. Hichem Felouat - [email protected] - Algeria 71 Convolutional neural networks (CNNs)
  • 72. Hichem Felouat - [email protected] - Algeria 72 Convolutional neural networks (CNNs)
  • 73. Hichem Felouat - [email protected] - Algeria 73 Convolutional neural networks (CNNs)
  • 74. Hichem Felouat - [email protected] - Algeria 74 Thank you for your attention