SlideShare a Scribd company logo
Machine learning intro
The only limit to AI is human imagination. - Chris Duffey
By : Anas Jamil
Mar - 2019
Agenda
1- AI & ML & DL.
2- Machine learning (ML) introduction.
3- Types of Machine learning.
4- ML data types.
5- Working with missing data
6- Model performance (fitting)
AI & ML & DL introduction.
Artificial Intelligence (AI) : It is the study of how to train the computers so that computers can do things
which at present human can do. -www.geeksforgeeks.org-
Machine learning (ML) :Is the scientific study of algorithms and statistical models that computer systems
use to perform a specific task without using explicit instructions. -wikipedia-
Deep learning (DL): is an artificial intelligence function that imitates the workings of the human brain in
processing data and creating patterns for use in decision making.
Machine learning introduction
What is Machine Learning?
“Learning is any process by which a system improves performance from
experience.” - Herbert Simon
Some use cases: We ML when:
• Human expertise does not exist (navigating on Mars)
• Humans can’t explain their expertise (speech recognition)
• Models are based on huge amounts of data (genomics).
Machine learning (ML)
Machine learning introduction
1- Supervised (inductive/class driven) learning
Supervised learning: is the machine learning task of learning a function that maps an input to an output
based on example input-output pairs.
Supervised learning: is where you have input variables (x) and an output variable (Y) and you use an
algorithm to learn the mapping function from the input to the output.
Y = f(X)
The goal is to approximate the mapping function so well that when you have new input data (x) that you
can predict the output variables (Y) for that data.
It is called supervised learning because the process of an algorithm learning from the training dataset can
be thought of as a teacher supervising the learning process.
Supervised learning terms:
Class, Target, Label
Attribute, Feature
Labeled data, dataset,
Sample data
Sample
Example
Record
Row
Instance
observation
Supervised learning:
Supervised learning algorithm types:
1- Regression: It is a Supervised Learning task where output is having continuous value.(Numeric output )
Ex: how much home worth.
2- Classifications: It is a Supervised Learning task where output is having defined labels(discrete value).
A- Binary classification: Yes/No
Ex: Spam email?
B- Multi-classes: One out of several outputs.
Ex:What is the weather?
Sample of algorithms : Support Vector Machine (SVM), Random Forest, Linear Regression, Decision Trees
2-Unsupervised (data driven) learning
Training data does not include desired outputs.
Unsupervised learning: is very much the opposite of supervised learning. It features no labels. Instead,
our algorithm would be fed a lot of data and given the tools to understand the properties of the data. From
there, it can learn to group, cluster, and/or organize the data in a way such that a human (or other
intelligent algorithm) can come in and make sense of the newly organized data.
Unsupervised learning
Unsupervised learning classified into two categories of algorithms:
Clustering: A clustering problem is where you want to discover the inherent groupings in the data,
such as grouping customers by purchasing behavior.
Association: An association rule learning problem is where you want to discover rules that describe
large portions of your data, such as people that buy X also tend to buy Y.
Unsupervised learning: Dimensionality Reduction (DR)
There are two components of dimensionality reduction:
Feature extraction: This reduces the data in a high dimensional space to a lower dimension space,
i.e. a space with lesser no. of dimensions.
a+b+c+d = e
ab+c+d = e
Feature selection: In this, we try to find a subset of the original set of features, to get a smaller subset
which can be used to model the problem
c = 0
Sample of algorithms : FastText alg, BlazingText alg, Principal component analysis (PCA)
3- Reinforcement learning (RL)
It is about taking suitable action to maximize reward in a particular situation.
3- Reinforcement learning (RL)
Types of Reinforcement: There are two types of Reinforcement:
1- Positive: Positive Reinforcement is defined as when an event, occurs due to a particular
behavior, increases the strength and the frequency of the behavior. In other words it has a positive
effect.
2-Negative: Negative Reinforcement is defined as strengthening of a behavior because a negative
condition is stopped or avoided.
Use Cases of RL : Real-time decisions , Game AI, Robo navigation, auto drive cars
Data types from ML perspectives
1- Numerical Data
2- Categorical Data
3- Time Series Data
4- Text
1- Numerical Data:
Numerical data is any data where data points are exact numbers. Statisticians also might call numerical
data, quantitative data. This data has meaning as a measurement such as house prices.
2- Categorical Data
Categorical data represents characteristics, such as a hockey player’s positions.
Categorical data can take numerical values. For example, maybe we would use 1 for colour red and 2 for
blue. But these numbers don’t have a mathematical meaning.
3- Time Series Data
Time series data is a sequence of numbers collected at regular intervals over some period of time. It is
very important, especially in particular fields like finance. Time series data has a temporal value attached
to it, so this would be something like a date or a timestamp that you can look for trends in time.
4- Text
Text data is basically just words. A lot of the time the first thing that you do with text is you turn it into
numbers using some interesting functions like the bag of words formulation.
We can use stemming, lowercase functions .. etc
4- Text
This is working not disappointed
This is not working. disappointed
Tokenization :
[ ‘disappointed’, ’is’, ’not’, ’working’, ’this’ ]
4- Text
4- Text
Orthogonal sparse bigram (OSB) :
5- Working with missing data
For row have a missing values you can :
1- Delete the row (if data is not related).
2- Impute missing data:
A- if data is related to each other you can calculate the mean for that column.
B- if data is independent you can pick data from another row.
C-if data is related to timestamp:
1- interpolation
2- fill backward
3- fill forward
5- Working with missing data
5- Working with missing data
5- Working with missing data
5- Working with missing data
5- Working with missing data
5- Working with missing data
5- Working with missing data
Some useful lib for python:
1-Numpy : Mathematical function for optimize large data
2- Pandas : Data analyzing and modeling & reading
3- Matplotlib : Plotting library for visualize the data
6- Model performance (fitting)
Relationship between input and output could be:
1- Liner
2- non-liner
Knowing this relation will help in using algorithm and choose the attributes needed in
predict function
6- Model performance (fitting)
1- Underfitting:
When: Poor performance in testing set , poor in training set
Why: Feature is not enough to capture the relationship between input and output
How : Add more rows , or add more features, optimize the hyperparameters
2- Overfitting:
When: Poor performance in testing set , Good in training set
Why: Model memories the data it has seen and unable to generalize it on unseen data.
How: Removing complex feature and optimize the hyperparameters
3- Balanced:
Good performance in testing set , Poor in training set
Regression model performance
Common Techniques for evaluating performance:
Visually observe using Plots
Residual Histograms (negative less than positive)
Evaluate with Metrics like Root Mean Square Error (RMSE)
Binary & multi-class model performance
Common Techniques for evaluating performance:
Visually observe using Plots
Confusion Matrix
Binary model performance (SKLearn)
References:
https://docs.aws.amazon.com/machine-learning/index.html

More Related Content

What's hot (20)

PDF
The Power of Auto ML and How Does it Work
Ivo Andreev
 
PPTX
Machine Learning
Kumar P
 
PPTX
Introduction to Machine Learning
snehal_152
 
PDF
Lecture 1: What is Machine Learning?
Marina Santini
 
PDF
Deep Learning for Recommender Systems RecSys2017 Tutorial
Alexandros Karatzoglou
 
PPTX
Machine Learning Tutorial Part - 1 | Machine Learning Tutorial For Beginners ...
Simplilearn
 
PPTX
Introduction to-machine-learning
Babu Priyavrat
 
PDF
Word2Vec
hyunyoung Lee
 
PPTX
Machine Learning
Sajitha Burvin
 
PPTX
Machine Learning Tutorial | Machine Learning Basics | Machine Learning Algori...
Simplilearn
 
PDF
Intro to LLMs
Loic Merckel
 
PPTX
introduction to machin learning
nilimapatel6
 
PPTX
Data Science With Python | Python For Data Science | Python Data Science Cour...
Simplilearn
 
PDF
Introduction to MLflow
Databricks
 
PDF
Automated Machine Learning
Yuriy Guts
 
PDF
What is MLOps
Henrik Skogström
 
PPTX
Text Classification
RAX Automation Suite
 
PPTX
Introduction to Machine Learning
Rahul Jain
 
PDF
Model selection and cross validation techniques
Venkata Reddy Konasani
 
PDF
Machine Learning and its Applications
Dr Ganesh Iyer
 
The Power of Auto ML and How Does it Work
Ivo Andreev
 
Machine Learning
Kumar P
 
Introduction to Machine Learning
snehal_152
 
Lecture 1: What is Machine Learning?
Marina Santini
 
Deep Learning for Recommender Systems RecSys2017 Tutorial
Alexandros Karatzoglou
 
Machine Learning Tutorial Part - 1 | Machine Learning Tutorial For Beginners ...
Simplilearn
 
Introduction to-machine-learning
Babu Priyavrat
 
Word2Vec
hyunyoung Lee
 
Machine Learning
Sajitha Burvin
 
Machine Learning Tutorial | Machine Learning Basics | Machine Learning Algori...
Simplilearn
 
Intro to LLMs
Loic Merckel
 
introduction to machin learning
nilimapatel6
 
Data Science With Python | Python For Data Science | Python Data Science Cour...
Simplilearn
 
Introduction to MLflow
Databricks
 
Automated Machine Learning
Yuriy Guts
 
What is MLOps
Henrik Skogström
 
Text Classification
RAX Automation Suite
 
Introduction to Machine Learning
Rahul Jain
 
Model selection and cross validation techniques
Venkata Reddy Konasani
 
Machine Learning and its Applications
Dr Ganesh Iyer
 

Similar to Machine learning introduction (20)

PDF
Machine Learning - Deep Learning
Oluwasegun Matthew
 
PDF
Introduction to machine learning
Oluwasegun Matthew
 
PPTX
ECT463 Machine Learning Module 1 KTU 2019 Scheme.pptx
roshi4781
 
PPTX
Introduction to Machine Learning
Sujith Jayaprakash
 
PPTX
MachineLearning_Unit-I.pptxScrum.pptxAgile Model.pptxAgile Model.pptxAgile Mo...
22eg105n11
 
PPTX
chapter Three artificial intelligence 1.pptx
gadisaadamu101
 
PPTX
Internship - Python - AI ML.pptx
Hchethankumar
 
PPTX
Internship - Python - AI ML.pptx
Hchethankumar
 
PDF
MachineLearning_Unit-I.pptx.pdtegfdxcdsfxf
22eg105n49
 
PDF
IRJET- Machine Learning: Survey, Types and Challenges
IRJET Journal
 
PDF
newmicrosoftpowerpointpresentation-210512111200.pdf
abhimanyurajjha002
 
PPTX
Introduction to ML (Machine Learning)
SwatiTripathi44
 
PPTX
AI_06_Machine Learning.pptx
Yousef Aburawi
 
PPTX
Session 17-18 machine learning very important and good type student favour.pptx
devadattha
 
PPTX
Chapter 05 Machine Learning.pptx
ssuser957b41
 
DOC
Lecture #1: Introduction to machine learning (ML)
butest
 
PPTX
Machine Learning Basics
Suresh Arora
 
PPTX
Machine Learning.pptx
NitinSharma134320
 
PPTX
introduction to machine learning
Johnson Ubah
 
PPTX
Machine Learning PPT BY RAVINDRA SINGH KUSHWAHA B.TECH(IT) CHAUDHARY CHARAN S...
RavindraSinghKushwah1
 
Machine Learning - Deep Learning
Oluwasegun Matthew
 
Introduction to machine learning
Oluwasegun Matthew
 
ECT463 Machine Learning Module 1 KTU 2019 Scheme.pptx
roshi4781
 
Introduction to Machine Learning
Sujith Jayaprakash
 
MachineLearning_Unit-I.pptxScrum.pptxAgile Model.pptxAgile Model.pptxAgile Mo...
22eg105n11
 
chapter Three artificial intelligence 1.pptx
gadisaadamu101
 
Internship - Python - AI ML.pptx
Hchethankumar
 
Internship - Python - AI ML.pptx
Hchethankumar
 
MachineLearning_Unit-I.pptx.pdtegfdxcdsfxf
22eg105n49
 
IRJET- Machine Learning: Survey, Types and Challenges
IRJET Journal
 
newmicrosoftpowerpointpresentation-210512111200.pdf
abhimanyurajjha002
 
Introduction to ML (Machine Learning)
SwatiTripathi44
 
AI_06_Machine Learning.pptx
Yousef Aburawi
 
Session 17-18 machine learning very important and good type student favour.pptx
devadattha
 
Chapter 05 Machine Learning.pptx
ssuser957b41
 
Lecture #1: Introduction to machine learning (ML)
butest
 
Machine Learning Basics
Suresh Arora
 
Machine Learning.pptx
NitinSharma134320
 
introduction to machine learning
Johnson Ubah
 
Machine Learning PPT BY RAVINDRA SINGH KUSHWAHA B.TECH(IT) CHAUDHARY CHARAN S...
RavindraSinghKushwah1
 
Ad

Recently uploaded (20)

PDF
Choosing the Right Database for Indexing.pdf
Tamanna
 
PDF
What does good look like - CRAP Brighton 8 July 2025
Jan Kierzyk
 
PPTX
Hadoop_EcoSystem slide by CIDAC India.pptx
migbaruget
 
DOCX
AI/ML Applications in Financial domain projects
Rituparna De
 
PPTX
Rational Functions, Equations, and Inequalities (1).pptx
mdregaspi24
 
PDF
Product Management in HealthTech (Case Studies from SnappDoctor)
Hamed Shams
 
PDF
apidays Helsinki & North 2025 - REST in Peace? Hunting the Dominant Design fo...
apidays
 
PDF
Avatar for apidays apidays PRO June 07, 2025 0 5 apidays Helsinki & North 2...
apidays
 
PDF
Incident Response and Digital Forensics Certificate
VICTOR MAESTRE RAMIREZ
 
PDF
MusicVideoProjectRubric Animation production music video.pdf
ALBERTIANCASUGA
 
PDF
apidays Helsinki & North 2025 - APIs in the healthcare sector: hospitals inte...
apidays
 
PDF
How to Connect Your On-Premises Site to AWS Using Site-to-Site VPN.pdf
Tamanna
 
PPT
Lecture 2-1.ppt at a higher learning institution such as the university of Za...
rachealhantukumane52
 
PDF
Data Chunking Strategies for RAG in 2025.pdf
Tamanna
 
PDF
OPPOTUS - Malaysias on Malaysia 1Q2025.pdf
Oppotus
 
PPTX
The _Operations_on_Functions_Addition subtruction Multiplication and Division...
mdregaspi24
 
PDF
Building Production-Ready AI Agents with LangGraph.pdf
Tamanna
 
PPT
Data base management system Transactions.ppt
gandhamcharan2006
 
PDF
Early_Diabetes_Detection_using_Machine_L.pdf
maria879693
 
PPTX
apidays Munich 2025 - Building Telco-Aware Apps with Open Gateway APIs, Subhr...
apidays
 
Choosing the Right Database for Indexing.pdf
Tamanna
 
What does good look like - CRAP Brighton 8 July 2025
Jan Kierzyk
 
Hadoop_EcoSystem slide by CIDAC India.pptx
migbaruget
 
AI/ML Applications in Financial domain projects
Rituparna De
 
Rational Functions, Equations, and Inequalities (1).pptx
mdregaspi24
 
Product Management in HealthTech (Case Studies from SnappDoctor)
Hamed Shams
 
apidays Helsinki & North 2025 - REST in Peace? Hunting the Dominant Design fo...
apidays
 
Avatar for apidays apidays PRO June 07, 2025 0 5 apidays Helsinki & North 2...
apidays
 
Incident Response and Digital Forensics Certificate
VICTOR MAESTRE RAMIREZ
 
MusicVideoProjectRubric Animation production music video.pdf
ALBERTIANCASUGA
 
apidays Helsinki & North 2025 - APIs in the healthcare sector: hospitals inte...
apidays
 
How to Connect Your On-Premises Site to AWS Using Site-to-Site VPN.pdf
Tamanna
 
Lecture 2-1.ppt at a higher learning institution such as the university of Za...
rachealhantukumane52
 
Data Chunking Strategies for RAG in 2025.pdf
Tamanna
 
OPPOTUS - Malaysias on Malaysia 1Q2025.pdf
Oppotus
 
The _Operations_on_Functions_Addition subtruction Multiplication and Division...
mdregaspi24
 
Building Production-Ready AI Agents with LangGraph.pdf
Tamanna
 
Data base management system Transactions.ppt
gandhamcharan2006
 
Early_Diabetes_Detection_using_Machine_L.pdf
maria879693
 
apidays Munich 2025 - Building Telco-Aware Apps with Open Gateway APIs, Subhr...
apidays
 
Ad

Machine learning introduction

  • 1. Machine learning intro The only limit to AI is human imagination. - Chris Duffey By : Anas Jamil Mar - 2019
  • 2. Agenda 1- AI & ML & DL. 2- Machine learning (ML) introduction. 3- Types of Machine learning. 4- ML data types. 5- Working with missing data 6- Model performance (fitting)
  • 3. AI & ML & DL introduction. Artificial Intelligence (AI) : It is the study of how to train the computers so that computers can do things which at present human can do. -www.geeksforgeeks.org- Machine learning (ML) :Is the scientific study of algorithms and statistical models that computer systems use to perform a specific task without using explicit instructions. -wikipedia- Deep learning (DL): is an artificial intelligence function that imitates the workings of the human brain in processing data and creating patterns for use in decision making.
  • 5. What is Machine Learning? “Learning is any process by which a system improves performance from experience.” - Herbert Simon Some use cases: We ML when: • Human expertise does not exist (navigating on Mars) • Humans can’t explain their expertise (speech recognition) • Models are based on huge amounts of data (genomics).
  • 8. 1- Supervised (inductive/class driven) learning Supervised learning: is the machine learning task of learning a function that maps an input to an output based on example input-output pairs. Supervised learning: is where you have input variables (x) and an output variable (Y) and you use an algorithm to learn the mapping function from the input to the output. Y = f(X) The goal is to approximate the mapping function so well that when you have new input data (x) that you can predict the output variables (Y) for that data. It is called supervised learning because the process of an algorithm learning from the training dataset can be thought of as a teacher supervising the learning process.
  • 9. Supervised learning terms: Class, Target, Label Attribute, Feature Labeled data, dataset, Sample data Sample Example Record Row Instance observation
  • 11. Supervised learning algorithm types: 1- Regression: It is a Supervised Learning task where output is having continuous value.(Numeric output ) Ex: how much home worth. 2- Classifications: It is a Supervised Learning task where output is having defined labels(discrete value). A- Binary classification: Yes/No Ex: Spam email? B- Multi-classes: One out of several outputs. Ex:What is the weather? Sample of algorithms : Support Vector Machine (SVM), Random Forest, Linear Regression, Decision Trees
  • 12. 2-Unsupervised (data driven) learning Training data does not include desired outputs. Unsupervised learning: is very much the opposite of supervised learning. It features no labels. Instead, our algorithm would be fed a lot of data and given the tools to understand the properties of the data. From there, it can learn to group, cluster, and/or organize the data in a way such that a human (or other intelligent algorithm) can come in and make sense of the newly organized data.
  • 13. Unsupervised learning Unsupervised learning classified into two categories of algorithms: Clustering: A clustering problem is where you want to discover the inherent groupings in the data, such as grouping customers by purchasing behavior. Association: An association rule learning problem is where you want to discover rules that describe large portions of your data, such as people that buy X also tend to buy Y.
  • 14. Unsupervised learning: Dimensionality Reduction (DR) There are two components of dimensionality reduction: Feature extraction: This reduces the data in a high dimensional space to a lower dimension space, i.e. a space with lesser no. of dimensions. a+b+c+d = e ab+c+d = e Feature selection: In this, we try to find a subset of the original set of features, to get a smaller subset which can be used to model the problem c = 0 Sample of algorithms : FastText alg, BlazingText alg, Principal component analysis (PCA)
  • 15. 3- Reinforcement learning (RL) It is about taking suitable action to maximize reward in a particular situation.
  • 16. 3- Reinforcement learning (RL) Types of Reinforcement: There are two types of Reinforcement: 1- Positive: Positive Reinforcement is defined as when an event, occurs due to a particular behavior, increases the strength and the frequency of the behavior. In other words it has a positive effect. 2-Negative: Negative Reinforcement is defined as strengthening of a behavior because a negative condition is stopped or avoided. Use Cases of RL : Real-time decisions , Game AI, Robo navigation, auto drive cars
  • 17. Data types from ML perspectives 1- Numerical Data 2- Categorical Data 3- Time Series Data 4- Text
  • 18. 1- Numerical Data: Numerical data is any data where data points are exact numbers. Statisticians also might call numerical data, quantitative data. This data has meaning as a measurement such as house prices.
  • 19. 2- Categorical Data Categorical data represents characteristics, such as a hockey player’s positions. Categorical data can take numerical values. For example, maybe we would use 1 for colour red and 2 for blue. But these numbers don’t have a mathematical meaning.
  • 20. 3- Time Series Data Time series data is a sequence of numbers collected at regular intervals over some period of time. It is very important, especially in particular fields like finance. Time series data has a temporal value attached to it, so this would be something like a date or a timestamp that you can look for trends in time.
  • 21. 4- Text Text data is basically just words. A lot of the time the first thing that you do with text is you turn it into numbers using some interesting functions like the bag of words formulation. We can use stemming, lowercase functions .. etc
  • 22. 4- Text This is working not disappointed This is not working. disappointed Tokenization : [ ‘disappointed’, ’is’, ’not’, ’working’, ’this’ ]
  • 24. 4- Text Orthogonal sparse bigram (OSB) :
  • 25. 5- Working with missing data For row have a missing values you can : 1- Delete the row (if data is not related). 2- Impute missing data: A- if data is related to each other you can calculate the mean for that column. B- if data is independent you can pick data from another row. C-if data is related to timestamp: 1- interpolation 2- fill backward 3- fill forward
  • 26. 5- Working with missing data
  • 27. 5- Working with missing data
  • 28. 5- Working with missing data
  • 29. 5- Working with missing data
  • 30. 5- Working with missing data
  • 31. 5- Working with missing data
  • 32. 5- Working with missing data Some useful lib for python: 1-Numpy : Mathematical function for optimize large data 2- Pandas : Data analyzing and modeling & reading 3- Matplotlib : Plotting library for visualize the data
  • 33. 6- Model performance (fitting) Relationship between input and output could be: 1- Liner 2- non-liner Knowing this relation will help in using algorithm and choose the attributes needed in predict function
  • 34. 6- Model performance (fitting) 1- Underfitting: When: Poor performance in testing set , poor in training set Why: Feature is not enough to capture the relationship between input and output How : Add more rows , or add more features, optimize the hyperparameters 2- Overfitting: When: Poor performance in testing set , Good in training set Why: Model memories the data it has seen and unable to generalize it on unseen data. How: Removing complex feature and optimize the hyperparameters 3- Balanced: Good performance in testing set , Poor in training set
  • 35. Regression model performance Common Techniques for evaluating performance: Visually observe using Plots Residual Histograms (negative less than positive) Evaluate with Metrics like Root Mean Square Error (RMSE)
  • 36. Binary & multi-class model performance Common Techniques for evaluating performance: Visually observe using Plots Confusion Matrix