SlideShare a Scribd company logo
4
Most read
5
Most read
9
Most read
Feature Engineering in
Machine Learning
Tanishka Garg
Mayura Zadane
08th April 2022
Agenda
1. What is a Feature? What is feature Engineering?
2. What is its importance and why it is used?
3. Main processes of Feature Engineering
4. Feature Engineering Techniques
● Imputation
● Handling Outliers
● Transformations
● Encoding
● Scaling (Normalization & Standardization)
● Binning
What is a Feature? What is Feature Engineering?
● Generally, all machine learning algorithms take input data to generate the output. The input data remains in a tabular form consisting
of rows (instances or observations) and columns (variable or attributes), and these attributes are often known as features. For
example, an image is an instance in computer vision, but a line in the image could be the feature. Similarly, in NLP, a document can
be an observation, and the word count could be the feature. So, we can say a feature is an attribute that impacts a problem or is
useful for the problem.
● The features you use influence more than everything else then the result. No algorithm alone, to my knowledge, can
supplement the information gain given by correct feature engineering.
What is a Feature?
● Feature engineering is the pre-processing step of machine learning, which extracts features from raw data.
● It helps to represent an underlying problem to predictive models in a better way, which as a result, improve the
accuracy of the model for unseen data.
● The predictive model contains predictor variables and an outcome variable, and while the feature engineering
process selects the most useful predictor variables for the model.
What is Feature Engineering?
What is its importance and why it is used?
What is its importance and why it is used?
Feature engineering in machine learning improves the model's performance. Below are some points that explain the need for feature engineering:
● Better features mean flexibility.
In machine learning, we always try to choose the optimal model to get good results. However, sometimes after choosing the wrong model, still, we
can get better predictions, and this is because of better features. The flexibility in features will enable you to select the less complex models.
Because less complex models are faster to run, easier to understand and maintain, which is always desirable.
● Better features mean simpler models.
If we input the well-engineered features to our model, then even after selecting the wrong parameters (Not much optimal), we can have good
outcomes. After feature engineering, it is not necessary to do hard for picking the right model with the most optimized parameters. If we have
good features, we can better represent the complete data and use it to best characterize the given problem.
● Better features mean better results.
As already discussed, in machine learning, as data we will provide will get the same output. So, to obtain better results, we must need to use better
features.
Main processes of Feature Engineering
Main processes of Feature Engineering
The steps of feature engineering may vary as per different data scientists and ML engineers. However, there are some common steps that are involved
in most machine learning algorithms, and these steps are as follows:
● Data Preparation: The first step is data preparation. In this step, raw data acquired from different resources are prepared to make it in a suitable
format so that it can be used in the ML model. The data preparation may contain cleaning of data, delivery, data augmentation, fusion, ingestion,
or loading.
● Exploratory Analysis: Exploratory analysis or Exploratory data analysis (EDA) is an important step of features engineering, which is mainly
used by data scientists. This step involves analysis, investing data set, and summarization of the main characteristics of data. Different data
visualization techniques are used to better understand the manipulation of data sources, to find the most appropriate statistical technique for data
analysis, and to select the best features for the data.
● Benchmark: Benchmarking is a process of setting a standard baseline for accuracy to compare all the variables from this baseline. The
benchmarking process is used to improve the predictability of the model and reduce the error rate.
Feature Engineering Techniques
Feature Engineering Techniques
1. Imputation
Feature engineering deals with inappropriate data, missing values, human interruption, general errors, insufficient data sources, etc. Missing values within the
dataset highly affect the performance of the algorithm, and to deal with them "Imputation" technique is used. Imputation is responsible for handling
irregularities within the dataset.
For example, removing the missing values from the complete row or complete column by a huge percentage of missing values. But at the same time, to
maintain the data size & to prevent loss of information, it is required to impute the missing data, which can be done as:
● For numerical data imputation, a default value can be imputed in a column, and missing values can be filled with means or medians of the columns.
● For categorical data imputation, missing values can be interchanged with the maximum occurred value in a column.
2. Handling Outliers
Outliers are the deviated values or data points that are observed too away from other data points in such a way that they badly affect the performance of the
model. Outliers can be handled with this feature engineering technique. This technique first identifies the outliers and then remove them out.
Standard deviation can be used to identify the outliers. For example, each value within a space has a definite to an average distance, but if a value is greater
distant than a certain value, it can be considered as an outlier.
Feature Engineering Techniques
3. Log transform
Logarithm transformation or log transform is one of the commonly used mathematical techniques in machine learning. Log transform helps in handling the skewed data,
and it makes the distribution more approximate to normal after transformation. It also reduces the effects of outliers on the data, as because of the normalization of
magnitude differences, a model becomes much robust.
4. Encoding
One hot encoding is the popular encoding technique in machine learning. It is a technique that converts the categorical data in a form so that they can be easily understood
by machine learning algorithms and hence can make a good prediction. It enables group the of categorical data without losing any information.
5. Scaling
In most cases, the numerical features of the dataset do not have a certain range and they differ from each other. In real life, it is nonsense to expect age and income columns
to have the same range. But from the machine learning point of view, how these two columns can be compared?
Scaling solves this problem. The continuous features become identical in terms of the range, after a scaling process.
Any Questions??
Reference :
1. https://www.repath.in/gallery/feature_engineering_for_machine_learning.pdf
2. https://www.analyticssteps.com/blogs/feature-engineering-method-machine- learning
Thank You!

More Related Content

PDF
Dimensionality Reduction
mrizwan969
 
PDF
Data preprocessing using Machine Learning
Gopal Sakarkar
 
PPTX
Data Science- Data Preprocessing, Data Cleaning.
Megha Sharma
 
PDF
Feature Engineering
HJ van Veen
 
PDF
Introduction to Machine Learning Classifiers
Functional Imperative
 
PPTX
Feature Selection in Machine Learning
Upekha Vandebona
 
PPTX
Machine Learning - Dataset Preparation
Andrew Ferlitsch
 
PPTX
Feature Engineering
odsc
 
Dimensionality Reduction
mrizwan969
 
Data preprocessing using Machine Learning
Gopal Sakarkar
 
Data Science- Data Preprocessing, Data Cleaning.
Megha Sharma
 
Feature Engineering
HJ van Veen
 
Introduction to Machine Learning Classifiers
Functional Imperative
 
Feature Selection in Machine Learning
Upekha Vandebona
 
Machine Learning - Dataset Preparation
Andrew Ferlitsch
 
Feature Engineering
odsc
 

What's hot (20)

PDF
Understanding Bagging and Boosting
Mohit Rajput
 
PDF
Confusion Matrix
Rajat Gupta
 
PPTX
Deep neural networks
Si Haem
 
PPTX
Introduction to Machine Learning
Lior Rokach
 
PPTX
Classification techniques in data mining
Kamal Acharya
 
PPTX
Inductive bias
swapnac12
 
PDF
An introduction to Machine Learning
butest
 
PPTX
Naive Bayes
Abdullah al Mamun
 
PPT
Basics of Machine Learning
butest
 
ODP
Machine Learning With Logistic Regression
Knoldus Inc.
 
PPT
Machine Learning presentation.
butest
 
PPT
2.3 bayesian classification
Krish_ver2
 
PPTX
Introduction to Data Mining
DataminingTools Inc
 
PPTX
Lecture #01
Konpal Darakshan
 
PPTX
Machine learning ppt.
ASHOK KUMAR
 
PDF
Feature selection
Dong Guo
 
PPTX
Naive bayes
Ashraf Uddin
 
PDF
Dimensionality Reduction
Saad Elbeleidy
 
PDF
Machine learning
Dr Geetha Mohan
 
Understanding Bagging and Boosting
Mohit Rajput
 
Confusion Matrix
Rajat Gupta
 
Deep neural networks
Si Haem
 
Introduction to Machine Learning
Lior Rokach
 
Classification techniques in data mining
Kamal Acharya
 
Inductive bias
swapnac12
 
An introduction to Machine Learning
butest
 
Naive Bayes
Abdullah al Mamun
 
Basics of Machine Learning
butest
 
Machine Learning With Logistic Regression
Knoldus Inc.
 
Machine Learning presentation.
butest
 
2.3 bayesian classification
Krish_ver2
 
Introduction to Data Mining
DataminingTools Inc
 
Lecture #01
Konpal Darakshan
 
Machine learning ppt.
ASHOK KUMAR
 
Feature selection
Dong Guo
 
Naive bayes
Ashraf Uddin
 
Dimensionality Reduction
Saad Elbeleidy
 
Machine learning
Dr Geetha Mohan
 
Ad

Similar to Feature Engineering in Machine Learning (20)

PDF
Feature Engineering.pdf
Rajoo Jha
 
PPTX
Lecture-1-Introduction to Deep learning.pptx
JayChauhan100
 
PPTX
Feature engineering
SaurabhWani6
 
PPTX
This notes are more beneficial for artifical intelligence
ghulammuhammad83506
 
PPTX
LECTURE-2-INTRO FEATURE ENGINEERING.pptx
kobiy41698
 
PPTX
Feature Engineering for data science.pptx
JohnWilliam111370
 
PDF
Lecture 8 - Feature Engineering and Optimization, a lecture in subject module...
Maninda Edirisooriya
 
DOCX
Deep Learning Vocabulary.docx
jaffarbikat
 
PDF
Feature Engineering & Selection
Eng Teong Cheah
 
PDF
ML-Unit-4.pdf
AnushaSharma81
 
PDF
Building a performing Machine Learning model from A to Z
Charles Vestur
 
PPTX
Feature Engineering in Machine Learning and AI
babysree726
 
PPTX
Optimal Model Complexity (1).pptx
MurindanyiSudi1
 
PDF
TDC2017 | São Paulo - Trilha Java EE How we figured out we had a SRE team at ...
tdc-globalcode
 
PDF
Machine Learning.pdf
BeyaNasr1
 
PPTX
Feature enginnering and selection
Davis David
 
PDF
THE IMPLICATION OF STATISTICAL ANALYSIS AND FEATURE ENGINEERING FOR MODEL BUI...
IJCSES Journal
 
PDF
THE IMPLICATION OF STATISTICAL ANALYSIS AND FEATURE ENGINEERING FOR MODEL BUI...
ijcseit
 
PDF
THE IMPLICATION OF STATISTICAL ANALYSIS AND FEATURE ENGINEERING FOR MODEL BUI...
IJCSES Journal
 
PDF
Choosing a Machine Learning technique to solve your need
GibDevs
 
Feature Engineering.pdf
Rajoo Jha
 
Lecture-1-Introduction to Deep learning.pptx
JayChauhan100
 
Feature engineering
SaurabhWani6
 
This notes are more beneficial for artifical intelligence
ghulammuhammad83506
 
LECTURE-2-INTRO FEATURE ENGINEERING.pptx
kobiy41698
 
Feature Engineering for data science.pptx
JohnWilliam111370
 
Lecture 8 - Feature Engineering and Optimization, a lecture in subject module...
Maninda Edirisooriya
 
Deep Learning Vocabulary.docx
jaffarbikat
 
Feature Engineering & Selection
Eng Teong Cheah
 
ML-Unit-4.pdf
AnushaSharma81
 
Building a performing Machine Learning model from A to Z
Charles Vestur
 
Feature Engineering in Machine Learning and AI
babysree726
 
Optimal Model Complexity (1).pptx
MurindanyiSudi1
 
TDC2017 | São Paulo - Trilha Java EE How we figured out we had a SRE team at ...
tdc-globalcode
 
Machine Learning.pdf
BeyaNasr1
 
Feature enginnering and selection
Davis David
 
THE IMPLICATION OF STATISTICAL ANALYSIS AND FEATURE ENGINEERING FOR MODEL BUI...
IJCSES Journal
 
THE IMPLICATION OF STATISTICAL ANALYSIS AND FEATURE ENGINEERING FOR MODEL BUI...
ijcseit
 
THE IMPLICATION OF STATISTICAL ANALYSIS AND FEATURE ENGINEERING FOR MODEL BUI...
IJCSES Journal
 
Choosing a Machine Learning technique to solve your need
GibDevs
 
Ad

More from Knoldus Inc. (20)

PPTX
Angular Hydration Presentation (FrontEnd)
Knoldus Inc.
 
PPTX
Optimizing Test Execution: Heuristic Algorithm for Self-Healing
Knoldus Inc.
 
PPTX
Self-Healing Test Automation Framework - Healenium
Knoldus Inc.
 
PPTX
Kanban Metrics Presentation (Project Management)
Knoldus Inc.
 
PPTX
Java 17 features and implementation.pptx
Knoldus Inc.
 
PPTX
Chaos Mesh Introducing Chaos in Kubernetes
Knoldus Inc.
 
PPTX
GraalVM - A Step Ahead of JVM Presentation
Knoldus Inc.
 
PPTX
Nomad by HashiCorp Presentation (DevOps)
Knoldus Inc.
 
PPTX
Nomad by HashiCorp Presentation (DevOps)
Knoldus Inc.
 
PPTX
DAPR - Distributed Application Runtime Presentation
Knoldus Inc.
 
PPTX
Introduction to Azure Virtual WAN Presentation
Knoldus Inc.
 
PPTX
Introduction to Argo Rollouts Presentation
Knoldus Inc.
 
PPTX
Intro to Azure Container App Presentation
Knoldus Inc.
 
PPTX
Insights Unveiled Test Reporting and Observability Excellence
Knoldus Inc.
 
PPTX
Introduction to Splunk Presentation (DevOps)
Knoldus Inc.
 
PPTX
Code Camp - Data Profiling and Quality Analysis Framework
Knoldus Inc.
 
PPTX
AWS: Messaging Services in AWS Presentation
Knoldus Inc.
 
PPTX
Amazon Cognito: A Primer on Authentication and Authorization
Knoldus Inc.
 
PPTX
ZIO Http A Functional Approach to Scalable and Type-Safe Web Development
Knoldus Inc.
 
PPTX
Managing State & HTTP Requests In Ionic.
Knoldus Inc.
 
Angular Hydration Presentation (FrontEnd)
Knoldus Inc.
 
Optimizing Test Execution: Heuristic Algorithm for Self-Healing
Knoldus Inc.
 
Self-Healing Test Automation Framework - Healenium
Knoldus Inc.
 
Kanban Metrics Presentation (Project Management)
Knoldus Inc.
 
Java 17 features and implementation.pptx
Knoldus Inc.
 
Chaos Mesh Introducing Chaos in Kubernetes
Knoldus Inc.
 
GraalVM - A Step Ahead of JVM Presentation
Knoldus Inc.
 
Nomad by HashiCorp Presentation (DevOps)
Knoldus Inc.
 
Nomad by HashiCorp Presentation (DevOps)
Knoldus Inc.
 
DAPR - Distributed Application Runtime Presentation
Knoldus Inc.
 
Introduction to Azure Virtual WAN Presentation
Knoldus Inc.
 
Introduction to Argo Rollouts Presentation
Knoldus Inc.
 
Intro to Azure Container App Presentation
Knoldus Inc.
 
Insights Unveiled Test Reporting and Observability Excellence
Knoldus Inc.
 
Introduction to Splunk Presentation (DevOps)
Knoldus Inc.
 
Code Camp - Data Profiling and Quality Analysis Framework
Knoldus Inc.
 
AWS: Messaging Services in AWS Presentation
Knoldus Inc.
 
Amazon Cognito: A Primer on Authentication and Authorization
Knoldus Inc.
 
ZIO Http A Functional Approach to Scalable and Type-Safe Web Development
Knoldus Inc.
 
Managing State & HTTP Requests In Ionic.
Knoldus Inc.
 

Recently uploaded (20)

PDF
AI Unleashed - Shaping the Future -Starting Today - AIOUG Yatra 2025 - For Co...
Sandesh Rao
 
PDF
Get More from Fiori Automation - What’s New, What Works, and What’s Next.pdf
Precisely
 
PDF
AI-Cloud-Business-Management-Platforms-The-Key-to-Efficiency-Growth.pdf
Artjoker Software Development Company
 
PDF
Oracle AI Vector Search- Getting Started and what's new in 2025- AIOUG Yatra ...
Sandesh Rao
 
PDF
Using Anchore and DefectDojo to Stand Up Your DevSecOps Function
Anchore
 
PDF
Presentation about Hardware and Software in Computer
snehamodhawadiya
 
PDF
Security features in Dell, HP, and Lenovo PC systems: A research-based compar...
Principled Technologies
 
PDF
GDG Cloud Munich - Intro - Luiz Carneiro - #BuildWithAI - July - Abdel.pdf
Luiz Carneiro
 
PPTX
Applied-Statistics-Mastering-Data-Driven-Decisions.pptx
parmaryashparmaryash
 
PPTX
Simple and concise overview about Quantum computing..pptx
mughal641
 
PDF
The Future of Artificial Intelligence (AI)
Mukul
 
PDF
Data_Analytics_vs_Data_Science_vs_BI_by_CA_Suvidha_Chaplot.pdf
CA Suvidha Chaplot
 
PDF
Orbitly Pitch Deck|A Mission-Driven Platform for Side Project Collaboration (...
zz41354899
 
PPTX
Dev Dives: Automate, test, and deploy in one place—with Unified Developer Exp...
AndreeaTom
 
PDF
Automating ArcGIS Content Discovery with FME: A Real World Use Case
Safe Software
 
PDF
How ETL Control Logic Keeps Your Pipelines Safe and Reliable.pdf
Stryv Solutions Pvt. Ltd.
 
PDF
Make GenAI investments go further with the Dell AI Factory
Principled Technologies
 
PPTX
cloud computing vai.pptx for the project
vaibhavdobariyal79
 
PDF
Doc9.....................................
SofiaCollazos
 
PDF
A Strategic Analysis of the MVNO Wave in Emerging Markets.pdf
IPLOOK Networks
 
AI Unleashed - Shaping the Future -Starting Today - AIOUG Yatra 2025 - For Co...
Sandesh Rao
 
Get More from Fiori Automation - What’s New, What Works, and What’s Next.pdf
Precisely
 
AI-Cloud-Business-Management-Platforms-The-Key-to-Efficiency-Growth.pdf
Artjoker Software Development Company
 
Oracle AI Vector Search- Getting Started and what's new in 2025- AIOUG Yatra ...
Sandesh Rao
 
Using Anchore and DefectDojo to Stand Up Your DevSecOps Function
Anchore
 
Presentation about Hardware and Software in Computer
snehamodhawadiya
 
Security features in Dell, HP, and Lenovo PC systems: A research-based compar...
Principled Technologies
 
GDG Cloud Munich - Intro - Luiz Carneiro - #BuildWithAI - July - Abdel.pdf
Luiz Carneiro
 
Applied-Statistics-Mastering-Data-Driven-Decisions.pptx
parmaryashparmaryash
 
Simple and concise overview about Quantum computing..pptx
mughal641
 
The Future of Artificial Intelligence (AI)
Mukul
 
Data_Analytics_vs_Data_Science_vs_BI_by_CA_Suvidha_Chaplot.pdf
CA Suvidha Chaplot
 
Orbitly Pitch Deck|A Mission-Driven Platform for Side Project Collaboration (...
zz41354899
 
Dev Dives: Automate, test, and deploy in one place—with Unified Developer Exp...
AndreeaTom
 
Automating ArcGIS Content Discovery with FME: A Real World Use Case
Safe Software
 
How ETL Control Logic Keeps Your Pipelines Safe and Reliable.pdf
Stryv Solutions Pvt. Ltd.
 
Make GenAI investments go further with the Dell AI Factory
Principled Technologies
 
cloud computing vai.pptx for the project
vaibhavdobariyal79
 
Doc9.....................................
SofiaCollazos
 
A Strategic Analysis of the MVNO Wave in Emerging Markets.pdf
IPLOOK Networks
 

Feature Engineering in Machine Learning

  • 1. Feature Engineering in Machine Learning Tanishka Garg Mayura Zadane 08th April 2022
  • 2. Agenda 1. What is a Feature? What is feature Engineering? 2. What is its importance and why it is used? 3. Main processes of Feature Engineering 4. Feature Engineering Techniques ● Imputation ● Handling Outliers ● Transformations ● Encoding ● Scaling (Normalization & Standardization) ● Binning
  • 3. What is a Feature? What is Feature Engineering?
  • 4. ● Generally, all machine learning algorithms take input data to generate the output. The input data remains in a tabular form consisting of rows (instances or observations) and columns (variable or attributes), and these attributes are often known as features. For example, an image is an instance in computer vision, but a line in the image could be the feature. Similarly, in NLP, a document can be an observation, and the word count could be the feature. So, we can say a feature is an attribute that impacts a problem or is useful for the problem. ● The features you use influence more than everything else then the result. No algorithm alone, to my knowledge, can supplement the information gain given by correct feature engineering. What is a Feature?
  • 5. ● Feature engineering is the pre-processing step of machine learning, which extracts features from raw data. ● It helps to represent an underlying problem to predictive models in a better way, which as a result, improve the accuracy of the model for unseen data. ● The predictive model contains predictor variables and an outcome variable, and while the feature engineering process selects the most useful predictor variables for the model. What is Feature Engineering?
  • 6. What is its importance and why it is used?
  • 7. What is its importance and why it is used? Feature engineering in machine learning improves the model's performance. Below are some points that explain the need for feature engineering: ● Better features mean flexibility. In machine learning, we always try to choose the optimal model to get good results. However, sometimes after choosing the wrong model, still, we can get better predictions, and this is because of better features. The flexibility in features will enable you to select the less complex models. Because less complex models are faster to run, easier to understand and maintain, which is always desirable. ● Better features mean simpler models. If we input the well-engineered features to our model, then even after selecting the wrong parameters (Not much optimal), we can have good outcomes. After feature engineering, it is not necessary to do hard for picking the right model with the most optimized parameters. If we have good features, we can better represent the complete data and use it to best characterize the given problem. ● Better features mean better results. As already discussed, in machine learning, as data we will provide will get the same output. So, to obtain better results, we must need to use better features.
  • 8. Main processes of Feature Engineering
  • 9. Main processes of Feature Engineering The steps of feature engineering may vary as per different data scientists and ML engineers. However, there are some common steps that are involved in most machine learning algorithms, and these steps are as follows: ● Data Preparation: The first step is data preparation. In this step, raw data acquired from different resources are prepared to make it in a suitable format so that it can be used in the ML model. The data preparation may contain cleaning of data, delivery, data augmentation, fusion, ingestion, or loading. ● Exploratory Analysis: Exploratory analysis or Exploratory data analysis (EDA) is an important step of features engineering, which is mainly used by data scientists. This step involves analysis, investing data set, and summarization of the main characteristics of data. Different data visualization techniques are used to better understand the manipulation of data sources, to find the most appropriate statistical technique for data analysis, and to select the best features for the data. ● Benchmark: Benchmarking is a process of setting a standard baseline for accuracy to compare all the variables from this baseline. The benchmarking process is used to improve the predictability of the model and reduce the error rate.
  • 11. Feature Engineering Techniques 1. Imputation Feature engineering deals with inappropriate data, missing values, human interruption, general errors, insufficient data sources, etc. Missing values within the dataset highly affect the performance of the algorithm, and to deal with them "Imputation" technique is used. Imputation is responsible for handling irregularities within the dataset. For example, removing the missing values from the complete row or complete column by a huge percentage of missing values. But at the same time, to maintain the data size & to prevent loss of information, it is required to impute the missing data, which can be done as: ● For numerical data imputation, a default value can be imputed in a column, and missing values can be filled with means or medians of the columns. ● For categorical data imputation, missing values can be interchanged with the maximum occurred value in a column. 2. Handling Outliers Outliers are the deviated values or data points that are observed too away from other data points in such a way that they badly affect the performance of the model. Outliers can be handled with this feature engineering technique. This technique first identifies the outliers and then remove them out. Standard deviation can be used to identify the outliers. For example, each value within a space has a definite to an average distance, but if a value is greater distant than a certain value, it can be considered as an outlier.
  • 12. Feature Engineering Techniques 3. Log transform Logarithm transformation or log transform is one of the commonly used mathematical techniques in machine learning. Log transform helps in handling the skewed data, and it makes the distribution more approximate to normal after transformation. It also reduces the effects of outliers on the data, as because of the normalization of magnitude differences, a model becomes much robust. 4. Encoding One hot encoding is the popular encoding technique in machine learning. It is a technique that converts the categorical data in a form so that they can be easily understood by machine learning algorithms and hence can make a good prediction. It enables group the of categorical data without losing any information. 5. Scaling In most cases, the numerical features of the dataset do not have a certain range and they differ from each other. In real life, it is nonsense to expect age and income columns to have the same range. But from the machine learning point of view, how these two columns can be compared? Scaling solves this problem. The continuous features become identical in terms of the range, after a scaling process.
  • 13. Any Questions?? Reference : 1. https://www.repath.in/gallery/feature_engineering_for_machine_learning.pdf 2. https://www.analyticssteps.com/blogs/feature-engineering-method-machine- learning