SlideShare a Scribd company logo
AutoML 101
2018 Copyright QuantUniversity LLC.
Presented By:
Sri Krishnamurthy, CFA, CAP
sri@quantuniversity.com
www.quantuniversity.com
10/25/2018
QuantUniversity Meetup
Boston
2
About us:
• Data Science, Quant Finance and
Model Governance Advisory
• Technologies using MATLAB, Python
and R
• Programs
▫ Analytics Certificate Program
▫ Fintech programs
• Platform
3
www.analyticscertificate.com/MachineLearning
Use code “Affiliate” for a 20% off by Oct 30th
Upcoming workshop
November 7,8,2018
4
• Your challenge is to design an artificial intelligence and machine
learning (AI/ML) framework capable of flying a drone through
several professional drone racing courses without human
intervention or navigational pre-programming.
AlphaPilot Drone AI Challenge
5
6
• Machine Learning
• Automatic Machine Learning
• Demos
Agenda
7
• “AI is the theory and development of computer systems able to
perform tasks that traditionally have required human intelligence.
• AI is a broad field, of which ‘machine learning’ is a sub-category”
What is Machine Learning and AI?
Source: http://www.fsb.org/wp-content/uploads/P011117.pdf
8
The Machine Learning Process
Data
cleansing
Feature
Engineering
Training and
Testing
Model
building
Model
selection
Hyper
parameter
optimization
Model
Deployment
9
• Supervised Algorithms
▫ Given a set of variables !", predict the value of another variable # in a
given data set such that
▫ If y is numeric => Prediction
▫ If y is categorical => Classification
Machine Learning
x1,x2,x3… Model F(X) y
10
• Unsupervised Algorithms
▫ Given a dataset with variables !", build a model that captures the
similarities in different observations and assigns them to different
buckets => Clustering
Machine Learning
Obs1,
Obs2,Obs3
etc.
Model
Obs1- Class 1
Obs2- Class 2
Obs3- Class 1
11
Supervised
Learning
algorithms
Parametric
models
Non-
Parametric
models
Supervised learning Algorithms - Prediction
12
• Parametric models
▫ Assume some functional form
▫ Fit coefficients
• Examples : Linear Regression, Neural Networks
Supervised Learning models - Prediction
! = #$ + #&'&
Linear Regression Model Neural network Model
13
• Non-Parametric models
▫ No functional form assumed
• Examples : K-nearest neighbors, Decision Trees
Supervised Learning models
K-nearest neighbor Model Decision tree Model
14
15
• Automated machine learning (AutoML) is the process of
automating the end-to-end process of applying machine learning to
real-world problems.
AutoML
16
• Automated Feature Engineering
▫ Feature selection
▫ Feature extraction
▫ Meta learning and transfer learning
▫ Detection and handling of skewed data and/or missing values
• Hyper-parameter optimization
• Model Selection
• Reference:
https://en.wikipedia.org/wiki/Automated_machine_learning
Types of frameworks
17
• Parameters: Values that can be estimated from data
▫ Examples:
– Regression Coefficients
– Weights in a Neural Network
• HyperParameters: Values external to the model and cannot be
learnt from the data
▫ Examples:
– Learning rate in Neural Network
– Regularization parameters
Parameters vs Hyper Parameters
18
• Hyperparameter optimization finds a tuple of hyperparameters that yields an
optimal model which minimizes a predefined loss function on given
independent data.[1]
• [1] Claesen, Marc; Bart De Moor (2015). "Hyperparameter Search in Machine
Learning".
• Image from:
https://support.sas.com/resources/papers/proceedings17/SAS0514-2017.pdf
Hyperparameter optimization
19
• Interpretability: Ability of users to understand the model, the
parameters of the model and their effect on the outcome
• Example:
▫ In regression, coefficients enable us to interpret the influence of an
independent variable on the dependent variable.
▫ The standard error of estimates of the coefficients enable us to
determine how confident are we on these estimates
Model selection considerations
20
• Parsimonious models: A parsimonious model is a model that
accomplishes a desired level of explanation or prediction with as
few predictor variables as possible.
• Example:
▫ In regression, using Exhaustive search, Forward search, Backward
search or Stepwise regression in model selection
▫ Using PCA on the feature space prior to model building
Model selection considerations
21
• Ensemble models: Ensemble methods use multiple learning
algorithms to obtain better predictive performance than could be
obtained from any of the constituent learning algorithms alone.
Image from:
https://blogs.sas.com/content/subconsciousmusings/2017/05/18/sta
cked-ensemble-models-win-data-science-competitions/
Model selection considerations
22
Full pipeline Auotmation
• AutoWEKA is an approach for the simultaneous selection of a machine
learning algorithm and its hyperparameters; combined with
the WEKA package it automatically yields good models for a wide variety
of data sets.
• Auto-sklearn is an extension of AutoWEKA using the Python library scikit-
learn which is a drop-in replacement for regular scikit-learn classifiers and
regressors. It improves over AutoWEKA by using meta-learning to
increase search efficiency and post-hoc ensemble building to combine the
models generated during the hyperparameter optimization process.
• TPOT is a data-science assistant which optimizes machine learning
pipelines using genetic programming.
Ref: https://www.ml4aad.org/automl/
Frameworks
23
Hyper-parameter optimization and Model Selection
• H2O AutoML provides automated model selection and ensembling
for the H2O machine learning and data analytics platform.
• mlr is a R package that contains several hyperparameter
optimization techniques for machine learning problems.
Ref: https://www.ml4aad.org/automl/
Frameworks
24
Deep Neural Network Architecture search
• Google CLOUD AUTOML is an could-based machine learning service
which so far provides the automated generation of computer vision
pipelines.
• Auto Keras is an open-source python package for neural architecture
search.
• Ref:
▫ https://www.ml4aad.org/automl/
▫ https://en.wikipedia.org/wiki/Automated_machine_learning
Frameworks
25
Hardware Considerations
26
Hardware Considerations
Reference: https://azure.microsoft.com/en-us/blog/release-
models-at-pace-using-microsoft-s-automl/
27
So, which one to choose?
Let’s try some of them
28
www.QuSandbox.com
Model
Analytics
Studio
QuResearchHub
QuSandbox
Prototype, Iterate and tune Standardize workflows
Productionize and share
29
www.analyticscertificate.com/MachineLearning
Use code “Affiliate” for a 20% off by Oct 30th
Continue your learning here!
November 7,8,2018
Sri Krishnamurthy, CFA, CAP
Founder and Chief Data Scientist
sri@quantuniversity.com
srikrishnamurthy
www.QuantUniversity.com
www.analyticscertificate.com
www.qusandbox.com
Information, data and drawings embodied in this presentation are strictly a property of QuantUniversity LLC. and shall not be
distributed or used in any other publication without the prior written consent of QuantUniversity LLC.
30
• Founder of QuantUniversity LLC. and
www.analyticscertificate.com
• Advisory and Consultancy for Financial Analytics
• Prior Experience at MathWorks, Citigroup and
Endeca and 25+ financial services and energy
customers.
• Regular Columnist for the Wilmott Magazine
• Author of forthcoming book
“Financial Modeling: A case study approach”
published by Wiley
• Charted Financial Analyst and Certified Analytics
Professional
• Teaches Analytics in the Babson College MBA
program and at Northeastern University, Boston
Sri Krishnamurthy
Founder and CEO
31

More Related Content

What's hot (20)

PDF
Introduction to Natural Language Processing (NLP)
VenkateshMurugadas
 
PPTX
Machine Learning Tutorial Part - 2 | Machine Learning Tutorial For Beginners ...
Simplilearn
 
PPTX
Paper presentation on LLM compression
SanjanaRajeshKothari
 
PDF
The Power of Auto ML and How Does it Work
Ivo Andreev
 
PDF
AutoML lectures (ACDL 2019)
Joaquin Vanschoren
 
PDF
Build an LLM-powered application using LangChain.pdf
StephenAmell4
 
PDF
Intro to LLMs
Loic Merckel
 
PDF
Introduction to Deep Learning
Oleg Mygryn
 
PDF
Introduction to MLflow
Databricks
 
PPT
Introduction to Natural Language Processing
Pranav Gupta
 
PPTX
Machine Learning
Kumar P
 
PDF
Deep learning - what is it and why now?
Natalia Konstantinova
 
PPTX
Intro/Overview on Machine Learning Presentation
Ankit Gupta
 
PPTX
Federated Learning
University of Oslo
 
PDF
Reinventing Deep Learning
 with Hugging Face Transformers
Julien SIMON
 
PDF
Deep Dive into Hyperparameter Tuning
Shubhmay Potdar
 
PDF
Monitoring Models in Production
Jannes Klaas
 
PPTX
Generative AI
Carlos J. Costa
 
PPTX
Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...
Simplilearn
 
PPTX
Deep learning
Ratnakar Pandey
 
Introduction to Natural Language Processing (NLP)
VenkateshMurugadas
 
Machine Learning Tutorial Part - 2 | Machine Learning Tutorial For Beginners ...
Simplilearn
 
Paper presentation on LLM compression
SanjanaRajeshKothari
 
The Power of Auto ML and How Does it Work
Ivo Andreev
 
AutoML lectures (ACDL 2019)
Joaquin Vanschoren
 
Build an LLM-powered application using LangChain.pdf
StephenAmell4
 
Intro to LLMs
Loic Merckel
 
Introduction to Deep Learning
Oleg Mygryn
 
Introduction to MLflow
Databricks
 
Introduction to Natural Language Processing
Pranav Gupta
 
Machine Learning
Kumar P
 
Deep learning - what is it and why now?
Natalia Konstantinova
 
Intro/Overview on Machine Learning Presentation
Ankit Gupta
 
Federated Learning
University of Oslo
 
Reinventing Deep Learning
 with Hugging Face Transformers
Julien SIMON
 
Deep Dive into Hyperparameter Tuning
Shubhmay Potdar
 
Monitoring Models in Production
Jannes Klaas
 
Generative AI
Carlos J. Costa
 
Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...
Simplilearn
 
Deep learning
Ratnakar Pandey
 

Similar to Automatic machine learning (AutoML) 101 (20)

PDF
Ai in finance
QuantUniversity
 
PDF
The Art of Intelligence – A Practical Introduction Machine Learning for Orac...
Lucas Jellema
 
PDF
Machine learning with Apache Spark MLlib | Big Data Hadoop Spark Tutorial | C...
CloudxLab
 
PPTX
The Art of Intelligence – A Practical Introduction Machine Learning for Oracl...
Lucas Jellema
 
PPTX
Introduction overviewmachinelearning sig Door Lucas Jellema
Getting value from IoT, Integration and Data Analytics
 
PPTX
Oct2019 - What is machine learning?
debak58
 
PDF
Automated Machine Learning
Yuriy Guts
 
PPTX
Introduction to Machine Learning - An overview and first step for candidate d...
Lucas Jellema
 
PPTX
Machine Learning Essentials Demystified part1 | Big Data Demystified
Omid Vahdaty
 
PDF
ML MODULE 1_slideshare.pdf
Shiwani Gupta
 
PDF
Machine learning for IoT - unpacking the blackbox
Ivo Andreev
 
PDF
Automatic Machine Learning, AutoML
Himadri Mishra
 
PPTX
TensorFlow Event presentation08-12-2024.pptx
myselfvinamrayadav
 
PDF
Machine Learning with Big Data using Apache Spark
InSemble
 
PDF
Machine Learning and AI: Core Methods and Applications
QuantUniversity
 
PDF
Intro to machine learning
Tamir Taha
 
PDF
The Machine Learning Workflow with Azure
Ivo Andreev
 
PPTX
Unit - 1 - Introduction of the machine learning
Taranpreet Singh
 
PPTX
MLlib and Machine Learning on Spark
Petr Zapletal
 
PDF
GDG DEvFest Hellas 2020 - Automated ML - Panagiotis Papaemmanouil
Panagiotis Papaemmanouil
 
Ai in finance
QuantUniversity
 
The Art of Intelligence – A Practical Introduction Machine Learning for Orac...
Lucas Jellema
 
Machine learning with Apache Spark MLlib | Big Data Hadoop Spark Tutorial | C...
CloudxLab
 
The Art of Intelligence – A Practical Introduction Machine Learning for Oracl...
Lucas Jellema
 
Introduction overviewmachinelearning sig Door Lucas Jellema
Getting value from IoT, Integration and Data Analytics
 
Oct2019 - What is machine learning?
debak58
 
Automated Machine Learning
Yuriy Guts
 
Introduction to Machine Learning - An overview and first step for candidate d...
Lucas Jellema
 
Machine Learning Essentials Demystified part1 | Big Data Demystified
Omid Vahdaty
 
ML MODULE 1_slideshare.pdf
Shiwani Gupta
 
Machine learning for IoT - unpacking the blackbox
Ivo Andreev
 
Automatic Machine Learning, AutoML
Himadri Mishra
 
TensorFlow Event presentation08-12-2024.pptx
myselfvinamrayadav
 
Machine Learning with Big Data using Apache Spark
InSemble
 
Machine Learning and AI: Core Methods and Applications
QuantUniversity
 
Intro to machine learning
Tamir Taha
 
The Machine Learning Workflow with Azure
Ivo Andreev
 
Unit - 1 - Introduction of the machine learning
Taranpreet Singh
 
MLlib and Machine Learning on Spark
Petr Zapletal
 
GDG DEvFest Hellas 2020 - Automated ML - Panagiotis Papaemmanouil
Panagiotis Papaemmanouil
 
Ad

More from QuantUniversity (20)

PDF
AI in Finance and Retirement Systems: Insights from the EBRI-Milken Institute...
QuantUniversity
 
PDF
Adversarial Machine Learning: A Taxonomy and Terminology of Attacks and Mitig...
QuantUniversity
 
PDF
EU Artificial Intelligence Act 2024 passed !
QuantUniversity
 
PDF
Managing-the-Risks-of-LLMs-in-FS-Industry-Roundtable-TruEra-QuantU.pdf
QuantUniversity
 
PDF
PYTHON AND DATA SCIENCE FOR INVESTMENT PROFESSIONALS
QuantUniversity
 
PDF
Qu for India - QuantUniversity FundRaiser
QuantUniversity
 
PDF
Ml master class for CFA Dallas
QuantUniversity
 
PDF
Algorithmic auditing 1.0
QuantUniversity
 
PDF
Towards Fairer Datasets: Filtering and Balancing the Distribution of the Peop...
QuantUniversity
 
PDF
Machine Learning: Considerations for Fairly and Transparently Expanding Acces...
QuantUniversity
 
PDF
Seeing what a gan cannot generate: paper review
QuantUniversity
 
PDF
AI Explainability and Model Risk Management
QuantUniversity
 
PDF
Algorithmic auditing 1.0
QuantUniversity
 
PDF
Machine Learning in Finance: 10 Things You Need to Know in 2021
QuantUniversity
 
PDF
Bayesian Portfolio Allocation
QuantUniversity
 
PDF
The API Jungle
QuantUniversity
 
PDF
Explainable AI Workshop
QuantUniversity
 
PDF
Constructing Private Asset Benchmarks
QuantUniversity
 
PDF
Machine Learning Interpretability
QuantUniversity
 
PDF
Responsible AI in Action
QuantUniversity
 
AI in Finance and Retirement Systems: Insights from the EBRI-Milken Institute...
QuantUniversity
 
Adversarial Machine Learning: A Taxonomy and Terminology of Attacks and Mitig...
QuantUniversity
 
EU Artificial Intelligence Act 2024 passed !
QuantUniversity
 
Managing-the-Risks-of-LLMs-in-FS-Industry-Roundtable-TruEra-QuantU.pdf
QuantUniversity
 
PYTHON AND DATA SCIENCE FOR INVESTMENT PROFESSIONALS
QuantUniversity
 
Qu for India - QuantUniversity FundRaiser
QuantUniversity
 
Ml master class for CFA Dallas
QuantUniversity
 
Algorithmic auditing 1.0
QuantUniversity
 
Towards Fairer Datasets: Filtering and Balancing the Distribution of the Peop...
QuantUniversity
 
Machine Learning: Considerations for Fairly and Transparently Expanding Acces...
QuantUniversity
 
Seeing what a gan cannot generate: paper review
QuantUniversity
 
AI Explainability and Model Risk Management
QuantUniversity
 
Algorithmic auditing 1.0
QuantUniversity
 
Machine Learning in Finance: 10 Things You Need to Know in 2021
QuantUniversity
 
Bayesian Portfolio Allocation
QuantUniversity
 
The API Jungle
QuantUniversity
 
Explainable AI Workshop
QuantUniversity
 
Constructing Private Asset Benchmarks
QuantUniversity
 
Machine Learning Interpretability
QuantUniversity
 
Responsible AI in Action
QuantUniversity
 
Ad

Recently uploaded (20)

PDF
apidays Helsinki & North 2025 - REST in Peace? Hunting the Dominant Design fo...
apidays
 
PDF
OPPOTUS - Malaysias on Malaysia 1Q2025.pdf
Oppotus
 
PPTX
The _Operations_on_Functions_Addition subtruction Multiplication and Division...
mdregaspi24
 
PPTX
Rational Functions, Equations, and Inequalities (1).pptx
mdregaspi24
 
PDF
Product Management in HealthTech (Case Studies from SnappDoctor)
Hamed Shams
 
PPTX
Module-5-Measures-of-Central-Tendency-Grouped-Data-1.pptx
lacsonjhoma0407
 
PPTX
GenAI-Introduction-to-Copilot-for-Bing-March-2025-FOR-HUB.pptx
cleydsonborges1
 
DOCX
AI/ML Applications in Financial domain projects
Rituparna De
 
PPT
Lecture 2-1.ppt at a higher learning institution such as the university of Za...
rachealhantukumane52
 
PDF
How to Avoid 7 Costly Mainframe Migration Mistakes
JP Infra Pvt Ltd
 
PDF
WEF_Future_of_Global_Fintech_Second_Edition_2025.pdf
AproximacionAlFuturo
 
PDF
How to Connect Your On-Premises Site to AWS Using Site-to-Site VPN.pdf
Tamanna
 
PPTX
recruitment Presentation.pptxhdhshhshshhehh
devraj40467
 
PPTX
Human-Action-Recognition-Understanding-Behavior.pptx
nreddyjanga
 
PPTX
Advanced_NLP_with_Transformers_PPT_final 50.pptx
Shiwani Gupta
 
PPTX
apidays Munich 2025 - Building Telco-Aware Apps with Open Gateway APIs, Subhr...
apidays
 
PPTX
Climate Action.pptx action plan for climate
justfortalabat
 
PDF
apidays Helsinki & North 2025 - API-Powered Journeys: Mobility in an API-Driv...
apidays
 
PDF
2_Management_of_patients_with_Reproductive_System_Disorders.pdf
motbayhonewunetu
 
PPTX
Numbers of a nation: how we estimate population statistics | Accessible slides
Office for National Statistics
 
apidays Helsinki & North 2025 - REST in Peace? Hunting the Dominant Design fo...
apidays
 
OPPOTUS - Malaysias on Malaysia 1Q2025.pdf
Oppotus
 
The _Operations_on_Functions_Addition subtruction Multiplication and Division...
mdregaspi24
 
Rational Functions, Equations, and Inequalities (1).pptx
mdregaspi24
 
Product Management in HealthTech (Case Studies from SnappDoctor)
Hamed Shams
 
Module-5-Measures-of-Central-Tendency-Grouped-Data-1.pptx
lacsonjhoma0407
 
GenAI-Introduction-to-Copilot-for-Bing-March-2025-FOR-HUB.pptx
cleydsonborges1
 
AI/ML Applications in Financial domain projects
Rituparna De
 
Lecture 2-1.ppt at a higher learning institution such as the university of Za...
rachealhantukumane52
 
How to Avoid 7 Costly Mainframe Migration Mistakes
JP Infra Pvt Ltd
 
WEF_Future_of_Global_Fintech_Second_Edition_2025.pdf
AproximacionAlFuturo
 
How to Connect Your On-Premises Site to AWS Using Site-to-Site VPN.pdf
Tamanna
 
recruitment Presentation.pptxhdhshhshshhehh
devraj40467
 
Human-Action-Recognition-Understanding-Behavior.pptx
nreddyjanga
 
Advanced_NLP_with_Transformers_PPT_final 50.pptx
Shiwani Gupta
 
apidays Munich 2025 - Building Telco-Aware Apps with Open Gateway APIs, Subhr...
apidays
 
Climate Action.pptx action plan for climate
justfortalabat
 
apidays Helsinki & North 2025 - API-Powered Journeys: Mobility in an API-Driv...
apidays
 
2_Management_of_patients_with_Reproductive_System_Disorders.pdf
motbayhonewunetu
 
Numbers of a nation: how we estimate population statistics | Accessible slides
Office for National Statistics
 

Automatic machine learning (AutoML) 101

  • 1. AutoML 101 2018 Copyright QuantUniversity LLC. Presented By: Sri Krishnamurthy, CFA, CAP [email protected] www.quantuniversity.com 10/25/2018 QuantUniversity Meetup Boston
  • 2. 2 About us: • Data Science, Quant Finance and Model Governance Advisory • Technologies using MATLAB, Python and R • Programs ▫ Analytics Certificate Program ▫ Fintech programs • Platform
  • 3. 3 www.analyticscertificate.com/MachineLearning Use code “Affiliate” for a 20% off by Oct 30th Upcoming workshop November 7,8,2018
  • 4. 4 • Your challenge is to design an artificial intelligence and machine learning (AI/ML) framework capable of flying a drone through several professional drone racing courses without human intervention or navigational pre-programming. AlphaPilot Drone AI Challenge
  • 5. 5
  • 6. 6 • Machine Learning • Automatic Machine Learning • Demos Agenda
  • 7. 7 • “AI is the theory and development of computer systems able to perform tasks that traditionally have required human intelligence. • AI is a broad field, of which ‘machine learning’ is a sub-category” What is Machine Learning and AI? Source: http://www.fsb.org/wp-content/uploads/P011117.pdf
  • 8. 8 The Machine Learning Process Data cleansing Feature Engineering Training and Testing Model building Model selection Hyper parameter optimization Model Deployment
  • 9. 9 • Supervised Algorithms ▫ Given a set of variables !", predict the value of another variable # in a given data set such that ▫ If y is numeric => Prediction ▫ If y is categorical => Classification Machine Learning x1,x2,x3… Model F(X) y
  • 10. 10 • Unsupervised Algorithms ▫ Given a dataset with variables !", build a model that captures the similarities in different observations and assigns them to different buckets => Clustering Machine Learning Obs1, Obs2,Obs3 etc. Model Obs1- Class 1 Obs2- Class 2 Obs3- Class 1
  • 12. 12 • Parametric models ▫ Assume some functional form ▫ Fit coefficients • Examples : Linear Regression, Neural Networks Supervised Learning models - Prediction ! = #$ + #&'& Linear Regression Model Neural network Model
  • 13. 13 • Non-Parametric models ▫ No functional form assumed • Examples : K-nearest neighbors, Decision Trees Supervised Learning models K-nearest neighbor Model Decision tree Model
  • 14. 14
  • 15. 15 • Automated machine learning (AutoML) is the process of automating the end-to-end process of applying machine learning to real-world problems. AutoML
  • 16. 16 • Automated Feature Engineering ▫ Feature selection ▫ Feature extraction ▫ Meta learning and transfer learning ▫ Detection and handling of skewed data and/or missing values • Hyper-parameter optimization • Model Selection • Reference: https://en.wikipedia.org/wiki/Automated_machine_learning Types of frameworks
  • 17. 17 • Parameters: Values that can be estimated from data ▫ Examples: – Regression Coefficients – Weights in a Neural Network • HyperParameters: Values external to the model and cannot be learnt from the data ▫ Examples: – Learning rate in Neural Network – Regularization parameters Parameters vs Hyper Parameters
  • 18. 18 • Hyperparameter optimization finds a tuple of hyperparameters that yields an optimal model which minimizes a predefined loss function on given independent data.[1] • [1] Claesen, Marc; Bart De Moor (2015). "Hyperparameter Search in Machine Learning". • Image from: https://support.sas.com/resources/papers/proceedings17/SAS0514-2017.pdf Hyperparameter optimization
  • 19. 19 • Interpretability: Ability of users to understand the model, the parameters of the model and their effect on the outcome • Example: ▫ In regression, coefficients enable us to interpret the influence of an independent variable on the dependent variable. ▫ The standard error of estimates of the coefficients enable us to determine how confident are we on these estimates Model selection considerations
  • 20. 20 • Parsimonious models: A parsimonious model is a model that accomplishes a desired level of explanation or prediction with as few predictor variables as possible. • Example: ▫ In regression, using Exhaustive search, Forward search, Backward search or Stepwise regression in model selection ▫ Using PCA on the feature space prior to model building Model selection considerations
  • 21. 21 • Ensemble models: Ensemble methods use multiple learning algorithms to obtain better predictive performance than could be obtained from any of the constituent learning algorithms alone. Image from: https://blogs.sas.com/content/subconsciousmusings/2017/05/18/sta cked-ensemble-models-win-data-science-competitions/ Model selection considerations
  • 22. 22 Full pipeline Auotmation • AutoWEKA is an approach for the simultaneous selection of a machine learning algorithm and its hyperparameters; combined with the WEKA package it automatically yields good models for a wide variety of data sets. • Auto-sklearn is an extension of AutoWEKA using the Python library scikit- learn which is a drop-in replacement for regular scikit-learn classifiers and regressors. It improves over AutoWEKA by using meta-learning to increase search efficiency and post-hoc ensemble building to combine the models generated during the hyperparameter optimization process. • TPOT is a data-science assistant which optimizes machine learning pipelines using genetic programming. Ref: https://www.ml4aad.org/automl/ Frameworks
  • 23. 23 Hyper-parameter optimization and Model Selection • H2O AutoML provides automated model selection and ensembling for the H2O machine learning and data analytics platform. • mlr is a R package that contains several hyperparameter optimization techniques for machine learning problems. Ref: https://www.ml4aad.org/automl/ Frameworks
  • 24. 24 Deep Neural Network Architecture search • Google CLOUD AUTOML is an could-based machine learning service which so far provides the automated generation of computer vision pipelines. • Auto Keras is an open-source python package for neural architecture search. • Ref: ▫ https://www.ml4aad.org/automl/ ▫ https://en.wikipedia.org/wiki/Automated_machine_learning Frameworks
  • 27. 27 So, which one to choose? Let’s try some of them
  • 29. 29 www.analyticscertificate.com/MachineLearning Use code “Affiliate” for a 20% off by Oct 30th Continue your learning here! November 7,8,2018
  • 30. Sri Krishnamurthy, CFA, CAP Founder and Chief Data Scientist [email protected] srikrishnamurthy www.QuantUniversity.com www.analyticscertificate.com www.qusandbox.com Information, data and drawings embodied in this presentation are strictly a property of QuantUniversity LLC. and shall not be distributed or used in any other publication without the prior written consent of QuantUniversity LLC. 30
  • 31. • Founder of QuantUniversity LLC. and www.analyticscertificate.com • Advisory and Consultancy for Financial Analytics • Prior Experience at MathWorks, Citigroup and Endeca and 25+ financial services and energy customers. • Regular Columnist for the Wilmott Magazine • Author of forthcoming book “Financial Modeling: A case study approach” published by Wiley • Charted Financial Analyst and Certified Analytics Professional • Teaches Analytics in the Babson College MBA program and at Northeastern University, Boston Sri Krishnamurthy Founder and CEO 31