SlideShare a Scribd company logo
4
Most read
12
Most read
14
Most read
Winning Kaggle 101:
Introduction to Stacking
Erin LeDell Ph.D.
March 2016
Introduction
• Statistician & Machine Learning Scientist at H2O.ai in
Mountain View, California, USA
• Ph.D. in Biostatistics with Designated Emphasis in
Computational Science and Engineering from 

UC Berkeley (focus on Machine Learning)
• Worked as a data scientist at several startups
Ensemble Learning
In statistics and machine learning,
ensemble methods use multiple
learning algorithms to obtain
better predictive performance
than could be obtained by any of
the constituent algorithms.


— Wikipedia (2015)
Common Types of Ensemble Methods
• Also reduces variance and increases accuracy
• Not robust against outliers or noisy data
• Flexible — can be used with any loss function
Bagging
Boosting
Stacking
• Reduces variance and increases accuracy
• Robust against outliers or noisy data
• Often used with Decision Trees (i.e. Random Forest)
• Used to ensemble a diverse group of strong learners
• Involves training a second-level machine learning
algorithm called a “metalearner” to learn the 

optimal combination of the base learners
History of Stacking
• Leo Breiman, “Stacked Regressions” (1996)
• Modified algorithm to use CV to generate level-one data
• Blended Neural Networks and GLMs (separately)
Stacked
Generalization
Stacked
Regressions
Super Learning
• David H. Wolpert, “Stacked Generalization” (1992)
• First formulation of stacking via a metalearner
• Blended Neural Networks
• Mark van der Laan et al., “Super Learner” (2007)
• Provided the theory to prove that the Super Learner is
the asymptotically optimal combination
• First R implementation in 2010
The Super Learner Algorithm
• Start with design matrix, X, and response, y
• Specify L base learners (with model params)
• Specify a metalearner (just another algorithm)
• Perform k-fold CV on each of the L learners
“Level-zero” 

data
The Super Learner Algorithm
• Collect the predicted values from k-fold CV that was
performed on each of the L base learners
• Column-bind these prediction vectors together to
form a new design matrix, Z
• Train the metalearner using Z, y
“Level-one” 

data
Super Learning vs. Parameter Tuning/Search
• A common task in machine learning is to perform model selection by
specifying a number of models with different parameters.
• An example of this is Grid Search or Random Search.
• The first phase of the Super Learner algorithm is computationally
equivalent to performing model selection via cross-validation.
• The latter phase of the Super Learner algorithm (the metalearning step)
is just training another single model (no CV).
• With Super Learner, your computation does not go to waste!
H2O Ensemble
Lasso GLM
Ridge GLM
Random

Forest
GBMRectifier

DNN
Maxout 

DNN
H2O Ensemble Overview
• H2O Ensemble implements the Super Learner algorithm.
• Super Learner finds the optimal combination of a
combination of a collection of base learning algorithms.
ML Tasks
Super Learner
Why
Ensembles?
• When a single algorithm does not approximate the true
prediction function well.
• Win Kaggle competitions!
• Regression
• Binary Classification
• Coming soon: Support for multi-class classification
How to Win Kaggle
https://www.kaggle.com/c/GiveMeSomeCredit/leaderboard/private
How to Win Kaggle
https://www.kaggle.com/c/GiveMeSomeCredit/forums/t/1166/congratulations-to-the-winners/7229#post7229
How to Win Kaggle
https://www.kaggle.com/c/GiveMeSomeCredit/forums/t/1166/congratulations-to-the-winners/7230#post7230
H2O Ensemble R Package
H2O Ensemble R Interface
H2O Ensemble R Interface
Live Demo!
The H2O Ensemble demo, including R code:
http://tinyurl.com/github-h2o-ensemble
The H2O Ensemble homepage on Github:
http://tinyurl.com/learn-h2o-ensemble
New H2O Ensemble features!
h2o.stack
Early access to a new H2O Ensemble function:
h2o.stack
http://tinyurl.com/h2o-stacking
ML@Berkeley Exclusive!!
Where to learn more?
• H2O Online Training (free): http://learn.h2o.ai
• H2O Slidedecks: http://www.slideshare.net/0xdata
• H2O Video Presentations: https://www.youtube.com/user/0xdata
• H2O Community Events & Meetups: http://h2o.ai/events
• Machine Learning & Data Science courses: http://coursebuffet.com
Thank you!
@ledell on Github, Twitter
erin@h2o.ai
http://www.stat.berkeley.edu/~ledell

More Related Content

What's hot (20)

PDF
XGBoost & LightGBM
Gabriel Cypriano Saca
 
PDF
Kaggleのテクニック
Yasunori Ozaki
 
PDF
Feature Engineering for ML - Dmitry Larko, H2O.ai
Sri Ambati
 
PDF
Feature Engineering
Sri Ambati
 
PDF
深層自己符号化器+混合ガウスモデルによる教師なし異常検知
Chihiro Kusunoki
 
PDF
Xgboost
Vivian S. Zhang
 
PDF
【DL輪読会】Free Lunch for Few-shot Learning: Distribution Calibration
Deep Learning JP
 
PDF
失敗から学ぶ機械学習応用
Hiroyuki Masuda
 
PDF
Deep Dive into Hyperparameter Tuning
Shubhmay Potdar
 
PPTX
Counterfaual Machine Learning(CFML)のサーベイ
ARISE analytics
 
PDF
PRML輪読#1
matsuolab
 
PPTX
Deep neural networks
Si Haem
 
PDF
機械学習デザインパターン Machine Learning Design Patterns
Hironori Washizaki
 
PDF
Feature Importance Analysis with XGBoost in Tax audit
Michael BENESTY
 
PDF
Winning Data Science Competitions
Jeong-Yoon Lee
 
PPTX
[DL輪読会]Revisiting Deep Learning Models for Tabular Data (NeurIPS 2021) 表形式デー...
Deep Learning JP
 
PDF
“An Introduction to Data Augmentation Techniques in ML Frameworks,” a Present...
Edge AI and Vision Alliance
 
PDF
整数計画法に基づく説明可能性な機械学習へのアプローチ
Kentaro Kanamori
 
PDF
深層学習による非滑らかな関数の推定
Masaaki Imaizumi
 
PDF
Kaggle presentation
HJ van Veen
 
XGBoost & LightGBM
Gabriel Cypriano Saca
 
Kaggleのテクニック
Yasunori Ozaki
 
Feature Engineering for ML - Dmitry Larko, H2O.ai
Sri Ambati
 
Feature Engineering
Sri Ambati
 
深層自己符号化器+混合ガウスモデルによる教師なし異常検知
Chihiro Kusunoki
 
【DL輪読会】Free Lunch for Few-shot Learning: Distribution Calibration
Deep Learning JP
 
失敗から学ぶ機械学習応用
Hiroyuki Masuda
 
Deep Dive into Hyperparameter Tuning
Shubhmay Potdar
 
Counterfaual Machine Learning(CFML)のサーベイ
ARISE analytics
 
PRML輪読#1
matsuolab
 
Deep neural networks
Si Haem
 
機械学習デザインパターン Machine Learning Design Patterns
Hironori Washizaki
 
Feature Importance Analysis with XGBoost in Tax audit
Michael BENESTY
 
Winning Data Science Competitions
Jeong-Yoon Lee
 
[DL輪読会]Revisiting Deep Learning Models for Tabular Data (NeurIPS 2021) 表形式デー...
Deep Learning JP
 
“An Introduction to Data Augmentation Techniques in ML Frameworks,” a Present...
Edge AI and Vision Alliance
 
整数計画法に基づく説明可能性な機械学習へのアプローチ
Kentaro Kanamori
 
深層学習による非滑らかな関数の推定
Masaaki Imaizumi
 
Kaggle presentation
HJ van Veen
 

Similar to Winning Kaggle 101: Introduction to Stacking (20)

PDF
H2O World - Ensembles with Erin LeDell
Sri Ambati
 
PDF
Strata San Jose 2016: Scalable Ensemble Learning with H2O
Sri Ambati
 
PDF
Erin LeDell, Machine Learning Scientist, H2O.ai at MLconf ATL 2016
MLconf
 
PDF
Dr. Erin LeDell, Machine Learning Scientist, H2O.ai at MLconf SEA - 5/20/16
MLconf
 
PDF
Introduction to cyclical learning rates for training neural nets
Sayak Paul
 
PPTX
ML SFCSE.pptx
NIKHILGR3
 
PPT
Machine Learning and its Appplications--
sudarmani rajagopal
 
PDF
H2O World - Intro to Data Science with Erin Ledell
Sri Ambati
 
PPTX
part3Module 3 ppt_with classification.pptx
VaishaliBagewadikar
 
PPTX
Unit V -Multiple Learners.pptx for artificial intelligence
y2fn5mbzdb
 
PPTX
Unit V -Multiple Learners in artificial intelligence and machine learning
y2fn5mbzdb
 
PPTX
How Machine Learning Helps Organizations to Work More Efficiently?
Tuan Yang
 
PDF
Machine Learning for Everyone
Aly Abdelkareem
 
PDF
Stacked Ensembles in H2O
Sri Ambati
 
PDF
To bag, or to boost? A question of balance
Alex Henderson
 
PPTX
Machine Learning Innovations
HPCC Systems
 
PPTX
Machine learning for Data Science
Dr. Vaibhav Kumar
 
PPTX
MACHINE LEARNING YEAR DL SECOND PART.pptx
NAGARAJANS68
 
PPTX
Intro to machine learning
Akshay Kanchan
 
PPTX
Machine learning Basics Introduction ppt
vizhivasu1
 
H2O World - Ensembles with Erin LeDell
Sri Ambati
 
Strata San Jose 2016: Scalable Ensemble Learning with H2O
Sri Ambati
 
Erin LeDell, Machine Learning Scientist, H2O.ai at MLconf ATL 2016
MLconf
 
Dr. Erin LeDell, Machine Learning Scientist, H2O.ai at MLconf SEA - 5/20/16
MLconf
 
Introduction to cyclical learning rates for training neural nets
Sayak Paul
 
ML SFCSE.pptx
NIKHILGR3
 
Machine Learning and its Appplications--
sudarmani rajagopal
 
H2O World - Intro to Data Science with Erin Ledell
Sri Ambati
 
part3Module 3 ppt_with classification.pptx
VaishaliBagewadikar
 
Unit V -Multiple Learners.pptx for artificial intelligence
y2fn5mbzdb
 
Unit V -Multiple Learners in artificial intelligence and machine learning
y2fn5mbzdb
 
How Machine Learning Helps Organizations to Work More Efficiently?
Tuan Yang
 
Machine Learning for Everyone
Aly Abdelkareem
 
Stacked Ensembles in H2O
Sri Ambati
 
To bag, or to boost? A question of balance
Alex Henderson
 
Machine Learning Innovations
HPCC Systems
 
Machine learning for Data Science
Dr. Vaibhav Kumar
 
MACHINE LEARNING YEAR DL SECOND PART.pptx
NAGARAJANS68
 
Intro to machine learning
Akshay Kanchan
 
Machine learning Basics Introduction ppt
vizhivasu1
 
Ad

Recently uploaded (20)

PDF
apidays Helsinki & North 2025 - APIs in the healthcare sector: hospitals inte...
apidays
 
PPTX
Advanced_NLP_with_Transformers_PPT_final 50.pptx
Shiwani Gupta
 
PDF
Data Chunking Strategies for RAG in 2025.pdf
Tamanna
 
PDF
What does good look like - CRAP Brighton 8 July 2025
Jan Kierzyk
 
PPTX
Human-Action-Recognition-Understanding-Behavior.pptx
nreddyjanga
 
PDF
Choosing the Right Database for Indexing.pdf
Tamanna
 
PDF
2_Management_of_patients_with_Reproductive_System_Disorders.pdf
motbayhonewunetu
 
PPTX
Hadoop_EcoSystem slide by CIDAC India.pptx
migbaruget
 
PPTX
Exploring Multilingual Embeddings for Italian Semantic Search: A Pretrained a...
Sease
 
PDF
apidays Helsinki & North 2025 - Monetizing AI APIs: The New API Economy, Alla...
apidays
 
PPTX
Climate Action.pptx action plan for climate
justfortalabat
 
PPT
Lecture 2-1.ppt at a higher learning institution such as the university of Za...
rachealhantukumane52
 
PDF
Context Engineering for AI Agents, approaches, memories.pdf
Tamanna
 
PDF
apidays Helsinki & North 2025 - How (not) to run a Graphql Stewardship Group,...
apidays
 
DOC
MATRIX_AMAN IRAWAN_20227479046.docbbbnnb
vanitafiani1
 
PPTX
Numbers of a nation: how we estimate population statistics | Accessible slides
Office for National Statistics
 
PPTX
Resmed Rady Landis May 4th - analytics.pptx
Adrian Limanto
 
PDF
Early_Diabetes_Detection_using_Machine_L.pdf
maria879693
 
PDF
How to Avoid 7 Costly Mainframe Migration Mistakes
JP Infra Pvt Ltd
 
PPTX
AI Presentation Tool Pitch Deck Presentation.pptx
ShyamPanthavoor1
 
apidays Helsinki & North 2025 - APIs in the healthcare sector: hospitals inte...
apidays
 
Advanced_NLP_with_Transformers_PPT_final 50.pptx
Shiwani Gupta
 
Data Chunking Strategies for RAG in 2025.pdf
Tamanna
 
What does good look like - CRAP Brighton 8 July 2025
Jan Kierzyk
 
Human-Action-Recognition-Understanding-Behavior.pptx
nreddyjanga
 
Choosing the Right Database for Indexing.pdf
Tamanna
 
2_Management_of_patients_with_Reproductive_System_Disorders.pdf
motbayhonewunetu
 
Hadoop_EcoSystem slide by CIDAC India.pptx
migbaruget
 
Exploring Multilingual Embeddings for Italian Semantic Search: A Pretrained a...
Sease
 
apidays Helsinki & North 2025 - Monetizing AI APIs: The New API Economy, Alla...
apidays
 
Climate Action.pptx action plan for climate
justfortalabat
 
Lecture 2-1.ppt at a higher learning institution such as the university of Za...
rachealhantukumane52
 
Context Engineering for AI Agents, approaches, memories.pdf
Tamanna
 
apidays Helsinki & North 2025 - How (not) to run a Graphql Stewardship Group,...
apidays
 
MATRIX_AMAN IRAWAN_20227479046.docbbbnnb
vanitafiani1
 
Numbers of a nation: how we estimate population statistics | Accessible slides
Office for National Statistics
 
Resmed Rady Landis May 4th - analytics.pptx
Adrian Limanto
 
Early_Diabetes_Detection_using_Machine_L.pdf
maria879693
 
How to Avoid 7 Costly Mainframe Migration Mistakes
JP Infra Pvt Ltd
 
AI Presentation Tool Pitch Deck Presentation.pptx
ShyamPanthavoor1
 
Ad

Winning Kaggle 101: Introduction to Stacking

  • 1. Winning Kaggle 101: Introduction to Stacking Erin LeDell Ph.D. March 2016
  • 2. Introduction • Statistician & Machine Learning Scientist at H2O.ai in Mountain View, California, USA • Ph.D. in Biostatistics with Designated Emphasis in Computational Science and Engineering from 
 UC Berkeley (focus on Machine Learning) • Worked as a data scientist at several startups
  • 3. Ensemble Learning In statistics and machine learning, ensemble methods use multiple learning algorithms to obtain better predictive performance than could be obtained by any of the constituent algorithms. 
 — Wikipedia (2015)
  • 4. Common Types of Ensemble Methods • Also reduces variance and increases accuracy • Not robust against outliers or noisy data • Flexible — can be used with any loss function Bagging Boosting Stacking • Reduces variance and increases accuracy • Robust against outliers or noisy data • Often used with Decision Trees (i.e. Random Forest) • Used to ensemble a diverse group of strong learners • Involves training a second-level machine learning algorithm called a “metalearner” to learn the 
 optimal combination of the base learners
  • 5. History of Stacking • Leo Breiman, “Stacked Regressions” (1996) • Modified algorithm to use CV to generate level-one data • Blended Neural Networks and GLMs (separately) Stacked Generalization Stacked Regressions Super Learning • David H. Wolpert, “Stacked Generalization” (1992) • First formulation of stacking via a metalearner • Blended Neural Networks • Mark van der Laan et al., “Super Learner” (2007) • Provided the theory to prove that the Super Learner is the asymptotically optimal combination • First R implementation in 2010
  • 6. The Super Learner Algorithm • Start with design matrix, X, and response, y • Specify L base learners (with model params) • Specify a metalearner (just another algorithm) • Perform k-fold CV on each of the L learners “Level-zero” 
 data
  • 7. The Super Learner Algorithm • Collect the predicted values from k-fold CV that was performed on each of the L base learners • Column-bind these prediction vectors together to form a new design matrix, Z • Train the metalearner using Z, y “Level-one” 
 data
  • 8. Super Learning vs. Parameter Tuning/Search • A common task in machine learning is to perform model selection by specifying a number of models with different parameters. • An example of this is Grid Search or Random Search. • The first phase of the Super Learner algorithm is computationally equivalent to performing model selection via cross-validation. • The latter phase of the Super Learner algorithm (the metalearning step) is just training another single model (no CV). • With Super Learner, your computation does not go to waste!
  • 9. H2O Ensemble Lasso GLM Ridge GLM Random
 Forest GBMRectifier
 DNN Maxout 
 DNN
  • 10. H2O Ensemble Overview • H2O Ensemble implements the Super Learner algorithm. • Super Learner finds the optimal combination of a combination of a collection of base learning algorithms. ML Tasks Super Learner Why Ensembles? • When a single algorithm does not approximate the true prediction function well. • Win Kaggle competitions! • Regression • Binary Classification • Coming soon: Support for multi-class classification
  • 11. How to Win Kaggle https://www.kaggle.com/c/GiveMeSomeCredit/leaderboard/private
  • 12. How to Win Kaggle https://www.kaggle.com/c/GiveMeSomeCredit/forums/t/1166/congratulations-to-the-winners/7229#post7229
  • 13. How to Win Kaggle https://www.kaggle.com/c/GiveMeSomeCredit/forums/t/1166/congratulations-to-the-winners/7230#post7230
  • 14. H2O Ensemble R Package
  • 15. H2O Ensemble R Interface
  • 16. H2O Ensemble R Interface
  • 17. Live Demo! The H2O Ensemble demo, including R code: http://tinyurl.com/github-h2o-ensemble The H2O Ensemble homepage on Github: http://tinyurl.com/learn-h2o-ensemble
  • 18. New H2O Ensemble features!
  • 19. h2o.stack Early access to a new H2O Ensemble function: h2o.stack http://tinyurl.com/h2o-stacking ML@Berkeley Exclusive!!
  • 20. Where to learn more? • H2O Online Training (free): http://learn.h2o.ai • H2O Slidedecks: http://www.slideshare.net/0xdata • H2O Video Presentations: https://www.youtube.com/user/0xdata • H2O Community Events & Meetups: http://h2o.ai/events • Machine Learning & Data Science courses: http://coursebuffet.com
  • 21. Thank you! @ledell on Github, Twitter [email protected] http://www.stat.berkeley.edu/~ledell