SlideShare a Scribd company logo
Introduction Probabilistic Programming Case Study
Introduction to Model-Based Machine
Learning
A Webinar to TRB ADB40 Big Data Initiative
by
Daniel Emaasit1
1Ph.D. Student
Department of Civil and Environmental Engineering
University of Nevada, Las Vegas, USA
emaasit@unlv.nevada.edu
September 27 2016
1 / 21
Introduction Probabilistic Programming Case Study
Acknowledgments1
Prof. Francisco C. Pereira Dr. Filipe Rodrigues
1
Machine Learning for Mobility group, DTU: Tutorial from Summer
school on Big Data, Mobility Patterns, Transport Analytics, July 1-3, 2016,
Filipe Rodrigues and Francisco Pereira 2 / 21
Introduction Probabilistic Programming Case Study
Introduction
3 / 21
Introduction Probabilistic Programming Case Study
Current Challenges in Adopting Machine Learning
Generally, current challenges in adopting ML:
Overwhelming number of traditional ML methods to learn
Deciding which algorithm to use or why
Some custom problems may not fit with any existing
algorithm
4 / 21
Introduction Probabilistic Programming Case Study
What is Model-Based Machine Learning?
A different viewpoint for machine learning proposed by
Bishop (2013)2, Winn et al. (2015)3
* Goal: + Provide a single development framework which
supports the creation of a wide range of bespoke models
* The core idea: + all assumptions about the problem domain
are made explicit in the form of a model
2
Bishop, C. M. (2013). Model-Based Machine Learning. Philosophical
Transactions of the Royal Society A, 371, pp 1–17
3
Winn, J., Bishop, C. M., Diethe, T. (2015). Model-Based Machine
Learning. Microsoft Research Cambridge. http://www.mbmlbook.com.
5 / 21
Introduction Probabilistic Programming Case Study
What is a Model in MBML?
A Model:
is a set of assumptions, expressed in mathematical/graphical
form
expresses all parameters, variables as random variables
shows the dependency between variables
Figure 2: Description of a model
6 / 21
Introduction Probabilistic Programming Case Study
Key Ideas of MBML?
MBML is built upon 3 key ideas
the use of Probabilistic Graphical Models (PGM)
the adoption of Bayesian ML
the application of fast, deterministic inference algorithms
7 / 21
Introduction Probabilistic Programming Case Study
Key Idea 1: Probabilistic Graphical Models
Combine probability theory with graphs (e.g Factor Graphs)
8 / 21
Introduction Probabilistic Programming Case Study
Key Idea 2: Bayesian Machine Learning
Everything follows from two simple rules of probability
theory
9 / 21
Introduction Probabilistic Programming Case Study
Key Idea 3: Inference Algorithms
the application of fast, approximate inference algorithms by
local message passing
Variational Bayes
Belief Propagation, Loopy Belief Propagation
Expectation Propagation
Learning by local message passing
Inference algorithms
Figure 3: MCMC vs Approximate methods
10 / 21
Introduction Probabilistic Programming Case Study
Stages of MBML
3 stages of MBML
Build the model: Joint probability distribution of all the
relevant variables (e.g as a graph)
Incorporate the observed data
Perform inference to learn parameters of the latent
variables
11 / 21
Introduction Probabilistic Programming Case Study
Special cases of MBML
Special cases
For sequential data
12 / 21
Introduction Probabilistic Programming Case Study
Benefits of MBML
Potential benefits of this approach
Provides a systematic process of creating ML solutions
Allows for incorporation of prior knowledge
Allows for handling uncertainity in a principled manner
Does not suffer from overfitting
Custom solutions are built for specific problems
Allows for quick building of several alternative models
Easy to compare those alternatives
It’s general purpose: No need to learn the 1000s of existing
ML algorithms
Separates model from inference/training code
13 / 21
Introduction Probabilistic Programming Case Study
Probabilistic Programming
14 / 21
Introduction Probabilistic Programming Case Study
What is Probabilistic Programming?
A software package that takes the model and then
automatically generate inference routines (even source code!)
to solve a wide variety of models
Takes programming languages and adds support for:
random variables
constraints on variables
inference
Examples of PP software packages
Infer.Net (C#, C++)
Stan (R, python, C++)
BUGS
church
PyMC (python)
15 / 21
Introduction Probabilistic Programming Case Study
How Probabilistic Programming works
Figure 5: How infer.NET works
16 / 21
Introduction Probabilistic Programming Case Study
Case Study
17 / 21
Introduction Probabilistic Programming Case Study
A Bicyclist’s Daily Travel
Analysing the distribution of an individual cyclist’s daily
travel time to work
18 / 21
Introduction Probabilistic Programming Case Study
A Bicyclist’s Daily Travel
Analysing the distribution of an individual cyclist’s daily
travel time to work
Identify the variables of interest
ttn - travel time in the
nth
day
at - average travel-time
tu - uncertainty
ttn
at
tu
N
18 / 21
Introduction Probabilistic Programming Case Study
A Bicyclist’s Daily Travel
Analysing the distribution of an individual cyclist’s daily
travel time to work
Specify relationships between variables
ttn - travel time in the
nth
day
at - average travel-time
tu - uncertainty
ttn
at
tu
N
19 / 21
Introduction Probabilistic Programming Case Study
A Bicyclist’s Daily Travel
Analysing the distribution of an individual cyclist’s daily
travel time to work
Specify relationships between variables
ttn - travel time in the
nth
day
at - average travel-time
tu - uncertainty
ttn
at
tu
N
Joint distribution is given by
p(tt, at, tu) = p(at) p(tu)
priors
×
N
n=1
p(ttn|at, tu)
likelihood
19 / 21
Introduction Probabilistic Programming Case Study
A Bicyclist’s Daily Travel
Analysing the distribution of an individual cyclist’s daily
travel time to work
Joint distribution is given by
p(tt, as, tu) = p(at) p(tu)
priors
×
N
n=1
p(ttn|at, tu)
likelihood
How should we define the likelihood p(ttn|at, tu)?
20 / 21
Introduction Probabilistic Programming Case Study
A Bicyclist’s Daily Travel
Analysing the distribution of an individual cyclist’s daily
travel time to work
Joint distribution is given by
p(tt, as, tu) = p(at) p(tu)
priors
×
N
n=1
p(ttn|at, tu)
likelihood
How should we define the likelihood p(ttn|at, tu)?
the distribution’s mean is the cyclist’s average travel time
the distribution’s variance determines how much the travel
time varies from day to day (e.g. variations in traffic
conditions)
20 / 21
Introduction Probabilistic Programming Case Study
A Bicyclist’s Daily Travel
Analysing the distribution of an individual cyclist’s daily
travel time to work
Joint distribution is given by
p(tt, as, tu) = p(at) p(tu)
priors
×
N
n=1
p(ttn|at, tu)
likelihood
How should we define the likelihood p(ttn|at, tu)?
the distribution’s mean is the cyclist’s average travel time
the distribution’s variance determines how much the travel
time varies from day to day (e.g. variations in traffic
conditions)
What distributions should p(at) and p(tu) have?
20 / 21
Introduction Probabilistic Programming Case Study
A Bicyclist’s Daily Travel
Analysing the distribution of an individual cyclist’s daily
travel time to work
Joint distribution is given by
p(tt, as, tu) = p(at) p(tu)
priors
×
N
n=1
p(ttn|at, tu)
likelihood
How should we define the likelihood p(ttn|at, tu)?
the distribution’s mean is the cyclist’s average travel time
the distribution’s variance determines how much the travel
time varies from day to day (e.g. variations in traffic
conditions)
What distributions should p(at) and p(tu) have?
conjugate priors!
20 / 21
Introduction Probabilistic Programming Case Study
A Bicyclist’s Daily Travel
Likelihood given by
p(ttn|at, tu) = N(ttn|at, tu)
We now know what distribution forms to assign to the
priors...
21 / 21
Introduction Probabilistic Programming Case Study
A Bicyclist’s Daily Travel
Likelihood given by
p(ttn|at, tu) = N(ttn|at, tu)
We now know what distribution forms to assign to the
priors...
p(at) = N(at|µ, σ2
)
p(tu) = cauchy(tu|µ, σ2
)
21 / 21
Introduction Probabilistic Programming Case Study
A Bicyclist’s Daily Travel
Likelihood given by
p(ttn|at, tu) = N(ttn|at, tu)
We now know what distribution forms to assign to the
priors...
p(at) = N(at|µ, σ2
)
p(tu) = cauchy(tu|µ, σ2
)
The choice of the initial parameters of the prior is significant
only if you have a small number of observations
As the number of observations increases, the influence of the
initial prior on the posterior declines
21 / 21

More Related Content

What's hot (17)

DOCX
Types of Machine Learnig Algorithms(CART, ID3)
Fatimakhan325
 
PPT
Improving Analogy Software Effort Estimation using Fuzzy Feature Subset Selec...
gregoryg
 
PDF
EVOLUTIONARY CENTRALITY AND MAXIMAL CLIQUES IN MOBILE SOCIAL NETWORKS
AIRCC Publishing Corporation
 
PPTX
Tweets Classification using Naive Bayes and SVM
Trilok Sharma
 
PPTX
Process mining approaches [email protected]
kashif kashif
 
DOC
DagdelenSiriwardaneY..
butest
 
PPTX
Survey on contrastive self supervised l earning
Anirudh Ganguly
 
PDF
EVOLUTIONARY CENTRALITY AND MAXIMAL CLIQUES IN MOBILE SOCIAL NETWORKS
ijcsit
 
PDF
resume_SDE
Caiyi Zhu
 
PDF
Adversarial examples in deep learning (Gregory Chatel)
MeetupDataScienceRoma
 
PPTX
Robust Filtering Schemes for Machine Learning Systems to Defend Adversarial A...
Kishor Datta Gupta
 
PPTX
Deep Reinforcement Learning based Recommendation with Explicit User-ItemInter...
Kishor Datta Gupta
 
PPTX
Applicability issues of Evasion-Based Adversarial Attacks and Mitigation Tech...
Kishor Datta Gupta
 
PPTX
PPT - Deep and Confident Prediction For Time Series at Uber
Jisang Yoon
 
PDF
Introduction to Few shot learning
Ridge-i, Inc.
 
PPT
Summary.ppt
butest
 
PDF
CORRELATION OF EIGENVECTOR CENTRALITY TO OTHER CENTRALITY MEASURES: RANDOM, S...
csandit
 
Types of Machine Learnig Algorithms(CART, ID3)
Fatimakhan325
 
Improving Analogy Software Effort Estimation using Fuzzy Feature Subset Selec...
gregoryg
 
EVOLUTIONARY CENTRALITY AND MAXIMAL CLIQUES IN MOBILE SOCIAL NETWORKS
AIRCC Publishing Corporation
 
Tweets Classification using Naive Bayes and SVM
Trilok Sharma
 
Process mining approaches [email protected]
kashif kashif
 
DagdelenSiriwardaneY..
butest
 
Survey on contrastive self supervised l earning
Anirudh Ganguly
 
EVOLUTIONARY CENTRALITY AND MAXIMAL CLIQUES IN MOBILE SOCIAL NETWORKS
ijcsit
 
resume_SDE
Caiyi Zhu
 
Adversarial examples in deep learning (Gregory Chatel)
MeetupDataScienceRoma
 
Robust Filtering Schemes for Machine Learning Systems to Defend Adversarial A...
Kishor Datta Gupta
 
Deep Reinforcement Learning based Recommendation with Explicit User-ItemInter...
Kishor Datta Gupta
 
Applicability issues of Evasion-Based Adversarial Attacks and Mitigation Tech...
Kishor Datta Gupta
 
PPT - Deep and Confident Prediction For Time Series at Uber
Jisang Yoon
 
Introduction to Few shot learning
Ridge-i, Inc.
 
Summary.ppt
butest
 
CORRELATION OF EIGENVECTOR CENTRALITY TO OTHER CENTRALITY MEASURES: RANDOM, S...
csandit
 

Viewers also liked (20)

PDF
Mantel PR Final Portfolio
Bailee Cofer
 
PDF
How semantic search changes recruitment - Glen Cathey
Textkernel
 
PPTX
Personality disorders
Brian Mejares
 
PDF
Proyecto ambiental eduambiental
Maria Zabala
 
PPT
Facebook Kampanya
Ayşenur Kıvılcım
 
PDF
Retail xmas marketing 2016
Alessia Grandis
 
PDF
NYU AVANCES overview 5-4
Elizabeth Sheehan
 
PDF
Unlocking your genuis!
Prof. Dr. Günter Faltin
 
PPTX
Wissenschaftsjournalismus als Dienstleistung: Die Verstärkerplattform The Con...
Axel Bruns
 
PDF
EU: Platinum - Market Report. Analysis And Forecast To 2020
IndexBox Marketing
 
PDF
Drug detection times in urine
Self-diagnostics online
 
PDF
H2O World - Survey of Available Machine Learning Frameworks - Brendan Herger
Sri Ambati
 
PPTX
Cnidarios
Jairo Molina
 
PPT
Siyasal Kampanyaların Altyapısının Oluşturulması hazırlayan: Ayşegül Liman Kaban
Aysegul Liman Kaban
 
PDF
Innehållsdeklaration
Creuna Sverige
 
PDF
Fem steg till en innehållsstrategi som fungerar
Caroline Andersson
 
PPTX
Creuna-dagen: Kom i gang med design thinking
Creuna
 
PPTX
Dynamics of institutional arrangements and their adaptation to socio-economic...
ILRI
 
PDF
Google İşe Alım süreçleri
Yunus Emre Sarıgül
 
PDF
Practical Deep Learning for NLP
Textkernel
 
Mantel PR Final Portfolio
Bailee Cofer
 
How semantic search changes recruitment - Glen Cathey
Textkernel
 
Personality disorders
Brian Mejares
 
Proyecto ambiental eduambiental
Maria Zabala
 
Facebook Kampanya
Ayşenur Kıvılcım
 
Retail xmas marketing 2016
Alessia Grandis
 
NYU AVANCES overview 5-4
Elizabeth Sheehan
 
Unlocking your genuis!
Prof. Dr. Günter Faltin
 
Wissenschaftsjournalismus als Dienstleistung: Die Verstärkerplattform The Con...
Axel Bruns
 
EU: Platinum - Market Report. Analysis And Forecast To 2020
IndexBox Marketing
 
Drug detection times in urine
Self-diagnostics online
 
H2O World - Survey of Available Machine Learning Frameworks - Brendan Herger
Sri Ambati
 
Cnidarios
Jairo Molina
 
Siyasal Kampanyaların Altyapısının Oluşturulması hazırlayan: Ayşegül Liman Kaban
Aysegul Liman Kaban
 
Innehållsdeklaration
Creuna Sverige
 
Fem steg till en innehållsstrategi som fungerar
Caroline Andersson
 
Creuna-dagen: Kom i gang med design thinking
Creuna
 
Dynamics of institutional arrangements and their adaptation to socio-economic...
ILRI
 
Google İşe Alım süreçleri
Yunus Emre Sarıgül
 
Practical Deep Learning for NLP
Textkernel
 
Ad

Similar to Introduction to Model-Based Machine Learning for Transportation (20)

PDF
Probabilistic Programming: Why, What, How, When?
Salesforce Engineering
 
PDF
Introduction to machine learning-2023-IT-AI and DS.pdf
SisayNegash4
 
PDF
Probabilistic Programming in Python
Peadar Coyle
 
PDF
Introduction to Bayesian Analysis in Python
Peadar Coyle
 
PDF
Graphical Models In Python | Edureka
Edureka!
 
PPTX
planning and decision making
AdengappaUnavu
 
PDF
Probabilistic modeling in deep learning
Denis Dus
 
PPT
4646150.ppt
TulkinChulliev
 
ODP
Gentle Introduction: Bayesian Modelling and Probabilistic Programming in R
Marco Wirthlin
 
PDF
Probabilistic Reasoning bayes rule conditional .pdf
SupreethGowda24
 
PPT
AML_030607.ppt
butest
 
PDF
Machine Learning Foundations
Albert Y. C. Chen
 
PPTX
Learn from Example and Learn Probabilistic Model
Junya Tanaka
 
PPTX
Into to prob_prog_hari (2)
Hariharan Chandrasekaran
 
PDF
Module - 4 Machine Learning -22ISE62.pdf
Dr. Shivashankar
 
ODP
Probabilistic programming
Eli Gottlieb
 
PDF
Lecture10 - Naïve Bayes
Albert Orriols-Puig
 
PDF
Machine learning mathematicals.pdf
King Khalid University
 
PDF
Mathematical Background for Artificial Intelligence
ananth
 
PPT
ProbabilisticModeling20080411
Clay Stanek
 
Probabilistic Programming: Why, What, How, When?
Salesforce Engineering
 
Introduction to machine learning-2023-IT-AI and DS.pdf
SisayNegash4
 
Probabilistic Programming in Python
Peadar Coyle
 
Introduction to Bayesian Analysis in Python
Peadar Coyle
 
Graphical Models In Python | Edureka
Edureka!
 
planning and decision making
AdengappaUnavu
 
Probabilistic modeling in deep learning
Denis Dus
 
4646150.ppt
TulkinChulliev
 
Gentle Introduction: Bayesian Modelling and Probabilistic Programming in R
Marco Wirthlin
 
Probabilistic Reasoning bayes rule conditional .pdf
SupreethGowda24
 
AML_030607.ppt
butest
 
Machine Learning Foundations
Albert Y. C. Chen
 
Learn from Example and Learn Probabilistic Model
Junya Tanaka
 
Into to prob_prog_hari (2)
Hariharan Chandrasekaran
 
Module - 4 Machine Learning -22ISE62.pdf
Dr. Shivashankar
 
Probabilistic programming
Eli Gottlieb
 
Lecture10 - Naïve Bayes
Albert Orriols-Puig
 
Machine learning mathematicals.pdf
King Khalid University
 
Mathematical Background for Artificial Intelligence
ananth
 
ProbabilisticModeling20080411
Clay Stanek
 
Ad

Recently uploaded (20)

PDF
Web Scraping with Google Gemini 2.0 .pdf
Tamanna
 
PDF
apidays Helsinki & North 2025 - REST in Peace? Hunting the Dominant Design fo...
apidays
 
PPT
Growth of Public Expendituuure_55423.ppt
NavyaDeora
 
PDF
Avatar for apidays apidays PRO June 07, 2025 0 5 apidays Helsinki & North 2...
apidays
 
PDF
OPPOTUS - Malaysias on Malaysia 1Q2025.pdf
Oppotus
 
PDF
apidays Helsinki & North 2025 - How (not) to run a Graphql Stewardship Group,...
apidays
 
PPTX
Advanced_NLP_with_Transformers_PPT_final 50.pptx
Shiwani Gupta
 
PPTX
Aict presentation on dpplppp sjdhfh.pptx
vabaso5932
 
PPT
AI Future trends and opportunities_oct7v1.ppt
SHIKHAKMEHTA
 
PDF
apidays Helsinki & North 2025 - APIs in the healthcare sector: hospitals inte...
apidays
 
PPTX
apidays Helsinki & North 2025 - Running a Successful API Program: Best Practi...
apidays
 
PPTX
apidays Munich 2025 - Building an AWS Serverless Application with Terraform, ...
apidays
 
PPTX
AI Presentation Tool Pitch Deck Presentation.pptx
ShyamPanthavoor1
 
PDF
Merits and Demerits of DBMS over File System & 3-Tier Architecture in DBMS
MD RIZWAN MOLLA
 
PDF
Choosing the Right Database for Indexing.pdf
Tamanna
 
PDF
OOPs with Java_unit2.pdf. sarthak bookkk
Sarthak964187
 
PPTX
apidays Helsinki & North 2025 - Vero APIs - Experiences of API development in...
apidays
 
PDF
How to Connect Your On-Premises Site to AWS Using Site-to-Site VPN.pdf
Tamanna
 
PDF
apidays Helsinki & North 2025 - API-Powered Journeys: Mobility in an API-Driv...
apidays
 
PPTX
Exploring Multilingual Embeddings for Italian Semantic Search: A Pretrained a...
Sease
 
Web Scraping with Google Gemini 2.0 .pdf
Tamanna
 
apidays Helsinki & North 2025 - REST in Peace? Hunting the Dominant Design fo...
apidays
 
Growth of Public Expendituuure_55423.ppt
NavyaDeora
 
Avatar for apidays apidays PRO June 07, 2025 0 5 apidays Helsinki & North 2...
apidays
 
OPPOTUS - Malaysias on Malaysia 1Q2025.pdf
Oppotus
 
apidays Helsinki & North 2025 - How (not) to run a Graphql Stewardship Group,...
apidays
 
Advanced_NLP_with_Transformers_PPT_final 50.pptx
Shiwani Gupta
 
Aict presentation on dpplppp sjdhfh.pptx
vabaso5932
 
AI Future trends and opportunities_oct7v1.ppt
SHIKHAKMEHTA
 
apidays Helsinki & North 2025 - APIs in the healthcare sector: hospitals inte...
apidays
 
apidays Helsinki & North 2025 - Running a Successful API Program: Best Practi...
apidays
 
apidays Munich 2025 - Building an AWS Serverless Application with Terraform, ...
apidays
 
AI Presentation Tool Pitch Deck Presentation.pptx
ShyamPanthavoor1
 
Merits and Demerits of DBMS over File System & 3-Tier Architecture in DBMS
MD RIZWAN MOLLA
 
Choosing the Right Database for Indexing.pdf
Tamanna
 
OOPs with Java_unit2.pdf. sarthak bookkk
Sarthak964187
 
apidays Helsinki & North 2025 - Vero APIs - Experiences of API development in...
apidays
 
How to Connect Your On-Premises Site to AWS Using Site-to-Site VPN.pdf
Tamanna
 
apidays Helsinki & North 2025 - API-Powered Journeys: Mobility in an API-Driv...
apidays
 
Exploring Multilingual Embeddings for Italian Semantic Search: A Pretrained a...
Sease
 

Introduction to Model-Based Machine Learning for Transportation

  • 1. Introduction Probabilistic Programming Case Study Introduction to Model-Based Machine Learning A Webinar to TRB ADB40 Big Data Initiative by Daniel Emaasit1 1Ph.D. Student Department of Civil and Environmental Engineering University of Nevada, Las Vegas, USA [email protected] September 27 2016 1 / 21
  • 2. Introduction Probabilistic Programming Case Study Acknowledgments1 Prof. Francisco C. Pereira Dr. Filipe Rodrigues 1 Machine Learning for Mobility group, DTU: Tutorial from Summer school on Big Data, Mobility Patterns, Transport Analytics, July 1-3, 2016, Filipe Rodrigues and Francisco Pereira 2 / 21
  • 3. Introduction Probabilistic Programming Case Study Introduction 3 / 21
  • 4. Introduction Probabilistic Programming Case Study Current Challenges in Adopting Machine Learning Generally, current challenges in adopting ML: Overwhelming number of traditional ML methods to learn Deciding which algorithm to use or why Some custom problems may not fit with any existing algorithm 4 / 21
  • 5. Introduction Probabilistic Programming Case Study What is Model-Based Machine Learning? A different viewpoint for machine learning proposed by Bishop (2013)2, Winn et al. (2015)3 * Goal: + Provide a single development framework which supports the creation of a wide range of bespoke models * The core idea: + all assumptions about the problem domain are made explicit in the form of a model 2 Bishop, C. M. (2013). Model-Based Machine Learning. Philosophical Transactions of the Royal Society A, 371, pp 1–17 3 Winn, J., Bishop, C. M., Diethe, T. (2015). Model-Based Machine Learning. Microsoft Research Cambridge. http://www.mbmlbook.com. 5 / 21
  • 6. Introduction Probabilistic Programming Case Study What is a Model in MBML? A Model: is a set of assumptions, expressed in mathematical/graphical form expresses all parameters, variables as random variables shows the dependency between variables Figure 2: Description of a model 6 / 21
  • 7. Introduction Probabilistic Programming Case Study Key Ideas of MBML? MBML is built upon 3 key ideas the use of Probabilistic Graphical Models (PGM) the adoption of Bayesian ML the application of fast, deterministic inference algorithms 7 / 21
  • 8. Introduction Probabilistic Programming Case Study Key Idea 1: Probabilistic Graphical Models Combine probability theory with graphs (e.g Factor Graphs) 8 / 21
  • 9. Introduction Probabilistic Programming Case Study Key Idea 2: Bayesian Machine Learning Everything follows from two simple rules of probability theory 9 / 21
  • 10. Introduction Probabilistic Programming Case Study Key Idea 3: Inference Algorithms the application of fast, approximate inference algorithms by local message passing Variational Bayes Belief Propagation, Loopy Belief Propagation Expectation Propagation Learning by local message passing Inference algorithms Figure 3: MCMC vs Approximate methods 10 / 21
  • 11. Introduction Probabilistic Programming Case Study Stages of MBML 3 stages of MBML Build the model: Joint probability distribution of all the relevant variables (e.g as a graph) Incorporate the observed data Perform inference to learn parameters of the latent variables 11 / 21
  • 12. Introduction Probabilistic Programming Case Study Special cases of MBML Special cases For sequential data 12 / 21
  • 13. Introduction Probabilistic Programming Case Study Benefits of MBML Potential benefits of this approach Provides a systematic process of creating ML solutions Allows for incorporation of prior knowledge Allows for handling uncertainity in a principled manner Does not suffer from overfitting Custom solutions are built for specific problems Allows for quick building of several alternative models Easy to compare those alternatives It’s general purpose: No need to learn the 1000s of existing ML algorithms Separates model from inference/training code 13 / 21
  • 14. Introduction Probabilistic Programming Case Study Probabilistic Programming 14 / 21
  • 15. Introduction Probabilistic Programming Case Study What is Probabilistic Programming? A software package that takes the model and then automatically generate inference routines (even source code!) to solve a wide variety of models Takes programming languages and adds support for: random variables constraints on variables inference Examples of PP software packages Infer.Net (C#, C++) Stan (R, python, C++) BUGS church PyMC (python) 15 / 21
  • 16. Introduction Probabilistic Programming Case Study How Probabilistic Programming works Figure 5: How infer.NET works 16 / 21
  • 17. Introduction Probabilistic Programming Case Study Case Study 17 / 21
  • 18. Introduction Probabilistic Programming Case Study A Bicyclist’s Daily Travel Analysing the distribution of an individual cyclist’s daily travel time to work 18 / 21
  • 19. Introduction Probabilistic Programming Case Study A Bicyclist’s Daily Travel Analysing the distribution of an individual cyclist’s daily travel time to work Identify the variables of interest ttn - travel time in the nth day at - average travel-time tu - uncertainty ttn at tu N 18 / 21
  • 20. Introduction Probabilistic Programming Case Study A Bicyclist’s Daily Travel Analysing the distribution of an individual cyclist’s daily travel time to work Specify relationships between variables ttn - travel time in the nth day at - average travel-time tu - uncertainty ttn at tu N 19 / 21
  • 21. Introduction Probabilistic Programming Case Study A Bicyclist’s Daily Travel Analysing the distribution of an individual cyclist’s daily travel time to work Specify relationships between variables ttn - travel time in the nth day at - average travel-time tu - uncertainty ttn at tu N Joint distribution is given by p(tt, at, tu) = p(at) p(tu) priors × N n=1 p(ttn|at, tu) likelihood 19 / 21
  • 22. Introduction Probabilistic Programming Case Study A Bicyclist’s Daily Travel Analysing the distribution of an individual cyclist’s daily travel time to work Joint distribution is given by p(tt, as, tu) = p(at) p(tu) priors × N n=1 p(ttn|at, tu) likelihood How should we define the likelihood p(ttn|at, tu)? 20 / 21
  • 23. Introduction Probabilistic Programming Case Study A Bicyclist’s Daily Travel Analysing the distribution of an individual cyclist’s daily travel time to work Joint distribution is given by p(tt, as, tu) = p(at) p(tu) priors × N n=1 p(ttn|at, tu) likelihood How should we define the likelihood p(ttn|at, tu)? the distribution’s mean is the cyclist’s average travel time the distribution’s variance determines how much the travel time varies from day to day (e.g. variations in traffic conditions) 20 / 21
  • 24. Introduction Probabilistic Programming Case Study A Bicyclist’s Daily Travel Analysing the distribution of an individual cyclist’s daily travel time to work Joint distribution is given by p(tt, as, tu) = p(at) p(tu) priors × N n=1 p(ttn|at, tu) likelihood How should we define the likelihood p(ttn|at, tu)? the distribution’s mean is the cyclist’s average travel time the distribution’s variance determines how much the travel time varies from day to day (e.g. variations in traffic conditions) What distributions should p(at) and p(tu) have? 20 / 21
  • 25. Introduction Probabilistic Programming Case Study A Bicyclist’s Daily Travel Analysing the distribution of an individual cyclist’s daily travel time to work Joint distribution is given by p(tt, as, tu) = p(at) p(tu) priors × N n=1 p(ttn|at, tu) likelihood How should we define the likelihood p(ttn|at, tu)? the distribution’s mean is the cyclist’s average travel time the distribution’s variance determines how much the travel time varies from day to day (e.g. variations in traffic conditions) What distributions should p(at) and p(tu) have? conjugate priors! 20 / 21
  • 26. Introduction Probabilistic Programming Case Study A Bicyclist’s Daily Travel Likelihood given by p(ttn|at, tu) = N(ttn|at, tu) We now know what distribution forms to assign to the priors... 21 / 21
  • 27. Introduction Probabilistic Programming Case Study A Bicyclist’s Daily Travel Likelihood given by p(ttn|at, tu) = N(ttn|at, tu) We now know what distribution forms to assign to the priors... p(at) = N(at|µ, σ2 ) p(tu) = cauchy(tu|µ, σ2 ) 21 / 21
  • 28. Introduction Probabilistic Programming Case Study A Bicyclist’s Daily Travel Likelihood given by p(ttn|at, tu) = N(ttn|at, tu) We now know what distribution forms to assign to the priors... p(at) = N(at|µ, σ2 ) p(tu) = cauchy(tu|µ, σ2 ) The choice of the initial parameters of the prior is significant only if you have a small number of observations As the number of observations increases, the influence of the initial prior on the posterior declines 21 / 21