SlideShare a Scribd company logo
From Software Engineering
To Machine Learning
Alexey Grigorev
Lead Data Scientist at OLX Group
Founder at DataTalks.Club
2010
2012
2015
2018
mlbookcamp.com
https://tech.olx.com/detecting-image-duplicates-at-olx-scale-7f59e4b6aef4
Mostly engineering work!
Mostly engineering work!
Hidden Technical Debt in Machine Learning Systems
https://papers.nips.cc/paper/2015/file/86df7dcfd896fcaf2674f757a2463eba-Paper.pdf
You already have 90% of required skills
Learning Plan
● Start with fundamentals
● Learn simple algorithms
● Evaluate your model
● Deploy your model
● Learn complex algorithms
Learning Plan
● Start with fundamentals
● Learn simple algorithms
● Evaluate your model
● Deploy your model
● Learn complex algorithms
By doing projects!
The fundamentals
● Python
● NumPy
● Pandas
WARNING:
SOMETHING SCARY
From Software Engineering To Machine Learning
It’s not scary!
Matrix multiplication is just a bunch of for loops!
It’s not scary!
Matrix multiplication is just a bunch of for loops!
Tip + Home task: implement these operations yourself:
● Vector-vector multiplication
● Matrix-vector multiplication
● Matrix-matrix multiplication
Bonus points:
● Express each operation using one for loop + previous operation
The best way to learn:
Learn by doing projects
Regression
Classification
Evaluation
Tree-Based
Models
Image
Classification
Kubernetes
and Kubeflow
Serverless
Deep
Learning
Chapter 2
Chapter 4
Classification
Chapter 3
Deployment
Chapter 5
Chapter 6
Chapter 7
Chapter 8
Chapter 9
https://github.com/alexeygrigorev/mlbookcamp-code/
Pictures from olx.ua
Project #1: Car Price Prediction
https://github.com/alexeygrigorev/mlbookcamp-code/blob/master/chapter-02-car-price/02-carprice.ipynb
Project #2: Churn
https://github.com/alexeygrigorev/mlbookcamp-code/blob/master/chapter-03-churn-prediction/03-churn.ipynb
Churn: 10% Churn: 20% Churn: 30% Churn: 40% Churn: 45%
Churn: 85%
Image source
Project #2 Cont’d: Evaluation
https://github.com/alexeygrigorev/mlbookcamp-code/blob/master/chapter-03-churn-prediction/04-metrics.ipynb
IMPORTANT!
Project #2 Cont’d: Deployment
https://github.com/alexeygrigorev/mlbookcamp-code/tree/master/chapter-05-deployment
Model
/predict
Churn service
POST /predict
{
"probability": 0.06,
"churn": true
}
Request
Response
{
"id": "8879-zkjof",
"gender": "female",
"partner": "no",
...
}
IMPORTANT!
* But easy
for devs
*
Project #3: Credit Risk
https://github.com/alexeygrigorev/mlbookcamp-code/blob/master/chapter-06-trees/06-trees.ipynb
🚗
Risk Scoring
Model
Approve
Decline
Project #4: Image Classification
https://github.com/alexeygrigorev/mlbookcamp-code/blob/master/chapter-07-neural-nets/07-neural-nets-train.ipynb
inputs base vector outputs
T-Shirt
150x150x3
keras.Model(inputs, outputs)
Dense(10)
Global
Average
Pooling2D
base_model
Input
Project #4 Cont’d: Deploy with Lambda
{
"tshirt": 0.9993,
"pants": 0.0005,
"shoes": 0.00004
}
https://github.com/alexeygrigorev/mlbookcamp-code/tree/master/chapter-08-serverless
Project #4 Cont’d: Deploy with Kubernetes
Gateway
(Resize and
process image)
Flask
Model
(Make predictions)
TF-Serving
Pants
Raw
predictions
Pre-processed
image
HTTP
(JSON)
gRPC
(protobuf)
https://github.com/alexeygrigorev/mlbookcamp-code/tree/master/chapter-09-kubernetes
Next steps
● Data science competitions (Kaggle)
● End-to-end projects — your own!
● You don’t have to do it alone. Join a community
From Software Engineering To Machine Learning
Summary
● You already have 90% of required skills
● Learn fundamentals: Python, NumPy, Pandas
● Don’t be afraid of math (it’s just for loops)
● The best way to learn is by doing projects
● Learn evaluation metrics and cross-validation
● Deployment is easy for you and difficult for data scientists
● Don’t do it alone!
@Al_Grigor
agrigorev
DataTalks.Club

More Related Content

What's hot (20)

PDF
Deck 8983a1d9-68df-4447-8481-3b4fd0de734c-128-133-443 (1)
Justin Ezor
 
PDF
Why reactive programing matter, and how PicCollage adaptive it.
PRADA Hsiung
 
PDF
Tripletail
Phu Truong
 
PDF
Build a Game with JavaScript
Thinkful
 
PDF
Build a Game with JavaScript
Thinkful
 
PDF
Failing @ Scaling: Don’t panic, and carry a towel!
Em Campbell-Pretty
 
PDF
website la 11/28
Thinkful
 
PPTX
Accelerate Cloud Profitabilty with CSP
Gui Carvalhal
 
PPTX
Applied Data Science: Building a Beer Recommender | Data Science MD - Oct 2014
Austin Ogilvie
 
PDF
Preparing Applications For Dynamic Scaling
Roy Simkes
 
PDF
Preparing for the WebGeek DevCup
bryanbibat
 
PPTX
How we use Silverstripe CMS to deliver bilingual and accessible websites
MichaelPritchard21
 
PPTX
Cranking CI to 11: Deployment Pipelines
Knut Haugen
 
PPTX
Why we fail at ml ai why we fail at ml_ai
Brian Ray
 
PDF
How NOT to build a pipeline
Josh Hill
 
PDF
A Principles Based Approach to SAFe
Em Campbell-Pretty
 
PDF
Intro to js august 31
Thinkful
 
PDF
Failing @ Scaling Agile? Don’t Panic! & Carry a Towel
Em Campbell-Pretty
 
PDF
Build a game la september 7
Thinkful
 
PPTX
Streamlining .net core development using Docker
Punit Jajodia
 
Deck 8983a1d9-68df-4447-8481-3b4fd0de734c-128-133-443 (1)
Justin Ezor
 
Why reactive programing matter, and how PicCollage adaptive it.
PRADA Hsiung
 
Tripletail
Phu Truong
 
Build a Game with JavaScript
Thinkful
 
Build a Game with JavaScript
Thinkful
 
Failing @ Scaling: Don’t panic, and carry a towel!
Em Campbell-Pretty
 
website la 11/28
Thinkful
 
Accelerate Cloud Profitabilty with CSP
Gui Carvalhal
 
Applied Data Science: Building a Beer Recommender | Data Science MD - Oct 2014
Austin Ogilvie
 
Preparing Applications For Dynamic Scaling
Roy Simkes
 
Preparing for the WebGeek DevCup
bryanbibat
 
How we use Silverstripe CMS to deliver bilingual and accessible websites
MichaelPritchard21
 
Cranking CI to 11: Deployment Pipelines
Knut Haugen
 
Why we fail at ml ai why we fail at ml_ai
Brian Ray
 
How NOT to build a pipeline
Josh Hill
 
A Principles Based Approach to SAFe
Em Campbell-Pretty
 
Intro to js august 31
Thinkful
 
Failing @ Scaling Agile? Don’t Panic! & Carry a Towel
Em Campbell-Pretty
 
Build a game la september 7
Thinkful
 
Streamlining .net core development using Docker
Punit Jajodia
 

Similar to From Software Engineering To Machine Learning (20)

PDF
Introduction to Web Components & Polymer Workshop - JS Interactive
John Riviello
 
PDF
Project management software of your dreams
Andrew Mleczko
 
PDF
Easy path to machine learning (Spring 2020)
wesley chun
 
PDF
Workshop: Introduction to Web Components & Polymer
John Riviello
 
PPTX
MLSEC 2020
Zoltan Balazs
 
PDF
"Deployment for free": removing the need to write model deployment code at St...
Stefan Krawczyk
 
PPTX
Google Cloud: Next'19 Extended Hanoi
GCPUserGroupVietnam
 
PDF
Easy path to machine learning (Spring 2021)
wesley chun
 
PPTX
DN18 | Demystifying the Buzz in Machine Learning! (This Time for Real) | Dat ...
Dataconomy Media
 
PDF
PyCon US 2009: Challenges and Opportunities for Python
Ted Leung
 
PDF
Exploring and Using the Python Ecosystem
Adam Cook
 
PDF
Work with Developers for Fun and Progress - AppSec California
leifdreizler
 
PDF
Mobile backends with Google Cloud Platform (MBLTDev'14)
Natalia Efimtseva
 
PDF
Machine learning in survey monkey
Da Kuang
 
PDF
AB Testing, Ads and other 3rd party tags - London WebPerf - March 2018
Andy Davies
 
PDF
Building NLP applications with Transformers
Julien SIMON
 
PDF
Continuous Deployment To The Cloud With Spring Cloud Pipelines @WarsawCloudNa...
Marcin Grzejszczak
 
PDF
Build and Host Real-world Machine Learning Services from Scratch @ pycontw2019
Chun-Yu Tseng
 
PDF
Challenges of Deep Learning in Computer Vision Webinar - Tessellate Imaging
Adhesh Shrivastava
 
PDF
Creating a Custom ML Model for your Application - Kotlin/Everywhere
Isabel Palomar
 
Introduction to Web Components & Polymer Workshop - JS Interactive
John Riviello
 
Project management software of your dreams
Andrew Mleczko
 
Easy path to machine learning (Spring 2020)
wesley chun
 
Workshop: Introduction to Web Components & Polymer
John Riviello
 
MLSEC 2020
Zoltan Balazs
 
"Deployment for free": removing the need to write model deployment code at St...
Stefan Krawczyk
 
Google Cloud: Next'19 Extended Hanoi
GCPUserGroupVietnam
 
Easy path to machine learning (Spring 2021)
wesley chun
 
DN18 | Demystifying the Buzz in Machine Learning! (This Time for Real) | Dat ...
Dataconomy Media
 
PyCon US 2009: Challenges and Opportunities for Python
Ted Leung
 
Exploring and Using the Python Ecosystem
Adam Cook
 
Work with Developers for Fun and Progress - AppSec California
leifdreizler
 
Mobile backends with Google Cloud Platform (MBLTDev'14)
Natalia Efimtseva
 
Machine learning in survey monkey
Da Kuang
 
AB Testing, Ads and other 3rd party tags - London WebPerf - March 2018
Andy Davies
 
Building NLP applications with Transformers
Julien SIMON
 
Continuous Deployment To The Cloud With Spring Cloud Pipelines @WarsawCloudNa...
Marcin Grzejszczak
 
Build and Host Real-world Machine Learning Services from Scratch @ pycontw2019
Chun-Yu Tseng
 
Challenges of Deep Learning in Computer Vision Webinar - Tessellate Imaging
Adhesh Shrivastava
 
Creating a Custom ML Model for your Application - Kotlin/Everywhere
Isabel Palomar
 
Ad

More from Alexey Grigorev (20)

PDF
MLOps week 1 intro
Alexey Grigorev
 
PDF
Codementor - Data Science at OLX
Alexey Grigorev
 
PDF
Data Monitoring with whylogs
Alexey Grigorev
 
PDF
AI in Fashion - Size & Fit - Nour Karessli
Alexey Grigorev
 
PDF
AI-Powered Computer Vision Applications in Media Industry - Yulia Pavlova
Alexey Grigorev
 
PDF
ML Zoomcamp 10 - Kubernetes
Alexey Grigorev
 
PDF
Paradoxes in Data Science
Alexey Grigorev
 
PDF
ML Zoomcamp 8 - Neural networks and deep learning
Alexey Grigorev
 
PDF
Algorithmic fairness
Alexey Grigorev
 
PDF
MLOps at OLX
Alexey Grigorev
 
PDF
ML Zoomcamp 6 - Decision Trees and Ensemble Learning
Alexey Grigorev
 
PDF
ML Zoomcamp 5 - Model deployment
Alexey Grigorev
 
PDF
Introduction to Transformers for NLP - Olga Petrova
Alexey Grigorev
 
PDF
ML Zoomcamp 4 - Evaluation Metrics for Classification
Alexey Grigorev
 
PDF
ML Zoomcamp 3 - Machine Learning for Classification
Alexey Grigorev
 
PDF
ML Zoomcamp Week #2 Office Hours
Alexey Grigorev
 
PDF
AMLD2021 - ML in online marketplaces
Alexey Grigorev
 
PDF
ML Zoomcamp 2 - Slides
Alexey Grigorev
 
PDF
ML Zoomcamp 2.1 - Car Price Prediction Project
Alexey Grigorev
 
PDF
ML Zoomcamp - Course Overview and Logistics
Alexey Grigorev
 
MLOps week 1 intro
Alexey Grigorev
 
Codementor - Data Science at OLX
Alexey Grigorev
 
Data Monitoring with whylogs
Alexey Grigorev
 
AI in Fashion - Size & Fit - Nour Karessli
Alexey Grigorev
 
AI-Powered Computer Vision Applications in Media Industry - Yulia Pavlova
Alexey Grigorev
 
ML Zoomcamp 10 - Kubernetes
Alexey Grigorev
 
Paradoxes in Data Science
Alexey Grigorev
 
ML Zoomcamp 8 - Neural networks and deep learning
Alexey Grigorev
 
Algorithmic fairness
Alexey Grigorev
 
MLOps at OLX
Alexey Grigorev
 
ML Zoomcamp 6 - Decision Trees and Ensemble Learning
Alexey Grigorev
 
ML Zoomcamp 5 - Model deployment
Alexey Grigorev
 
Introduction to Transformers for NLP - Olga Petrova
Alexey Grigorev
 
ML Zoomcamp 4 - Evaluation Metrics for Classification
Alexey Grigorev
 
ML Zoomcamp 3 - Machine Learning for Classification
Alexey Grigorev
 
ML Zoomcamp Week #2 Office Hours
Alexey Grigorev
 
AMLD2021 - ML in online marketplaces
Alexey Grigorev
 
ML Zoomcamp 2 - Slides
Alexey Grigorev
 
ML Zoomcamp 2.1 - Car Price Prediction Project
Alexey Grigorev
 
ML Zoomcamp - Course Overview and Logistics
Alexey Grigorev
 
Ad

Recently uploaded (20)

PPTX
sajflsajfljsdfljslfjslfsdfas;fdsfksadfjlsdflkjslgfs;lfjlsajfl;sajfasfd.pptx
theknightme
 
PPTX
Orchestrating things in Angular application
Peter Abraham
 
PPTX
原版西班牙莱昂大学毕业证(León毕业证书)如何办理
Taqyea
 
PPTX
Lec15_Mutability Immutability-converted.pptx
khanjahanzaib1
 
PPTX
internet básico presentacion es una red global
70965857
 
PDF
BRKACI-1001 - Your First 7 Days of ACI.pdf
fcesargonca
 
PPT
Agilent Optoelectronic Solutions for Mobile Application
andreashenniger2
 
PPTX
Softuni - Psychology of entrepreneurship
Kalin Karakehayov
 
PDF
Build Fast, Scale Faster: Milvus vs. Zilliz Cloud for Production-Ready AI
Zilliz
 
PPTX
04 Output 1 Instruments & Tools (3).pptx
GEDYIONGebre
 
PPTX
PM200.pptxghjgfhjghjghjghjghjghjghjghjghjghj
breadpaan921
 
PDF
BRKACI-1003 ACI Brownfield Migration - Real World Experiences and Best Practi...
fcesargonca
 
PPTX
一比一原版(SUNY-Albany毕业证)纽约州立大学奥尔巴尼分校毕业证如何办理
Taqyea
 
PPT
introductio to computers by arthur janry
RamananMuthukrishnan
 
DOCX
Custom vs. Off-the-Shelf Banking Software
KristenCarter35
 
PPTX
Presentation3gsgsgsgsdfgadgsfgfgsfgagsfgsfgzfdgsdgs.pptx
SUB03
 
PDF
The Internet - By the numbers, presented at npNOG 11
APNIC
 
PDF
𝐁𝐔𝐊𝐓𝐈 𝐊𝐄𝐌𝐄𝐍𝐀𝐍𝐆𝐀𝐍 𝐊𝐈𝐏𝐄𝐑𝟒𝐃 𝐇𝐀𝐑𝐈 𝐈𝐍𝐈 𝟐𝟎𝟐𝟓
hokimamad0
 
PPTX
ONLINE BIRTH CERTIFICATE APPLICATION SYSYTEM PPT.pptx
ShyamasreeDutta
 
PDF
The-Hidden-Dangers-of-Skipping-Penetration-Testing.pdf.pdf
naksh4thra
 
sajflsajfljsdfljslfjslfsdfas;fdsfksadfjlsdflkjslgfs;lfjlsajfl;sajfasfd.pptx
theknightme
 
Orchestrating things in Angular application
Peter Abraham
 
原版西班牙莱昂大学毕业证(León毕业证书)如何办理
Taqyea
 
Lec15_Mutability Immutability-converted.pptx
khanjahanzaib1
 
internet básico presentacion es una red global
70965857
 
BRKACI-1001 - Your First 7 Days of ACI.pdf
fcesargonca
 
Agilent Optoelectronic Solutions for Mobile Application
andreashenniger2
 
Softuni - Psychology of entrepreneurship
Kalin Karakehayov
 
Build Fast, Scale Faster: Milvus vs. Zilliz Cloud for Production-Ready AI
Zilliz
 
04 Output 1 Instruments & Tools (3).pptx
GEDYIONGebre
 
PM200.pptxghjgfhjghjghjghjghjghjghjghjghjghj
breadpaan921
 
BRKACI-1003 ACI Brownfield Migration - Real World Experiences and Best Practi...
fcesargonca
 
一比一原版(SUNY-Albany毕业证)纽约州立大学奥尔巴尼分校毕业证如何办理
Taqyea
 
introductio to computers by arthur janry
RamananMuthukrishnan
 
Custom vs. Off-the-Shelf Banking Software
KristenCarter35
 
Presentation3gsgsgsgsdfgadgsfgfgsfgagsfgsfgzfdgsdgs.pptx
SUB03
 
The Internet - By the numbers, presented at npNOG 11
APNIC
 
𝐁𝐔𝐊𝐓𝐈 𝐊𝐄𝐌𝐄𝐍𝐀𝐍𝐆𝐀𝐍 𝐊𝐈𝐏𝐄𝐑𝟒𝐃 𝐇𝐀𝐑𝐈 𝐈𝐍𝐈 𝟐𝟎𝟐𝟓
hokimamad0
 
ONLINE BIRTH CERTIFICATE APPLICATION SYSYTEM PPT.pptx
ShyamasreeDutta
 
The-Hidden-Dangers-of-Skipping-Penetration-Testing.pdf.pdf
naksh4thra
 

From Software Engineering To Machine Learning