SlideShare a Scribd company logo
From Notebook to Production: What
Most ML Tutorials Don’t Teach
Vivek Bharti - Sr. MLE @ Roku
Guest Lecture at NYU Computer Science – Spring 2025
Agenda
● Intro to ML
● ML Case Study
● ML Development Lifecycle
● MLOps
● QnA
What is Machine Learning
● Machine Learning (ML) is a method of teaching computers to learn patterns from data and make decisions or
predictions without being explicitly programmed.
● Types of ML:
○ Supervised – Learn from labeled examples (e.g. spam detection)
○ Unsupervised – Discover patterns without labels (e.g. customer clustering)
○ Reinforcement – Learn by interacting and getting rewarded (e.g. game-playing AIs)
Why is ML Important?
Real-World Applications:
● Product Recommendations (Amazon, Netflix or other streaming platforms)
● Fraud Detection (banking, credit cards)
● Self-driving Cars (Tesla, Waymo)
● Chatbots & Language Models
● Disease Diagnosis (medical imaging)
Why It’s Growing:
● Explosion of data
● Affordable compute power
● Open-source ML tools (TensorFlow, PyTorch, scikit-learn)
Key Takeaway:
ML is transforming every industry — learning how it works is essential for modern software engineers.
Real-World Problem – Classifying Emails as
Spam or Not Spam
Goal: Build a model that can classify incoming emails as "Spam" or "Not Spam".
● Input (Features):
○ Email subject
○ Body content
○ Sender address
○ Keywords (e.g., “free”, “win”, “offer”)
● Output (Label):
○ 1 = Spam
0 = Not Spam
Classifying Emails as Spam or Not Spam -
Training the Model
● Dataset: Thousands of emails labeled by humans (spam vs not spam)
● Steps:
○ Text Preprocessing – clean & tokenize text
○ Feature Extraction – e.g., TF-IDF or embeddings
○ Model Selection – e.g., Logistic Regression, Decision Tree, or a simple Neural Net
○ Training – Feed data into model and adjust weights to minimize error
● Learning Objective: Find a function f(email) → {0,1}
Classifying Emails as Spam or Not Spam -
Evaluation and Deployment
● Evaluation Metrics:
○ Accuracy
○ Precision / Recall (important for spam)
○ Confusion Matrix
● Once the model is good enough:
○ Deploy into an email server
○ Continuously monitor performance (concept drift)
○ Retrain as new types of spam emerge
Note: There are many other important concepts like overfitting, underfitting, bias variance tradeoff etc
ML Development Lifecycle
Now that we’ve seen the model training, let’s explore how we can scale and maintain this model in the ML lifecycle.
● Data Collection & Preparation
○ Continuously collect new email data (including labeled spam vs not spam)
○ Feature engineering for better classification
● Model Development & Training
○ Experimentation with different models and hyperparameters
● Model Evaluation & Validation
○ Cross-validation, Hyperparameter tuning
○ Continuous evaluation against a validation set to ensure quality
ML Development Lifecycle
● Deployment
○ Integrating the model with the production email system - model pickle file, build API
○ Use containerized environments (Docker) for portability
● Monitoring & Maintenance
○ Monitoring: Performance tracking, model drift detection
○ Feedback loops: Real-time feedback for improving the model
Note: AB tests are done before model deployment to track the performance of new models
ML Development Lifecycle - Challenges
● How do we scale these processes for continuous updates? - models might need continuous updating, keep track of
experiments, if done manually it is very inefficient
● How do we ensure that the model stays accurate over time and adapts to new patterns? - Monitoring, drift detection,
feedback loops, retraining
● How do we deploy and monitor the model consistently across environments? - we need to keep the environment
consistent
To Implement the ML Development Lifecycle, and overcome all these challenges, We Need MLOps
What is MLOps
MLOps (Machine Learning Operations) is the DevOps-inspired discipline that streamlines the development, deployment,
and lifecycle management of machine learning models.
Why MLOps is Needed:
● ML projects aren’t just code — they involve data, models, experiments, metrics, and retraining loops
● Models decay over time — retraining and monitoring are critical (drift, ex: new spams will keep on coming)
● Collaboration across PMs, UI/UX Engineer, data scientists, engineers, and DevOps teams needs clear processes
Core Goals:
● Automate the ML lifecycle
● Ensure reproducibility and traceability
● Enable continuous delivery (CI/CD) for ML
● Monitor and maintain models in production in real time
MLOps in Action - MLOps Tools
CI/CD Pipelines
● Automate model training, testing, validation, and deployment, CI/CD
● Tools: GitLab CI, Jenkins, GitHub Actions, Kubeflow Pipelines
Model & Data Versioning
● Track model iterations, datasets, code, and performance metrics
● Tools: MLflow, DVC, Weights & Biases
MLOps in Action - MLOps Tools
Model Serving
● Package models into APIs for real-time or batch use
● Tools: Flask, FastAPI, BentoML, TorchServe
Monitoring & Logging
● Real time - track performance, data drift, latency, and errors in production
● Tools: Prometheus, Grafana, ELK stack, Arize, WhyLabs
Scheduled Retraining
● Keep models up-to-date with new data
● Tools: Apache Airflow, Prefect
● Trigger retraining on schedule or on drift detection
MLOps in Action - Team Practices Supporting
MLOps
Agile Development:
● Daily standups, weekly sprints, retrospectives
Kanban Boards:
● Track tasks, model experiments, bugs using Jira or Trello
Collaboration:
● Clear handoffs between PMs, UI/UX engineers, data scientists, ML engineers, and DevOps — coordinated via tools like
Slack, email, and Jira.
Key Takeaways and QnA
● ML touches everyone’s lives in today’s world
● The ML Lifecycle is Continuous:
○ From data collection to model deployment, it’s an ongoing process that requires constant updates and
iterations.
● MLOps Enforces Automation & Collaboration:
○ The MLOps pipeline automates critical aspects like model retraining, monitoring, and scaling while
fostering collaboration between teams. Make processes more efficient and less manual.
● Agile & Tools Drive Efficiency:
○ Practices like Agile sprints, Kanban boards, and tools like Jira and Trello keep tasks organized and ensure
timely updates.
Remember, building a machine learning model is only the beginning. Keeping that model effective and scalable over
time is where the actual value lies.
Thank You!
Connect on LinkedIn: www.linkedin.com/in/vivek-bharti

More Related Content

Similar to From Notebook to Production: What Most ML Tutorials Don’t Teach (20)

PDF
What is MLOps - Complete Guide for Beginners
Evision
 
PDF
Mykola Mykytenko: MLOps: your way from nonsense to valuable effect (approache...
Lviv Startup Club
 
PDF
MLOps for production-level machine learning
cnvrg.io AI OS - Hands-on ML Workshops
 
PDF
Demystifying MLOps: A Beginner's Guide To Machine Learning Operations
Rahul Bedi
 
PDF
Scaling Recommendations at Quora (RecSys talk 9/16/2016)
Nikhil Dandekar
 
PDF
Open source ml systems that need to be built
Nikhil Garg
 
PPTX
Aws autopilot
Vivek Raja P S
 
PDF
Top Natural Language Processing |aitech.studio
AITechStudio
 
PDF
Dmitry Spodarets: Modern MLOps toolchain 2023
Lviv Startup Club
 
PDF
Advanced MLflow: Multi-Step Workflows, Hyperparameter Tuning and Integrating ...
Databricks
 
PDF
Introduction to MLOps_ CI_CD for Machine Learning Models.pdf
sakethv1308
 
PPTX
AI hype or reality
Awantik Das
 
PPTX
Mohamed Sabri: Operationalize machine learning with Kubeflow
Lviv Startup Club
 
PPTX
Mohamed Sabri: Operationalize machine learning with Kubeflow
Edunomica
 
PDF
10 more lessons learned from building Machine Learning systems - MLConf
Xavier Amatriain
 
PDF
Xavier Amatriain, VP of Engineering, Quora at MLconf SF - 11/13/15
MLconf
 
PDF
10 more lessons learned from building Machine Learning systems
Xavier Amatriain
 
PPTX
Machine Learning Orchestration with Airflow
Anant Corporation
 
PDF
Rsqrd AI: ML Tooling at an AI-first Startup
Sanjana Chowdhury
 
PDF
Architecting for analytics
Rob Winters
 
What is MLOps - Complete Guide for Beginners
Evision
 
Mykola Mykytenko: MLOps: your way from nonsense to valuable effect (approache...
Lviv Startup Club
 
MLOps for production-level machine learning
cnvrg.io AI OS - Hands-on ML Workshops
 
Demystifying MLOps: A Beginner's Guide To Machine Learning Operations
Rahul Bedi
 
Scaling Recommendations at Quora (RecSys talk 9/16/2016)
Nikhil Dandekar
 
Open source ml systems that need to be built
Nikhil Garg
 
Aws autopilot
Vivek Raja P S
 
Top Natural Language Processing |aitech.studio
AITechStudio
 
Dmitry Spodarets: Modern MLOps toolchain 2023
Lviv Startup Club
 
Advanced MLflow: Multi-Step Workflows, Hyperparameter Tuning and Integrating ...
Databricks
 
Introduction to MLOps_ CI_CD for Machine Learning Models.pdf
sakethv1308
 
AI hype or reality
Awantik Das
 
Mohamed Sabri: Operationalize machine learning with Kubeflow
Lviv Startup Club
 
Mohamed Sabri: Operationalize machine learning with Kubeflow
Edunomica
 
10 more lessons learned from building Machine Learning systems - MLConf
Xavier Amatriain
 
Xavier Amatriain, VP of Engineering, Quora at MLconf SF - 11/13/15
MLconf
 
10 more lessons learned from building Machine Learning systems
Xavier Amatriain
 
Machine Learning Orchestration with Airflow
Anant Corporation
 
Rsqrd AI: ML Tooling at an AI-first Startup
Sanjana Chowdhury
 
Architecting for analytics
Rob Winters
 

Recently uploaded (20)

PDF
Mastering Financial Management in Direct Selling
Epixel MLM Software
 
PDF
Presentation - Vibe Coding The Future of Tech
yanuarsinggih1
 
PDF
DevBcn - Building 10x Organizations Using Modern Productivity Metrics
Justin Reock
 
PDF
Newgen 2022-Forrester Newgen TEI_13 05 2022-The-Total-Economic-Impact-Newgen-...
darshakparmar
 
PDF
NewMind AI - Journal 100 Insights After The 100th Issue
NewMind AI
 
PDF
July Patch Tuesday
Ivanti
 
PPTX
Webinar: Introduction to LF Energy EVerest
DanBrown980551
 
PPTX
AI Penetration Testing Essentials: A Cybersecurity Guide for 2025
defencerabbit Team
 
PDF
Biography of Daniel Podor.pdf
Daniel Podor
 
PDF
CIFDAQ Token Spotlight for 9th July 2025
CIFDAQ
 
PDF
HubSpot Main Hub: A Unified Growth Platform
Jaswinder Singh
 
PDF
Agentic AI lifecycle for Enterprise Hyper-Automation
Debmalya Biswas
 
PDF
[Newgen] NewgenONE Marvin Brochure 1.pdf
darshakparmar
 
PDF
Exolore The Essential AI Tools in 2025.pdf
Srinivasan M
 
PDF
"Beyond English: Navigating the Challenges of Building a Ukrainian-language R...
Fwdays
 
PPTX
OpenID AuthZEN - Analyst Briefing July 2025
David Brossard
 
PPTX
COMPARISON OF RASTER ANALYSIS TOOLS OF QGIS AND ARCGIS
Sharanya Sarkar
 
PDF
From Code to Challenge: Crafting Skill-Based Games That Engage and Reward
aiyshauae
 
PDF
Transcript: New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
PDF
Newgen Beyond Frankenstein_Build vs Buy_Digital_version.pdf
darshakparmar
 
Mastering Financial Management in Direct Selling
Epixel MLM Software
 
Presentation - Vibe Coding The Future of Tech
yanuarsinggih1
 
DevBcn - Building 10x Organizations Using Modern Productivity Metrics
Justin Reock
 
Newgen 2022-Forrester Newgen TEI_13 05 2022-The-Total-Economic-Impact-Newgen-...
darshakparmar
 
NewMind AI - Journal 100 Insights After The 100th Issue
NewMind AI
 
July Patch Tuesday
Ivanti
 
Webinar: Introduction to LF Energy EVerest
DanBrown980551
 
AI Penetration Testing Essentials: A Cybersecurity Guide for 2025
defencerabbit Team
 
Biography of Daniel Podor.pdf
Daniel Podor
 
CIFDAQ Token Spotlight for 9th July 2025
CIFDAQ
 
HubSpot Main Hub: A Unified Growth Platform
Jaswinder Singh
 
Agentic AI lifecycle for Enterprise Hyper-Automation
Debmalya Biswas
 
[Newgen] NewgenONE Marvin Brochure 1.pdf
darshakparmar
 
Exolore The Essential AI Tools in 2025.pdf
Srinivasan M
 
"Beyond English: Navigating the Challenges of Building a Ukrainian-language R...
Fwdays
 
OpenID AuthZEN - Analyst Briefing July 2025
David Brossard
 
COMPARISON OF RASTER ANALYSIS TOOLS OF QGIS AND ARCGIS
Sharanya Sarkar
 
From Code to Challenge: Crafting Skill-Based Games That Engage and Reward
aiyshauae
 
Transcript: New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
Newgen Beyond Frankenstein_Build vs Buy_Digital_version.pdf
darshakparmar
 
Ad

From Notebook to Production: What Most ML Tutorials Don’t Teach

  • 1. From Notebook to Production: What Most ML Tutorials Don’t Teach Vivek Bharti - Sr. MLE @ Roku Guest Lecture at NYU Computer Science – Spring 2025
  • 2. Agenda ● Intro to ML ● ML Case Study ● ML Development Lifecycle ● MLOps ● QnA
  • 3. What is Machine Learning ● Machine Learning (ML) is a method of teaching computers to learn patterns from data and make decisions or predictions without being explicitly programmed. ● Types of ML: ○ Supervised – Learn from labeled examples (e.g. spam detection) ○ Unsupervised – Discover patterns without labels (e.g. customer clustering) ○ Reinforcement – Learn by interacting and getting rewarded (e.g. game-playing AIs)
  • 4. Why is ML Important? Real-World Applications: ● Product Recommendations (Amazon, Netflix or other streaming platforms) ● Fraud Detection (banking, credit cards) ● Self-driving Cars (Tesla, Waymo) ● Chatbots & Language Models ● Disease Diagnosis (medical imaging) Why It’s Growing: ● Explosion of data ● Affordable compute power ● Open-source ML tools (TensorFlow, PyTorch, scikit-learn) Key Takeaway: ML is transforming every industry — learning how it works is essential for modern software engineers.
  • 5. Real-World Problem – Classifying Emails as Spam or Not Spam Goal: Build a model that can classify incoming emails as "Spam" or "Not Spam". ● Input (Features): ○ Email subject ○ Body content ○ Sender address ○ Keywords (e.g., “free”, “win”, “offer”) ● Output (Label): ○ 1 = Spam 0 = Not Spam
  • 6. Classifying Emails as Spam or Not Spam - Training the Model ● Dataset: Thousands of emails labeled by humans (spam vs not spam) ● Steps: ○ Text Preprocessing – clean & tokenize text ○ Feature Extraction – e.g., TF-IDF or embeddings ○ Model Selection – e.g., Logistic Regression, Decision Tree, or a simple Neural Net ○ Training – Feed data into model and adjust weights to minimize error ● Learning Objective: Find a function f(email) → {0,1}
  • 7. Classifying Emails as Spam or Not Spam - Evaluation and Deployment ● Evaluation Metrics: ○ Accuracy ○ Precision / Recall (important for spam) ○ Confusion Matrix ● Once the model is good enough: ○ Deploy into an email server ○ Continuously monitor performance (concept drift) ○ Retrain as new types of spam emerge Note: There are many other important concepts like overfitting, underfitting, bias variance tradeoff etc
  • 8. ML Development Lifecycle Now that we’ve seen the model training, let’s explore how we can scale and maintain this model in the ML lifecycle. ● Data Collection & Preparation ○ Continuously collect new email data (including labeled spam vs not spam) ○ Feature engineering for better classification ● Model Development & Training ○ Experimentation with different models and hyperparameters ● Model Evaluation & Validation ○ Cross-validation, Hyperparameter tuning ○ Continuous evaluation against a validation set to ensure quality
  • 9. ML Development Lifecycle ● Deployment ○ Integrating the model with the production email system - model pickle file, build API ○ Use containerized environments (Docker) for portability ● Monitoring & Maintenance ○ Monitoring: Performance tracking, model drift detection ○ Feedback loops: Real-time feedback for improving the model Note: AB tests are done before model deployment to track the performance of new models
  • 10. ML Development Lifecycle - Challenges ● How do we scale these processes for continuous updates? - models might need continuous updating, keep track of experiments, if done manually it is very inefficient ● How do we ensure that the model stays accurate over time and adapts to new patterns? - Monitoring, drift detection, feedback loops, retraining ● How do we deploy and monitor the model consistently across environments? - we need to keep the environment consistent To Implement the ML Development Lifecycle, and overcome all these challenges, We Need MLOps
  • 11. What is MLOps MLOps (Machine Learning Operations) is the DevOps-inspired discipline that streamlines the development, deployment, and lifecycle management of machine learning models. Why MLOps is Needed: ● ML projects aren’t just code — they involve data, models, experiments, metrics, and retraining loops ● Models decay over time — retraining and monitoring are critical (drift, ex: new spams will keep on coming) ● Collaboration across PMs, UI/UX Engineer, data scientists, engineers, and DevOps teams needs clear processes Core Goals: ● Automate the ML lifecycle ● Ensure reproducibility and traceability ● Enable continuous delivery (CI/CD) for ML ● Monitor and maintain models in production in real time
  • 12. MLOps in Action - MLOps Tools CI/CD Pipelines ● Automate model training, testing, validation, and deployment, CI/CD ● Tools: GitLab CI, Jenkins, GitHub Actions, Kubeflow Pipelines Model & Data Versioning ● Track model iterations, datasets, code, and performance metrics ● Tools: MLflow, DVC, Weights & Biases
  • 13. MLOps in Action - MLOps Tools Model Serving ● Package models into APIs for real-time or batch use ● Tools: Flask, FastAPI, BentoML, TorchServe Monitoring & Logging ● Real time - track performance, data drift, latency, and errors in production ● Tools: Prometheus, Grafana, ELK stack, Arize, WhyLabs Scheduled Retraining ● Keep models up-to-date with new data ● Tools: Apache Airflow, Prefect ● Trigger retraining on schedule or on drift detection
  • 14. MLOps in Action - Team Practices Supporting MLOps Agile Development: ● Daily standups, weekly sprints, retrospectives Kanban Boards: ● Track tasks, model experiments, bugs using Jira or Trello Collaboration: ● Clear handoffs between PMs, UI/UX engineers, data scientists, ML engineers, and DevOps — coordinated via tools like Slack, email, and Jira.
  • 15. Key Takeaways and QnA ● ML touches everyone’s lives in today’s world ● The ML Lifecycle is Continuous: ○ From data collection to model deployment, it’s an ongoing process that requires constant updates and iterations. ● MLOps Enforces Automation & Collaboration: ○ The MLOps pipeline automates critical aspects like model retraining, monitoring, and scaling while fostering collaboration between teams. Make processes more efficient and less manual. ● Agile & Tools Drive Efficiency: ○ Practices like Agile sprints, Kanban boards, and tools like Jira and Trello keep tasks organized and ensure timely updates. Remember, building a machine learning model is only the beginning. Keeping that model effective and scalable over time is where the actual value lies.
  • 16. Thank You! Connect on LinkedIn: www.linkedin.com/in/vivek-bharti

Editor's Notes

  • #1: Let’s discuss the entire journey of ML models from notebook to production
  • #3: LLMS - are typically trained using a combination of unsupervised, self-supervised, and sometimes supervised learning Pretraining = Self-supervised (unsupervised) Fine-tuning = Supervised or Reinforcement Learning
  • #4: Deep learning Concepts began in the 1950s
  • #7: In spam detection, missing spam is worse than occasionally flagging a legit email as spam. Precision (a.k.a. Positive Predictive Value) Definition: Of all the predicted positives, how many were actually positive? Recall (a.k.a. Sensitivity or True Positive Rate) Definition: Of all actual positives, how many were correctly predicted? Accuracy Definition: The proportion of total correct predictions.
  • #8: Feature Eng – Time of Day Sent, Number of Links, Has Attachment, HTML or Plain Text, Email Length, Excessive Punctuation, Previous Spam Reports on Similar Emails