This guest lecture was delivered at NYU’s Computer Science department. It dives into what most ML tutorials leave out — the ML lifecycle, production pitfalls, and how MLOps brings machine learning into the real world.
Exolore The Essential AI Tools in 2025.pdfSrinivasan M
"Beyond English: Navigating the Challenges of Building a Ukrainian-language R...Fwdays
OpenID AuthZEN - Analyst Briefing July 2025David Brossard
COMPARISON OF RASTER ANALYSIS TOOLS OF QGIS AND ARCGISSharanya Sarkar
From Code to Challenge: Crafting Skill-Based Games That Engage and Rewardaiyshauae
Transcript: New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025BookNet Canada
Newgen Beyond Frankenstein_Build vs Buy_Digital_version.pdfdarshakparmar
Ad
From Notebook to Production: What Most ML Tutorials Don’t Teach
1. From Notebook to Production: What
Most ML Tutorials Don’t Teach
Vivek Bharti - Sr. MLE @ Roku
Guest Lecture at NYU Computer Science – Spring 2025
2. Agenda
● Intro to ML
● ML Case Study
● ML Development Lifecycle
● MLOps
● QnA
3. What is Machine Learning
● Machine Learning (ML) is a method of teaching computers to learn patterns from data and make decisions or
predictions without being explicitly programmed.
● Types of ML:
○ Supervised – Learn from labeled examples (e.g. spam detection)
○ Unsupervised – Discover patterns without labels (e.g. customer clustering)
○ Reinforcement – Learn by interacting and getting rewarded (e.g. game-playing AIs)
4. Why is ML Important?
Real-World Applications:
● Product Recommendations (Amazon, Netflix or other streaming platforms)
● Fraud Detection (banking, credit cards)
● Self-driving Cars (Tesla, Waymo)
● Chatbots & Language Models
● Disease Diagnosis (medical imaging)
Why It’s Growing:
● Explosion of data
● Affordable compute power
● Open-source ML tools (TensorFlow, PyTorch, scikit-learn)
Key Takeaway:
ML is transforming every industry — learning how it works is essential for modern software engineers.
5. Real-World Problem – Classifying Emails as
Spam or Not Spam
Goal: Build a model that can classify incoming emails as "Spam" or "Not Spam".
● Input (Features):
○ Email subject
○ Body content
○ Sender address
○ Keywords (e.g., “free”, “win”, “offer”)
● Output (Label):
○ 1 = Spam
0 = Not Spam
6. Classifying Emails as Spam or Not Spam -
Training the Model
● Dataset: Thousands of emails labeled by humans (spam vs not spam)
● Steps:
○ Text Preprocessing – clean & tokenize text
○ Feature Extraction – e.g., TF-IDF or embeddings
○ Model Selection – e.g., Logistic Regression, Decision Tree, or a simple Neural Net
○ Training – Feed data into model and adjust weights to minimize error
● Learning Objective: Find a function f(email) → {0,1}
7. Classifying Emails as Spam or Not Spam -
Evaluation and Deployment
● Evaluation Metrics:
○ Accuracy
○ Precision / Recall (important for spam)
○ Confusion Matrix
● Once the model is good enough:
○ Deploy into an email server
○ Continuously monitor performance (concept drift)
○ Retrain as new types of spam emerge
Note: There are many other important concepts like overfitting, underfitting, bias variance tradeoff etc
8. ML Development Lifecycle
Now that we’ve seen the model training, let’s explore how we can scale and maintain this model in the ML lifecycle.
● Data Collection & Preparation
○ Continuously collect new email data (including labeled spam vs not spam)
○ Feature engineering for better classification
● Model Development & Training
○ Experimentation with different models and hyperparameters
● Model Evaluation & Validation
○ Cross-validation, Hyperparameter tuning
○ Continuous evaluation against a validation set to ensure quality
9. ML Development Lifecycle
● Deployment
○ Integrating the model with the production email system - model pickle file, build API
○ Use containerized environments (Docker) for portability
● Monitoring & Maintenance
○ Monitoring: Performance tracking, model drift detection
○ Feedback loops: Real-time feedback for improving the model
Note: AB tests are done before model deployment to track the performance of new models
10. ML Development Lifecycle - Challenges
● How do we scale these processes for continuous updates? - models might need continuous updating, keep track of
experiments, if done manually it is very inefficient
● How do we ensure that the model stays accurate over time and adapts to new patterns? - Monitoring, drift detection,
feedback loops, retraining
● How do we deploy and monitor the model consistently across environments? - we need to keep the environment
consistent
To Implement the ML Development Lifecycle, and overcome all these challenges, We Need MLOps
11. What is MLOps
MLOps (Machine Learning Operations) is the DevOps-inspired discipline that streamlines the development, deployment,
and lifecycle management of machine learning models.
Why MLOps is Needed:
● ML projects aren’t just code — they involve data, models, experiments, metrics, and retraining loops
● Models decay over time — retraining and monitoring are critical (drift, ex: new spams will keep on coming)
● Collaboration across PMs, UI/UX Engineer, data scientists, engineers, and DevOps teams needs clear processes
Core Goals:
● Automate the ML lifecycle
● Ensure reproducibility and traceability
● Enable continuous delivery (CI/CD) for ML
● Monitor and maintain models in production in real time
12. MLOps in Action - MLOps Tools
CI/CD Pipelines
● Automate model training, testing, validation, and deployment, CI/CD
● Tools: GitLab CI, Jenkins, GitHub Actions, Kubeflow Pipelines
Model & Data Versioning
● Track model iterations, datasets, code, and performance metrics
● Tools: MLflow, DVC, Weights & Biases
13. MLOps in Action - MLOps Tools
Model Serving
● Package models into APIs for real-time or batch use
● Tools: Flask, FastAPI, BentoML, TorchServe
Monitoring & Logging
● Real time - track performance, data drift, latency, and errors in production
● Tools: Prometheus, Grafana, ELK stack, Arize, WhyLabs
Scheduled Retraining
● Keep models up-to-date with new data
● Tools: Apache Airflow, Prefect
● Trigger retraining on schedule or on drift detection
14. MLOps in Action - Team Practices Supporting
MLOps
Agile Development:
● Daily standups, weekly sprints, retrospectives
Kanban Boards:
● Track tasks, model experiments, bugs using Jira or Trello
Collaboration:
● Clear handoffs between PMs, UI/UX engineers, data scientists, ML engineers, and DevOps — coordinated via tools like
Slack, email, and Jira.
15. Key Takeaways and QnA
● ML touches everyone’s lives in today’s world
● The ML Lifecycle is Continuous:
○ From data collection to model deployment, it’s an ongoing process that requires constant updates and
iterations.
● MLOps Enforces Automation & Collaboration:
○ The MLOps pipeline automates critical aspects like model retraining, monitoring, and scaling while
fostering collaboration between teams. Make processes more efficient and less manual.
● Agile & Tools Drive Efficiency:
○ Practices like Agile sprints, Kanban boards, and tools like Jira and Trello keep tasks organized and ensure
timely updates.
Remember, building a machine learning model is only the beginning. Keeping that model effective and scalable over
time is where the actual value lies.
#1: Let’s discuss the entire journey of ML models from notebook to production
#3:
LLMS - are typically trained using a combination of unsupervised, self-supervised, and sometimes supervised learning
Pretraining = Self-supervised (unsupervised)
Fine-tuning = Supervised or Reinforcement Learning
#7: In spam detection, missing spam is worse than occasionally flagging a legit email as spam.
Precision (a.k.a. Positive Predictive Value)
Definition: Of all the predicted positives, how many were actually positive?
Recall (a.k.a. Sensitivity or True Positive Rate)
Definition: Of all actual positives, how many were correctly predicted?
Accuracy
Definition: The proportion of total correct predictions.
#8: Feature Eng – Time of Day Sent, Number of Links, Has Attachment, HTML or Plain Text, Email Length, Excessive Punctuation, Previous Spam Reports on Similar Emails