SlideShare a Scribd company logo
ML Model Serving @Twitter
Joe Xie, Yue Lu and Jack Guo
Twitter: @Joe_Xie, @Yue, @JackGuo8
Outline
• ML Infra Overview
• Model Serving Challenges
• Deep Dive in Solutions
– Performance Optimization
– Robust & Resilient
– Real-time Online Learning
– Scaling with Parameter Server
• Model Serving Scenarios
ML Infra - Overview
• ML is increasingly at the core of
everything we build at Twitter
• ML infra supports many product teams
– ads ranking, ads targeting, timeline ranking,
product safety, recommendation, moments
ranking, trends
ML Infra – Product Examples
Ad
Recap
ML Infra - High-level Architecture
ML Infra – Core Prediction Engine
• Large scale online SGD
learning
• Architecture
– Transform: MDL, Decision tree
– Feature crossing
– Logistic Regression: In-house
JVM learner or Vowpal Wabbit
Model Serving Challenges
• Performant: Trillions of predictions served daily
• Robust & Resilient: Traffic spike during events, etc.
Super bowl, Oscar award, world cup
• Real-time: news, events, trends, hash tags, ads.
Dynamically adapt to changes spanning as short as
a few hours even minutes
• Scalability: Horizontal scaling to handle organic
growth, new features and advanced modeling
Performant – Prediction Engine
Optimization
• Reduce serialization cost
– Model collocation
– Batch request API
• Reduce compute cost
– Feature id instead of string name
– Transform sharing across models
– Feature cross done on the fly
Robust & Resilient
• Resilient
– Load factor to control the traffic based on the
success rate of the requests
• Robust
– Snapshot models at fixed interval
– Abnormal traffic detection
Real time – Online Learning
Training Traffic
Client Read
Requests
Prediction Service Instance
Model
Training Traffic
Training Traffic
Scaling – Challenges
• Network fan-out: Each prediction service has to
receive all training traffic
• Limit to Training Traffic Size: Training throughput
limited by the capacity of a single instance
• Inefficient serving : A big portion of the resource is
allocated for training
Scaling – Parameter Server
• Incremental model updates instead of
integrated training
Training
Traffic
‘Server Node’
Model Updates
Serving GroupServer Group
Client Read
Requests
Model
Model
‘Worker Node’
Model
Model Updates
Model Updates
Model Serving Scenarios
• Static model in-memory integration
• Static model standalone service
• Online learning service with integrated
training
• Parameter server with incremental model
updates
Thank you

More Related Content

PDF
ML Platform Q1 Meetup: An introduction to LinkedIn's Ranking and Federation L...
Fei Chen
 
PDF
ML Platform Q1 Meetup: End to-end Feature Analysis, Validation and Transforma...
Fei Chen
 
PPTX
Advanced Machine Learning with Amazon SageMaker
Julien SIMON
 
PPTX
Data platform at Samsung (Big Learning)
ZhuanzhuanDing
 
PDF
Accelerate your Machine Learning workflows with Amazon SageMaker
Julien SIMON
 
PPTX
Build, train, and deploy Machine Learning models at scale (May 2018)
Julien SIMON
 
PDF
Amazon SageMaker workshop
Julien SIMON
 
PDF
Speed up your Machine Learning workflows with build-in algorithms
Julien SIMON
 
ML Platform Q1 Meetup: An introduction to LinkedIn's Ranking and Federation L...
Fei Chen
 
ML Platform Q1 Meetup: End to-end Feature Analysis, Validation and Transforma...
Fei Chen
 
Advanced Machine Learning with Amazon SageMaker
Julien SIMON
 
Data platform at Samsung (Big Learning)
ZhuanzhuanDing
 
Accelerate your Machine Learning workflows with Amazon SageMaker
Julien SIMON
 
Build, train, and deploy Machine Learning models at scale (May 2018)
Julien SIMON
 
Amazon SageMaker workshop
Julien SIMON
 
Speed up your Machine Learning workflows with build-in algorithms
Julien SIMON
 

Similar to ML Model Serving at Twitter (20)

PPTX
ICML'16 Scaling ML System@Twitter
Jack Xiaojiang Guo
 
PDF
Parameter Server Approach for Online Learning at Twitter
Zhiyong (Joe) Xie
 
PDF
Operationalizing Machine Learning at Scale at Starbucks
Databricks
 
PDF
Productionising Machine Learning Models
Tash Bickley
 
PDF
Multi Model Machine Learning by Maximo Gurmendez and Beth Logan
Spark Summit
 
PPTX
Web Performance BootCamp 2013
Daniel Austin
 
PPTX
Web Performance Bootcamp 2014
Daniel Austin
 
PPT
System center seminar presentation
C/D/H Technology Consultants
 
PDF
WSO2Con USA 2017: Scalable Real-time Complex Event Processing at Uber
WSO2
 
PPTX
Practical soa for business and researchers
Mustafa Gamal
 
PDF
A survey on Machine Learning In Production (July 2018)
Arnab Biswas
 
PDF
Microsoft DevOps for AI with GoDataDriven
GoDataDriven
 
PPTX
Ops Jumpstart: MongoDB Administration 101
MongoDB
 
PPTX
Comparing Cloud platforms and tools
sameerabrol
 
PPTX
Comparing Cloud Providers, Platforms and Tools
InnoTech
 
PDF
Five Early Challenges Of Building Streaming Fast Data Applications
Lightbend
 
PPTX
Comparing Legacy and Modern e-commerce solutions
Mike Ensor
 
PDF
Case study value of it strategy in hi tech industry
iasaglobal
 
PPTX
Assessing New Databases– Translytical Use Cases
DATAVERSITY
 
PDF
Operationalizing Machine Learning—Managing Provenance from Raw Data to Predic...
Databricks
 
ICML'16 Scaling ML System@Twitter
Jack Xiaojiang Guo
 
Parameter Server Approach for Online Learning at Twitter
Zhiyong (Joe) Xie
 
Operationalizing Machine Learning at Scale at Starbucks
Databricks
 
Productionising Machine Learning Models
Tash Bickley
 
Multi Model Machine Learning by Maximo Gurmendez and Beth Logan
Spark Summit
 
Web Performance BootCamp 2013
Daniel Austin
 
Web Performance Bootcamp 2014
Daniel Austin
 
System center seminar presentation
C/D/H Technology Consultants
 
WSO2Con USA 2017: Scalable Real-time Complex Event Processing at Uber
WSO2
 
Practical soa for business and researchers
Mustafa Gamal
 
A survey on Machine Learning In Production (July 2018)
Arnab Biswas
 
Microsoft DevOps for AI with GoDataDriven
GoDataDriven
 
Ops Jumpstart: MongoDB Administration 101
MongoDB
 
Comparing Cloud platforms and tools
sameerabrol
 
Comparing Cloud Providers, Platforms and Tools
InnoTech
 
Five Early Challenges Of Building Streaming Fast Data Applications
Lightbend
 
Comparing Legacy and Modern e-commerce solutions
Mike Ensor
 
Case study value of it strategy in hi tech industry
iasaglobal
 
Assessing New Databases– Translytical Use Cases
DATAVERSITY
 
Operationalizing Machine Learning—Managing Provenance from Raw Data to Predic...
Databricks
 
Ad

Recently uploaded (20)

PPTX
MT Chapter 1.pptx- Magnetic particle testing
ABCAnyBodyCanRelax
 
PDF
Chad Ayach - A Versatile Aerospace Professional
Chad Ayach
 
PDF
20ME702-Mechatronics-UNIT-1,UNIT-2,UNIT-3,UNIT-4,UNIT-5, 2025-2026
Mohanumar S
 
PPTX
Civil Engineering Practices_BY Sh.JP Mishra 23.09.pptx
bineetmishra1990
 
PPT
1. SYSTEMS, ROLES, AND DEVELOPMENT METHODOLOGIES.ppt
zilow058
 
PDF
The Effect of Artifact Removal from EEG Signals on the Detection of Epileptic...
Partho Prosad
 
PDF
CAD-CAM U-1 Combined Notes_57761226_2025_04_22_14_40.pdf
shailendrapratap2002
 
PDF
Construction of a Thermal Vacuum Chamber for Environment Test of Triple CubeS...
2208441
 
PPTX
22PCOAM21 Session 2 Understanding Data Source.pptx
Guru Nanak Technical Institutions
 
PDF
Machine Learning All topics Covers In This Single Slides
AmritTiwari19
 
PPTX
Inventory management chapter in automation and robotics.
atisht0104
 
PDF
Biodegradable Plastics: Innovations and Market Potential (www.kiu.ac.ug)
publication11
 
PPTX
business incubation centre aaaaaaaaaaaaaa
hodeeesite4
 
PPTX
FUNDAMENTALS OF ELECTRIC VEHICLES UNIT-1
MikkiliSuresh
 
DOCX
SAR - EEEfdfdsdasdsdasdasdasdasdasdasdasda.docx
Kanimozhi676285
 
PDF
Packaging Tips for Stainless Steel Tubes and Pipes
heavymetalsandtubes
 
PDF
settlement FOR FOUNDATION ENGINEERS.pdf
Endalkazene
 
PDF
top-5-use-cases-for-splunk-security-analytics.pdf
yaghutialireza
 
PPTX
Online Cab Booking and Management System.pptx
diptipaneri80
 
PDF
Advanced LangChain & RAG: Building a Financial AI Assistant with Real-Time Data
Soufiane Sejjari
 
MT Chapter 1.pptx- Magnetic particle testing
ABCAnyBodyCanRelax
 
Chad Ayach - A Versatile Aerospace Professional
Chad Ayach
 
20ME702-Mechatronics-UNIT-1,UNIT-2,UNIT-3,UNIT-4,UNIT-5, 2025-2026
Mohanumar S
 
Civil Engineering Practices_BY Sh.JP Mishra 23.09.pptx
bineetmishra1990
 
1. SYSTEMS, ROLES, AND DEVELOPMENT METHODOLOGIES.ppt
zilow058
 
The Effect of Artifact Removal from EEG Signals on the Detection of Epileptic...
Partho Prosad
 
CAD-CAM U-1 Combined Notes_57761226_2025_04_22_14_40.pdf
shailendrapratap2002
 
Construction of a Thermal Vacuum Chamber for Environment Test of Triple CubeS...
2208441
 
22PCOAM21 Session 2 Understanding Data Source.pptx
Guru Nanak Technical Institutions
 
Machine Learning All topics Covers In This Single Slides
AmritTiwari19
 
Inventory management chapter in automation and robotics.
atisht0104
 
Biodegradable Plastics: Innovations and Market Potential (www.kiu.ac.ug)
publication11
 
business incubation centre aaaaaaaaaaaaaa
hodeeesite4
 
FUNDAMENTALS OF ELECTRIC VEHICLES UNIT-1
MikkiliSuresh
 
SAR - EEEfdfdsdasdsdasdasdasdasdasdasdasda.docx
Kanimozhi676285
 
Packaging Tips for Stainless Steel Tubes and Pipes
heavymetalsandtubes
 
settlement FOR FOUNDATION ENGINEERS.pdf
Endalkazene
 
top-5-use-cases-for-splunk-security-analytics.pdf
yaghutialireza
 
Online Cab Booking and Management System.pptx
diptipaneri80
 
Advanced LangChain & RAG: Building a Financial AI Assistant with Real-Time Data
Soufiane Sejjari
 
Ad

ML Model Serving at Twitter

  • 1. ML Model Serving @Twitter Joe Xie, Yue Lu and Jack Guo Twitter: @Joe_Xie, @Yue, @JackGuo8
  • 2. Outline • ML Infra Overview • Model Serving Challenges • Deep Dive in Solutions – Performance Optimization – Robust & Resilient – Real-time Online Learning – Scaling with Parameter Server • Model Serving Scenarios
  • 3. ML Infra - Overview • ML is increasingly at the core of everything we build at Twitter • ML infra supports many product teams – ads ranking, ads targeting, timeline ranking, product safety, recommendation, moments ranking, trends
  • 4. ML Infra – Product Examples Ad Recap
  • 5. ML Infra - High-level Architecture
  • 6. ML Infra – Core Prediction Engine • Large scale online SGD learning • Architecture – Transform: MDL, Decision tree – Feature crossing – Logistic Regression: In-house JVM learner or Vowpal Wabbit
  • 7. Model Serving Challenges • Performant: Trillions of predictions served daily • Robust & Resilient: Traffic spike during events, etc. Super bowl, Oscar award, world cup • Real-time: news, events, trends, hash tags, ads. Dynamically adapt to changes spanning as short as a few hours even minutes • Scalability: Horizontal scaling to handle organic growth, new features and advanced modeling
  • 8. Performant – Prediction Engine Optimization • Reduce serialization cost – Model collocation – Batch request API • Reduce compute cost – Feature id instead of string name – Transform sharing across models – Feature cross done on the fly
  • 9. Robust & Resilient • Resilient – Load factor to control the traffic based on the success rate of the requests • Robust – Snapshot models at fixed interval – Abnormal traffic detection
  • 10. Real time – Online Learning Training Traffic Client Read Requests Prediction Service Instance Model Training Traffic Training Traffic
  • 11. Scaling – Challenges • Network fan-out: Each prediction service has to receive all training traffic • Limit to Training Traffic Size: Training throughput limited by the capacity of a single instance • Inefficient serving : A big portion of the resource is allocated for training
  • 12. Scaling – Parameter Server • Incremental model updates instead of integrated training Training Traffic ‘Server Node’ Model Updates Serving GroupServer Group Client Read Requests Model Model ‘Worker Node’ Model Model Updates Model Updates
  • 13. Model Serving Scenarios • Static model in-memory integration • Static model standalone service • Online learning service with integrated training • Parameter server with incremental model updates