Productionizing Predictive Analytics using the Rendezvous Architecture - for Architects

Delivery Excellence:
Love is in the Air –
Rendezvous Architecture
for Analytics in Production
Munich, October 25, 2019, Daniel Schulz

AAC_RendezvousArchitecture.pptx
Love is in the Air
What is Rendezvous Architecture
and why is it Important?

Why is Rendezvous Architecture Important?
• An Event-driven Architecture (EDA)
• Applies Event stream processing style for Supervised Learning tasks
• Solves common obstacles in productionizing Machine Learning Models for both
• Initial Go-Live &
• Updates in production
• Analogous to the Ivory Tower Effect in Enterprise Architecture
• Like the architect needs feedback from production systems to improve this & future architectures
• Data Scientists need production feedback from their ML Models in order to improve the upcoming versions of it
© 2019 Daniel Schulz. All rights reserved. 3
Taking Machine Learning Models to Production is Challenging

Traditional Green/Blue Deployments – a Load Balancer Forwards
any Request to Exactly One Machine Learning Model
The Reverse Proxy Enables Green/Blue Deployments Only
Source & image courtesy: [MLL]

Load Balanced,
Parallelized Models
Enable Multiple ML Models at Once

Multiple Machine Learning Models in Parallel –
Adding a Message Queue Enables Concurrency of Models
But which Prediction to Choose in the End?

Multiple Machine Learning Models in Parallel –
Adding a Message Queue Enables Concurrency of Models
• Shortcomings:
• Somewhat challenging various models for reference against one another
• No comparing many model’s accuracies against one another
• Continuous improvement in Data Science resp. DataOps team unlikely due to lack of feedback
• Need to add return address (as an URI) and boolean flag to whether to return anything at all in the queue –
for the consumer to respond to the original (HTTP) request, which might be completed then
• Persistent Message Queues tend to be slower than pure in-memory ones;
Spark Streaming working in micro-batches might have higher latencies as well
But which Prediction to Choose in the End?

Rendezvous
Architecture
Stop-back Request Guarantees,
Parallelized, Comparable,
Rapid Model Updates,
QoS Guarantees,
etc.

The ML Model’s Rendezvous – a Scoring Stream Collects Various
Predictions & Returns them to the Original Request’s Client
Many Models – One Final Prediction

Stateless Models are Easier to Replicate –
but Data Augmentation Might be Helpful or Necessary
Any Additional Piece of or State Information Helpful to our Models Shall be Added in One Central Place

The Decoy Model –
Collect Production Data for Debugging, Optimizations and Reproducibility
The “Unit Tests” of AI Models in Production is Real-World Data

Discussion when to Add External Information
No Augmentation
▪ Models only receive request information
▪ More reliable for technical failure
▪ More overhead when many models fetch the same data –
use caches here

Discussion when to Add External Information
No Augmentation
▪ Models only receive request information
▪ More reliable for technical failure
▪ More overhead when many models fetch the same data –
use caches here
All Augmentation in One Central Place
▪ Models receive all complete information
▪ Ideal case for reproducibility –
as the “Decoy Model” stores all information for
debugging and to explain predictions later-on
▪ Usually faster due to smaller overhead
→ Best Practice by Ellen Friedman & Ted Dunning,
Chapter 3, sub-section “Stateful Models” in [MLL]

Add Metrics & the Canary Model for Monitoring & Optimizations
Metrics are Crucial to Compare Many Models Against One Another & Hence Supports Accuracy Improvements

Metrics
▪ Metrics monitor all model’s predictions and compares
their performances
▪ Hence, we are able to judge challenging model’s
performances compared to the incumbent model’s ones
▪ Metrics are helpful to detect outliers in predictions –
• e.g. Adversarial Images, where models might predict
obscure classifications
• e.g. detect swayed models, shift in input data’s
distributions, etc.
▪ Both technological SLA, timing & AI metrics
• SLA metrics:
latency, throughput, etc.
• Timing metrics:
computation time in threads, time for requests by
source/endpoint, etc.
• AI metrics:
accuracy, error metrics, AUC, F-statistics, etc.

Metrics
▪ Metrics monitor all model’s predictions and compares
their performances
▪ Hence, we are able to judge challenging model’s
performances compared to the incumbent model’s ones
▪ Metrics are helpful to detect outliers in predictions –
• e.g. Adversarial Images, where models might predict
obscure classifications
• e.g. detect swayed models, shift in input data’s
distributions, etc.
▪ Both technological SLA, timing & AI metrics
• SLA metrics:
latency, throughput, etc.
• Timing metrics:
computation time in threads, time for requests by
source/endpoint, etc.
• AI metrics:
accuracy, error metrics, AUC, F-statistics, etc.
The Canary Model
▪ Also a Best Practice by [MLL] for finding anomalies
▪ Is a rather dated model that keeps predicting to compare
the newer models’ predictions with it – to help detect
how production-ready they really are
▪ The difference in Canary Model and later ones is proof of
progress – or lack thereof for the Data Scientists

Rendezvous Architecture – a Mixture of Models in Harmony
• Advantages:
• Model “warm-up” in production-like environments
• Switch models in an instant – un-deploy & deploy AI models swiftly
• Introduce time guarantees: all models work in parallel like Cassandra queries
• Mix simple, technically robust (not failing) models along w/ more sophisticated ones, which might break suddenly
• Incumbent vs challenging – collect raw data and performance metrics for various models (e.g. XGBoost vs Random
Forests; e.g. Linear Model vs SVM; e.g. PCA vs T-SNE) and differing versions in model streams (version 0.1, 0.2, …)
• Backstop:
when taking too long, a simpler, less robust model may answer as a backstop for more complex, more sophisticated
models; the same applies when longer-term performance metrics might indicate another model would perform better
My Suggestion for Reliable AI Systems Due to…

Résumé
Resilient, Extendable &
Production-ready Architecture for
Predictive Analytics

Resumé on Rendezvous Architecture
Limitations
▪ Is no silver bullet – does not solve all logistical obstacles in AI projects
▪ Focusses on production-side architecture – development, test, QA and LTE environments might benefit from it
▪ Focus on technological, software architecture – does not cover ML Metrics, Hyperparameter Tuning, etc.
▪ Latencies increased a bit by Message-Queue-dependency
▪ I am not aware of any implementation as of now – neither OSS, nor to purchase, nor Cloud-based

Major Advantages
▪ Manage multiple models in production and alike
environments
▪ Test-drive incumbent & challenging models against one
another
▪ Reproducibility & transparency of model’s predictions
▪ Solves common obstacles in productionizing
Machine Learning Models for both

Major Advantages
▪ Manage multiple models in production and alike
environments
▪ Test-drive incumbent & challenging models against one
another
▪ Reproducibility & transparency of model’s predictions
▪ Solves common obstacles in productionizing
Machine Learning Models for both
Minor Benefits
▪ Collect real-world data for future development
▪ Establish baseline performance values for Predictive
Analytics
▪ Rapid development due to default, fallback models
▪ Latency guarantees for model predictions
resilient, robust Predictive models
in modern Agile & DevOps projects

Rendezvous Architecture
is the Modern Bedrock
of Robust Predictive models for
Today’s Agile & DevOps Projects

Thank You for Your Attention
Please Feel Free to Ask any Open Questions, Suggestions or Voice Your Opinion…

A global leader in consulting, technology services and digital transformation,
Capgemini is at the forefront of innovation to address the entire breadth of clients’
opportunities in the evolving world of cloud, digital and platforms. Building on its
strong 50-year heritage and deep industry-specific expertise, Capgemini enables
organizations to realize their business ambitions through an array of services from
strategy to operations. Capgemini is driven by the conviction that the business
value of technology comes from and through people. It is a multicultural company
of over 200,000 team members in more than 40 countries. The Group reported
2018 global revenues of EUR 13.2 billion.
About Capgemini
Learn more about us at
www.capgemini.com
This presentation contains information that may be privileged or confidential and
is the property of the Capgemini Group.
Copyright © 2019 Daniel Schulz. All rights reserved.
People matter, results count.

Source & Image Courtesy from Book “Machine Learning Logistics”
“Machine Learning Logistics” by Ellen Friedman & Ted Dunning
▪ Authors: Ellen Friedman & Ted Dunning
▪ Publisher: O'Reilly Media, Inc.
▪ Release Date: October 2017
▪ ISBN: 9 7814 9199 7611
▪ Picture source ID: MLL

Source from Book “Hands-on Machine Learning with Scikit-Learn,
Keras, and TensorFlow, 2nd Edition”
“Hands-on Machine Learning with Scikit-Learn, Keras, and TensorFlow” by Aurélien Géron
▪ Author: Aurélien Géron
▪ Publisher: O'Reilly Media, Inc.
▪ Release Date: October 2019
▪ ISBN: 9 7814 9203 2649

Productionizing Predictive Analytics using the Rendezvous Architecture - for Architects

More Related Content

What's hot (14)

Similar to Productionizing Predictive Analytics using the Rendezvous Architecture - for Architects (20)

Recently uploaded (20)

Productionizing Predictive Analytics using the Rendezvous Architecture - for Architects