SlideShare a Scribd company logo
Tensorflow meetup
07 Aug 2018, Ghent
Today
KubeFlow
Robbe Sneyders @RobbeSneyders
TensorFlow Transform
Matthias Feys @FsMatt
Tensorflow Hub & TensorFlow Serving
Stijn Decubber @sdcubber
- Next meetup: 08/10/2018
- PyTorch vs Keras
- ...
- ...
- ...
Next time
TensorFlow Transform
What is tf.Transform?
Library for preprocessing data with
TensorFlow
● Structured way of analyzing and
transforming big datasets
● Remove "training-serving skew"
Why tf.Transform?
{weight:100, x:99, y:12}
Batch
Process
Stream
Process
Why tf.Transform?
{weight:100, x:99, y:12}
tf.Transform
How does it work?
1. “Analyze” step similar to scikit learn “fit” step
○ Iterates over the complete dataset and creates a TF Graph
2. “Transform” step similar to scikit learn “transform” step
○ Uses the TF Graph from the “Analyze step”
○ Transforms the complete dataset
3. Same TF Graph can be used during serving
“Analyze” and “Transform” step both use the same preprocessing function
Preprocessing function in tf.Transform
Preprocessing function in tf.Transform
Preprocessing function in tf.Transform
Preprocessing function in tf.Transform
Preprocessing function in tf.Transform
Preprocessing function in tf.Transform
“Analyse” vs “Transform”
Goal of “Analyze” step
Running on Apache Beam
● Open source, unified model for defining both
batch and streaming data-parallel
processing pipelines.
● Using one of the open source Beam SDKs,
you build a program that defines the
pipeline.
● The pipeline is then executed by one of
Beam’s supported distributed processing
back-ends, which include Apache Apex,
Apache Flink, Apache Spark, and Google
Cloud Dataflow.
Beam Model: Fn Runners
Apache
Flink
Apache
Spark
Beam Model: Pipeline
Construction
Other
LanguagesBeam Java
Beam
Python
Execution Execution
Cloud
Dataflow
Execution
Source: https://beam.apache.org
Apache Beam Key Concepts
● Pipelines: data processing job made of a
series of computations including input,
processing, and output
● PCollections: bounded (or unbounded)
datasets which represent the input,
intermediate and output data in pipelines
● PTransforms: data processing step in a
pipeline in which one or more PCollections
are an input and output
● I/O Sources and Sinks: APIs for reading
and writing data which are the roots and
endpoints of the pipeline.
Source: https://beam.apache.org
Demo time
(repo: http://bit.ly/tftransformdemo)
tf.Transform
Library for preprocessing data with
TensorFlow
● Structured way of analyzing and
transforming big datasets
● Remove "training-serving skew"
TensorFlow Hub
Why TF Hub?
Many state-of-the-art ML models are trained on huge datasets (ImageNet) and require massive
amounts of compute to train (VGG, Inception…)
However, reusing these models for other applications (transfer learning) can:
● Improve training speed
● Improve generalization and accuracy
● Allow to train with smaller datasets
Weights of the
module can be
retrained or fixed
What is TF Hub?
https://www.tensorflow.org/hub/
● TF Hub is a library for the publication, and consumption of ML models
● Similar to Caffe model zoo, Keras applications…
● But easier for everyone to publish and host models
● A module is a self-contained piece of a graph together with weights and assets
How to use it?
m = hub.Module("https://tfhub.dev/google/progan-128/1")
The model graph and weights are downloaded when a Module is instantiated:
with tf.Graph().as_default():
module_url = "https://tfhub.dev/google/nnlm-en-dim128-with-normalization/1"
embed = hub.Module(module_url)
embeddings = embed(["A long sentence.", "single-word",
"http://example.com"])
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
sess.run(tf.tables_initializer())
print(sess.run(embeddings))
After that, the module will be added to the graph each time it is called:
Returns embeddings without training a
model
(NASNet: you get 62000+ GPU-hours)
Exporting and Hosting Modules
"https://tfhub.dev/google/progan-128/1"
repo publisher model version
Hosting: export the trained model, create a tarball and upload it
def module_fn():
inputs = tf.placeholder(dtype=tf.float32, shape=[None, 50])
layer1 = tf.layers.fully_connected(inputs, 200)
layer2 = tf.layers.fully_connected(layer1, 100)
outputs = dict(default=layer2, hidden_activations=layer1)
# Add default signature.
hub.add_signature(inputs=inputs, outputs=outputs)
spec = hub.create_module_spec(module_fn)
Exporting: define a graph, add signature, call create_model_spec
TF Hub Applications
Images
Natural language
More coming… (video, audio…)
TensorFlow Serving
What is Serving?
Serving is how you apply a model, after
you’ve trained it
What is Serving?
Client side
message
prediction
message
Server side
request
response
Why TF Serving?
● Online, low-latency
● Multiple models, multiple versions
● Should scale with demand: K8S
Goals
Data Model Application?
What is TF Serving?
● Flexible, high-performance serving system for machine learning models, designed for
production environments
● Can be hosted on for example kubernetes
○ ~ ML Engine in your own kubernetes cluster
https://github.com/tensorflow/serving
Data Model Application
Serving
Main Architecture
TF Serving Libraries
File System
Model v1
Model v2
Scan and
load models
Servable
handler Loader
Version
Manager
gRPC/REST
requests
Publish new
versions
Serves the
model
TensorFlow Serving
Server sideClient side
How?
General pipeline:
Train the model Export model Host server Make requests
Custom TF
Keras
Estimators
TF serving HTTP
gRPC
Exporting a model
Three APIs
1. Regress: 1 input tensor - 1 output tensor
2. Classify: 1 input tensor - outputs: classes & scores
3. Predict: arbitrary many input and output tensors
SavedModel is the universal serialization format for TF models
● Supports multiple graphs that share variables
● SignatureDef fully specifies inference computation by inputs - outputs
Universal format for many models: Estimators - Keras - custom TF ...
Model graph
Model weights
Custom TF models
Idea: specify inference graph and store it together with the model weights
SignatureDef: specify the inference
computation
Serving key: identifies the metagraph
Builder combines the model weights and
{key: signaturedef} mapping
Exporting a model
Custom models - simplified
TensorFlow provides a convenience method that is sufficient for most cases
SignatureDef: implicitly defined with
default signature key
Exporting a model
Keras models
Work just fine with the simple_save() method
Save model in context of the Keras session
Use the Keras Model instance as a convenient
wrapper to define the SignatureDef
Exporting a model
Using the Estimator API
● Trained estimator has export_savedmodel() method
● Expects a serving_input_fn:
○ Serving time equivalent of input_fn
○ Returns a ServingInputReceiver object
○ Role: receive a request, parse it, send it to model for inference
● Requires a feature specification to provide placeholders parsed from serialized Examples (parsing input
receiver) or from raw tensors (raw input receiver)
Feature spec:
Receiver fn:
Export:
Exporting a model
Result: metagraph + variables
Model graph
Model weights
Model version: root folder of the model files should be an integer that denotes the model version.
TF serving infers the model version from the folder name.
Inspect this folder with the SavedModel CLI tool!
Exporting a model
Setting up a TF Server
tensorflow_model_server --model_base_path=$(pwd) --rest_api_port=9000 --model_name=MyModel
tf_serving/core/basic_manager] Successfully reserved resources to load servable {name: MyModel version: 1}
tf_serving/core/loader_harness.cc] Loading servable version {name: MyModel version: 1}
external/org_tensorflow/tensorflow/cc/saved_model/loader.cc] Loading MyModel with tags: { serve };
external/org_tensorflow/tensorflow/cc/saved_model/loader.cc] SavedModel load for tags { serve }; Status:
success. Took 1048518 microseconds.
tf_serving/core/loader_harness.cc] Successfully loaded servable version {name: MyModel version: 1}
tf_serving/model_servers/main.cc] Exporting HTTP/REST API at:localhost:9000 ...
Submitting a request
Via HTTP: using the python requests module
Via gRPC: by populating a request protobuf via Python bindings and passing it through a PredictionService stub
Demo time
Getting started
● Docs: https://www.tensorflow.org/serving/
● Source code: https://github.com/tensorflow/serving
● Installation: Dockerfiles are available, also for GPU
● End-to-end example blogpost with tf.Keras:
https://blog.ml6.eu/training-and-serving-ml-models-with-tf-keras-3d29b41e066c
KubeFlow
Perception: ML products are mostly
about ML
Reality: ML requires DevOps, lots of it
You know what is really good
at DevOps?
Containers
and
Kubernetes
Kubernetes is
● An open-source system
● For managing containerized applications
● Across multiple hosts in a cluster
Open-sourced by Google
Can run anywhere
Containerized applications
Advantages of containerized
applications
● Runs anywhere
○ OS is packaged with container
● Consistent environment
○ Runs the same on laptop as on cloud
● Isolation
○ Every container has his own OS and filesystem
● Dev and Ops separation of concern
○ Software development can be separated from deployment
● Microservices
○ Applications are broken into smaller, independent pieces and can be deployed and managed dynamically
○ Separate pieces can be developed independently
Orchestration across nodes in a cluster
Nodes
Oh, you want to use ML on K8s?
● Containers
● Packaging
● Kubernetes service endpoints
● Persistent volumes
● Scaling
● Immutable deployments
● GPUs, Drivers & the GPL
● Cloud APIs
● DevOps
● ...
Kubeflow
Build portable ML products using
Kubernetes
What is Kubeflow?
“The Kubeflow project is dedicated to making deployments of machine
learning (ML) workflows on Kubernetes simple, portable and scalable. Our
goal is not to recreate other services, but to provide a straightforward way to
deploy best-of-breed open-source systems for ML to diverse infrastructures.
Anywhere you are running Kubernetes, you should be able to run Kubeflow.”
Why use Kubeflow
Why use Kubeflow
Composability
Composability
Composability
Composability
Integration of popular third party tools
● JupyterHub
○ Experiment in Jupyter Notebooks
● Tensorflow operator
○ Run TensorFlow code
● PyTorch operator
○ Run Pytorch code
● Caffe2 operator
○ Run Caffe2 code
● Katib
○ Hyperparameter tuning
Extendable to more tools
Why use Kubeflow
Portability
Portability
Portability
Portability
Portability
Why use Kubeflow
Scalability
● Built-in accelerator support (GPU, TPU)
● Kubernetes native
○ All scaling advantages of kubernetes
○ Integration with third party tools like Istio
How to use Kubeflow
Three large parts:
● Jupyterhub
● TF Jobs
● TF serving
Data scientist perspective
Kubeflow for Machine Learning
Workflow
Demo
https://youtu.be/I6iMznIYwM8?t=8m30s
Still in alpha
V0.2.2
● Still some parts and pieces missing
● Development at rapid pace
● V1.0 planned for end of year
running Tensorflow in Production

More Related Content

What's hot (20)

PPTX
Braxton McKee, Founder & CEO, Ufora at MLconf SF - 11/13/15
MLconf
 
PPTX
Braxton McKee, CEO & Founder, Ufora at MLconf NYC - 4/15/16
MLconf
 
PDF
Automating machine learning lifecycle with kubeflow
Stepan Pushkarev
 
PPTX
Tom Peters, Software Engineer, Ufora at MLconf ATL 2016
MLconf
 
PDF
Hydrosphere.io for ODSC: Webinar on Kubeflow
Rustem Zakiev
 
PPTX
ML6 talk at Nexxworks Bootcamp
Karel Dumon
 
PDF
Machine learning on kubernetes
Anirudh Ramanathan
 
PDF
Introduction to TensorFlow
Matthias Feys
 
PPTX
Narayanan Sundaram, Research Scientist, Intel Labs at MLconf SF - 11/13/15
MLconf
 
PDF
Kyryl Truskovskyi: Kubeflow for end2end machine learning lifecycle
Lviv Startup Club
 
PDF
From Python to PySpark and Back Again – Unifying Single-host and Distributed ...
Databricks
 
PPTX
Tensorflow London 13: Barbara Fusinska 'Hassle Free, Scalable, Machine Learni...
Seldon
 
PPTX
Brief introduction to Distributed Deep Learning
Adam Gibson
 
PDF
Deep learning with TensorFlow
Ndjido Ardo BAR
 
PDF
Data Science and Deep Learning on Spark with 1/10th of the Code with Roope As...
Databricks
 
PPTX
AI Pipeline Optimization using Kubeflow
Steve Guhr
 
PPTX
An Introduction to TensorFlow architecture
Mani Goswami
 
PDF
TinyML as-a-Service
Hiroshi Doyu
 
PDF
How to use Apache TVM to optimize your ML models
Databricks
 
Braxton McKee, Founder & CEO, Ufora at MLconf SF - 11/13/15
MLconf
 
Braxton McKee, CEO & Founder, Ufora at MLconf NYC - 4/15/16
MLconf
 
Automating machine learning lifecycle with kubeflow
Stepan Pushkarev
 
Tom Peters, Software Engineer, Ufora at MLconf ATL 2016
MLconf
 
Hydrosphere.io for ODSC: Webinar on Kubeflow
Rustem Zakiev
 
ML6 talk at Nexxworks Bootcamp
Karel Dumon
 
Machine learning on kubernetes
Anirudh Ramanathan
 
Introduction to TensorFlow
Matthias Feys
 
Narayanan Sundaram, Research Scientist, Intel Labs at MLconf SF - 11/13/15
MLconf
 
Kyryl Truskovskyi: Kubeflow for end2end machine learning lifecycle
Lviv Startup Club
 
From Python to PySpark and Back Again – Unifying Single-host and Distributed ...
Databricks
 
Tensorflow London 13: Barbara Fusinska 'Hassle Free, Scalable, Machine Learni...
Seldon
 
Brief introduction to Distributed Deep Learning
Adam Gibson
 
Deep learning with TensorFlow
Ndjido Ardo BAR
 
Data Science and Deep Learning on Spark with 1/10th of the Code with Roope As...
Databricks
 
AI Pipeline Optimization using Kubeflow
Steve Guhr
 
An Introduction to TensorFlow architecture
Mani Goswami
 
TinyML as-a-Service
Hiroshi Doyu
 
How to use Apache TVM to optimize your ML models
Databricks
 

Similar to running Tensorflow in Production (20)

PDF
Tensorflow 2.0 and Coral Edge TPU
Andrés Leonardo Martinez Ortiz
 
PPTX
Tensorflow Ecosystem
Vivek Raja P S
 
PPTX
Certification Study Group -Professional ML Engineer Session 2 (GCP-TensorFlow...
gdgsurrey
 
PDF
Advanced Spark and TensorFlow Meetup May 26, 2016
Chris Fregly
 
PDF
TensorFlow example for AI Ukraine2016
Andrii Babii
 
PDF
Hands-on Learning with KubeFlow + Keras/TensorFlow 2.0 + TF Extended (TFX) + ...
Chris Fregly
 
PPTX
Introduction To TensorFlow | Deep Learning Using TensorFlow | CloudxLab
CloudxLab
 
PDF
Moving Your Machine Learning Models to Production with TensorFlow Extended
Jonathan Mugan
 
PDF
TensorFlow meetup: Keras - Pytorch - TensorFlow.js
Stijn Decubber
 
PDF
RESTful Machine Learning with Flask and TensorFlow Serving - Carlo Mazzaferro
PyData
 
PDF
Intro - End to end ML with Kubeflow @ SignalConf 2018
Holden Karau
 
PPTX
What is TensorFlow? | Introduction to TensorFlow | TensorFlow Tutorial For Be...
Simplilearn
 
PDF
Tensorflow 2 Pocket Reference Building And Deploying Machine Learning Models ...
tmnfxlrqd1983
 
PDF
TensorFlow and Keras: An Overview
Poo Kuan Hoong
 
PDF
TensorFlow Tutorial.pdf
Antonio Espinosa
 
PPTX
Introduction to Tensor Flow-v1.pptx
Janagi Raman S
 
PDF
Rajat Monga, Engineering Director, TensorFlow, Google at MLconf 2016
MLconf
 
PDF
Managing the Machine Learning Lifecycle with MLOps
Fatih Baltacı
 
PDF
Flink Forward San Francisco 2019: TensorFlow Extended: An end-to-end machine ...
Flink Forward
 
PPTX
Tensorflow in practice by Engineer - donghwi cha
Donghwi Cha
 
Tensorflow 2.0 and Coral Edge TPU
Andrés Leonardo Martinez Ortiz
 
Tensorflow Ecosystem
Vivek Raja P S
 
Certification Study Group -Professional ML Engineer Session 2 (GCP-TensorFlow...
gdgsurrey
 
Advanced Spark and TensorFlow Meetup May 26, 2016
Chris Fregly
 
TensorFlow example for AI Ukraine2016
Andrii Babii
 
Hands-on Learning with KubeFlow + Keras/TensorFlow 2.0 + TF Extended (TFX) + ...
Chris Fregly
 
Introduction To TensorFlow | Deep Learning Using TensorFlow | CloudxLab
CloudxLab
 
Moving Your Machine Learning Models to Production with TensorFlow Extended
Jonathan Mugan
 
TensorFlow meetup: Keras - Pytorch - TensorFlow.js
Stijn Decubber
 
RESTful Machine Learning with Flask and TensorFlow Serving - Carlo Mazzaferro
PyData
 
Intro - End to end ML with Kubeflow @ SignalConf 2018
Holden Karau
 
What is TensorFlow? | Introduction to TensorFlow | TensorFlow Tutorial For Be...
Simplilearn
 
Tensorflow 2 Pocket Reference Building And Deploying Machine Learning Models ...
tmnfxlrqd1983
 
TensorFlow and Keras: An Overview
Poo Kuan Hoong
 
TensorFlow Tutorial.pdf
Antonio Espinosa
 
Introduction to Tensor Flow-v1.pptx
Janagi Raman S
 
Rajat Monga, Engineering Director, TensorFlow, Google at MLconf 2016
MLconf
 
Managing the Machine Learning Lifecycle with MLOps
Fatih Baltacı
 
Flink Forward San Francisco 2019: TensorFlow Extended: An end-to-end machine ...
Flink Forward
 
Tensorflow in practice by Engineer - donghwi cha
Donghwi Cha
 
Ad

Recently uploaded (20)

PDF
유니티에서 Burst Compiler+ThreadedJobs+SIMD 적용사례
Seongdae Kim
 
PDF
Alarm in Android-Scheduling Timed Tasks Using AlarmManager in Android.pdf
Nabin Dhakal
 
PPTX
Feb 2021 Cohesity first pitch presentation.pptx
enginsayin1
 
PDF
Revenue streams of the Wazirx clone script.pdf
aaronjeffray
 
PDF
Streamline Contractor Lifecycle- TECH EHS Solution
TECH EHS Solution
 
PDF
GetOnCRM Speeds Up Agentforce 3 Deployment for Enterprise AI Wins.pdf
GetOnCRM Solutions
 
PPTX
How Apagen Empowered an EPC Company with Engineering ERP Software
SatishKumar2651
 
PPTX
Java Native Memory Leaks: The Hidden Villain Behind JVM Performance Issues
Tier1 app
 
PPTX
Revolutionizing Code Modernization with AI
KrzysztofKkol1
 
PPTX
Comprehensive Guide: Shoviv Exchange to Office 365 Migration Tool 2025
Shoviv Software
 
PPTX
3uTools Full Crack Free Version Download [Latest] 2025
muhammadgurbazkhan
 
PDF
Understanding the Need for Systemic Change in Open Source Through Intersectio...
Imma Valls Bernaus
 
DOCX
Import Data Form Excel to Tally Services
Tally xperts
 
PPTX
Tally software_Introduction_Presentation
AditiBansal54083
 
PDF
Powering GIS with FME and VertiGIS - Peak of Data & AI 2025
Safe Software
 
PDF
Beyond Binaries: Understanding Diversity and Allyship in a Global Workplace -...
Imma Valls Bernaus
 
PDF
Why Businesses Are Switching to Open Source Alternatives to Crystal Reports.pdf
Varsha Nayak
 
PDF
vMix Pro 28.0.0.42 Download vMix Registration key Bundle
kulindacore
 
PPTX
Migrating Millions of Users with Debezium, Apache Kafka, and an Acyclic Synch...
MD Sayem Ahmed
 
PDF
iTop VPN With Crack Lifetime Activation Key-CODE
utfefguu
 
유니티에서 Burst Compiler+ThreadedJobs+SIMD 적용사례
Seongdae Kim
 
Alarm in Android-Scheduling Timed Tasks Using AlarmManager in Android.pdf
Nabin Dhakal
 
Feb 2021 Cohesity first pitch presentation.pptx
enginsayin1
 
Revenue streams of the Wazirx clone script.pdf
aaronjeffray
 
Streamline Contractor Lifecycle- TECH EHS Solution
TECH EHS Solution
 
GetOnCRM Speeds Up Agentforce 3 Deployment for Enterprise AI Wins.pdf
GetOnCRM Solutions
 
How Apagen Empowered an EPC Company with Engineering ERP Software
SatishKumar2651
 
Java Native Memory Leaks: The Hidden Villain Behind JVM Performance Issues
Tier1 app
 
Revolutionizing Code Modernization with AI
KrzysztofKkol1
 
Comprehensive Guide: Shoviv Exchange to Office 365 Migration Tool 2025
Shoviv Software
 
3uTools Full Crack Free Version Download [Latest] 2025
muhammadgurbazkhan
 
Understanding the Need for Systemic Change in Open Source Through Intersectio...
Imma Valls Bernaus
 
Import Data Form Excel to Tally Services
Tally xperts
 
Tally software_Introduction_Presentation
AditiBansal54083
 
Powering GIS with FME and VertiGIS - Peak of Data & AI 2025
Safe Software
 
Beyond Binaries: Understanding Diversity and Allyship in a Global Workplace -...
Imma Valls Bernaus
 
Why Businesses Are Switching to Open Source Alternatives to Crystal Reports.pdf
Varsha Nayak
 
vMix Pro 28.0.0.42 Download vMix Registration key Bundle
kulindacore
 
Migrating Millions of Users with Debezium, Apache Kafka, and an Acyclic Synch...
MD Sayem Ahmed
 
iTop VPN With Crack Lifetime Activation Key-CODE
utfefguu
 
Ad

running Tensorflow in Production

  • 2. Today KubeFlow Robbe Sneyders @RobbeSneyders TensorFlow Transform Matthias Feys @FsMatt Tensorflow Hub & TensorFlow Serving Stijn Decubber @sdcubber
  • 3. - Next meetup: 08/10/2018 - PyTorch vs Keras - ... - ... - ... Next time
  • 5. What is tf.Transform? Library for preprocessing data with TensorFlow ● Structured way of analyzing and transforming big datasets ● Remove "training-serving skew"
  • 6. Why tf.Transform? {weight:100, x:99, y:12} Batch Process Stream Process
  • 8. How does it work? 1. “Analyze” step similar to scikit learn “fit” step ○ Iterates over the complete dataset and creates a TF Graph 2. “Transform” step similar to scikit learn “transform” step ○ Uses the TF Graph from the “Analyze step” ○ Transforms the complete dataset 3. Same TF Graph can be used during serving “Analyze” and “Transform” step both use the same preprocessing function
  • 17. Running on Apache Beam ● Open source, unified model for defining both batch and streaming data-parallel processing pipelines. ● Using one of the open source Beam SDKs, you build a program that defines the pipeline. ● The pipeline is then executed by one of Beam’s supported distributed processing back-ends, which include Apache Apex, Apache Flink, Apache Spark, and Google Cloud Dataflow. Beam Model: Fn Runners Apache Flink Apache Spark Beam Model: Pipeline Construction Other LanguagesBeam Java Beam Python Execution Execution Cloud Dataflow Execution Source: https://beam.apache.org
  • 18. Apache Beam Key Concepts ● Pipelines: data processing job made of a series of computations including input, processing, and output ● PCollections: bounded (or unbounded) datasets which represent the input, intermediate and output data in pipelines ● PTransforms: data processing step in a pipeline in which one or more PCollections are an input and output ● I/O Sources and Sinks: APIs for reading and writing data which are the roots and endpoints of the pipeline. Source: https://beam.apache.org
  • 20. tf.Transform Library for preprocessing data with TensorFlow ● Structured way of analyzing and transforming big datasets ● Remove "training-serving skew"
  • 22. Why TF Hub? Many state-of-the-art ML models are trained on huge datasets (ImageNet) and require massive amounts of compute to train (VGG, Inception…) However, reusing these models for other applications (transfer learning) can: ● Improve training speed ● Improve generalization and accuracy ● Allow to train with smaller datasets
  • 23. Weights of the module can be retrained or fixed What is TF Hub? https://www.tensorflow.org/hub/ ● TF Hub is a library for the publication, and consumption of ML models ● Similar to Caffe model zoo, Keras applications… ● But easier for everyone to publish and host models ● A module is a self-contained piece of a graph together with weights and assets
  • 24. How to use it? m = hub.Module("https://tfhub.dev/google/progan-128/1") The model graph and weights are downloaded when a Module is instantiated: with tf.Graph().as_default(): module_url = "https://tfhub.dev/google/nnlm-en-dim128-with-normalization/1" embed = hub.Module(module_url) embeddings = embed(["A long sentence.", "single-word", "http://example.com"]) with tf.Session() as sess: sess.run(tf.global_variables_initializer()) sess.run(tf.tables_initializer()) print(sess.run(embeddings)) After that, the module will be added to the graph each time it is called: Returns embeddings without training a model (NASNet: you get 62000+ GPU-hours)
  • 25. Exporting and Hosting Modules "https://tfhub.dev/google/progan-128/1" repo publisher model version Hosting: export the trained model, create a tarball and upload it def module_fn(): inputs = tf.placeholder(dtype=tf.float32, shape=[None, 50]) layer1 = tf.layers.fully_connected(inputs, 200) layer2 = tf.layers.fully_connected(layer1, 100) outputs = dict(default=layer2, hidden_activations=layer1) # Add default signature. hub.add_signature(inputs=inputs, outputs=outputs) spec = hub.create_module_spec(module_fn) Exporting: define a graph, add signature, call create_model_spec
  • 26. TF Hub Applications Images Natural language More coming… (video, audio…)
  • 28. What is Serving? Serving is how you apply a model, after you’ve trained it
  • 29. What is Serving? Client side message prediction message Server side request response
  • 30. Why TF Serving? ● Online, low-latency ● Multiple models, multiple versions ● Should scale with demand: K8S Goals Data Model Application?
  • 31. What is TF Serving? ● Flexible, high-performance serving system for machine learning models, designed for production environments ● Can be hosted on for example kubernetes ○ ~ ML Engine in your own kubernetes cluster https://github.com/tensorflow/serving Data Model Application Serving
  • 32. Main Architecture TF Serving Libraries File System Model v1 Model v2 Scan and load models Servable handler Loader Version Manager gRPC/REST requests Publish new versions Serves the model TensorFlow Serving Server sideClient side
  • 33. How? General pipeline: Train the model Export model Host server Make requests Custom TF Keras Estimators TF serving HTTP gRPC
  • 34. Exporting a model Three APIs 1. Regress: 1 input tensor - 1 output tensor 2. Classify: 1 input tensor - outputs: classes & scores 3. Predict: arbitrary many input and output tensors SavedModel is the universal serialization format for TF models ● Supports multiple graphs that share variables ● SignatureDef fully specifies inference computation by inputs - outputs Universal format for many models: Estimators - Keras - custom TF ... Model graph Model weights
  • 35. Custom TF models Idea: specify inference graph and store it together with the model weights SignatureDef: specify the inference computation Serving key: identifies the metagraph Builder combines the model weights and {key: signaturedef} mapping Exporting a model
  • 36. Custom models - simplified TensorFlow provides a convenience method that is sufficient for most cases SignatureDef: implicitly defined with default signature key Exporting a model
  • 37. Keras models Work just fine with the simple_save() method Save model in context of the Keras session Use the Keras Model instance as a convenient wrapper to define the SignatureDef Exporting a model
  • 38. Using the Estimator API ● Trained estimator has export_savedmodel() method ● Expects a serving_input_fn: ○ Serving time equivalent of input_fn ○ Returns a ServingInputReceiver object ○ Role: receive a request, parse it, send it to model for inference ● Requires a feature specification to provide placeholders parsed from serialized Examples (parsing input receiver) or from raw tensors (raw input receiver) Feature spec: Receiver fn: Export: Exporting a model
  • 39. Result: metagraph + variables Model graph Model weights Model version: root folder of the model files should be an integer that denotes the model version. TF serving infers the model version from the folder name. Inspect this folder with the SavedModel CLI tool! Exporting a model
  • 40. Setting up a TF Server tensorflow_model_server --model_base_path=$(pwd) --rest_api_port=9000 --model_name=MyModel tf_serving/core/basic_manager] Successfully reserved resources to load servable {name: MyModel version: 1} tf_serving/core/loader_harness.cc] Loading servable version {name: MyModel version: 1} external/org_tensorflow/tensorflow/cc/saved_model/loader.cc] Loading MyModel with tags: { serve }; external/org_tensorflow/tensorflow/cc/saved_model/loader.cc] SavedModel load for tags { serve }; Status: success. Took 1048518 microseconds. tf_serving/core/loader_harness.cc] Successfully loaded servable version {name: MyModel version: 1} tf_serving/model_servers/main.cc] Exporting HTTP/REST API at:localhost:9000 ...
  • 41. Submitting a request Via HTTP: using the python requests module Via gRPC: by populating a request protobuf via Python bindings and passing it through a PredictionService stub
  • 43. Getting started ● Docs: https://www.tensorflow.org/serving/ ● Source code: https://github.com/tensorflow/serving ● Installation: Dockerfiles are available, also for GPU ● End-to-end example blogpost with tf.Keras: https://blog.ml6.eu/training-and-serving-ml-models-with-tf-keras-3d29b41e066c
  • 45. Perception: ML products are mostly about ML
  • 46. Reality: ML requires DevOps, lots of it
  • 47. You know what is really good at DevOps?
  • 49. Kubernetes is ● An open-source system ● For managing containerized applications ● Across multiple hosts in a cluster
  • 53. Advantages of containerized applications ● Runs anywhere ○ OS is packaged with container ● Consistent environment ○ Runs the same on laptop as on cloud ● Isolation ○ Every container has his own OS and filesystem ● Dev and Ops separation of concern ○ Software development can be separated from deployment ● Microservices ○ Applications are broken into smaller, independent pieces and can be deployed and managed dynamically ○ Separate pieces can be developed independently
  • 54. Orchestration across nodes in a cluster Nodes
  • 55. Oh, you want to use ML on K8s? ● Containers ● Packaging ● Kubernetes service endpoints ● Persistent volumes ● Scaling ● Immutable deployments ● GPUs, Drivers & the GPL ● Cloud APIs ● DevOps ● ...
  • 56. Kubeflow Build portable ML products using Kubernetes
  • 57. What is Kubeflow? “The Kubeflow project is dedicated to making deployments of machine learning (ML) workflows on Kubernetes simple, portable and scalable. Our goal is not to recreate other services, but to provide a straightforward way to deploy best-of-breed open-source systems for ML to diverse infrastructures. Anywhere you are running Kubernetes, you should be able to run Kubeflow.”
  • 63. Composability Integration of popular third party tools ● JupyterHub ○ Experiment in Jupyter Notebooks ● Tensorflow operator ○ Run TensorFlow code ● PyTorch operator ○ Run Pytorch code ● Caffe2 operator ○ Run Caffe2 code ● Katib ○ Hyperparameter tuning Extendable to more tools
  • 71. Scalability ● Built-in accelerator support (GPU, TPU) ● Kubernetes native ○ All scaling advantages of kubernetes ○ Integration with third party tools like Istio
  • 72. How to use Kubeflow Three large parts: ● Jupyterhub ● TF Jobs ● TF serving
  • 77. Still in alpha V0.2.2 ● Still some parts and pieces missing ● Development at rapid pace ● V1.0 planned for end of year