SlideShare a Scribd company logo
1Confidential
Highly scalable machine learning and deep learning
in real time with open source frameworks
Kai Waehner
Technology Evangelist
kontakt@kai-waehner.de
LinkedIn
@KaiWaehner
www.confluent.io
www.kai-waehner.de
2Apache Kafka and Machine Learning – Kai Waehner
Agenda
1) Machine Learning and Real World Applications
2) Machine Learning and the Apache Kafka Ecosystem
3) Building Neural Networks with TensorFlow and H2O
4) Deployment of Neural Networks with Kafka Streams
3Apache Kafka and Machine Learning – Kai Waehner
Agenda
1) Machine Learning and Real World Applications
2) Machine Learning and the Apache Kafka Ecosystem
3) Building Neural Networks with TensorFlow and H2O
4) Deployment of Neural Networks with Kafka Streams
4Apache Kafka and Machine Learning – Kai Waehner
Machine Learning
... allows computers to find hidden insights without being
explicitly programmed where to look.
Machine Learning
• Decision Trees
• Naïve Bayes
• Clustering
• Neural Networks
• etc.
Deep Learning
• CNN
• RNN
• Autoencoder
• etc.
5Apache Kafka and Machine Learning – Kai Waehner
Neural Network compared to other algorithms
6Apache Kafka and Machine Learning – Kai Waehner
Real World Examples of Machine Learning
Spam Detection
Search Results +
Product Recommendation
Picture Detection
(Friends, Locations, Products)
Your Company
The Next Disruption:
Google Beats Go Champion
7Apache Kafka and Machine Learning – Kai Waehner
Leverage Machine Learning to Analyze and Act on Critical Business Moments
Seconds Minutes Hours
Price
Optimization
Predictive
Maintenance
Fraud
Detection
Cross Selling
Transportation
Rerouting
Customer
Service
Inventory
Management
Windows of Opportunity
8Apache Kafka and Machine Learning – Kai Waehner
Live Demo – Building an Analytic Model
Neural Networks in Action
http://playground.tensorflow.org/
9Apache Kafka and Machine Learning – Kai Waehner
Agenda
1) Machine Learning and Real World Applications
2) Machine Learning and the Apache Kafka Ecosystem
3) Building Neural Networks with TensorFlow and H2O
4) Deployment of Neural Networks with Kafka Streams
10Apache Kafka and Machine Learning – Kai Waehner
Hidden Technical Debt in Machine Learning Systems
https://papers.nips.cc/paper/5656-hidden-technical-debt-in-machine-learning-systems.pdf
11Apache Kafka and Machine Learning – Kai Waehner
The Log ConnectorsConnectors
Producer Consumer
Streaming Engine
Apache Kafka – A Streaming Platform
12Apache Kafka and Machine Learning – Kai Waehner
Apache Kafka’s Open Source Ecosystem and Machine Learning
Kafka
Streams
Kafka
Connect
Rest Proxy
Schema Registry
Go / .NET / Python
Kafka Producer
KSQL
Kafka
Streams
13Apache Kafka and Machine Learning – Kai Waehner
Uber’s internal ML-as-a-Service Platform
https://eng.uber.com/michelangelo/
• Cover the end-to-end ML
workflow: manage data, train,
evaluate, and deploy models,
make predictions, and monitor
predictions
• Supports various AI
technologies: Traditional ML
models, time series forecasting,
and deep learning
14Apache Kafka and Machine Learning – Kai Waehner
Netflix’ Meson: Automation Engine for ML Pipelines
https://www.infoq.com/presentations/netflix-ml-meson
15Apache Kafka and Machine Learning – Kai Waehner
Agenda
1) Machine Learning and Real World Applications
2) Machine Learning and the Apache Kafka Ecosystem
3) Building Neural Networks with TensorFlow and H2O
4) Deployment of Neural Networks with Kafka Streams
16Apache Kafka and Machine Learning – Kai Waehner
Languages, Frameworks and Tools for Machine Learning
There is no Allrounder!
Portable Format
for Analytics (PFA)
17Confidential
Machine Learning with H2O.ai
H2O Engine
R / Python /
Scala / Flow UI
Java Code
18Apache Kafka and Machine Learning – Kai Waehner
Live Demo
Use Case:
Airline Flight Delay Prediction
Machine Learning Algorithm:
Deep Learning
using Neural Networks
Technology:
H2O.ai, TensorFlow
19Apache Kafka and Machine Learning – Kai Waehner
H2O Deep Water (TensorFlow, MXNet, …)
https://h2o-release.s3.amazonaws.com/h2o/rel-vapnik/1/
docs-website/h2o-docs/booklets/DeepWaterBooklet.pdf
Deep Water
(H2O + TensorFlow)
Pre-Defined Networks
+
User-Defined Networks
20Apache Kafka and Machine Learning – Kai Waehner
Agenda
1) Machine Learning and Real World Applications
2) Machine Learning and the Apache Kafka Ecosystem
3) Building Neural Networks with TensorFlow and H2O
4) Deployment of Neural Networks with Kafka Streams
21Apache Kafka and Machine Learning – Kai Waehner
Stream Processing
Data at Rest Data in Motion
22Apache Kafka and Machine Learning – Kai Waehner
Stream Processing Pipeline
APIs
Adapters /
Channels
Integration
Messaging
Stream
Ingest
Transformation
Aggregation
Enrichment
Filtering
Stream
Preprocessing
Process
Management
Analytics
(Real Time)
Applications
& APIs
Analytics /
DW Reporting
Stream
Outcomes
• Contextual Rules
• Windowing
• Patterns
• Analytics
• Machine Learning
• …
Stream
Analytics
Index / SearchNormalization
Applying an Analytic Model
is just a piece of the puzzle!
23Apache Kafka and Machine Learning – Kai Waehner
Kafka Streams (shipped with Apache Kafka)
Map, filter, aggregate,
apply analytic model,
„any business logic“
Input Stream
(Kafka Topic)
Kafka Cluster
Output Stream
(Kafka Topic)
Kafka Cluster
Stream Processing
Microservice
(Kafka Streams)
Deployed Anywhere
Java App, Docker,
Kubernetes, Mesos,
“you-name-it”
24Apache Kafka and Machine Learning – Kai Waehner
When to use Kafka Streams for Stream Processing?
25Apache Kafka and Machine Learning – Kai Waehner
A complete streaming microservices, ready for production at large-scale
Word
Count
App configuration
Define processing
(here: WordCount)
Start processing
26Apache Kafka and Machine Learning – Kai Waehner
Use Case:
Airline Flight Delay Prediction
Machine Learning Algorithm:
Neural Network
built with H2O and TensorFlow
Streaming Platform:
Apache Kafka and Kafka Streams
Live Demo
27Apache Kafka and Machine Learning – Kai Waehner
H2O.ai Model + Kafka Streams
Filter
Map
1) Create H2O DL model
2) Configure Kafka Streams Application
3) Apply H2O DL model to Streaming Data
4) Start Kafka Streams App
28Apache Kafka and Machine Learning – Kai Waehner
Github Examples: Kafka + Machine Learning
https://github.com/kaiwaehner/kafka-streams-machine-learning-examples
1) git clone … 2) mvn clean package … 3) look at implementations and unit tests
29Apache Kafka and Machine Learning – Kai Waehner
Online Model Training with Apache Kafka and Kafka Streams
How to improve models?
1. Manual Update
2. Automated Batch
3. Real Time
Your choice… All possible with Kafka!
30Apache Kafka and Machine Learning – Kai Waehner
Caveats for Online Model Training
• Processes and infrastructure not ready
• Validation needed before production
• Slows down the system
• Only a few ML implementations à Build your own!
• Only possible for unsupervised ML (e.g. clustering)
• Many use cases do not need it
à Do it only when feasible!
31Apache Kafka and Machine Learning – Kai Waehner
Key Take-Aways
à Data Scientist and Developers have to work together continuously (org + tech!)
à Mission critical, scalable production deployment is key for success of Machine Learning projects
à Apache Kafka Ecosystem for Batch and Real Time Machine Learning (Training, Inference, Monitoring)
32Apache Kafka and Machine Learning – Kai Waehner
Kai Waehner
Technology Evangelist
kontakt@kai-waehner.de
@KaiWaehner
www.kai-waehner.de
www.confluent.io
LinkedIn
Questions? Feedback?
Please contact me!

More Related Content

What's hot (20)

PDF
Confluent REST Proxy and Schema Registry (Concepts, Architecture, Features)
Kai Wähner
 
PDF
Apache Kafka vs. Integration Middleware (MQ, ETL, ESB)
Kai Wähner
 
PDF
Best Practices for Streaming IoT Data with MQTT and Apache Kafka
Kai Wähner
 
PDF
Apache Kafka Open Source Ecosystem for Machine Learning at Extreme Scale (Apa...
Kai Wähner
 
PDF
KSQL – An Open Source Streaming Engine for Apache Kafka
Kai Wähner
 
PDF
IoT Sensor Analytics with Python, Jupyter, TensorFlow, Keras, Apache Kafka, K...
Kai Wähner
 
PDF
Kafka Connect and Streams (Concepts, Architecture, Features)
Kai Wähner
 
PDF
Event-Driven Stream Processing and Model Deployment with Apache Kafka, Kafka ...
Kai Wähner
 
PDF
How to Leverage the Apache Kafka Ecosystem to Productionize Machine Learning ...
Codemotion
 
PDF
Cloud Native London 2019 Faas composition using Kafka and cloud-events
Neil Avery
 
PDF
Unleashing Apache Kafka and TensorFlow in Hybrid Cloud Architectures
Kai Wähner
 
PDF
Kafka for Real-Time Replication between Edge and Hybrid Cloud
Kai Wähner
 
PDF
Event Streaming CTO Roundtable for Cloud-native Kafka Architectures
Kai Wähner
 
PDF
Unleashing Apache Kafka and TensorFlow in the Cloud

Kai Wähner
 
PDF
Apache Kafka and API Management / API Gateway – Friends, Enemies or Frenemies?
Kai Wähner
 
PDF
App modernization on AWS with Apache Kafka and Confluent Cloud
Kai Wähner
 
PDF
Can Apache Kafka Replace a Database?
Kai Wähner
 
PDF
Apache Kafka Streams + Machine Learning / Deep Learning
Kai Wähner
 
PDF
Kai Waehner - KSQL – The Open Source SQL Streaming Engine for Apache Kafka - ...
Codemotion
 
PDF
Building a real-time data processing pipeline using Apache Kafka, Kafka Conne...
Paul Brebner
 
Confluent REST Proxy and Schema Registry (Concepts, Architecture, Features)
Kai Wähner
 
Apache Kafka vs. Integration Middleware (MQ, ETL, ESB)
Kai Wähner
 
Best Practices for Streaming IoT Data with MQTT and Apache Kafka
Kai Wähner
 
Apache Kafka Open Source Ecosystem for Machine Learning at Extreme Scale (Apa...
Kai Wähner
 
KSQL – An Open Source Streaming Engine for Apache Kafka
Kai Wähner
 
IoT Sensor Analytics with Python, Jupyter, TensorFlow, Keras, Apache Kafka, K...
Kai Wähner
 
Kafka Connect and Streams (Concepts, Architecture, Features)
Kai Wähner
 
Event-Driven Stream Processing and Model Deployment with Apache Kafka, Kafka ...
Kai Wähner
 
How to Leverage the Apache Kafka Ecosystem to Productionize Machine Learning ...
Codemotion
 
Cloud Native London 2019 Faas composition using Kafka and cloud-events
Neil Avery
 
Unleashing Apache Kafka and TensorFlow in Hybrid Cloud Architectures
Kai Wähner
 
Kafka for Real-Time Replication between Edge and Hybrid Cloud
Kai Wähner
 
Event Streaming CTO Roundtable for Cloud-native Kafka Architectures
Kai Wähner
 
Unleashing Apache Kafka and TensorFlow in the Cloud

Kai Wähner
 
Apache Kafka and API Management / API Gateway – Friends, Enemies or Frenemies?
Kai Wähner
 
App modernization on AWS with Apache Kafka and Confluent Cloud
Kai Wähner
 
Can Apache Kafka Replace a Database?
Kai Wähner
 
Apache Kafka Streams + Machine Learning / Deep Learning
Kai Wähner
 
Kai Waehner - KSQL – The Open Source SQL Streaming Engine for Apache Kafka - ...
Codemotion
 
Building a real-time data processing pipeline using Apache Kafka, Kafka Conne...
Paul Brebner
 

Similar to Machine Learning Trends of 2018 combined with the Apache Kafka Ecosystem (20)

PDF
Event-Driven Model Serving: Stream Processing vs. RPC with Kafka and TensorFl...
confluent
 
PDF
Kai Waehner - Deep Learning at Extreme Scale in the Cloud with Apache Kafka a...
Codemotion
 
PDF
Machine Learning and Deep Learning Applied to Real Time with Apache Kafka Str...
confluent
 
PDF
Kai Wähner, Technology Evangelist at Confluent: "Development of Scalable Mac...
Dataconomy Media
 
PDF
2019 04 seattle_meetup___kafka_machine_learning___kai_waehner
Nitin Kumar
 
PDF
Apache Kafka, Tiered Storage and TensorFlow for Streaming Machine Learning wi...
confluent
 
PDF
Apache Kafka, Tiered Storage and TensorFlow for Streaming Machine Learning wi...
Kai Wähner
 
PDF
Simplified Machine Learning Architecture with an Event Streaming Platform (Ap...
Kai Wähner
 
PDF
Machine Learning with Apache Kafka in Pharma and Life Sciences
Kai Wähner
 
PPTX
Apache Kafka® + Machine Learning for Supply Chain 
confluent
 
PPTX
IIoT with Kafka and Machine Learning for Supply Chain Optimization In Real Ti...
Kai Wähner
 
DOC
Download Materials
butest
 
PDF
Introduction to Apache Kafka and why it matters - Madrid
Paolo Castagna
 
PDF
Apache Mahout
Save Manos
 
PDF
Artificial intelligence and data stream mining
Albert Bifet
 
PDF
Introduction to Mahout and Machine Learning
Varad Meru
 
PDF
Apache Kafka for Smart Grid, Utilities and Energy Production
Kai Wähner
 
PDF
Machine Learning on Streaming Data using Kafka, Beam, and TensorFlow (Mikhail...
confluent
 
PDF
Build intelligent, real-time applications using Machine Learning
Hotstar
 
PDF
Mahout and Distributed Machine Learning 101
John Ternent
 
Event-Driven Model Serving: Stream Processing vs. RPC with Kafka and TensorFl...
confluent
 
Kai Waehner - Deep Learning at Extreme Scale in the Cloud with Apache Kafka a...
Codemotion
 
Machine Learning and Deep Learning Applied to Real Time with Apache Kafka Str...
confluent
 
Kai Wähner, Technology Evangelist at Confluent: "Development of Scalable Mac...
Dataconomy Media
 
2019 04 seattle_meetup___kafka_machine_learning___kai_waehner
Nitin Kumar
 
Apache Kafka, Tiered Storage and TensorFlow for Streaming Machine Learning wi...
confluent
 
Apache Kafka, Tiered Storage and TensorFlow for Streaming Machine Learning wi...
Kai Wähner
 
Simplified Machine Learning Architecture with an Event Streaming Platform (Ap...
Kai Wähner
 
Machine Learning with Apache Kafka in Pharma and Life Sciences
Kai Wähner
 
Apache Kafka® + Machine Learning for Supply Chain 
confluent
 
IIoT with Kafka and Machine Learning for Supply Chain Optimization In Real Ti...
Kai Wähner
 
Download Materials
butest
 
Introduction to Apache Kafka and why it matters - Madrid
Paolo Castagna
 
Apache Mahout
Save Manos
 
Artificial intelligence and data stream mining
Albert Bifet
 
Introduction to Mahout and Machine Learning
Varad Meru
 
Apache Kafka for Smart Grid, Utilities and Energy Production
Kai Wähner
 
Machine Learning on Streaming Data using Kafka, Beam, and TensorFlow (Mikhail...
confluent
 
Build intelligent, real-time applications using Machine Learning
Hotstar
 
Mahout and Distributed Machine Learning 101
John Ternent
 
Ad

More from Kai Wähner (20)

PDF
Apache Kafka as Data Hub for Crypto, NFT, Metaverse (Beyond the Buzz!)
Kai Wähner
 
PDF
When NOT to use Apache Kafka?
Kai Wähner
 
PDF
Kafka for Live Commerce to Transform the Retail and Shopping Metaverse
Kai Wähner
 
PDF
The Heart of the Data Mesh Beats in Real-Time with Apache Kafka
Kai Wähner
 
PDF
Apache Kafka vs. Cloud-native iPaaS Integration Platform Middleware
Kai Wähner
 
PDF
Data Warehouse vs. Data Lake vs. Data Streaming – Friends, Enemies, Frenemies?
Kai Wähner
 
PDF
Serverless Kafka and Spark in a Multi-Cloud Lakehouse Architecture
Kai Wähner
 
PDF
Resilient Real-time Data Streaming across the Edge and Hybrid Cloud with Apac...
Kai Wähner
 
PDF
Data Streaming with Apache Kafka in the Defence and Cybersecurity Industry
Kai Wähner
 
PDF
Apache Kafka in the Healthcare Industry
Kai Wähner
 
PDF
Apache Kafka in the Healthcare Industry
Kai Wähner
 
PDF
Apache Kafka for Real-time Supply Chain in the Food and Retail Industry
Kai Wähner
 
PDF
Apache Kafka for Predictive Maintenance in Industrial IoT / Industry 4.0
Kai Wähner
 
PDF
Apache Kafka Landscape for Automotive and Manufacturing
Kai Wähner
 
PDF
Kappa vs Lambda Architectures and Technology Comparison
Kai Wähner
 
PPTX
The Top 5 Apache Kafka Use Cases and Architectures in 2022
Kai Wähner
 
PDF
Apache Kafka in the Public Sector (Government, National Security, Citizen Ser...
Kai Wähner
 
PDF
Telco 4.0 - Payment and FinServ Integration for Data in Motion with 5G and Ap...
Kai Wähner
 
PDF
Apache Kafka in the Transportation and Logistics
Kai Wähner
 
PDF
Apache Kafka for Cybersecurity and SIEM / SOAR Modernization
Kai Wähner
 
Apache Kafka as Data Hub for Crypto, NFT, Metaverse (Beyond the Buzz!)
Kai Wähner
 
When NOT to use Apache Kafka?
Kai Wähner
 
Kafka for Live Commerce to Transform the Retail and Shopping Metaverse
Kai Wähner
 
The Heart of the Data Mesh Beats in Real-Time with Apache Kafka
Kai Wähner
 
Apache Kafka vs. Cloud-native iPaaS Integration Platform Middleware
Kai Wähner
 
Data Warehouse vs. Data Lake vs. Data Streaming – Friends, Enemies, Frenemies?
Kai Wähner
 
Serverless Kafka and Spark in a Multi-Cloud Lakehouse Architecture
Kai Wähner
 
Resilient Real-time Data Streaming across the Edge and Hybrid Cloud with Apac...
Kai Wähner
 
Data Streaming with Apache Kafka in the Defence and Cybersecurity Industry
Kai Wähner
 
Apache Kafka in the Healthcare Industry
Kai Wähner
 
Apache Kafka in the Healthcare Industry
Kai Wähner
 
Apache Kafka for Real-time Supply Chain in the Food and Retail Industry
Kai Wähner
 
Apache Kafka for Predictive Maintenance in Industrial IoT / Industry 4.0
Kai Wähner
 
Apache Kafka Landscape for Automotive and Manufacturing
Kai Wähner
 
Kappa vs Lambda Architectures and Technology Comparison
Kai Wähner
 
The Top 5 Apache Kafka Use Cases and Architectures in 2022
Kai Wähner
 
Apache Kafka in the Public Sector (Government, National Security, Citizen Ser...
Kai Wähner
 
Telco 4.0 - Payment and FinServ Integration for Data in Motion with 5G and Ap...
Kai Wähner
 
Apache Kafka in the Transportation and Logistics
Kai Wähner
 
Apache Kafka for Cybersecurity and SIEM / SOAR Modernization
Kai Wähner
 
Ad

Recently uploaded (20)

PDF
Linux Certificate of Completion - LabEx Certificate
VICTOR MAESTRE RAMIREZ
 
PDF
Revenue streams of the Wazirx clone script.pdf
aaronjeffray
 
PDF
HiHelloHR – Simplify HR Operations for Modern Workplaces
HiHelloHR
 
PPT
MergeSortfbsjbjsfk sdfik k
RafishaikIT02044
 
PDF
유니티에서 Burst Compiler+ThreadedJobs+SIMD 적용사례
Seongdae Kim
 
PPTX
MailsDaddy Outlook OST to PST converter.pptx
abhishekdutt366
 
PDF
MiniTool Partition Wizard 12.8 Crack License Key LATEST
hashhshs786
 
PPTX
Human Resources Information System (HRIS)
Amity University, Patna
 
PDF
Beyond Binaries: Understanding Diversity and Allyship in a Global Workplace -...
Imma Valls Bernaus
 
PPTX
Platform for Enterprise Solution - Java EE5
abhishekoza1981
 
PPTX
Agentic Automation Journey Session 1/5: Context Grounding and Autopilot for E...
klpathrudu
 
PPTX
Revolutionizing Code Modernization with AI
KrzysztofKkol1
 
PPTX
A Complete Guide to Salesforce SMS Integrations Build Scalable Messaging With...
360 SMS APP
 
PDF
Automate Cybersecurity Tasks with Python
VICTOR MAESTRE RAMIREZ
 
PPTX
MiniTool Power Data Recovery Full Crack Latest 2025
muhammadgurbazkhan
 
PPTX
Engineering the Java Web Application (MVC)
abhishekoza1981
 
PDF
Alarm in Android-Scheduling Timed Tasks Using AlarmManager in Android.pdf
Nabin Dhakal
 
DOCX
Import Data Form Excel to Tally Services
Tally xperts
 
PPTX
Tally_Basic_Operations_Presentation.pptx
AditiBansal54083
 
PDF
vMix Pro 28.0.0.42 Download vMix Registration key Bundle
kulindacore
 
Linux Certificate of Completion - LabEx Certificate
VICTOR MAESTRE RAMIREZ
 
Revenue streams of the Wazirx clone script.pdf
aaronjeffray
 
HiHelloHR – Simplify HR Operations for Modern Workplaces
HiHelloHR
 
MergeSortfbsjbjsfk sdfik k
RafishaikIT02044
 
유니티에서 Burst Compiler+ThreadedJobs+SIMD 적용사례
Seongdae Kim
 
MailsDaddy Outlook OST to PST converter.pptx
abhishekdutt366
 
MiniTool Partition Wizard 12.8 Crack License Key LATEST
hashhshs786
 
Human Resources Information System (HRIS)
Amity University, Patna
 
Beyond Binaries: Understanding Diversity and Allyship in a Global Workplace -...
Imma Valls Bernaus
 
Platform for Enterprise Solution - Java EE5
abhishekoza1981
 
Agentic Automation Journey Session 1/5: Context Grounding and Autopilot for E...
klpathrudu
 
Revolutionizing Code Modernization with AI
KrzysztofKkol1
 
A Complete Guide to Salesforce SMS Integrations Build Scalable Messaging With...
360 SMS APP
 
Automate Cybersecurity Tasks with Python
VICTOR MAESTRE RAMIREZ
 
MiniTool Power Data Recovery Full Crack Latest 2025
muhammadgurbazkhan
 
Engineering the Java Web Application (MVC)
abhishekoza1981
 
Alarm in Android-Scheduling Timed Tasks Using AlarmManager in Android.pdf
Nabin Dhakal
 
Import Data Form Excel to Tally Services
Tally xperts
 
Tally_Basic_Operations_Presentation.pptx
AditiBansal54083
 
vMix Pro 28.0.0.42 Download vMix Registration key Bundle
kulindacore
 

Machine Learning Trends of 2018 combined with the Apache Kafka Ecosystem

  • 1. 1Confidential Highly scalable machine learning and deep learning in real time with open source frameworks Kai Waehner Technology Evangelist [email protected] LinkedIn @KaiWaehner www.confluent.io www.kai-waehner.de
  • 2. 2Apache Kafka and Machine Learning – Kai Waehner Agenda 1) Machine Learning and Real World Applications 2) Machine Learning and the Apache Kafka Ecosystem 3) Building Neural Networks with TensorFlow and H2O 4) Deployment of Neural Networks with Kafka Streams
  • 3. 3Apache Kafka and Machine Learning – Kai Waehner Agenda 1) Machine Learning and Real World Applications 2) Machine Learning and the Apache Kafka Ecosystem 3) Building Neural Networks with TensorFlow and H2O 4) Deployment of Neural Networks with Kafka Streams
  • 4. 4Apache Kafka and Machine Learning – Kai Waehner Machine Learning ... allows computers to find hidden insights without being explicitly programmed where to look. Machine Learning • Decision Trees • Naïve Bayes • Clustering • Neural Networks • etc. Deep Learning • CNN • RNN • Autoencoder • etc.
  • 5. 5Apache Kafka and Machine Learning – Kai Waehner Neural Network compared to other algorithms
  • 6. 6Apache Kafka and Machine Learning – Kai Waehner Real World Examples of Machine Learning Spam Detection Search Results + Product Recommendation Picture Detection (Friends, Locations, Products) Your Company The Next Disruption: Google Beats Go Champion
  • 7. 7Apache Kafka and Machine Learning – Kai Waehner Leverage Machine Learning to Analyze and Act on Critical Business Moments Seconds Minutes Hours Price Optimization Predictive Maintenance Fraud Detection Cross Selling Transportation Rerouting Customer Service Inventory Management Windows of Opportunity
  • 8. 8Apache Kafka and Machine Learning – Kai Waehner Live Demo – Building an Analytic Model Neural Networks in Action http://playground.tensorflow.org/
  • 9. 9Apache Kafka and Machine Learning – Kai Waehner Agenda 1) Machine Learning and Real World Applications 2) Machine Learning and the Apache Kafka Ecosystem 3) Building Neural Networks with TensorFlow and H2O 4) Deployment of Neural Networks with Kafka Streams
  • 10. 10Apache Kafka and Machine Learning – Kai Waehner Hidden Technical Debt in Machine Learning Systems https://papers.nips.cc/paper/5656-hidden-technical-debt-in-machine-learning-systems.pdf
  • 11. 11Apache Kafka and Machine Learning – Kai Waehner The Log ConnectorsConnectors Producer Consumer Streaming Engine Apache Kafka – A Streaming Platform
  • 12. 12Apache Kafka and Machine Learning – Kai Waehner Apache Kafka’s Open Source Ecosystem and Machine Learning Kafka Streams Kafka Connect Rest Proxy Schema Registry Go / .NET / Python Kafka Producer KSQL Kafka Streams
  • 13. 13Apache Kafka and Machine Learning – Kai Waehner Uber’s internal ML-as-a-Service Platform https://eng.uber.com/michelangelo/ • Cover the end-to-end ML workflow: manage data, train, evaluate, and deploy models, make predictions, and monitor predictions • Supports various AI technologies: Traditional ML models, time series forecasting, and deep learning
  • 14. 14Apache Kafka and Machine Learning – Kai Waehner Netflix’ Meson: Automation Engine for ML Pipelines https://www.infoq.com/presentations/netflix-ml-meson
  • 15. 15Apache Kafka and Machine Learning – Kai Waehner Agenda 1) Machine Learning and Real World Applications 2) Machine Learning and the Apache Kafka Ecosystem 3) Building Neural Networks with TensorFlow and H2O 4) Deployment of Neural Networks with Kafka Streams
  • 16. 16Apache Kafka and Machine Learning – Kai Waehner Languages, Frameworks and Tools for Machine Learning There is no Allrounder! Portable Format for Analytics (PFA)
  • 17. 17Confidential Machine Learning with H2O.ai H2O Engine R / Python / Scala / Flow UI Java Code
  • 18. 18Apache Kafka and Machine Learning – Kai Waehner Live Demo Use Case: Airline Flight Delay Prediction Machine Learning Algorithm: Deep Learning using Neural Networks Technology: H2O.ai, TensorFlow
  • 19. 19Apache Kafka and Machine Learning – Kai Waehner H2O Deep Water (TensorFlow, MXNet, …) https://h2o-release.s3.amazonaws.com/h2o/rel-vapnik/1/ docs-website/h2o-docs/booklets/DeepWaterBooklet.pdf Deep Water (H2O + TensorFlow) Pre-Defined Networks + User-Defined Networks
  • 20. 20Apache Kafka and Machine Learning – Kai Waehner Agenda 1) Machine Learning and Real World Applications 2) Machine Learning and the Apache Kafka Ecosystem 3) Building Neural Networks with TensorFlow and H2O 4) Deployment of Neural Networks with Kafka Streams
  • 21. 21Apache Kafka and Machine Learning – Kai Waehner Stream Processing Data at Rest Data in Motion
  • 22. 22Apache Kafka and Machine Learning – Kai Waehner Stream Processing Pipeline APIs Adapters / Channels Integration Messaging Stream Ingest Transformation Aggregation Enrichment Filtering Stream Preprocessing Process Management Analytics (Real Time) Applications & APIs Analytics / DW Reporting Stream Outcomes • Contextual Rules • Windowing • Patterns • Analytics • Machine Learning • … Stream Analytics Index / SearchNormalization Applying an Analytic Model is just a piece of the puzzle!
  • 23. 23Apache Kafka and Machine Learning – Kai Waehner Kafka Streams (shipped with Apache Kafka) Map, filter, aggregate, apply analytic model, „any business logic“ Input Stream (Kafka Topic) Kafka Cluster Output Stream (Kafka Topic) Kafka Cluster Stream Processing Microservice (Kafka Streams) Deployed Anywhere Java App, Docker, Kubernetes, Mesos, “you-name-it”
  • 24. 24Apache Kafka and Machine Learning – Kai Waehner When to use Kafka Streams for Stream Processing?
  • 25. 25Apache Kafka and Machine Learning – Kai Waehner A complete streaming microservices, ready for production at large-scale Word Count App configuration Define processing (here: WordCount) Start processing
  • 26. 26Apache Kafka and Machine Learning – Kai Waehner Use Case: Airline Flight Delay Prediction Machine Learning Algorithm: Neural Network built with H2O and TensorFlow Streaming Platform: Apache Kafka and Kafka Streams Live Demo
  • 27. 27Apache Kafka and Machine Learning – Kai Waehner H2O.ai Model + Kafka Streams Filter Map 1) Create H2O DL model 2) Configure Kafka Streams Application 3) Apply H2O DL model to Streaming Data 4) Start Kafka Streams App
  • 28. 28Apache Kafka and Machine Learning – Kai Waehner Github Examples: Kafka + Machine Learning https://github.com/kaiwaehner/kafka-streams-machine-learning-examples 1) git clone … 2) mvn clean package … 3) look at implementations and unit tests
  • 29. 29Apache Kafka and Machine Learning – Kai Waehner Online Model Training with Apache Kafka and Kafka Streams How to improve models? 1. Manual Update 2. Automated Batch 3. Real Time Your choice… All possible with Kafka!
  • 30. 30Apache Kafka and Machine Learning – Kai Waehner Caveats for Online Model Training • Processes and infrastructure not ready • Validation needed before production • Slows down the system • Only a few ML implementations à Build your own! • Only possible for unsupervised ML (e.g. clustering) • Many use cases do not need it à Do it only when feasible!
  • 31. 31Apache Kafka and Machine Learning – Kai Waehner Key Take-Aways à Data Scientist and Developers have to work together continuously (org + tech!) à Mission critical, scalable production deployment is key for success of Machine Learning projects à Apache Kafka Ecosystem for Batch and Real Time Machine Learning (Training, Inference, Monitoring)
  • 32. 32Apache Kafka and Machine Learning – Kai Waehner Kai Waehner Technology Evangelist [email protected] @KaiWaehner www.kai-waehner.de www.confluent.io LinkedIn Questions? Feedback? Please contact me!