SlideShare a Scribd company logo
Stream Me Up, Scotty
Transitioning to the cloud
using a streaming data platform
About us
Chrix Finne, Director of Product Management Bob Lehmann, Architect, Data Platform
We’ll Talk About
● Everyone is moving to the cloud
● Common challenges
● Transition design patterns
● How streaming platform fits in
● Monsanto’s data challenges
● Monsanto’s streaming architecture / use cases
● Security
● Lessons learned
● Future plans
How Amazon
thinks you will
move to cloud
You have a lot to worry about
❏ Do I migrate to cloud-native databases?
❏ Will my apps survive random cloud failures?
❏ Do I need a failure injection test framework?
❏ Dynamic configuration of hostnames
❏ Do I break my monolith into microservices?
❏ Do I want to migrate to more than one cloud?
❏ Can I migrate back?
At first,
this is no
big deal….
App
App
AppApp
DWH
DB
KV
App
DB
DC1 AWS
6 month
later...
DC1 AWS
DB
APP
APP
APP
APP
APP
APP
APP
APP
DB
DB
DWH
KV
KV
KV
DWH
Are you
kidding?
● This is expensive
● This is a maintenance
nightmare
● We may need more than
one region!
● We may need more than
one cloud!
We’ve done this before...
This... To this...
There is a
better way
Bridge to Cloud Pattern
Key Components
1. Distributed Log
2. Big ReLiable Buffer
3. Great Replication
4. Connect Anywhere
5. Management & Monitoring
Deeper look at bi-directional replication
Why is this
awesome?
● Proven architecture
● Non-stop low-latency
● One throat to choke
● Cost savings
But wait -
there is
more!
● Future-proof architecture
○ With Connect - get the data everywhere.
○ Try cloud services without fear of lock-in
○ “Kafka is our escape valve”
● Multi-zone availability
● Multi-Region architecture
● Multi-Cloud architectures
● Microservices ready
● Jump to stream processing!
Let’s look at how this worked for...
Monsanto
A sustainable agriculture company
• Bringing a broad range of solutions to help nourish our growing
world
• Headquartered in Saint Louis, Missouri
• >20,000 employees in 66 countries
• A global company with >50% employees based outside of the
United States
• One of the 25 World’s Best Multinational Workplaces by Great
Place to Work Institute
Produce with more
judicious use
of limited natural
resources.
improve the
lives of the
world’s
farmers.
Increase
production
to meet
needs of
a growing
population.
“We succeed when
farmers succeed.”
-Hugh Grant, Monsanto
CEO
Inbred
(Parent 1)
Inbred
(Parent 2)
Hybrid
The Monsanto Corn “Galaxy”
To Boldly Go Where No Man (or
Woman) Has Gone Before...
CAPTAIN’S LOG STARTDATE 41153.7 (early 2015)
Mission: Develop an enterprise information architecture
covering BI to machine learning and everything in between.
Maximize use of cloud based assets.
In What Parallel
Universe Are You
Living?
Existing Challenges
● Data sprawl
● Data consistency
● Data discovery
● Data latency
The Cloud - New Challenges
● Increased data sprawl
(microservices’ dirty little secret)
● Can’t forklift applications overnight
● Cloud apps need on-prem data
● On-prem apps need cloud data
The “Aha” Moment
The Log: What every software engineer should know
about real-time data's unifying abstraction
LinkedIn Engineering Blog post by Jay Kreps
Dec. 6th, 2013
Let’s Clean Up This Mess!
Courtesy of Jay Kreps
Replication Is Your Friend
● Uncontrolled, unsynchronized
replication IS BAD
● Controlled, synchronized
replication IS NECESSARY
AND BENEFICIAL...especially
in distributed environments
The Enterprise DataHub
EDH Plan
- Create Kafka clusters on prem and in AWS
- Establish cross datacenter connection
- Provide replication between the clusters
- Use AVRO schemas
- Only alow Apps To interact with the local
Cluster
IMPORTANT!
Use Schema Registry
VPN?
DIRECT CONNECT?
MIRRORMAKER
Use Cases And
Architecture Evolution
Initial POC - Operations Metrics Collection
Use Case - Customer 360
Use Case - Replication Between Dissimilar Databases
Use Case - Data Warehouse Replication
Use Case - Data Warehouse Migration
Turn this feed off
once cloud solution
is complete
Drives adoption of the platform despite these
common “concerns”:
● Our data will never be used by anybody else
● We don’t need our data in near real time
● Our volumes are too low
● Our volumes are too high
● Do we really have to use schemas?
Cross Datacenter Replication - The Killer App!
Enterprise
DataHub
Simplified
AWS
Architecture
Security
Native Clients
❏ SSL certs used for authentication AND
authorization
❏ Principal is defined by the cert DN
❏ ACLs are defined based on DN
CN=YourProject, OU=YourOrg, O=YourCompany, L=YourLocation, ST=YourState, C=YourCountry
CN and OU are the only attributes
defined by clients
Security
REST Clients
❏ API endpoint defined in Network Director for each
topic
❏ REST client authenticates with SSO platform and
then calls API endpoint
❏ Network Director calls REST Proxy as privileged
user
❏ Access to topics is controlled by restricting access
to API endpoint
Security Architecture
Security Notes
● Some Kafka tools don’t work with SSL and ACLs
● SSL client certs have to be added to truststore on each
broker - requires brokers to be bounced. Patched
broker code to load certs dynamically.
● Long term roadmap
○ LDAP authentication and role based authorization
for native clients.
○ REST Proxy authorization based on OAuth
entitlements
Lessons Learned
❏ Partition By Default
❏ Use “new” consumer - required for SSL anyway
❏ Lock down Zookeeper
❏ Be prepared for AWS instance termination!
❏ Use Associated VPCs in AWS
❏ Use LVM For disks
Future - Make Ingress/Egress Part Of The
Platform
Future - Multi-Cloud Integration
THE END

More Related Content

What's hot (20)

PDF
Common Patterns of Multi Data-Center Architectures with Apache Kafka
confluent
 
PPTX
Streaming in Practice - Putting Apache Kafka in Production
confluent
 
PPTX
Building Large-Scale Stream Infrastructures Across Multiple Data Centers with...
confluent
 
PPTX
Kafka Summit NYC 2017 - Apache Kafka in the Enterprise: What if it Fails?
confluent
 
PDF
Event Sourcing, Stream Processing and Serverless (Benjamin Stopford, Confluen...
confluent
 
PPTX
Kafka at scale facebook israel
Gwen (Chen) Shapira
 
PDF
Metrics Are Not Enough: Monitoring Apache Kafka and Streaming Applications
confluent
 
PDF
Deploying Confluent Platform for Production
confluent
 
PPTX
Kafka Summit NYC 2017 - Cloud Native Data Streaming Microservices with Spring...
confluent
 
PDF
101 ways to configure kafka - badly (Kafka Summit)
Henning Spjelkavik
 
PPTX
kafka for db as postgres
PivotalOpenSourceHub
 
PDF
Exactly-once Semantics in Apache Kafka
confluent
 
PDF
Ingesting Healthcare Data, Micah Whitacre
confluent
 
PDF
Building Stream Infrastructure across Multiple Data Centers with Apache Kafka
Guozhang Wang
 
PPTX
Running Kafka for Maximum Pain
Todd Palino
 
PPTX
Data Streaming with Apache Kafka & MongoDB
confluent
 
PDF
Building Scalable and Extendable Data Pipeline for Call of Duty Games (Yarosl...
confluent
 
PPTX
Reducing Microservice Complexity with Kafka and Reactive Streams
jimriecken
 
PDF
How to over-engineer things and have fun? | Oto Brglez, OPALAB
HostedbyConfluent
 
PDF
Introduction to Apache Kafka and why it matters - Madrid
Paolo Castagna
 
Common Patterns of Multi Data-Center Architectures with Apache Kafka
confluent
 
Streaming in Practice - Putting Apache Kafka in Production
confluent
 
Building Large-Scale Stream Infrastructures Across Multiple Data Centers with...
confluent
 
Kafka Summit NYC 2017 - Apache Kafka in the Enterprise: What if it Fails?
confluent
 
Event Sourcing, Stream Processing and Serverless (Benjamin Stopford, Confluen...
confluent
 
Kafka at scale facebook israel
Gwen (Chen) Shapira
 
Metrics Are Not Enough: Monitoring Apache Kafka and Streaming Applications
confluent
 
Deploying Confluent Platform for Production
confluent
 
Kafka Summit NYC 2017 - Cloud Native Data Streaming Microservices with Spring...
confluent
 
101 ways to configure kafka - badly (Kafka Summit)
Henning Spjelkavik
 
kafka for db as postgres
PivotalOpenSourceHub
 
Exactly-once Semantics in Apache Kafka
confluent
 
Ingesting Healthcare Data, Micah Whitacre
confluent
 
Building Stream Infrastructure across Multiple Data Centers with Apache Kafka
Guozhang Wang
 
Running Kafka for Maximum Pain
Todd Palino
 
Data Streaming with Apache Kafka & MongoDB
confluent
 
Building Scalable and Extendable Data Pipeline for Call of Duty Games (Yarosl...
confluent
 
Reducing Microservice Complexity with Kafka and Reactive Streams
jimriecken
 
How to over-engineer things and have fun? | Oto Brglez, OPALAB
HostedbyConfluent
 
Introduction to Apache Kafka and why it matters - Madrid
Paolo Castagna
 

Similar to Stream Me Up, Scotty: Transitioning to the Cloud Using a Streaming Data Platform (20)

PDF
Designing a Distributed Cloud Database for Dummies
DataStax
 
PPTX
Big Data on Cloud Native Platform
Sunil Govindan
 
PPTX
Big Data on Cloud Native Platform
Sunil Govindan
 
PDF
The Cloud Imperative – What, Why, When and How
Inside Analysis
 
PDF
Ultimate hybrid cloud
Mirantis
 
PDF
Ultimate hybrid cloud: World Wide Cloud
Mirantis
 
PDF
Containerizing couchbase with microservice architecture on mesosphere.pptx
Ravi Yadav
 
PDF
From Monoliths to Microservices - A Journey With Confluent With Gayathri Veal...
HostedbyConfluent
 
PDF
Despliegue Cloud-Native Simplificado: Infraestructura, Servicios y GenAI en m...
Alberto Lorenzo
 
PPTX
Webinar: Enterprise Cloud Migration - 4 Problems to Solve
Storage Switzerland
 
PPTX
Technology insights: Decision Science Platform
Decision Science Community
 
PPTX
Reblaze Case Study on GCP
Idan Tohami
 
PPTX
Dori Exterman, Considerations for choosing the parallel computing strategy th...
Sergey Platonov
 
PPTX
Running OpenStack in Production
Tesora
 
PDF
Cloud-Native Data: What data questions to ask when building cloud-native apps
VMware Tanzu
 
PPTX
How to grow to a modern workplace in 16 steps with microsoft 365
Tim Hermie ☁️
 
PPTX
Understanding cloud with Google Cloud Platform
Dr. Ketan Parmar
 
PPTX
Distributed Data Processing for Real-time Applications
ScyllaDB
 
PPTX
Mapping Life Science Informatics to the Cloud
Chris Dagdigian
 
PPTX
How to migrate workloads to the google cloud platform
actualtechmedia
 
Designing a Distributed Cloud Database for Dummies
DataStax
 
Big Data on Cloud Native Platform
Sunil Govindan
 
Big Data on Cloud Native Platform
Sunil Govindan
 
The Cloud Imperative – What, Why, When and How
Inside Analysis
 
Ultimate hybrid cloud
Mirantis
 
Ultimate hybrid cloud: World Wide Cloud
Mirantis
 
Containerizing couchbase with microservice architecture on mesosphere.pptx
Ravi Yadav
 
From Monoliths to Microservices - A Journey With Confluent With Gayathri Veal...
HostedbyConfluent
 
Despliegue Cloud-Native Simplificado: Infraestructura, Servicios y GenAI en m...
Alberto Lorenzo
 
Webinar: Enterprise Cloud Migration - 4 Problems to Solve
Storage Switzerland
 
Technology insights: Decision Science Platform
Decision Science Community
 
Reblaze Case Study on GCP
Idan Tohami
 
Dori Exterman, Considerations for choosing the parallel computing strategy th...
Sergey Platonov
 
Running OpenStack in Production
Tesora
 
Cloud-Native Data: What data questions to ask when building cloud-native apps
VMware Tanzu
 
How to grow to a modern workplace in 16 steps with microsoft 365
Tim Hermie ☁️
 
Understanding cloud with Google Cloud Platform
Dr. Ketan Parmar
 
Distributed Data Processing for Real-time Applications
ScyllaDB
 
Mapping Life Science Informatics to the Cloud
Chris Dagdigian
 
How to migrate workloads to the google cloud platform
actualtechmedia
 
Ad

More from confluent (20)

PDF
Stream Processing Handson Workshop - Flink SQL Hands-on Workshop (Korean)
confluent
 
PPTX
Webinar Think Right - Shift Left - 19-03-2025.pptx
confluent
 
PDF
Migration, backup and restore made easy using Kannika
confluent
 
PDF
Five Things You Need to Know About Data Streaming in 2025
confluent
 
PDF
Data in Motion Tour Seoul 2024 - Keynote
confluent
 
PDF
Data in Motion Tour Seoul 2024 - Roadmap Demo
confluent
 
PDF
From Stream to Screen: Real-Time Data Streaming to Web Frontends with Conflue...
confluent
 
PDF
Confluent per il settore FSI: Accelerare l'Innovazione con il Data Streaming...
confluent
 
PDF
Data in Motion Tour 2024 Riyadh, Saudi Arabia
confluent
 
PDF
Build a Real-Time Decision Support Application for Financial Market Traders w...
confluent
 
PDF
Strumenti e Strategie di Stream Governance con Confluent Platform
confluent
 
PDF
Compose Gen-AI Apps With Real-Time Data - In Minutes, Not Weeks
confluent
 
PDF
Building Real-Time Gen AI Applications with SingleStore and Confluent
confluent
 
PDF
Unlocking value with event-driven architecture by Confluent
confluent
 
PDF
Il Data Streaming per un’AI real-time di nuova generazione
confluent
 
PDF
Unleashing the Future: Building a Scalable and Up-to-Date GenAI Chatbot with ...
confluent
 
PDF
Break data silos with real-time connectivity using Confluent Cloud Connectors
confluent
 
PDF
Building API data products on top of your real-time data infrastructure
confluent
 
PDF
Speed Wins: From Kafka to APIs in Minutes
confluent
 
PDF
Evolving Data Governance for the Real-time Streaming and AI Era
confluent
 
Stream Processing Handson Workshop - Flink SQL Hands-on Workshop (Korean)
confluent
 
Webinar Think Right - Shift Left - 19-03-2025.pptx
confluent
 
Migration, backup and restore made easy using Kannika
confluent
 
Five Things You Need to Know About Data Streaming in 2025
confluent
 
Data in Motion Tour Seoul 2024 - Keynote
confluent
 
Data in Motion Tour Seoul 2024 - Roadmap Demo
confluent
 
From Stream to Screen: Real-Time Data Streaming to Web Frontends with Conflue...
confluent
 
Confluent per il settore FSI: Accelerare l'Innovazione con il Data Streaming...
confluent
 
Data in Motion Tour 2024 Riyadh, Saudi Arabia
confluent
 
Build a Real-Time Decision Support Application for Financial Market Traders w...
confluent
 
Strumenti e Strategie di Stream Governance con Confluent Platform
confluent
 
Compose Gen-AI Apps With Real-Time Data - In Minutes, Not Weeks
confluent
 
Building Real-Time Gen AI Applications with SingleStore and Confluent
confluent
 
Unlocking value with event-driven architecture by Confluent
confluent
 
Il Data Streaming per un’AI real-time di nuova generazione
confluent
 
Unleashing the Future: Building a Scalable and Up-to-Date GenAI Chatbot with ...
confluent
 
Break data silos with real-time connectivity using Confluent Cloud Connectors
confluent
 
Building API data products on top of your real-time data infrastructure
confluent
 
Speed Wins: From Kafka to APIs in Minutes
confluent
 
Evolving Data Governance for the Real-time Streaming and AI Era
confluent
 
Ad

Recently uploaded (20)

PDF
Open Chain Q2 Steering Committee Meeting - 2025-06-25
Shane Coughlan
 
PDF
Alexander Marshalov - How to use AI Assistants with your Monitoring system Q2...
VictoriaMetrics
 
PDF
iTop VPN With Crack Lifetime Activation Key-CODE
utfefguu
 
PDF
SciPy 2025 - Packaging a Scientific Python Project
Henry Schreiner
 
PPTX
ChiSquare Procedure in IBM SPSS Statistics Version 31.pptx
Version 1 Analytics
 
PDF
Driver Easy Pro 6.1.1 Crack Licensce key 2025 FREE
utfefguu
 
PPTX
Homogeneity of Variance Test Options IBM SPSS Statistics Version 31.pptx
Version 1 Analytics
 
PDF
Odoo CRM vs Zoho CRM: Honest Comparison 2025
Odiware Technologies Private Limited
 
PPTX
AEM User Group: India Chapter Kickoff Meeting
jennaf3
 
PDF
The 5 Reasons for IT Maintenance - Arna Softech
Arna Softech
 
PDF
Digger Solo: Semantic search and maps for your local files
seanpedersen96
 
PDF
MiniTool Partition Wizard 12.8 Crack License Key LATEST
hashhshs786
 
PPTX
Migrating Millions of Users with Debezium, Apache Kafka, and an Acyclic Synch...
MD Sayem Ahmed
 
PDF
Generic or Specific? Making sensible software design decisions
Bert Jan Schrijver
 
PDF
HiHelloHR – Simplify HR Operations for Modern Workplaces
HiHelloHR
 
PDF
Thread In Android-Mastering Concurrency for Responsive Apps.pdf
Nabin Dhakal
 
PDF
SAP Firmaya İade ABAB Kodları - ABAB ile yazılmıl hazır kod örneği
Salih Küçük
 
PDF
Linux Certificate of Completion - LabEx Certificate
VICTOR MAESTRE RAMIREZ
 
PDF
[Solution] Why Choose the VeryPDF DRM Protector Custom-Built Solution for You...
Lingwen1998
 
PDF
Revenue streams of the Wazirx clone script.pdf
aaronjeffray
 
Open Chain Q2 Steering Committee Meeting - 2025-06-25
Shane Coughlan
 
Alexander Marshalov - How to use AI Assistants with your Monitoring system Q2...
VictoriaMetrics
 
iTop VPN With Crack Lifetime Activation Key-CODE
utfefguu
 
SciPy 2025 - Packaging a Scientific Python Project
Henry Schreiner
 
ChiSquare Procedure in IBM SPSS Statistics Version 31.pptx
Version 1 Analytics
 
Driver Easy Pro 6.1.1 Crack Licensce key 2025 FREE
utfefguu
 
Homogeneity of Variance Test Options IBM SPSS Statistics Version 31.pptx
Version 1 Analytics
 
Odoo CRM vs Zoho CRM: Honest Comparison 2025
Odiware Technologies Private Limited
 
AEM User Group: India Chapter Kickoff Meeting
jennaf3
 
The 5 Reasons for IT Maintenance - Arna Softech
Arna Softech
 
Digger Solo: Semantic search and maps for your local files
seanpedersen96
 
MiniTool Partition Wizard 12.8 Crack License Key LATEST
hashhshs786
 
Migrating Millions of Users with Debezium, Apache Kafka, and an Acyclic Synch...
MD Sayem Ahmed
 
Generic or Specific? Making sensible software design decisions
Bert Jan Schrijver
 
HiHelloHR – Simplify HR Operations for Modern Workplaces
HiHelloHR
 
Thread In Android-Mastering Concurrency for Responsive Apps.pdf
Nabin Dhakal
 
SAP Firmaya İade ABAB Kodları - ABAB ile yazılmıl hazır kod örneği
Salih Küçük
 
Linux Certificate of Completion - LabEx Certificate
VICTOR MAESTRE RAMIREZ
 
[Solution] Why Choose the VeryPDF DRM Protector Custom-Built Solution for You...
Lingwen1998
 
Revenue streams of the Wazirx clone script.pdf
aaronjeffray
 

Stream Me Up, Scotty: Transitioning to the Cloud Using a Streaming Data Platform

  • 1. Stream Me Up, Scotty Transitioning to the cloud using a streaming data platform
  • 2. About us Chrix Finne, Director of Product Management Bob Lehmann, Architect, Data Platform
  • 3. We’ll Talk About ● Everyone is moving to the cloud ● Common challenges ● Transition design patterns ● How streaming platform fits in ● Monsanto’s data challenges ● Monsanto’s streaming architecture / use cases ● Security ● Lessons learned ● Future plans
  • 4. How Amazon thinks you will move to cloud
  • 5. You have a lot to worry about ❏ Do I migrate to cloud-native databases? ❏ Will my apps survive random cloud failures? ❏ Do I need a failure injection test framework? ❏ Dynamic configuration of hostnames ❏ Do I break my monolith into microservices? ❏ Do I want to migrate to more than one cloud? ❏ Can I migrate back?
  • 6. At first, this is no big deal…. App App AppApp DWH DB KV App DB DC1 AWS
  • 8. Are you kidding? ● This is expensive ● This is a maintenance nightmare ● We may need more than one region! ● We may need more than one cloud!
  • 9. We’ve done this before... This... To this...
  • 11. Bridge to Cloud Pattern
  • 12. Key Components 1. Distributed Log 2. Big ReLiable Buffer 3. Great Replication 4. Connect Anywhere 5. Management & Monitoring
  • 13. Deeper look at bi-directional replication
  • 14. Why is this awesome? ● Proven architecture ● Non-stop low-latency ● One throat to choke ● Cost savings
  • 15. But wait - there is more! ● Future-proof architecture ○ With Connect - get the data everywhere. ○ Try cloud services without fear of lock-in ○ “Kafka is our escape valve” ● Multi-zone availability ● Multi-Region architecture ● Multi-Cloud architectures ● Microservices ready ● Jump to stream processing!
  • 16. Let’s look at how this worked for...
  • 17. Monsanto A sustainable agriculture company • Bringing a broad range of solutions to help nourish our growing world • Headquartered in Saint Louis, Missouri • >20,000 employees in 66 countries • A global company with >50% employees based outside of the United States • One of the 25 World’s Best Multinational Workplaces by Great Place to Work Institute Produce with more judicious use of limited natural resources. improve the lives of the world’s farmers. Increase production to meet needs of a growing population. “We succeed when farmers succeed.” -Hugh Grant, Monsanto CEO
  • 19. The Monsanto Corn “Galaxy”
  • 20. To Boldly Go Where No Man (or Woman) Has Gone Before... CAPTAIN’S LOG STARTDATE 41153.7 (early 2015) Mission: Develop an enterprise information architecture covering BI to machine learning and everything in between. Maximize use of cloud based assets.
  • 21. In What Parallel Universe Are You Living?
  • 22. Existing Challenges ● Data sprawl ● Data consistency ● Data discovery ● Data latency
  • 23. The Cloud - New Challenges ● Increased data sprawl (microservices’ dirty little secret) ● Can’t forklift applications overnight ● Cloud apps need on-prem data ● On-prem apps need cloud data
  • 24. The “Aha” Moment The Log: What every software engineer should know about real-time data's unifying abstraction LinkedIn Engineering Blog post by Jay Kreps Dec. 6th, 2013
  • 25. Let’s Clean Up This Mess! Courtesy of Jay Kreps
  • 26. Replication Is Your Friend ● Uncontrolled, unsynchronized replication IS BAD ● Controlled, synchronized replication IS NECESSARY AND BENEFICIAL...especially in distributed environments
  • 27. The Enterprise DataHub EDH Plan - Create Kafka clusters on prem and in AWS - Establish cross datacenter connection - Provide replication between the clusters - Use AVRO schemas - Only alow Apps To interact with the local Cluster IMPORTANT! Use Schema Registry VPN? DIRECT CONNECT? MIRRORMAKER
  • 29. Initial POC - Operations Metrics Collection
  • 30. Use Case - Customer 360
  • 31. Use Case - Replication Between Dissimilar Databases
  • 32. Use Case - Data Warehouse Replication
  • 33. Use Case - Data Warehouse Migration Turn this feed off once cloud solution is complete
  • 34. Drives adoption of the platform despite these common “concerns”: ● Our data will never be used by anybody else ● We don’t need our data in near real time ● Our volumes are too low ● Our volumes are too high ● Do we really have to use schemas? Cross Datacenter Replication - The Killer App!
  • 36. Security Native Clients ❏ SSL certs used for authentication AND authorization ❏ Principal is defined by the cert DN ❏ ACLs are defined based on DN CN=YourProject, OU=YourOrg, O=YourCompany, L=YourLocation, ST=YourState, C=YourCountry CN and OU are the only attributes defined by clients
  • 37. Security REST Clients ❏ API endpoint defined in Network Director for each topic ❏ REST client authenticates with SSO platform and then calls API endpoint ❏ Network Director calls REST Proxy as privileged user ❏ Access to topics is controlled by restricting access to API endpoint
  • 39. Security Notes ● Some Kafka tools don’t work with SSL and ACLs ● SSL client certs have to be added to truststore on each broker - requires brokers to be bounced. Patched broker code to load certs dynamically. ● Long term roadmap ○ LDAP authentication and role based authorization for native clients. ○ REST Proxy authorization based on OAuth entitlements
  • 40. Lessons Learned ❏ Partition By Default ❏ Use “new” consumer - required for SSL anyway ❏ Lock down Zookeeper ❏ Be prepared for AWS instance termination! ❏ Use Associated VPCs in AWS ❏ Use LVM For disks
  • 41. Future - Make Ingress/Egress Part Of The Platform
  • 42. Future - Multi-Cloud Integration