SlideShare a Scribd company logo
Google Meets
Apache Kafka
Kir Titievsky
Product Manager
Google Cloud Platform, Messaging
Putting Kafka Together with the Best of Google Cloud Platform
Messages
History
Patterns
Building Apache Kafka on GCP
Patterns (Kafka on GCP)
History
Putting Kafka Together with the Best of Google Cloud Platform
Google File System* (2003)
“Hundreds of producers [...] concurrently append to a file.”
“...files are only read, and often only sequentially”
“...files are often used as producer-consumer queues”
*”The Google File System”, Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung (Google), SOSP’03, Copyright
ACM 2003. tinyurl.com/y7r86zyc
Not quite Kafka
Redacted: Fig 3 from the Google Filesystems Paper (Copyright ACM 2003)
pubsub
2005 pubsub1
... is “hard to maintain”
2009 pubsub2
2015 Cloud Pub/Sub
2018 TB/s on pubsub2
M
Scaling model: Kafka -- partitions
Order!
Stateful apps/CDC!
Hot keys + head of
line blocking.
Consumer
ConsumerM
M
M M M M MM M M X
M M M M M
M M M M M
M
M
M
Max throughput
Client/messages
Stateless 😃
No order 😲
Cloud Pub/Sub
Scaling model: Cloud Pub/Sub -- messages
Consumer
Consumer
Consumer
M
M
M M
M
M
M
M
M M
M
M
M
M
M
M
M
M
M
M
MM
Cloud Pub/Sub
- Store close to
publisher
- Collate for
subscribers
Global messaging
us-west FE
Topic A Topic A Topic A
Subscription X
LB (one hostname, namespace)
us-east FE eu-west
US BE
Subscription Y
EU BE
M
M
M
M
M
M
M
M
M M
M M
Patterns
EAI in IT: monolith refactored
Applicant Candidate Employee
User Apps
pubsub
* Architecture approximate
Lambda lives
Ads FE
Ads FE
Ads FE
Ads FE
Ads FE
Ads FE
pubsub
Batch
updater
DB
Config API
Ads FE
Ads FE
Ads FE
Ads FE
Extreme availability
pubsub blue
pubsub green
pubsub black
state
state
state
Ads FE
Ads FE
Ads FE
Ads FE
Ads FE
Ads FE Ads FE
But wait... what if everything else scaled?
● Kubernetes / Google App Engine
● Bigtable, Apache HBase, Apache Cassandra....
● Spanner (transactions)
Putting Kafka Together with the Best of Google Cloud Platform
Freedom to optimize features vs. cost
Apache Kafka Cloud Pub/Sub
Authentication 82 4
Logging 54 0
Semantics 44 4
Performance 39 24
Kafka-Specific 78 0
297 32
● 20MB/sec
● 0.5 days of storage
● 52TB / month published
● $0.002/GB
Super-efficiency
Single node Kafka cluster
4 vCPUs, 3.6 GB RAM
$72/month
1TB disk (100MB/sec)
$40/month
Availability is 10X the price (2-10¢/GB)
Zone 2
Zone 1
Follower
Zone 3
Follower
1¢/GB zone egress
Producer Consumer
Consumer
Producer
Leader
Producer
2/3
1/3 1/3
2/3
Patterns from GCP users
Flows
App
Cloud Pub/Sub Dataflow
BigQuery
Kafka
KSQL, Streams
Kafka Connect
GCP
On prem
App BigQueryPub/Sub Dataflow
Kafka
Kafka Connect
App
PCollection<String> input =
p.apply("KafkaIngest1", KafkaIO.<Long, String>read()
.withBootstrapServers(KAFKA_BOOTSTRAP_SERVERS)
.withTopic(TOPIC_NAME)
.withKeyDeserializer(LongDeserializer.class)
.withValueDeserializer(StringDeserializer.class)
.withoutMetadata())
//
// ------- GENERIC TRANSFORMS -------
.apply("Parse", Values.create())
.apply("AddTimeStamp", ParDo.of(
new DoFn<String, String>() {
...
Reading from Kafka in Apache Beam
Stream=batch in processing layer vs.
storage
***@google.com
Kir Titievsky, Product Manager
Google Cloud Platform
Thanks & Happy Streaming!

More Related Content

Similar to Putting Kafka Together with the Best of Google Cloud Platform (20)

PDF
Developing Realtime Data Pipelines With Apache Kafka
Joe Stein
 
PPTX
Apache Kafka
Joe Stein
 
PDF
14th Athens Big Data Meetup - Landoop Workshop - Apache Kafka Entering The St...
Athens Big Data
 
PPTX
Apache Kafka with Spark Streaming: Real-time Analytics Redefined
Edureka!
 
PDF
Trivadis TechEvent 2016 Apache Kafka - Scalable Massage Processing and more! ...
Trivadis
 
PDF
Introduction to Apache Kafka
Ricardo Bravo
 
PDF
Apache Kafka - Event Sourcing, Monitoring, Librdkafka, Scaling & Partitioning
Guido Schmutz
 
PDF
Introduction to apache kafka
Samuel Kerrien
 
PPTX
Apache kafka
sKaushikNarayanan
 
PPTX
Apache kafka
sKaushikNarayanan
 
PPTX
Apache kafka
MvkZ
 
PPTX
Apache kafka
MvkZ
 
PPTX
Apache kafka
MvkZ
 
PPTX
Apache kafka
sKaushikNarayanan
 
PPTX
Apache kafka
sKaushikNarayanan
 
PDF
CAPI and OpenCAPI Hardware acceleration enablement
Ganesan Narayanasamy
 
PPT
2011 06-30-hadoop-summit v5
Samuel Rash
 
PDF
AWS CZSK Webinar - Migrácia desktopov a aplikácií do AWS cloudu s Amazon Work...
Vladimir Simek
 
PPTX
Westpac Bank Tech Talk 1: Dive into Apache Kafka
confluent
 
PPTX
F_1330_Narkhede_Kafka .pptx
NIMITJAIN71
 
Developing Realtime Data Pipelines With Apache Kafka
Joe Stein
 
Apache Kafka
Joe Stein
 
14th Athens Big Data Meetup - Landoop Workshop - Apache Kafka Entering The St...
Athens Big Data
 
Apache Kafka with Spark Streaming: Real-time Analytics Redefined
Edureka!
 
Trivadis TechEvent 2016 Apache Kafka - Scalable Massage Processing and more! ...
Trivadis
 
Introduction to Apache Kafka
Ricardo Bravo
 
Apache Kafka - Event Sourcing, Monitoring, Librdkafka, Scaling & Partitioning
Guido Schmutz
 
Introduction to apache kafka
Samuel Kerrien
 
Apache kafka
sKaushikNarayanan
 
Apache kafka
sKaushikNarayanan
 
Apache kafka
MvkZ
 
Apache kafka
MvkZ
 
Apache kafka
MvkZ
 
Apache kafka
sKaushikNarayanan
 
Apache kafka
sKaushikNarayanan
 
CAPI and OpenCAPI Hardware acceleration enablement
Ganesan Narayanasamy
 
2011 06-30-hadoop-summit v5
Samuel Rash
 
AWS CZSK Webinar - Migrácia desktopov a aplikácií do AWS cloudu s Amazon Work...
Vladimir Simek
 
Westpac Bank Tech Talk 1: Dive into Apache Kafka
confluent
 
F_1330_Narkhede_Kafka .pptx
NIMITJAIN71
 

More from confluent (20)

PDF
Stream Processing Handson Workshop - Flink SQL Hands-on Workshop (Korean)
confluent
 
PPTX
Webinar Think Right - Shift Left - 19-03-2025.pptx
confluent
 
PDF
Migration, backup and restore made easy using Kannika
confluent
 
PDF
Five Things You Need to Know About Data Streaming in 2025
confluent
 
PDF
Data in Motion Tour Seoul 2024 - Keynote
confluent
 
PDF
Data in Motion Tour Seoul 2024 - Roadmap Demo
confluent
 
PDF
From Stream to Screen: Real-Time Data Streaming to Web Frontends with Conflue...
confluent
 
PDF
Confluent per il settore FSI: Accelerare l'Innovazione con il Data Streaming...
confluent
 
PDF
Data in Motion Tour 2024 Riyadh, Saudi Arabia
confluent
 
PDF
Build a Real-Time Decision Support Application for Financial Market Traders w...
confluent
 
PDF
Strumenti e Strategie di Stream Governance con Confluent Platform
confluent
 
PDF
Compose Gen-AI Apps With Real-Time Data - In Minutes, Not Weeks
confluent
 
PDF
Building Real-Time Gen AI Applications with SingleStore and Confluent
confluent
 
PDF
Unlocking value with event-driven architecture by Confluent
confluent
 
PDF
Il Data Streaming per un’AI real-time di nuova generazione
confluent
 
PDF
Unleashing the Future: Building a Scalable and Up-to-Date GenAI Chatbot with ...
confluent
 
PDF
Break data silos with real-time connectivity using Confluent Cloud Connectors
confluent
 
PDF
Building API data products on top of your real-time data infrastructure
confluent
 
PDF
Speed Wins: From Kafka to APIs in Minutes
confluent
 
PDF
Evolving Data Governance for the Real-time Streaming and AI Era
confluent
 
Stream Processing Handson Workshop - Flink SQL Hands-on Workshop (Korean)
confluent
 
Webinar Think Right - Shift Left - 19-03-2025.pptx
confluent
 
Migration, backup and restore made easy using Kannika
confluent
 
Five Things You Need to Know About Data Streaming in 2025
confluent
 
Data in Motion Tour Seoul 2024 - Keynote
confluent
 
Data in Motion Tour Seoul 2024 - Roadmap Demo
confluent
 
From Stream to Screen: Real-Time Data Streaming to Web Frontends with Conflue...
confluent
 
Confluent per il settore FSI: Accelerare l'Innovazione con il Data Streaming...
confluent
 
Data in Motion Tour 2024 Riyadh, Saudi Arabia
confluent
 
Build a Real-Time Decision Support Application for Financial Market Traders w...
confluent
 
Strumenti e Strategie di Stream Governance con Confluent Platform
confluent
 
Compose Gen-AI Apps With Real-Time Data - In Minutes, Not Weeks
confluent
 
Building Real-Time Gen AI Applications with SingleStore and Confluent
confluent
 
Unlocking value with event-driven architecture by Confluent
confluent
 
Il Data Streaming per un’AI real-time di nuova generazione
confluent
 
Unleashing the Future: Building a Scalable and Up-to-Date GenAI Chatbot with ...
confluent
 
Break data silos with real-time connectivity using Confluent Cloud Connectors
confluent
 
Building API data products on top of your real-time data infrastructure
confluent
 
Speed Wins: From Kafka to APIs in Minutes
confluent
 
Evolving Data Governance for the Real-time Streaming and AI Era
confluent
 
Ad

Recently uploaded (20)

PPTX
AI in Daily Life: How Artificial Intelligence Helps Us Every Day
vanshrpatil7
 
PDF
TrustArc Webinar - Navigating Data Privacy in LATAM: Laws, Trends, and Compli...
TrustArc
 
PDF
GDG Cloud Munich - Intro - Luiz Carneiro - #BuildWithAI - July - Abdel.pdf
Luiz Carneiro
 
PPTX
OA presentation.pptx OA presentation.pptx
pateldhruv002338
 
PDF
RAT Builders - How to Catch Them All [DeepSec 2024]
malmoeb
 
PPTX
What-is-the-World-Wide-Web -- Introduction
tonifi9488
 
PDF
Research-Fundamentals-and-Topic-Development.pdf
ayesha butalia
 
PDF
Make GenAI investments go further with the Dell AI Factory
Principled Technologies
 
PDF
Brief History of Internet - Early Days of Internet
sutharharshit158
 
PDF
The Future of Artificial Intelligence (AI)
Mukul
 
PDF
Build with AI and GDG Cloud Bydgoszcz- ADK .pdf
jaroslawgajewski1
 
PDF
NewMind AI Weekly Chronicles – July’25, Week III
NewMind AI
 
PPTX
Farrell_Programming Logic and Design slides_10e_ch02_PowerPoint.pptx
bashnahara11
 
PDF
Trying to figure out MCP by actually building an app from scratch with open s...
Julien SIMON
 
PPTX
AI and Robotics for Human Well-being.pptx
JAYMIN SUTHAR
 
PPTX
Simple and concise overview about Quantum computing..pptx
mughal641
 
PDF
A Strategic Analysis of the MVNO Wave in Emerging Markets.pdf
IPLOOK Networks
 
PPTX
The Future of AI & Machine Learning.pptx
pritsen4700
 
PPTX
cloud computing vai.pptx for the project
vaibhavdobariyal79
 
PPTX
Agile Chennai 18-19 July 2025 Ideathon | AI Powered Microfinance Literacy Gui...
AgileNetwork
 
AI in Daily Life: How Artificial Intelligence Helps Us Every Day
vanshrpatil7
 
TrustArc Webinar - Navigating Data Privacy in LATAM: Laws, Trends, and Compli...
TrustArc
 
GDG Cloud Munich - Intro - Luiz Carneiro - #BuildWithAI - July - Abdel.pdf
Luiz Carneiro
 
OA presentation.pptx OA presentation.pptx
pateldhruv002338
 
RAT Builders - How to Catch Them All [DeepSec 2024]
malmoeb
 
What-is-the-World-Wide-Web -- Introduction
tonifi9488
 
Research-Fundamentals-and-Topic-Development.pdf
ayesha butalia
 
Make GenAI investments go further with the Dell AI Factory
Principled Technologies
 
Brief History of Internet - Early Days of Internet
sutharharshit158
 
The Future of Artificial Intelligence (AI)
Mukul
 
Build with AI and GDG Cloud Bydgoszcz- ADK .pdf
jaroslawgajewski1
 
NewMind AI Weekly Chronicles – July’25, Week III
NewMind AI
 
Farrell_Programming Logic and Design slides_10e_ch02_PowerPoint.pptx
bashnahara11
 
Trying to figure out MCP by actually building an app from scratch with open s...
Julien SIMON
 
AI and Robotics for Human Well-being.pptx
JAYMIN SUTHAR
 
Simple and concise overview about Quantum computing..pptx
mughal641
 
A Strategic Analysis of the MVNO Wave in Emerging Markets.pdf
IPLOOK Networks
 
The Future of AI & Machine Learning.pptx
pritsen4700
 
cloud computing vai.pptx for the project
vaibhavdobariyal79
 
Agile Chennai 18-19 July 2025 Ideathon | AI Powered Microfinance Literacy Gui...
AgileNetwork
 
Ad

Putting Kafka Together with the Best of Google Cloud Platform

  • 1. Google Meets Apache Kafka Kir Titievsky Product Manager Google Cloud Platform, Messaging
  • 3. Messages History Patterns Building Apache Kafka on GCP Patterns (Kafka on GCP)
  • 6. Google File System* (2003) “Hundreds of producers [...] concurrently append to a file.” “...files are only read, and often only sequentially” “...files are often used as producer-consumer queues” *”The Google File System”, Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung (Google), SOSP’03, Copyright ACM 2003. tinyurl.com/y7r86zyc
  • 7. Not quite Kafka Redacted: Fig 3 from the Google Filesystems Paper (Copyright ACM 2003)
  • 8. pubsub 2005 pubsub1 ... is “hard to maintain” 2009 pubsub2 2015 Cloud Pub/Sub 2018 TB/s on pubsub2
  • 9. M Scaling model: Kafka -- partitions Order! Stateful apps/CDC! Hot keys + head of line blocking. Consumer ConsumerM M M M M M MM M M X M M M M M M M M M M M M M
  • 10. Max throughput Client/messages Stateless 😃 No order 😲 Cloud Pub/Sub Scaling model: Cloud Pub/Sub -- messages Consumer Consumer Consumer M M M M M M M M M M M M M M M M M M M M MM
  • 11. Cloud Pub/Sub - Store close to publisher - Collate for subscribers Global messaging us-west FE Topic A Topic A Topic A Subscription X LB (one hostname, namespace) us-east FE eu-west US BE Subscription Y EU BE M M M M M M M M M M M M
  • 13. EAI in IT: monolith refactored Applicant Candidate Employee User Apps pubsub * Architecture approximate
  • 14. Lambda lives Ads FE Ads FE Ads FE Ads FE Ads FE Ads FE pubsub Batch updater DB Config API
  • 15. Ads FE Ads FE Ads FE Ads FE Extreme availability pubsub blue pubsub green pubsub black state state state Ads FE Ads FE Ads FE Ads FE Ads FE Ads FE Ads FE
  • 16. But wait... what if everything else scaled? ● Kubernetes / Google App Engine ● Bigtable, Apache HBase, Apache Cassandra.... ● Spanner (transactions)
  • 18. Freedom to optimize features vs. cost Apache Kafka Cloud Pub/Sub Authentication 82 4 Logging 54 0 Semantics 44 4 Performance 39 24 Kafka-Specific 78 0 297 32
  • 19. ● 20MB/sec ● 0.5 days of storage ● 52TB / month published ● $0.002/GB Super-efficiency Single node Kafka cluster 4 vCPUs, 3.6 GB RAM $72/month 1TB disk (100MB/sec) $40/month
  • 20. Availability is 10X the price (2-10¢/GB) Zone 2 Zone 1 Follower Zone 3 Follower 1¢/GB zone egress Producer Consumer Consumer Producer Leader Producer 2/3 1/3 1/3 2/3
  • 23. GCP On prem App BigQueryPub/Sub Dataflow Kafka Kafka Connect App
  • 24. PCollection<String> input = p.apply("KafkaIngest1", KafkaIO.<Long, String>read() .withBootstrapServers(KAFKA_BOOTSTRAP_SERVERS) .withTopic(TOPIC_NAME) .withKeyDeserializer(LongDeserializer.class) .withValueDeserializer(StringDeserializer.class) .withoutMetadata()) // // ------- GENERIC TRANSFORMS ------- .apply("Parse", Values.create()) .apply("AddTimeStamp", ParDo.of( new DoFn<String, String>() { ... Reading from Kafka in Apache Beam
  • 25. Stream=batch in processing layer vs. storage
  • 26. ***@google.com Kir Titievsky, Product Manager Google Cloud Platform Thanks & Happy Streaming!