SlideShare a Scribd company logo
Building an Event Bus at Scale
Senior Software Developer
@jimriecken
Jim Riecken
• Senior developer on the Platform Team at Hootsuite
• Building backend services + infrastructure
• I <3 Scala
About me
A bit of history
• PHP monolith, horizontally
scaled
• Single Database
• Any part of the system can
easily interact with any other
part of the system
• Local method calls
• Shared cache
• Shared database
The early days
Load balancers
Memcache + DB
• Smaller PHP monolith
• Lots of Scala microservices
• Multiple databases
• Distributed Systems
• Not local anymore
• Latency
• Failures, partial failures
Now
Dealing with Complexity
• As the number of services
increases, the coupling of them
tends to as well
• More network calls end up in
the critical path of the request
• Slows user experience
• More prone to failure
• Do all of them need to be?
Coupling
sendMessage()
1
2 3
4 5
Event Bus
• Decouple asynchronous
consumption of data/events
from the producer of that data.
• New consumers easily added
• No longer in the critical path of
the request, and fewer
potential points for failure
• Faster requests + happier
users!
Event Bus
sendMessage()
Event Bus
1
2 3
4
• High throughput
• High availability
• Durability
• Handle fast producers + slow consumers
• Multi-region/data center support
• Must have Scala and PHP clients
Requirements
Which technology to choose?
• RabbitMQ (or some other flavour of AMQP)
• ØMQ
• Apache Kafka
Candidates
• ØMQ
• Too low level, would have to build a lot on top of it
• RabbitMQ
• Based on previous experience
• Doesn’t recover well from crashes
• Doesn’t perform well when messages are persisted to disk
• Slow consumers can affect performance of the system
Why not ØMQ or RabbitMQ?
• Simple - conceptually it’s just a log
• High performance - in use at large organizations (e.g. LinkedIn, Etsy,
Netflix)
• Can scale up to millions of messages per second / terabytes of data per day
• Highly available - designed to be fault tolerant
• High durability - messages are replicated across cluster
• Handles slow consumers
• Pull model, not push
• Configurable message retention
• Can work with multiple regions/data centers
• Written in Scala!
Why Kafka?
What is
• Distributed, partitioned,
replicated commit log service
• Producers publish messages
to Topics
• Consumers pull + process the
feed of published messages
• Runs as a cluster of Brokers
• Requires ZooKeeper for
coordination/leader election
Kafka
P P P
C C C
ZK
Brokers
w/ Topics
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
• Split into Partitions (which are stored in log files)
• Each partition is an ordered, immutable sequence of messages
that is only appended to
• Partitions are distributed and replicated across the cluster of
Brokers
• Data is kept for a configurable retention period after which it is
either discarded or compacted
• Consumers keep track of their offset in the logs
Topics
• Push messages to partitions of topics
• Can send to
• A random/round-robined partition
• A specific partition
• A partition based on a hash constructed from a key
• Maintain per-key order
• Messages and Keys are just Array[Byte]
• Responsible for your own serialization
Producers
• Pull messages from partitions of topics
• Can either
• Manually manage offsets (“simple consumer”)
• Have offsets/partition assignment automatically managed (“high level
consumer”)
• Consumer Groups
• Offsets stored in ZooKeeper (or Kafka itself)
• Partitions are distributed among consumers
• # Consumers > # Partitions => Some consume nothing
• # Partitions > # Consumers => Some consume several partitions
Consumers
How we set up Kafka
• Each cluster consists of a set of
Kafka brokers and a ZooKeeper
quorum
• At least 3 brokers
• At least 3 ZK nodes (preferably
more)
• Brokers have large disks
• Standard topic retention -
overridden per topic as necessary
• Topics are managed via Jenkins jobs
Clusters
ZK ZK
ZK
B B
B
• MirrorMaker
• Tool for consuming topics
from one cluster + producing
to another
• Aggregate + Local clusters
• Producers produce to local
cluster
• Consumers consume from
local + aggregate
• MirrorMaker consumes from
local + produces to aggregate
Multi-Region
ZK
Local
Aggregate
MirrorMaker
ZK
Local
Aggregate
MirrorMaker
Region 1 Region 2
PP
C C
Producing + Consuming
• Wrote a thin Scala wrapper around the Kafka “New” Producer Java
API
• Effectively send(topic, message, [key])
• Use minimum “in-sync replicas” setting for Topics
• We set it to ceil(N/2 + 1) where N is the size of the cluster
• Wait for acks from partition replicas before committing to leader
Producing
• To produce from our PHP
components, we use a Scala
proxy service with a REST API
• We also produce directly from
MySQL by using Tungsten
Replicator and a filter that
converts binlog changes to
event bus messages and
produces them
Producing
Kafka
TR
• Wrote a thin Scala wrapper on top of the High-Level Kafka
Consumer Java API
• Abstracts consuming from Local + Aggregate clusters
• Register consumer function for a topic
• Offsets auto-committed to ZooKeeper
• Consumer group for each logical consumer
• Sometimes have more consumers than partitions (fault tolerance)
• Also have consumption mechanism for PHP/Python
Consuming
Message Format
• Need to be able to serialize/deserialize messages in an efficient,
language agnostic way that tolerates evolution in message data
• Options
• JSON
• Plain text, everything understands it, easy to add/change fields
• Expensive to parse, large size, still have convert parsed JSON into domain
objects
• Protocol Buffers (protobuf)
• Binary, language-specific impls generated from an IDL
• Fast to parse, small size, generated code, easy to make
backwards/forwards compatible changes
Data -> Array[Byte] -> Data
• All of the messages we publish/consume from Kafka are serialized
protobufs
• We use ScalaPB (https://github.com/trueaccord/ScalaPB)
• Built on top of Google’s Java protobuf library
• Generates scala case class definitions from .proto
• Use only “optional” fields
• Helps forwards/backwards compatibility of messages
• Can add/remove fields without breaking
Protobuf
• You have to know the type of the serialized protobuf data before
you can deserialize it
• Potential solutions
• Only publish one type of message per topic
• Prepend a non-protobuf type tag in the payload
• The previous, but with protobufs inside protobufs
Small problem
• Protobuf that contains a list
• UUID string
• Payload bytes (serialized protobuf)
• Benefits
• Multiple objects per logical event
• Evolution of data in a topic
• Automatic serialization and
deserialization (maintain a
mapping of UUID-to-Type in each
language)
Message wrapper
UUID
Serialized protobuf payload bytes
• We use Kafka as a high-performance, highly-available
asynchronous event bus to decouple our services and reduce
complexity.
• Kafka is awesome - it just works!
• We use Protocol Buffers for an efficient message format that is
easy to use and evolve.
• Scala support for Kafka + Protobuf is great!
Wrapping up
Thank you!
Questions?
Senior Software Developer
@jimriecken
Jim Riecken

More Related Content

What's hot (20)

PPTX
kafka
Amikam Snir
 
PDF
The Future of Data Science and Machine Learning at Scale: A Look at MLflow, D...
Databricks
 
PDF
Benefits of Stream Processing and Apache Kafka Use Cases
confluent
 
PDF
cLoki: Like Loki but for ClickHouse
Altinity Ltd
 
PPTX
Spring Boot+Kafka: the New Enterprise Platform
VMware Tanzu
 
PPTX
Introduction to KSQL: Streaming SQL for Apache Kafka®
confluent
 
PPTX
Kafka 101
Clement Demonchy
 
PDF
Scalability, Availability & Stability Patterns
Jonas Bonér
 
PDF
Fundamentals of Apache Kafka
Chhavi Parasher
 
PDF
KSQL Deep Dive - The Open Source Streaming Engine for Apache Kafka
Kai Wähner
 
PPTX
Towards Flink 2.0: Unified Batch & Stream Processing - Aljoscha Krettek, Verv...
Flink Forward
 
PDF
Producer Performance Tuning for Apache Kafka
Jiangjie Qin
 
PDF
From Zero to Hero with Kafka Connect
confluent
 
PPTX
Apache Kafka Best Practices
DataWorks Summit/Hadoop Summit
 
PDF
Bucketing 2.0: Improve Spark SQL Performance by Removing Shuffle
Databricks
 
PDF
Simplifying Model Management with MLflow
Databricks
 
PDF
Confluent Platform 5.4 + Apache Kafka 2.4 Overview (RBAC, Tiered Storage, Mul...
Kai Wähner
 
PDF
Systems Monitoring with Prometheus (Devops Ireland April 2015)
Brian Brazil
 
PDF
Real-Life Use Cases & Architectures for Event Streaming with Apache Kafka
Kai Wähner
 
The Future of Data Science and Machine Learning at Scale: A Look at MLflow, D...
Databricks
 
Benefits of Stream Processing and Apache Kafka Use Cases
confluent
 
cLoki: Like Loki but for ClickHouse
Altinity Ltd
 
Spring Boot+Kafka: the New Enterprise Platform
VMware Tanzu
 
Introduction to KSQL: Streaming SQL for Apache Kafka®
confluent
 
Kafka 101
Clement Demonchy
 
Scalability, Availability & Stability Patterns
Jonas Bonér
 
Fundamentals of Apache Kafka
Chhavi Parasher
 
KSQL Deep Dive - The Open Source Streaming Engine for Apache Kafka
Kai Wähner
 
Towards Flink 2.0: Unified Batch & Stream Processing - Aljoscha Krettek, Verv...
Flink Forward
 
Producer Performance Tuning for Apache Kafka
Jiangjie Qin
 
From Zero to Hero with Kafka Connect
confluent
 
Apache Kafka Best Practices
DataWorks Summit/Hadoop Summit
 
Bucketing 2.0: Improve Spark SQL Performance by Removing Shuffle
Databricks
 
Simplifying Model Management with MLflow
Databricks
 
Confluent Platform 5.4 + Apache Kafka 2.4 Overview (RBAC, Tiered Storage, Mul...
Kai Wähner
 
Systems Monitoring with Prometheus (Devops Ireland April 2015)
Brian Brazil
 
Real-Life Use Cases & Architectures for Event Streaming with Apache Kafka
Kai Wähner
 

Similar to Building an Event Bus at Scale (20)

PDF
Introduction and Overview of Apache Kafka, TriHUG July 23, 2013
mumrah
 
PDF
Trivadis TechEvent 2016 Apache Kafka - Scalable Massage Processing and more! ...
Trivadis
 
PPTX
AMIS SIG - Introducing Apache Kafka - Scalable, reliable Event Bus & Message ...
Lucas Jellema
 
PDF
An Introduction to Apache Kafka
Amir Sedighi
 
PPTX
Apache Kafka
Joe Stein
 
PDF
Apache Kafka Introduction
Amita Mirajkar
 
PPTX
Distributed messaging through Kafka
Dileep Kalidindi
 
PDF
Developing Real-Time Data Pipelines with Apache Kafka
Joe Stein
 
ODP
Apache Kafka Demo
Edward Capriolo
 
PDF
Developing Realtime Data Pipelines With Apache Kafka
Joe Stein
 
PDF
Apache kafka
NexThoughts Technologies
 
PPTX
Apache kafka
Srikrishna k
 
PPTX
Kafkha real time analytics platform.pptx
dummyuseage1
 
PPTX
Kafka and ibm event streams basics
Brian S. Paskin
 
PPTX
Being Ready for Apache Kafka - Apache: Big Data Europe 2015
Michael Noll
 
PPTX
Apache kafka
Viswanath J
 
PPTX
Distributed messaging with Apache Kafka
Saumitra Srivastav
 
PPTX
Apache kafka
Srikrishna k
 
PPTX
Kafka tutorial
Srikrishna k
 
PDF
apachekafka-160907180205.pdf
TarekHamdi8
 
Introduction and Overview of Apache Kafka, TriHUG July 23, 2013
mumrah
 
Trivadis TechEvent 2016 Apache Kafka - Scalable Massage Processing and more! ...
Trivadis
 
AMIS SIG - Introducing Apache Kafka - Scalable, reliable Event Bus & Message ...
Lucas Jellema
 
An Introduction to Apache Kafka
Amir Sedighi
 
Apache Kafka
Joe Stein
 
Apache Kafka Introduction
Amita Mirajkar
 
Distributed messaging through Kafka
Dileep Kalidindi
 
Developing Real-Time Data Pipelines with Apache Kafka
Joe Stein
 
Apache Kafka Demo
Edward Capriolo
 
Developing Realtime Data Pipelines With Apache Kafka
Joe Stein
 
Apache kafka
Srikrishna k
 
Kafkha real time analytics platform.pptx
dummyuseage1
 
Kafka and ibm event streams basics
Brian S. Paskin
 
Being Ready for Apache Kafka - Apache: Big Data Europe 2015
Michael Noll
 
Apache kafka
Viswanath J
 
Distributed messaging with Apache Kafka
Saumitra Srivastav
 
Apache kafka
Srikrishna k
 
Kafka tutorial
Srikrishna k
 
apachekafka-160907180205.pdf
TarekHamdi8
 
Ad

Recently uploaded (20)

PPTX
The Role of a PHP Development Company in Modern Web Development
SEO Company for School in Delhi NCR
 
PPTX
Comprehensive Guide: Shoviv Exchange to Office 365 Migration Tool 2025
Shoviv Software
 
PPTX
Equipment Management Software BIS Safety UK.pptx
BIS Safety Software
 
PDF
Efficient, Automated Claims Processing Software for Insurers
Insurance Tech Services
 
PPTX
Writing Better Code - Helping Developers make Decisions.pptx
Lorraine Steyn
 
PDF
MiniTool Partition Wizard 12.8 Crack License Key LATEST
hashhshs786
 
PDF
Revenue streams of the Wazirx clone script.pdf
aaronjeffray
 
PDF
Capcut Pro Crack For PC Latest Version {Fully Unlocked} 2025
hashhshs786
 
PPTX
Tally_Basic_Operations_Presentation.pptx
AditiBansal54083
 
PDF
iTop VPN With Crack Lifetime Activation Key-CODE
utfefguu
 
PPTX
How Apagen Empowered an EPC Company with Engineering ERP Software
SatishKumar2651
 
PPTX
An Introduction to ZAP by Checkmarx - Official Version
Simon Bennetts
 
PPTX
Engineering the Java Web Application (MVC)
abhishekoza1981
 
PPTX
Java Native Memory Leaks: The Hidden Villain Behind JVM Performance Issues
Tier1 app
 
PDF
Beyond Binaries: Understanding Diversity and Allyship in a Global Workplace -...
Imma Valls Bernaus
 
PDF
Thread In Android-Mastering Concurrency for Responsive Apps.pdf
Nabin Dhakal
 
PPTX
How Odoo Became a Game-Changer for an IT Company in Manufacturing ERP
SatishKumar2651
 
PDF
Alarm in Android-Scheduling Timed Tasks Using AlarmManager in Android.pdf
Nabin Dhakal
 
PDF
Salesforce CRM Services.VALiNTRY360
VALiNTRY360
 
DOCX
Import Data Form Excel to Tally Services
Tally xperts
 
The Role of a PHP Development Company in Modern Web Development
SEO Company for School in Delhi NCR
 
Comprehensive Guide: Shoviv Exchange to Office 365 Migration Tool 2025
Shoviv Software
 
Equipment Management Software BIS Safety UK.pptx
BIS Safety Software
 
Efficient, Automated Claims Processing Software for Insurers
Insurance Tech Services
 
Writing Better Code - Helping Developers make Decisions.pptx
Lorraine Steyn
 
MiniTool Partition Wizard 12.8 Crack License Key LATEST
hashhshs786
 
Revenue streams of the Wazirx clone script.pdf
aaronjeffray
 
Capcut Pro Crack For PC Latest Version {Fully Unlocked} 2025
hashhshs786
 
Tally_Basic_Operations_Presentation.pptx
AditiBansal54083
 
iTop VPN With Crack Lifetime Activation Key-CODE
utfefguu
 
How Apagen Empowered an EPC Company with Engineering ERP Software
SatishKumar2651
 
An Introduction to ZAP by Checkmarx - Official Version
Simon Bennetts
 
Engineering the Java Web Application (MVC)
abhishekoza1981
 
Java Native Memory Leaks: The Hidden Villain Behind JVM Performance Issues
Tier1 app
 
Beyond Binaries: Understanding Diversity and Allyship in a Global Workplace -...
Imma Valls Bernaus
 
Thread In Android-Mastering Concurrency for Responsive Apps.pdf
Nabin Dhakal
 
How Odoo Became a Game-Changer for an IT Company in Manufacturing ERP
SatishKumar2651
 
Alarm in Android-Scheduling Timed Tasks Using AlarmManager in Android.pdf
Nabin Dhakal
 
Salesforce CRM Services.VALiNTRY360
VALiNTRY360
 
Import Data Form Excel to Tally Services
Tally xperts
 
Ad

Building an Event Bus at Scale

  • 1. Building an Event Bus at Scale Senior Software Developer @jimriecken Jim Riecken
  • 2. • Senior developer on the Platform Team at Hootsuite • Building backend services + infrastructure • I <3 Scala About me
  • 3. A bit of history
  • 4. • PHP monolith, horizontally scaled • Single Database • Any part of the system can easily interact with any other part of the system • Local method calls • Shared cache • Shared database The early days Load balancers Memcache + DB
  • 5. • Smaller PHP monolith • Lots of Scala microservices • Multiple databases • Distributed Systems • Not local anymore • Latency • Failures, partial failures Now
  • 7. • As the number of services increases, the coupling of them tends to as well • More network calls end up in the critical path of the request • Slows user experience • More prone to failure • Do all of them need to be? Coupling sendMessage() 1 2 3 4 5
  • 9. • Decouple asynchronous consumption of data/events from the producer of that data. • New consumers easily added • No longer in the critical path of the request, and fewer potential points for failure • Faster requests + happier users! Event Bus sendMessage() Event Bus 1 2 3 4
  • 10. • High throughput • High availability • Durability • Handle fast producers + slow consumers • Multi-region/data center support • Must have Scala and PHP clients Requirements
  • 12. • RabbitMQ (or some other flavour of AMQP) • ØMQ • Apache Kafka Candidates
  • 13. • ØMQ • Too low level, would have to build a lot on top of it • RabbitMQ • Based on previous experience • Doesn’t recover well from crashes • Doesn’t perform well when messages are persisted to disk • Slow consumers can affect performance of the system Why not ØMQ or RabbitMQ?
  • 14. • Simple - conceptually it’s just a log • High performance - in use at large organizations (e.g. LinkedIn, Etsy, Netflix) • Can scale up to millions of messages per second / terabytes of data per day • Highly available - designed to be fault tolerant • High durability - messages are replicated across cluster • Handles slow consumers • Pull model, not push • Configurable message retention • Can work with multiple regions/data centers • Written in Scala! Why Kafka?
  • 16. • Distributed, partitioned, replicated commit log service • Producers publish messages to Topics • Consumers pull + process the feed of published messages • Runs as a cluster of Brokers • Requires ZooKeeper for coordination/leader election Kafka P P P C C C ZK Brokers w/ Topics | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
  • 17. • Split into Partitions (which are stored in log files) • Each partition is an ordered, immutable sequence of messages that is only appended to • Partitions are distributed and replicated across the cluster of Brokers • Data is kept for a configurable retention period after which it is either discarded or compacted • Consumers keep track of their offset in the logs Topics
  • 18. • Push messages to partitions of topics • Can send to • A random/round-robined partition • A specific partition • A partition based on a hash constructed from a key • Maintain per-key order • Messages and Keys are just Array[Byte] • Responsible for your own serialization Producers
  • 19. • Pull messages from partitions of topics • Can either • Manually manage offsets (“simple consumer”) • Have offsets/partition assignment automatically managed (“high level consumer”) • Consumer Groups • Offsets stored in ZooKeeper (or Kafka itself) • Partitions are distributed among consumers • # Consumers > # Partitions => Some consume nothing • # Partitions > # Consumers => Some consume several partitions Consumers
  • 20. How we set up Kafka
  • 21. • Each cluster consists of a set of Kafka brokers and a ZooKeeper quorum • At least 3 brokers • At least 3 ZK nodes (preferably more) • Brokers have large disks • Standard topic retention - overridden per topic as necessary • Topics are managed via Jenkins jobs Clusters ZK ZK ZK B B B
  • 22. • MirrorMaker • Tool for consuming topics from one cluster + producing to another • Aggregate + Local clusters • Producers produce to local cluster • Consumers consume from local + aggregate • MirrorMaker consumes from local + produces to aggregate Multi-Region ZK Local Aggregate MirrorMaker ZK Local Aggregate MirrorMaker Region 1 Region 2 PP C C
  • 24. • Wrote a thin Scala wrapper around the Kafka “New” Producer Java API • Effectively send(topic, message, [key]) • Use minimum “in-sync replicas” setting for Topics • We set it to ceil(N/2 + 1) where N is the size of the cluster • Wait for acks from partition replicas before committing to leader Producing
  • 25. • To produce from our PHP components, we use a Scala proxy service with a REST API • We also produce directly from MySQL by using Tungsten Replicator and a filter that converts binlog changes to event bus messages and produces them Producing Kafka TR
  • 26. • Wrote a thin Scala wrapper on top of the High-Level Kafka Consumer Java API • Abstracts consuming from Local + Aggregate clusters • Register consumer function for a topic • Offsets auto-committed to ZooKeeper • Consumer group for each logical consumer • Sometimes have more consumers than partitions (fault tolerance) • Also have consumption mechanism for PHP/Python Consuming
  • 28. • Need to be able to serialize/deserialize messages in an efficient, language agnostic way that tolerates evolution in message data • Options • JSON • Plain text, everything understands it, easy to add/change fields • Expensive to parse, large size, still have convert parsed JSON into domain objects • Protocol Buffers (protobuf) • Binary, language-specific impls generated from an IDL • Fast to parse, small size, generated code, easy to make backwards/forwards compatible changes Data -> Array[Byte] -> Data
  • 29. • All of the messages we publish/consume from Kafka are serialized protobufs • We use ScalaPB (https://github.com/trueaccord/ScalaPB) • Built on top of Google’s Java protobuf library • Generates scala case class definitions from .proto • Use only “optional” fields • Helps forwards/backwards compatibility of messages • Can add/remove fields without breaking Protobuf
  • 30. • You have to know the type of the serialized protobuf data before you can deserialize it • Potential solutions • Only publish one type of message per topic • Prepend a non-protobuf type tag in the payload • The previous, but with protobufs inside protobufs Small problem
  • 31. • Protobuf that contains a list • UUID string • Payload bytes (serialized protobuf) • Benefits • Multiple objects per logical event • Evolution of data in a topic • Automatic serialization and deserialization (maintain a mapping of UUID-to-Type in each language) Message wrapper UUID Serialized protobuf payload bytes
  • 32. • We use Kafka as a high-performance, highly-available asynchronous event bus to decouple our services and reduce complexity. • Kafka is awesome - it just works! • We use Protocol Buffers for an efficient message format that is easy to use and evolve. • Scala support for Kafka + Protobuf is great! Wrapping up
  • 33. Thank you! Questions? Senior Software Developer @jimriecken Jim Riecken