SlideShare a Scribd company logo
HIGH PERFORMANCE MESSAGING
WITH APACHE PULSAR
http://pulsar.apache.org
WHAT IS PULSAR?
“Pub-Sub messaging backed by durable log storage”
WHAT IS APACHE PULSAR?
3
Multi-tenancy
A single cluster can
support many tenants
and use cases
Ordering
Guaranteed ordering
Durability
Data replicated and
synced to disk
Delivery Guarantees
At least once, at most
once and effectively
once
Highly scalable
Can support millions
of topics
Unified messaging model
Support both Topic &
Queue semantic in a
single model
Geo-replication
Out of box support for
geographically distributed
applications
High throughput
Can reach 1.8 M
messages/s in a single
partition
Low Latency
Low publish latency of
5ms at 99pct
MESSAGING MODEL
DEFINING PERFORMANCE
DISTRIBUTEDVSVERTICAL
• Focus is very different
• While both target overall system performance
• Distributed systems generally focus on distributed perf
6
DISTRIBUTED SYSTEMS PERFORMANCE
• Key factors:
• How different components interact
• How data is replicated
• Can each component make progress while waiting for other
7
STATEFUL SYSTEMS
• Macro optimizations typically yield orders of magnitude differences
• Ensure throughput is not bottlenecked by waiting
• In failure path (eg: how to replace a failed node)
• In ops tasks (eg: how to expand a cluster)
8
STATEFUL SYSTEMS
• Stateful systems can become unbalanced when traffic changes
• The system needs to be designed to allow for quick reaction,
distributing the load across all nodes
9
VERTICAL OPTIMIZATIONS
• Ensure a single machine can give the max throughput
• Optimize thread access
• Concurrent data structures
• Micro-Profiling
10
ARCHITECTURALVIEW
SEGMENT
CENTRIC
STORAGE
• In addition to partitioning,
messages are stored in segments
(based on time and size)
• Segments are independent from
each others and spread across
all storage nodes
SEGMENT CENTRIC
• Unbounded log storage
• Instant scaling without data rebalancing
• High write and read availability via maximized data placement options
• Fast replica repair — many-to-many read
13
SEGMENTSVS PARTITIONS
COMPARISON WITH APACHE KAFKA
• In Kafka, partitions are sticky to brokers
• A single partition is stored entirely in a single node
• Retention is limited by a single node storage capacity
• Failure recovery and capacity expansion require “rebalancing”
• Rebalancing has a big impact over the system, affecting regular traffic
15
DATA PATH
1 — Publisher sends message to broker
DATA PATH
2 — Broker writes in parallel to N replicas
DATA PATH
3 — Wait for a quorum of acks from bookies
DATA PATH
4 — Send ack to producer — Dispatch to consumer
BOOKKEEPER REPLICATION MODEL
• Single writer (Pulsar broker)
• Write in parallel to multiple storage nodes
• Wait for a configurable number of acks
• Supports quorum writes (eg: write 3 nodes — wait 2 acks)
• Perform recovery only after writer crashes
• Establish what was the last committed entry and “seal” the segment
20
KAFKA REPLICATION MODEL
ISR replication
LIMITATIONS OF KAFKA REPLICATION
• When followers are “in-sync”, the leader will have to wait for them —
cannot prune the slowest follower
• Leader election happens per partition
• To ensure ordering, only 1 message (or batch) can be outstanding —
limits throughput
• Reads can only happen from leader broker
22
STORAGE
STORAGE
• Disk access patterns can lead to order of magnitude differences
• Systems that rely on page-cache have unpredictable performance
• Page cache is RAM speed until the system is under stress
• After that, memory accesses can take 100s of millis
24
BOOKKEEPER INTERNAL
BOOKKEEPER STORAGE
• IO isolation between write and read operations
• Slow consumers won’t impact latency
• Very effective IO patterns:
• Journal — append only and no reads
• Storage device — bulk write and sequential reads
• Number of files is independent from number of topics
26
KAFKA STORAGE
• Multiple files per each partitions — each segment has data
• Lot of file descriptors needed
• Page cache only works well until active data set exceed RAM size
• IO is scattered throughout the disk
27
OPTIMIZATIONS
OPTIMIZATIONS
• Payload Buffer pooling — Direct memory — No heap pollution
• Object pooling in data path — minimize GC work
• Serialize operations to thread to avoid mutex contention
• Pulsar brokers acts as a “proxy” — Payloads are forwarded with
zero-copies from producers to storage and consumers
29
BENCHMARK
OPENMESSAGING
BENCHMARK
openmessaging.cloud
openmessaging.cloud/docs/benchmarks
BENCHMARK FRAMEWORK
• Designed to measure performance of distributed messaging systems
• Supports various “drivers” (Kafka, Pulsar, RocketMQ, RabbitMQ)
• Automated deployment in EC2
• Configure workloads through aYAML file
32
DISTRIBUTED EXECUTION
Coordinator will take the workload definition and propagate to multiple
workers — Collects and reports stats
BENCHMARK RESULTS
• Testing goals
• Throughput & latency under different conditions
• Min 2 guaranteed copies
• Running on 3 EC2VMs with local SSDs
34
KAFKA SETTINGS
• Topic settings
replicationFactor=3
min.insync.replicas=2
log.flush.interval.ms= # Using default: means no fsyncs
• Kafka producer config
acks=all
linger.ms=1
batch.size=131072
35
PULSAR / BOOKKEEPER SETTINGS
• Use ensemble=3 write=3 ack=2
• Write to 3 bookies and wait for 2 acks
• Data synced on disk before ack
36
MaxThroughput
1Topic
1 Partition
1KB payload
MaxThroughput
—
Exactly once
producer
1Topic
1 Partition
1KB payload
—
Kafka settings:

enable.idempotence=true
max.in.flight.requests.per.connection=1
retries=2147483647
Latency at fixed
throughput
50K msg/s
1Topic
1 Partition
1KB payload
Latency at fixed
throughput
—
99pct
50K msg/s
1Topic
1 Partition
1KB payload
Latency at fixed
throughput
—
(including Kafka-sync)
50K msg/s
1Topic
1 Partition
1KB payload
Latency at fixed
throughput
—
99pct
50K msg/s
1Topic
1 Partition
1KB payload
OPTIMIZING FOR LOW LATENCY
• Testing at a smaller throughput for sub-millisecond latency
• Tested on bare metal server
• Single machine to isolate impact of slow networks
43
HARDWARE SETUP
• 1 Machine — Bare metal
• 12 CPU cores — Intel(R) Xeon(R) CPU E5-2687W v4 @ 3.00GHz
• 128 GB RAM
• 2 x 1.2TB NVMe disks
44
Latency at low
throughput
1K msg/s
1Topic
1 Partition
1KB payload
Latency at low
throughput
1K msg/s
1Topic
1 Partition
1KB payload
Latency at low
throughput
1K msg/s
1Topic
1 Partition
1KB payload
QUESTIONS ?

More Related Content

What's hot (20)

PPTX
Apache Kafka - Yüksek Performanslı Dağıtık Mesajlaşma Sistemi - Türkçe
Emre Akış
 
PPTX
kafka
Amikam Snir
 
PDF
Bringing Kafka Without Zookeeper Into Production with Colin McCabe | Kafka Su...
HostedbyConfluent
 
PPTX
Apache kafka
Kumar Shivam
 
PDF
Apache kafka
NexThoughts Technologies
 
PDF
Why Splunk Chose Pulsar_Karthik Ramasamy
StreamNative
 
PPTX
Apache kafka
Srikrishna k
 
PDF
Capacity Planning Your Kafka Cluster | Jason Bell, Digitalis
HostedbyConfluent
 
PPTX
Kafka Tutorial - introduction to the Kafka streaming platform
Jean-Paul Azar
 
PPTX
Apache Kafka Best Practices
DataWorks Summit/Hadoop Summit
 
PDF
Kafka Overview
iamtodor
 
PPTX
Apache Pulsar First Overview
Ricardo Paiva
 
PDF
Fundamentals of Apache Kafka
Chhavi Parasher
 
PDF
An Introduction to Apache Kafka
Amir Sedighi
 
PPTX
Kafka presentation
Mohammed Fazuluddin
 
PDF
An Introduction to Kubernetes
Imesh Gunaratne
 
PDF
Grafana Loki: like Prometheus, but for Logs
Marco Pracucci
 
PPTX
Envoy and Kafka
Adam Kotwasinski
 
PPTX
Kubernetes PPT.pptx
ssuser0cc9131
 
PDF
Kafka Streams: What it is, and how to use it?
confluent
 
Apache Kafka - Yüksek Performanslı Dağıtık Mesajlaşma Sistemi - Türkçe
Emre Akış
 
Bringing Kafka Without Zookeeper Into Production with Colin McCabe | Kafka Su...
HostedbyConfluent
 
Apache kafka
Kumar Shivam
 
Why Splunk Chose Pulsar_Karthik Ramasamy
StreamNative
 
Apache kafka
Srikrishna k
 
Capacity Planning Your Kafka Cluster | Jason Bell, Digitalis
HostedbyConfluent
 
Kafka Tutorial - introduction to the Kafka streaming platform
Jean-Paul Azar
 
Apache Kafka Best Practices
DataWorks Summit/Hadoop Summit
 
Kafka Overview
iamtodor
 
Apache Pulsar First Overview
Ricardo Paiva
 
Fundamentals of Apache Kafka
Chhavi Parasher
 
An Introduction to Apache Kafka
Amir Sedighi
 
Kafka presentation
Mohammed Fazuluddin
 
An Introduction to Kubernetes
Imesh Gunaratne
 
Grafana Loki: like Prometheus, but for Logs
Marco Pracucci
 
Envoy and Kafka
Adam Kotwasinski
 
Kubernetes PPT.pptx
ssuser0cc9131
 
Kafka Streams: What it is, and how to use it?
confluent
 

Similar to High performance messaging with Apache Pulsar (20)

PDF
Pulsar - flexible pub-sub for internet scale
Matteo Merli
 
PDF
Linked In Stream Processing Meetup - Apache Pulsar
Karthik Ramasamy
 
PDF
Unifying Messaging, Queueing & Light Weight Compute Using Apache Pulsar
Karthik Ramasamy
 
PDF
October 2016 HUG: Pulsar,  a highly scalable, low latency pub-sub messaging s...
Yahoo Developer Network
 
PDF
Hands-on Workshop: Apache Pulsar
Sijie Guo
 
PDF
Pulsar - Distributed pub/sub platform
Matteo Merli
 
PPTX
The Evolution of Trillion-level Real-time Messaging System in BIGO - Puslar ...
StreamNative
 
PDF
What We Learned From Building a Modern Messaging and Streaming System for Cloud
StreamNative
 
PDF
Apache Pulsar Seattle - Meetup
Karthik Ramasamy
 
PDF
lessons from managing a pulsar cluster
Shivji Kumar Jha
 
PDF
Designing Modern Streaming Data Applications
Arun Kejariwal
 
PDF
Lessons from managing a Pulsar cluster (Nutanix)
StreamNative
 
PDF
Modern real-time streaming architectures
Arun Kejariwal
 
PDF
bigdata 2022_ FLiP Into Pulsar Apps
Timothy Spann
 
PDF
Apache Pulsar in Action MEAP V04 David Kjerrumgaard
biruktresehb
 
PDF
Messaging, storage, or both? The real time story of Pulsar and Apache Distri...
Streamlio
 
PDF
Apache Pulsar Overview
Streamlio
 
PDF
Effectively-once semantics in Apache Pulsar
Matteo Merli
 
PPTX
Apache Pulsar: Why Unified Messaging and Streaming Is the Future - Pulsar Sum...
StreamNative
 
PDF
Timothy Spann: Apache Pulsar for ML
Edunomica
 
Pulsar - flexible pub-sub for internet scale
Matteo Merli
 
Linked In Stream Processing Meetup - Apache Pulsar
Karthik Ramasamy
 
Unifying Messaging, Queueing & Light Weight Compute Using Apache Pulsar
Karthik Ramasamy
 
October 2016 HUG: Pulsar,  a highly scalable, low latency pub-sub messaging s...
Yahoo Developer Network
 
Hands-on Workshop: Apache Pulsar
Sijie Guo
 
Pulsar - Distributed pub/sub platform
Matteo Merli
 
The Evolution of Trillion-level Real-time Messaging System in BIGO - Puslar ...
StreamNative
 
What We Learned From Building a Modern Messaging and Streaming System for Cloud
StreamNative
 
Apache Pulsar Seattle - Meetup
Karthik Ramasamy
 
lessons from managing a pulsar cluster
Shivji Kumar Jha
 
Designing Modern Streaming Data Applications
Arun Kejariwal
 
Lessons from managing a Pulsar cluster (Nutanix)
StreamNative
 
Modern real-time streaming architectures
Arun Kejariwal
 
bigdata 2022_ FLiP Into Pulsar Apps
Timothy Spann
 
Apache Pulsar in Action MEAP V04 David Kjerrumgaard
biruktresehb
 
Messaging, storage, or both? The real time story of Pulsar and Apache Distri...
Streamlio
 
Apache Pulsar Overview
Streamlio
 
Effectively-once semantics in Apache Pulsar
Matteo Merli
 
Apache Pulsar: Why Unified Messaging and Streaming Is the Future - Pulsar Sum...
StreamNative
 
Timothy Spann: Apache Pulsar for ML
Edunomica
 
Ad

Recently uploaded (20)

PPTX
MODULE 05 - CLOUD COMPUTING AND SECURITY.pptx
Alvas Institute of Engineering and technology, Moodabidri
 
PPTX
Distribution reservoir and service storage pptx
dhanashree78
 
PDF
REINFORCEMENT LEARNING IN DECISION MAKING SEMINAR REPORT
anushaashraf20
 
PPTX
MODULE 03 - CLOUD COMPUTING AND SECURITY.pptx
Alvas Institute of Engineering and technology, Moodabidri
 
PPTX
How Industrial Project Management Differs From Construction.pptx
jamespit799
 
PPTX
澳洲电子毕业证澳大利亚圣母大学水印成绩单UNDA学生证网上可查学历
Taqyea
 
PPTX
Water Resources Engineering (CVE 728)--Slide 3.pptx
mohammedado3
 
PPTX
美国电子版毕业证南卡罗莱纳大学上州分校水印成绩单USC学费发票定做学位证书编号怎么查
Taqyea
 
PPTX
Final Major project a b c d e f g h i j k l m
bharathpsnab
 
PDF
20ES1152 Programming for Problem Solving Lab Manual VRSEC.pdf
Ashutosh Satapathy
 
PPTX
Mechanical Design of shell and tube heat exchangers as per ASME Sec VIII Divi...
shahveer210504
 
PPTX
Numerical-Solutions-of-Ordinary-Differential-Equations.pptx
SAMUKTHAARM
 
PPT
New_school_Engineering_presentation_011707.ppt
VinayKumar304579
 
PDF
Viol_Alessandro_Presentazione_prelaurea.pdf
dsecqyvhbowrzxshhf
 
PPTX
fatigue in aircraft structures-221113192308-0ad6dc8c.pptx
aviatecofficial
 
PPT
Testing and final inspection of a solar PV system
MuhammadSanni2
 
PDF
aAn_Introduction_to_Arcadia_20150115.pdf
henriqueltorres1
 
PPTX
DATA BASE MANAGEMENT AND RELATIONAL DATA
gomathisankariv2
 
PDF
Reasons for the succes of MENARD PRESSUREMETER.pdf
majdiamz
 
PPTX
Knowledge Representation : Semantic Networks
Amity University, Patna
 
MODULE 05 - CLOUD COMPUTING AND SECURITY.pptx
Alvas Institute of Engineering and technology, Moodabidri
 
Distribution reservoir and service storage pptx
dhanashree78
 
REINFORCEMENT LEARNING IN DECISION MAKING SEMINAR REPORT
anushaashraf20
 
MODULE 03 - CLOUD COMPUTING AND SECURITY.pptx
Alvas Institute of Engineering and technology, Moodabidri
 
How Industrial Project Management Differs From Construction.pptx
jamespit799
 
澳洲电子毕业证澳大利亚圣母大学水印成绩单UNDA学生证网上可查学历
Taqyea
 
Water Resources Engineering (CVE 728)--Slide 3.pptx
mohammedado3
 
美国电子版毕业证南卡罗莱纳大学上州分校水印成绩单USC学费发票定做学位证书编号怎么查
Taqyea
 
Final Major project a b c d e f g h i j k l m
bharathpsnab
 
20ES1152 Programming for Problem Solving Lab Manual VRSEC.pdf
Ashutosh Satapathy
 
Mechanical Design of shell and tube heat exchangers as per ASME Sec VIII Divi...
shahveer210504
 
Numerical-Solutions-of-Ordinary-Differential-Equations.pptx
SAMUKTHAARM
 
New_school_Engineering_presentation_011707.ppt
VinayKumar304579
 
Viol_Alessandro_Presentazione_prelaurea.pdf
dsecqyvhbowrzxshhf
 
fatigue in aircraft structures-221113192308-0ad6dc8c.pptx
aviatecofficial
 
Testing and final inspection of a solar PV system
MuhammadSanni2
 
aAn_Introduction_to_Arcadia_20150115.pdf
henriqueltorres1
 
DATA BASE MANAGEMENT AND RELATIONAL DATA
gomathisankariv2
 
Reasons for the succes of MENARD PRESSUREMETER.pdf
majdiamz
 
Knowledge Representation : Semantic Networks
Amity University, Patna
 
Ad

High performance messaging with Apache Pulsar

  • 1. HIGH PERFORMANCE MESSAGING WITH APACHE PULSAR http://pulsar.apache.org
  • 2. WHAT IS PULSAR? “Pub-Sub messaging backed by durable log storage”
  • 3. WHAT IS APACHE PULSAR? 3 Multi-tenancy A single cluster can support many tenants and use cases Ordering Guaranteed ordering Durability Data replicated and synced to disk Delivery Guarantees At least once, at most once and effectively once Highly scalable Can support millions of topics Unified messaging model Support both Topic & Queue semantic in a single model Geo-replication Out of box support for geographically distributed applications High throughput Can reach 1.8 M messages/s in a single partition Low Latency Low publish latency of 5ms at 99pct
  • 6. DISTRIBUTEDVSVERTICAL • Focus is very different • While both target overall system performance • Distributed systems generally focus on distributed perf 6
  • 7. DISTRIBUTED SYSTEMS PERFORMANCE • Key factors: • How different components interact • How data is replicated • Can each component make progress while waiting for other 7
  • 8. STATEFUL SYSTEMS • Macro optimizations typically yield orders of magnitude differences • Ensure throughput is not bottlenecked by waiting • In failure path (eg: how to replace a failed node) • In ops tasks (eg: how to expand a cluster) 8
  • 9. STATEFUL SYSTEMS • Stateful systems can become unbalanced when traffic changes • The system needs to be designed to allow for quick reaction, distributing the load across all nodes 9
  • 10. VERTICAL OPTIMIZATIONS • Ensure a single machine can give the max throughput • Optimize thread access • Concurrent data structures • Micro-Profiling 10
  • 12. SEGMENT CENTRIC STORAGE • In addition to partitioning, messages are stored in segments (based on time and size) • Segments are independent from each others and spread across all storage nodes
  • 13. SEGMENT CENTRIC • Unbounded log storage • Instant scaling without data rebalancing • High write and read availability via maximized data placement options • Fast replica repair — many-to-many read 13
  • 15. COMPARISON WITH APACHE KAFKA • In Kafka, partitions are sticky to brokers • A single partition is stored entirely in a single node • Retention is limited by a single node storage capacity • Failure recovery and capacity expansion require “rebalancing” • Rebalancing has a big impact over the system, affecting regular traffic 15
  • 16. DATA PATH 1 — Publisher sends message to broker
  • 17. DATA PATH 2 — Broker writes in parallel to N replicas
  • 18. DATA PATH 3 — Wait for a quorum of acks from bookies
  • 19. DATA PATH 4 — Send ack to producer — Dispatch to consumer
  • 20. BOOKKEEPER REPLICATION MODEL • Single writer (Pulsar broker) • Write in parallel to multiple storage nodes • Wait for a configurable number of acks • Supports quorum writes (eg: write 3 nodes — wait 2 acks) • Perform recovery only after writer crashes • Establish what was the last committed entry and “seal” the segment 20
  • 22. LIMITATIONS OF KAFKA REPLICATION • When followers are “in-sync”, the leader will have to wait for them — cannot prune the slowest follower • Leader election happens per partition • To ensure ordering, only 1 message (or batch) can be outstanding — limits throughput • Reads can only happen from leader broker 22
  • 24. STORAGE • Disk access patterns can lead to order of magnitude differences • Systems that rely on page-cache have unpredictable performance • Page cache is RAM speed until the system is under stress • After that, memory accesses can take 100s of millis 24
  • 26. BOOKKEEPER STORAGE • IO isolation between write and read operations • Slow consumers won’t impact latency • Very effective IO patterns: • Journal — append only and no reads • Storage device — bulk write and sequential reads • Number of files is independent from number of topics 26
  • 27. KAFKA STORAGE • Multiple files per each partitions — each segment has data • Lot of file descriptors needed • Page cache only works well until active data set exceed RAM size • IO is scattered throughout the disk 27
  • 29. OPTIMIZATIONS • Payload Buffer pooling — Direct memory — No heap pollution • Object pooling in data path — minimize GC work • Serialize operations to thread to avoid mutex contention • Pulsar brokers acts as a “proxy” — Payloads are forwarded with zero-copies from producers to storage and consumers 29
  • 32. BENCHMARK FRAMEWORK • Designed to measure performance of distributed messaging systems • Supports various “drivers” (Kafka, Pulsar, RocketMQ, RabbitMQ) • Automated deployment in EC2 • Configure workloads through aYAML file 32
  • 33. DISTRIBUTED EXECUTION Coordinator will take the workload definition and propagate to multiple workers — Collects and reports stats
  • 34. BENCHMARK RESULTS • Testing goals • Throughput & latency under different conditions • Min 2 guaranteed copies • Running on 3 EC2VMs with local SSDs 34
  • 35. KAFKA SETTINGS • Topic settings replicationFactor=3 min.insync.replicas=2 log.flush.interval.ms= # Using default: means no fsyncs • Kafka producer config acks=all linger.ms=1 batch.size=131072 35
  • 36. PULSAR / BOOKKEEPER SETTINGS • Use ensemble=3 write=3 ack=2 • Write to 3 bookies and wait for 2 acks • Data synced on disk before ack 36
  • 38. MaxThroughput — Exactly once producer 1Topic 1 Partition 1KB payload — Kafka settings:
 enable.idempotence=true max.in.flight.requests.per.connection=1 retries=2147483647
  • 39. Latency at fixed throughput 50K msg/s 1Topic 1 Partition 1KB payload
  • 40. Latency at fixed throughput — 99pct 50K msg/s 1Topic 1 Partition 1KB payload
  • 41. Latency at fixed throughput — (including Kafka-sync) 50K msg/s 1Topic 1 Partition 1KB payload
  • 42. Latency at fixed throughput — 99pct 50K msg/s 1Topic 1 Partition 1KB payload
  • 43. OPTIMIZING FOR LOW LATENCY • Testing at a smaller throughput for sub-millisecond latency • Tested on bare metal server • Single machine to isolate impact of slow networks 43
  • 44. HARDWARE SETUP • 1 Machine — Bare metal • 12 CPU cores — Intel(R) Xeon(R) CPU E5-2687W v4 @ 3.00GHz • 128 GB RAM • 2 x 1.2TB NVMe disks 44
  • 45. Latency at low throughput 1K msg/s 1Topic 1 Partition 1KB payload
  • 46. Latency at low throughput 1K msg/s 1Topic 1 Partition 1KB payload
  • 47. Latency at low throughput 1K msg/s 1Topic 1 Partition 1KB payload