SlideShare a Scribd company logo
Preview of Apache Pulsar 2.5.0
Transactional streaming
Sticky consumer
Batch receiving
Namespace change events
Messaging semantics - 1
1. At least once
try {
Message msg = consumer.receive()
// processing
consumer.acknowledge(msg)
} catch (Exception e) {
consumer.negativeAcknowledge(msg)
}
try {
Message msg = consumer.receive()
// processing
} catch (Exception e) {
log.error(“processing error”, e)
} finally {
consumer.acknowledge(msg)
}
2. At most once
3. Exactly once ?
Messaging semantics - 2
idempotent produce and idempotent consume be used more in practice
Messaging semantics - 3
Effectively once
ledgerId + messageId -> sequenceId
+
Broker deduplication
Messaging semantics - 4
Limitations in effectively once
1. Only works with one partition producing
2. Only works with one message producing
3. Only works with on partition consuming
4. Consumers are required to store the message id and state for restoring
Streaming processing - 1
ATopic-1 Topic-2f (A) B
1
1. Received message A from Topic-1 and do some processing
Streaming processing - 2
ATopic-1 Topic-2f (A) B
2
2. Write the result message B to Topic-2
Streaming processing - 3
ATopic-1 Topic-2f (A) B
3
3. Get send response from Topic-2
How to handle get response timeout or consumer/function crash?
Ack message A = At most once
Nack message A = At least once
Streaming processing - 4
ATopic-1 Topic-2f (A) B4
4. Ack message A
How to handle ack failed or consumer/function crash?
Transactional streaming semantics
1. Atomic multi-topic publish and acknowledge
2.Message only dispatch to one consumer until transaction abort
3.Only committed message can be read by consumer
READ_COMMITTED
https://github.com/apache/pulsar/wiki/PIP-31%3A-Transaction-Support
Transactional streaming demo
Message<String> message = inputConsumer.receive();
Transaction txn =
client.newTransaction().withTransactionTimeout(…).build().get();
CompletableFuture<MessageId> sendFuture1 =
producer1.newMessage(txn).value(“output-message-1”).sendAsync();
CompletableFuture<MessageId> sendFuture2 =
producer2.newMessage(txn).value(“output-message-2”).sendAsync();
inputConsumer.acknowledgeAsync(message.getMessageId(), txn);
txn.commit().get();
MessageId msgId1 = sendFuture1.get();
MessageId msgId2 = sendFuture2.get();
Sticky consumer
Sticky consumer
https://github.com/apache/pulsar/wiki/PIP-34%3A-Add-new-subscribe-type-Key_shared
Consumer consumer1 = client.newConsumer()
.topic(“my-topic“)
.subscription(“my-subscription”)
.subscriptionType(SubscriptionType.Key_Shared)
.keySharedPolicy(KeySharedPolicy.sticky()
.ranges(Range.of(0, 32767)))
).subscribe();
Consumer consumer2 = client.newConsumer()
.topic(“my-topic“)
.subscription(“my-subscription”)
.subscriptionType(SubscriptionType.Key_Shared)
.keySharedPolicy(KeySharedPolicy.sticky()
.ranges(Range.of(32768, 65535)))
).subscribe();
Batch receiving messages
Consumer consumer = client.newConsumer()
.topic(“my-topic“)
.subscription(“my-subscription”)
.batchReceivePolicy(BatchReceivePolicy.builder()
.maxNumMessages(100)
.maxNumBytes(2 * 1024 * 1024)
.timeout(1, TimeUnit.SECONDS)
).subscribe();
Messages msgs = consumer.batchReceive();
// doing some batch operate
https://github.com/apache/pulsar/wiki/PIP-38%3A-Batch-Receiving-Messages
Namespace change events
https://github.com/apache/pulsar/wiki/PIP-39%3A-Namespace-Change-Events
persistent://tenant/ns/__change_events
class PulsarEvent {
EventType eventType;
ActionType actionType;
TopicEvent topicEvent;
}
Thanks
Penghui Li
Bo Cong / 丛搏
Pulsar Schema
智联招聘消息系统研发⼯程师
Pulsar schema、HDFS Offload 核⼼贡献者
Schema Evolution
2
Data management can't escape the evolution of schema
Single version schema
3
message 1 message 2 message 3
version 1
Multiple version schemas
4
message 1 message 2 message 3
version 1 version 2 Version 3
Schema compatibility
can read Deserialization=
Compatibility strategy evolution
Back Ward
Back Ward Transitive
version 2 version 1 version 0
version 2 version 1 version 0
can read can read
can read can read
can read
may can’t read
Evolution of the situation
7
Class Person {
@Nullable
String name;
}
Version 1
Class Person {
String name;
}
Class Person {
@Nullable
@AvroDefault(""Zhang San"")
String name;
} Version 2
Version 3
Can read
Can readCan’t read
Compatibility check
Separate schema compatibility checker for producer and consumer
Producer Check if exist
Consumer
isAllowAutoUpdateSchema = false
Upgrade way
BACKWORD
Different strategy with different upgrade way
BACKWORD_TRANSITIVE
FORWORD
FORWORD_TRANSITIVE
Full
Full_TRANSITIVE
Consumers
Producers
Any order
Produce Different Message
10
Producer<V1Data> p = pulsarClient.newProducer(Schema.AVRO(V1Data.class))
.topic(topic).create();
Consumer<V2Data> c = pulsarClient.newConsumer(Schema.AVRO(V2Data.class))
.topic(topic)
.subscriptionName("sub1").subscribe()
p.newMessage().value(data1).send();
p.newMessage(Schema.AVRO(V2Data.class)).value(data2).send();
p.newMessage(Schema.AVRO(V1Data.class)).value(data3).send();
Message<V2Data> msg1 = c.receive();
V2Data msg1Value = msg1.getValue();
Message<V2Data> msg2 = c.receive();
Message<V2Data> msg3 = c.receive();
V2Data msg3Value = msg3.getValue();
Thanks
Bo Cong
翟佳
Kafka On Pulsar(KOP)
What is Apache Pulsar?
Flexible Pub/Sub
Messaging
backed by Durable
log Storage
Barrier for user?
Unified Messaging Protocol
Apps Build on old systems
How Pulsar handles it?
Pulsar Kafka Wrapper on Kafka Java API
https://pulsar.apache.org/docs/en/adaptors-kafka/
Pulsar IO Connect
https://pulsar.apache.org/docs/en/io-overview/
Kafka on Pulsar (KoP)
KoP Feasibility — Log
Topic
KoP Feasibility — Log
Topic
Producer Consumer
KoP Feasibility — Log
Topic
Producer Consumer
Kafka
KoP Feasibility — Log
Topic
Producer Consumer
Pulsar
KoP Feasibility — Others
Producer Consumer
Topic Lookup
Produce
Consume
Offset
Consumption State
KoP Overview
Kafka lib
Broker
Pulsar
Consumer
Pulsar lib
Load
Balancer
Pulsar Protocol handler Kafka Protocol handler
Pulsar
Producer
Pulsar lib
Kafka
Producer
Kafka lib
Kafka
Consumer
Kafka lib
Kafka
Producer
Managed Ledger
BK Client
Geo-
Replicator
Pulsar Topic
ZooKeeper
Bookie
Pulsar
KoP Implementation
Topic flat map: Broker sets `kafkaNamespace`
Message ID and Offset: LedgerId + EntryId
Message: Convert Key/value/timestamp/headers(properties)
Topic Lookup: Pulsar admin topic lookup -> owner broker
Produce: Convert, then call PulsarTopic.publishMessage
Consume: Convert, then call non-durable-cursor.readEntries
Group Coordinator: Keep in topic `public/__kafka/__offsets`
KoP Now
Layered Architecture
Independent Scale
Instant Recovery
Balance-free expand
Ordering
Guaranteed ordering
Multi-tenancy
A single cluster can
support many tenants
and use cases
High throughput
Can reach 1.8 M
messages/s in a
single partition
Durability
Data replicated and
synced to disk
Geo-replication
Out of box support for
geographically
distributed
applications
Unified messaging
model
Support both
Streaming and
Queuing
Delivery Guarantees
At least once, at most
once and effectively once
Low Latency
Low publish latency of
5ms
Highly scalable &
available
Can support millions of
topics
HA
KoP Now
Demo
https://kafka.apache.org/quickstart
Demo1: Kafka Producer / Consumer
Demo2: Kafka Connect
https://archive.apache.org/dist/kafka/2.0.0/
kafka_2.12-2.0.0.tgz
Demo video: https://www.bilibili.com/video/av75540685
Demo
Kafka lib
Broker
Pulsar
Consumer
Pulsar lib
Load
Balancer
Pulsar Protocol handler Kafka Protocol handler
Pulsar
Producer
Pulsar lib
Kafka
Producer
Kafka lib
Kafka
Consumer
Kafka lib
Kafka
Producer
Managed Ledger
BK Client
Geo-
Replicator
Pulsar Topic
ZooKeeper
Bookie
Pulsar
Demo1: K-Producer -> K-Consumer
Kafka lib
Kafka
Consumer
Kafka libKafka lib
Kafka
Producer
Broker
Pulsar Protocol handler Kafka Protocol handler
Pulsar Topic
bin/kafka-console-producer.sh --broker-list localhost:9092 --topic test
bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic test --from-beginning
Preview of Apache Pulsar 2.5.0
Demo1: P-Producer -> K-Consumer
Pulsar
Consumer
Pulsar lib
Pulsar
Producer
Pulsar lib
Kafka lib
Kafka
Consumer
Kafka libKafka lib
Kafka
Producer
Broker
Pulsar Protocol handler Kafka Protocol handler
Pulsar Topic
bin/pulsar-client produce test -n 1 -m “Hello from Pulsar Producer, Message 1”
bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic test --from-beginning
Preview of Apache Pulsar 2.5.0
Demo1: P-Producer -> K-Consumer
Pulsar
Consumer
Pulsar lib
Pulsar
Producer
Pulsar lib
Kafka lib
Kafka
Consumer
Kafka libKafka lib
Kafka
Producer
Broker
Pulsar Protocol handler Kafka Protocol handler
Pulsar Topic
bin/kafka-console-producer.sh --broker-list localhost:9092 --topic test
bin/pulsar-client consume -s sub-name test -n 0
Preview of Apache Pulsar 2.5.0
Demo2: Kafka Connect
Demo2: Kafka Connect
Kafka lib
Kafka
File
Source
Broker
Pulsar Protocol handler Kafka Protocol handler
Pulsar Topic
InPut
File
Kafka
File
Sink
OutPut
File
TOPIC
bin/connect-standalone.sh 

config/connect-standalone.properties 

config/connect-file-source.properties 

config/connect-file-sink.properties
Demo2: Pulsar Functions
https://pulsar.apache.org/docs/en/functions-overview/
Demo2: Pulsar Functions
Kafka lib
Kafka
File
Source
Broker
Pulsar Protocol handler Kafka Protocol handler
Pulsar Topic
InPut
File
Kafka
File
Sink
OutPut
File
TOPIC
Kafka lib
Pulsar
Functions
OutPut Topic
bin/pulsar-admin functions localrun --name pulsarExclamation

--jar pulsar-functions-api-examples.jar 

--classname org…ExclamationFunction

--inputs connect-test-partition-0 --output out-hello
Preview of Apache Pulsar 2.5.0
Apache Pulsar & Apache Kafka
Thanks!Stream
Native
We are hiring
Thanks

More Related Content

What's hot (20)

PPTX
Exactly-Once Made Easy: Transactional Messaging in Apache Pulsar - Pulsar Sum...
StreamNative
 
PPTX
High performance queues with Cassandra
Mikalai Alimenkou
 
PDF
Openstack meetup lyon_2017-09-28
Xavier Lucas
 
PDF
Kafka on Pulsar
StreamNative
 
PPTX
Architectures with Windows Azure
Damir Dobric
 
PDF
NATS + Docker meetup talk Oct - 2016
wallyqs
 
PDF
GopherCon 2017 - Writing Networking Clients in Go: The Design & Implementati...
wallyqs
 
PPTX
Kafka: Internals
Knoldus Inc.
 
PDF
Building a FaaS with pulsar
StreamNative
 
PDF
The Zen of High Performance Messaging with NATS (Strange Loop 2016)
wallyqs
 
PDF
1. Core Features of Apache RocketMQ
振东 刘
 
PPTX
Brokered Messaging in Windows Azure
Neil Mackenzie
 
PDF
Pulsar for Kafka People
Jesse Anderson
 
PDF
Cassandra by example - the path of read and write requests
grro
 
PDF
Distribute Key Value Store
Santal Li
 
PPTX
Docker Swarm secrets for creating great FIWARE platforms
Federico Michele Facca
 
PDF
Let the alpakka pull your stream
Enno Runne
 
PDF
GopherFest 2017 - Adding Context to NATS
wallyqs
 
PPTX
RabbitMQ vs Apache Kafka Part II Webinar
Erlang Solutions
 
Exactly-Once Made Easy: Transactional Messaging in Apache Pulsar - Pulsar Sum...
StreamNative
 
High performance queues with Cassandra
Mikalai Alimenkou
 
Openstack meetup lyon_2017-09-28
Xavier Lucas
 
Kafka on Pulsar
StreamNative
 
Architectures with Windows Azure
Damir Dobric
 
NATS + Docker meetup talk Oct - 2016
wallyqs
 
GopherCon 2017 - Writing Networking Clients in Go: The Design & Implementati...
wallyqs
 
Kafka: Internals
Knoldus Inc.
 
Building a FaaS with pulsar
StreamNative
 
The Zen of High Performance Messaging with NATS (Strange Loop 2016)
wallyqs
 
1. Core Features of Apache RocketMQ
振东 刘
 
Brokered Messaging in Windows Azure
Neil Mackenzie
 
Pulsar for Kafka People
Jesse Anderson
 
Cassandra by example - the path of read and write requests
grro
 
Distribute Key Value Store
Santal Li
 
Docker Swarm secrets for creating great FIWARE platforms
Federico Michele Facca
 
Let the alpakka pull your stream
Enno Runne
 
GopherFest 2017 - Adding Context to NATS
wallyqs
 
RabbitMQ vs Apache Kafka Part II Webinar
Erlang Solutions
 

Similar to Preview of Apache Pulsar 2.5.0 (20)

PDF
Let's keep it simple and streaming.pdf
VMware Tanzu
 
PDF
Let's keep it simple and streaming
Timothy Spann
 
PDF
DevNexus: Apache Pulsar Development 101 with Java
Timothy Spann
 
PDF
Apache Pulsar Seattle - Meetup
Karthik Ramasamy
 
PDF
Princeton Dec 2022 Meetup_ StreamNative and Cloudera Streaming
Timothy Spann
 
PDF
bigdata 2022_ FLiP Into Pulsar Apps
Timothy Spann
 
PDF
Pulsar for Kafka People_Jesse anderson
StreamNative
 
PDF
Timothy Spann: Apache Pulsar for ML
Edunomica
 
PDF
Machine Intelligence Guild_ Build ML Enhanced Event Streaming Applications wi...
Timothy Spann
 
PDF
Virtual Flink Forward 2020: Build your next-generation stream platform based ...
Flink Forward
 
PDF
Python Web Conference 2022 - Apache Pulsar Development 101 with Python (FLiP-Py)
Timothy Spann
 
PDF
OSS EU: Deep Dive into Building Streaming Applications with Apache Pulsar
Timothy Spann
 
PDF
Deep Dive into Building Streaming Applications with Apache Pulsar
Timothy Spann
 
PDF
The Dream Stream Team for Pulsar and Spring
Timothy Spann
 
PDF
Apache Pulsar in Action MEAP V04 David Kjerrumgaard
biruktresehb
 
PDF
Pulsar - flexible pub-sub for internet scale
Matteo Merli
 
PDF
Apache Pulsar in Action MEAP V04 David Kjerrumgaard
gcawlrgjfe307
 
PDF
Effectively-once semantics in Apache Pulsar
Matteo Merli
 
PDF
Python web conference 2022 apache pulsar development 101 with python (f li-...
Timothy Spann
 
PDF
Unified Messaging and Data Streaming 101
Timothy Spann
 
Let's keep it simple and streaming.pdf
VMware Tanzu
 
Let's keep it simple and streaming
Timothy Spann
 
DevNexus: Apache Pulsar Development 101 with Java
Timothy Spann
 
Apache Pulsar Seattle - Meetup
Karthik Ramasamy
 
Princeton Dec 2022 Meetup_ StreamNative and Cloudera Streaming
Timothy Spann
 
bigdata 2022_ FLiP Into Pulsar Apps
Timothy Spann
 
Pulsar for Kafka People_Jesse anderson
StreamNative
 
Timothy Spann: Apache Pulsar for ML
Edunomica
 
Machine Intelligence Guild_ Build ML Enhanced Event Streaming Applications wi...
Timothy Spann
 
Virtual Flink Forward 2020: Build your next-generation stream platform based ...
Flink Forward
 
Python Web Conference 2022 - Apache Pulsar Development 101 with Python (FLiP-Py)
Timothy Spann
 
OSS EU: Deep Dive into Building Streaming Applications with Apache Pulsar
Timothy Spann
 
Deep Dive into Building Streaming Applications with Apache Pulsar
Timothy Spann
 
The Dream Stream Team for Pulsar and Spring
Timothy Spann
 
Apache Pulsar in Action MEAP V04 David Kjerrumgaard
biruktresehb
 
Pulsar - flexible pub-sub for internet scale
Matteo Merli
 
Apache Pulsar in Action MEAP V04 David Kjerrumgaard
gcawlrgjfe307
 
Effectively-once semantics in Apache Pulsar
Matteo Merli
 
Python web conference 2022 apache pulsar development 101 with python (f li-...
Timothy Spann
 
Unified Messaging and Data Streaming 101
Timothy Spann
 
Ad

More from StreamNative (20)

PDF
Is Using KoP (Kafka-on-Pulsar) a Good Idea? - Pulsar Summit SF 2022
StreamNative
 
PDF
Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...
StreamNative
 
PDF
Blue-green deploys with Pulsar & Envoy in an event-driven microservice ecosys...
StreamNative
 
PDF
Distributed Database Design Decisions to Support High Performance Event Strea...
StreamNative
 
PDF
Simplify Pulsar Functions Development with SQL - Pulsar Summit SF 2022
StreamNative
 
PDF
Towards a ZooKeeper-less Pulsar, etcd, etcd, etcd. - Pulsar Summit SF 2022
StreamNative
 
PDF
Validating Apache Pulsar’s Behavior under Failure Conditions - Pulsar Summit ...
StreamNative
 
PDF
Cross the Streams! Creating Streaming Data Pipelines with Apache Flink + Apac...
StreamNative
 
PDF
Message Redelivery: An Unexpected Journey - Pulsar Summit SF 2022
StreamNative
 
PDF
Unlocking the Power of Lakehouse Architectures with Apache Pulsar and Apache ...
StreamNative
 
PDF
Understanding Broker Load Balancing - Pulsar Summit SF 2022
StreamNative
 
PDF
Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...
StreamNative
 
PDF
Pulsar's Journey in Yahoo!: On-prem, Cloud and Hybrid - Pulsar Summit SF 2022
StreamNative
 
PDF
Event-Driven Applications Done Right - Pulsar Summit SF 2022
StreamNative
 
PDF
Pulsar @ Scale. 200M RPM and 1K instances - Pulsar Summit SF 2022
StreamNative
 
PDF
Data Democracy: Journey to User-Facing Analytics - Pulsar Summit SF 2022
StreamNative
 
PDF
Beam + Pulsar: Powerful Stream Processing at Scale - Pulsar Summit SF 2022
StreamNative
 
PDF
Welcome and Opening Remarks - Pulsar Summit SF 2022
StreamNative
 
PDF
Log System As Backbone – How We Built the World’s Most Advanced Vector Databa...
StreamNative
 
PDF
MoP(MQTT on Pulsar) - a Powerful Tool for Apache Pulsar in IoT - Pulsar Summi...
StreamNative
 
Is Using KoP (Kafka-on-Pulsar) a Good Idea? - Pulsar Summit SF 2022
StreamNative
 
Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...
StreamNative
 
Blue-green deploys with Pulsar & Envoy in an event-driven microservice ecosys...
StreamNative
 
Distributed Database Design Decisions to Support High Performance Event Strea...
StreamNative
 
Simplify Pulsar Functions Development with SQL - Pulsar Summit SF 2022
StreamNative
 
Towards a ZooKeeper-less Pulsar, etcd, etcd, etcd. - Pulsar Summit SF 2022
StreamNative
 
Validating Apache Pulsar’s Behavior under Failure Conditions - Pulsar Summit ...
StreamNative
 
Cross the Streams! Creating Streaming Data Pipelines with Apache Flink + Apac...
StreamNative
 
Message Redelivery: An Unexpected Journey - Pulsar Summit SF 2022
StreamNative
 
Unlocking the Power of Lakehouse Architectures with Apache Pulsar and Apache ...
StreamNative
 
Understanding Broker Load Balancing - Pulsar Summit SF 2022
StreamNative
 
Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...
StreamNative
 
Pulsar's Journey in Yahoo!: On-prem, Cloud and Hybrid - Pulsar Summit SF 2022
StreamNative
 
Event-Driven Applications Done Right - Pulsar Summit SF 2022
StreamNative
 
Pulsar @ Scale. 200M RPM and 1K instances - Pulsar Summit SF 2022
StreamNative
 
Data Democracy: Journey to User-Facing Analytics - Pulsar Summit SF 2022
StreamNative
 
Beam + Pulsar: Powerful Stream Processing at Scale - Pulsar Summit SF 2022
StreamNative
 
Welcome and Opening Remarks - Pulsar Summit SF 2022
StreamNative
 
Log System As Backbone – How We Built the World’s Most Advanced Vector Databa...
StreamNative
 
MoP(MQTT on Pulsar) - a Powerful Tool for Apache Pulsar in IoT - Pulsar Summi...
StreamNative
 
Ad

Recently uploaded (20)

PPTX
apidays Singapore 2025 - Designing for Change, Julie Schiller (Google)
apidays
 
PDF
apidays Singapore 2025 - Building a Federated Future, Alex Szomora (GSMA)
apidays
 
PPT
AI Future trends and opportunities_oct7v1.ppt
SHIKHAKMEHTA
 
PPTX
apidays Helsinki & North 2025 - Agentic AI: A Friend or Foe?, Merja Kajava (A...
apidays
 
PDF
apidays Singapore 2025 - How APIs can make - or break - trust in your AI by S...
apidays
 
PPTX
apidays Singapore 2025 - Generative AI Landscape Building a Modern Data Strat...
apidays
 
PPTX
Advanced_NLP_with_Transformers_PPT_final 50.pptx
Shiwani Gupta
 
PDF
Avatar for apidays apidays PRO June 07, 2025 0 5 apidays Helsinki & North 2...
apidays
 
PPTX
apidays Singapore 2025 - The Quest for the Greenest LLM , Jean Philippe Ehre...
apidays
 
PDF
apidays Helsinki & North 2025 - Monetizing AI APIs: The New API Economy, Alla...
apidays
 
PDF
Research Methodology Overview Introduction
ayeshagul29594
 
PDF
apidays Singapore 2025 - Surviving an interconnected world with API governanc...
apidays
 
PDF
apidays Helsinki & North 2025 - API-Powered Journeys: Mobility in an API-Driv...
apidays
 
PDF
What does good look like - CRAP Brighton 8 July 2025
Jan Kierzyk
 
PDF
apidays Helsinki & North 2025 - How (not) to run a Graphql Stewardship Group,...
apidays
 
PDF
Data Retrieval and Preparation Business Analytics.pdf
kayserrakib80
 
PDF
OPPOTUS - Malaysias on Malaysia 1Q2025.pdf
Oppotus
 
PPTX
apidays Helsinki & North 2025 - From Chaos to Clarity: Designing (AI-Ready) A...
apidays
 
PPTX
Aict presentation on dpplppp sjdhfh.pptx
vabaso5932
 
PPTX
apidays Munich 2025 - Building Telco-Aware Apps with Open Gateway APIs, Subhr...
apidays
 
apidays Singapore 2025 - Designing for Change, Julie Schiller (Google)
apidays
 
apidays Singapore 2025 - Building a Federated Future, Alex Szomora (GSMA)
apidays
 
AI Future trends and opportunities_oct7v1.ppt
SHIKHAKMEHTA
 
apidays Helsinki & North 2025 - Agentic AI: A Friend or Foe?, Merja Kajava (A...
apidays
 
apidays Singapore 2025 - How APIs can make - or break - trust in your AI by S...
apidays
 
apidays Singapore 2025 - Generative AI Landscape Building a Modern Data Strat...
apidays
 
Advanced_NLP_with_Transformers_PPT_final 50.pptx
Shiwani Gupta
 
Avatar for apidays apidays PRO June 07, 2025 0 5 apidays Helsinki & North 2...
apidays
 
apidays Singapore 2025 - The Quest for the Greenest LLM , Jean Philippe Ehre...
apidays
 
apidays Helsinki & North 2025 - Monetizing AI APIs: The New API Economy, Alla...
apidays
 
Research Methodology Overview Introduction
ayeshagul29594
 
apidays Singapore 2025 - Surviving an interconnected world with API governanc...
apidays
 
apidays Helsinki & North 2025 - API-Powered Journeys: Mobility in an API-Driv...
apidays
 
What does good look like - CRAP Brighton 8 July 2025
Jan Kierzyk
 
apidays Helsinki & North 2025 - How (not) to run a Graphql Stewardship Group,...
apidays
 
Data Retrieval and Preparation Business Analytics.pdf
kayserrakib80
 
OPPOTUS - Malaysias on Malaysia 1Q2025.pdf
Oppotus
 
apidays Helsinki & North 2025 - From Chaos to Clarity: Designing (AI-Ready) A...
apidays
 
Aict presentation on dpplppp sjdhfh.pptx
vabaso5932
 
apidays Munich 2025 - Building Telco-Aware Apps with Open Gateway APIs, Subhr...
apidays
 

Preview of Apache Pulsar 2.5.0

  • 1. Preview of Apache Pulsar 2.5.0 Transactional streaming Sticky consumer Batch receiving Namespace change events
  • 2. Messaging semantics - 1 1. At least once try { Message msg = consumer.receive() // processing consumer.acknowledge(msg) } catch (Exception e) { consumer.negativeAcknowledge(msg) } try { Message msg = consumer.receive() // processing } catch (Exception e) { log.error(“processing error”, e) } finally { consumer.acknowledge(msg) } 2. At most once 3. Exactly once ?
  • 3. Messaging semantics - 2 idempotent produce and idempotent consume be used more in practice
  • 4. Messaging semantics - 3 Effectively once ledgerId + messageId -> sequenceId + Broker deduplication
  • 5. Messaging semantics - 4 Limitations in effectively once 1. Only works with one partition producing 2. Only works with one message producing 3. Only works with on partition consuming 4. Consumers are required to store the message id and state for restoring
  • 6. Streaming processing - 1 ATopic-1 Topic-2f (A) B 1 1. Received message A from Topic-1 and do some processing
  • 7. Streaming processing - 2 ATopic-1 Topic-2f (A) B 2 2. Write the result message B to Topic-2
  • 8. Streaming processing - 3 ATopic-1 Topic-2f (A) B 3 3. Get send response from Topic-2 How to handle get response timeout or consumer/function crash? Ack message A = At most once Nack message A = At least once
  • 9. Streaming processing - 4 ATopic-1 Topic-2f (A) B4 4. Ack message A How to handle ack failed or consumer/function crash?
  • 10. Transactional streaming semantics 1. Atomic multi-topic publish and acknowledge 2.Message only dispatch to one consumer until transaction abort 3.Only committed message can be read by consumer READ_COMMITTED https://github.com/apache/pulsar/wiki/PIP-31%3A-Transaction-Support
  • 11. Transactional streaming demo Message<String> message = inputConsumer.receive(); Transaction txn = client.newTransaction().withTransactionTimeout(…).build().get(); CompletableFuture<MessageId> sendFuture1 = producer1.newMessage(txn).value(“output-message-1”).sendAsync(); CompletableFuture<MessageId> sendFuture2 = producer2.newMessage(txn).value(“output-message-2”).sendAsync(); inputConsumer.acknowledgeAsync(message.getMessageId(), txn); txn.commit().get(); MessageId msgId1 = sendFuture1.get(); MessageId msgId2 = sendFuture2.get();
  • 13. Sticky consumer https://github.com/apache/pulsar/wiki/PIP-34%3A-Add-new-subscribe-type-Key_shared Consumer consumer1 = client.newConsumer() .topic(“my-topic“) .subscription(“my-subscription”) .subscriptionType(SubscriptionType.Key_Shared) .keySharedPolicy(KeySharedPolicy.sticky() .ranges(Range.of(0, 32767))) ).subscribe(); Consumer consumer2 = client.newConsumer() .topic(“my-topic“) .subscription(“my-subscription”) .subscriptionType(SubscriptionType.Key_Shared) .keySharedPolicy(KeySharedPolicy.sticky() .ranges(Range.of(32768, 65535))) ).subscribe();
  • 14. Batch receiving messages Consumer consumer = client.newConsumer() .topic(“my-topic“) .subscription(“my-subscription”) .batchReceivePolicy(BatchReceivePolicy.builder() .maxNumMessages(100) .maxNumBytes(2 * 1024 * 1024) .timeout(1, TimeUnit.SECONDS) ).subscribe(); Messages msgs = consumer.batchReceive(); // doing some batch operate https://github.com/apache/pulsar/wiki/PIP-38%3A-Batch-Receiving-Messages
  • 17. Bo Cong / 丛搏 Pulsar Schema 智联招聘消息系统研发⼯程师 Pulsar schema、HDFS Offload 核⼼贡献者
  • 18. Schema Evolution 2 Data management can't escape the evolution of schema
  • 19. Single version schema 3 message 1 message 2 message 3 version 1
  • 20. Multiple version schemas 4 message 1 message 2 message 3 version 1 version 2 Version 3
  • 21. Schema compatibility can read Deserialization=
  • 22. Compatibility strategy evolution Back Ward Back Ward Transitive version 2 version 1 version 0 version 2 version 1 version 0 can read can read can read can read can read may can’t read
  • 23. Evolution of the situation 7 Class Person { @Nullable String name; } Version 1 Class Person { String name; } Class Person { @Nullable @AvroDefault(""Zhang San"") String name; } Version 2 Version 3 Can read Can readCan’t read
  • 24. Compatibility check Separate schema compatibility checker for producer and consumer Producer Check if exist Consumer isAllowAutoUpdateSchema = false
  • 25. Upgrade way BACKWORD Different strategy with different upgrade way BACKWORD_TRANSITIVE FORWORD FORWORD_TRANSITIVE Full Full_TRANSITIVE Consumers Producers Any order
  • 26. Produce Different Message 10 Producer<V1Data> p = pulsarClient.newProducer(Schema.AVRO(V1Data.class)) .topic(topic).create(); Consumer<V2Data> c = pulsarClient.newConsumer(Schema.AVRO(V2Data.class)) .topic(topic) .subscriptionName("sub1").subscribe() p.newMessage().value(data1).send(); p.newMessage(Schema.AVRO(V2Data.class)).value(data2).send(); p.newMessage(Schema.AVRO(V1Data.class)).value(data3).send(); Message<V2Data> msg1 = c.receive(); V2Data msg1Value = msg1.getValue(); Message<V2Data> msg2 = c.receive(); Message<V2Data> msg3 = c.receive(); V2Data msg3Value = msg3.getValue();
  • 29. What is Apache Pulsar? Flexible Pub/Sub Messaging backed by Durable log Storage
  • 30. Barrier for user? Unified Messaging Protocol Apps Build on old systems
  • 31. How Pulsar handles it? Pulsar Kafka Wrapper on Kafka Java API https://pulsar.apache.org/docs/en/adaptors-kafka/ Pulsar IO Connect https://pulsar.apache.org/docs/en/io-overview/
  • 33. KoP Feasibility — Log Topic
  • 34. KoP Feasibility — Log Topic Producer Consumer
  • 35. KoP Feasibility — Log Topic Producer Consumer Kafka
  • 36. KoP Feasibility — Log Topic Producer Consumer Pulsar
  • 37. KoP Feasibility — Others Producer Consumer Topic Lookup Produce Consume Offset Consumption State
  • 38. KoP Overview Kafka lib Broker Pulsar Consumer Pulsar lib Load Balancer Pulsar Protocol handler Kafka Protocol handler Pulsar Producer Pulsar lib Kafka Producer Kafka lib Kafka Consumer Kafka lib Kafka Producer Managed Ledger BK Client Geo- Replicator Pulsar Topic ZooKeeper Bookie Pulsar
  • 39. KoP Implementation Topic flat map: Broker sets `kafkaNamespace` Message ID and Offset: LedgerId + EntryId Message: Convert Key/value/timestamp/headers(properties) Topic Lookup: Pulsar admin topic lookup -> owner broker Produce: Convert, then call PulsarTopic.publishMessage Consume: Convert, then call non-durable-cursor.readEntries Group Coordinator: Keep in topic `public/__kafka/__offsets`
  • 40. KoP Now Layered Architecture Independent Scale Instant Recovery Balance-free expand
  • 41. Ordering Guaranteed ordering Multi-tenancy A single cluster can support many tenants and use cases High throughput Can reach 1.8 M messages/s in a single partition Durability Data replicated and synced to disk Geo-replication Out of box support for geographically distributed applications Unified messaging model Support both Streaming and Queuing Delivery Guarantees At least once, at most once and effectively once Low Latency Low publish latency of 5ms Highly scalable & available Can support millions of topics HA KoP Now
  • 42. Demo https://kafka.apache.org/quickstart Demo1: Kafka Producer / Consumer Demo2: Kafka Connect https://archive.apache.org/dist/kafka/2.0.0/ kafka_2.12-2.0.0.tgz Demo video: https://www.bilibili.com/video/av75540685
  • 43. Demo Kafka lib Broker Pulsar Consumer Pulsar lib Load Balancer Pulsar Protocol handler Kafka Protocol handler Pulsar Producer Pulsar lib Kafka Producer Kafka lib Kafka Consumer Kafka lib Kafka Producer Managed Ledger BK Client Geo- Replicator Pulsar Topic ZooKeeper Bookie Pulsar
  • 44. Demo1: K-Producer -> K-Consumer Kafka lib Kafka Consumer Kafka libKafka lib Kafka Producer Broker Pulsar Protocol handler Kafka Protocol handler Pulsar Topic bin/kafka-console-producer.sh --broker-list localhost:9092 --topic test bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic test --from-beginning
  • 46. Demo1: P-Producer -> K-Consumer Pulsar Consumer Pulsar lib Pulsar Producer Pulsar lib Kafka lib Kafka Consumer Kafka libKafka lib Kafka Producer Broker Pulsar Protocol handler Kafka Protocol handler Pulsar Topic bin/pulsar-client produce test -n 1 -m “Hello from Pulsar Producer, Message 1” bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic test --from-beginning
  • 48. Demo1: P-Producer -> K-Consumer Pulsar Consumer Pulsar lib Pulsar Producer Pulsar lib Kafka lib Kafka Consumer Kafka libKafka lib Kafka Producer Broker Pulsar Protocol handler Kafka Protocol handler Pulsar Topic bin/kafka-console-producer.sh --broker-list localhost:9092 --topic test bin/pulsar-client consume -s sub-name test -n 0
  • 51. Demo2: Kafka Connect Kafka lib Kafka File Source Broker Pulsar Protocol handler Kafka Protocol handler Pulsar Topic InPut File Kafka File Sink OutPut File TOPIC bin/connect-standalone.sh 
 config/connect-standalone.properties 
 config/connect-file-source.properties 
 config/connect-file-sink.properties
  • 53. Demo2: Pulsar Functions Kafka lib Kafka File Source Broker Pulsar Protocol handler Kafka Protocol handler Pulsar Topic InPut File Kafka File Sink OutPut File TOPIC Kafka lib Pulsar Functions OutPut Topic bin/pulsar-admin functions localrun --name pulsarExclamation
 --jar pulsar-functions-api-examples.jar 
 --classname org…ExclamationFunction
 --inputs connect-test-partition-0 --output out-hello
  • 55. Apache Pulsar & Apache Kafka