SlideShare a Scribd company logo
Let's keep it simple
and streaming
Tim Spann
Developer Advocate
Tim Spann
Developer Advocate
● FLiP(N) Stack = Flink, Pulsar and NiFi Stack
● Streaming Systems/ Data Architect
● Experience:
○ 15+ years of experience with batch and streaming
technologies including Pulsar, Flink, Spark, NiFi, Spring,
Java, Big Data, Cloud, MXNet, Hadoop, Datalakes, IoT
and more.
streamnative.io
Let's keep it simple and streaming
Proprietary & Confidential |
Agenda
5
● Introduction
● What is Apache Pulsar?
● Spring Apps
● Pulsar
● AMQP
● MQTT
● Kafka
● Demo
Proprietary & Confidential | 6
Apache Pulsar has a vibrant community
560+
Contributors
10,000+
Commits
7,000+
Slack Members
1,000+
Organizations
Using Pulsar
101
Unified
Messaging
Platform
Guaranteed
Message
Delivery
Resiliency Infinite
Scalability
Proprietary & Confidential |
Streaming
Consumer
Consumer
Consumer
Subscription
Shared
Failover
Consumer
Consumer
Subscription
In case of failure in
Consumer B-0
Consumer
Consumer
Subscription
Exclusive
X
Consumer
Consumer
Key-Shared
Subscription
Pulsar
Topic/Partition
Messaging
8
Tenants / Namespaces / Topics
Tenants
(Compliance)
Tenants
(Data Services)
Namespace
(Microservices)
Topic-1
(Cust Auth)
Topic-1
(Location Resolution)
Topic-2
(Demographics)
Topic-1
(Budgeted Spend)
Topic-1
(Acct History)
Topic-1
(Risk Detection)
Namespace
(ETL)
Namespace
(Campaigns)
Namespace
(ETL)
Tenants
(Marketing)
Namespace
(Risk Assessment)
Pulsar Cluster
Proprietary & Confidential |
Messages - the basic unit of Pulsar
10
Component Description
Value / data payload The data carried by the message. All Pulsar messages contain raw bytes, although message data
can also conform to data schemas.
Key Messages are optionally tagged with keys, used in partitioning and also is useful for things like
topic compaction.
Properties An optional key/value map of user-defined properties.
Producer name The name of the producer who produces the message. If you do not specify a producer name, the
default name is used.
Sequence ID Each Pulsar message belongs to an ordered sequence on its topic. The sequence ID of the
message is its order in that sequence.
Pulsar Cluster
● “Bookies”
● Stores messages and cursors
● Messages are grouped in
segments/ledgers
● A group of bookies form an
“ensemble” to store a ledger
● “Brokers”
● Handles message routing and
connections
● Stateless, but with caches
● Automatic load-balancing
● Topics are composed of
multiple segments
●
● Stores metadata for
both Pulsar and
BookKeeper
● Service discovery
Store
Messages
Metadata &
Service Discovery
Metadata &
Service Discovery
Metadata
Storage
Producer-Consumer
Producer Consumer
Publisher sends data and
doesn't know about the
subscribers or their status.
All interactions go through
Pulsar and it handles all
communication.
Subscriber receives data
from publisher and never
directly interacts with it
Topic
Topic
Apache Pulsar: Messaging vs Streaming
Message Queueing - Queueing
systems are ideal for work queues
that do not require tasks to be
performed in a particular order.
Streaming - Streaming works
best in situations where the
order of messages is important.
Pulsar Subscription Modes
Different subscription modes
have different semantics:
Exclusive/Failover -
guaranteed order, single active
consumer
Shared - multiple active
consumers, no order
Key_Shared - multiple active
consumers, order for given key
Producer 1
Producer 2
Pulsar Topic
Subscription D
Consumer D-1
Consumer D-2
Key-Shared
<
K
1,
V
10
>
<
K
1,
V
11
>
<
K
1,
V
12
>
<
K
2
,V
2
0
>
<
K
2
,V
2
1>
<
K
2
,V
2
2
>
Subscription C
Consumer C-1
Consumer C-2
Shared
<
K
1,
V
10
>
<
K
2,
V
21
>
<
K
1,
V
12
>
<
K
2
,V
2
0
>
<
K
1,
V
11
>
<
K
2
,V
2
2
>
Subscription A Consumer A
Exclusive
Subscription B
Consumer B-1
Consumer B-2
In case of failure in
Consumer B-1
Failover
Flexible Pub/Sub API for Pulsar - Shared
Consumer consumer = client.newConsumer()
.topic("my-topic")
.subscriptionName("work-q-1")
.subscriptionType(SubType.Shared)
.subscribe();
Flexible Pub/Sub API for Pulsar - Failover
Consumer consumer = client.newConsumer()
.topic("my-topic")
.subscriptionName("stream-1")
.subscriptionType(SubType.Failover)
.subscribe();
Proprietary & Confidential |
Data Offloaders
(Tiered Storage)
Client Libraries
StreamNative Pulsar ecosystem
hub.streamnative.io
Connectors
(Sources & Sinks)
Protocol Handlers
Pulsar Functions
(Lightweight Stream
Processing)
Processing Engines
… and more!
… and more!
Kafka
On Pulsar
(KoP)
MQTT
On Pulsar
(MoP)
AMQP
On Pulsar
(AoP)
Schema Registry
Schema Registry
schema-1 (value=Avro/Protobuf/JSON) schema-2 (value=Avro/Protobuf/JSON) schema-3
(value=Avro/Protobuf/JSON)
Schema
Data
ID
Local Cache
for Schemas
+
Schema
Data
ID +
Local Cache
for Schemas
Send schema-1
(value=Avro/Protobuf/JSON) data
serialized per schema ID
Send (register)
schema (if not in
local cache)
Read schema-1
(value=Avro/Protobuf/JSON) data
deserialized per schema ID
Get schema by ID (if
not in local cache)
Producers Consumers
Building Real-Time Requires a Team
Pulsar - Spring
https://docs.spring.io/spring-pulsar/docs/current/reference/html/
Pulsar - Spring - Code
@Autowired
private PulsarTemplate<Observation> pulsarTemplate;
this.pulsarTemplate.setSchema(Schema.
JSON(Observation.class));
MessageId msgid = pulsarTemplate.newMessage(observation)
.withMessageCustomizer((mb) -> mb.key(
uuidKey.toString()))
.send();
@PulsarListener(subscriptionName = "aq-spring-reader", subscriptionType = Shared,
schemaType = SchemaType.
JSON, topics = "persistent://public/default/aq-pm25")
void echoObservation(Observation message) {
this.log.info("PM2.5 Message received: {}", message);
}
Pulsar - Spring - Configuration
spring:
pulsar:
client:
service-url: pulsar+ssl://sn-academy.sndevadvocate.snio.cloud:6651
auth-plugin-class-name: org.apache.pulsar.client.impl.auth.oauth2.AuthenticationOAuth2
authentication:
issuer-url: https://auth.streamnative.cloud/
private-key: file:///scr/sndevadvocate-tspann.json
audience: urn:sn:pulsar:sndevadvocate:my-instance
producer:
batching-enabled: false
send-timeout-ms: 90000
producer-name: airqualityjava
topic-name: persistent://public/default/airquality
Spring - Pulsar as Kafka
https://www.baeldung.com/spring-kafka
@Bean
public KafkaTemplate<String, Observation> kafkaTemplate() {
KafkaTemplate<String, Observation> kafkaTemplate =
new KafkaTemplate<String, Observation>(producerFactory());
return kafkaTemplate;
}
ProducerRecord<String, Observation> producerRecord = new ProducerRecord<>(topicName,
uuidKey.toString(),
message);
kafkaTemplate.send(producerRecord);
Spring - MQTT - Pulsar
https://roytuts.com/publish-subscribe-message-onto-mqtt-using-spring/
@Bean
public IMqttClient mqttClient(
@Value("${mqtt.clientId}") String clientId,
@Value("${mqtt.hostname}") String hostname,
@Value("${mqtt.port}") int port)
throws MqttException {
IMqttClient mqttClient = new MqttClient(
"tcp://" + hostname + ":" + port, clientId);
mqttClient.connect(mqttConnectOptions());
return mqttClient;
}
MqttMessage mqttMessage = new MqttMessage();
mqttMessage.setPayload(DataUtility.serialize(payload));
mqttMessage.setQos(0);
mqttMessage.setRetained(true);
mqttClient.publish(topicName, mqttMessage);
Spring - AMQP - Pulsar
https://www.baeldung.com/spring-amqp
rabbitTemplate.convertAndSend(topicName,
DataUtility.serializeToJSON(observation));
@Bean
public CachingConnectionFactory
connectionFactory() {
CachingConnectionFactory ccf =
new CachingConnectionFactory();
ccf.setAddresses(serverName);
return ccf;
}
Reactive Spring - Pulsar
Reactive Spring - Pulsar
REST + Spring Boot + Pulsar + Friends
FLiP Stack Weekly
This week in Apache Flink, Apache Pulsar, Apache
NiFi, Apache Spark, Java and Open Source friends.
https://bit.ly/32dAJft
Demo
● https://streamnative.io/blog/engineering/2022-11-29-spring-into-pulsar-part-2-spri
ng-based-microservices-for-multiple-protocols-with-apache-pulsar/
● https://streamnative.io/blog/release/2022-09-21-announcing-spring-for-apache-pu
lsar/
● https://docs.spring.io/spring-pulsar/docs/current-SNAPSHOT/reference/html/
● https://spring.io/blog/2022/08/16/introducing-experimental-spring-support-for-apa
che-pulsar
● https://medium.com/@tspann/using-the-new-spring-boot-apache-pulsar-integratio
n-8a38447dce7b
Spring + Pulsar References
● https://spring.io/guides/gs/spring-boot/
● https://spring.io/projects/spring-amqp/
● https://spring.io/projects/spring-kafka/
● https://github.com/spring-projects/spring-integration-kafka
● https://github.com/spring-projects/spring-integration
● https://github.com/spring-projects/spring-data-relational
● https://github.com/spring-projects/spring-kafka
● https://github.com/spring-projects/spring-amqp
Spring Things
36
Apache
Pulsar
in Action
Please enjoy David’s complete
book which is the ultimate
guide to Pulsar.
@PaaSDev
https://www.linkedin.com/in/timothyspann
https://github.com/tspannhw
37
Tim Spann
Developer Advocate
at StreamNative
Notices
Apache Pulsar™
Apache®, Apache Pulsar™, Pulsar™, Apache Flink®, Flink®, Apache Spark®, Spark®, Apache
NiFi®, NiFi® and the logo are either registered trademarks or trademarks of the Apache
Software Foundation in the United States and/or other countries. No endorsement by The
Apache Software Foundation is implied by the use of these marks.
Copyright © 2021-2022 The Apache Software Foundation. All Rights Reserved. Apache,
Apache Pulsar and the Apache feather logo are trademarks of The Apache Software
Foundation.

More Related Content

Similar to Let's keep it simple and streaming (20)

PDF
Why Spring Belongs In Your Data Stream (From Edge to Multi-Cloud)
Timothy Spann
 
PDF
[March sn meetup] apache pulsar + apache nifi for cloud data lake
Timothy Spann
 
PDF
Python Web Conference 2022 - Apache Pulsar Development 101 with Python (FLiP-Py)
Timothy Spann
 
PDF
Python web conference 2022 apache pulsar development 101 with python (f li-...
Timothy Spann
 
PDF
Machine Intelligence Guild_ Build ML Enhanced Event Streaming Applications wi...
Timothy Spann
 
PDF
Devfest uk & ireland using apache nifi with apache pulsar for fast data on-r...
Timothy Spann
 
PDF
Deep Dive into Building Streaming Applications with Apache Pulsar
Timothy Spann
 
PDF
Princeton Dec 2022 Meetup_ NiFi + Flink + Pulsar
Timothy Spann
 
PDF
OSS EU: Deep Dive into Building Streaming Applications with Apache Pulsar
Timothy Spann
 
PDF
PhillyJug Getting Started With Real-time Cloud Native Streaming With Java
Timothy Spann
 
PDF
Apache Pulsar Development 101 with Python
Timothy Spann
 
PDF
Big mountain data and dev conference apache pulsar with mqtt for edge compu...
Timothy Spann
 
PDF
Unified Messaging and Data Streaming 101
Timothy Spann
 
PDF
ApacheCon2022_Deep Dive into Building Streaming Applications with Apache Pulsar
Timothy Spann
 
PDF
(Current22) Let's Monitor The Conditions at the Conference
Timothy Spann
 
PDF
Let’s Monitor Conditions at the Conference With Timothy Spann & David Kjerrum...
HostedbyConfluent
 
PDF
CODEONTHEBEACH_Streaming Applications with Apache Pulsar
Timothy Spann
 
PDF
OSA Con 2022 - Streaming Data Made Easy - Tim Spann & David Kjerrumgaard - St...
Altinity Ltd
 
PDF
OSA Con 2022: Streaming Data Made Easy
Timothy Spann
 
PDF
Using the FLiPN Stack for Edge AI (Flink, NiFi, Pulsar)
Timothy Spann
 
Why Spring Belongs In Your Data Stream (From Edge to Multi-Cloud)
Timothy Spann
 
[March sn meetup] apache pulsar + apache nifi for cloud data lake
Timothy Spann
 
Python Web Conference 2022 - Apache Pulsar Development 101 with Python (FLiP-Py)
Timothy Spann
 
Python web conference 2022 apache pulsar development 101 with python (f li-...
Timothy Spann
 
Machine Intelligence Guild_ Build ML Enhanced Event Streaming Applications wi...
Timothy Spann
 
Devfest uk & ireland using apache nifi with apache pulsar for fast data on-r...
Timothy Spann
 
Deep Dive into Building Streaming Applications with Apache Pulsar
Timothy Spann
 
Princeton Dec 2022 Meetup_ NiFi + Flink + Pulsar
Timothy Spann
 
OSS EU: Deep Dive into Building Streaming Applications with Apache Pulsar
Timothy Spann
 
PhillyJug Getting Started With Real-time Cloud Native Streaming With Java
Timothy Spann
 
Apache Pulsar Development 101 with Python
Timothy Spann
 
Big mountain data and dev conference apache pulsar with mqtt for edge compu...
Timothy Spann
 
Unified Messaging and Data Streaming 101
Timothy Spann
 
ApacheCon2022_Deep Dive into Building Streaming Applications with Apache Pulsar
Timothy Spann
 
(Current22) Let's Monitor The Conditions at the Conference
Timothy Spann
 
Let’s Monitor Conditions at the Conference With Timothy Spann & David Kjerrum...
HostedbyConfluent
 
CODEONTHEBEACH_Streaming Applications with Apache Pulsar
Timothy Spann
 
OSA Con 2022 - Streaming Data Made Easy - Tim Spann & David Kjerrumgaard - St...
Altinity Ltd
 
OSA Con 2022: Streaming Data Made Easy
Timothy Spann
 
Using the FLiPN Stack for Edge AI (Flink, NiFi, Pulsar)
Timothy Spann
 

More from Timothy Spann (20)

PDF
14May2025_TSPANN_FromAirQualityUnstructuredData.pdf
Timothy Spann
 
PDF
Streaming AI Pipelines with Apache NiFi and Snowflake NYC 2025
Timothy Spann
 
PDF
2025-03-03-Philly-AAAI-GoodData-Build Secure RAG Apps With Open LLM
Timothy Spann
 
PDF
Conf42_IoT_Dec2024_Building IoT Applications With Open Source
Timothy Spann
 
PDF
2024 Dec 05 - PyData Global - Tutorial Its In The Air Tonight
Timothy Spann
 
PDF
2024Nov20-BigDataEU-RealTimeAIWithOpenSource
Timothy Spann
 
PDF
TSPANN-2024-Nov-CloudX-Adding Generative AI to Real-Time Streaming Pipelines
Timothy Spann
 
PDF
2024-Nov-BuildStuff-Adding Generative AI to Real-Time Streaming Pipelines
Timothy Spann
 
PDF
14 November 2024 - Conf 42 - Prompt Engineering - Codeless Generative AI Pipe...
Timothy Spann
 
PDF
2024 Nov 05 - Linux Foundation TAC TALK With Milvus
Timothy Spann
 
PPTX
tspann06-NOV-2024_AI-Alliance_NYC_ intro to Data Prep Kit and Open Source RAG
Timothy Spann
 
PDF
tspann08-Nov-2024_PyDataNYC_Unstructured Data Processing with a Raspberry Pi ...
Timothy Spann
 
PDF
2024-10-28 All Things Open - Advanced Retrieval Augmented Generation (RAG) Te...
Timothy Spann
 
PDF
10-25-2024_BITS_NYC_Unstructured Data and LLM_ What, Why and How
Timothy Spann
 
PDF
2024-OCT-23 NYC Meetup - Unstructured Data Meetup - Unstructured Halloween
Timothy Spann
 
PDF
DBTA Round Table with Zilliz and Airbyte - Unstructured Data Engineering
Timothy Spann
 
PDF
17-October-2024 NYC AI Camp - Step-by-Step RAG 101
Timothy Spann
 
PDF
11-OCT-2024_AI_101_CryptoOracle_UnstructuredData
Timothy Spann
 
PDF
2024-10-04 - Grace Hopper Celebration Open Source Day - Stefan
Timothy Spann
 
PDF
01-Oct-2024_PES-VectorDatabasesAndAI.pdf
Timothy Spann
 
14May2025_TSPANN_FromAirQualityUnstructuredData.pdf
Timothy Spann
 
Streaming AI Pipelines with Apache NiFi and Snowflake NYC 2025
Timothy Spann
 
2025-03-03-Philly-AAAI-GoodData-Build Secure RAG Apps With Open LLM
Timothy Spann
 
Conf42_IoT_Dec2024_Building IoT Applications With Open Source
Timothy Spann
 
2024 Dec 05 - PyData Global - Tutorial Its In The Air Tonight
Timothy Spann
 
2024Nov20-BigDataEU-RealTimeAIWithOpenSource
Timothy Spann
 
TSPANN-2024-Nov-CloudX-Adding Generative AI to Real-Time Streaming Pipelines
Timothy Spann
 
2024-Nov-BuildStuff-Adding Generative AI to Real-Time Streaming Pipelines
Timothy Spann
 
14 November 2024 - Conf 42 - Prompt Engineering - Codeless Generative AI Pipe...
Timothy Spann
 
2024 Nov 05 - Linux Foundation TAC TALK With Milvus
Timothy Spann
 
tspann06-NOV-2024_AI-Alliance_NYC_ intro to Data Prep Kit and Open Source RAG
Timothy Spann
 
tspann08-Nov-2024_PyDataNYC_Unstructured Data Processing with a Raspberry Pi ...
Timothy Spann
 
2024-10-28 All Things Open - Advanced Retrieval Augmented Generation (RAG) Te...
Timothy Spann
 
10-25-2024_BITS_NYC_Unstructured Data and LLM_ What, Why and How
Timothy Spann
 
2024-OCT-23 NYC Meetup - Unstructured Data Meetup - Unstructured Halloween
Timothy Spann
 
DBTA Round Table with Zilliz and Airbyte - Unstructured Data Engineering
Timothy Spann
 
17-October-2024 NYC AI Camp - Step-by-Step RAG 101
Timothy Spann
 
11-OCT-2024_AI_101_CryptoOracle_UnstructuredData
Timothy Spann
 
2024-10-04 - Grace Hopper Celebration Open Source Day - Stefan
Timothy Spann
 
01-Oct-2024_PES-VectorDatabasesAndAI.pdf
Timothy Spann
 
Ad

Recently uploaded (20)

PPTX
In From the Cold: Open Source as Part of Mainstream Software Asset Management
Shane Coughlan
 
PPTX
Foundations of Marketo Engage - Powering Campaigns with Marketo Personalization
bbedford2
 
PDF
Dipole Tech Innovations – Global IT Solutions for Business Growth
dipoletechi3
 
PPTX
OpenChain @ OSS NA - In From the Cold: Open Source as Part of Mainstream Soft...
Shane Coughlan
 
PDF
Wondershare PDFelement Pro Crack for MacOS New Version Latest 2025
bashirkhan333g
 
PDF
AI Prompts Cheat Code prompt engineering
Avijit Kumar Roy
 
PDF
MiniTool Power Data Recovery 8.8 With Crack New Latest 2025
bashirkhan333g
 
PDF
Open Chain Q2 Steering Committee Meeting - 2025-06-25
Shane Coughlan
 
PDF
UITP Summit Meep Pitch may 2025 MaaS Rebooted
campoamor1
 
PDF
4K Video Downloader Plus Pro Crack for MacOS New Download 2025
bashirkhan333g
 
PPTX
Finding Your License Details in IBM SPSS Statistics Version 31.pptx
Version 1 Analytics
 
PDF
[Solution] Why Choose the VeryPDF DRM Protector Custom-Built Solution for You...
Lingwen1998
 
PPTX
Customise Your Correlation Table in IBM SPSS Statistics.pptx
Version 1 Analytics
 
PPTX
Get Started with Maestro: Agent, Robot, and Human in Action – Session 5 of 5
klpathrudu
 
PPTX
Home Care Tools: Benefits, features and more
Third Rock Techkno
 
PPTX
Function & Procedure: Function Vs Procedure in PL/SQL
Shani Tiwari
 
PDF
AI + DevOps = Smart Automation with devseccops.ai.pdf
Devseccops.ai
 
PPTX
AEM User Group: India Chapter Kickoff Meeting
jennaf3
 
PPTX
Smart Doctor Appointment Booking option in odoo.pptx
AxisTechnolabs
 
PPTX
Agentic Automation: Build & Deploy Your First UiPath Agent
klpathrudu
 
In From the Cold: Open Source as Part of Mainstream Software Asset Management
Shane Coughlan
 
Foundations of Marketo Engage - Powering Campaigns with Marketo Personalization
bbedford2
 
Dipole Tech Innovations – Global IT Solutions for Business Growth
dipoletechi3
 
OpenChain @ OSS NA - In From the Cold: Open Source as Part of Mainstream Soft...
Shane Coughlan
 
Wondershare PDFelement Pro Crack for MacOS New Version Latest 2025
bashirkhan333g
 
AI Prompts Cheat Code prompt engineering
Avijit Kumar Roy
 
MiniTool Power Data Recovery 8.8 With Crack New Latest 2025
bashirkhan333g
 
Open Chain Q2 Steering Committee Meeting - 2025-06-25
Shane Coughlan
 
UITP Summit Meep Pitch may 2025 MaaS Rebooted
campoamor1
 
4K Video Downloader Plus Pro Crack for MacOS New Download 2025
bashirkhan333g
 
Finding Your License Details in IBM SPSS Statistics Version 31.pptx
Version 1 Analytics
 
[Solution] Why Choose the VeryPDF DRM Protector Custom-Built Solution for You...
Lingwen1998
 
Customise Your Correlation Table in IBM SPSS Statistics.pptx
Version 1 Analytics
 
Get Started with Maestro: Agent, Robot, and Human in Action – Session 5 of 5
klpathrudu
 
Home Care Tools: Benefits, features and more
Third Rock Techkno
 
Function & Procedure: Function Vs Procedure in PL/SQL
Shani Tiwari
 
AI + DevOps = Smart Automation with devseccops.ai.pdf
Devseccops.ai
 
AEM User Group: India Chapter Kickoff Meeting
jennaf3
 
Smart Doctor Appointment Booking option in odoo.pptx
AxisTechnolabs
 
Agentic Automation: Build & Deploy Your First UiPath Agent
klpathrudu
 
Ad

Let's keep it simple and streaming

  • 1. Let's keep it simple and streaming Tim Spann Developer Advocate
  • 2. Tim Spann Developer Advocate ● FLiP(N) Stack = Flink, Pulsar and NiFi Stack ● Streaming Systems/ Data Architect ● Experience: ○ 15+ years of experience with batch and streaming technologies including Pulsar, Flink, Spark, NiFi, Spring, Java, Big Data, Cloud, MXNet, Hadoop, Datalakes, IoT and more.
  • 5. Proprietary & Confidential | Agenda 5 ● Introduction ● What is Apache Pulsar? ● Spring Apps ● Pulsar ● AMQP ● MQTT ● Kafka ● Demo
  • 6. Proprietary & Confidential | 6 Apache Pulsar has a vibrant community 560+ Contributors 10,000+ Commits 7,000+ Slack Members 1,000+ Organizations Using Pulsar
  • 8. Proprietary & Confidential | Streaming Consumer Consumer Consumer Subscription Shared Failover Consumer Consumer Subscription In case of failure in Consumer B-0 Consumer Consumer Subscription Exclusive X Consumer Consumer Key-Shared Subscription Pulsar Topic/Partition Messaging 8
  • 9. Tenants / Namespaces / Topics Tenants (Compliance) Tenants (Data Services) Namespace (Microservices) Topic-1 (Cust Auth) Topic-1 (Location Resolution) Topic-2 (Demographics) Topic-1 (Budgeted Spend) Topic-1 (Acct History) Topic-1 (Risk Detection) Namespace (ETL) Namespace (Campaigns) Namespace (ETL) Tenants (Marketing) Namespace (Risk Assessment) Pulsar Cluster
  • 10. Proprietary & Confidential | Messages - the basic unit of Pulsar 10 Component Description Value / data payload The data carried by the message. All Pulsar messages contain raw bytes, although message data can also conform to data schemas. Key Messages are optionally tagged with keys, used in partitioning and also is useful for things like topic compaction. Properties An optional key/value map of user-defined properties. Producer name The name of the producer who produces the message. If you do not specify a producer name, the default name is used. Sequence ID Each Pulsar message belongs to an ordered sequence on its topic. The sequence ID of the message is its order in that sequence.
  • 11. Pulsar Cluster ● “Bookies” ● Stores messages and cursors ● Messages are grouped in segments/ledgers ● A group of bookies form an “ensemble” to store a ledger ● “Brokers” ● Handles message routing and connections ● Stateless, but with caches ● Automatic load-balancing ● Topics are composed of multiple segments ● ● Stores metadata for both Pulsar and BookKeeper ● Service discovery Store Messages Metadata & Service Discovery Metadata & Service Discovery Metadata Storage
  • 12. Producer-Consumer Producer Consumer Publisher sends data and doesn't know about the subscribers or their status. All interactions go through Pulsar and it handles all communication. Subscriber receives data from publisher and never directly interacts with it Topic Topic
  • 13. Apache Pulsar: Messaging vs Streaming Message Queueing - Queueing systems are ideal for work queues that do not require tasks to be performed in a particular order. Streaming - Streaming works best in situations where the order of messages is important.
  • 14. Pulsar Subscription Modes Different subscription modes have different semantics: Exclusive/Failover - guaranteed order, single active consumer Shared - multiple active consumers, no order Key_Shared - multiple active consumers, order for given key Producer 1 Producer 2 Pulsar Topic Subscription D Consumer D-1 Consumer D-2 Key-Shared < K 1, V 10 > < K 1, V 11 > < K 1, V 12 > < K 2 ,V 2 0 > < K 2 ,V 2 1> < K 2 ,V 2 2 > Subscription C Consumer C-1 Consumer C-2 Shared < K 1, V 10 > < K 2, V 21 > < K 1, V 12 > < K 2 ,V 2 0 > < K 1, V 11 > < K 2 ,V 2 2 > Subscription A Consumer A Exclusive Subscription B Consumer B-1 Consumer B-2 In case of failure in Consumer B-1 Failover
  • 15. Flexible Pub/Sub API for Pulsar - Shared Consumer consumer = client.newConsumer() .topic("my-topic") .subscriptionName("work-q-1") .subscriptionType(SubType.Shared) .subscribe();
  • 16. Flexible Pub/Sub API for Pulsar - Failover Consumer consumer = client.newConsumer() .topic("my-topic") .subscriptionName("stream-1") .subscriptionType(SubType.Failover) .subscribe();
  • 17. Proprietary & Confidential | Data Offloaders (Tiered Storage) Client Libraries StreamNative Pulsar ecosystem hub.streamnative.io Connectors (Sources & Sinks) Protocol Handlers Pulsar Functions (Lightweight Stream Processing) Processing Engines … and more! … and more!
  • 21. Schema Registry Schema Registry schema-1 (value=Avro/Protobuf/JSON) schema-2 (value=Avro/Protobuf/JSON) schema-3 (value=Avro/Protobuf/JSON) Schema Data ID Local Cache for Schemas + Schema Data ID + Local Cache for Schemas Send schema-1 (value=Avro/Protobuf/JSON) data serialized per schema ID Send (register) schema (if not in local cache) Read schema-1 (value=Avro/Protobuf/JSON) data deserialized per schema ID Get schema by ID (if not in local cache) Producers Consumers
  • 24. Pulsar - Spring - Code @Autowired private PulsarTemplate<Observation> pulsarTemplate; this.pulsarTemplate.setSchema(Schema. JSON(Observation.class)); MessageId msgid = pulsarTemplate.newMessage(observation) .withMessageCustomizer((mb) -> mb.key( uuidKey.toString())) .send(); @PulsarListener(subscriptionName = "aq-spring-reader", subscriptionType = Shared, schemaType = SchemaType. JSON, topics = "persistent://public/default/aq-pm25") void echoObservation(Observation message) { this.log.info("PM2.5 Message received: {}", message); }
  • 25. Pulsar - Spring - Configuration spring: pulsar: client: service-url: pulsar+ssl://sn-academy.sndevadvocate.snio.cloud:6651 auth-plugin-class-name: org.apache.pulsar.client.impl.auth.oauth2.AuthenticationOAuth2 authentication: issuer-url: https://auth.streamnative.cloud/ private-key: file:///scr/sndevadvocate-tspann.json audience: urn:sn:pulsar:sndevadvocate:my-instance producer: batching-enabled: false send-timeout-ms: 90000 producer-name: airqualityjava topic-name: persistent://public/default/airquality
  • 26. Spring - Pulsar as Kafka https://www.baeldung.com/spring-kafka @Bean public KafkaTemplate<String, Observation> kafkaTemplate() { KafkaTemplate<String, Observation> kafkaTemplate = new KafkaTemplate<String, Observation>(producerFactory()); return kafkaTemplate; } ProducerRecord<String, Observation> producerRecord = new ProducerRecord<>(topicName, uuidKey.toString(), message); kafkaTemplate.send(producerRecord);
  • 27. Spring - MQTT - Pulsar https://roytuts.com/publish-subscribe-message-onto-mqtt-using-spring/ @Bean public IMqttClient mqttClient( @Value("${mqtt.clientId}") String clientId, @Value("${mqtt.hostname}") String hostname, @Value("${mqtt.port}") int port) throws MqttException { IMqttClient mqttClient = new MqttClient( "tcp://" + hostname + ":" + port, clientId); mqttClient.connect(mqttConnectOptions()); return mqttClient; } MqttMessage mqttMessage = new MqttMessage(); mqttMessage.setPayload(DataUtility.serialize(payload)); mqttMessage.setQos(0); mqttMessage.setRetained(true); mqttClient.publish(topicName, mqttMessage);
  • 28. Spring - AMQP - Pulsar https://www.baeldung.com/spring-amqp rabbitTemplate.convertAndSend(topicName, DataUtility.serializeToJSON(observation)); @Bean public CachingConnectionFactory connectionFactory() { CachingConnectionFactory ccf = new CachingConnectionFactory(); ccf.setAddresses(serverName); return ccf; }
  • 31. REST + Spring Boot + Pulsar + Friends
  • 32. FLiP Stack Weekly This week in Apache Flink, Apache Pulsar, Apache NiFi, Apache Spark, Java and Open Source friends. https://bit.ly/32dAJft
  • 33. Demo
  • 34. ● https://streamnative.io/blog/engineering/2022-11-29-spring-into-pulsar-part-2-spri ng-based-microservices-for-multiple-protocols-with-apache-pulsar/ ● https://streamnative.io/blog/release/2022-09-21-announcing-spring-for-apache-pu lsar/ ● https://docs.spring.io/spring-pulsar/docs/current-SNAPSHOT/reference/html/ ● https://spring.io/blog/2022/08/16/introducing-experimental-spring-support-for-apa che-pulsar ● https://medium.com/@tspann/using-the-new-spring-boot-apache-pulsar-integratio n-8a38447dce7b Spring + Pulsar References
  • 35. ● https://spring.io/guides/gs/spring-boot/ ● https://spring.io/projects/spring-amqp/ ● https://spring.io/projects/spring-kafka/ ● https://github.com/spring-projects/spring-integration-kafka ● https://github.com/spring-projects/spring-integration ● https://github.com/spring-projects/spring-data-relational ● https://github.com/spring-projects/spring-kafka ● https://github.com/spring-projects/spring-amqp Spring Things
  • 36. 36 Apache Pulsar in Action Please enjoy David’s complete book which is the ultimate guide to Pulsar.
  • 38. Notices Apache Pulsar™ Apache®, Apache Pulsar™, Pulsar™, Apache Flink®, Flink®, Apache Spark®, Spark®, Apache NiFi®, NiFi® and the logo are either registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries. No endorsement by The Apache Software Foundation is implied by the use of these marks. Copyright © 2021-2022 The Apache Software Foundation. All Rights Reserved. Apache, Apache Pulsar and the Apache feather logo are trademarks of The Apache Software Foundation.