SlideShare a Scribd company logo
March
2024
·
Kafka
Summit
London
Real-time Geospatial Aircraft Monitoring Using
Apache Kafka
Bhaarat Sharma – CTO & Co-Founder Raft
Neil Buesing – CTO & Co-Founder Kinetic Edge
March
2024
·
Kafka
Summit
London
We bridge the gap between humans
and data through radical
transparency and our obsession with
the mission..
March
2024
·
Kafka
Summit
London
March
2024
·
Kafka
Summit
London
Inspired by: Perishable Insights, Mike Gualtier, Forrester
March
2024
·
Kafka
Summit
London
Data Catalog
March
2024
·
Kafka
Summit
London
Data Pipelines
March
2024
·
Kafka
Summit
London
SQL over Kafka (backed by Pinot)
March
2024
·
Kafka
Summit
London
Fine grained access to dashboard
March
2024
·
Kafka
Summit
London
Architecture
March
2024
·
Kafka
Summit
London
Architecture - Focus for today
March
2024
·
Kafka
Summit
London
Legacy Data Formats
Challenge 1
11
March
2024
·
Kafka
Summit
London
Message Format XML
March
2024
·
Kafka
Summit
London
Message Format XML
● XML ⇒ JSON ✔
● JSON ⇒ XML ✖
consumers expected XML (& valid against XSD schema)
● Additional Challenges
○ marshaling bytes into XML
○ schema validation speed
○ thread safety
March
2024
·
Kafka
Summit
London
Message Format XML
● Custom Serializers - when XML parsing is "read-only"
○ XMLWrapper(byte[] bytes, Document document)
○ wrapper.serialize(xmlWrapper)
March
2024
·
Kafka
Summit
London
Message Format XML
● XOM
○ Slightly better thread safety
○ Faster Implementation
○ Still built on existing Parser technologies
■ so not fully better thread safety
March
2024
·
Kafka
Summit
London
Message Format XML
● XSD Validation
○ Only on incoming or outgoing topics.
March
2024
·
Kafka
Summit
London
Message Format XML
● Staying with XML easier than converting back to XML.
● Optimize Read-Only for speed by compromising on Storage.
● Lessons learned here would apply to other formats; just more
obvious with XML.
March
2024
·
Kafka
Summit
London
Throughput and Latency
Challenge 2
100k msgs/sec
150+ Sensors
Multiple Sensor
Fusion Engines
Enhanced Search
March
2024
·
Kafka
Summit
London
Configurations - Topics
Challenge 2 – Throughput and Latency
● 12 partitions
○ even distribution across availability zones (÷3)
○ even consumer workloads
■ 1/2/3/4/6/12 (6)
● Currently evaluating 24 for some topics
○ also evenly distributed across availability zones (÷3)
○ 1/2/3/4/6/8/12/24 (8)
■ 30->1/2/3/5/6/10/15/30 (7)
■ 36->1/2/3/4/6/9/12/18/36 (9)
March
2024
·
Kafka
Summit
London
Configurations - Producer
Challenge 2 – Throughput and Latency
● buffer.size=200_000
○ or more
● linger.ms=10-50
○ balance of latency & throughput
● compression.type=lz4
○ Always test your data against your compression
○ Never compress compressed data
March
2024
·
Kafka
Summit
London
Configurations - Consumer
Challenge 2 – Throughput and Latency
● 12 partitions
○ even distribution across availability zones (÷3)
○ even consumer workloads
■ 1/2/3/4/6/12
● max.partition.fetch.bytes & fetch.max.bytes
○ adding partitions can increase latency
(especially if number of consumers isn't increased)
March
2024
·
Kafka
Summit
London
Configurations - Performance
Challenge 2 – Throughput and Latency
● Start with the producer
● Then the partitions
● Kafka Streams
○ State Store - Caching (Disable for Latency)
○ Commit Interval (Reduce for Visibility and Latency)
○ Threading - depends on topology & number of
containers.
March
2024
·
Kafka
Summit
London
Authentication and Authorization
Challenge 3
March
2024
·
Kafka
Summit
London
Authentication - Keycloak
Challenge 3 – Authentication & Authorization
● OAuth Callback Handler
○ Kafka 3.2.1+
○ self-signed certificates ...
● Custom Callback Handler
○ error handling considerations
● Librdkafka (Non-Java)
○ Leverage callback handler doing RESTful operation
(custom callback)
March
2024
·
Kafka
Summit
London
Authorization – Open Policy Agent
Challenge 3 – Authentication & Authorization
● OPA Rego ☞ reminds me of Prolog
● .rego examples
consumer_operations = {
"TOPIC" : [ "READ", "DESCRIBE" ],
"GROUP" : [ "READ", "DESCRIBE" ]
}
is_consumer_group {
user_groups(principal)[_] == topic_consumer_groups(topic_name)[_]
}
allow_consumer_group {
is_group_resource
is_consumer_operation
startswith(group_name, concat("-", [principal, ""]))
}
March
2024
·
Kafka
Summit
London
Pattern of life analysis – Warm Data
Challenge 4
Source: ADSB Exchange
March
2024
·
Kafka
Summit
London
Custom Kafka Rest API
Challenge 4 – Pattern of Life Analysis
● One of the easiest parts to build
○ Producer
■ linger.ms - major impact - chose wisely
○ Consumer -- not RESTful ☞ Websockets
● One of the easiest mistakes to make - waiting...
○ producer flushing
○ linger.ms
○ Callback vs. Waiting on Future
Leverage Framework, e.g. Spring's Deferred Result
● Http Response Codes
○ 200 - OK
○ 201 - Created (try to avoid using this one, but some
clients....)
○ 202 - Accepted
March
2024
·
Kafka
Summit
London
Real Time Data – Hot Data
Challenge 5
Source: ADSB Exchange
March
2024
·
Kafka
Summit
London
Websocket - Consumer
Challenge 5 – Real Time Data – Hot Data
● consumer.subscribe() vs. consumer.assign()
● handling backpressure
● tried Java 21 and Virtual Threads - did not help...
● 2 implementation
○ web-socket consumption drains queue
○ 30 second eviction (independent of consumption)
● garbage collection
topic poll() websocket
LinkedBlockingQueue push()
March
2024
·
Kafka
Summit
London
Websocket - Consumer
Challenge 5 – Real Time Data – Hot Data
● Data Mashing over Websocket
○ XML - not great
○ JSON - better
○ Apache Arrow - best
topic poll() websocket
LinkedBlockingQueue push()
consumer thread thread / websocket
March
2024
·
Kafka
Summit
London
Real-Time Data Enrichment
Challenge 6
March
2024
·
Kafka
Summit
London
Kafka Streams
Challenge 6 – Real Time Data Enrichment
● Avoid initial rekeying (trust but verify), every rekey adds latency.
filter((k, v) -> {
if (k ≠ v.id()) {
log.error("invalid...");
return false;
}
● Global Tables / KTables
March
2024
·
Kafka
Summit
London
Kafka Streams
Challenge 6 – Real Time Data Enrichment
● Commit Interval
○ 100ms - 5000ms
● Threads
○ Containers vs Stream Threads
● If Scaling down and up (and not using static membership)
internal.leave.group.on.close = true
● If Scaling down and up (and static membership + Kafka Streams 3.3+)
KafkaStreams.CloseOptions closeOptions = new
KafkaStreams.CloseOptions().timeout(SHUTDOWN).leaveGroup(true);
streams.close(closeOptions);
March
2024
·
Kafka
Summit
London
Websocket Demo
March
2024
·
Kafka
Summit
London
Thank you
Questions

More Related Content

Similar to Real-time Geospatial Aircraft Monitoring Using Apache Kafka (20)

PPTX
Reducing Microservice Complexity with Kafka and Reactive Streams
jimriecken
 
PDF
Stream Processing with Apache Kafka and .NET
confluent
 
PPTX
Current and Future of Apache Kafka
Joe Stein
 
PPTX
AMIS SIG - Introducing Apache Kafka - Scalable, reliable Event Bus & Message ...
Lucas Jellema
 
PDF
Apache Kafka Introduction
Amita Mirajkar
 
PPTX
Apache Kafka 0.8 basic training - Verisign
Michael Noll
 
PDF
From Zero to Streaming Healthcare in Production (Alexander Kouznetsov, Invita...
confluent
 
PDF
Type safe, versioned, and rewindable stream processing with Apache {Avro, K...
Hisham Mardam-Bey
 
PPTX
An introduction to Apache Kafka and Kafka ecosystem at LinkedIn
Dong Lin
 
PPTX
kafka simplicity and complexity
Paolo Platter
 
PDF
Stockholm meetup Kafka_tutorials_window_final_result
confluent
 
PPTX
Introducing Apache Kafka's Streams API - Kafka meetup Munich, Jan 25 2017
Michael Noll
 
PDF
Apache Kafka from 0.7 to 1.0, History and Lesson Learned
Guozhang Wang
 
PPTX
Building streaming data applications using Kafka*[Connect + Core + Streams] b...
Data Con LA
 
PDF
Building High-Throughput, Low-Latency Pipelines in Kafka
confluent
 
PPTX
Design Patterns for working with Fast Data in Kafka
Ian Downard
 
PPTX
Design Patterns for working with Fast Data
MapR Technologies
 
PDF
An Introduction to Apache Kafka
Amir Sedighi
 
PDF
Kafka Deep Dive
Knoldus Inc.
 
PDF
Apache Kafka
Worapol Alex Pongpech, PhD
 
Reducing Microservice Complexity with Kafka and Reactive Streams
jimriecken
 
Stream Processing with Apache Kafka and .NET
confluent
 
Current and Future of Apache Kafka
Joe Stein
 
AMIS SIG - Introducing Apache Kafka - Scalable, reliable Event Bus & Message ...
Lucas Jellema
 
Apache Kafka Introduction
Amita Mirajkar
 
Apache Kafka 0.8 basic training - Verisign
Michael Noll
 
From Zero to Streaming Healthcare in Production (Alexander Kouznetsov, Invita...
confluent
 
Type safe, versioned, and rewindable stream processing with Apache {Avro, K...
Hisham Mardam-Bey
 
An introduction to Apache Kafka and Kafka ecosystem at LinkedIn
Dong Lin
 
kafka simplicity and complexity
Paolo Platter
 
Stockholm meetup Kafka_tutorials_window_final_result
confluent
 
Introducing Apache Kafka's Streams API - Kafka meetup Munich, Jan 25 2017
Michael Noll
 
Apache Kafka from 0.7 to 1.0, History and Lesson Learned
Guozhang Wang
 
Building streaming data applications using Kafka*[Connect + Core + Streams] b...
Data Con LA
 
Building High-Throughput, Low-Latency Pipelines in Kafka
confluent
 
Design Patterns for working with Fast Data in Kafka
Ian Downard
 
Design Patterns for working with Fast Data
MapR Technologies
 
An Introduction to Apache Kafka
Amir Sedighi
 
Kafka Deep Dive
Knoldus Inc.
 

More from HostedbyConfluent (20)

PDF
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
HostedbyConfluent
 
PDF
Renaming a Kafka Topic | Kafka Summit London
HostedbyConfluent
 
PDF
Evolution of NRT Data Ingestion Pipeline at Trendyol
HostedbyConfluent
 
PDF
Ensuring Kafka Service Resilience: A Dive into Health-Checking Techniques
HostedbyConfluent
 
PDF
Exactly-once Stream Processing with Arroyo and Kafka
HostedbyConfluent
 
PDF
Fish Plays Pokemon | Kafka Summit London
HostedbyConfluent
 
PDF
Tiered Storage 101 | Kafla Summit London
HostedbyConfluent
 
PDF
Building a Self-Service Stream Processing Portal: How And Why
HostedbyConfluent
 
PDF
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
HostedbyConfluent
 
PDF
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
HostedbyConfluent
 
PDF
Navigating Private Network Connectivity Options for Kafka Clusters
HostedbyConfluent
 
PDF
Apache Flink: Building a Company-wide Self-service Streaming Data Platform
HostedbyConfluent
 
PDF
Explaining How Real-Time GenAI Works in a Noisy Pub
HostedbyConfluent
 
PDF
TL;DR Kafka Metrics | Kafka Summit London
HostedbyConfluent
 
PDF
A Window Into Your Kafka Streams Tasks | KSL
HostedbyConfluent
 
PDF
Mastering Kafka Producer Configs: A Guide to Optimizing Performance
HostedbyConfluent
 
PDF
Data Contracts Management: Schema Registry and Beyond
HostedbyConfluent
 
PDF
Code-First Approach: Crafting Efficient Flink Apps
HostedbyConfluent
 
PDF
Debezium vs. the World: An Overview of the CDC Ecosystem
HostedbyConfluent
 
PDF
Beyond Tiered Storage: Serverless Kafka with No Local Disks
HostedbyConfluent
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
HostedbyConfluent
 
Renaming a Kafka Topic | Kafka Summit London
HostedbyConfluent
 
Evolution of NRT Data Ingestion Pipeline at Trendyol
HostedbyConfluent
 
Ensuring Kafka Service Resilience: A Dive into Health-Checking Techniques
HostedbyConfluent
 
Exactly-once Stream Processing with Arroyo and Kafka
HostedbyConfluent
 
Fish Plays Pokemon | Kafka Summit London
HostedbyConfluent
 
Tiered Storage 101 | Kafla Summit London
HostedbyConfluent
 
Building a Self-Service Stream Processing Portal: How And Why
HostedbyConfluent
 
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
HostedbyConfluent
 
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
HostedbyConfluent
 
Navigating Private Network Connectivity Options for Kafka Clusters
HostedbyConfluent
 
Apache Flink: Building a Company-wide Self-service Streaming Data Platform
HostedbyConfluent
 
Explaining How Real-Time GenAI Works in a Noisy Pub
HostedbyConfluent
 
TL;DR Kafka Metrics | Kafka Summit London
HostedbyConfluent
 
A Window Into Your Kafka Streams Tasks | KSL
HostedbyConfluent
 
Mastering Kafka Producer Configs: A Guide to Optimizing Performance
HostedbyConfluent
 
Data Contracts Management: Schema Registry and Beyond
HostedbyConfluent
 
Code-First Approach: Crafting Efficient Flink Apps
HostedbyConfluent
 
Debezium vs. the World: An Overview of the CDC Ecosystem
HostedbyConfluent
 
Beyond Tiered Storage: Serverless Kafka with No Local Disks
HostedbyConfluent
 
Ad

Recently uploaded (20)

PDF
Complete JavaScript Notes: From Basics to Advanced Concepts.pdf
haydendavispro
 
PDF
Empowering Cloud Providers with Apache CloudStack and Stackbill
ShapeBlue
 
PDF
Chris Elwell Woburn, MA - Passionate About IT Innovation
Chris Elwell Woburn, MA
 
PDF
NewMind AI - Journal 100 Insights After The 100th Issue
NewMind AI
 
PPTX
WooCommerce Workshop: Bring Your Laptop
Laura Hartwig
 
PDF
Windsurf Meetup Ottawa 2025-07-12 - Planning Mode at Reliza.pdf
Pavel Shukhman
 
PDF
SWEBOK Guide and Software Services Engineering Education
Hironori Washizaki
 
PDF
How Startups Are Growing Faster with App Developers in Australia.pdf
India App Developer
 
PDF
Apache CloudStack 201: Let's Design & Build an IaaS Cloud
ShapeBlue
 
PDF
DevBcn - Building 10x Organizations Using Modern Productivity Metrics
Justin Reock
 
PDF
Rethinking Security Operations - SOC Evolution Journey.pdf
Haris Chughtai
 
PPTX
✨Unleashing Collaboration: Salesforce Channels & Community Power in Patna!✨
SanjeetMishra29
 
PDF
SFWelly Summer 25 Release Highlights July 2025
Anna Loughnan Colquhoun
 
PDF
July Patch Tuesday
Ivanti
 
PPTX
UiPath Academic Alliance Educator Panels: Session 2 - Business Analyst Content
DianaGray10
 
PDF
Ampere Offers Energy-Efficient Future For AI And Cloud
ShapeBlue
 
PDF
Impact of IEEE Computer Society in Advancing Emerging Technologies including ...
Hironori Washizaki
 
PDF
CIFDAQ Token Spotlight for 9th July 2025
CIFDAQ
 
PDF
CloudStack GPU Integration - Rohit Yadav
ShapeBlue
 
PDF
CIFDAQ Weekly Market Wrap for 11th July 2025
CIFDAQ
 
Complete JavaScript Notes: From Basics to Advanced Concepts.pdf
haydendavispro
 
Empowering Cloud Providers with Apache CloudStack and Stackbill
ShapeBlue
 
Chris Elwell Woburn, MA - Passionate About IT Innovation
Chris Elwell Woburn, MA
 
NewMind AI - Journal 100 Insights After The 100th Issue
NewMind AI
 
WooCommerce Workshop: Bring Your Laptop
Laura Hartwig
 
Windsurf Meetup Ottawa 2025-07-12 - Planning Mode at Reliza.pdf
Pavel Shukhman
 
SWEBOK Guide and Software Services Engineering Education
Hironori Washizaki
 
How Startups Are Growing Faster with App Developers in Australia.pdf
India App Developer
 
Apache CloudStack 201: Let's Design & Build an IaaS Cloud
ShapeBlue
 
DevBcn - Building 10x Organizations Using Modern Productivity Metrics
Justin Reock
 
Rethinking Security Operations - SOC Evolution Journey.pdf
Haris Chughtai
 
✨Unleashing Collaboration: Salesforce Channels & Community Power in Patna!✨
SanjeetMishra29
 
SFWelly Summer 25 Release Highlights July 2025
Anna Loughnan Colquhoun
 
July Patch Tuesday
Ivanti
 
UiPath Academic Alliance Educator Panels: Session 2 - Business Analyst Content
DianaGray10
 
Ampere Offers Energy-Efficient Future For AI And Cloud
ShapeBlue
 
Impact of IEEE Computer Society in Advancing Emerging Technologies including ...
Hironori Washizaki
 
CIFDAQ Token Spotlight for 9th July 2025
CIFDAQ
 
CloudStack GPU Integration - Rohit Yadav
ShapeBlue
 
CIFDAQ Weekly Market Wrap for 11th July 2025
CIFDAQ
 
Ad

Real-time Geospatial Aircraft Monitoring Using Apache Kafka