SlideShare a Scribd company logo
natans@wix.com twitter @NSilnitsky linkedin/natansilnitsky github.com/natansil
Exactly Once Delivery
is a harsh mistress
Natan Silnitsky
Backend Infra Developer, Wix.com
A Scala/Java high-level SDK for Apache Kafka.
Greyhound
As it happens,
we have a distributed system at Wix.
~1400 micro-services
@NSilnitsky
As it happens,
we have a distributed system at Wix.
Edit MySite
~1400 micro-services
~1015M messages every day
@NSilnitsky
PurchaseCompleted
Classic ecommerce flow
UpdateInventory(ItemN)
updates
UpdateInventory(Item1)
UpdateInventory(Item2)
@NSilnitsky
Payments Checkout Inventory
Classic ecommerce flow - hard to achieve
PurchaseCompleted
UpdateInventory(ItemN)
updates
UpdateInventory(Item1)
UpdateInventory(Item2)
😳 But... how do we update all items exactly once? on failures/restarts too...
Payments Checkout Inventory
PurchaseCompleted
updates
UpdateInventory(Item1)
UpdateInventory(Item2)
Classic ecommerce flow - hard to achieve
Inventory:
Item1 9 → 8
Item2 5 → 4
@NSilnitsky
PurchaseCompleted
updates
UpdateInventory(Item1)
UpdateInventory(Item2)
Classic ecommerce flow - hard to achieve
Inventory:
Item1 9 → 7
Item2 5 → 3
Payments
* use DB
Achieving Exactly-Once delivery in
distributed systems is NOT easy.
Message delivery over the network
is unreliable.
@NSilnitsky
Message delivery over the network
is unreliable.
Message
Consumer
Message
Producer
Broker
@NSilnitsky
Kafka Broker
Topic-1
0 1 2 3 45
0 1 2 3 45
0 1 2 3 45
0 1 2 3 45
0 1 2 3 4
Topic-2
0 1 2 3 45
0 1 2 3 45
0 1 2 3 45
0 1 2 3 45
0 1 2 3 4
0 1 2 3 4 5
0 1 2 3 4 5
0 1 2 3 4 5
0 1 2 3 4 5
0 1 2 3 4 5
(append-only logs)
Partitions
@NSilnitsky
Kafka
Consumer
Kafka
Producer
Kafka Broker
Topic-1
0 1 2 3 45
0 1 2 3 45
0 1 2 3 45
0 1 2 3 45
0 1 2 3 4
0 1 2 3 4 5
0 1 2 3 4 5
0 1 2 3 4 5
0 1 2 3 4 5
0 1 2 3 4 5
@NSilnitsky
The options for
message delivery
Why Exactly-Once
is difficult
Kafka’s solution for
Exactly-Once
delivery
Kafka Broker
Kafka
Producer
0 1 2 3 4 5
😕 Services need to address message duplicates
at-most-onceat-least-once exactly-once
Producer retries on
every failure
Kafka Broker
0 1 2 3 4 5
😕 Services need to handle messages exactly once
consumerRecords = Consumer.poll
process(consumerRecords)
consumer.commit
at-most-onceat-least-once exactly-once
Consumer retries
on every failure
Kafka
Consumer
Kafka Broker
0 1 2 3 4 5
😕 Messages may be lost
Consumer commits
before processing
(AutoCommit - Default)
consumerRecords = Consumer.poll
Consumer.commit
process(consumerRecords)
at-most-onceat-least-once exactly-once
Kafka
Consumer
Kafka Broker
0 1 2 3 4 5
Kafka
Consumer
😕 Really really really hard to do
Messages are read and processed
exactly once by the consumer
?
at-least-once exactly-onceat-most-once
The options for
message delivery
Why Exactly-Once
is difficult
Kafka’s solution for
Exactly-Once
delivery
The “Two Generals Problem"
thought experiment
Alice Bob
Tomorrow
7 AM
The “Two Generals Problem"
thought experiment
Alice Bob
The “Two Generals Problem"
thought experiment
Alice Bob
The “Two Generals Problem"
thought experiment
Alice Bob
Tomorrow
7 AM
OK!
Tomorrow
7 AM
The “Two Generals Problem"
thought experiment
Alice Bob
?
The “Two Generals Problem"
thought experiment
Alice Bob
OK!
The “Two Generals Problem"
thought experiment
AliceService BobService
The options for
message delivery
Why Exactly-Once
is difficult
Kafka’s solution for
Exactly-Once
delivery
Kafka Broker
Processor Observer
Consumer
-Producer
Consumer
Topic A Topic B
@NSilnitsky
Kafka Broker
Processor Observer
Consumer
-Producer
Consumer
Topic A Topic B
@NSilnitsky
Kafka Broker
Kafka
Idempotent
Producer
0 1 2 3 4 5
0 1
1
Attaches offset to
message
Enable.idempotence = true
@NSilnitsky
Kafka Broker
Kafka
Idempotent
Producer
0 1 2 3 4 5
0 1
1
Attaches offset to
message
Enable.idempotence = true
duplicate!
@NSilnitsky
Kafka Broker
Topic A Topic B
Processor Observer
Consumer
-Producer
Consumer
@NSilnitsky
Kafka Broker
consumer.poll
Attaches offset to message +
marks transaction
ABCDEF
Processor Observer
AB
01
Topic A Topic B
@NSilnitsky
isolation.level =
"read_committed"
Kafka Broker
consumer.poll
producer.beginTransaction
Producer.send
Attaches offset to message +
marks transaction
ABCDEF
Processor Observer
ABC
01
Topic A Topic B
@NSilnitsky
Kafka Broker
consumer.poll
producer.beginTransaction
Producer.send
producer.sendOffsets
Attaches offset to message +
marks transaction
ABCDEF
Processor Observer
ABC
012
Topic A Topic B
@NSilnitsky
Kafka Broker
consumer.poll
producer.beginTransaction
Producer.send
producer.sendOffsets
producer.commitTransaction
Attaches offset to message +
marks transaction
ABCDEF
Processor Observer
ABC
012
Topic A Topic B
@NSilnitsky
Kafka Broker
consumer.poll
producer.beginTransaction
Producer.send
producer.sendOffsets
Attaches offset to message +
marks transaction
ABCDEF
Processor Observer
ABC
01
Topic A Topic B
@NSilnitsky
Kafka Broker
Throughput impact is between 3% to 25% worse.
Processor Observer
@NSilnitsky
Kafka Broker
Throughput impact is between 2% to 25% worse.
The bigger the batch the better the throughput and longer the latency.
Processor Observer
@NSilnitsky
Kafka Broker
Processor Observer
No end-to-end exactly-once guarantees
out of the box.
@NSilnitsky
Kafka Broker
Deduplicate using offsets from Kafka Transaction
@NSilnitsky
ABC
012
Topic B
Observer
DB Table
Partition Offset Value
0 2 C
... ... ...
In reality, this is more complex
than how I describe it.
Out of scope:
▪ Two-phase-commit with transaction coordinator
▪ Transaction log
▪ Additional “fencing” data
▪ 1 processor-producer - 1 partition (up to Kafka 2.4.x)
@NSilnitsky
Exactly Once in Kafka Streams
StreamsConfig:
processing.guarantee = "exactly_once"
consumers will be configured with:
● isolation.level = "read_committed"
producers will be configured with:
● enable.idempotence = true
@NSilnitsky
Purchase
Completed
UpdateInventory(Item1)
Exactly Once (Classic ecommerce) flow at Wix
UpdateInventory(Item2)
UpdateInventory(ItemN)
ABCDEF
1
iNB
1
I3I8
I1I2I5I5
01
updates
@NSilnitsky
Kafka Broker
Exactly Once (Classic ecommerce) flow at Wix
- Dedup
@NSilnitsky
Topic B
Observer
Inventory
Partition Offset Item Value
0 0 I1 9->8
0 1 I2 5->401
I1I2
UPDATE offset 0, Item1 -1
UPDATE offset 1 Item2 - 1
Kafka Broker
Exactly Once (Classic ecommerce) flow at Wix
- Dedup
@NSilnitsky
Topic B
Observer
Inventory
Partition Offset Item Value
0 0 I1 8
0 1 I2 401
I1I2
UPDATE offset 0, Item1 -1
UPDATE offset 1 Item2 - 1
UPDATE offset 1 Item2 - 1
Exactly-Once delivery is...
like the holy grail of message delivery over the network.
It’s a tough nut to crack.
Exactly-Once delivery is...
complex to implement, to use, and it requires fine-tuning.
Exactly-Once delivery is...
crucial for achieving atomic actions in distributed systems.
Thank You
natans@wix.com twitter @NSilnitsky linkedin/natansilnitsky github.com/natansil
Resources
EoS in Kafka talk by Jason Gustafson
Exactly-once Semantics are Possible by Neha Narkhede (Performance)
Transactions in Apache Kafka
Revisiting Exactly One Semantics by Jason Gustafson, Confluent
Proposal to improve EOS Produce scalability - Kafka 2.5
How akka works with exactly once message delivery by Hugh McKee
A Scala/Java high-level SDK for Apache Kafka.
github.com/wix/greyhound
Slides & More
slideshare.net/NatanSilnitsky
medium.com/@natansil
twitter.com/NSilnitsky
natansil.com

More Related Content

What's hot (20)

PDF
Actors or Not: Async Event Architectures
Yaroslav Tkachenko
 
PDF
Performance Analysis and Optimizations for Kafka Streams Applications
Guozhang Wang
 
PDF
10 Lessons Learned from using Kafka with 1000 microservices - java global summit
Natan Silnitsky
 
PDF
Apache Kafka: New Features That You Might Not Know About
Yaroslav Tkachenko
 
PDF
SFBigAnalytics_20190724: Monitor kafka like a Pro
Chester Chen
 
PPTX
A Modern C++ Kafka API | Kenneth Jia, Morgan Stanley
HostedbyConfluent
 
PDF
Making Sense of Your Event-Driven Dataflows (Jorge Esteban Quilcate Otoya, SY...
confluent
 
PDF
Producer Performance Tuning for Apache Kafka
Jiangjie Qin
 
PDF
Migrating to Multi Cluster Managed Kafka - Conf42 - CloudNative
Natan Silnitsky
 
PDF
Consumer offset management in Kafka
Joel Koshy
 
PDF
Follow the (Kafka) Streams
confluent
 
PPTX
Building a Replicated Logging System with Apache Kafka
Guozhang Wang
 
PDF
Let the alpakka pull your stream
Enno Runne
 
PDF
Apache Kafka, and the Rise of Stream Processing
Guozhang Wang
 
PDF
Kafka Summit SF 2017 - MultiCluster, MultiTenant and Hierarchical Kafka Messa...
confluent
 
PDF
Getting Started with Confluent Schema Registry
confluent
 
PDF
Kafka Streams: the easiest way to start with stream processing
Yaroslav Tkachenko
 
PDF
Kafka on Kubernetes: Does it really have to be "The Hard Way"? (Viktor Gamov,...
confluent
 
PDF
Kafkaesque days at linked in in 2015
Joel Koshy
 
PDF
Real Time Streaming Data with Kafka and TensorFlow (Yong Tang, MobileIron) Ka...
confluent
 
Actors or Not: Async Event Architectures
Yaroslav Tkachenko
 
Performance Analysis and Optimizations for Kafka Streams Applications
Guozhang Wang
 
10 Lessons Learned from using Kafka with 1000 microservices - java global summit
Natan Silnitsky
 
Apache Kafka: New Features That You Might Not Know About
Yaroslav Tkachenko
 
SFBigAnalytics_20190724: Monitor kafka like a Pro
Chester Chen
 
A Modern C++ Kafka API | Kenneth Jia, Morgan Stanley
HostedbyConfluent
 
Making Sense of Your Event-Driven Dataflows (Jorge Esteban Quilcate Otoya, SY...
confluent
 
Producer Performance Tuning for Apache Kafka
Jiangjie Qin
 
Migrating to Multi Cluster Managed Kafka - Conf42 - CloudNative
Natan Silnitsky
 
Consumer offset management in Kafka
Joel Koshy
 
Follow the (Kafka) Streams
confluent
 
Building a Replicated Logging System with Apache Kafka
Guozhang Wang
 
Let the alpakka pull your stream
Enno Runne
 
Apache Kafka, and the Rise of Stream Processing
Guozhang Wang
 
Kafka Summit SF 2017 - MultiCluster, MultiTenant and Hierarchical Kafka Messa...
confluent
 
Getting Started with Confluent Schema Registry
confluent
 
Kafka Streams: the easiest way to start with stream processing
Yaroslav Tkachenko
 
Kafka on Kubernetes: Does it really have to be "The Hard Way"? (Viktor Gamov,...
confluent
 
Kafkaesque days at linked in in 2015
Joel Koshy
 
Real Time Streaming Data with Kafka and TensorFlow (Yong Tang, MobileIron) Ka...
confluent
 

Similar to Exactly Once Delivery with Kafka - Kafka Tel-Aviv Meetup (20)

PDF
Exactly Once Delivery is a Harsh Mistress - Natan Silnitsky
DevOpsDays Tel Aviv
 
PDF
Exactly once delivery is a harsh mistress - DevOps Days TLV
Natan Silnitsky
 
PPTX
Exactly Once Delivery - Natan Silnitsky
Wix Engineering
 
PDF
Polyglot, Fault Tolerant Event-Driven Programming with Kafka, Kubernetes and ...
Natan Silnitsky
 
PDF
Migrating to Multi Cluster Managed Kafka - ApacheKafkaIL
Natan Silnitsky
 
PDF
Polyglot, fault-tolerant event-driven programming with kafka, kubernetes and ...
Natan Silnitsky
 
PDF
Polyglot, fault-tolerant event-driven programming with kafka, kubernetes and ...
Natan Silnitsky
 
PDF
How to build 1000 microservices with Kafka and thrive
Natan Silnitsky
 
PDF
8 Lessons Learned from Using Kafka in 1000 Scala microservices - Scale by the...
Natan Silnitsky
 
PDF
8 Lessons Learned from Using Kafka in 1500 microservices - confluent streamin...
Natan Silnitsky
 
PPTX
Matt Franklin - Apache Software (Geekfest)
W2O Group
 
PPTX
When it Absolutely, Positively, Has to be There: Reliability Guarantees in Ka...
confluent
 
PPTX
Paris Kafka Meetup - patterns anti-patterns
Florent Ramiere
 
PDF
Production Ready Kafka on Kubernetes (Devandra Tagare, Lyft) Kafka Summit SF ...
confluent
 
PDF
Anton Moldovan "Building an efficient replication system for thousands of ter...
Fwdays
 
PDF
Apache Kafka – (Pattern and) Anti-Pattern
confluent
 
PDF
Devoxx UK - Migrating to Multi Cluster Managed Kafka
Natan Silnitsky
 
PDF
Reliability Guarantees for Apache Kafka
confluent
 
PDF
Exactly-once Semantics in Apache Kafka
confluent
 
PDF
Scaling big with Apache Kafka
Nikolay Stoitsev
 
Exactly Once Delivery is a Harsh Mistress - Natan Silnitsky
DevOpsDays Tel Aviv
 
Exactly once delivery is a harsh mistress - DevOps Days TLV
Natan Silnitsky
 
Exactly Once Delivery - Natan Silnitsky
Wix Engineering
 
Polyglot, Fault Tolerant Event-Driven Programming with Kafka, Kubernetes and ...
Natan Silnitsky
 
Migrating to Multi Cluster Managed Kafka - ApacheKafkaIL
Natan Silnitsky
 
Polyglot, fault-tolerant event-driven programming with kafka, kubernetes and ...
Natan Silnitsky
 
Polyglot, fault-tolerant event-driven programming with kafka, kubernetes and ...
Natan Silnitsky
 
How to build 1000 microservices with Kafka and thrive
Natan Silnitsky
 
8 Lessons Learned from Using Kafka in 1000 Scala microservices - Scale by the...
Natan Silnitsky
 
8 Lessons Learned from Using Kafka in 1500 microservices - confluent streamin...
Natan Silnitsky
 
Matt Franklin - Apache Software (Geekfest)
W2O Group
 
When it Absolutely, Positively, Has to be There: Reliability Guarantees in Ka...
confluent
 
Paris Kafka Meetup - patterns anti-patterns
Florent Ramiere
 
Production Ready Kafka on Kubernetes (Devandra Tagare, Lyft) Kafka Summit SF ...
confluent
 
Anton Moldovan "Building an efficient replication system for thousands of ter...
Fwdays
 
Apache Kafka – (Pattern and) Anti-Pattern
confluent
 
Devoxx UK - Migrating to Multi Cluster Managed Kafka
Natan Silnitsky
 
Reliability Guarantees for Apache Kafka
confluent
 
Exactly-once Semantics in Apache Kafka
confluent
 
Scaling big with Apache Kafka
Nikolay Stoitsev
 
Ad

More from Natan Silnitsky (20)

PDF
Async-ronizing Success at Wix - Patterns for Seamless Microservices - Devoxx ...
Natan Silnitsky
 
PDF
Integration Ignited Redefining Event-Driven Architecture at Wix - EventCentric
Natan Silnitsky
 
PDF
Reinventing Microservices Efficiency and Innovation with Single-Runtime
Natan Silnitsky
 
PDF
Async Excellence Unlocking Scalability with Kafka - Devoxx Greece
Natan Silnitsky
 
PDF
Wix Single-Runtime - Conquering the multi-service challenge
Natan Silnitsky
 
PDF
WeAreDevs - Supercharge Your Developer Journey with Tiny Atomic Habits
Natan Silnitsky
 
PDF
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
Natan Silnitsky
 
PDF
Effective Strategies for Wix's Scaling challenges - GeeCon
Natan Silnitsky
 
PDF
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Natan Silnitsky
 
PDF
Workflow Engines & Event Streaming Brokers - Can they work together? [Current...
Natan Silnitsky
 
PDF
DevSum - Lessons Learned from 2000 microservices
Natan Silnitsky
 
PDF
GeeCon - Lessons Learned from 2000 microservices
Natan Silnitsky
 
PDF
Wix+Confluent Meetup - Lessons Learned from 2000 Event Driven Microservices
Natan Silnitsky
 
PDF
BuildStuff - Lessons Learned from 2000 Event Driven Microservices
Natan Silnitsky
 
PDF
Lessons Learned from 2000 Event Driven Microservices - Reversim
Natan Silnitsky
 
PDF
Devoxx Ukraine - Kafka based Global Data Mesh
Natan Silnitsky
 
PDF
Dev Days Europe - Kafka based Global Data Mesh at Wix
Natan Silnitsky
 
PDF
Kafka Summit London - Kafka based Global Data Mesh at Wix
Natan Silnitsky
 
PDF
5 Takeaways from Migrating a Library to Scala 3 - Scala Love
Natan Silnitsky
 
PDF
Migrating to Multi Cluster Managed Kafka - DevopStars 2022
Natan Silnitsky
 
Async-ronizing Success at Wix - Patterns for Seamless Microservices - Devoxx ...
Natan Silnitsky
 
Integration Ignited Redefining Event-Driven Architecture at Wix - EventCentric
Natan Silnitsky
 
Reinventing Microservices Efficiency and Innovation with Single-Runtime
Natan Silnitsky
 
Async Excellence Unlocking Scalability with Kafka - Devoxx Greece
Natan Silnitsky
 
Wix Single-Runtime - Conquering the multi-service challenge
Natan Silnitsky
 
WeAreDevs - Supercharge Your Developer Journey with Tiny Atomic Habits
Natan Silnitsky
 
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
Natan Silnitsky
 
Effective Strategies for Wix's Scaling challenges - GeeCon
Natan Silnitsky
 
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Natan Silnitsky
 
Workflow Engines & Event Streaming Brokers - Can they work together? [Current...
Natan Silnitsky
 
DevSum - Lessons Learned from 2000 microservices
Natan Silnitsky
 
GeeCon - Lessons Learned from 2000 microservices
Natan Silnitsky
 
Wix+Confluent Meetup - Lessons Learned from 2000 Event Driven Microservices
Natan Silnitsky
 
BuildStuff - Lessons Learned from 2000 Event Driven Microservices
Natan Silnitsky
 
Lessons Learned from 2000 Event Driven Microservices - Reversim
Natan Silnitsky
 
Devoxx Ukraine - Kafka based Global Data Mesh
Natan Silnitsky
 
Dev Days Europe - Kafka based Global Data Mesh at Wix
Natan Silnitsky
 
Kafka Summit London - Kafka based Global Data Mesh at Wix
Natan Silnitsky
 
5 Takeaways from Migrating a Library to Scala 3 - Scala Love
Natan Silnitsky
 
Migrating to Multi Cluster Managed Kafka - DevopStars 2022
Natan Silnitsky
 
Ad

Recently uploaded (20)

PDF
Build with AI and GDG Cloud Bydgoszcz- ADK .pdf
jaroslawgajewski1
 
PPTX
Agile Chennai 18-19 July 2025 Ideathon | AI Powered Microfinance Literacy Gui...
AgileNetwork
 
PDF
Peak of Data & AI Encore - Real-Time Insights & Scalable Editing with ArcGIS
Safe Software
 
PDF
MASTERDECK GRAPHSUMMIT SYDNEY (Public).pdf
Neo4j
 
PDF
Market Insight : ETH Dominance Returns
CIFDAQ
 
PDF
Make GenAI investments go further with the Dell AI Factory
Principled Technologies
 
PPTX
Agentic AI in Healthcare Driving the Next Wave of Digital Transformation
danielle hunter
 
PDF
Data_Analytics_vs_Data_Science_vs_BI_by_CA_Suvidha_Chaplot.pdf
CA Suvidha Chaplot
 
PDF
Google I/O Extended 2025 Baku - all ppts
HusseinMalikMammadli
 
PDF
Trying to figure out MCP by actually building an app from scratch with open s...
Julien SIMON
 
PDF
A Strategic Analysis of the MVNO Wave in Emerging Markets.pdf
IPLOOK Networks
 
PDF
RAT Builders - How to Catch Them All [DeepSec 2024]
malmoeb
 
PDF
Structs to JSON: How Go Powers REST APIs
Emily Achieng
 
PPTX
Agile Chennai 18-19 July 2025 | Workshop - Enhancing Agile Collaboration with...
AgileNetwork
 
PDF
Per Axbom: The spectacular lies of maps
Nexer Digital
 
PDF
Research-Fundamentals-and-Topic-Development.pdf
ayesha butalia
 
PDF
The Future of Artificial Intelligence (AI)
Mukul
 
PPTX
Introduction to Flutter by Ayush Desai.pptx
ayushdesai204
 
PPTX
What-is-the-World-Wide-Web -- Introduction
tonifi9488
 
PPTX
The Future of AI & Machine Learning.pptx
pritsen4700
 
Build with AI and GDG Cloud Bydgoszcz- ADK .pdf
jaroslawgajewski1
 
Agile Chennai 18-19 July 2025 Ideathon | AI Powered Microfinance Literacy Gui...
AgileNetwork
 
Peak of Data & AI Encore - Real-Time Insights & Scalable Editing with ArcGIS
Safe Software
 
MASTERDECK GRAPHSUMMIT SYDNEY (Public).pdf
Neo4j
 
Market Insight : ETH Dominance Returns
CIFDAQ
 
Make GenAI investments go further with the Dell AI Factory
Principled Technologies
 
Agentic AI in Healthcare Driving the Next Wave of Digital Transformation
danielle hunter
 
Data_Analytics_vs_Data_Science_vs_BI_by_CA_Suvidha_Chaplot.pdf
CA Suvidha Chaplot
 
Google I/O Extended 2025 Baku - all ppts
HusseinMalikMammadli
 
Trying to figure out MCP by actually building an app from scratch with open s...
Julien SIMON
 
A Strategic Analysis of the MVNO Wave in Emerging Markets.pdf
IPLOOK Networks
 
RAT Builders - How to Catch Them All [DeepSec 2024]
malmoeb
 
Structs to JSON: How Go Powers REST APIs
Emily Achieng
 
Agile Chennai 18-19 July 2025 | Workshop - Enhancing Agile Collaboration with...
AgileNetwork
 
Per Axbom: The spectacular lies of maps
Nexer Digital
 
Research-Fundamentals-and-Topic-Development.pdf
ayesha butalia
 
The Future of Artificial Intelligence (AI)
Mukul
 
Introduction to Flutter by Ayush Desai.pptx
ayushdesai204
 
What-is-the-World-Wide-Web -- Introduction
tonifi9488
 
The Future of AI & Machine Learning.pptx
pritsen4700
 

Exactly Once Delivery with Kafka - Kafka Tel-Aviv Meetup