SlideShare a Scribd company logo
Zookeeper vs ETCD 3 vs
Other Distributed
Coordination Systems
ETCD v3
API
● RAFT - Consensus algorithm
● Put
● Get
● Range - Get values of keys from a one key to another key
● Transactions - Read, compare, modify, write combinations
● Watch - On a key or a range. Streaming API
● Example : https://coreos.com/etcd/docs/latest/rfc/v3api.html
Guarantees
● Atomicity
● Consistency
● Sequential Consistency https://en.wikipedia.org/wiki/Sequential_consistency
● Serializable Isolation
● Durability
● Linearizability (Except for watches)
● References:
○ Guarantees - https://coreos.com/etcd/docs/latest/api_v3.html#kv-api-guarantees
Pros ● Has a Java Client
● Has a distributed lock implementation in v3
client. Will be moved to server side in 3.2
release.
● Incremental Snapshots - Avoid
pauses when creating snapshots.
● No garbage collection pauses - Off-
heap storage
Pros ...
● Performance of etcd 3 with zookeeper(snapshots disabled)
● Low latency
● Low storage usage
● Watchers are redesigned, replacing the older event model with one that
streams and multiplexes events over key intervals.
● gRPC is 2x faster than JSON parsing in etcd2
● Leases for TTL. Leases are also multiplexed in single stream.
Pros
● Unlike ZooKeeper or Consul that return one event per watch request, etcd can
continuously watch from the current revision.
● multiplexes watches on a single connection.
● Zookeeper looses old events, while etcd3 holds a sliding window to keep old
events so that a client’s disconnection will not lose all events occurred till it
connect back.
Cons ● Note that the client may be uncertain about the
status of an operation if it times out, or there is
a network disruption between the client and the
etcd member. etcd may also abort operations
when there is a leader election. etcd does not
send abort responses to clients’ outstanding
requests in this event.
● In a network split, minority side may still serve
(serialized) read requests.
Cons
● Java Client don’t support watches yet. Java client is immature and not tested
much.
● Serialized read requests will continue to be served in a network split where
the minority partition has the leader at the time of split.
References
● etcd CTO’s presentation and slides
● Complete API of etcd
● etcd blog post
Apache Curator
(Zookeeper)
Reference Material
● Zookeeper Tech Notes
● Usually ~10K write ops. More important is that write speed does not scale
when you increase number of servers. Read speed scales.
● ZAB for consensus.
● Even though Paxos is beautifully elegant in describing the essence of
distributed consensus, the absence of a comprehensive and prescriptive
specification has rendered it inaccessible and notoriously difficult to
implement in practical systems.
Pros
● Non-blocking full snapshots (to make eventually consistent)
● Efficient memory management.
● Reliable, has been there for a long time.
● A simplified API
● Automatic ZooKeeper connection management with retries
● Complete, well-tested implementations of ZooKeeper recipes
● A framework that makes writing new ZooKeeper recipes much easier.
● Event support
Pros ...
● In a network partition, both minority and majority partitions will start a leader
election. Therefore the minority partition will stop operations.
● In the above scenario, the watchers registered in the minority partition will be
notified with a “KeeperState.Disconnected” event. So they can connect back
to the operating partition later.
Cons
● Snapshots (where the data is written to disk, if enabled) cause zookeeper to
vary its performance and sometimes pauses (leader election + snapshot
creation).
Cons ...
● Garbage Collection
● Pauses when creating
snapshots
Consul
Overview
● Use serf. A solution for cluster membership, failure detection and
orchestration.
● Broadcast custom events.
● Gossip protocol for communication.
● Client to server -> rpc
● Complete architecture
● Ability to fire and listen events
Features
● Has a distributed key, value(KV) store for storing Service database.
● Provides comprehensive service health checking using both in-built solutions
as well as user provided custom solutions.
● Provides REST based HTTP api for interaction.
● Service database can be queried using DNS.
● Does dynamic load balancing.
● Supports single data center and can be scaled to support multiple data
centers.
● Integrates well with Docker.
Pros
● Has a java client
● Supports multiple data centers
● Focused on service discovery
● Consul use clients (too) to do health checks allowing the developers to create
large clusters without concentrating load on a small set of servers.
Cons
● Use multicasting and unicasting for member discovery. This can cause the
network to flood.
Hazlecast
Topology
● “One of the main features of Hazelcast is that it does not have a master
member. Each cluster member is configured to be the same in terms of
functionality. The oldest member (the first member created in the cluster)
automatically performs the data assignment to cluster members. If the oldest
member dies, the second oldest member takes over.”
● “Lite members are intended for use in computationally-heavy task executions
and listener registrations. Although they do not own any partitions, they can
access partitions that are owned by other members in the cluster.”
Pros
● Java Client
● Distributed implementations of java.util.{Queue, Set, List, Map}.
● Distributed implementation of java.util.concurrent.locks.Lock.
● Distributed implementation of java.util.concurrent.ExecutorService.
● Distributed MultiMap for one-to-many relationships.
● Distributed Topic for publish/subscribe messaging.
● Distributed Query, MapReduce and Aggregators.
● Synchronous (write-through) and asynchronous (write-behind) persistence.
● Transaction support.
● Specification compliant JCache implementation.
● Native Java, .NET, C++ clients, Memcache and REST clients.
● Socket level encryption support for secure clusters.
● Second level cache provider for Hibernate.
● Monitoring and management of the cluster via JMX.
● Dynamic HTTP session clustering.
● Support for cluster info and membership events.
● Dynamic discovery, scaling, partitioning with backups and fail-over.
Pros ...
● Has inbuilt Event Listeners. We can write new listeners as well.
● Awesome docs
● Almost all the features we want are inbuilt.
● No external dependencies. 1 jar. Written in java.
● Peer-to-peer
● keeps the backup of each data entry on multiple members
Cons
● USE ONLY HEAP MEMORY. NO PERSISTENT STORAGE SUPPORT FOR OPEN
SOURCE EDITION.
● Does sharding, but the docs say that they are keeping redundant copies.
What to chose?
● Extremely depends on the requirement and the used developer eco-system.
○ For java based/related environments Zookeeper will be better.
○ For Go lang related environments, etcd will be better
● If you need other services like service discovery, consul or hazelcast will be
better.
● This presentation is intended to list out pros and cons. Since this presentation
was made in the last quarter of 2016, these technologies/tools may have
changed a lot by now.
○ For example, etcd has been massively improved throughout.
Thank you!

More Related Content

What's hot (20)

PPTX
Tech talks#6: Code Refactoring
Nguyễn Việt Khoa
 
PDF
ORM Injection
Simone Onofri
 
PPTX
Onion architecture
Vidyasagar Machupalli
 
PPTX
Testing Microservices
Anil Allewar
 
PDF
Java logging
Jumping Bean
 
PPTX
React js programming concept
Tariqul islam
 
PDF
Favor composition over inheritance
Kohei Nozaki
 
PDF
JPA Week3 Entity Mapping / Hexagonal Architecture
Covenant Ko
 
PDF
Hardening Your CI/CD Pipelines with GitOps and Continuous Security
Weaveworks
 
PPSX
Event Sourcing & CQRS, Kafka, Rabbit MQ
Araf Karsh Hamid
 
PPTX
Hexagonal architecture with Spring Boot
Mikalai Alimenkou
 
PDF
Spring Initializrをハックする-カスタマイズを通してその内部実装を覗く
apkiban
 
PPTX
Meetup angular http client
Gaurav Madaan
 
PPTX
React js
Alireza Akbari
 
PDF
Asynchronous API in Java8, how to use CompletableFuture
José Paumard
 
ODP
五行完成網頁多國語系
amostsai
 
PDF
Action Jackson! Effective JSON processing in Spring Boot Applications
Joris Kuipers
 
PDF
Introducing Clean Architecture
Roc Boronat
 
PDF
RPC에서 REST까지 간단한 개념소개
Wonchang Song
 
PPTX
Angular 2.0 forms
Eyal Vardi
 
Tech talks#6: Code Refactoring
Nguyễn Việt Khoa
 
ORM Injection
Simone Onofri
 
Onion architecture
Vidyasagar Machupalli
 
Testing Microservices
Anil Allewar
 
Java logging
Jumping Bean
 
React js programming concept
Tariqul islam
 
Favor composition over inheritance
Kohei Nozaki
 
JPA Week3 Entity Mapping / Hexagonal Architecture
Covenant Ko
 
Hardening Your CI/CD Pipelines with GitOps and Continuous Security
Weaveworks
 
Event Sourcing & CQRS, Kafka, Rabbit MQ
Araf Karsh Hamid
 
Hexagonal architecture with Spring Boot
Mikalai Alimenkou
 
Spring Initializrをハックする-カスタマイズを通してその内部実装を覗く
apkiban
 
Meetup angular http client
Gaurav Madaan
 
React js
Alireza Akbari
 
Asynchronous API in Java8, how to use CompletableFuture
José Paumard
 
五行完成網頁多國語系
amostsai
 
Action Jackson! Effective JSON processing in Spring Boot Applications
Joris Kuipers
 
Introducing Clean Architecture
Roc Boronat
 
RPC에서 REST까지 간단한 개념소개
Wonchang Song
 
Angular 2.0 forms
Eyal Vardi
 

Similar to Comparison between zookeeper, etcd 3 and other distributed coordination systems (20)

PDF
ClickHouse Keeper
Altinity Ltd
 
PDF
Introduction to ZooKeeper - TriHUG May 22, 2012
mumrah
 
PDF
Beyond Off the-Shelf Consensus
Rebecca Bilbro
 
PDF
Distributed fun with etcd
Abdulaziz AlMalki
 
PPTX
ZooKeeper (and other things)
Jonathan Halterman
 
PPTX
Winter is coming? Not if ZooKeeper is there!
Joydeep Banik Roy
 
PDF
A Python Petting Zoo
devondjones
 
PDF
Introduction to Apache ZooKeeper | Big Data Hadoop Spark Tutorial | CloudxLab
CloudxLab
 
PDF
Apache ZooKeeper TechTuesday
Andrei Savu
 
PDF
Tech Talks_25.04.15_Session 3_Tibor Sulyan_Distributed coordination with zook...
EPAM_Systems_Bulgaria
 
PDF
SVCC-2014
John Brinnand
 
ODP
Consensus algo with_distributed_key_value_store_in_distributed_system
Atin Mukherjee
 
PDF
Java one2013
Aleksei Kornev
 
PPTX
Zookeeper
santosh sahoo
 
PDF
Comparing ZooKeeper and Consul
Ivan Glushkov
 
PPT
Distributed & Highly Available server applications in Java and Scala
Max Alexejev
 
PPTX
Zookeeper
venkata ramireddy
 
PPTX
Leo's Notes about Apache Kafka
Léopold Gault
 
PPTX
Zookeeper Tutorial for beginners
jeetendra mandal
 
PDF
The present and future of serverless observability
Yan Cui
 
ClickHouse Keeper
Altinity Ltd
 
Introduction to ZooKeeper - TriHUG May 22, 2012
mumrah
 
Beyond Off the-Shelf Consensus
Rebecca Bilbro
 
Distributed fun with etcd
Abdulaziz AlMalki
 
ZooKeeper (and other things)
Jonathan Halterman
 
Winter is coming? Not if ZooKeeper is there!
Joydeep Banik Roy
 
A Python Petting Zoo
devondjones
 
Introduction to Apache ZooKeeper | Big Data Hadoop Spark Tutorial | CloudxLab
CloudxLab
 
Apache ZooKeeper TechTuesday
Andrei Savu
 
Tech Talks_25.04.15_Session 3_Tibor Sulyan_Distributed coordination with zook...
EPAM_Systems_Bulgaria
 
SVCC-2014
John Brinnand
 
Consensus algo with_distributed_key_value_store_in_distributed_system
Atin Mukherjee
 
Java one2013
Aleksei Kornev
 
Zookeeper
santosh sahoo
 
Comparing ZooKeeper and Consul
Ivan Glushkov
 
Distributed & Highly Available server applications in Java and Scala
Max Alexejev
 
Leo's Notes about Apache Kafka
Léopold Gault
 
Zookeeper Tutorial for beginners
jeetendra mandal
 
The present and future of serverless observability
Yan Cui
 
Ad

Recently uploaded (20)

PDF
HCIP-Data Center Facility Deployment V2.0 Training Material (Without Remarks ...
mcastillo49
 
PDF
Bitcoin for Millennials podcast with Bram, Power Laws of Bitcoin
Stephen Perrenod
 
PDF
Agentic AI lifecycle for Enterprise Hyper-Automation
Debmalya Biswas
 
PDF
Chris Elwell Woburn, MA - Passionate About IT Innovation
Chris Elwell Woburn, MA
 
PDF
Python basic programing language for automation
DanialHabibi2
 
PPTX
Building Search Using OpenSearch: Limitations and Workarounds
Sease
 
PDF
The Builder’s Playbook - 2025 State of AI Report.pdf
jeroen339954
 
PDF
LLMs.txt: Easily Control How AI Crawls Your Site
Keploy
 
PDF
New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
PPTX
Webinar: Introduction to LF Energy EVerest
DanBrown980551
 
PDF
Presentation - Vibe Coding The Future of Tech
yanuarsinggih1
 
PDF
Jak MŚP w Europie Środkowo-Wschodniej odnajdują się w świecie AI
dominikamizerska1
 
PDF
Building Real-Time Digital Twins with IBM Maximo & ArcGIS Indoors
Safe Software
 
PDF
Smart Trailers 2025 Update with History and Overview
Paul Menig
 
PDF
CIFDAQ Token Spotlight for 9th July 2025
CIFDAQ
 
PDF
DevBcn - Building 10x Organizations Using Modern Productivity Metrics
Justin Reock
 
PDF
[Newgen] NewgenONE Marvin Brochure 1.pdf
darshakparmar
 
PPTX
Q2 FY26 Tableau User Group Leader Quarterly Call
lward7
 
PPTX
WooCommerce Workshop: Bring Your Laptop
Laura Hartwig
 
PPTX
AUTOMATION AND ROBOTICS IN PHARMA INDUSTRY.pptx
sameeraaabegumm
 
HCIP-Data Center Facility Deployment V2.0 Training Material (Without Remarks ...
mcastillo49
 
Bitcoin for Millennials podcast with Bram, Power Laws of Bitcoin
Stephen Perrenod
 
Agentic AI lifecycle for Enterprise Hyper-Automation
Debmalya Biswas
 
Chris Elwell Woburn, MA - Passionate About IT Innovation
Chris Elwell Woburn, MA
 
Python basic programing language for automation
DanialHabibi2
 
Building Search Using OpenSearch: Limitations and Workarounds
Sease
 
The Builder’s Playbook - 2025 State of AI Report.pdf
jeroen339954
 
LLMs.txt: Easily Control How AI Crawls Your Site
Keploy
 
New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
Webinar: Introduction to LF Energy EVerest
DanBrown980551
 
Presentation - Vibe Coding The Future of Tech
yanuarsinggih1
 
Jak MŚP w Europie Środkowo-Wschodniej odnajdują się w świecie AI
dominikamizerska1
 
Building Real-Time Digital Twins with IBM Maximo & ArcGIS Indoors
Safe Software
 
Smart Trailers 2025 Update with History and Overview
Paul Menig
 
CIFDAQ Token Spotlight for 9th July 2025
CIFDAQ
 
DevBcn - Building 10x Organizations Using Modern Productivity Metrics
Justin Reock
 
[Newgen] NewgenONE Marvin Brochure 1.pdf
darshakparmar
 
Q2 FY26 Tableau User Group Leader Quarterly Call
lward7
 
WooCommerce Workshop: Bring Your Laptop
Laura Hartwig
 
AUTOMATION AND ROBOTICS IN PHARMA INDUSTRY.pptx
sameeraaabegumm
 
Ad

Comparison between zookeeper, etcd 3 and other distributed coordination systems

  • 1. Zookeeper vs ETCD 3 vs Other Distributed Coordination Systems
  • 3. API ● RAFT - Consensus algorithm ● Put ● Get ● Range - Get values of keys from a one key to another key ● Transactions - Read, compare, modify, write combinations ● Watch - On a key or a range. Streaming API ● Example : https://coreos.com/etcd/docs/latest/rfc/v3api.html
  • 4. Guarantees ● Atomicity ● Consistency ● Sequential Consistency https://en.wikipedia.org/wiki/Sequential_consistency ● Serializable Isolation ● Durability ● Linearizability (Except for watches) ● References: ○ Guarantees - https://coreos.com/etcd/docs/latest/api_v3.html#kv-api-guarantees
  • 5. Pros ● Has a Java Client ● Has a distributed lock implementation in v3 client. Will be moved to server side in 3.2 release. ● Incremental Snapshots - Avoid pauses when creating snapshots. ● No garbage collection pauses - Off- heap storage
  • 6. Pros ... ● Performance of etcd 3 with zookeeper(snapshots disabled) ● Low latency ● Low storage usage ● Watchers are redesigned, replacing the older event model with one that streams and multiplexes events over key intervals. ● gRPC is 2x faster than JSON parsing in etcd2 ● Leases for TTL. Leases are also multiplexed in single stream.
  • 7. Pros ● Unlike ZooKeeper or Consul that return one event per watch request, etcd can continuously watch from the current revision. ● multiplexes watches on a single connection. ● Zookeeper looses old events, while etcd3 holds a sliding window to keep old events so that a client’s disconnection will not lose all events occurred till it connect back.
  • 8. Cons ● Note that the client may be uncertain about the status of an operation if it times out, or there is a network disruption between the client and the etcd member. etcd may also abort operations when there is a leader election. etcd does not send abort responses to clients’ outstanding requests in this event. ● In a network split, minority side may still serve (serialized) read requests.
  • 9. Cons ● Java Client don’t support watches yet. Java client is immature and not tested much. ● Serialized read requests will continue to be served in a network split where the minority partition has the leader at the time of split.
  • 10. References ● etcd CTO’s presentation and slides ● Complete API of etcd ● etcd blog post
  • 12. Reference Material ● Zookeeper Tech Notes ● Usually ~10K write ops. More important is that write speed does not scale when you increase number of servers. Read speed scales. ● ZAB for consensus. ● Even though Paxos is beautifully elegant in describing the essence of distributed consensus, the absence of a comprehensive and prescriptive specification has rendered it inaccessible and notoriously difficult to implement in practical systems.
  • 13. Pros ● Non-blocking full snapshots (to make eventually consistent) ● Efficient memory management. ● Reliable, has been there for a long time. ● A simplified API ● Automatic ZooKeeper connection management with retries ● Complete, well-tested implementations of ZooKeeper recipes ● A framework that makes writing new ZooKeeper recipes much easier. ● Event support
  • 14. Pros ... ● In a network partition, both minority and majority partitions will start a leader election. Therefore the minority partition will stop operations. ● In the above scenario, the watchers registered in the minority partition will be notified with a “KeeperState.Disconnected” event. So they can connect back to the operating partition later.
  • 15. Cons ● Snapshots (where the data is written to disk, if enabled) cause zookeeper to vary its performance and sometimes pauses (leader election + snapshot creation).
  • 16. Cons ... ● Garbage Collection ● Pauses when creating snapshots
  • 18. Overview ● Use serf. A solution for cluster membership, failure detection and orchestration. ● Broadcast custom events. ● Gossip protocol for communication. ● Client to server -> rpc ● Complete architecture ● Ability to fire and listen events
  • 19. Features ● Has a distributed key, value(KV) store for storing Service database. ● Provides comprehensive service health checking using both in-built solutions as well as user provided custom solutions. ● Provides REST based HTTP api for interaction. ● Service database can be queried using DNS. ● Does dynamic load balancing. ● Supports single data center and can be scaled to support multiple data centers. ● Integrates well with Docker.
  • 20. Pros ● Has a java client ● Supports multiple data centers ● Focused on service discovery ● Consul use clients (too) to do health checks allowing the developers to create large clusters without concentrating load on a small set of servers.
  • 21. Cons ● Use multicasting and unicasting for member discovery. This can cause the network to flood.
  • 23. Topology ● “One of the main features of Hazelcast is that it does not have a master member. Each cluster member is configured to be the same in terms of functionality. The oldest member (the first member created in the cluster) automatically performs the data assignment to cluster members. If the oldest member dies, the second oldest member takes over.” ● “Lite members are intended for use in computationally-heavy task executions and listener registrations. Although they do not own any partitions, they can access partitions that are owned by other members in the cluster.”
  • 24. Pros ● Java Client ● Distributed implementations of java.util.{Queue, Set, List, Map}. ● Distributed implementation of java.util.concurrent.locks.Lock. ● Distributed implementation of java.util.concurrent.ExecutorService. ● Distributed MultiMap for one-to-many relationships. ● Distributed Topic for publish/subscribe messaging. ● Distributed Query, MapReduce and Aggregators. ● Synchronous (write-through) and asynchronous (write-behind) persistence. ● Transaction support. ● Specification compliant JCache implementation. ● Native Java, .NET, C++ clients, Memcache and REST clients. ● Socket level encryption support for secure clusters. ● Second level cache provider for Hibernate. ● Monitoring and management of the cluster via JMX. ● Dynamic HTTP session clustering. ● Support for cluster info and membership events. ● Dynamic discovery, scaling, partitioning with backups and fail-over.
  • 25. Pros ... ● Has inbuilt Event Listeners. We can write new listeners as well. ● Awesome docs ● Almost all the features we want are inbuilt. ● No external dependencies. 1 jar. Written in java. ● Peer-to-peer ● keeps the backup of each data entry on multiple members
  • 26. Cons ● USE ONLY HEAP MEMORY. NO PERSISTENT STORAGE SUPPORT FOR OPEN SOURCE EDITION. ● Does sharding, but the docs say that they are keeping redundant copies.
  • 27. What to chose? ● Extremely depends on the requirement and the used developer eco-system. ○ For java based/related environments Zookeeper will be better. ○ For Go lang related environments, etcd will be better ● If you need other services like service discovery, consul or hazelcast will be better. ● This presentation is intended to list out pros and cons. Since this presentation was made in the last quarter of 2016, these technologies/tools may have changed a lot by now. ○ For example, etcd has been massively improved throughout.