SlideShare a Scribd company logo
Curator
The Netflix ZooKeeper Client Library




                           Jordan Zimmerman
                                   Senior Platform Engineer
                                                 Netflix, Inc.
                                 jzimmerman@netflix.com
                                                  @rangalt
Agenda
Agenda
• Background
• Overview of Curator
• The Recipes
• Some Low-Level Details
• Q&A
Background
What’s wrong with this
        code?


ZooKeeper       client = new ZooKeeper(...);

client.create(“/foo”, data, ...);
ZooKeeper Surprise
ZooKeeper Surprise
• Almost no ZK client call is safe
• You cannot assume success
• You must handle exceptions
The Recipes Are Hard
       Locks
Fully distributed locks that are globally synchronous, meaning at any snapshot in time no two clients think they hold the same lock. These can be
implemented using ZooKeeeper. As with priority queues, first define a lock node.

Note
There now exists a Lock implementation in ZooKeeper recipes directory. This is distributed with the release -- src/recipes/lock directory of the release artifact.

Clients wishing to obtain a lock do the following:

  1.  Call create( ) with a pathname of "_locknode_/guid-lock-" and the sequence and ephemeral flags set. The guid is needed in case the
      create() result is missed. See the note below.
  2. Call getChildren( ) on the lock node without setting the watch flag (this is important to avoid the herd effect).
  3. If the pathname created in step 1 has the lowest sequence number suffix, the client has the lock and the client exits the protocol.
  4. The client calls exists( ) with the watch flag set on the path in the lock directory with the next lowest sequence number.
  5. if exists( ) returns false, go to step 2. Otherwise, wait for a notification for the pathname from the previous step before going to step 2.
The unlock protocol is very simple: clients wishing to release a lock simply delete the node they created in step 1.

Here are a few things to notice:

  •    The removal of a node will only cause one client to wake up since each node is watched by exactly one client. In this way, you avoid the
       herd effect.
  •    There is no polling or timeouts.
  •    Because of the way you implement locking, it is easy to see the amount of lock contention, break locks, debug locking problems, etc.
Recoverable Errors and the GUID
  •    If a recoverable error occurs calling create() the client should call getChildren() and check for a node containing the guid used in the
       path name. This handles the case (noted above) of the create() succeeding on the server but the server crashing before returning the
       name of the new node.
Even the Distribution
         Has Issues
from org.apache.zookeeper.recipes.lock.WriteLock
if (id == null) {
    long sessionId = zookeeper.getSessionId();
    String prefix = "x-" + sessionId + "-";
    // lets try look up the current ID if we failed
    // in the middle of creating the znode
    findPrefixInChildren(prefix, zookeeper, dir);
    idName = new ZNodeName(id);
}
Even the Distribution
           Has Issues
  from org.apache.zookeeper.recipes.lock.WriteLock
  if (id == null) {
      long sessionId = zookeeper.getSessionId();
      String prefix = "x-" + sessionId + "-";
      // lets try look up the current ID if we failed
      // in the middle of creating the znode
      findPrefixInChildren(prefix, zookeeper, dir);
      idName = new ZNodeName(id);
  }


Bad handling of Ephemeral-Sequential issue!
What About ZKClient?
•   Unclear if it’s still being supported
    Eleven open issues (back to 10/1/2009)
•   README:
    “+ TBD”
•   No docs
•   Little or no retries
•   Design problems:
          •   All exceptions converted to RuntimeException
          •   Recipes/management code highly coupled
          •   Lots of foreground synchronization
          •   Small number of tests
          •   ... etc ...
•   ...
Curator intro
Curator intro
Introducing Curator
Introducing Curator
Curator n ˈkyo͝orˌātər: a keeper or custodian of a
museum or other collection - A ZooKeeper
Keeper
Three components:
  Client - A replacement/wrapper for the bundled ZooKeeper class

  Framework - A high-level API that greatly simplifies using
  ZooKeeper

  Recipes - Implementations of some of the common ZooKeeper
  "recipes" built on top of the Curator Framework
Overview of Curator
The Curator Stack
• Client
• Framework
• Recipes
• Extensions
The Curator Stack
• Client
• Framework
• Recipes        Curator Recipes


                Curator Framework



• Extensions      Curator Client

                    ZooKeeper
Curator intro
Curator is a platform
for writing ZooKeeper
        Recipes
Curator Client
 manages the
 ZooKeeper
 Connection
Curator Recipes


                 Curator Framework

                   Curator Client

                     ZooKeeper




Curator Client
 manages the
 ZooKeeper
 Connection
Curator Framework
   uses retry for all
operations and provides
    a friendlier API
Curator Recipes


                   Curator Framework

                     Curator Client

                       ZooKeeper




 Curator Framework
   uses retry for all
operations and provides
    a friendlier API
Curator Recipes:
implementations of all
 recipes listed on the
ZK website (and more)
Curator Recipes


                  Curator Framework

                    Curator Client

                      ZooKeeper




   Curator Recipes:
implementations of all
 recipes listed on the
ZK website (and more)
The Recipes
The Recipes
• Leader Selector
• Distributed Locks
• Queues
• Barriers
• Counters
• Atomics
• ...
CuratorFramework
               Instance
CuratorFrameworkFactory.newClient(...)
              ---------------------
CuratorFrameworkFactory.builder()
   .connectString(“...”)
   ...
   .build()




         Usually injected as a singleton
Must Be Started

client.start();


// client is now ready for use
Leader Selector
By far the most common usage of
ZooKeeper


       Distributed lock with a notification
       mechanism
Sample
public class CleanupLeader implements
         LeaderSelectorListener
  {
      ...
      @Override
      public void takeLeadership(CuratorFramework client)
            throws Exception
      {
         while ( !Thread.currentThread().isInterrupted() )
         {
           sleepUntilNextPeriod();
           doPeriodicCleanup();
         }
      }
  }




...
LeaderSelector leaderSelector =
    new LeaderSelector(client, path, new CleanupLeader());
leaderSelector.start();
Distributed Locks
• InterProcessMutex
• InterProcessReadWriteLock
• InterProcessMultiLock
• InterProcessSemaphore
Distributed Locks
• InterProcessMutex
• InterProcessReadWriteLock
• InterProcessMultiLock
• InterProcessSemaphore

            Very similar to JDK locks
Sample
InterProcessMutex mutex =
    new InterProcessMutex(client, lockPath);

mutex.acquire();
try
{
    // do work in critical section
}
finally
{
    mutex.release();
}
Low-Level Details
public void process(WatchedEvent event)
{
    boolean wasConnected = isConnected.get();
    boolean newIsConnected = wasConnected;
    if ( event.getType() == Watcher.Event.EventType.None )
    {
        newIsConnected = (event.getState() == Event.KeeperState.SyncConnected);
        if ( event.getState() == Event.KeeperState.Expired )
        {
            handleExpiredSession();
        }
    }

    if ( newIsConnected != wasConnected )
    {
        isConnected.set(newIsConnected);
        connectionStartMs = System.currentTimeMillis();
    }

     ...
}
public static boolean      shouldRetry(int rc)
{
    return (rc == KeeperException.Code.CONNECTIONLOSS.intValue()) ||
        (rc == KeeperException.Code.OPERATIONTIMEOUT.intValue()) ||
        (rc == KeeperException.Code.SESSIONMOVED.intValue()) ||
        (rc == KeeperException.Code.SESSIONEXPIRED.intValue());
}




public void         takeException(Exception exception) throws Exception
{
    boolean     rethrow = true;
    if ( isRetryException(exception) )
    {
        if ( retryPolicy.allowRetry(retryCount++, System.currentTimeMillis() - startTimeMs) )
        {
            rethrow = false;
        }
    }

    if ( rethrow )
    {
        throw exception;
    }
}
byte[]      responseData = RetryLoop.callWithRetry
(
    client.getZookeeperClient(),
    new Callable<byte[]>()
    {
        @Override
        public byte[] call() throws Exception
        {
            byte[]      responseData;
            responseData = client.getZooKeeper().getData(path,
               ...);
            }
            return responseData;
        }
    }
);
return responseData;
client.withProtectedEphemeralSequential()
final AtomicBoolean     firstTime = new AtomicBoolean(true);
String                  returnPath = RetryLoop.callWithRetry
(
    client.getZookeeperClient(),
    new Callable<String>()
    {
        @Override
        public String call() throws Exception
        {
           ...

               String createdPath = null;
               if ( !firstTime.get() && doProtectedEphemeralSequential )
               {
                   createdPath = findProtectedNodeInForeground(localPath);
               }
             ...
         }
     }
);
public interface ConnectionStateListener
{
    public void stateChanged(CuratorFramework
        client, ConnectionState newState);
}

public enum ConnectionState
{
    SUSPENDED,
    RECONNECTED,
    LOST
}
if ( e instanceof KeeperException.ConnectionLossException )
  {
      connectionStateManager.addStateChange(ConnectionState.LOST);
  }



private void validateConnection(CuratorEvent curatorEvent)
{
    if ( curatorEvent.getType() == CuratorEventType.WATCHED )
    {
        if ( curatorEvent.getWatchedEvent().getState() ==
          Watcher.Event.KeeperState.Disconnected )
        {
            connectionStateManager.addStateChange(ConnectionState.SUSPENDED);
            internalSync(this, "/", null);
        }
        else if ( curatorEvent.getWatchedEvent().getState() ==
          Watcher.Event.KeeperState.Expired )
        {
            connectionStateManager.addStateChange(ConnectionState.LOST);
        }
        else if ( curatorEvent.getWatchedEvent().getState() ==
          Watcher.Event.KeeperState.SyncConnected )
        {
            connectionStateManager.addStateChange(ConnectionState.RECONNECTED);
        }
    }
}
Testing Utilities
• TestingServer: manages an internally
  running ZooKeeper server
  // Create the server using a random port
  public TestingServer()




• TestingCluster: manages an internally
  running ensemble of ZooKeeper servers.
  // Creates an ensemble comprised of n servers.
  // Each server will use a temp directory and
  // random ports
  public TestingCluster(int instanceQty)
Extensions
• Discovery
• Discovery REST Server
• Exhibitor
• ???
Extensions
• Discovery
• Discovery REST Server
• Exhibitor                Curator Recipes




• ???
                          Curator Framework
                                              Extensions
                            Curator Client

                              ZooKeeper
Exhibitor
  Sneak Peak
Exhibitor
  Sneak Peak
Exhibitor
  Sneak Peak
Exhibitor
  Sneak Peak
Exhibitor
       Sneak Peak




 March or April 2012
Open Source on Github
Curator intro
Curator intro
Netflix Github
Netflix Github




Netflix’s home for Open Source
Maven Central
Maven Central
Binaries pushed to Maven Central

  <dependency>
      <groupId>com.netflix.curator</groupId>
      <artifactId>curator-recipes</artifactId>
      <version>1.1.0</version>
  </dependency>
Much%younger%–%much%thinner0


        Jordan Zimmerman
 jzimmerman@netflix.com
                @randgalt
Q&A


 Much%younger%–%much%thinner0


         Jordan Zimmerman
  jzimmerman@netflix.com
                 @randgalt

More Related Content

What's hot (20)

PPTX
Developing distributed applications with Akka and Akka Cluster
Konstantin Tsykulenko
 
PDF
Apache ZooKeeper TechTuesday
Andrei Savu
 
PPTX
Winter is coming? Not if ZooKeeper is there!
Joydeep Banik Roy
 
PDF
Introduction to Apache ZooKeeper
knowbigdata
 
PDF
Zookeeper In Action
juvenxu
 
PPTX
Apache Zookeeper Explained: Tutorial, Use Cases and Zookeeper Java API Examples
Binu George
 
PDF
ZooKeeper - wait free protocol for coordinating processes
Julia Proskurnia
 
PDF
Distributed system coordination by zookeeper and introduction to kazoo python...
Jimmy Lai
 
PDF
Comparing ZooKeeper and Consul
Ivan Glushkov
 
PPTX
Introduction to apache zoo keeper
Omid Vahdaty
 
PPT
Zookeeper Introduce
jhao niu
 
PDF
Consul - service discovery and others
Walter Liu
 
PPTX
REEF: Towards a Big Data Stdlib
DataWorks Summit
 
PDF
Introduction to Apache ZooKeeper | Big Data Hadoop Spark Tutorial | CloudxLab
CloudxLab
 
PDF
使用ZooKeeper打造軟體式負載平衡
Lawrence Huang
 
PDF
Apache Zookeeper
Nguyen Quang
 
PPTX
Zoo keeper in the wild
datamantra
 
PDF
Terraform introduction
Jason Vance
 
PPTX
Akka Actor presentation
Gene Chang
 
PDF
Native container monitoring
Rohit Jnagal
 
Developing distributed applications with Akka and Akka Cluster
Konstantin Tsykulenko
 
Apache ZooKeeper TechTuesday
Andrei Savu
 
Winter is coming? Not if ZooKeeper is there!
Joydeep Banik Roy
 
Introduction to Apache ZooKeeper
knowbigdata
 
Zookeeper In Action
juvenxu
 
Apache Zookeeper Explained: Tutorial, Use Cases and Zookeeper Java API Examples
Binu George
 
ZooKeeper - wait free protocol for coordinating processes
Julia Proskurnia
 
Distributed system coordination by zookeeper and introduction to kazoo python...
Jimmy Lai
 
Comparing ZooKeeper and Consul
Ivan Glushkov
 
Introduction to apache zoo keeper
Omid Vahdaty
 
Zookeeper Introduce
jhao niu
 
Consul - service discovery and others
Walter Liu
 
REEF: Towards a Big Data Stdlib
DataWorks Summit
 
Introduction to Apache ZooKeeper | Big Data Hadoop Spark Tutorial | CloudxLab
CloudxLab
 
使用ZooKeeper打造軟體式負載平衡
Lawrence Huang
 
Apache Zookeeper
Nguyen Quang
 
Zoo keeper in the wild
datamantra
 
Terraform introduction
Jason Vance
 
Akka Actor presentation
Gene Chang
 
Native container monitoring
Rohit Jnagal
 

Viewers also liked (20)

PPTX
Apache Curator: Past, Present and Future
Jordan Zimmerman
 
KEY
Exhibitor Introduction
Jordan Zimmerman
 
PPTX
Introduction to Apache ZooKeeper
Saurav Haloi
 
PDF
Large volume data analysis on the Typesafe Reactive Platform - Big Data Scala...
Martin Zapletal
 
PPTX
Introduction to Kafka and Zookeeper
Rahul Jain
 
PDF
Chicago Hadoop Users Group: Enterprise Data Workflows
Paco Nathan
 
PDF
Spring 3.1 and MVC Testing Support - 4Developers
Sam Brannen
 
PDF
Reactive Programming With Akka - Lessons Learned
Daniel Sawano
 
PDF
A Sceptical Guide to Functional Programming
Garth Gilmour
 
PDF
The no-framework Scala Dependency Injection Framework
Adam Warski
 
PDF
Data in Motion: Streaming Static Data Efficiently 2
Martin Zapletal
 
PDF
Effective akka scalaio
shinolajla
 
PDF
Actor Based Asyncronous IO in Akka
drewhk
 
PDF
ZooKeeper Futures
Cloudera, Inc.
 
PPTX
ZooKeeper (and other things)
Jonathan Halterman
 
PDF
Taming Pythons with ZooKeeper
Jyrki Pulliainen
 
PDF
Zookeeper
ltsllc
 
PDF
Taming Pythons with ZooKeeper (Pyconfi edition)
Jyrki Pulliainen
 
PDF
Large volume data analysis on the Typesafe Reactive Platform
Martin Zapletal
 
PDF
Efficient HTTP Apis
Adrian Cole
 
Apache Curator: Past, Present and Future
Jordan Zimmerman
 
Exhibitor Introduction
Jordan Zimmerman
 
Introduction to Apache ZooKeeper
Saurav Haloi
 
Large volume data analysis on the Typesafe Reactive Platform - Big Data Scala...
Martin Zapletal
 
Introduction to Kafka and Zookeeper
Rahul Jain
 
Chicago Hadoop Users Group: Enterprise Data Workflows
Paco Nathan
 
Spring 3.1 and MVC Testing Support - 4Developers
Sam Brannen
 
Reactive Programming With Akka - Lessons Learned
Daniel Sawano
 
A Sceptical Guide to Functional Programming
Garth Gilmour
 
The no-framework Scala Dependency Injection Framework
Adam Warski
 
Data in Motion: Streaming Static Data Efficiently 2
Martin Zapletal
 
Effective akka scalaio
shinolajla
 
Actor Based Asyncronous IO in Akka
drewhk
 
ZooKeeper Futures
Cloudera, Inc.
 
ZooKeeper (and other things)
Jonathan Halterman
 
Taming Pythons with ZooKeeper
Jyrki Pulliainen
 
Zookeeper
ltsllc
 
Taming Pythons with ZooKeeper (Pyconfi edition)
Jyrki Pulliainen
 
Large volume data analysis on the Typesafe Reactive Platform
Martin Zapletal
 
Efficient HTTP Apis
Adrian Cole
 
Ad

Similar to Curator intro (20)

PDF
A Python Petting Zoo
devondjones
 
PDF
ZooKeeper Recipes and Solutions
Jeff Smith
 
PDF
ZooKeeper Recipes and Solutions
Jeff Smith
 
PDF
ZooKeeper Recipes and Solutions
Jeff Smith
 
PDF
ClickHouse Keeper
Altinity Ltd
 
PPTX
Leo's Notes about Apache Kafka
Léopold Gault
 
PPTX
Zookeeper Tutorial for beginners
jeetendra mandal
 
PDF
Tech Talks_25.04.15_Session 3_Tibor Sulyan_Distributed coordination with zook...
EPAM_Systems_Bulgaria
 
PPTX
Zookeeper big sonata
Anh Le
 
PPTX
ZeroMq ZooKeeper and FlatBuffers
Ravi Okade
 
PPTX
Zookeeper
venkata ramireddy
 
PDF
Abhishek Kumar - CloudStack Locking Service
ShapeBlue
 
PPTX
Apache zookeeper 101
Quach Tung
 
PPTX
Zookeeper Architecture
Prasad Wali
 
PDF
Базы данных. ZooKeeper
Vadim Tsesko
 
PDF
ZooKeeper Partitioning - A project report
pramodbiligiri
 
PPTX
Building reliable systems with Apache BookKeeper
Matthieu Morel
 
PPT
Shopzilla On Concurrency
Rodney Barlow
 
PDF
Voldemort Nosql
elliando dias
 
PPT
Shopzilla On Concurrency
Will Gage
 
A Python Petting Zoo
devondjones
 
ZooKeeper Recipes and Solutions
Jeff Smith
 
ZooKeeper Recipes and Solutions
Jeff Smith
 
ZooKeeper Recipes and Solutions
Jeff Smith
 
ClickHouse Keeper
Altinity Ltd
 
Leo's Notes about Apache Kafka
Léopold Gault
 
Zookeeper Tutorial for beginners
jeetendra mandal
 
Tech Talks_25.04.15_Session 3_Tibor Sulyan_Distributed coordination with zook...
EPAM_Systems_Bulgaria
 
Zookeeper big sonata
Anh Le
 
ZeroMq ZooKeeper and FlatBuffers
Ravi Okade
 
Abhishek Kumar - CloudStack Locking Service
ShapeBlue
 
Apache zookeeper 101
Quach Tung
 
Zookeeper Architecture
Prasad Wali
 
Базы данных. ZooKeeper
Vadim Tsesko
 
ZooKeeper Partitioning - A project report
pramodbiligiri
 
Building reliable systems with Apache BookKeeper
Matthieu Morel
 
Shopzilla On Concurrency
Rodney Barlow
 
Voldemort Nosql
elliando dias
 
Shopzilla On Concurrency
Will Gage
 
Ad

Recently uploaded (20)

PPTX
The Project Compass - GDG on Campus MSIT
dscmsitkol
 
PDF
"Beyond English: Navigating the Challenges of Building a Ukrainian-language R...
Fwdays
 
PPTX
AI Penetration Testing Essentials: A Cybersecurity Guide for 2025
defencerabbit Team
 
PDF
CIFDAQ Token Spotlight for 9th July 2025
CIFDAQ
 
PDF
July Patch Tuesday
Ivanti
 
PDF
"AI Transformation: Directions and Challenges", Pavlo Shaternik
Fwdays
 
PDF
Using FME to Develop Self-Service CAD Applications for a Major UK Police Force
Safe Software
 
PDF
Advancing WebDriver BiDi support in WebKit
Igalia
 
PDF
How Startups Are Growing Faster with App Developers in Australia.pdf
India App Developer
 
PDF
Transforming Utility Networks: Large-scale Data Migrations with FME
Safe Software
 
PPTX
OpenID AuthZEN - Analyst Briefing July 2025
David Brossard
 
PDF
Jak MŚP w Europie Środkowo-Wschodniej odnajdują się w świecie AI
dominikamizerska1
 
PPTX
Q2 FY26 Tableau User Group Leader Quarterly Call
lward7
 
PDF
The Rise of AI and IoT in Mobile App Tech.pdf
IMG Global Infotech
 
PDF
What Makes Contify’s News API Stand Out: Key Features at a Glance
Contify
 
PDF
New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
PDF
CIFDAQ Market Wrap for the week of 4th July 2025
CIFDAQ
 
DOCX
Python coding for beginners !! Start now!#
Rajni Bhardwaj Grover
 
PDF
“NPU IP Hardware Shaped Through Software and Use-case Analysis,” a Presentati...
Edge AI and Vision Alliance
 
PDF
Empower Inclusion Through Accessible Java Applications
Ana-Maria Mihalceanu
 
The Project Compass - GDG on Campus MSIT
dscmsitkol
 
"Beyond English: Navigating the Challenges of Building a Ukrainian-language R...
Fwdays
 
AI Penetration Testing Essentials: A Cybersecurity Guide for 2025
defencerabbit Team
 
CIFDAQ Token Spotlight for 9th July 2025
CIFDAQ
 
July Patch Tuesday
Ivanti
 
"AI Transformation: Directions and Challenges", Pavlo Shaternik
Fwdays
 
Using FME to Develop Self-Service CAD Applications for a Major UK Police Force
Safe Software
 
Advancing WebDriver BiDi support in WebKit
Igalia
 
How Startups Are Growing Faster with App Developers in Australia.pdf
India App Developer
 
Transforming Utility Networks: Large-scale Data Migrations with FME
Safe Software
 
OpenID AuthZEN - Analyst Briefing July 2025
David Brossard
 
Jak MŚP w Europie Środkowo-Wschodniej odnajdują się w świecie AI
dominikamizerska1
 
Q2 FY26 Tableau User Group Leader Quarterly Call
lward7
 
The Rise of AI and IoT in Mobile App Tech.pdf
IMG Global Infotech
 
What Makes Contify’s News API Stand Out: Key Features at a Glance
Contify
 
New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
CIFDAQ Market Wrap for the week of 4th July 2025
CIFDAQ
 
Python coding for beginners !! Start now!#
Rajni Bhardwaj Grover
 
“NPU IP Hardware Shaped Through Software and Use-case Analysis,” a Presentati...
Edge AI and Vision Alliance
 
Empower Inclusion Through Accessible Java Applications
Ana-Maria Mihalceanu
 

Curator intro

  • 1. Curator The Netflix ZooKeeper Client Library Jordan Zimmerman Senior Platform Engineer Netflix, Inc. jzimmerman@netflix.com @rangalt
  • 3. Agenda • Background • Overview of Curator • The Recipes • Some Low-Level Details • Q&A
  • 5. What’s wrong with this code? ZooKeeper client = new ZooKeeper(...); client.create(“/foo”, data, ...);
  • 7. ZooKeeper Surprise • Almost no ZK client call is safe • You cannot assume success • You must handle exceptions
  • 8. The Recipes Are Hard Locks Fully distributed locks that are globally synchronous, meaning at any snapshot in time no two clients think they hold the same lock. These can be implemented using ZooKeeeper. As with priority queues, first define a lock node. Note There now exists a Lock implementation in ZooKeeper recipes directory. This is distributed with the release -- src/recipes/lock directory of the release artifact. Clients wishing to obtain a lock do the following: 1. Call create( ) with a pathname of "_locknode_/guid-lock-" and the sequence and ephemeral flags set. The guid is needed in case the create() result is missed. See the note below. 2. Call getChildren( ) on the lock node without setting the watch flag (this is important to avoid the herd effect). 3. If the pathname created in step 1 has the lowest sequence number suffix, the client has the lock and the client exits the protocol. 4. The client calls exists( ) with the watch flag set on the path in the lock directory with the next lowest sequence number. 5. if exists( ) returns false, go to step 2. Otherwise, wait for a notification for the pathname from the previous step before going to step 2. The unlock protocol is very simple: clients wishing to release a lock simply delete the node they created in step 1. Here are a few things to notice: • The removal of a node will only cause one client to wake up since each node is watched by exactly one client. In this way, you avoid the herd effect. • There is no polling or timeouts. • Because of the way you implement locking, it is easy to see the amount of lock contention, break locks, debug locking problems, etc. Recoverable Errors and the GUID • If a recoverable error occurs calling create() the client should call getChildren() and check for a node containing the guid used in the path name. This handles the case (noted above) of the create() succeeding on the server but the server crashing before returning the name of the new node.
  • 9. Even the Distribution Has Issues from org.apache.zookeeper.recipes.lock.WriteLock if (id == null) { long sessionId = zookeeper.getSessionId(); String prefix = "x-" + sessionId + "-"; // lets try look up the current ID if we failed // in the middle of creating the znode findPrefixInChildren(prefix, zookeeper, dir); idName = new ZNodeName(id); }
  • 10. Even the Distribution Has Issues from org.apache.zookeeper.recipes.lock.WriteLock if (id == null) { long sessionId = zookeeper.getSessionId(); String prefix = "x-" + sessionId + "-"; // lets try look up the current ID if we failed // in the middle of creating the znode findPrefixInChildren(prefix, zookeeper, dir); idName = new ZNodeName(id); } Bad handling of Ephemeral-Sequential issue!
  • 11. What About ZKClient? • Unclear if it’s still being supported Eleven open issues (back to 10/1/2009) • README: “+ TBD” • No docs • Little or no retries • Design problems: • All exceptions converted to RuntimeException • Recipes/management code highly coupled • Lots of foreground synchronization • Small number of tests • ... etc ... • ...
  • 15. Introducing Curator Curator n ˈkyo͝orˌātər: a keeper or custodian of a museum or other collection - A ZooKeeper Keeper Three components: Client - A replacement/wrapper for the bundled ZooKeeper class Framework - A high-level API that greatly simplifies using ZooKeeper Recipes - Implementations of some of the common ZooKeeper "recipes" built on top of the Curator Framework
  • 17. The Curator Stack • Client • Framework • Recipes • Extensions
  • 18. The Curator Stack • Client • Framework • Recipes Curator Recipes Curator Framework • Extensions Curator Client ZooKeeper
  • 20. Curator is a platform for writing ZooKeeper Recipes
  • 21. Curator Client manages the ZooKeeper Connection
  • 22. Curator Recipes Curator Framework Curator Client ZooKeeper Curator Client manages the ZooKeeper Connection
  • 23. Curator Framework uses retry for all operations and provides a friendlier API
  • 24. Curator Recipes Curator Framework Curator Client ZooKeeper Curator Framework uses retry for all operations and provides a friendlier API
  • 25. Curator Recipes: implementations of all recipes listed on the ZK website (and more)
  • 26. Curator Recipes Curator Framework Curator Client ZooKeeper Curator Recipes: implementations of all recipes listed on the ZK website (and more)
  • 28. The Recipes • Leader Selector • Distributed Locks • Queues • Barriers • Counters • Atomics • ...
  • 29. CuratorFramework Instance CuratorFrameworkFactory.newClient(...) --------------------- CuratorFrameworkFactory.builder() .connectString(“...”) ... .build() Usually injected as a singleton
  • 30. Must Be Started client.start(); // client is now ready for use
  • 31. Leader Selector By far the most common usage of ZooKeeper Distributed lock with a notification mechanism
  • 33. public class CleanupLeader implements LeaderSelectorListener { ... @Override public void takeLeadership(CuratorFramework client) throws Exception { while ( !Thread.currentThread().isInterrupted() ) { sleepUntilNextPeriod(); doPeriodicCleanup(); } } } ... LeaderSelector leaderSelector = new LeaderSelector(client, path, new CleanupLeader()); leaderSelector.start();
  • 34. Distributed Locks • InterProcessMutex • InterProcessReadWriteLock • InterProcessMultiLock • InterProcessSemaphore
  • 35. Distributed Locks • InterProcessMutex • InterProcessReadWriteLock • InterProcessMultiLock • InterProcessSemaphore Very similar to JDK locks
  • 37. InterProcessMutex mutex = new InterProcessMutex(client, lockPath); mutex.acquire(); try { // do work in critical section } finally { mutex.release(); }
  • 39. public void process(WatchedEvent event) { boolean wasConnected = isConnected.get(); boolean newIsConnected = wasConnected; if ( event.getType() == Watcher.Event.EventType.None ) { newIsConnected = (event.getState() == Event.KeeperState.SyncConnected); if ( event.getState() == Event.KeeperState.Expired ) { handleExpiredSession(); } } if ( newIsConnected != wasConnected ) { isConnected.set(newIsConnected); connectionStartMs = System.currentTimeMillis(); } ... }
  • 40. public static boolean shouldRetry(int rc) { return (rc == KeeperException.Code.CONNECTIONLOSS.intValue()) || (rc == KeeperException.Code.OPERATIONTIMEOUT.intValue()) || (rc == KeeperException.Code.SESSIONMOVED.intValue()) || (rc == KeeperException.Code.SESSIONEXPIRED.intValue()); } public void takeException(Exception exception) throws Exception { boolean rethrow = true; if ( isRetryException(exception) ) { if ( retryPolicy.allowRetry(retryCount++, System.currentTimeMillis() - startTimeMs) ) { rethrow = false; } } if ( rethrow ) { throw exception; } }
  • 41. byte[] responseData = RetryLoop.callWithRetry ( client.getZookeeperClient(), new Callable<byte[]>() { @Override public byte[] call() throws Exception { byte[] responseData; responseData = client.getZooKeeper().getData(path, ...); } return responseData; } } ); return responseData;
  • 43. final AtomicBoolean firstTime = new AtomicBoolean(true); String returnPath = RetryLoop.callWithRetry ( client.getZookeeperClient(), new Callable<String>() { @Override public String call() throws Exception { ... String createdPath = null; if ( !firstTime.get() && doProtectedEphemeralSequential ) { createdPath = findProtectedNodeInForeground(localPath); } ... } } );
  • 44. public interface ConnectionStateListener { public void stateChanged(CuratorFramework client, ConnectionState newState); } public enum ConnectionState { SUSPENDED, RECONNECTED, LOST }
  • 45. if ( e instanceof KeeperException.ConnectionLossException ) { connectionStateManager.addStateChange(ConnectionState.LOST); } private void validateConnection(CuratorEvent curatorEvent) { if ( curatorEvent.getType() == CuratorEventType.WATCHED ) { if ( curatorEvent.getWatchedEvent().getState() == Watcher.Event.KeeperState.Disconnected ) { connectionStateManager.addStateChange(ConnectionState.SUSPENDED); internalSync(this, "/", null); } else if ( curatorEvent.getWatchedEvent().getState() == Watcher.Event.KeeperState.Expired ) { connectionStateManager.addStateChange(ConnectionState.LOST); } else if ( curatorEvent.getWatchedEvent().getState() == Watcher.Event.KeeperState.SyncConnected ) { connectionStateManager.addStateChange(ConnectionState.RECONNECTED); } } }
  • 47. • TestingServer: manages an internally running ZooKeeper server // Create the server using a random port public TestingServer() • TestingCluster: manages an internally running ensemble of ZooKeeper servers. // Creates an ensemble comprised of n servers. // Each server will use a temp directory and // random ports public TestingCluster(int instanceQty)
  • 48. Extensions • Discovery • Discovery REST Server • Exhibitor • ???
  • 49. Extensions • Discovery • Discovery REST Server • Exhibitor Curator Recipes • ??? Curator Framework Extensions Curator Client ZooKeeper
  • 54. Exhibitor Sneak Peak March or April 2012 Open Source on Github
  • 60. Maven Central Binaries pushed to Maven Central <dependency> <groupId>com.netflix.curator</groupId> <artifactId>curator-recipes</artifactId> <version>1.1.0</version> </dependency>
  • 61. Much%younger%–%much%thinner0 Jordan Zimmerman jzimmerman@netflix.com @randgalt
  • 62. Q&A Much%younger%–%much%thinner0 Jordan Zimmerman jzimmerman@netflix.com @randgalt

Editor's Notes

  • #2: \n
  • #3: * Background - ZK issues, the need for a wrapper, etc. - mention that you can go more in depth on this\n* Why Curator was written, etc.\n* Low-level - details of the client/framework. Error handling, assumptions, etc.\n* Mention that this will be very technical - lots of code\n
  • #4: * Background - ZK issues, the need for a wrapper, etc. - mention that you can go more in depth on this\n* Why Curator was written, etc.\n* Low-level - details of the client/framework. Error handling, assumptions, etc.\n* Mention that this will be very technical - lots of code\n
  • #5: * Background - ZK issues, the need for a wrapper, etc. - mention that you can go more in depth on this\n* Why Curator was written, etc.\n* Low-level - details of the client/framework. Error handling, assumptions, etc.\n* Mention that this will be very technical - lots of code\n
  • #6: * Background - ZK issues, the need for a wrapper, etc. - mention that you can go more in depth on this\n* Why Curator was written, etc.\n* Low-level - details of the client/framework. Error handling, assumptions, etc.\n* Mention that this will be very technical - lots of code\n
  • #7: * Background - ZK issues, the need for a wrapper, etc. - mention that you can go more in depth on this\n* Why Curator was written, etc.\n* Low-level - details of the client/framework. Error handling, assumptions, etc.\n* Mention that this will be very technical - lots of code\n
  • #8: \n
  • #9: \n
  • #10: \n
  • #11: \n
  • #12: \n
  • #13: \n
  • #14: Mention that you contributed part on recoverable errors\n
  • #15: \n
  • #16: \n
  • #17: \n
  • #18: \n
  • #19: \n
  • #20: \n
  • #21: \n
  • #22: \n
  • #23: \n
  • #24: \n
  • #25: \n
  • #26: \n
  • #27: \n
  • #28: \n
  • #29: \n
  • #30: \n
  • #31: \n
  • #32: \n
  • #33: \n
  • #34: \n
  • #35: \n
  • #36: \n
  • #37: Becomes a persistent, unchanging handle to the ZK ensemble\n
  • #38: \n
  • #39: \n
  • #40: \n
  • #41: \n
  • #42: \n
  • #43: \n
  • #44: \n
  • #45: \n
  • #46: \n
  • #47: \n
  • #48: \n
  • #49: \n
  • #50: \n
  • #51: \n
  • #52: \n
  • #53: \n
  • #54: \n
  • #55: \n
  • #56: \n
  • #57: \n
  • #58: Kishore Gopalakrishna from Linked-in\n
  • #59: \n
  • #60: \n
  • #61: \n
  • #62: \n
  • #63: \n
  • #64: \n
  • #65: \n
  • #66: \n