SlideShare a Scribd company logo
Cassandra Drivers
instaclustr.com
@Instaclustr
Who am I and what do I do?
• Ben Bromhead
• Co-founder and CTO of Instaclustr -> www.instaclustr.com
• Instaclustr provides Cassandra-as-a-Service in the cloud
• Currently in AWS, Azure and IBM Softlayer
• We currently manage 400+ nodes
What this talk will cover
• Driver basics
• Sync vs Async
• Driver connection policies and tuning
The driver
• The Cassandra driver contains the logic for connecting to
Cassandra and running queries in a fast and efficient manner
• Focus on the Datastax Open Source drivers:
The driver
• Java
• .NET (C#)
• C/C++
• Python
• Node.js
• Ruby
• PHP
Cassandra Drivers
• All have a similar architecture that consists of:
• Session & pool management
• Chainable policies for managing failure and performance
• Sync vs Async queries
• Failover & Retry
• Tracing
Cassandra Drivers
A basic example in Java:
Cluster	
  cluster	
  =	
  Cluster.builder()	
  
	
  	
  	
  	
  .addContactPoints("52.89.183.67")	
  
	
  	
  	
  	
  .withPort(9042)	
  
	
  	
  	
  	
  .build();	
  
Session	
  session	
  =	
  cluster.newSession();	
  
session.execute("SELECT	
  *	
  FROM	
  foo…");
Cassandra Drivers
A basic example in Python:
cluster	
  =	
  Cluster(contact_points=["52.89.183.67"],	
  port=9042)	
  
session	
  =	
  cluster.connect()	
  
rows	
  =	
  session.execute("SELECT	
  name,	
  age,	
  email	
  FROM	
  users")
Cassandra Drivers
A basic example in Ruby:
cluster	
  =	
  Cassandra.cluster(	
  
	
  	
  	
  	
  :hosts	
  =>	
  ["52.89.183.67",	
  "52.89.99.88",	
  "54.69.217.141"],
	
  	
  	
  	
  :datacenter	
  =>	
  'AWS_VPC_US_WEST_2'	
  
)	
  
session	
  =	
  cluster.connect()	
  
rows	
  =	
  session.execute("SELECT	
  name,	
  age,	
  email	
  FROM	
  users")
Cassandra Drivers
• Architecture makes the driver similar across languages
• What happens under the hood?
• Cluster object creates configuration (auth, load balancing, contact
points).
• Session object holds the thread pool and manages connections.
• Session object authenticates and maintains connections.
• Session can be shared and is threadsafe!
Different ways of querying
• Synchronous:
session.execute("SELECT	
  *	
  FROM	
  foo..”);
• Asynchronous:
ResultSetFuture	
  result	
  =	
  session.executeAsync("SELECT	
  *	
  FROM	
  
foo..”);	
  
result.get();
How do these perform?
Operations
0
7500
15000
22500
30000
Read Sync Write Sync Read Async Write Async
Op/s
How do these perform?
Latency
0
20
40
60
80
Read Sync Write Sync Read Async Write Async
ms
Different ways of querying
Prepared Statements:
PreparedStatement	
  statement	
  =	
  getSession().prepare(	
  
	
  	
  	
  	
  	
  	
  "INSERT	
  INTO	
  simplex.songs	
  "	
  +	
  
	
  	
  	
  	
  	
  	
  "(id,	
  title,	
  album,	
  artist,	
  tags)	
  "	
  +	
  
	
  	
  	
  	
  	
  	
  "VALUES	
  (?,	
  ?,	
  ?,	
  ?,	
  ?);");
Different ways of querying
boundStatement	
  =	
  new	
  BoundStatement(statement);	
  
getSession().execute(boundStatement.bind(	
  
	
  	
  	
  	
  	
  	
  UUID.fromString("2cc9ccb7-­‐6221-­‐4ccb-­‐8387-­‐f22b6a1b354d"),	
  
	
  	
  	
  	
  	
  	
  UUID.fromString("756716f7-­‐2e54-­‐4715-­‐9f00-­‐91dcbea6cf50"),	
  
	
  	
  	
  	
  	
  	
  "La	
  Petite	
  Tonkinoise",	
  
	
  	
  	
  	
  	
  	
  "Bye	
  Bye	
  Blackbird",	
  
	
  	
  	
  	
  	
  	
  "Joséphine	
  Baker")	
  );
Drivers and consistency
• Within the different ways of querying Cassandra you can also adjust
the consistency level per query.
• Lets have a quick consistency refresh
A brief intro to tuneable consistency
• Cassandra is considered to be a db that favours Availability and
Partition Tolerance.
• Let’s you change those characteristics per query to suit your
application requirement
Two consistency levers
• Consistency level - How many acknowledgements/responses from
replicas before a query is considered a success.
• Replication Factor (RF) - How many copies of a record do I store.
Two consistency levers
• Consistency level - Chosen by the client at query time
• Replication Factor (RF) - Determined client on schema definition
Consistency Levels
• ALL - Every replica
• *QUORUM - (EACH_QUORUM, QUORUM, LOCAL_QUORUM)
• Numbered - (ONE, TWO, THREE, LOCAL_ONE)
• *SERIAL - (SERIAL, LOCAL_SERIAL)
• ANY
What does it all mean
• At the client level (your application) you have total control
• Define implicit and explicit failure handling
• Isolate queries to a single geography
• Trade consistency for latency (a decision is better than no
decision)
How does it all work?
Write
CL:QUORUM
RF:3
1
2
3
4
partition_key: a
How does it all work?
Write
CL:QUORUM
RF:3
1
2
3
4
partition_key: a
How does it all work?
Write
CL:QUORUM
RF:3
1
2
3
4
partition_key: a
How does it all work?
Write
CL:QUORUM
RF:3
1
2
3
4
partition_key: a
How does it all work?
How does CL impact Op/s ?
Operations
0
7500
15000
22500
30000
Read Sync Write Sync Read Async Write Async
ONE QUORUM ALL
How does CL impact latency ?
Latency
0
30
60
90
120
Read Sync Write Sync Read Async Write Async
ONE QUORUM ALL
What happens when something goes wrong?
Write
CL:QUORUM
RF:3
1
2
3
4
partition_key: b
How does it all work?
Write
CL:QUORUM
RF:3
1
2
3
4
partition_key: b
How does it all work?
Write
CL:QUORUM
RF:3
1
2
3
4
partition_key: b
How does it all work?
Write
CL:QUORUM
RF:3
1
2
3
4
partition_key: b
How does it all work?
✓✓
Required responses:
floor(3 * 0.5) + 1 = 2
Write
CL:QUORUM
RF:3
1
2
3
4
partition_key: b
How does it all work?
✓✓
Success!
How does an outage impact Op/s ?
Operations
0
7500
15000
22500
30000
Read Sync Write Sync Read Async Write Async
ONE QUORUM ALL
How does an outage impact latency ?
Latency
0
25
50
75
100
Read Sync Write Sync Read Async Write Async
ONE QUORUM ALL
We are now have a replica that is not
consistent
• Anti-entropy repair (only guaranteed way to make things consistent)
• Hinted handoff
• Read repair
We are now have a replica that is not
consistent
• Anti-entropy repair (only guaranteed way to make things consistent)
• Hinted handoff - lets cover this quickly
• Read repair
What is hinted handoff ?
• A performance optimisation for “catching up” nodes who missed
writes.
What isn’t hinted handoff ?
• A consistent distribution mechanism
Write
CL:QUORUM
RF:3
partition_key: b
1
2
3
4
How does it all work?
Write
CL:QUORUM
RF:3
partition_key: b
1
2
3
4
How does it all work?
How does hinted handoff work?
1
2
3
4
host / key A B
1 ✔ ✔
2 ?
3 ✔ ✔
…
✔
How does hinted handoff work?
partition_key: b
1
2
3
4
How does hinted handoff work?
partition_key: b
1
2
3
4
Gossip: 2 is now UP
Node 1: I have stored hints for 2
How does hinted handoff work?
partition_key: b
1
2
3
4
Some things to keep in mind
• Cassandra will only store hints for a certain period of time, set by
max_hint_window_in_ms. 3 hours by default
• Hints are not a reliable delivery mechanism
• Hint replay will cause counters to overcoat
• CF of ANY will cause a hint to be stored even if no replicas are
available. Sometimes called extreme availability… also called who
knows where and if your data is safe?
Hinted handoff performance
• Causes the same volume of writes to occur in a cluster with
reduced capacity (local write amplification on the co-ordinator
node)
• Hints are written to system.hints, each replica has hints stored in a
single partition.
• Hints use TTLs and tombstones.. the hint table is actually a queue!
• When cassandra starts compacting or throwing tombstone
warnings on the system.hints table… things are bad
Hinted handoff performance
• Rewritten in Cassandra 3.0 (in beta now)
• Takes a commitlog approach:
• No compaction
• no TTL
• no tombstones
• no memtables
How does this relate to the driver?
• With a node outage the “latency” on the down node becomes
hours/days until it becomes consistent
• Cassandra itself takes over the client portion of ensuring the write
makes it to the node that was down.
• You can control whether C* handles this (via repair, HH etc) or
whether your application controls this (have your client receive an
exception instead).
Driver policies
• Cassandra driver policies allow you to control failure
• Cassandra driver policies allow you to control how the driver routes
requests
• This can reduce your latency and/or increase op/s (in some cases)
Retry Policy
• Default Retry Policy
• Downgrading Consistency Retry Policy
• Fall through Retry Policy
• Logging Retry Policy
Load Balancing Policy
• Round Robin
• DC Aware
• TokenAware
• LatencyAware
Driver policies impact latency ?
Latency
0
0.3
0.6
0.9
1.2
Read Sync Write Sync
Round Robin Token Aware Latency Aware
Last but not least
• Use one Cluster instance per (physical) cluster (per application
lifetime)
• Use at most one Session per keyspace, or use a single Session and
explicitly specify the keyspace in your queries
• If you execute a statement more than once, consider using a
PreparedStatement
• You can reduce the number of network roundtrips and also have
atomic operations by using Batches
Thank you!
Questions?

More Related Content

PDF
Client Drivers and Cassandra, the Right Way
DataStax Academy
 
PDF
Introduction to .Net Driver
DataStax Academy
 
PPTX
How to optimize CloudLinux OS limits
CloudLinux
 
PDF
Как сделать высоконагруженный сервис, не зная количество нагрузки / Олег Обле...
Ontico
 
PDF
Salvatore Sanfilippo – How Redis Cluster works, and why - NoSQL matters Barce...
NoSQLmatters
 
PDF
초보자를 위한 분산 캐시 이야기
OnGameServer
 
PDF
Redis acl
DaeMyung Kang
 
PDF
How to tune Kafka® for production
confluent
 
Client Drivers and Cassandra, the Right Way
DataStax Academy
 
Introduction to .Net Driver
DataStax Academy
 
How to optimize CloudLinux OS limits
CloudLinux
 
Как сделать высоконагруженный сервис, не зная количество нагрузки / Олег Обле...
Ontico
 
Salvatore Sanfilippo – How Redis Cluster works, and why - NoSQL matters Barce...
NoSQLmatters
 
초보자를 위한 분산 캐시 이야기
OnGameServer
 
Redis acl
DaeMyung Kang
 
How to tune Kafka® for production
confluent
 

What's hot (20)

PDF
Highly Available MySQL/PHP Applications with mysqlnd
Jervin Real
 
PDF
Producer Performance Tuning for Apache Kafka
Jiangjie Qin
 
PDF
Redis ndc2013
DaeMyung Kang
 
PPTX
How Yelp does Service Discovery
John Billings
 
ODP
Shootout at the AWS Corral
PostgreSQL Experts, Inc.
 
PDF
Scaling with sync_replication using Galera and EC2
Marco Tusa
 
PPTX
Docker Swarm secrets for creating great FIWARE platforms
Federico Michele Facca
 
PDF
Load Balancing with Nginx
Marian Marinov
 
KEY
Introduction to memcached
Jurriaan Persyn
 
PDF
Troubleshooting RabbitMQ and services that use it
Michael Klishin
 
PDF
Redis trouble shooting_eng
DaeMyung Kang
 
ODP
Fail over fail_back
PostgreSQL Experts, Inc.
 
PDF
Tuning the Kernel for Varnish Cache
Per Buer
 
PDF
HDFS Scalability and Security, Daryn Sharp, Senior Engineer, Oath
Yahoo Developer Network
 
PPTX
HighLoad Solutions On MySQL / Xiaobin Lin (Alibaba)
Ontico
 
PDF
Nginx Internals
Joshua Zhu
 
PDF
Consul - service discovery and others
Walter Liu
 
PDF
How to monitor NGINX
Server Density
 
ODP
PostgreSQL: Welcome To Total Security
Robert Bernier
 
PDF
Varnish Cache 4.0 / Redpill Linpro breakfast in Oslo
Per Buer
 
Highly Available MySQL/PHP Applications with mysqlnd
Jervin Real
 
Producer Performance Tuning for Apache Kafka
Jiangjie Qin
 
Redis ndc2013
DaeMyung Kang
 
How Yelp does Service Discovery
John Billings
 
Shootout at the AWS Corral
PostgreSQL Experts, Inc.
 
Scaling with sync_replication using Galera and EC2
Marco Tusa
 
Docker Swarm secrets for creating great FIWARE platforms
Federico Michele Facca
 
Load Balancing with Nginx
Marian Marinov
 
Introduction to memcached
Jurriaan Persyn
 
Troubleshooting RabbitMQ and services that use it
Michael Klishin
 
Redis trouble shooting_eng
DaeMyung Kang
 
Fail over fail_back
PostgreSQL Experts, Inc.
 
Tuning the Kernel for Varnish Cache
Per Buer
 
HDFS Scalability and Security, Daryn Sharp, Senior Engineer, Oath
Yahoo Developer Network
 
HighLoad Solutions On MySQL / Xiaobin Lin (Alibaba)
Ontico
 
Nginx Internals
Joshua Zhu
 
Consul - service discovery and others
Walter Liu
 
How to monitor NGINX
Server Density
 
PostgreSQL: Welcome To Total Security
Robert Bernier
 
Varnish Cache 4.0 / Redpill Linpro breakfast in Oslo
Per Buer
 
Ad

Similar to Cassandra and drivers (20)

PPTX
Right-Sizing your SQL Server Virtual Machine
heraflux
 
PDF
Performance Scenario: Diagnosing and resolving sudden slow down on two node RAC
Kristofferson A
 
PDF
Three Perspectives on Measuring Latency
ScyllaDB
 
PPTX
Scalable Web Apps
Piotr Pelczar
 
PDF
Where Django Caching Bust at the Seams
Concentric Sky
 
PDF
Aerospike Go Language Client
Sayyaparaju Sunil
 
PPTX
Benchmarking Solr Performance at Scale
thelabdude
 
PDF
Cassandra serving netflix @ scale
Vinay Kumar Chella
 
PPTX
Release it! - Takeaways
Manuela Grindei
 
PDF
Real-Time Analytics with Kafka, Cassandra and Storm
John Georgiadis
 
PPTX
The impact of cloud NSBCon NY by Yves Goeleven
Particular Software
 
PDF
KoprowskiT - SQLBITS X - 2am a disaster just began
Tobias Koprowski
 
PPTX
Cassandra Tools and Distributed Administration (Jeffrey Berger, Knewton) | C*...
DataStax
 
PPTX
Database and Public Endpoints redundancy on Azure
Radu Vunvulea
 
PPTX
When it Absolutely, Positively, Has to be There: Reliability Guarantees in Ka...
confluent
 
PDF
Performance Oriented Design
Rodrigo Campos
 
PPTX
L6.sp17.pptx
SudheerKumar499932
 
PDF
Leveraging Cassandra for real-time multi-datacenter public cloud analytics
Julien Anguenot
 
PDF
iland Internet Solutions: Leveraging Cassandra for real-time multi-datacenter...
DataStax Academy
 
PDF
Introduction to Apache ZooKeeper | Big Data Hadoop Spark Tutorial | CloudxLab
CloudxLab
 
Right-Sizing your SQL Server Virtual Machine
heraflux
 
Performance Scenario: Diagnosing and resolving sudden slow down on two node RAC
Kristofferson A
 
Three Perspectives on Measuring Latency
ScyllaDB
 
Scalable Web Apps
Piotr Pelczar
 
Where Django Caching Bust at the Seams
Concentric Sky
 
Aerospike Go Language Client
Sayyaparaju Sunil
 
Benchmarking Solr Performance at Scale
thelabdude
 
Cassandra serving netflix @ scale
Vinay Kumar Chella
 
Release it! - Takeaways
Manuela Grindei
 
Real-Time Analytics with Kafka, Cassandra and Storm
John Georgiadis
 
The impact of cloud NSBCon NY by Yves Goeleven
Particular Software
 
KoprowskiT - SQLBITS X - 2am a disaster just began
Tobias Koprowski
 
Cassandra Tools and Distributed Administration (Jeffrey Berger, Knewton) | C*...
DataStax
 
Database and Public Endpoints redundancy on Azure
Radu Vunvulea
 
When it Absolutely, Positively, Has to be There: Reliability Guarantees in Ka...
confluent
 
Performance Oriented Design
Rodrigo Campos
 
L6.sp17.pptx
SudheerKumar499932
 
Leveraging Cassandra for real-time multi-datacenter public cloud analytics
Julien Anguenot
 
iland Internet Solutions: Leveraging Cassandra for real-time multi-datacenter...
DataStax Academy
 
Introduction to Apache ZooKeeper | Big Data Hadoop Spark Tutorial | CloudxLab
CloudxLab
 
Ad

Recently uploaded (20)

PPTX
Web dev -ppt that helps us understand web technology
shubhragoyal12
 
PDF
Linux OS guide to know, operate. Linux Filesystem, command, users and system
Kiran Maharjan
 
PDF
202501214233242351219 QASS Session 2.pdf
lauramejiamillan
 
PPTX
Azure Data management Engineer project.pptx
sumitmundhe77
 
PDF
Research about a FoodFolio app for personalized dietary tracking and health o...
AustinLiamAndres
 
PPTX
Employee Salary Presentation.l based on data science collection of data
barridevakumari2004
 
PDF
CH2-MODEL-SETUP-v2017.1-JC-APR27-2017.pdf
jcc00023con
 
PDF
oop_java (1) of ice or cse or eee ic.pdf
sabiquntoufiqlabonno
 
PDF
Blue Futuristic Cyber Security Presentation.pdf
tanvikhunt1003
 
PPTX
Introduction to Data Analytics and Data Science
KavithaCIT
 
PDF
D9110.pdfdsfvsdfvsdfvsdfvfvfsvfsvffsdfvsdfvsd
minhn6673
 
PPTX
Fuzzy_Membership_Functions_Presentation.pptx
pythoncrazy2024
 
PDF
Technical Writing Module-I Complete Notes.pdf
VedprakashArya13
 
PPT
Grade 5 PPT_Science_Q2_W6_Methods of reproduction.ppt
AaronBaluyut
 
PPTX
Short term internship project report on power Bi
JMJCollegeComputerde
 
PDF
The_Future_of_Data_Analytics_by_CA_Suvidha_Chaplot_UPDATED.pdf
CA Suvidha Chaplot
 
PPTX
Future_of_AI_Presentation for everyone.pptx
boranamanju07
 
PDF
Company Presentation pada Perusahaan ADB.pdf
didikfahmi
 
PDF
blockchain123456789012345678901234567890
tanvikhunt1003
 
PPTX
Complete_STATA_Introduction_Beginner.pptx
mbayekebe
 
Web dev -ppt that helps us understand web technology
shubhragoyal12
 
Linux OS guide to know, operate. Linux Filesystem, command, users and system
Kiran Maharjan
 
202501214233242351219 QASS Session 2.pdf
lauramejiamillan
 
Azure Data management Engineer project.pptx
sumitmundhe77
 
Research about a FoodFolio app for personalized dietary tracking and health o...
AustinLiamAndres
 
Employee Salary Presentation.l based on data science collection of data
barridevakumari2004
 
CH2-MODEL-SETUP-v2017.1-JC-APR27-2017.pdf
jcc00023con
 
oop_java (1) of ice or cse or eee ic.pdf
sabiquntoufiqlabonno
 
Blue Futuristic Cyber Security Presentation.pdf
tanvikhunt1003
 
Introduction to Data Analytics and Data Science
KavithaCIT
 
D9110.pdfdsfvsdfvsdfvsdfvfvfsvfsvffsdfvsdfvsd
minhn6673
 
Fuzzy_Membership_Functions_Presentation.pptx
pythoncrazy2024
 
Technical Writing Module-I Complete Notes.pdf
VedprakashArya13
 
Grade 5 PPT_Science_Q2_W6_Methods of reproduction.ppt
AaronBaluyut
 
Short term internship project report on power Bi
JMJCollegeComputerde
 
The_Future_of_Data_Analytics_by_CA_Suvidha_Chaplot_UPDATED.pdf
CA Suvidha Chaplot
 
Future_of_AI_Presentation for everyone.pptx
boranamanju07
 
Company Presentation pada Perusahaan ADB.pdf
didikfahmi
 
blockchain123456789012345678901234567890
tanvikhunt1003
 
Complete_STATA_Introduction_Beginner.pptx
mbayekebe
 

Cassandra and drivers

  • 2. Who am I and what do I do? • Ben Bromhead • Co-founder and CTO of Instaclustr -> www.instaclustr.com • Instaclustr provides Cassandra-as-a-Service in the cloud • Currently in AWS, Azure and IBM Softlayer • We currently manage 400+ nodes
  • 3. What this talk will cover • Driver basics • Sync vs Async • Driver connection policies and tuning
  • 4. The driver • The Cassandra driver contains the logic for connecting to Cassandra and running queries in a fast and efficient manner • Focus on the Datastax Open Source drivers:
  • 5. The driver • Java • .NET (C#) • C/C++ • Python • Node.js • Ruby • PHP
  • 6. Cassandra Drivers • All have a similar architecture that consists of: • Session & pool management • Chainable policies for managing failure and performance • Sync vs Async queries • Failover & Retry • Tracing
  • 7. Cassandra Drivers A basic example in Java: Cluster  cluster  =  Cluster.builder()          .addContactPoints("52.89.183.67")          .withPort(9042)          .build();   Session  session  =  cluster.newSession();   session.execute("SELECT  *  FROM  foo…");
  • 8. Cassandra Drivers A basic example in Python: cluster  =  Cluster(contact_points=["52.89.183.67"],  port=9042)   session  =  cluster.connect()   rows  =  session.execute("SELECT  name,  age,  email  FROM  users")
  • 9. Cassandra Drivers A basic example in Ruby: cluster  =  Cassandra.cluster(          :hosts  =>  ["52.89.183.67",  "52.89.99.88",  "54.69.217.141"],        :datacenter  =>  'AWS_VPC_US_WEST_2'   )   session  =  cluster.connect()   rows  =  session.execute("SELECT  name,  age,  email  FROM  users")
  • 10. Cassandra Drivers • Architecture makes the driver similar across languages • What happens under the hood? • Cluster object creates configuration (auth, load balancing, contact points). • Session object holds the thread pool and manages connections. • Session object authenticates and maintains connections. • Session can be shared and is threadsafe!
  • 11. Different ways of querying • Synchronous: session.execute("SELECT  *  FROM  foo..”); • Asynchronous: ResultSetFuture  result  =  session.executeAsync("SELECT  *  FROM   foo..”);   result.get();
  • 12. How do these perform? Operations 0 7500 15000 22500 30000 Read Sync Write Sync Read Async Write Async Op/s
  • 13. How do these perform? Latency 0 20 40 60 80 Read Sync Write Sync Read Async Write Async ms
  • 14. Different ways of querying Prepared Statements: PreparedStatement  statement  =  getSession().prepare(              "INSERT  INTO  simplex.songs  "  +              "(id,  title,  album,  artist,  tags)  "  +              "VALUES  (?,  ?,  ?,  ?,  ?);");
  • 15. Different ways of querying boundStatement  =  new  BoundStatement(statement);   getSession().execute(boundStatement.bind(              UUID.fromString("2cc9ccb7-­‐6221-­‐4ccb-­‐8387-­‐f22b6a1b354d"),              UUID.fromString("756716f7-­‐2e54-­‐4715-­‐9f00-­‐91dcbea6cf50"),              "La  Petite  Tonkinoise",              "Bye  Bye  Blackbird",              "Joséphine  Baker")  );
  • 16. Drivers and consistency • Within the different ways of querying Cassandra you can also adjust the consistency level per query. • Lets have a quick consistency refresh
  • 17. A brief intro to tuneable consistency • Cassandra is considered to be a db that favours Availability and Partition Tolerance. • Let’s you change those characteristics per query to suit your application requirement
  • 18. Two consistency levers • Consistency level - How many acknowledgements/responses from replicas before a query is considered a success. • Replication Factor (RF) - How many copies of a record do I store.
  • 19. Two consistency levers • Consistency level - Chosen by the client at query time • Replication Factor (RF) - Determined client on schema definition
  • 20. Consistency Levels • ALL - Every replica • *QUORUM - (EACH_QUORUM, QUORUM, LOCAL_QUORUM) • Numbered - (ONE, TWO, THREE, LOCAL_ONE) • *SERIAL - (SERIAL, LOCAL_SERIAL) • ANY
  • 21. What does it all mean • At the client level (your application) you have total control • Define implicit and explicit failure handling • Isolate queries to a single geography • Trade consistency for latency (a decision is better than no decision)
  • 22. How does it all work?
  • 27. How does CL impact Op/s ? Operations 0 7500 15000 22500 30000 Read Sync Write Sync Read Async Write Async ONE QUORUM ALL
  • 28. How does CL impact latency ? Latency 0 30 60 90 120 Read Sync Write Sync Read Async Write Async ONE QUORUM ALL
  • 29. What happens when something goes wrong?
  • 33. Write CL:QUORUM RF:3 1 2 3 4 partition_key: b How does it all work? ✓✓ Required responses: floor(3 * 0.5) + 1 = 2
  • 35. How does an outage impact Op/s ? Operations 0 7500 15000 22500 30000 Read Sync Write Sync Read Async Write Async ONE QUORUM ALL
  • 36. How does an outage impact latency ? Latency 0 25 50 75 100 Read Sync Write Sync Read Async Write Async ONE QUORUM ALL
  • 37. We are now have a replica that is not consistent • Anti-entropy repair (only guaranteed way to make things consistent) • Hinted handoff • Read repair
  • 38. We are now have a replica that is not consistent • Anti-entropy repair (only guaranteed way to make things consistent) • Hinted handoff - lets cover this quickly • Read repair
  • 39. What is hinted handoff ? • A performance optimisation for “catching up” nodes who missed writes.
  • 40. What isn’t hinted handoff ? • A consistent distribution mechanism
  • 43. How does hinted handoff work? 1 2 3 4 host / key A B 1 ✔ ✔ 2 ? 3 ✔ ✔ … ✔
  • 44. How does hinted handoff work? partition_key: b 1 2 3 4
  • 45. How does hinted handoff work? partition_key: b 1 2 3 4 Gossip: 2 is now UP Node 1: I have stored hints for 2
  • 46. How does hinted handoff work? partition_key: b 1 2 3 4
  • 47. Some things to keep in mind • Cassandra will only store hints for a certain period of time, set by max_hint_window_in_ms. 3 hours by default • Hints are not a reliable delivery mechanism • Hint replay will cause counters to overcoat • CF of ANY will cause a hint to be stored even if no replicas are available. Sometimes called extreme availability… also called who knows where and if your data is safe?
  • 48. Hinted handoff performance • Causes the same volume of writes to occur in a cluster with reduced capacity (local write amplification on the co-ordinator node) • Hints are written to system.hints, each replica has hints stored in a single partition. • Hints use TTLs and tombstones.. the hint table is actually a queue! • When cassandra starts compacting or throwing tombstone warnings on the system.hints table… things are bad
  • 49. Hinted handoff performance • Rewritten in Cassandra 3.0 (in beta now) • Takes a commitlog approach: • No compaction • no TTL • no tombstones • no memtables
  • 50. How does this relate to the driver? • With a node outage the “latency” on the down node becomes hours/days until it becomes consistent • Cassandra itself takes over the client portion of ensuring the write makes it to the node that was down. • You can control whether C* handles this (via repair, HH etc) or whether your application controls this (have your client receive an exception instead).
  • 51. Driver policies • Cassandra driver policies allow you to control failure • Cassandra driver policies allow you to control how the driver routes requests • This can reduce your latency and/or increase op/s (in some cases)
  • 52. Retry Policy • Default Retry Policy • Downgrading Consistency Retry Policy • Fall through Retry Policy • Logging Retry Policy
  • 53. Load Balancing Policy • Round Robin • DC Aware • TokenAware • LatencyAware
  • 54. Driver policies impact latency ? Latency 0 0.3 0.6 0.9 1.2 Read Sync Write Sync Round Robin Token Aware Latency Aware
  • 55. Last but not least • Use one Cluster instance per (physical) cluster (per application lifetime) • Use at most one Session per keyspace, or use a single Session and explicitly specify the keyspace in your queries • If you execute a statement more than once, consider using a PreparedStatement • You can reduce the number of network roundtrips and also have atomic operations by using Batches