SlideShare a Scribd company logo
www.informatik-aktuell.de
Patrick Guillebert Solutions Architect
pguillebert@datastax.com
Availability, scalability and consistency
with Cassandra
Cassandra and the CAP Theorem
In a distributed system, only 2 of these attributes can be fulfilled at a time:
Consistency, Availability, Partition tolerance
Availability, Partition tolerance is deeply built in by the design.
Consistency results programmatically and is tunable.
©2014 DataStax Confidential. Do not distribute without consent. 2
Cassandra is an AP system
Cassandra’s design goals
Massive and predictable
scaling
Always-On
Tunable consistency
Geographically distributed
Low latency
Operationally simple
2016: + User friendly
©2015 DataStax. Do not distribute without consent. 3
© 2014 DataStax Confidential. Do not distribute without consent. 4
Numbers and facts
from production deployments
Scale at Apple
Cassandra can scale from 3 to 1000+ nodes
• Footprint @ Apple
• 75,000+ nodes
• 10’s of petabytes of data
• Millions ops/second
• Largest cluster 1000+ nodes
Apple Inc.: Cassandra at Apple for Massive Scale
Video https://www.youtube.com/watch?v=Bc4ql9TDzyg
Form Cassandra Summit, USA, September 2014
Availability at Netflix
©2015 DataStax. Do not distribute without consent. 6
Consistency at ING Bank
©2015 DataStax. Do not distribute without consent. 7
https://www.youtube.com/watch?v=-sD3x8-tuDU
Scalability
Linear and predictable scaling
Need to store mode data? Add more nodes.
Need more throughput ? Add more nodes.
©2015 DataStax. Do not distribute without consent. 9
Cassandra
• Is “future proof”
• By scaling out linearly on
commodity hardware.
Partitionning
• The dataset is distributed all
nodes of the cluster
• Data subsets are called “token
ranges”
• With vnodes, a node could have
several small token ranges, 256
by default
©2015 DataStax. Do not distribute without consent. 10
Token Range
(Murmur3)
Node 1
Node 3
Node 2Node 4
- 263+ 263
The partitioner
©2015 DataStax. Do not distribute without consent. 11
"@DataSta
x"
Hash Function
97203
-2245462676723223822
7723358927203680754
• Partitioner is responsible for
calculating the token, which is
the hash of the key of the data
to store
• The token is just a number
between -2^63 and 2^63
• Partitioners:
Murmur3Partitioner (Murmur3),
RandomPartitioner (MD5)
ByteOrderedPartitioner
Data distribution within the cluster
©2015 DataStax. Do not distribute without consent. 12
Node 1
Node 3
Node 2Node 4Node 4
Partitioner
’@Datastax'
Token 91
Node 1
0100
13
Availability
Node level availability
A node failure has no impact
• No failover
• 0 downtime
• 0 data loss
• Consistent responses
©2014 DataStax Confidential. Do not distribute without consent. 14
Node 1
1st copy
Node 4
Node 5
Node 2
2nd copy
Node 3
3rd copy
Parallel Writes
RF=3 CL=QUORUM
If 2 nodes out of the 3 replicas respond, the request is a success
Immediate onsistency is ensured
Node 4
Rack level availability
A rack failure has no impact on the service
©2014 DataStax Confidential. Do not distribute without consent. 15
Node 1
1st copy
Node 4
Node 5
Node 2
2nd copy
Node 3
3rd copy
Write
CL=QUORUM
Immediate consistency (RF 3 / CL QUORUM)
:
Nodes are distributed across at least 3 racks
If a rack fails, a quorum of replicas remain
available in the remaining racks and the
request suceeds.
Node 4
RAC 1
RAC 2
RAC 3
RF=3 CL=QUORUM
Data Centre level availability
©2014 DataStax Confidential. Do not distribute without consent. 16
Node 1
1st copy
Node 4
Node 5
Node 2
2nd copy
Node 3
3rd copy
DC: Frankfurt
Node 1
1st copy
Node 4
Node 5
Node 2
2nd copy
Node 3
3rd copy
DC: Dublin
A DC failure has no impact on the service
Internals and practices that enable availability
©2014 DataStax Confidential. Do not distribute without consent.
• Cluster topology awareness (“NetworkTopologyStrategy”) in replication
A cluster is described as a group of data centres, containining racks, containining nodes.
This topology is used to set replicas in a “smart” fashion.
• Peer to peer architecture
All nodes are strictly equal on the read/write path, any node can be queried for any data
Seeds have a special function only in the context of node bootstrap
Failover never happens
• Trade off on consistency
Response consistency is ensured by asking a quorum of nodes to be consistent.
There is no single “master” knowing the latest write.
• Live operations
Whichever the setup is (single or multi-dc) operations can be made live, with no downtime
Single DC replication
©2014 DataStax Confidential. Do not distribute without consent.
Node 1
Node 3
Node 2Node 4
Client Driver
Node 4
Node 1
Node 2
Node 3
1-25
26-5051-75
76-0
91
91
Single Data Center
CREATE KEYSPACE demo
WITH REPLICATION = {
'class':'SimpleStrategy',
'replication_factor':3
};
Primary Range
for Token 91
Multi-DC replication
©2014 DataStax Confidential. Do not distribute without consent.
Node 1
Node 3
Node 2Node 4
Client Driver
Node 4
Node 1
Node 2
Node 3
1-13
26-38
51-63
76-88
Node 1
Node 3
Node 2Node 4
Data Center - West Rack 1
Rack 2
Node 2
Rack 1
Rack 2
91
91
91
CREATE KEYSPACE demo
WITH REPLICATION {
'class':'NetworkTopologyStrategy',
'dc-east':2,
'dc-west’:3
}
Remote
Coordinator
Primary Range
for Token 91
14-25
39-50
64-75
89-100
91
What is different from a master-slave system ?
©2014 DataStax Confidential. Do not distribute without consent.
Failovers never happens, having no master irremediable data corruption in
split brain situation can’t happen
Consistency
Consistency types
©2015 DataStax. Do not distribute without consent. 22
• Immediate consistency
The uniquely given consistency type which a relational database provide.
A read always return the last written data, whichever node is requested.
• Eventual consistency
A read will eventually return the last written data.
Depending on the node queried, responses could be different.
Cassandra offers both types of consistency, it can be tuned.
Consistency Levels
©2015 DataStax. Do not distribute without consent. 23
Consistency levels are set on each single request. It determines the number
of node which need to acknowledge a write or agree on the value of a
queried data.
Consistency types could be different depending on the table or in time.
• Common CLs: ONE, QUORUM, ALL
• Multi-DC CLs: LOCAL_ONE, LOCAL_QUORUM, LOCAL_ALL,
EACH_QUORUM
• Special CLs: ANY, SERIAL, LOCAL_SERIAL
General consistency levels
©2015 DataStax. Do not distribute without consent. 24
ONE
QUORUM
ALL
ANY
SERIAL
Multi-DC consistency levels
©2015 DataStax. Do not distribute without consent. 25
LOCAL_ONE
LOCAL_QUORUM
LOCAL_ALL
EACH_QUORUM
LOCAL_SERIAL
The rule to tune consistency
©2015 DataStax. Do not distribute without consent. 26
Immediate consistency :
CL READ + CL WRITE > REPLICATION
READ QUORUM + WRITE QUORUM
READ ONE + WRITE ALL
READ ALL + WRITE ONE
Eventual consistency
CL READ + CL WRITE <= REPLICATION
READ ONE + WRITE ONE
READ ONE + WRITE ANY
Time base conflict resolution
©2015 DataStax. Do not distribute without consent. 27
Node 1
Node 3
Node 2Node 4
Client Driver
timestamp
… 523
timestamp
… 511
timestamp
… 542
The last write wins !
Empirical verification of the rule
©2015 DataStax. Do not distribute without consent. 28
Node 1
Node 3
Node 2Node 4RF = 3
CL QUORUM WRITE
ALL COMBINATIONS
OF CL QUORUM
Use cases
Domains of use cases
©2015 DataStax. Do not distribute without consent. 30
• IoT
• Web / mobile
• Gaming
• Bank, finance
• Telecoms
Using C* for to back a global reference data service
©2015 DataStax. Do not distribute without consent. 31
Requirements
• Always on
• High throughput
• Low latency
• Multi-region replication
Little or big data 
Extending use cases with DSE
©2015 DataStax. Do not distribute without consent. 32
Thank you !

More Related Content

PPTX
Cassandra under the hood
Andriy Rymar
 
PPTX
Microservices interaction at scale using Apache Kafka
Ivan Ursul
 
PDF
Cassandra and Spark
nickmbailey
 
PPTX
Tales From The Front: An Architecture For Multi-Data Center Scalable Applicat...
DataStax Academy
 
PDF
Apache Cassandra and DataStax Enterprise Explained with Peter Halliday at Wil...
DataStax Academy
 
PDF
Beginning Operations: 7 Deadly Sins for Apache Cassandra Ops
DataStax Academy
 
PPTX
DataStax TechDay - Munich 2014
Christian Johannsen
 
PDF
Leveraging Cassandra for real-time multi-datacenter public cloud analytics
Julien Anguenot
 
Cassandra under the hood
Andriy Rymar
 
Microservices interaction at scale using Apache Kafka
Ivan Ursul
 
Cassandra and Spark
nickmbailey
 
Tales From The Front: An Architecture For Multi-Data Center Scalable Applicat...
DataStax Academy
 
Apache Cassandra and DataStax Enterprise Explained with Peter Halliday at Wil...
DataStax Academy
 
Beginning Operations: 7 Deadly Sins for Apache Cassandra Ops
DataStax Academy
 
DataStax TechDay - Munich 2014
Christian Johannsen
 
Leveraging Cassandra for real-time multi-datacenter public cloud analytics
Julien Anguenot
 

What's hot (20)

KEY
Introduction to Cassandra: Replication and Consistency
Benjamin Black
 
PDF
OSCON: Building Cloud Native Apps with NATS
wallyqs
 
PPTX
Distributed Task Scheduling with Akka, Kafka and Cassandra
David van Geest
 
PDF
Distributed Systems Concepts
Jordan Halterman
 
PDF
iland Internet Solutions: Leveraging Cassandra for real-time multi-datacenter...
DataStax Academy
 
PDF
Introduction to Cassandra and CQL for Java developers
Julien Anguenot
 
PDF
The Zen of High Performance Messaging with NATS (Strange Loop 2016)
wallyqs
 
PDF
Distributed Systems Theory for Mere Mortals - Software Craftsmanship Turkey
Ensar Basri Kahveci
 
PDF
Advanced Operations
DataStax Academy
 
PDF
Instaclustr introduction to managing cassandra
Instaclustr
 
PDF
Distributed Systems Theory for Mere Mortals - Topconf Dusseldorf October 2017
Ensar Basri Kahveci
 
PDF
Webinar: Diagnosing Apache Cassandra Problems in Production
DataStax Academy
 
PDF
KubeCon NA 2018 - NATS Deep Dive: The Evolution of NATS
wallyqs
 
PDF
Top Mistakes When Writing Reactive Applications - Scala by the Bay 2016
Petr Zapletal
 
PDF
Understanding Cassandra internals to solve real-world problems
Acunu
 
PDF
Instaclustr Apache Cassandra Best Practices & Toubleshooting
Instaclustr
 
PDF
OSMC 2017 | Monitoring MySQL with Prometheus and Grafana by Julien Pivotto
NETWAYS
 
PDF
Simple Solutions for Complex Problems
Tyler Treat
 
PDF
The Economics of Scale: Promises and Perils of Going Distributed
Tyler Treat
 
PDF
SF Python Meetup - Introduction to NATS Messaging with Python3
wallyqs
 
Introduction to Cassandra: Replication and Consistency
Benjamin Black
 
OSCON: Building Cloud Native Apps with NATS
wallyqs
 
Distributed Task Scheduling with Akka, Kafka and Cassandra
David van Geest
 
Distributed Systems Concepts
Jordan Halterman
 
iland Internet Solutions: Leveraging Cassandra for real-time multi-datacenter...
DataStax Academy
 
Introduction to Cassandra and CQL for Java developers
Julien Anguenot
 
The Zen of High Performance Messaging with NATS (Strange Loop 2016)
wallyqs
 
Distributed Systems Theory for Mere Mortals - Software Craftsmanship Turkey
Ensar Basri Kahveci
 
Advanced Operations
DataStax Academy
 
Instaclustr introduction to managing cassandra
Instaclustr
 
Distributed Systems Theory for Mere Mortals - Topconf Dusseldorf October 2017
Ensar Basri Kahveci
 
Webinar: Diagnosing Apache Cassandra Problems in Production
DataStax Academy
 
KubeCon NA 2018 - NATS Deep Dive: The Evolution of NATS
wallyqs
 
Top Mistakes When Writing Reactive Applications - Scala by the Bay 2016
Petr Zapletal
 
Understanding Cassandra internals to solve real-world problems
Acunu
 
Instaclustr Apache Cassandra Best Practices & Toubleshooting
Instaclustr
 
OSMC 2017 | Monitoring MySQL with Prometheus and Grafana by Julien Pivotto
NETWAYS
 
Simple Solutions for Complex Problems
Tyler Treat
 
The Economics of Scale: Promises and Perils of Going Distributed
Tyler Treat
 
SF Python Meetup - Introduction to NATS Messaging with Python3
wallyqs
 
Ad

Viewers also liked (17)

PDF
How to export from china to ghana
sandyr2015
 
PPT
Dr. mohannad barakat
MariaBaring
 
DOCX
Christal Sirdine cv2
Christal Sirdine
 
PPTX
Irnbruboards
Callumknight
 
PDF
ARTIGIANO in CASTEGGIO 29-31 maggio 2015. Ingresso omaggio per 2
stefano basso
 
PDF
Installing tomcat on windows 7
Ravi Kumar Lanke
 
PDF
Future Foundation Chairs network meeting 2013
Sport and Recreation Alliance
 
PPTX
Los animales
Campanilla15
 
PDF
Cricket World Cup Quiz IIITH finals
Shashank S
 
PPTX
Lecture Two: Duplicates
hgeismar
 
DOCX
Props list
GillyFerries
 
DOCX
Daftar peserta
riefahmad60
 
PPS
Factura electrònica - Intervenció General Diputació de Barcelona - Jordi A. E...
Consorci Administració Oberta de Catalunya
 
PDF
Philipp Krenn – IT-Tage 2015 – Cloud Computing und Big Data, Platform as a Se...
Informatik Aktuell
 
RTF
Resumen tema 1
LMG286
 
PDF
Stephan Hummel – IT-Tage 2015 – DB2 In-Memory - Eine Technologie nicht nur fü...
Informatik Aktuell
 
How to export from china to ghana
sandyr2015
 
Dr. mohannad barakat
MariaBaring
 
Christal Sirdine cv2
Christal Sirdine
 
Irnbruboards
Callumknight
 
ARTIGIANO in CASTEGGIO 29-31 maggio 2015. Ingresso omaggio per 2
stefano basso
 
Installing tomcat on windows 7
Ravi Kumar Lanke
 
Future Foundation Chairs network meeting 2013
Sport and Recreation Alliance
 
Los animales
Campanilla15
 
Cricket World Cup Quiz IIITH finals
Shashank S
 
Lecture Two: Duplicates
hgeismar
 
Props list
GillyFerries
 
Daftar peserta
riefahmad60
 
Factura electrònica - Intervenció General Diputació de Barcelona - Jordi A. E...
Consorci Administració Oberta de Catalunya
 
Philipp Krenn – IT-Tage 2015 – Cloud Computing und Big Data, Platform as a Se...
Informatik Aktuell
 
Resumen tema 1
LMG286
 
Stephan Hummel – IT-Tage 2015 – DB2 In-Memory - Eine Technologie nicht nur fü...
Informatik Aktuell
 
Ad

Similar to Patrick Guillebert – IT-Tage 2015 – Cassandra NoSQL - Architektur und Anwendungsfälle (20)

PDF
DataStax Enterprise & Apache Cassandra – Essentials for Financial Services – ...
Daniel Cohen
 
PPTX
Devops kc
Philip Thompson
 
PPTX
How to size up an Apache Cassandra cluster (Training)
DataStax Academy
 
PDF
Highly available, scalable and secure data with Cassandra and DataStax Enterp...
Johnny Miller
 
PPTX
BigData Developers MeetUp
Christian Johannsen
 
PPTX
Always On: Building Highly Available Applications on Cassandra
Robbie Strickland
 
PPTX
Apache Cassandra at the Geek2Geek Berlin
Christian Johannsen
 
PDF
Apache Cassandra For Java Developers - Why, What and How. LJC @ UCL October 2014
Johnny Miller
 
PDF
Why Distributed Databases?
Sargun Dhillon
 
PPTX
Cassandra Tools and Distributed Administration (Jeffrey Berger, Knewton) | C*...
DataStax
 
PDF
Top 10 present and future innovations in the NoSQL Cassandra ecosystem (2022)
Cédrick Lunven
 
PDF
Apache Cassandra and The Multi-Cloud by Amanda Moran
Data Con LA
 
PPTX
O'Reilly Webinar: Simplicity Scales - Big Data
Basho Technologies
 
PDF
Designing Resilient Application Platforms with Apache Cassandra - Hayato Shim...
jaxLondonConference
 
PDF
Training Slides: 202 - Monitoring & Troubleshooting
Continuent
 
PDF
Distribute Storage System May-2014
Công Lợi Dương
 
PDF
Cassandra multi-datacenter operations essentials
Julien Anguenot
 
ODP
Everything you always wanted to know about Distributed databases, at devoxx l...
javier ramirez
 
DOC
Ntc 362 forecasting and strategic planning -uopstudy.com
ULLPTT
 
DOC
Ntc 362 effective communication uopstudy.com
ULLPTT
 
DataStax Enterprise & Apache Cassandra – Essentials for Financial Services – ...
Daniel Cohen
 
Devops kc
Philip Thompson
 
How to size up an Apache Cassandra cluster (Training)
DataStax Academy
 
Highly available, scalable and secure data with Cassandra and DataStax Enterp...
Johnny Miller
 
BigData Developers MeetUp
Christian Johannsen
 
Always On: Building Highly Available Applications on Cassandra
Robbie Strickland
 
Apache Cassandra at the Geek2Geek Berlin
Christian Johannsen
 
Apache Cassandra For Java Developers - Why, What and How. LJC @ UCL October 2014
Johnny Miller
 
Why Distributed Databases?
Sargun Dhillon
 
Cassandra Tools and Distributed Administration (Jeffrey Berger, Knewton) | C*...
DataStax
 
Top 10 present and future innovations in the NoSQL Cassandra ecosystem (2022)
Cédrick Lunven
 
Apache Cassandra and The Multi-Cloud by Amanda Moran
Data Con LA
 
O'Reilly Webinar: Simplicity Scales - Big Data
Basho Technologies
 
Designing Resilient Application Platforms with Apache Cassandra - Hayato Shim...
jaxLondonConference
 
Training Slides: 202 - Monitoring & Troubleshooting
Continuent
 
Distribute Storage System May-2014
Công Lợi Dương
 
Cassandra multi-datacenter operations essentials
Julien Anguenot
 
Everything you always wanted to know about Distributed databases, at devoxx l...
javier ramirez
 
Ntc 362 forecasting and strategic planning -uopstudy.com
ULLPTT
 
Ntc 362 effective communication uopstudy.com
ULLPTT
 

Recently uploaded (20)

PPTX
THE school_exposure_presentation[1].pptx
sayanmondal3500
 
PDF
Thu Dinh - CIE-RESEARCH-METHODS-SLIDES-sample-extract.pptx.pdf
dinhminhthu1405
 
PPTX
Working-with-HTML-CSS-and-JavaScript.pptx
badalsenma5
 
PPTX
Selecting relevant value chain/s for Impactful Development Policies
Francois Stepman
 
PPTX
Joy And Peace In All Circumstances.pptx
FamilyWorshipCenterD
 
PPTX
Rotary_Fundraising_Overview_Updated_new video .pptx
allangraemeduncan
 
PDF
protein structure and function for basics .pdf
RakeshKumar508211
 
PPTX
DARKWEB Deepweb what to do or not ?.pptx
prembasnet12
 
PPTX
IBA DISTRICT PIR PRESENTATION.POWERPOINT
ROGELIOLADIERO1
 
PPTX
Raksha Bandhan Celebrations PPT festival
sowmyabapuram
 
PDF
Something I m waiting to tell you By Shravya Bhinder
patelprushti2007
 
PPTX
Public Speakingbjdsbkjfdkjdasnlkdasnlknadslnbsjknsakjscbnkjbncs.pptx
ranazunairriaz1
 
PPTX
Ocean_and_Freshwater_Awareness_Presentation.pptx
Suhaira9
 
DOCX
Policies & Procedures of Internal Audit Department of Shelter Holding LLC.docx
AlamGir100
 
PPTX
Information Security and Risk Management.pptx
prembasnet12
 
PPTX
DPIC Assingment_1.pptx.pptx for presentation
yashwork2607
 
PDF
Developing Accessible and Usable Security Heuristics
Daniela Napoli
 
PPTX
Iconic Destinations in India: Explore Heritage and Beauty
dhorashankar
 
PPTX
PHILIPPINE LITERATURE DURING SPANISH ERA
AllizaJoyMendigoria
 
PDF
SXSW Panel Picker: Placemaking: Culture is the new cost of living
GabrielCohen28
 
THE school_exposure_presentation[1].pptx
sayanmondal3500
 
Thu Dinh - CIE-RESEARCH-METHODS-SLIDES-sample-extract.pptx.pdf
dinhminhthu1405
 
Working-with-HTML-CSS-and-JavaScript.pptx
badalsenma5
 
Selecting relevant value chain/s for Impactful Development Policies
Francois Stepman
 
Joy And Peace In All Circumstances.pptx
FamilyWorshipCenterD
 
Rotary_Fundraising_Overview_Updated_new video .pptx
allangraemeduncan
 
protein structure and function for basics .pdf
RakeshKumar508211
 
DARKWEB Deepweb what to do or not ?.pptx
prembasnet12
 
IBA DISTRICT PIR PRESENTATION.POWERPOINT
ROGELIOLADIERO1
 
Raksha Bandhan Celebrations PPT festival
sowmyabapuram
 
Something I m waiting to tell you By Shravya Bhinder
patelprushti2007
 
Public Speakingbjdsbkjfdkjdasnlkdasnlknadslnbsjknsakjscbnkjbncs.pptx
ranazunairriaz1
 
Ocean_and_Freshwater_Awareness_Presentation.pptx
Suhaira9
 
Policies & Procedures of Internal Audit Department of Shelter Holding LLC.docx
AlamGir100
 
Information Security and Risk Management.pptx
prembasnet12
 
DPIC Assingment_1.pptx.pptx for presentation
yashwork2607
 
Developing Accessible and Usable Security Heuristics
Daniela Napoli
 
Iconic Destinations in India: Explore Heritage and Beauty
dhorashankar
 
PHILIPPINE LITERATURE DURING SPANISH ERA
AllizaJoyMendigoria
 
SXSW Panel Picker: Placemaking: Culture is the new cost of living
GabrielCohen28
 

Patrick Guillebert – IT-Tage 2015 – Cassandra NoSQL - Architektur und Anwendungsfälle

  • 2. Patrick Guillebert Solutions Architect [email protected] Availability, scalability and consistency with Cassandra
  • 3. Cassandra and the CAP Theorem In a distributed system, only 2 of these attributes can be fulfilled at a time: Consistency, Availability, Partition tolerance Availability, Partition tolerance is deeply built in by the design. Consistency results programmatically and is tunable. ©2014 DataStax Confidential. Do not distribute without consent. 2 Cassandra is an AP system
  • 4. Cassandra’s design goals Massive and predictable scaling Always-On Tunable consistency Geographically distributed Low latency Operationally simple 2016: + User friendly ©2015 DataStax. Do not distribute without consent. 3
  • 5. © 2014 DataStax Confidential. Do not distribute without consent. 4 Numbers and facts from production deployments
  • 6. Scale at Apple Cassandra can scale from 3 to 1000+ nodes • Footprint @ Apple • 75,000+ nodes • 10’s of petabytes of data • Millions ops/second • Largest cluster 1000+ nodes Apple Inc.: Cassandra at Apple for Massive Scale Video https://www.youtube.com/watch?v=Bc4ql9TDzyg Form Cassandra Summit, USA, September 2014
  • 7. Availability at Netflix ©2015 DataStax. Do not distribute without consent. 6
  • 8. Consistency at ING Bank ©2015 DataStax. Do not distribute without consent. 7 https://www.youtube.com/watch?v=-sD3x8-tuDU
  • 10. Linear and predictable scaling Need to store mode data? Add more nodes. Need more throughput ? Add more nodes. ©2015 DataStax. Do not distribute without consent. 9 Cassandra • Is “future proof” • By scaling out linearly on commodity hardware.
  • 11. Partitionning • The dataset is distributed all nodes of the cluster • Data subsets are called “token ranges” • With vnodes, a node could have several small token ranges, 256 by default ©2015 DataStax. Do not distribute without consent. 10 Token Range (Murmur3) Node 1 Node 3 Node 2Node 4 - 263+ 263
  • 12. The partitioner ©2015 DataStax. Do not distribute without consent. 11 "@DataSta x" Hash Function 97203 -2245462676723223822 7723358927203680754 • Partitioner is responsible for calculating the token, which is the hash of the key of the data to store • The token is just a number between -2^63 and 2^63 • Partitioners: Murmur3Partitioner (Murmur3), RandomPartitioner (MD5) ByteOrderedPartitioner
  • 13. Data distribution within the cluster ©2015 DataStax. Do not distribute without consent. 12 Node 1 Node 3 Node 2Node 4Node 4 Partitioner ’@Datastax' Token 91 Node 1 0100
  • 15. Node level availability A node failure has no impact • No failover • 0 downtime • 0 data loss • Consistent responses ©2014 DataStax Confidential. Do not distribute without consent. 14 Node 1 1st copy Node 4 Node 5 Node 2 2nd copy Node 3 3rd copy Parallel Writes RF=3 CL=QUORUM If 2 nodes out of the 3 replicas respond, the request is a success Immediate onsistency is ensured Node 4
  • 16. Rack level availability A rack failure has no impact on the service ©2014 DataStax Confidential. Do not distribute without consent. 15 Node 1 1st copy Node 4 Node 5 Node 2 2nd copy Node 3 3rd copy Write CL=QUORUM Immediate consistency (RF 3 / CL QUORUM) : Nodes are distributed across at least 3 racks If a rack fails, a quorum of replicas remain available in the remaining racks and the request suceeds. Node 4 RAC 1 RAC 2 RAC 3 RF=3 CL=QUORUM
  • 17. Data Centre level availability ©2014 DataStax Confidential. Do not distribute without consent. 16 Node 1 1st copy Node 4 Node 5 Node 2 2nd copy Node 3 3rd copy DC: Frankfurt Node 1 1st copy Node 4 Node 5 Node 2 2nd copy Node 3 3rd copy DC: Dublin A DC failure has no impact on the service
  • 18. Internals and practices that enable availability ©2014 DataStax Confidential. Do not distribute without consent. • Cluster topology awareness (“NetworkTopologyStrategy”) in replication A cluster is described as a group of data centres, containining racks, containining nodes. This topology is used to set replicas in a “smart” fashion. • Peer to peer architecture All nodes are strictly equal on the read/write path, any node can be queried for any data Seeds have a special function only in the context of node bootstrap Failover never happens • Trade off on consistency Response consistency is ensured by asking a quorum of nodes to be consistent. There is no single “master” knowing the latest write. • Live operations Whichever the setup is (single or multi-dc) operations can be made live, with no downtime
  • 19. Single DC replication ©2014 DataStax Confidential. Do not distribute without consent. Node 1 Node 3 Node 2Node 4 Client Driver Node 4 Node 1 Node 2 Node 3 1-25 26-5051-75 76-0 91 91 Single Data Center CREATE KEYSPACE demo WITH REPLICATION = { 'class':'SimpleStrategy', 'replication_factor':3 }; Primary Range for Token 91
  • 20. Multi-DC replication ©2014 DataStax Confidential. Do not distribute without consent. Node 1 Node 3 Node 2Node 4 Client Driver Node 4 Node 1 Node 2 Node 3 1-13 26-38 51-63 76-88 Node 1 Node 3 Node 2Node 4 Data Center - West Rack 1 Rack 2 Node 2 Rack 1 Rack 2 91 91 91 CREATE KEYSPACE demo WITH REPLICATION { 'class':'NetworkTopologyStrategy', 'dc-east':2, 'dc-west’:3 } Remote Coordinator Primary Range for Token 91 14-25 39-50 64-75 89-100 91
  • 21. What is different from a master-slave system ? ©2014 DataStax Confidential. Do not distribute without consent. Failovers never happens, having no master irremediable data corruption in split brain situation can’t happen
  • 23. Consistency types ©2015 DataStax. Do not distribute without consent. 22 • Immediate consistency The uniquely given consistency type which a relational database provide. A read always return the last written data, whichever node is requested. • Eventual consistency A read will eventually return the last written data. Depending on the node queried, responses could be different. Cassandra offers both types of consistency, it can be tuned.
  • 24. Consistency Levels ©2015 DataStax. Do not distribute without consent. 23 Consistency levels are set on each single request. It determines the number of node which need to acknowledge a write or agree on the value of a queried data. Consistency types could be different depending on the table or in time. • Common CLs: ONE, QUORUM, ALL • Multi-DC CLs: LOCAL_ONE, LOCAL_QUORUM, LOCAL_ALL, EACH_QUORUM • Special CLs: ANY, SERIAL, LOCAL_SERIAL
  • 25. General consistency levels ©2015 DataStax. Do not distribute without consent. 24 ONE QUORUM ALL ANY SERIAL
  • 26. Multi-DC consistency levels ©2015 DataStax. Do not distribute without consent. 25 LOCAL_ONE LOCAL_QUORUM LOCAL_ALL EACH_QUORUM LOCAL_SERIAL
  • 27. The rule to tune consistency ©2015 DataStax. Do not distribute without consent. 26 Immediate consistency : CL READ + CL WRITE > REPLICATION READ QUORUM + WRITE QUORUM READ ONE + WRITE ALL READ ALL + WRITE ONE Eventual consistency CL READ + CL WRITE <= REPLICATION READ ONE + WRITE ONE READ ONE + WRITE ANY
  • 28. Time base conflict resolution ©2015 DataStax. Do not distribute without consent. 27 Node 1 Node 3 Node 2Node 4 Client Driver timestamp … 523 timestamp … 511 timestamp … 542 The last write wins !
  • 29. Empirical verification of the rule ©2015 DataStax. Do not distribute without consent. 28 Node 1 Node 3 Node 2Node 4RF = 3 CL QUORUM WRITE ALL COMBINATIONS OF CL QUORUM
  • 31. Domains of use cases ©2015 DataStax. Do not distribute without consent. 30 • IoT • Web / mobile • Gaming • Bank, finance • Telecoms
  • 32. Using C* for to back a global reference data service ©2015 DataStax. Do not distribute without consent. 31 Requirements • Always on • High throughput • Low latency • Multi-region replication Little or big data 
  • 33. Extending use cases with DSE ©2015 DataStax. Do not distribute without consent. 32