Cassandra Summit: C* Keys - Partitioning, Clustering, & Crossfit

3 likes2,113 views

This document summarizes key concepts for partitioning, clustering, and composite partition keys in Apache Cassandra. It begins with an introduction of the presenter and his company DataScale. It then defines partitioning as how nodes in a Cassandra cluster are assigned tokens to determine data distribution. The partition key is explained as the first column in the primary key that maps data to nodes, while clustering columns within each partition sort and organize the data. Composite partition keys using multiple columns for the token hash are also introduced. Examples of using clustering columns for hierarchical and time series data are provided.

Technology

C* Keys: Partitioning, Clustering, & CrossFit
Adam Hutson - Data Architect, DataScale Inc.

© DataStax, All Rights Reserved.
Who am I & What do we do?
2
Adam Hutson
Data Architect @ DataScale -> www.datascale.io
DataStax MVP for Apache Cassandra
DataScale provides hosted data platforms as a service
Offering Cassandra & Spark, with more to come
Currently hosted in Amazon & Azure

© DataStax, All Rights Reserved.
1 Why
2 Partition
3 Partition Key
4 Composite Partition Key
5 Clustering Columns
4

© DataStax, All Rights Reserved.
Why give this presentation?
Partitioning & Clustering should be the foundation.
Too often glossed over.
Has the biggest impact to performance of the cluster
5

© DataStax, All Rights Reserved.
Partition Explained
• Token values can range from -263 to 263-1.
• Nodes in the cluster/ring are assigned a single
token.
• A node is responsible for the token value and
expands to the previous node’s token.
• A Partitioner decides where a partition key
maps onto the cluster/ring.
7
Node #3 is responsible for tokens
from -1844674407370955162
to -5534023222112865485

© DataStax, All Rights Reserved.
Partition Explained
8

© DataStax, All Rights Reserved.
Partition Key Explained
The Partition Key is:
• responsible for distribution of data amongst the nodes
• the first column defined in the PRIMARY KEY
10

© DataStax, All Rights Reserved.
Partition Key Explained
11

© DataStax, All Rights Reserved.
Partition Key Explained
12

© DataStax, All Rights Reserved.
Composite Partition Key Explained
Using multiple columns for the token hash value.
14

© DataStax, All Rights Reserved.
Composite Partition Key Explained
15

© DataStax, All Rights Reserved.
Composite Partition Key Explained
16

© DataStax, All Rights Reserved.
Clustering Columns Explained
Clustering Columns are:
• responsible for sorting within the partition
• any column added to the Primary Key, past
the first column
18

© DataStax, All Rights Reserved.
Clustering Columns Explained
Can be used for Hierarchical structured data.
19

© DataStax, All Rights Reserved.
Clustering Columns Explained
Can be used for Time Series structured data.
CREATE TABLE member_log
( member text,
workout_date timestamp,
workout_duration text,
PRIMARY KEY (member, workout_date)
) WITH CLUSTERING ORDER BY (workout_date DESC);
20

© DataStax, All Rights Reserved.
Clustering Columns Explained
21

Thank You!
Questions?
Adam Hutson @AdamHutson
adam@datascale.io @DataScaleInc

More Related Content

What's hot (20)

PPTX

DataStax | Adversarial Modeling: Graph, ML, and Analytics for Identity Fraud ...DataStax

PDF

Can My Inventory Survive Eventual Consistency?DataStax

PDF

Aleksejs Nemirovskis - Manage your data using oracle BDAAndrejs Vorobjovs

PDF

Unleash the power of Azure Data Factory Sergio Zenatti Filho

PDF

The new big dataAdam Doyle

PPTX

Data Modeling Basics for the Cloud with DataStaxDataStax

PPTX

NoSQL on MySQL - MySQL Document Store by Vadim TkachenkoData Con LA

PDF

Encryption and Masking for Sensitive Apache Spark Analytics Addressing CCPA a...Databricks

PDF

Apache Iceberg Presentation for the St. Louis Big Data IDEAAdam Doyle

PPTX

How jKool Analyzes Streaming Data in Real Time with DataStaxDataStax

PDF

Azure Data Factory v2Sergio Zenatti Filho

PPTX

Cloudian HyperStore Operating EnvironmentCloudian

PDF

Apache Hadoop 3Cloudera, Inc.

PDF

DataStax | Building a Spark Streaming App with DSE File System (Rocco Varela)...DataStax

PDF

A Gentle Introduction to GPU Computing by Armen DonigianData Con LA

PPT

Webinar - The Agility Challenge - Powering Cloud Apps with Multi-Model & Mixe...DataStax

PDF

Improving Apache Spark™ In-Memory Computing with Apache Ignite™Tom Diederich

PDF

Presto: Fast SQL-on-Anything Across Data Lakes, DBMS, and NoSQL Data StoresAlluxio, Inc.

PPTX

How DataStax Enterprise and Azure Make Your Apps Scale from Day 1DataStax

PDF

A New “Sparkitecture” for Modernizing your Data Warehouse: Spark Summit East ...Spark Summit

DataStax | Adversarial Modeling: Graph, ML, and Analytics for Identity Fraud ...DataStax

Can My Inventory Survive Eventual Consistency?DataStax

Aleksejs Nemirovskis - Manage your data using oracle BDAAndrejs Vorobjovs

Unleash the power of Azure Data Factory Sergio Zenatti Filho

The new big dataAdam Doyle

Data Modeling Basics for the Cloud with DataStaxDataStax

NoSQL on MySQL - MySQL Document Store by Vadim TkachenkoData Con LA

Encryption and Masking for Sensitive Apache Spark Analytics Addressing CCPA a...Databricks

Apache Iceberg Presentation for the St. Louis Big Data IDEAAdam Doyle

How jKool Analyzes Streaming Data in Real Time with DataStaxDataStax

Azure Data Factory v2Sergio Zenatti Filho

Cloudian HyperStore Operating EnvironmentCloudian

Apache Hadoop 3Cloudera, Inc.

DataStax | Building a Spark Streaming App with DSE File System (Rocco Varela)...DataStax

A Gentle Introduction to GPU Computing by Armen DonigianData Con LA

Webinar - The Agility Challenge - Powering Cloud Apps with Multi-Model & Mixe...DataStax

Improving Apache Spark™ In-Memory Computing with Apache Ignite™Tom Diederich

Presto: Fast SQL-on-Anything Across Data Lakes, DBMS, and NoSQL Data StoresAlluxio, Inc.

How DataStax Enterprise and Azure Make Your Apps Scale from Day 1DataStax

A New “Sparkitecture” for Modernizing your Data Warehouse: Spark Summit East ...Spark Summit

Viewers also liked (20)

PDF

A Shortcut to Awesome: Cassandra Data Modeling By Example (Jon Haddad, The La...DataStax

PDF

Cassandra Community Webinar | Getting Started with Apache Cassandra with Patr...DataStax Academy

PDF

NoSQL Essentials: CassandraFernando Rodriguez

PDF

RDBからの脱却：新ERP"HUE"におけるCassandra2t3

PDF

Cassandra Summit 2014: CQL Under the HoodDataStax Academy

PDF

Overview of DataStax OpsCenterDataStax

PDF

DataStax: Backup and Restore in Cassandra and OpsCenterDataStax Academy

PPTX

EADL conference: Towards National stratgies for OER? The Dutch landscape, Fre...Fred de Vries

PDF

VA HOME LOANSUSAN HARVEY

PDF

ScrumMaster activities in building a winning self organized teams - Naveen Na...Naveen Nanjundappa

PPTX

Nida presentationDinesh Raheja

PDF

Attack toolkit webinar 9-7-11Alex T.

PDF

J1939 stack integration with an advanced EPS system | Automotive Tier-I Suppl...Embitel Technologies - A VOLKSWAGEN GROUP COMPANY

PPTX

Working With Interpreters in Palliative Care.HMVT Teaching and Learning Space

PPT

Becky kelly[1]rkelly2010

PDF

Ellsworthetal1996SSSAJpaperellswort

PDF

Zasady prezentacji 2pcmp

PDF

Ukraine - Business unplugged!Morten Munk

DOCX

Microsoft Project workshop in Pune 6th & 7th Augustvrushalis

PPT

ATTACK Toolkit Webinar on Big Tobacco's Emerging MarketingAlex T.

A Shortcut to Awesome: Cassandra Data Modeling By Example (Jon Haddad, The La...DataStax

Cassandra Community Webinar | Getting Started with Apache Cassandra with Patr...DataStax Academy

NoSQL Essentials: CassandraFernando Rodriguez

RDBからの脱却：新ERP"HUE"におけるCassandra2t3

Cassandra Summit 2014: CQL Under the HoodDataStax Academy

Overview of DataStax OpsCenterDataStax

DataStax: Backup and Restore in Cassandra and OpsCenterDataStax Academy

EADL conference: Towards National stratgies for OER? The Dutch landscape, Fre...Fred de Vries

VA HOME LOANSUSAN HARVEY

ScrumMaster activities in building a winning self organized teams - Naveen Na...Naveen Nanjundappa

Nida presentationDinesh Raheja

Attack toolkit webinar 9-7-11Alex T.

J1939 stack integration with an advanced EPS system | Automotive Tier-I Suppl...Embitel Technologies - A VOLKSWAGEN GROUP COMPANY

Working With Interpreters in Palliative Care.HMVT Teaching and Learning Space

Becky kelly[1]rkelly2010

Ellsworthetal1996SSSAJpaperellswort

Zasady prezentacji 2pcmp

Ukraine - Business unplugged!Morten Munk

Microsoft Project workshop in Pune 6th & 7th Augustvrushalis

ATTACK Toolkit Webinar on Big Tobacco's Emerging MarketingAlex T.

Similar to Cassandra Summit: C* Keys - Partitioning, Clustering, & Crossfit (20)

PDF

Datastax day 2016 : Cassandra data modeling basicsDuyhai Doan

PDF

Introduction to Dating Modeling for CassandraDataStax Academy

PDF

CassandraLucian Neghina

PPTX

Introduction to Apache CassandraJesus Guzman

PDF

Timeli: Believing Cassandra: Our Big-Data Journey To Enlightenment under the ...DataStax Academy

PDF

Cassandra Data ModellingKnoldus Inc.

DOCX

Cassandra data modelling best practicesSandeep Sharma IIMK Smart City,IoT,Bigdata,Cloud,BI,DW

PDF

Meetup Crash Course: Cassandra Data ModellingErick Ramirez

PPTX

DataStax & Cassandra Data Modeling StrategiesAnant Corporation

PPTX

Datastax / Cassandra Modeling Strategies Anant Corporation

PPTX

NoSQL, SQL, NewSQL - methods of structuring data.Tony Rogerson

PPTX

Performance tuning - A key to successful cassandra migrationRamkumar Nottath

PDF

Big Data Grows Up - A (re)introduction to CassandraRobbie Strickland

PPTX

PresentationDimitris Stripelis

PPTX

Learning spark ch04 - Working with Key/Value Pairsphanleson

PDF

What We Need to Unlearn about Persistent StorageScyllaDB

PDF

Avoiding Data Hotspots at ScaleScyllaDB

PDF

Data Partitioning in Mongo DB with CloudIJAAS Team

PPTX

Symantec: Cassandra Data Modelling techniques in actionDataStax Academy

PDF

Apache Cassandra & Data ModelingMassimiliano Tomassi

Datastax day 2016 : Cassandra data modeling basicsDuyhai Doan

Introduction to Dating Modeling for CassandraDataStax Academy

CassandraLucian Neghina

Introduction to Apache CassandraJesus Guzman

Timeli: Believing Cassandra: Our Big-Data Journey To Enlightenment under the ...DataStax Academy

Cassandra Data ModellingKnoldus Inc.

Cassandra data modelling best practicesSandeep Sharma IIMK Smart City,IoT,Bigdata,Cloud,BI,DW

Meetup Crash Course: Cassandra Data ModellingErick Ramirez

DataStax & Cassandra Data Modeling StrategiesAnant Corporation

Datastax / Cassandra Modeling Strategies Anant Corporation

NoSQL, SQL, NewSQL - methods of structuring data.Tony Rogerson

Performance tuning - A key to successful cassandra migrationRamkumar Nottath

Big Data Grows Up - A (re)introduction to CassandraRobbie Strickland

PresentationDimitris Stripelis

Learning spark ch04 - Working with Key/Value Pairsphanleson

What We Need to Unlearn about Persistent StorageScyllaDB

Avoiding Data Hotspots at ScaleScyllaDB

Data Partitioning in Mongo DB with CloudIJAAS Team

Symantec: Cassandra Data Modelling techniques in actionDataStax Academy

Apache Cassandra & Data ModelingMassimiliano Tomassi

Recently uploaded (20)

PPTX

COMPARISON OF RASTER ANALYSIS TOOLS OF QGIS AND ARCGISSharanya Sarkar

PPTX

OpenID AuthZEN - Analyst Briefing July 2025David Brossard

PDF

Building Real-Time Digital Twins with IBM Maximo & ArcGIS IndoorsSafe Software

PDF

Smart Trailers 2025 Update with History and OverviewPaul Menig

PDF

Bitcoin for Millennials podcast with Bram, Power Laws of BitcoinStephen Perrenod

PDF

From Code to Challenge: Crafting Skill-Based Games That Engage and Rewardaiyshauae

PDF

New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025BookNet Canada

PDF

NewMind AI - Journal 100 Insights After The 100th IssueNewMind AI

PDF

The Rise of AI and IoT in Mobile App Tech.pdfIMG Global Infotech

PDF

Newgen 2022-Forrester Newgen TEI_13 05 2022-The-Total-Economic-Impact-Newgen-...darshakparmar

PDF

What Makes Contify’s News API Stand Out: Key Features at a GlanceContify

PPTX

Q2 FY26 Tableau User Group Leader Quarterly Calllward7

PPTX

Webinar: Introduction to LF Energy EVerestDanBrown980551

PDF

CIFDAQ Market Insights for July 7th 2025CIFDAQ

PPTX

AUTOMATION AND ROBOTICS IN PHARMA INDUSTRY.pptxsameeraaabegumm

PDF

"Beyond English: Navigating the Challenges of Building a Ukrainian-language R...Fwdays

PDF

Chris Elwell Woburn, MA - Passionate About IT InnovationChris Elwell Woburn, MA

PPTX

Building Search Using OpenSearch: Limitations and WorkaroundsSease

PDF

CIFDAQ Weekly Market Wrap for 11th July 2025CIFDAQ

PDF

[Newgen] NewgenONE Marvin Brochure 1.pdfdarshakparmar