SlideShare a Scribd company logo
CASSANDRA SUMMIT 2016
CQL PERFORMANCE WITH APACHE
CASSANDRA 3.0
Aaron Morton
@aaronmorton
CEO
Licensed under a Creative Commons Attribution-NonCommercial 3.0 New Zealand License
CQL performance with Apache Cassandra 3.0 (Aaron Morton, The Last Pickle) | C* Summit 2016
How We Got Here
Storage Engine 3.0
Read Path
How We Got Here
Way back in 2011…
2011
Blog: Cassandra Query Plans
http://thelastpickle.com/blog/2011/07/04/
Cassandra-Query-Plans.html
2012
Talk:Technical Deep Dive -
Query Performance
https://www.youtube.com/watch?
v=gomOKhMV0zc
2012
Explain Read & Write
performance in 45 minutes.
Skip Forward to 2016
Blog: Introduction To The
Apache Cassandra 3.x Storage
Engine
http://thelastpickle.com/blog/2016/03/04/introductiont-to-
the-apache-cassandra-3-storage-engine.html
Skip Forward to 2016
“Why don’t I do another talk
about Cassandra
performance.”
Skip Forward to 2016
It was a busy 4 years…
Skip Forward to 2016
CQL 3, Collection Types,
UDTs, UDF’s, UDA’s,
MaterialisedViews,Triggers,
SASI,…
Skip Forward to 2016
Explain Read & Write
performance in 45 minutes.
So Lets Avoid
CQL 3, Collection Types,
UDTs, UDF’s, UDA’s,
MaterialisedViews,Triggers,
SASI,…
How We Got Here
Storage Engine 3.0
Read Path
High Level Storage Engine 3.0
Storage Engine 3.0 Files
Data.db
Index.db
Filter.db
Storage Engine 3.0 Files
CompressionInfo.db
Statistics.db
Digest.crc32
CRC.db
Summary.db
TOC.txt
CQL Recap
create table my_table (
partition_1 text,
cluster_1 text,
foo text,
bar text,
baz text,
PRIMARY KEY (partition_1, cluster_1)
);
CQL Recap
WARNING:
FAKE DATA AHEAD
CQL WithThrift Pre 3.0
[default@dev] list my_table;
-------------------
RowKey: part_a
=> (column=clust_a:, value=, timestamp=1357…739000)
=> (column=clust_a:foo, value=some foo, timestamp=1357…739000)
=> (column=clust_a:bar, value=and bar, timestamp=1357…739000)
=> (column=clust_a:baz, value=no baz, timestamp=1357…739000)
=> (column=clust_b:, value=, timestamp=1357…739000)
=> (column=clust_b:foo, value=no foo, timestamp=1357…739000)
=> (column=clust_b:bar, value=no bar, timestamp=1357…739000)
=> (column=clust_b:baz, value=lots baz, timestamp=1357…739000)
CQL Pre 3.0
Clustering Keys Repeated
Column Names Repeated
Timestamps Repeated
Fixed Width Encoding
No Knowledge Of Row Contents
Storage Engine 3.0 Improvements
Delta Encoding
Variable Int Encoding
Clustering Written Once
Aggregated Metadata
Cell Presence
SerializationHeader
For each SSTable*.
Stored in each SSTable.
Held in memory.
SerializationHeader
public class SerializationHeader
{
private final AbstractType<?> keyType;
private final List<AbstractType<?>>
clusteringTypes;
private final PartitionColumns columns;
private final EncodingStats stats;
…
}
EncodingStats
Collected on the fly by the
Memtable.
EncodingStats
public class EncodingStats
{
public final long minTimestamp;
public final int minLocalDeletionTime;
public final int minTTL;
…
}
SerializationHeader
public class SerializationHeader
{
public void writeTimestamp(long timestamp,
DataOutputPlus out) throws IOException
{
out.writeUnsignedVInt(timestamp -
stats.minTimestamp);
}
…
}
VIntCoding
public class VIntCoding
{
public static void writeUnsignedVInt(long value, DataOutput
output) throws IOException {
int size = VIntCoding.computeUnsignedVIntSize(value);
if (size == 1)
{
output.write((int)value);
return;
}
output.write(VIntCoding.encodeVInt(value, size), 0,
size);
}
Storage Engine 3.0 Improvements
Delta Encoding
Variable Int Encoding
Clustering Written Once
Aggregated Metadata
Cell Presence
CQL WithThrift Pre 3.0
[default@dev] list my_table;
-------------------
RowKey: part_a
=> (column=clust_a:, value=, timestamp=1357…739000)
=> (column=clust_a:foo, value=some foo, timestamp=1357…739000)
=> (column=clust_a:bar, value=and bar, timestamp=1357…739000)
=> (column=clust_a:baz, value=no baz, timestamp=1357…739000)
=> (column=clust_b:, value=, timestamp=1357…739000)
=> (column=clust_b:foo, value=no foo, timestamp=1357…739000)
=> (column=clust_b:bar, value=no bar, timestamp=1357…739000)
=> (column=clust_b:baz, value=lots baz, timestamp=1357…739000)
Storage Engine 3.0 Data.db
Storage Engine 3.0 Partition Header
Storage Engine 3.0 Row
Storage Engine 3.0 Clustering Block
Storage Engine 3.0 Improvements
Delta Encoding
Variable Int Encoding
Clustering Written Once
Aggregated Cell Metadata
Cell Presence
CQL WithThrift Pre 3.0
[default@dev] list my_table;
-------------------
RowKey: part_a
=> (column=clust_a:, value=, timestamp=1357…739000)
=> (column=clust_a:foo, value=some foo, timestamp=1357…739000)
=> (column=clust_a:bar, value=and bar, timestamp=1357…739000)
=> (column=clust_a:baz, value=no baz, timestamp=1357…739000)
=> (column=clust_b:, value=, timestamp=1357…739000)
=> (column=clust_b:foo, value=no foo, timestamp=1357…739000)
=> (column=clust_b:bar, value=no bar, timestamp=1357…739000)
=> (column=clust_b:baz, value=lots baz, timestamp=1357…739000)
Aggregated Cell Metadata
Only store CellTimestamp,TTL, and
Local DeletionTime if different to
the Row.
Aggregated Cell Metadata
Simple Cell Component Byte Size
Flags 1
Optional Cell Timestamp (delta) varint 1…n
Optional Cell Local Deletion Time (delta) varint 1…n
Optional Cell TTL (delta) varint 1…n
Fixed Width Cell Value Byte Size
Value 1…n
Optional Cell Value See Below
Variable Width Cell Value Byte Size
Value Length varint 1…n
Value 1…n
Apache Cassandra 3.0 Storage Engine
Storage Engine 3.0 Improvements
Delta Encoding
Variable Int Encoding
Clustering Written Once
Aggregated Cell Metadata
Cell Presence
Cell Presence
SSTable stores list of Cells in this
SSTable.
Rows stores bitmap of Cells in this
Row, with reference to SSTable.
Storage Engine 3.0 Row
Remember Where We Came From
[default@dev] list my_table;
-------------------
RowKey: part_a
=> (column=clust_a:, value=, timestamp=1357…739000)
=> (column=clust_a:foo, value=some foo, timestamp=1357…739000)
=> (column=clust_a:bar, value=and bar, timestamp=1357…739000)
=> (column=clust_a:baz, value=no baz, timestamp=1357…739000)
=> (column=clust_b:, value=, timestamp=1357…739000)
=> (column=clust_b:foo, value=no foo, timestamp=1357…739000)
=> (column=clust_b:bar, value=no bar, timestamp=1357…739000)
=> (column=clust_b:baz, value=lots baz, timestamp=1357…739000)
How We Got Here
Storage Engine 3.0
Read Path
Read Paths
Ignoring Index Read paths.
Read Commands
PartitionRangeReadCommand
SinglePartitionReadCommand
AbstractClusteringIndexFilter
ClusteringIndexNamesFilter
(When we know the column names.)
ClusteringIndexSliceFilter
(When we do not know the column names.)
ClusteringIndexNamesFilter
When we know what
Columns to select, we know
when the search is over.
ClusteringIndexNamesFilter
1. Get Partition From Memtables.
2. Filter named columns into a temporary
result.
3. Select SSTables that may contain Partition
Key.
4. Order in descending timestamp order.
5. Read from SSTables in order.
Names Filter Short Circuits
If result has a Partition Deletion
newer than next SSTable max
timestamp.
Stop Search.
Names Filter Short Circuits
If read all Columns and max
timestamp of next SSTable less than
selected Columns min timestamp.
Stop Search.
Names Filter Short Circuits
If search clustering value not within
clustering range in the SSTable.
Skip SSTable.
Names Filter Short Circuits
If SSTable Cell not in search set.
Skip reading value.
ClusteringIndexSliceFilter
When we do not know which
columns to select, the search
ends when it is exhausted.
ClusteringIndexSliceFilter
Used with:
Distinct.
Not all clustering columns
restricted.
ClusteringIndexSliceFilter
1. Get Partition From Memtables.
2. Create Iterators for Partitions.
3. Select SSTables that may contain Partition
Key.
4. Order in reverse max timestamp order.
5. Create Iterators for SSTables in order.
Slice Filter Short Circuits
If SSTable max timestamp is before
max seen Partition Deletion
timestamp.
Stop Search.
Names Filter Short Circuits
If search clustering value not within
clustering range in the SSTable.
Skip SSTable.
Thanks.
Aaron Morton
@aaronmorton
Co-Founder & Principal Consultant
www.thelastpickle.com

More Related Content

What's hot (20)

PDF
Polyglot Persistence
Scott Leberknight
 
PDF
wtf is in Java/JDK/wtf7?
Scott Leberknight
 
PDF
Squeak DBX
ESUG
 
PDF
Cassandra Materialized Views
Carl Yeksigian
 
PDF
How and Where in GLORP
ESUG
 
PDF
ETL With Cassandra Streaming Bulk Loading
alex_araujo
 
PDF
SQL to Hive Cheat Sheet
Hortonworks
 
PDF
Developing and Deploying Apps with the Postgres FDW
Jonathan Katz
 
PPTX
Advanced Sqoop
Yogesh Kulkarni
 
PPT
15 Ways to Kill Your Mysql Application Performance
guest9912e5
 
PDF
Cassandra 3 new features 2016
Duyhai Doan
 
KEY
Cassandra and Rails at LA NoSQL Meetup
Michael Wynholds
 
PDF
Cassandra 3.0 Awesomeness
Jon Haddad
 
PDF
April 2010 - JBoss Web Services
JBug Italy
 
PDF
Cassandra Summit 2014: Reading Cassandra SSTables Directly for Offline Data A...
DataStax Academy
 
PPTX
MongoDB: tips, trick and hacks
Scott Hernandez
 
PDF
Infinispan,Lucene,Hibername OGM
JBug Italy
 
PDF
PostgreSQL, your NoSQL database
Reuven Lerner
 
PDF
OrientDB
aemadrid
 
PDF
CQL3 in depth
Yuki Morishita
 
Polyglot Persistence
Scott Leberknight
 
wtf is in Java/JDK/wtf7?
Scott Leberknight
 
Squeak DBX
ESUG
 
Cassandra Materialized Views
Carl Yeksigian
 
How and Where in GLORP
ESUG
 
ETL With Cassandra Streaming Bulk Loading
alex_araujo
 
SQL to Hive Cheat Sheet
Hortonworks
 
Developing and Deploying Apps with the Postgres FDW
Jonathan Katz
 
Advanced Sqoop
Yogesh Kulkarni
 
15 Ways to Kill Your Mysql Application Performance
guest9912e5
 
Cassandra 3 new features 2016
Duyhai Doan
 
Cassandra and Rails at LA NoSQL Meetup
Michael Wynholds
 
Cassandra 3.0 Awesomeness
Jon Haddad
 
April 2010 - JBoss Web Services
JBug Italy
 
Cassandra Summit 2014: Reading Cassandra SSTables Directly for Offline Data A...
DataStax Academy
 
MongoDB: tips, trick and hacks
Scott Hernandez
 
Infinispan,Lucene,Hibername OGM
JBug Italy
 
PostgreSQL, your NoSQL database
Reuven Lerner
 
OrientDB
aemadrid
 
CQL3 in depth
Yuki Morishita
 

Viewers also liked (13)

PPTX
Cassandra 2.2 & 3.0
Victor Coustenoble
 
PDF
10 Devops-Friendly Database Must-Haves - Dor Laor, ScyllaDB - DevOpsDays Tel ...
DevOpsDays Tel Aviv
 
PDF
Apache Cassandra and Go
DataStax Academy
 
PDF
Advanced Apache Cassandra Operations with JMX
zznate
 
PDF
5分で作るさくらのVPSでKUSANAGI8環境
さくらインターネット株式会社
 
PPTX
Apache Cassandra 2.0
Joe Stein
 
PDF
Lightning fast analytics with Spark and Cassandra
nickmbailey
 
ODP
NoSQL: onde, como e por quê? Cassandra e MongoDB
Rodrigo Hjort
 
ODP
Introduction to Apache Cassandra
Knoldus Inc.
 
PPTX
Elasticsearch+nodejs+dynamodbで作る全社システム基盤
Recruit Technologies
 
PDF
Cassandra 3.0 Data Modeling
DataStax Academy
 
PDF
Nosqlの基礎知識(2013年7月講義資料)
CLOUDIAN KK
 
PDF
Goでヤフーの分散オブジェクトストレージを作った話 Go Conference 2017 Spring
Yahoo!デベロッパーネットワーク
 
Cassandra 2.2 & 3.0
Victor Coustenoble
 
10 Devops-Friendly Database Must-Haves - Dor Laor, ScyllaDB - DevOpsDays Tel ...
DevOpsDays Tel Aviv
 
Apache Cassandra and Go
DataStax Academy
 
Advanced Apache Cassandra Operations with JMX
zznate
 
5分で作るさくらのVPSでKUSANAGI8環境
さくらインターネット株式会社
 
Apache Cassandra 2.0
Joe Stein
 
Lightning fast analytics with Spark and Cassandra
nickmbailey
 
NoSQL: onde, como e por quê? Cassandra e MongoDB
Rodrigo Hjort
 
Introduction to Apache Cassandra
Knoldus Inc.
 
Elasticsearch+nodejs+dynamodbで作る全社システム基盤
Recruit Technologies
 
Cassandra 3.0 Data Modeling
DataStax Academy
 
Nosqlの基礎知識(2013年7月講義資料)
CLOUDIAN KK
 
Goでヤフーの分散オブジェクトストレージを作った話 Go Conference 2017 Spring
Yahoo!デベロッパーネットワーク
 
Ad

Similar to CQL performance with Apache Cassandra 3.0 (Aaron Morton, The Last Pickle) | C* Summit 2016 (20)

PDF
Cassandra SF Meetup - CQL Performance With Apache Cassandra 3.X
aaronmorton
 
PDF
Refactoring to Macros with Clojure
Dmitry Buzdin
 
PDF
Hadoop Integration in Cassandra
Jairam Chandar
 
PDF
Postgres Performance for Humans
Citus Data
 
PDF
PostgreSQL 9.6 새 기능 소개
PgDay.Seoul
 
PDF
M|18 Ingesting Data with the New Bulk Data Adapters
MariaDB plc
 
PDF
12 Monkeys Inside JS Engine
ChengHui Weng
 
PDF
Fun Teaching MongoDB New Tricks
MongoDB
 
PDF
Solr @ Etsy - Apache Lucene Eurocon
Giovanni Fernandez-Kincade
 
PPTX
The Road To Reactive with RxJava JEEConf 2016
Frank Lyaruu
 
PDF
Integrating SAP the Java EE Way - JBoss One Day talk 2012
hwilming
 
PPTX
Lambdas puzzler - Peter Lawrey
JAXLondon_Conference
 
PPTX
Where the wild things are - Benchmarking and Micro-Optimisations
Matt Warren
 
PPTX
MiamiJS - The Future of JavaScript
Caridy Patino
 
PDF
Presto anatomy
Dongmin Yu
 
PDF
Performance improvements in PostgreSQL 9.5 and beyond
Tomas Vondra
 
PPT
Sqlapi0.1
jitendral
 
PDF
Scala to assembly
Jarek Ratajski
 
PDF
[245] presto 내부구조 파헤치기
NAVER D2
 
PDF
Scala in Places API
Łukasz Bałamut
 
Cassandra SF Meetup - CQL Performance With Apache Cassandra 3.X
aaronmorton
 
Refactoring to Macros with Clojure
Dmitry Buzdin
 
Hadoop Integration in Cassandra
Jairam Chandar
 
Postgres Performance for Humans
Citus Data
 
PostgreSQL 9.6 새 기능 소개
PgDay.Seoul
 
M|18 Ingesting Data with the New Bulk Data Adapters
MariaDB plc
 
12 Monkeys Inside JS Engine
ChengHui Weng
 
Fun Teaching MongoDB New Tricks
MongoDB
 
Solr @ Etsy - Apache Lucene Eurocon
Giovanni Fernandez-Kincade
 
The Road To Reactive with RxJava JEEConf 2016
Frank Lyaruu
 
Integrating SAP the Java EE Way - JBoss One Day talk 2012
hwilming
 
Lambdas puzzler - Peter Lawrey
JAXLondon_Conference
 
Where the wild things are - Benchmarking and Micro-Optimisations
Matt Warren
 
MiamiJS - The Future of JavaScript
Caridy Patino
 
Presto anatomy
Dongmin Yu
 
Performance improvements in PostgreSQL 9.5 and beyond
Tomas Vondra
 
Sqlapi0.1
jitendral
 
Scala to assembly
Jarek Ratajski
 
[245] presto 내부구조 파헤치기
NAVER D2
 
Scala in Places API
Łukasz Bałamut
 
Ad

More from DataStax (20)

PPTX
Is Your Enterprise Ready to Shine This Holiday Season?
DataStax
 
PPTX
Designing Fault-Tolerant Applications with DataStax Enterprise and Apache Cas...
DataStax
 
PPTX
Running DataStax Enterprise in VMware Cloud and Hybrid Environments
DataStax
 
PPTX
Best Practices for Getting to Production with DataStax Enterprise Graph
DataStax
 
PPTX
Webinar | Data Management for Hybrid and Multi-Cloud: A Four-Step Journey
DataStax
 
PPTX
Webinar | How to Understand Apache Cassandra™ Performance Through Read/Writ...
DataStax
 
PDF
Webinar | Better Together: Apache Cassandra and Apache Kafka
DataStax
 
PDF
Top 10 Best Practices for Apache Cassandra and DataStax Enterprise
DataStax
 
PDF
Introduction to Apache Cassandra™ + What’s New in 4.0
DataStax
 
PPTX
Webinar: How Active Everywhere Database Architecture Accelerates Hybrid Cloud...
DataStax
 
PPTX
Webinar | Aligning GDPR Requirements with Today's Hybrid Cloud Realities
DataStax
 
PDF
Designing a Distributed Cloud Database for Dummies
DataStax
 
PDF
How to Power Innovation with Geo-Distributed Data Management in Hybrid Cloud
DataStax
 
PDF
How to Evaluate Cloud Databases for eCommerce
DataStax
 
PPTX
Webinar: DataStax Enterprise 6: 10 Ways to Multiply the Power of Apache Cassa...
DataStax
 
PPTX
Webinar: DataStax and Microsoft Azure: Empowering the Right-Now Enterprise wi...
DataStax
 
PPTX
Webinar - Real-Time Customer Experience for the Right-Now Enterprise featurin...
DataStax
 
PPTX
Datastax - The Architect's guide to customer experience (CX)
DataStax
 
PPTX
An Operational Data Layer is Critical for Transformative Banking Applications
DataStax
 
PPTX
Becoming a Customer-Centric Enterprise Via Real-Time Data and Design Thinking
DataStax
 
Is Your Enterprise Ready to Shine This Holiday Season?
DataStax
 
Designing Fault-Tolerant Applications with DataStax Enterprise and Apache Cas...
DataStax
 
Running DataStax Enterprise in VMware Cloud and Hybrid Environments
DataStax
 
Best Practices for Getting to Production with DataStax Enterprise Graph
DataStax
 
Webinar | Data Management for Hybrid and Multi-Cloud: A Four-Step Journey
DataStax
 
Webinar | How to Understand Apache Cassandra™ Performance Through Read/Writ...
DataStax
 
Webinar | Better Together: Apache Cassandra and Apache Kafka
DataStax
 
Top 10 Best Practices for Apache Cassandra and DataStax Enterprise
DataStax
 
Introduction to Apache Cassandra™ + What’s New in 4.0
DataStax
 
Webinar: How Active Everywhere Database Architecture Accelerates Hybrid Cloud...
DataStax
 
Webinar | Aligning GDPR Requirements with Today's Hybrid Cloud Realities
DataStax
 
Designing a Distributed Cloud Database for Dummies
DataStax
 
How to Power Innovation with Geo-Distributed Data Management in Hybrid Cloud
DataStax
 
How to Evaluate Cloud Databases for eCommerce
DataStax
 
Webinar: DataStax Enterprise 6: 10 Ways to Multiply the Power of Apache Cassa...
DataStax
 
Webinar: DataStax and Microsoft Azure: Empowering the Right-Now Enterprise wi...
DataStax
 
Webinar - Real-Time Customer Experience for the Right-Now Enterprise featurin...
DataStax
 
Datastax - The Architect's guide to customer experience (CX)
DataStax
 
An Operational Data Layer is Critical for Transformative Banking Applications
DataStax
 
Becoming a Customer-Centric Enterprise Via Real-Time Data and Design Thinking
DataStax
 

Recently uploaded (20)

PDF
Adobe Illustrator Crack Full Download (Latest Version 2025) Pre-Activated
imang66g
 
PDF
MiniTool Power Data Recovery Crack New Pre Activated Version Latest 2025
imang66g
 
PPTX
classification of computer and basic part of digital computer
ravisinghrajpurohit3
 
PDF
Applitools Platform Pulse: What's New and What's Coming - July 2025
Applitools
 
PDF
SAP GUI Installation Guide for Windows | Step-by-Step Setup for SAP Access
SAP Vista, an A L T Z E N Company
 
PDF
Salesforce Implementation Services Provider.pdf
VALiNTRY360
 
PDF
System Center 2025 vs. 2022; What’s new, what’s next_PDF.pdf
Q-Advise
 
PDF
ChatPharo: an Open Architecture for Understanding How to Talk Live to LLMs
ESUG
 
PPTX
Employee salary prediction using Machine learning Project template.ppt
bhanuk27082004
 
PDF
Enhancing Security in VAST: Towards Static Vulnerability Scanning
ESUG
 
PPTX
ASSIGNMENT_1[1][1][1][1][1] (1) variables.pptx
kr2589474
 
PDF
Why Are More Businesses Choosing Partners Over Freelancers for Salesforce.pdf
Cymetrix Software
 
PDF
Summary Of Odoo 18.1 to 18.4 : The Way For Odoo 19
CandidRoot Solutions Private Limited
 
PPTX
Explanation about Structures in C language.pptx
Veeral Rathod
 
PDF
Enhancing Healthcare RPM Platforms with Contextual AI Integration
Cadabra Studio
 
PPT
Activate_Methodology_Summary presentatio
annapureddyn
 
PDF
10 posting ideas for community engagement with AI prompts
Pankaj Taneja
 
PDF
Using licensed Data Loss Prevention (DLP) as a strategic proactive data secur...
Q-Advise
 
PDF
Download iTop VPN Free 6.1.0.5882 Crack Full Activated Pre Latest 2025
imang66g
 
PDF
SAP GUI Installation Guide for macOS (iOS) | Connect to SAP Systems on Mac
SAP Vista, an A L T Z E N Company
 
Adobe Illustrator Crack Full Download (Latest Version 2025) Pre-Activated
imang66g
 
MiniTool Power Data Recovery Crack New Pre Activated Version Latest 2025
imang66g
 
classification of computer and basic part of digital computer
ravisinghrajpurohit3
 
Applitools Platform Pulse: What's New and What's Coming - July 2025
Applitools
 
SAP GUI Installation Guide for Windows | Step-by-Step Setup for SAP Access
SAP Vista, an A L T Z E N Company
 
Salesforce Implementation Services Provider.pdf
VALiNTRY360
 
System Center 2025 vs. 2022; What’s new, what’s next_PDF.pdf
Q-Advise
 
ChatPharo: an Open Architecture for Understanding How to Talk Live to LLMs
ESUG
 
Employee salary prediction using Machine learning Project template.ppt
bhanuk27082004
 
Enhancing Security in VAST: Towards Static Vulnerability Scanning
ESUG
 
ASSIGNMENT_1[1][1][1][1][1] (1) variables.pptx
kr2589474
 
Why Are More Businesses Choosing Partners Over Freelancers for Salesforce.pdf
Cymetrix Software
 
Summary Of Odoo 18.1 to 18.4 : The Way For Odoo 19
CandidRoot Solutions Private Limited
 
Explanation about Structures in C language.pptx
Veeral Rathod
 
Enhancing Healthcare RPM Platforms with Contextual AI Integration
Cadabra Studio
 
Activate_Methodology_Summary presentatio
annapureddyn
 
10 posting ideas for community engagement with AI prompts
Pankaj Taneja
 
Using licensed Data Loss Prevention (DLP) as a strategic proactive data secur...
Q-Advise
 
Download iTop VPN Free 6.1.0.5882 Crack Full Activated Pre Latest 2025
imang66g
 
SAP GUI Installation Guide for macOS (iOS) | Connect to SAP Systems on Mac
SAP Vista, an A L T Z E N Company
 

CQL performance with Apache Cassandra 3.0 (Aaron Morton, The Last Pickle) | C* Summit 2016

  • 1. CASSANDRA SUMMIT 2016 CQL PERFORMANCE WITH APACHE CASSANDRA 3.0 Aaron Morton @aaronmorton CEO Licensed under a Creative Commons Attribution-NonCommercial 3.0 New Zealand License
  • 3. How We Got Here Storage Engine 3.0 Read Path
  • 4. How We Got Here Way back in 2011…
  • 5. 2011 Blog: Cassandra Query Plans http://thelastpickle.com/blog/2011/07/04/ Cassandra-Query-Plans.html
  • 6. 2012 Talk:Technical Deep Dive - Query Performance https://www.youtube.com/watch? v=gomOKhMV0zc
  • 7. 2012 Explain Read & Write performance in 45 minutes.
  • 8. Skip Forward to 2016 Blog: Introduction To The Apache Cassandra 3.x Storage Engine http://thelastpickle.com/blog/2016/03/04/introductiont-to- the-apache-cassandra-3-storage-engine.html
  • 9. Skip Forward to 2016 “Why don’t I do another talk about Cassandra performance.”
  • 10. Skip Forward to 2016 It was a busy 4 years…
  • 11. Skip Forward to 2016 CQL 3, Collection Types, UDTs, UDF’s, UDA’s, MaterialisedViews,Triggers, SASI,…
  • 12. Skip Forward to 2016 Explain Read & Write performance in 45 minutes.
  • 13. So Lets Avoid CQL 3, Collection Types, UDTs, UDF’s, UDA’s, MaterialisedViews,Triggers, SASI,…
  • 14. How We Got Here Storage Engine 3.0 Read Path
  • 15. High Level Storage Engine 3.0
  • 16. Storage Engine 3.0 Files Data.db Index.db Filter.db
  • 17. Storage Engine 3.0 Files CompressionInfo.db Statistics.db Digest.crc32 CRC.db Summary.db TOC.txt
  • 18. CQL Recap create table my_table ( partition_1 text, cluster_1 text, foo text, bar text, baz text, PRIMARY KEY (partition_1, cluster_1) );
  • 20. CQL WithThrift Pre 3.0 [default@dev] list my_table; ------------------- RowKey: part_a => (column=clust_a:, value=, timestamp=1357…739000) => (column=clust_a:foo, value=some foo, timestamp=1357…739000) => (column=clust_a:bar, value=and bar, timestamp=1357…739000) => (column=clust_a:baz, value=no baz, timestamp=1357…739000) => (column=clust_b:, value=, timestamp=1357…739000) => (column=clust_b:foo, value=no foo, timestamp=1357…739000) => (column=clust_b:bar, value=no bar, timestamp=1357…739000) => (column=clust_b:baz, value=lots baz, timestamp=1357…739000)
  • 21. CQL Pre 3.0 Clustering Keys Repeated Column Names Repeated Timestamps Repeated Fixed Width Encoding No Knowledge Of Row Contents
  • 22. Storage Engine 3.0 Improvements Delta Encoding Variable Int Encoding Clustering Written Once Aggregated Metadata Cell Presence
  • 23. SerializationHeader For each SSTable*. Stored in each SSTable. Held in memory.
  • 24. SerializationHeader public class SerializationHeader { private final AbstractType<?> keyType; private final List<AbstractType<?>> clusteringTypes; private final PartitionColumns columns; private final EncodingStats stats; … }
  • 25. EncodingStats Collected on the fly by the Memtable.
  • 26. EncodingStats public class EncodingStats { public final long minTimestamp; public final int minLocalDeletionTime; public final int minTTL; … }
  • 27. SerializationHeader public class SerializationHeader { public void writeTimestamp(long timestamp, DataOutputPlus out) throws IOException { out.writeUnsignedVInt(timestamp - stats.minTimestamp); } … }
  • 28. VIntCoding public class VIntCoding { public static void writeUnsignedVInt(long value, DataOutput output) throws IOException { int size = VIntCoding.computeUnsignedVIntSize(value); if (size == 1) { output.write((int)value); return; } output.write(VIntCoding.encodeVInt(value, size), 0, size); }
  • 29. Storage Engine 3.0 Improvements Delta Encoding Variable Int Encoding Clustering Written Once Aggregated Metadata Cell Presence
  • 30. CQL WithThrift Pre 3.0 [default@dev] list my_table; ------------------- RowKey: part_a => (column=clust_a:, value=, timestamp=1357…739000) => (column=clust_a:foo, value=some foo, timestamp=1357…739000) => (column=clust_a:bar, value=and bar, timestamp=1357…739000) => (column=clust_a:baz, value=no baz, timestamp=1357…739000) => (column=clust_b:, value=, timestamp=1357…739000) => (column=clust_b:foo, value=no foo, timestamp=1357…739000) => (column=clust_b:bar, value=no bar, timestamp=1357…739000) => (column=clust_b:baz, value=lots baz, timestamp=1357…739000)
  • 32. Storage Engine 3.0 Partition Header
  • 34. Storage Engine 3.0 Clustering Block
  • 35. Storage Engine 3.0 Improvements Delta Encoding Variable Int Encoding Clustering Written Once Aggregated Cell Metadata Cell Presence
  • 36. CQL WithThrift Pre 3.0 [default@dev] list my_table; ------------------- RowKey: part_a => (column=clust_a:, value=, timestamp=1357…739000) => (column=clust_a:foo, value=some foo, timestamp=1357…739000) => (column=clust_a:bar, value=and bar, timestamp=1357…739000) => (column=clust_a:baz, value=no baz, timestamp=1357…739000) => (column=clust_b:, value=, timestamp=1357…739000) => (column=clust_b:foo, value=no foo, timestamp=1357…739000) => (column=clust_b:bar, value=no bar, timestamp=1357…739000) => (column=clust_b:baz, value=lots baz, timestamp=1357…739000)
  • 37. Aggregated Cell Metadata Only store CellTimestamp,TTL, and Local DeletionTime if different to the Row.
  • 38. Aggregated Cell Metadata Simple Cell Component Byte Size Flags 1 Optional Cell Timestamp (delta) varint 1…n Optional Cell Local Deletion Time (delta) varint 1…n Optional Cell TTL (delta) varint 1…n Fixed Width Cell Value Byte Size Value 1…n Optional Cell Value See Below Variable Width Cell Value Byte Size Value Length varint 1…n Value 1…n Apache Cassandra 3.0 Storage Engine
  • 39. Storage Engine 3.0 Improvements Delta Encoding Variable Int Encoding Clustering Written Once Aggregated Cell Metadata Cell Presence
  • 40. Cell Presence SSTable stores list of Cells in this SSTable. Rows stores bitmap of Cells in this Row, with reference to SSTable.
  • 42. Remember Where We Came From [default@dev] list my_table; ------------------- RowKey: part_a => (column=clust_a:, value=, timestamp=1357…739000) => (column=clust_a:foo, value=some foo, timestamp=1357…739000) => (column=clust_a:bar, value=and bar, timestamp=1357…739000) => (column=clust_a:baz, value=no baz, timestamp=1357…739000) => (column=clust_b:, value=, timestamp=1357…739000) => (column=clust_b:foo, value=no foo, timestamp=1357…739000) => (column=clust_b:bar, value=no bar, timestamp=1357…739000) => (column=clust_b:baz, value=lots baz, timestamp=1357…739000)
  • 43. How We Got Here Storage Engine 3.0 Read Path
  • 46. AbstractClusteringIndexFilter ClusteringIndexNamesFilter (When we know the column names.) ClusteringIndexSliceFilter (When we do not know the column names.)
  • 47. ClusteringIndexNamesFilter When we know what Columns to select, we know when the search is over.
  • 48. ClusteringIndexNamesFilter 1. Get Partition From Memtables. 2. Filter named columns into a temporary result. 3. Select SSTables that may contain Partition Key. 4. Order in descending timestamp order. 5. Read from SSTables in order.
  • 49. Names Filter Short Circuits If result has a Partition Deletion newer than next SSTable max timestamp. Stop Search.
  • 50. Names Filter Short Circuits If read all Columns and max timestamp of next SSTable less than selected Columns min timestamp. Stop Search.
  • 51. Names Filter Short Circuits If search clustering value not within clustering range in the SSTable. Skip SSTable.
  • 52. Names Filter Short Circuits If SSTable Cell not in search set. Skip reading value.
  • 53. ClusteringIndexSliceFilter When we do not know which columns to select, the search ends when it is exhausted.
  • 55. ClusteringIndexSliceFilter 1. Get Partition From Memtables. 2. Create Iterators for Partitions. 3. Select SSTables that may contain Partition Key. 4. Order in reverse max timestamp order. 5. Create Iterators for SSTables in order.
  • 56. Slice Filter Short Circuits If SSTable max timestamp is before max seen Partition Deletion timestamp. Stop Search.
  • 57. Names Filter Short Circuits If search clustering value not within clustering range in the SSTable. Skip SSTable.
  • 59. Aaron Morton @aaronmorton Co-Founder & Principal Consultant www.thelastpickle.com