SlideShare a Scribd company logo
CQL: SQL for Cassandra
      Cassandra NYC
     December 6, 2011

           Eric Evans
       eric@acunu.com
     @jericevans, @acunu
ā—   Overview, history, motivation
ā—   Performance characteristics
ā—   Coming soon (?)
ā—   Drivers status
What?
ā—   Cassandra Query Language
    ā—   aka CQL
    ā—   aka /ˈsēkwəl/
ā—   Exactly like SQL (except where it's not)
ā—   Introduced in Cassandra 0.8.0
ā—   Ready for production use
SQL? Almost.

–- Inserts or updates
INSERT INTO Standard1 (KEY, col0, col1)
VALUES (key, value0, value1)
                   vs.
–- Inserts or updates
UPDATE Standard1
SET col0=value0, col1=value1 WHERE KEY=key
SQL? Almost.
–- Get columns for a row
SELECT col0,col1 FROM Standard1 WHERE KEY=key

–- Range of columns for a row
SELECT col0..colN
    FROM Standard1 WHERE KEY=key

–- First 10 results from a range of columns
SELECT FIRST 10 col0..colN
    FROM Standard1 WHERE KEY=key

–- Invert the sorting of results
SELECT REVERSED col0..colN
    FROM Standard1 WHERE KEY=key
Why?
Interface Instability
(Un)ease of use
Column col = new Column(ByteBuffer.wrap(ā€œnameā€.getBytes()));
col.setValue(ByteBuffer.wrap(ā€œvalueā€.getBytes()));
col.setTimestamp(System.currentTimeMillis());

ColumnOrSuperColumn cosc = new ColumnOrSuperColumn();
cosc.setColumn(col);
Mutation mutation = new Mutation();
Mutation.setColumnOrSuperColumn(cosc);
List mutations = new ArrayList<Mutation>();
mutations.add(mutation);
Map mutations_map = new HashMap<ByteBuffer, Map<String, List<Mutation>>>();
Map cf_map = new HashMap<String, List<Mutation>>();
cf_map.set(ā€œStandard1ā€, mutations);
mutations.put(ByteBuffer.wrap(ā€œkeyā€.getBytes()), cf_map)
CQL
INSERT INTO Standard1 (KEY, col0)
    VALUES (key, value0)
Why? How about...
ā—   Better stability guarantees
ā—   Easier to use (you already know it)
ā—   Better code readability / maintainability
Why? How about...
ā—   Better stability guarantees
ā—   Easier to use (you already know it)
ā—   Better code readability / maintainability
ā—   Irritates the NoSQL purists
Why? How about...
ā—   Better stability guarantees
ā—   Easier to use (you already know it)
ā—   Better code readability / maintainability
ā—   Irritates the NoSQL purists
ā—   (Still )irritates the SQL purists
CQL: SQL In Cassandra
Performance
CQL: SQL In Cassandra
Thrift RPC
Column col = new Column(ByteBuffer.wrap(ā€œnameā€.getBytes()));
col.setValue(ByteBuffer.wrap(ā€œvalueā€.getBytes()));
col.setTimestamp(System.currentTimeMillis());

ColumnOrSuperColumn cosc = new ColumnOrSuperColumn();
cosc.setColumn(col);
Mutation mutation = new Mutation();
Mutation.setColumnOrSuperColumn(cosc);
List mutations = new ArrayList<Mutation>();
mutations.add(mutation);
Map mutations_map = new HashMap<ByteBuffer, Map<String, List<Mutation>>>();
Map cf_map = new HashMap<String, List<Mutation>>();
cf_map.set(ā€œStandard1ā€, mutations);
mutations.put(ByteBuffer.wrap(ā€œkeyā€.getBytes()), cf_map)
Your query, it's a graph
CQL

INSERT INTO Standard1 (KEY, col0)
    VALUES (key, value0)
Hotspot
             Quoted string literals


UPDATE table SET 'name' = 'value'
    WHERE KEY = 'somekey'
Hotspot
             Quoted string literals


UPDATE table SET 'name' = 'value'
    WHERE KEY = 'somekey'
Hotspot
                  Quoted string literals


UPDATE table SET 'name' = 'value'
    WHERE KEY = 'somekey'


ā—   Anything that appears between quotes
ā—   Inlined Java constructs a StringBuilder to store
    the contents (slow not fast)
ā—   Incurred multiple times per statement
Hotspot
                Marshalling


UPDATE table SET 'clear' = 'abffaadd10'
    WHERE KEY = 'acfe12ff'
Hotspot
                  Marshalling


UPDATE table SET 'clear' = 'abffaadd10'
    WHERE KEY = 'acfe12ff'
          ascii                 blob
Hotspot
                        Marshalling


UPDATE table SET 'clear' = 'abffaadd10'
    WHERE KEY = 'acfe12ff'
              ascii                   blob


ā—   Terms are marshalled to bytes by type
ā—   String.getBytes is slow (AsciiType)
ā—   Hex conversion is fast faster (BytesType)
ā—   Incurred multiple times per statement
Hotspot
                   Copying / Conversion


execute_cql_query(
    ByteBuffer query, enum compression)
ā—   Query is binary to support compression (is it worth it?)
ā—   And don't forget the String → ByteBuffer conversion on
    the client-side
ā—   Incurred only once per statement!
Achtung!
             (These tests weren't perfect)

ā—   Uneeded String → ByteBuffer → String
ā—   No query compression implemented
ā—   Co-located client and server
Insert 20M rows, 5 columns




           Avg rate      Avg latency
     RPC   20,953/s      1.6ms
     CQL   19,176/s (-8%) 1.7ms (+9%)
Insert 10M rows, 5 cols (indexed)




               Avg rate        Avg latency
         RPC   9,850/s         5.3ms
         CQL   9,290/s (-6%)   5.5ms (+4%)
Counts, 10M rows, 5 cols




          Avg rate      Avg latency
    RPC   18,052/s      1.7ms
    CQL   17,635/s (-2%) 1.7ms
Reading 20M rows, 5 cols




          Avg rate       Avg latency
    RPC 22.726/s         2.0ms
    CQL   20,272/s (-11%) 2.3ms (+10%)
In Summary
Don't step over dollars to pick up pennies!
Coming Soon(ish)
Roadmap
ā—   Prepared statements (CASSANDRA-2475)
ā—   Compound columns (CASSANDRA-2474)
ā—   Custom transport / protocol (CASSANDRA-2478)
ā—   Performance testing (CASSANDRA-2268)
ā—   Schema introspection (CASSANDRA-2477)
ā—   Multiget support (CASSANDRA-3069)
Drivers
Drivers
ā—   Hosted on Apache Extras (Google Code)
ā—   Tagged cassandra and cql
ā—   Licensed using Apache License 2.0
ā—   Conforming to a standard for database
    connectivity (if applicable)
ā—   Coming soon, automated testing and
    acceptance criteria
Drivers
Driver                           Platform                 Status
cassandra-jdbc                   Java                     Good
cassandra-dbapi2                 Python                   Good
cassandra-ruby                   Ruby                     New
cassandra-pdo                    PHP                      New
cassandra-node                   Node.js                  Good

http://code.google.com/a/apache-extras.org/hosting/search?q=label%3aCassandra
The End

More Related Content

What's hot (19)

ODP
ВлаГимир ŠŸŠµŃ€ŠµŠæŠµŠ»ŠøŃ†Š° "МоГули"
Media Gorod
Ā 
PDF
MongoDB as Message Queue
MongoDB
Ā 
PDF
ŠŸŃƒŃ‚ŃŒ мониторинга 2.0 всё стало Š“Ń€ŃƒŠ³ŠøŠ¼ / ВсеволоГ ŠŸŠ¾Š»ŃŠŗŠ¾Š² (Grammarly)
Ontico
Ā 
PDF
Tales Of The Black Knight - Keeping EverythingMe running
Dvir Volk
Ā 
PDF
Object Storage with Gluster
Gluster.org
Ā 
PDF
tdc2012
Juan Lopes
Ā 
PDF
2017 meetup-apache-kafka-nov
Florian Hussonnois
Ā 
PPT
ELK stack at weibo.com
琛琳 é„¶
Ā 
PDF
How and Why Prometheus' New Storage Engine Pushes the Limits of Time Series D...
Docker, Inc.
Ā 
PPT
Hector v2: The Second Version of the Popular High-Level Java Client for Apach...
zznate
Ā 
PDF
Tuning Solr for Logs
Sematext Group, Inc.
Ā 
PDF
Centralized + Unified Logging
Gabor Kozma
Ā 
ODP
Perl - laziness, impatience, hubris, and one liners
Kirk Kimmel
Ā 
PDF
OSMC 2014: Monitoring VoIP Systems | Sebastian Damm
NETWAYS
Ā 
PDF
Node.js streaming csv downloads proxy
Ismael Celis
Ā 
PDF
Performance Profiling in Rust
InfluxData
Ā 
PDF
Go Programming Patterns
Hao Chen
Ā 
PPTX
Monitoring MySQL with OpenTSDB
Geoffrey Anderson
Ā 
PDF
[231] the simplicity of cluster apps with circuit
NAVER D2
Ā 
ВлаГимир ŠŸŠµŃ€ŠµŠæŠµŠ»ŠøŃ†Š° "МоГули"
Media Gorod
Ā 
MongoDB as Message Queue
MongoDB
Ā 
ŠŸŃƒŃ‚ŃŒ мониторинга 2.0 всё стало Š“Ń€ŃƒŠ³ŠøŠ¼ / ВсеволоГ ŠŸŠ¾Š»ŃŠŗŠ¾Š² (Grammarly)
Ontico
Ā 
Tales Of The Black Knight - Keeping EverythingMe running
Dvir Volk
Ā 
Object Storage with Gluster
Gluster.org
Ā 
tdc2012
Juan Lopes
Ā 
2017 meetup-apache-kafka-nov
Florian Hussonnois
Ā 
ELK stack at weibo.com
琛琳 é„¶
Ā 
How and Why Prometheus' New Storage Engine Pushes the Limits of Time Series D...
Docker, Inc.
Ā 
Hector v2: The Second Version of the Popular High-Level Java Client for Apach...
zznate
Ā 
Tuning Solr for Logs
Sematext Group, Inc.
Ā 
Centralized + Unified Logging
Gabor Kozma
Ā 
Perl - laziness, impatience, hubris, and one liners
Kirk Kimmel
Ā 
OSMC 2014: Monitoring VoIP Systems | Sebastian Damm
NETWAYS
Ā 
Node.js streaming csv downloads proxy
Ismael Celis
Ā 
Performance Profiling in Rust
InfluxData
Ā 
Go Programming Patterns
Hao Chen
Ā 
Monitoring MySQL with OpenTSDB
Geoffrey Anderson
Ā 
[231] the simplicity of cluster apps with circuit
NAVER D2
Ā 

Viewers also liked (20)

PDF
C* Summit 2013: The World's Next Top Data Model by Patrick McFadin
DataStax Academy
Ā 
PDF
Indexing in Cassandra
Ed Anuff
Ā 
PPTX
REST and Microservices
Shaun Abram
Ā 
PDF
Cassandra By Example: Data Modelling with CQL3
Eric Evans
Ā 
PDF
Why does my choice of storage matter with cassandra?
Johnny Miller
Ā 
PDF
Cassandra Summit 2014: CQL Under the Hood
DataStax Academy
Ā 
PDF
Wikimedia Content API: A Cassandra Use-case
Eric Evans
Ā 
PDF
Wikimedia Content API: A Cassandra Use-case
Eric Evans
Ā 
PPTX
Webinaire Business&Decision - Trifacta
Victor Coustenoble
Ā 
PPTX
Webinar Degetel DataStax
Victor Coustenoble
Ā 
PDF
Time Series Data with Apache Cassandra (ApacheCon EU 2014)
Eric Evans
Ā 
KEY
Castle enhanced Cassandra
Eric Evans
Ā 
PDF
Wikimedia Content API (Strangeloop)
Eric Evans
Ā 
PPTX
DataStax et Apache Cassandra pour la gestion des flux IoT
Victor Coustenoble
Ā 
PPTX
DataStax Enterprise BBL
Victor Coustenoble
Ā 
PDF
Cassandra by Example: Data Modelling with CQL3
Eric Evans
Ā 
PDF
Virtual Nodes: Rethinking Topology in Cassandra
Eric Evans
Ā 
PDF
Virtual Nodes: Rethinking Topology in Cassandra
Eric Evans
Ā 
PPTX
Microservices with Node.js and Apache Cassandra
Jorge Bay Gondra
Ā 
PDF
It's not you, it's me: Ending a 15 year relationship with RRD
Eric Evans
Ā 
C* Summit 2013: The World's Next Top Data Model by Patrick McFadin
DataStax Academy
Ā 
Indexing in Cassandra
Ed Anuff
Ā 
REST and Microservices
Shaun Abram
Ā 
Cassandra By Example: Data Modelling with CQL3
Eric Evans
Ā 
Why does my choice of storage matter with cassandra?
Johnny Miller
Ā 
Cassandra Summit 2014: CQL Under the Hood
DataStax Academy
Ā 
Wikimedia Content API: A Cassandra Use-case
Eric Evans
Ā 
Wikimedia Content API: A Cassandra Use-case
Eric Evans
Ā 
Webinaire Business&Decision - Trifacta
Victor Coustenoble
Ā 
Webinar Degetel DataStax
Victor Coustenoble
Ā 
Time Series Data with Apache Cassandra (ApacheCon EU 2014)
Eric Evans
Ā 
Castle enhanced Cassandra
Eric Evans
Ā 
Wikimedia Content API (Strangeloop)
Eric Evans
Ā 
DataStax et Apache Cassandra pour la gestion des flux IoT
Victor Coustenoble
Ā 
DataStax Enterprise BBL
Victor Coustenoble
Ā 
Cassandra by Example: Data Modelling with CQL3
Eric Evans
Ā 
Virtual Nodes: Rethinking Topology in Cassandra
Eric Evans
Ā 
Virtual Nodes: Rethinking Topology in Cassandra
Eric Evans
Ā 
Microservices with Node.js and Apache Cassandra
Jorge Bay Gondra
Ā 
It's not you, it's me: Ending a 15 year relationship with RRD
Eric Evans
Ā 
Ad

Similar to CQL: SQL In Cassandra (20)

PDF
Trivadis TechEvent 2016 Big Data Cassandra, wieso brauche ich das? by Jan Ott
Trivadis
Ā 
PPTX
CQL, then and now
Courtney Robinson
Ā 
PDF
Cassandra 2012
beobal
Ā 
PDF
Big Data Grows Up - A (re)introduction to Cassandra
Robbie Strickland
Ā 
PPTX
Cassandra Java APIs Old and New – A Comparison
shsedghi
Ā 
PDF
Cassandra EU 2012 - CQL: Then, Now and When by Eric Evans
Acunu
Ā 
PDF
Cassandra - An Introduction
Mikio L. Braun
Ā 
PDF
Cassandra
Robert Koletka
Ā 
PDF
NoSQL Yes, But YesCQL, No?
Eric Evans
Ā 
PDF
Cassandra: Open Source Bigtable + Dynamo
jbellis
Ā 
PPT
Apache cassandra
Muralidharan Deenathayalan
Ā 
PPTX
Apache Cassandra Data Modeling with Travis Price
DataStax Academy
Ā 
PPTX
Apache Cassandra 2.0
Joe Stein
Ā 
PDF
Cassandra EU - State of CQL
pcmanus
Ā 
PDF
C* Summit EU 2013: The State of CQL
DataStax Academy
Ā 
PDF
Hw09 Sqoop Database Import For Hadoop
Cloudera, Inc.
Ā 
PPTX
Appache Cassandra
nehabsairam
Ā 
PDF
C* Summit 2013: Can't we all just get along? MariaDB and Cassandra by Colin C...
DataStax Academy
Ā 
PDF
Cassandra: Not Just NoSQL, It's MoSQL
Eric Evans
Ā 
PPTX
Cassandra Tutorial | Data types | Why Cassandra for Big Data
vinayiqbusiness
Ā 
Trivadis TechEvent 2016 Big Data Cassandra, wieso brauche ich das? by Jan Ott
Trivadis
Ā 
CQL, then and now
Courtney Robinson
Ā 
Cassandra 2012
beobal
Ā 
Big Data Grows Up - A (re)introduction to Cassandra
Robbie Strickland
Ā 
Cassandra Java APIs Old and New – A Comparison
shsedghi
Ā 
Cassandra EU 2012 - CQL: Then, Now and When by Eric Evans
Acunu
Ā 
Cassandra - An Introduction
Mikio L. Braun
Ā 
Cassandra
Robert Koletka
Ā 
NoSQL Yes, But YesCQL, No?
Eric Evans
Ā 
Cassandra: Open Source Bigtable + Dynamo
jbellis
Ā 
Apache cassandra
Muralidharan Deenathayalan
Ā 
Apache Cassandra Data Modeling with Travis Price
DataStax Academy
Ā 
Apache Cassandra 2.0
Joe Stein
Ā 
Cassandra EU - State of CQL
pcmanus
Ā 
C* Summit EU 2013: The State of CQL
DataStax Academy
Ā 
Hw09 Sqoop Database Import For Hadoop
Cloudera, Inc.
Ā 
Appache Cassandra
nehabsairam
Ā 
C* Summit 2013: Can't we all just get along? MariaDB and Cassandra by Colin C...
DataStax Academy
Ā 
Cassandra: Not Just NoSQL, It's MoSQL
Eric Evans
Ā 
Cassandra Tutorial | Data types | Why Cassandra for Big Data
vinayiqbusiness
Ā 
Ad

More from Eric Evans (10)

PDF
Time Series Data with Apache Cassandra
Eric Evans
Ā 
PDF
Time Series Data with Apache Cassandra
Eric Evans
Ā 
PDF
Time series storage in Cassandra
Eric Evans
Ā 
PDF
Rethinking Topology In Cassandra (ApacheCon NA)
Eric Evans
Ā 
PDF
Cassandra Explained
Eric Evans
Ā 
PDF
Cassandra Explained
Eric Evans
Ā 
PDF
Outside The Box With Apache Cassnadra
Eric Evans
Ā 
PDF
The Cassandra Distributed Database
Eric Evans
Ā 
PDF
An Introduction To Cassandra
Eric Evans
Ā 
PDF
Cassandra In A Nutshell
Eric Evans
Ā 
Time Series Data with Apache Cassandra
Eric Evans
Ā 
Time Series Data with Apache Cassandra
Eric Evans
Ā 
Time series storage in Cassandra
Eric Evans
Ā 
Rethinking Topology In Cassandra (ApacheCon NA)
Eric Evans
Ā 
Cassandra Explained
Eric Evans
Ā 
Cassandra Explained
Eric Evans
Ā 
Outside The Box With Apache Cassnadra
Eric Evans
Ā 
The Cassandra Distributed Database
Eric Evans
Ā 
An Introduction To Cassandra
Eric Evans
Ā 
Cassandra In A Nutshell
Eric Evans
Ā 

Recently uploaded (20)

PPTX
AI Penetration Testing Essentials: A Cybersecurity Guide for 2025
defencerabbit
Ā 
PDF
Go Concurrency Real-World Patterns, Pitfalls, and Playground Battles.pdf
Emily Achieng
Ā 
PDF
The 2025 InfraRed Report - Redpoint Ventures
Razin Mustafiz
Ā 
PDF
šŸš€ Let’s Build Our First Slack Workflow! šŸ”§.pdf
SanjeetMishra29
Ā 
PPTX
COMPARISON OF RASTER ANALYSIS TOOLS OF QGIS AND ARCGIS
Sharanya Sarkar
Ā 
PDF
Kit-Works Team Study_20250627_ķ•œė‹¬ė§Œģ—ė§Œė“ ģ‚¬ė‚“ģ„œė¹„ģŠ¤ķ‚¤ė§(ģ–‘ė‹¤ģœ—).pdf
Wonjun Hwang
Ā 
PDF
Transforming Utility Networks: Large-scale Data Migrations with FME
Safe Software
Ā 
PDF
Automating Feature Enrichment and Station Creation in Natural Gas Utility Net...
Safe Software
Ā 
PPTX
The Project Compass - GDG on Campus MSIT
dscmsitkol
Ā 
PDF
NLJUG Speaker academy 2025 - first session
Bert Jan Schrijver
Ā 
PDF
ā€œComputer Vision at Sea: Automated Fish Tracking for Sustainable Fishing,ā€ a ...
Edge AI and Vision Alliance
Ā 
PDF
POV_ Why Enterprises Need to Find Value in ZERO.pdf
darshakparmar
Ā 
DOCX
Python coding for beginners !! Start now!#
Rajni Bhardwaj Grover
Ā 
PDF
Book industry state of the nation 2025 - Tech Forum 2025
BookNet Canada
Ā 
PPT
Ericsson LTE presentation SEMINAR 2010.ppt
npat3
Ā 
PDF
UiPath DevConnect 2025: Agentic Automation Community User Group Meeting
DianaGray10
Ā 
PDF
Bitcoin for Millennials podcast with Bram, Power Laws of Bitcoin
Stephen Perrenod
Ā 
PDF
Agentic AI lifecycle for Enterprise Hyper-Automation
Debmalya Biswas
Ā 
PDF
Staying Human in a Machine- Accelerated World
Catalin Jora
Ā 
PDF
Newgen 2022-Forrester Newgen TEI_13 05 2022-The-Total-Economic-Impact-Newgen-...
darshakparmar
Ā 
AI Penetration Testing Essentials: A Cybersecurity Guide for 2025
defencerabbit
Ā 
Go Concurrency Real-World Patterns, Pitfalls, and Playground Battles.pdf
Emily Achieng
Ā 
The 2025 InfraRed Report - Redpoint Ventures
Razin Mustafiz
Ā 
šŸš€ Let’s Build Our First Slack Workflow! šŸ”§.pdf
SanjeetMishra29
Ā 
COMPARISON OF RASTER ANALYSIS TOOLS OF QGIS AND ARCGIS
Sharanya Sarkar
Ā 
Kit-Works Team Study_20250627_ķ•œė‹¬ė§Œģ—ė§Œė“ ģ‚¬ė‚“ģ„œė¹„ģŠ¤ķ‚¤ė§(ģ–‘ė‹¤ģœ—).pdf
Wonjun Hwang
Ā 
Transforming Utility Networks: Large-scale Data Migrations with FME
Safe Software
Ā 
Automating Feature Enrichment and Station Creation in Natural Gas Utility Net...
Safe Software
Ā 
The Project Compass - GDG on Campus MSIT
dscmsitkol
Ā 
NLJUG Speaker academy 2025 - first session
Bert Jan Schrijver
Ā 
ā€œComputer Vision at Sea: Automated Fish Tracking for Sustainable Fishing,ā€ a ...
Edge AI and Vision Alliance
Ā 
POV_ Why Enterprises Need to Find Value in ZERO.pdf
darshakparmar
Ā 
Python coding for beginners !! Start now!#
Rajni Bhardwaj Grover
Ā 
Book industry state of the nation 2025 - Tech Forum 2025
BookNet Canada
Ā 
Ericsson LTE presentation SEMINAR 2010.ppt
npat3
Ā 
UiPath DevConnect 2025: Agentic Automation Community User Group Meeting
DianaGray10
Ā 
Bitcoin for Millennials podcast with Bram, Power Laws of Bitcoin
Stephen Perrenod
Ā 
Agentic AI lifecycle for Enterprise Hyper-Automation
Debmalya Biswas
Ā 
Staying Human in a Machine- Accelerated World
Catalin Jora
Ā 
Newgen 2022-Forrester Newgen TEI_13 05 2022-The-Total-Economic-Impact-Newgen-...
darshakparmar
Ā 

CQL: SQL In Cassandra

  • 1. CQL: SQL for Cassandra Cassandra NYC December 6, 2011 Eric Evans [email protected] @jericevans, @acunu
  • 2. ā— Overview, history, motivation ā— Performance characteristics ā— Coming soon (?) ā— Drivers status
  • 3. What? ā— Cassandra Query Language ā— aka CQL ā— aka /ˈsēkwəl/ ā— Exactly like SQL (except where it's not) ā— Introduced in Cassandra 0.8.0 ā— Ready for production use
  • 4. SQL? Almost. –- Inserts or updates INSERT INTO Standard1 (KEY, col0, col1) VALUES (key, value0, value1) vs. –- Inserts or updates UPDATE Standard1 SET col0=value0, col1=value1 WHERE KEY=key
  • 5. SQL? Almost. –- Get columns for a row SELECT col0,col1 FROM Standard1 WHERE KEY=key –- Range of columns for a row SELECT col0..colN FROM Standard1 WHERE KEY=key –- First 10 results from a range of columns SELECT FIRST 10 col0..colN FROM Standard1 WHERE KEY=key –- Invert the sorting of results SELECT REVERSED col0..colN FROM Standard1 WHERE KEY=key
  • 8. (Un)ease of use Column col = new Column(ByteBuffer.wrap(ā€œnameā€.getBytes())); col.setValue(ByteBuffer.wrap(ā€œvalueā€.getBytes())); col.setTimestamp(System.currentTimeMillis()); ColumnOrSuperColumn cosc = new ColumnOrSuperColumn(); cosc.setColumn(col); Mutation mutation = new Mutation(); Mutation.setColumnOrSuperColumn(cosc); List mutations = new ArrayList<Mutation>(); mutations.add(mutation); Map mutations_map = new HashMap<ByteBuffer, Map<String, List<Mutation>>>(); Map cf_map = new HashMap<String, List<Mutation>>(); cf_map.set(ā€œStandard1ā€, mutations); mutations.put(ByteBuffer.wrap(ā€œkeyā€.getBytes()), cf_map)
  • 9. CQL INSERT INTO Standard1 (KEY, col0) VALUES (key, value0)
  • 10. Why? How about... ā— Better stability guarantees ā— Easier to use (you already know it) ā— Better code readability / maintainability
  • 11. Why? How about... ā— Better stability guarantees ā— Easier to use (you already know it) ā— Better code readability / maintainability ā— Irritates the NoSQL purists
  • 12. Why? How about... ā— Better stability guarantees ā— Easier to use (you already know it) ā— Better code readability / maintainability ā— Irritates the NoSQL purists ā— (Still )irritates the SQL purists
  • 16. Thrift RPC Column col = new Column(ByteBuffer.wrap(ā€œnameā€.getBytes())); col.setValue(ByteBuffer.wrap(ā€œvalueā€.getBytes())); col.setTimestamp(System.currentTimeMillis()); ColumnOrSuperColumn cosc = new ColumnOrSuperColumn(); cosc.setColumn(col); Mutation mutation = new Mutation(); Mutation.setColumnOrSuperColumn(cosc); List mutations = new ArrayList<Mutation>(); mutations.add(mutation); Map mutations_map = new HashMap<ByteBuffer, Map<String, List<Mutation>>>(); Map cf_map = new HashMap<String, List<Mutation>>(); cf_map.set(ā€œStandard1ā€, mutations); mutations.put(ByteBuffer.wrap(ā€œkeyā€.getBytes()), cf_map)
  • 17. Your query, it's a graph
  • 18. CQL INSERT INTO Standard1 (KEY, col0) VALUES (key, value0)
  • 19. Hotspot Quoted string literals UPDATE table SET 'name' = 'value' WHERE KEY = 'somekey'
  • 20. Hotspot Quoted string literals UPDATE table SET 'name' = 'value' WHERE KEY = 'somekey'
  • 21. Hotspot Quoted string literals UPDATE table SET 'name' = 'value' WHERE KEY = 'somekey' ā— Anything that appears between quotes ā— Inlined Java constructs a StringBuilder to store the contents (slow not fast) ā— Incurred multiple times per statement
  • 22. Hotspot Marshalling UPDATE table SET 'clear' = 'abffaadd10' WHERE KEY = 'acfe12ff'
  • 23. Hotspot Marshalling UPDATE table SET 'clear' = 'abffaadd10' WHERE KEY = 'acfe12ff' ascii blob
  • 24. Hotspot Marshalling UPDATE table SET 'clear' = 'abffaadd10' WHERE KEY = 'acfe12ff' ascii blob ā— Terms are marshalled to bytes by type ā— String.getBytes is slow (AsciiType) ā— Hex conversion is fast faster (BytesType) ā— Incurred multiple times per statement
  • 25. Hotspot Copying / Conversion execute_cql_query( ByteBuffer query, enum compression) ā— Query is binary to support compression (is it worth it?) ā— And don't forget the String → ByteBuffer conversion on the client-side ā— Incurred only once per statement!
  • 26. Achtung! (These tests weren't perfect) ā— Uneeded String → ByteBuffer → String ā— No query compression implemented ā— Co-located client and server
  • 27. Insert 20M rows, 5 columns Avg rate Avg latency RPC 20,953/s 1.6ms CQL 19,176/s (-8%) 1.7ms (+9%)
  • 28. Insert 10M rows, 5 cols (indexed) Avg rate Avg latency RPC 9,850/s 5.3ms CQL 9,290/s (-6%) 5.5ms (+4%)
  • 29. Counts, 10M rows, 5 cols Avg rate Avg latency RPC 18,052/s 1.7ms CQL 17,635/s (-2%) 1.7ms
  • 30. Reading 20M rows, 5 cols Avg rate Avg latency RPC 22.726/s 2.0ms CQL 20,272/s (-11%) 2.3ms (+10%)
  • 31. In Summary Don't step over dollars to pick up pennies!
  • 33. Roadmap ā— Prepared statements (CASSANDRA-2475) ā— Compound columns (CASSANDRA-2474) ā— Custom transport / protocol (CASSANDRA-2478) ā— Performance testing (CASSANDRA-2268) ā— Schema introspection (CASSANDRA-2477) ā— Multiget support (CASSANDRA-3069)
  • 35. Drivers ā— Hosted on Apache Extras (Google Code) ā— Tagged cassandra and cql ā— Licensed using Apache License 2.0 ā— Conforming to a standard for database connectivity (if applicable) ā— Coming soon, automated testing and acceptance criteria
  • 36. Drivers Driver Platform Status cassandra-jdbc Java Good cassandra-dbapi2 Python Good cassandra-ruby Ruby New cassandra-pdo PHP New cassandra-node Node.js Good http://code.google.com/a/apache-extras.org/hosting/search?q=label%3aCassandra