SlideShare a Scribd company logo
©2015 DataStax Confidential. Do not distribute without consent.
@RachelPedreschi & @PatrickMcFadin
Rachel Pedreschi & Patrick McFadin

Evangelists for Apache Cassandra
Oracle to Cassandra Core Concepts Guide
Part 2: Data Model Strikes Back
1
Where are we now?
Pt. 1 - Transition from Oracle to Cassandra
How does it work? What are the differences?
Pt. 2 - Cassandra Data Model
All the details on how to use Cassandra Query Language (CQL)
Pt. 3 - Building a Cassandra Application
Wrap it all up with a top down application design. Combining everything we’ve
learned plus a more.
Application
hash(key) => token(43)
replication factor = 3
80
10
3050
70
60
40
20
“Static” Table
CREATE TABLE videos (

videoid uuid,

userid uuid,

name varchar,

description varchar,

location text,

location_type int,

preview_thumbnails map<text,text>,

tags set<varchar>,

added_date timestamp,

PRIMARY KEY (videoid)

);
Table Name
Column Name
Column CQL Type
Primary Key Designation Partition Key
Row
Column
1
Partition
Key 1
Column
2
Column
3
Column
4
Partition
Column
1
Partition
Key 1
Column
2
Column
3
Column
4
Column
1
Partition
Key 1
Column
2
Column
3
Column
4
Column
1
Partition
Key 1
Column
2
Column
3
Column
4
Column
1
Partition
Key 1
Column
2
Column
3
Column
4
Table Column
1
Partition
Key 1
Column
2
Column
3
Column
4
Column
1
Partition
Key 1
Column
2
Column
3
Column
4
Column
1
Partition
Key 1
Column
2
Column
3
Column
4
Column
1
Partition
Key 1
Column
2
Column
3
Column
4
Column
1
Partition
Key 2
Column
2
Column
3
Column
4
Column
1
Column
2
Column
3
Column
4
Column
1
Column
2
Column
3
Column
4
Column
1
Column
2
Column
3
Column
4
Partition
Key 2
Partition
Key 2
Partition
Key 2
Keyspace
Column
1
Partition
Key 1
Column
2
Column
3
Column
4
Column
1
Partition
Key 2
Column
2
Column
3
Column
4
Column
1
Partition
Key 1
Column
2
Column
3
Column
4
Column
1
Partition
Key 1
Column
2
Column
3
Column
4
Column
1
Partition
Key 1
Column
2
Column
3
Column
4
Column
1
Partition
Key 2
Column
2
Column
3
Column
4
Column
1
Partition
Key 2
Column
2
Column
3
Column
4
Column
1
Partition
Key 2
Column
2
Column
3
Column
4
Column
1
Partition
Key 1
Column
2
Column
3
Column
4
Column
1
Partition
Key 2
Column
2
Column
3
Column
4
Column
1
Partition
Key 1
Column
2
Column
3
Column
4
Column
1
Partition
Key 1
Column
2
Column
3
Column
4
Column
1
Partition
Key 1
Column
2
Column
3
Column
4
Column
1
Partition
Key 2
Column
2
Column
3
Column
4
Column
1
Partition
Key 2
Column
2
Column
3
Column
4
Column
1
Partition
Key 2
Column
2
Column
3
Column
4
Table 1 Table 2
Keyspace 1
Insert
INSERT INTO videos (videoid, name, userid, description, location, location_type, preview_thumbnails, tags, added_date, metadata)

VALUES (06049cbb-dfed-421f-b889-5f649a0de1ed,'The data model is dead. Long live the data model.',
9761d3d7-7fbd-4269-9988-6cfd4e188678, 

'First in a three part series for Cassandra Data Modeling','http://www.youtube.com/watch?v=px6U2n74q3g',1,

{'YouTube':'http://www.youtube.com/watch?v=px6U2n74q3g'},{'cassandra','data model','relational','instruction'},

'2013-05-02 12:30:29');
Table Name
Fields
Values
Partition Key: Required
Partition keys
06049cbb-dfed-421f-b889-5f649a0de1ed Murmur3 Hash Token = 7224631062609997448
873ff430-9c23-4e60-be5f-278ea2bb21bd Murmur3 Hash Token = -6804302034103043898
Consistent hash. 128 bit number
between 2-63
and 264
INSERT INTO videos (videoid, name, userid, description)

VALUES (06049cbb-dfed-421f-b889-5f649a0de1ed,'The data model is dead. Long live the data model.’,
9761d3d7-7fbd-4269-9988-6cfd4e188678, 'First in a three part series for Cassandra Data Modeling');
INSERT INTO videos (videoid, name, userid, description)

VALUES (873ff430-9c23-4e60-be5f-278ea2bb21bd,'Become a Super Modeler’,
9761d3d7-7fbd-4269-9988-6cfd4e188678, 'Second in a three part series for Cassandra Data Modeling');
“Dynamic” Table
CREATE TABLE videos_by_tag (

tag text,

videoid uuid,

added_date timestamp,

name text,

preview_image_location text,

tagged_date timestamp,

PRIMARY KEY (tag, videoid)

);
Partition Key Clustering Column
Primary key relationship
PRIMARY KEY (tag,videoid)
Primary key relationship
Partition Key
PRIMARY KEY (tag,videoid)
Primary key relationship
Partition Key Clustering Column
PRIMARY KEY (tag,videoid)
Primary key relationship
Partition Key
data model
PRIMARY KEY (tag,videoid)
Clustering Column
-5.6
06049cbb-dfed-421f-b889-5f649a0de1ed
Primary key relationship
Partition Key
2013-05-16 16:50:002013-05-02 12:30:29
873ff430-9c23-4e60-be5f-278ea2bb21bd
PRIMARY KEY (tag,videoid)
Clustering Column
data model
49f64d40-7d89-4890-b910-dbf923563a33
2013-06-11 11:00:00
Select
name | description | added_date

---------------------------------------------------+----------------------------------------------------------+--------------------------

The data model is dead. Long live the data model. | First in a three part series for Cassandra Data Modeling | 2013-05-02 12:30:29-0700
SELECT name, description, added_date

FROM videos

WHERE videoid = 06049cbb-dfed-421f-b889-5f649a0de1ed;
Fields
Table Name
Primary Key: Partition Key Required
Controlling Order
CREATE TABLE raw_weather_data (

wsid text,

year int,

month int,

day int,

hour int,

temperature double,

PRIMARY KEY ((wsid), year, month, day, hour)

) WITH CLUSTERING ORDER BY (year DESC, month DESC, day DESC, hour DESC);
INSERT INTO raw_weather_data(wsid,year,month,day,hour,temperature)

VALUES (‘10010:99999’,2005,12,1,10,-5.6);
INSERT INTO raw_weather_data(wsid,year,month,day,hour,temperature)

VALUES (‘10010:99999’,2005,12,1,9,-5.1);
INSERT INTO raw_weather_data(wsid,year,month,day,hour,temperature)

VALUES (‘10010:99999’,2005,12,1,8,-4.9);
INSERT INTO raw_weather_data(wsid,year,month,day,hour,temperature)

VALUES (‘10010:99999’,2005,12,1,7,-5.3);
Clustering
200510010:99999 12 1 10
200510010:99999 12 1 9
raw_weather_data
-5.6
-5.1
200510010:99999 12 1 8
200510010:99999 12 1 7
-4.9
-5.3
Order By
DESC
Write Path
Client
INSERT INTO raw_weather_data(wsid,year,month,day,hour,temperature)

VALUES (‘10010:99999’,2005,12,1,7,-5.3);
year 1wsid 1 month 1 day 1 hour 1
year 2wsid 2 month 2 day 2 hour 2
Memtable
SSTable
SSTable
SSTable
SSTable
Node
Commit Log Data * Compaction *
Temp
Temp
Storage Model - Logical View
2005:12:1:10
-5.6
2005:12:1:9
-5.1
2005:12:1:8
-4.9
10010:99999
10010:99999
10010:99999
wsid hour temperature
2005:12:1:7
-5.3
10010:99999
SELECT wsid, hour, temperature

FROM raw_weather_data

WHERE wsid=‘10010:99999’

AND year = 2005 AND month = 12 AND day = 1;
2005:12:1:10
-5.6 -5.3-4.9-5.1
Storage Model - Disk Layout
2005:12:1:9 2005:12:1:8
10010:99999
2005:12:1:7
Merged, Sorted and Stored Sequentially
SELECT wsid, hour, temperature

FROM raw_weather_data

WHERE wsid=‘10010:99999’

AND year = 2005 AND month = 12 AND day = 1;
2005:12:1:10
-5.6
2005:12:1:11
-4.9 -5.3-4.9-5.1
Storage Model - Disk Layout
2005:12:1:9 2005:12:1:8
10010:99999
2005:12:1:7
Merged, Sorted and Stored Sequentially
SELECT wsid, hour, temperature

FROM raw_weather_data

WHERE wsid=‘10010:99999’

AND year = 2005 AND month = 12 AND day = 1;
2005:12:1:10
-5.6
2005:12:1:11
-4.9 -5.3-4.9-5.1
Storage Model - Disk Layout
2005:12:1:9 2005:12:1:8
10010:99999
2005:12:1:7
Merged, Sorted and Stored Sequentially
SELECT wsid, hour, temperature

FROM raw_weather_data

WHERE wsid=‘10010:99999’

AND year = 2005 AND month = 12 AND day = 1;
2005:12:1:12
-5.4
Read Path
Client
SSTable
SSTable
SSTable
Node
Data
SELECT wsid,hour,temperature

FROM raw_weather_data

WHERE wsid='10010:99999'

AND year = 2005 AND month = 12 AND day = 1 

AND hour >= 7 AND hour <= 10;
year 1wsid 1 month 1 day 1 hour 1
year 2wsid 2 month 2 day 2 hour 2
Memtable
Temp
Temp
Query patterns
• Range queries
• “Slice” operation on disk
Single seek on disk
10010:99999
Partition key for locality
SELECT wsid,hour,temperature

FROM raw_weather_data

WHERE wsid='10010:99999'

AND year = 2005 AND month = 12 AND day = 1 

AND hour >= 7 AND hour <= 10;
2005:12:1:10
-5.6 -5.3-4.9-5.1
2005:12:1:9 2005:12:1:8 2005:12:1:7
Query patterns
• Range queries
• “Slice” operation on disk
Programmers like this
Sorted by event_time
2005:12:1:10
-5.6
2005:12:1:9
-5.1
2005:12:1:8
-4.9
10010:99999
10010:99999
10010:99999
weather_station hour temperature
2005:12:1:7
-5.3
10010:99999
SELECT weatherstation,hour,temperature
FROM temperature
WHERE weatherstation_id=‘10010:99999'
AND year = 2005 AND month = 12 AND day = 1
AND hour >= 7 AND hour <= 10;
Other New and Not-so-New-but-different things
Collections
Set
tags set<varchar>
CQL Type: For Ordering
Column Name CREATE TABLE videos (

videoid uuid,

userid uuid,

name varchar,

description varchar,

location text,

location_type int,

preview_thumbnails map<text,text>,

tags set<varchar>,

added_date timestamp,

PRIMARY KEY (videoid)

);
Collections
Set
List
Column Name
Column Name
CQL Type
CREATE TABLE videos (

videoid uuid,

userid uuid,

name varchar,

description varchar,

location text,

location_type int,

preview_thumbnails map<text,text>,

tags set<varchar>,

added_date timestamp,

PRIMARY KEY (videoid)

);
tags set<varchar>
CQL Type: For Ordering
tags set<varchar>
Collections
Set
List
Map
preview_thumbnails map<text,text>
Column Name
Column Name
CQL Key Type CQL Value Type
Column Name CREATE TABLE videos (

videoid uuid,

userid uuid,

name varchar,

description varchar,

location text,

location_type int,

preview_thumbnails map<text,text>,

tags set<varchar>,

added_date timestamp,

PRIMARY KEY (videoid)

);
CQL Type
tags set<varchar>
tags set<varchar>
CQL Type: For Ordering
Aggregates (Sort of)
*As of Cassandra 2.2
•Built-in: avg, min, max, count(<column name>)
•Runs on server
•Always use with partition key
User Defined Functions
CREATE FUNCTION maxI(current int, candidate int)

CALLED ON NULL INPUT

RETURNS int LANGUAGE java AS

'if (current == null) return candidate; else return Math.max(current, candidate);' ;



CREATE AGGREGATE maxAgg(int)

SFUNC maxI

STYPE int

INITCOND null;
CQL Type
Pure Function
SELECT maxAgg(temperature)

FROM raw_weather_data

WHERE wsid='10010:99999' 

AND year = 2005 AND month = 12 AND day = 1
Aggregate using
function over
partition
Lightweight Transactions
Don’t overwrite!
INSERT INTO videos (videoid, name, userid, description, location, location_type, preview_thumbnails, tags, added_date, metadata)

VALUES (06049cbb-dfed-421f-b889-5f649a0de1ed,'The data model is dead. Long live the data model.',
9761d3d7-7fbd-4269-9988-6cfd4e188678, 

'First in a three part series for Cassandra Data Modeling','http://www.youtube.com/watch?v=px6U2n74q3g',1,

{'YouTube':'http://www.youtube.com/watch?v=px6U2n74q3g'},{'cassandra','data model','relational','instruction'},

'2013-05-02 12:30:29’)
IF NOT EXISTS;
Lightweight Transactions
No-op. Don’t throw error
CREATE TABLE IF NOT EXISTS videos_by_tag (

tag text,

videoid uuid,

added_date timestamp,

name text,

preview_image_location text,

tagged_date timestamp,

PRIMARY KEY (tag, videoid)

);
Regular Update
UPDATE videos

SET name = 'The data model is dead. Long live the data model.'

WHERE id = 06049cbb-dfed-421f-b889-5f649a0de1ed;
Table Name
Fields to Update: Not in Primary Key
Primary Key
Lightweight Transactions
Don’t overwrite!
UPDATE videos

SET name = 'The data model is dead. Long live the data model.'

WHERE id = 06049cbb-dfed-421f-b889-5f649a0de1ed
IF userid = 9761d3d7-7fbd-4269-9988-6cfd4e188678;
Deleting Data
Delete
DELETE FROM videos

WHERE id = 06049cbb-dfed-421f-b889-5f649a0de1ed;
Table Name
Primary Key: Required
Expiring Data
Time To Live = TTL
INSERT INTO videos (videoid, name, userid, description, location, location_type, preview_thumbnails, tags, added_date, metadata)

VALUES (06049cbb-dfed-421f-b889-5f649a0de1ed,'The data model is dead. Long live the data model.',
9761d3d7-7fbd-4269-9988-6cfd4e188678, 

'First in a three part series for Cassandra Data Modeling','http://www.youtube.com/watch?v=px6U2n74q3g',1,

{'YouTube':'http://www.youtube.com/watch?v=px6U2n74q3g'},{'cassandra','data model','relational','instruction'},

'2013-05-02 12:30:29’)
USING TTL = 2592000
Expire Data: 30 Days
What about joins??
Oracle to Cassandra Core Concepts Guide Pt. 3
Tired of timeouts? Cursing your cursors? Join the distributed revolution and bring
your dev team into application nirvana. You won’t believe how easy it is to be code
complete on your next big project. We will show you how to lead your devs away
from the clutches of the DBA and be in control of their own data destiny. Discover
the methodology that will make your Cassandra project epic.
Stay tuned!
RachelP50 or PatrickM50- 50% off Priority Pass
RachelPCert or PatrickMCert- 25% Certification

More Related Content

What's hot (20)

PDF
DataStax Training – Everything you need to become a Cassandra Rockstar
DataStax
 
PDF
Data Pipelines with Spark & DataStax Enterprise
DataStax
 
PPTX
Cassandra Community Webinar: MySQL to Cassandra - What I Wish I'd Known
DataStax
 
PPTX
Introducing DataStax Enterprise 4.7
DataStax
 
PPTX
Azure + DataStax Enterprise (DSE) Powers Office365 Per User Store
DataStax Academy
 
PDF
The Promise and Perils of Encrypting Cassandra Data (Ameesh Divatia, Baffle, ...
DataStax
 
PPTX
Productizing a Cassandra-Based Solution (Brij Bhushan Ravat, Ericsson) | C* S...
DataStax
 
PDF
Workshop - How to benchmark your database
ScyllaDB
 
PPTX
Webinar | Introducing DataStax Enterprise 4.6
DataStax
 
PDF
DataStax | DataStax Tools for Developers (Alex Popescu) | Cassandra Summit 2016
DataStax
 
PDF
Migration Best Practices: From RDBMS to Cassandra without a Hitch
DataStax Academy
 
PPT
Reporting from the Trenches: Intuit & Cassandra
DataStax
 
PDF
Cold Storage That Isn't Glacial (Joshua Hollander, Protectwise) | Cassandra S...
DataStax
 
PPTX
Making Every Drop Count: How i20 Addresses the Water Crisis with the IoT and ...
DataStax
 
PPTX
Webinar: Get On-Demand Education Anytime, Anywhere with Coursera and DataStax
DataStax
 
PDF
Building a Digital Bank
DataStax
 
PPTX
Building a Multi-Region Cluster at Target (Aaron Ploetz, Target) | Cassandra ...
DataStax
 
PPTX
Webinar: DataStax Training - Everything you need to become a Cassandra Rockstar
DataStax
 
PDF
Keeping your application’s latency SLAs no matter what
ScyllaDB
 
PDF
Cassandra CLuster Management by Japan Cassandra Community
Hiromitsu Komatsu
 
DataStax Training – Everything you need to become a Cassandra Rockstar
DataStax
 
Data Pipelines with Spark & DataStax Enterprise
DataStax
 
Cassandra Community Webinar: MySQL to Cassandra - What I Wish I'd Known
DataStax
 
Introducing DataStax Enterprise 4.7
DataStax
 
Azure + DataStax Enterprise (DSE) Powers Office365 Per User Store
DataStax Academy
 
The Promise and Perils of Encrypting Cassandra Data (Ameesh Divatia, Baffle, ...
DataStax
 
Productizing a Cassandra-Based Solution (Brij Bhushan Ravat, Ericsson) | C* S...
DataStax
 
Workshop - How to benchmark your database
ScyllaDB
 
Webinar | Introducing DataStax Enterprise 4.6
DataStax
 
DataStax | DataStax Tools for Developers (Alex Popescu) | Cassandra Summit 2016
DataStax
 
Migration Best Practices: From RDBMS to Cassandra without a Hitch
DataStax Academy
 
Reporting from the Trenches: Intuit & Cassandra
DataStax
 
Cold Storage That Isn't Glacial (Joshua Hollander, Protectwise) | Cassandra S...
DataStax
 
Making Every Drop Count: How i20 Addresses the Water Crisis with the IoT and ...
DataStax
 
Webinar: Get On-Demand Education Anytime, Anywhere with Coursera and DataStax
DataStax
 
Building a Digital Bank
DataStax
 
Building a Multi-Region Cluster at Target (Aaron Ploetz, Target) | Cassandra ...
DataStax
 
Webinar: DataStax Training - Everything you need to become a Cassandra Rockstar
DataStax
 
Keeping your application’s latency SLAs no matter what
ScyllaDB
 
Cassandra CLuster Management by Japan Cassandra Community
Hiromitsu Komatsu
 

Viewers also liked (20)

PDF
Building a Mobile Data Platform with Cassandra - Apigee Under the Hood (Webcast)
Apigee | Google Cloud
 
PDF
Cassandra Summit 2014: Cassandra in Large Scale Enterprise Grade xPatterns De...
DataStax Academy
 
PDF
Cassandra Summit 2014: Social Media Security Company Nexgate Relies on Cassan...
DataStax Academy
 
PDF
Cassandra Summit 2014: META — An Efficient Distributed Data Hub with Batch an...
DataStax Academy
 
PDF
Apache Cassandra at Narmal 2014
DataStax Academy
 
PDF
Cassandra Summit 2014: A Train of Thoughts About Growing and Scalability — Bu...
DataStax Academy
 
PDF
Introduction to Dating Modeling for Cassandra
DataStax Academy
 
PPTX
Cassandra Summit 2014: Apache Cassandra at Telefonica CBS
DataStax Academy
 
PDF
Coursera's Adoption of Cassandra
DataStax Academy
 
PDF
Cassandra Summit 2014: Monitor Everything!
DataStax Academy
 
PDF
Production Ready Cassandra (Beginner)
DataStax Academy
 
PDF
Cassandra Summit 2014: The Cassandra Experience at Orange — Season 2
DataStax Academy
 
PDF
The Last Pickle: Distributed Tracing from Application to Database
DataStax Academy
 
PDF
New features in 3.0
DataStax Academy
 
PDF
Introduction to .Net Driver
DataStax Academy
 
PPTX
Spark Cassandra Connector: Past, Present and Furure
DataStax Academy
 
PDF
Playlists at Spotify
DataStax Academy
 
PPTX
Lessons Learned with Cassandra and Spark at the US Patent and Trademark Office
DataStax Academy
 
PPTX
Using Event-Driven Architectures with Cassandra
DataStax Academy
 
PDF
Signal Digital: The Skinny on Wide Rows
DataStax Academy
 
Building a Mobile Data Platform with Cassandra - Apigee Under the Hood (Webcast)
Apigee | Google Cloud
 
Cassandra Summit 2014: Cassandra in Large Scale Enterprise Grade xPatterns De...
DataStax Academy
 
Cassandra Summit 2014: Social Media Security Company Nexgate Relies on Cassan...
DataStax Academy
 
Cassandra Summit 2014: META — An Efficient Distributed Data Hub with Batch an...
DataStax Academy
 
Apache Cassandra at Narmal 2014
DataStax Academy
 
Cassandra Summit 2014: A Train of Thoughts About Growing and Scalability — Bu...
DataStax Academy
 
Introduction to Dating Modeling for Cassandra
DataStax Academy
 
Cassandra Summit 2014: Apache Cassandra at Telefonica CBS
DataStax Academy
 
Coursera's Adoption of Cassandra
DataStax Academy
 
Cassandra Summit 2014: Monitor Everything!
DataStax Academy
 
Production Ready Cassandra (Beginner)
DataStax Academy
 
Cassandra Summit 2014: The Cassandra Experience at Orange — Season 2
DataStax Academy
 
The Last Pickle: Distributed Tracing from Application to Database
DataStax Academy
 
New features in 3.0
DataStax Academy
 
Introduction to .Net Driver
DataStax Academy
 
Spark Cassandra Connector: Past, Present and Furure
DataStax Academy
 
Playlists at Spotify
DataStax Academy
 
Lessons Learned with Cassandra and Spark at the US Patent and Trademark Office
DataStax Academy
 
Using Event-Driven Architectures with Cassandra
DataStax Academy
 
Signal Digital: The Skinny on Wide Rows
DataStax Academy
 
Ad

Similar to Oracle to Cassandra Core Concepts Guide Pt. 2 (20)

PDF
Cassandra lesson learned - extended
Andrzej Ludwikowski
 
PDF
DataStax: Old Dogs, New Tricks. Teaching your Relational DBA to fetch
DataStax Academy
 
PDF
Introduction to Data Modeling with Apache Cassandra
DataStax Academy
 
PDF
Cassandra - lesson learned
Andrzej Ludwikowski
 
PDF
Introduction to data modeling with apache cassandra
Patrick McFadin
 
PDF
Cassandra Day Atlanta 2015: Data Modeling 101
DataStax Academy
 
PDF
Cassandra Day Chicago 2015: Apache Cassandra Data Modeling 101
DataStax Academy
 
PDF
Cassandra Day London 2015: Data Modeling 101
DataStax Academy
 
PPTX
Presentation
Dimitris Stripelis
 
PDF
Introduction to Data Modeling with Apache Cassandra
Luke Tillman
 
PDF
Cassandra Day SV 2014: Fundamentals of Apache Cassandra Data Modeling
DataStax Academy
 
PDF
Apache Cassandra & Data Modeling
Massimiliano Tomassi
 
PDF
Become a super modeler
Patrick McFadin
 
PDF
1 Dundee - Cassandra 101
Christopher Batey
 
PDF
Apache cassandra & apache spark for time series data
Patrick McFadin
 
PDF
Cassandra Community Webinar | Become a Super Modeler
DataStax
 
PDF
Cassandra Data Modeling
Ben Knear
 
PDF
Time series with Apache Cassandra - Long version
Patrick McFadin
 
PDF
Dublin Meetup: Cassandra anti patterns
Christopher Batey
 
PDF
Advanced Data Modeling with Apache Cassandra
DataStax Academy
 
Cassandra lesson learned - extended
Andrzej Ludwikowski
 
DataStax: Old Dogs, New Tricks. Teaching your Relational DBA to fetch
DataStax Academy
 
Introduction to Data Modeling with Apache Cassandra
DataStax Academy
 
Cassandra - lesson learned
Andrzej Ludwikowski
 
Introduction to data modeling with apache cassandra
Patrick McFadin
 
Cassandra Day Atlanta 2015: Data Modeling 101
DataStax Academy
 
Cassandra Day Chicago 2015: Apache Cassandra Data Modeling 101
DataStax Academy
 
Cassandra Day London 2015: Data Modeling 101
DataStax Academy
 
Presentation
Dimitris Stripelis
 
Introduction to Data Modeling with Apache Cassandra
Luke Tillman
 
Cassandra Day SV 2014: Fundamentals of Apache Cassandra Data Modeling
DataStax Academy
 
Apache Cassandra & Data Modeling
Massimiliano Tomassi
 
Become a super modeler
Patrick McFadin
 
1 Dundee - Cassandra 101
Christopher Batey
 
Apache cassandra & apache spark for time series data
Patrick McFadin
 
Cassandra Community Webinar | Become a Super Modeler
DataStax
 
Cassandra Data Modeling
Ben Knear
 
Time series with Apache Cassandra - Long version
Patrick McFadin
 
Dublin Meetup: Cassandra anti patterns
Christopher Batey
 
Advanced Data Modeling with Apache Cassandra
DataStax Academy
 
Ad

More from DataStax Academy (20)

PDF
Forrester CXNYC 2017 - Delivering great real-time cx is a true craft
DataStax Academy
 
PPTX
Introduction to DataStax Enterprise Graph Database
DataStax Academy
 
PPTX
Introduction to DataStax Enterprise Advanced Replication with Apache Cassandra
DataStax Academy
 
PPTX
Cassandra on Docker @ Walmart Labs
DataStax Academy
 
PDF
Cassandra 3.0 Data Modeling
DataStax Academy
 
PPTX
Cassandra Adoption on Cisco UCS & Open stack
DataStax Academy
 
PDF
Data Modeling for Apache Cassandra
DataStax Academy
 
PDF
Coursera Cassandra Driver
DataStax Academy
 
PDF
Production Ready Cassandra
DataStax Academy
 
PDF
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
DataStax Academy
 
PPTX
Cassandra @ Sony: The good, the bad, and the ugly part 1
DataStax Academy
 
PPTX
Cassandra @ Sony: The good, the bad, and the ugly part 2
DataStax Academy
 
PDF
Standing Up Your First Cluster
DataStax Academy
 
PDF
Real Time Analytics with Dse
DataStax Academy
 
PDF
Cassandra Core Concepts
DataStax Academy
 
PPTX
Enabling Search in your Cassandra Application with DataStax Enterprise
DataStax Academy
 
PPTX
Bad Habits Die Hard
DataStax Academy
 
PDF
Advanced Cassandra
DataStax Academy
 
PDF
Apache Cassandra and Drivers
DataStax Academy
 
PDF
Getting Started with Graph Databases
DataStax Academy
 
Forrester CXNYC 2017 - Delivering great real-time cx is a true craft
DataStax Academy
 
Introduction to DataStax Enterprise Graph Database
DataStax Academy
 
Introduction to DataStax Enterprise Advanced Replication with Apache Cassandra
DataStax Academy
 
Cassandra on Docker @ Walmart Labs
DataStax Academy
 
Cassandra 3.0 Data Modeling
DataStax Academy
 
Cassandra Adoption on Cisco UCS & Open stack
DataStax Academy
 
Data Modeling for Apache Cassandra
DataStax Academy
 
Coursera Cassandra Driver
DataStax Academy
 
Production Ready Cassandra
DataStax Academy
 
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
DataStax Academy
 
Cassandra @ Sony: The good, the bad, and the ugly part 1
DataStax Academy
 
Cassandra @ Sony: The good, the bad, and the ugly part 2
DataStax Academy
 
Standing Up Your First Cluster
DataStax Academy
 
Real Time Analytics with Dse
DataStax Academy
 
Cassandra Core Concepts
DataStax Academy
 
Enabling Search in your Cassandra Application with DataStax Enterprise
DataStax Academy
 
Bad Habits Die Hard
DataStax Academy
 
Advanced Cassandra
DataStax Academy
 
Apache Cassandra and Drivers
DataStax Academy
 
Getting Started with Graph Databases
DataStax Academy
 

Recently uploaded (20)

PDF
CIFDAQ Market Wrap for the week of 4th July 2025
CIFDAQ
 
PPTX
Designing_the_Future_AI_Driven_Product_Experiences_Across_Devices.pptx
presentifyai
 
PPTX
Future Tech Innovations 2025 – A TechLists Insight
TechLists
 
PDF
Kit-Works Team Study_20250627_한달만에만든사내서비스키링(양다윗).pdf
Wonjun Hwang
 
PPTX
MuleSoft MCP Support (Model Context Protocol) and Use Case Demo
shyamraj55
 
PDF
Newgen 2022-Forrester Newgen TEI_13 05 2022-The-Total-Economic-Impact-Newgen-...
darshakparmar
 
PPTX
The Project Compass - GDG on Campus MSIT
dscmsitkol
 
DOCX
Python coding for beginners !! Start now!#
Rajni Bhardwaj Grover
 
PPTX
Agentforce World Tour Toronto '25 - Supercharge MuleSoft Development with Mod...
Alexandra N. Martinez
 
PDF
Transforming Utility Networks: Large-scale Data Migrations with FME
Safe Software
 
PDF
Transcript: Book industry state of the nation 2025 - Tech Forum 2025
BookNet Canada
 
PDF
Newgen Beyond Frankenstein_Build vs Buy_Digital_version.pdf
darshakparmar
 
PPTX
Agentforce World Tour Toronto '25 - MCP with MuleSoft
Alexandra N. Martinez
 
PDF
“Squinting Vision Pipelines: Detecting and Correcting Errors in Vision Models...
Edge AI and Vision Alliance
 
PPT
Ericsson LTE presentation SEMINAR 2010.ppt
npat3
 
PDF
Bitcoin for Millennials podcast with Bram, Power Laws of Bitcoin
Stephen Perrenod
 
PDF
🚀 Let’s Build Our First Slack Workflow! 🔧.pdf
SanjeetMishra29
 
PDF
The 2025 InfraRed Report - Redpoint Ventures
Razin Mustafiz
 
PPTX
AI Penetration Testing Essentials: A Cybersecurity Guide for 2025
defencerabbit
 
PDF
AI Agents in the Cloud: The Rise of Agentic Cloud Architecture
Lilly Gracia
 
CIFDAQ Market Wrap for the week of 4th July 2025
CIFDAQ
 
Designing_the_Future_AI_Driven_Product_Experiences_Across_Devices.pptx
presentifyai
 
Future Tech Innovations 2025 – A TechLists Insight
TechLists
 
Kit-Works Team Study_20250627_한달만에만든사내서비스키링(양다윗).pdf
Wonjun Hwang
 
MuleSoft MCP Support (Model Context Protocol) and Use Case Demo
shyamraj55
 
Newgen 2022-Forrester Newgen TEI_13 05 2022-The-Total-Economic-Impact-Newgen-...
darshakparmar
 
The Project Compass - GDG on Campus MSIT
dscmsitkol
 
Python coding for beginners !! Start now!#
Rajni Bhardwaj Grover
 
Agentforce World Tour Toronto '25 - Supercharge MuleSoft Development with Mod...
Alexandra N. Martinez
 
Transforming Utility Networks: Large-scale Data Migrations with FME
Safe Software
 
Transcript: Book industry state of the nation 2025 - Tech Forum 2025
BookNet Canada
 
Newgen Beyond Frankenstein_Build vs Buy_Digital_version.pdf
darshakparmar
 
Agentforce World Tour Toronto '25 - MCP with MuleSoft
Alexandra N. Martinez
 
“Squinting Vision Pipelines: Detecting and Correcting Errors in Vision Models...
Edge AI and Vision Alliance
 
Ericsson LTE presentation SEMINAR 2010.ppt
npat3
 
Bitcoin for Millennials podcast with Bram, Power Laws of Bitcoin
Stephen Perrenod
 
🚀 Let’s Build Our First Slack Workflow! 🔧.pdf
SanjeetMishra29
 
The 2025 InfraRed Report - Redpoint Ventures
Razin Mustafiz
 
AI Penetration Testing Essentials: A Cybersecurity Guide for 2025
defencerabbit
 
AI Agents in the Cloud: The Rise of Agentic Cloud Architecture
Lilly Gracia
 

Oracle to Cassandra Core Concepts Guide Pt. 2

  • 1. ©2015 DataStax Confidential. Do not distribute without consent. @RachelPedreschi & @PatrickMcFadin Rachel Pedreschi & Patrick McFadin
 Evangelists for Apache Cassandra Oracle to Cassandra Core Concepts Guide Part 2: Data Model Strikes Back 1
  • 2. Where are we now? Pt. 1 - Transition from Oracle to Cassandra How does it work? What are the differences? Pt. 2 - Cassandra Data Model All the details on how to use Cassandra Query Language (CQL) Pt. 3 - Building a Cassandra Application Wrap it all up with a top down application design. Combining everything we’ve learned plus a more.
  • 3. Application hash(key) => token(43) replication factor = 3 80 10 3050 70 60 40 20
  • 4. “Static” Table CREATE TABLE videos (
 videoid uuid,
 userid uuid,
 name varchar,
 description varchar,
 location text,
 location_type int,
 preview_thumbnails map<text,text>,
 tags set<varchar>,
 added_date timestamp,
 PRIMARY KEY (videoid)
 ); Table Name Column Name Column CQL Type Primary Key Designation Partition Key
  • 7. Table Column 1 Partition Key 1 Column 2 Column 3 Column 4 Column 1 Partition Key 1 Column 2 Column 3 Column 4 Column 1 Partition Key 1 Column 2 Column 3 Column 4 Column 1 Partition Key 1 Column 2 Column 3 Column 4 Column 1 Partition Key 2 Column 2 Column 3 Column 4 Column 1 Column 2 Column 3 Column 4 Column 1 Column 2 Column 3 Column 4 Column 1 Column 2 Column 3 Column 4 Partition Key 2 Partition Key 2 Partition Key 2
  • 8. Keyspace Column 1 Partition Key 1 Column 2 Column 3 Column 4 Column 1 Partition Key 2 Column 2 Column 3 Column 4 Column 1 Partition Key 1 Column 2 Column 3 Column 4 Column 1 Partition Key 1 Column 2 Column 3 Column 4 Column 1 Partition Key 1 Column 2 Column 3 Column 4 Column 1 Partition Key 2 Column 2 Column 3 Column 4 Column 1 Partition Key 2 Column 2 Column 3 Column 4 Column 1 Partition Key 2 Column 2 Column 3 Column 4 Column 1 Partition Key 1 Column 2 Column 3 Column 4 Column 1 Partition Key 2 Column 2 Column 3 Column 4 Column 1 Partition Key 1 Column 2 Column 3 Column 4 Column 1 Partition Key 1 Column 2 Column 3 Column 4 Column 1 Partition Key 1 Column 2 Column 3 Column 4 Column 1 Partition Key 2 Column 2 Column 3 Column 4 Column 1 Partition Key 2 Column 2 Column 3 Column 4 Column 1 Partition Key 2 Column 2 Column 3 Column 4 Table 1 Table 2 Keyspace 1
  • 9. Insert INSERT INTO videos (videoid, name, userid, description, location, location_type, preview_thumbnails, tags, added_date, metadata)
 VALUES (06049cbb-dfed-421f-b889-5f649a0de1ed,'The data model is dead. Long live the data model.', 9761d3d7-7fbd-4269-9988-6cfd4e188678, 
 'First in a three part series for Cassandra Data Modeling','http://www.youtube.com/watch?v=px6U2n74q3g',1,
 {'YouTube':'http://www.youtube.com/watch?v=px6U2n74q3g'},{'cassandra','data model','relational','instruction'},
 '2013-05-02 12:30:29'); Table Name Fields Values Partition Key: Required
  • 10. Partition keys 06049cbb-dfed-421f-b889-5f649a0de1ed Murmur3 Hash Token = 7224631062609997448 873ff430-9c23-4e60-be5f-278ea2bb21bd Murmur3 Hash Token = -6804302034103043898 Consistent hash. 128 bit number between 2-63 and 264 INSERT INTO videos (videoid, name, userid, description)
 VALUES (06049cbb-dfed-421f-b889-5f649a0de1ed,'The data model is dead. Long live the data model.’, 9761d3d7-7fbd-4269-9988-6cfd4e188678, 'First in a three part series for Cassandra Data Modeling'); INSERT INTO videos (videoid, name, userid, description)
 VALUES (873ff430-9c23-4e60-be5f-278ea2bb21bd,'Become a Super Modeler’, 9761d3d7-7fbd-4269-9988-6cfd4e188678, 'Second in a three part series for Cassandra Data Modeling');
  • 11. “Dynamic” Table CREATE TABLE videos_by_tag (
 tag text,
 videoid uuid,
 added_date timestamp,
 name text,
 preview_image_location text,
 tagged_date timestamp,
 PRIMARY KEY (tag, videoid)
 ); Partition Key Clustering Column
  • 12. Primary key relationship PRIMARY KEY (tag,videoid)
  • 13. Primary key relationship Partition Key PRIMARY KEY (tag,videoid)
  • 14. Primary key relationship Partition Key Clustering Column PRIMARY KEY (tag,videoid)
  • 15. Primary key relationship Partition Key data model PRIMARY KEY (tag,videoid) Clustering Column
  • 16. -5.6 06049cbb-dfed-421f-b889-5f649a0de1ed Primary key relationship Partition Key 2013-05-16 16:50:002013-05-02 12:30:29 873ff430-9c23-4e60-be5f-278ea2bb21bd PRIMARY KEY (tag,videoid) Clustering Column data model 49f64d40-7d89-4890-b910-dbf923563a33 2013-06-11 11:00:00
  • 17. Select name | description | added_date
 ---------------------------------------------------+----------------------------------------------------------+--------------------------
 The data model is dead. Long live the data model. | First in a three part series for Cassandra Data Modeling | 2013-05-02 12:30:29-0700 SELECT name, description, added_date
 FROM videos
 WHERE videoid = 06049cbb-dfed-421f-b889-5f649a0de1ed; Fields Table Name Primary Key: Partition Key Required
  • 18. Controlling Order CREATE TABLE raw_weather_data (
 wsid text,
 year int,
 month int,
 day int,
 hour int,
 temperature double,
 PRIMARY KEY ((wsid), year, month, day, hour)
 ) WITH CLUSTERING ORDER BY (year DESC, month DESC, day DESC, hour DESC); INSERT INTO raw_weather_data(wsid,year,month,day,hour,temperature)
 VALUES (‘10010:99999’,2005,12,1,10,-5.6); INSERT INTO raw_weather_data(wsid,year,month,day,hour,temperature)
 VALUES (‘10010:99999’,2005,12,1,9,-5.1); INSERT INTO raw_weather_data(wsid,year,month,day,hour,temperature)
 VALUES (‘10010:99999’,2005,12,1,8,-4.9); INSERT INTO raw_weather_data(wsid,year,month,day,hour,temperature)
 VALUES (‘10010:99999’,2005,12,1,7,-5.3);
  • 19. Clustering 200510010:99999 12 1 10 200510010:99999 12 1 9 raw_weather_data -5.6 -5.1 200510010:99999 12 1 8 200510010:99999 12 1 7 -4.9 -5.3 Order By DESC
  • 20. Write Path Client INSERT INTO raw_weather_data(wsid,year,month,day,hour,temperature)
 VALUES (‘10010:99999’,2005,12,1,7,-5.3); year 1wsid 1 month 1 day 1 hour 1 year 2wsid 2 month 2 day 2 hour 2 Memtable SSTable SSTable SSTable SSTable Node Commit Log Data * Compaction * Temp Temp
  • 21. Storage Model - Logical View 2005:12:1:10 -5.6 2005:12:1:9 -5.1 2005:12:1:8 -4.9 10010:99999 10010:99999 10010:99999 wsid hour temperature 2005:12:1:7 -5.3 10010:99999 SELECT wsid, hour, temperature
 FROM raw_weather_data
 WHERE wsid=‘10010:99999’
 AND year = 2005 AND month = 12 AND day = 1;
  • 22. 2005:12:1:10 -5.6 -5.3-4.9-5.1 Storage Model - Disk Layout 2005:12:1:9 2005:12:1:8 10010:99999 2005:12:1:7 Merged, Sorted and Stored Sequentially SELECT wsid, hour, temperature
 FROM raw_weather_data
 WHERE wsid=‘10010:99999’
 AND year = 2005 AND month = 12 AND day = 1;
  • 23. 2005:12:1:10 -5.6 2005:12:1:11 -4.9 -5.3-4.9-5.1 Storage Model - Disk Layout 2005:12:1:9 2005:12:1:8 10010:99999 2005:12:1:7 Merged, Sorted and Stored Sequentially SELECT wsid, hour, temperature
 FROM raw_weather_data
 WHERE wsid=‘10010:99999’
 AND year = 2005 AND month = 12 AND day = 1;
  • 24. 2005:12:1:10 -5.6 2005:12:1:11 -4.9 -5.3-4.9-5.1 Storage Model - Disk Layout 2005:12:1:9 2005:12:1:8 10010:99999 2005:12:1:7 Merged, Sorted and Stored Sequentially SELECT wsid, hour, temperature
 FROM raw_weather_data
 WHERE wsid=‘10010:99999’
 AND year = 2005 AND month = 12 AND day = 1; 2005:12:1:12 -5.4
  • 25. Read Path Client SSTable SSTable SSTable Node Data SELECT wsid,hour,temperature
 FROM raw_weather_data
 WHERE wsid='10010:99999'
 AND year = 2005 AND month = 12 AND day = 1 
 AND hour >= 7 AND hour <= 10; year 1wsid 1 month 1 day 1 hour 1 year 2wsid 2 month 2 day 2 hour 2 Memtable Temp Temp
  • 26. Query patterns • Range queries • “Slice” operation on disk Single seek on disk 10010:99999 Partition key for locality SELECT wsid,hour,temperature
 FROM raw_weather_data
 WHERE wsid='10010:99999'
 AND year = 2005 AND month = 12 AND day = 1 
 AND hour >= 7 AND hour <= 10; 2005:12:1:10 -5.6 -5.3-4.9-5.1 2005:12:1:9 2005:12:1:8 2005:12:1:7
  • 27. Query patterns • Range queries • “Slice” operation on disk Programmers like this Sorted by event_time 2005:12:1:10 -5.6 2005:12:1:9 -5.1 2005:12:1:8 -4.9 10010:99999 10010:99999 10010:99999 weather_station hour temperature 2005:12:1:7 -5.3 10010:99999 SELECT weatherstation,hour,temperature FROM temperature WHERE weatherstation_id=‘10010:99999' AND year = 2005 AND month = 12 AND day = 1 AND hour >= 7 AND hour <= 10;
  • 28. Other New and Not-so-New-but-different things
  • 29. Collections Set tags set<varchar> CQL Type: For Ordering Column Name CREATE TABLE videos (
 videoid uuid,
 userid uuid,
 name varchar,
 description varchar,
 location text,
 location_type int,
 preview_thumbnails map<text,text>,
 tags set<varchar>,
 added_date timestamp,
 PRIMARY KEY (videoid)
 );
  • 30. Collections Set List Column Name Column Name CQL Type CREATE TABLE videos (
 videoid uuid,
 userid uuid,
 name varchar,
 description varchar,
 location text,
 location_type int,
 preview_thumbnails map<text,text>,
 tags set<varchar>,
 added_date timestamp,
 PRIMARY KEY (videoid)
 ); tags set<varchar> CQL Type: For Ordering tags set<varchar>
  • 31. Collections Set List Map preview_thumbnails map<text,text> Column Name Column Name CQL Key Type CQL Value Type Column Name CREATE TABLE videos (
 videoid uuid,
 userid uuid,
 name varchar,
 description varchar,
 location text,
 location_type int,
 preview_thumbnails map<text,text>,
 tags set<varchar>,
 added_date timestamp,
 PRIMARY KEY (videoid)
 ); CQL Type tags set<varchar> tags set<varchar> CQL Type: For Ordering
  • 32. Aggregates (Sort of) *As of Cassandra 2.2 •Built-in: avg, min, max, count(<column name>) •Runs on server •Always use with partition key
  • 33. User Defined Functions CREATE FUNCTION maxI(current int, candidate int)
 CALLED ON NULL INPUT
 RETURNS int LANGUAGE java AS
 'if (current == null) return candidate; else return Math.max(current, candidate);' ;
 
 CREATE AGGREGATE maxAgg(int)
 SFUNC maxI
 STYPE int
 INITCOND null; CQL Type Pure Function SELECT maxAgg(temperature)
 FROM raw_weather_data
 WHERE wsid='10010:99999' 
 AND year = 2005 AND month = 12 AND day = 1 Aggregate using function over partition
  • 34. Lightweight Transactions Don’t overwrite! INSERT INTO videos (videoid, name, userid, description, location, location_type, preview_thumbnails, tags, added_date, metadata)
 VALUES (06049cbb-dfed-421f-b889-5f649a0de1ed,'The data model is dead. Long live the data model.', 9761d3d7-7fbd-4269-9988-6cfd4e188678, 
 'First in a three part series for Cassandra Data Modeling','http://www.youtube.com/watch?v=px6U2n74q3g',1,
 {'YouTube':'http://www.youtube.com/watch?v=px6U2n74q3g'},{'cassandra','data model','relational','instruction'},
 '2013-05-02 12:30:29’) IF NOT EXISTS;
  • 35. Lightweight Transactions No-op. Don’t throw error CREATE TABLE IF NOT EXISTS videos_by_tag (
 tag text,
 videoid uuid,
 added_date timestamp,
 name text,
 preview_image_location text,
 tagged_date timestamp,
 PRIMARY KEY (tag, videoid)
 );
  • 36. Regular Update UPDATE videos
 SET name = 'The data model is dead. Long live the data model.'
 WHERE id = 06049cbb-dfed-421f-b889-5f649a0de1ed; Table Name Fields to Update: Not in Primary Key Primary Key
  • 37. Lightweight Transactions Don’t overwrite! UPDATE videos
 SET name = 'The data model is dead. Long live the data model.'
 WHERE id = 06049cbb-dfed-421f-b889-5f649a0de1ed IF userid = 9761d3d7-7fbd-4269-9988-6cfd4e188678;
  • 39. Delete DELETE FROM videos
 WHERE id = 06049cbb-dfed-421f-b889-5f649a0de1ed; Table Name Primary Key: Required
  • 40. Expiring Data Time To Live = TTL INSERT INTO videos (videoid, name, userid, description, location, location_type, preview_thumbnails, tags, added_date, metadata)
 VALUES (06049cbb-dfed-421f-b889-5f649a0de1ed,'The data model is dead. Long live the data model.', 9761d3d7-7fbd-4269-9988-6cfd4e188678, 
 'First in a three part series for Cassandra Data Modeling','http://www.youtube.com/watch?v=px6U2n74q3g',1,
 {'YouTube':'http://www.youtube.com/watch?v=px6U2n74q3g'},{'cassandra','data model','relational','instruction'},
 '2013-05-02 12:30:29’) USING TTL = 2592000 Expire Data: 30 Days
  • 42. Oracle to Cassandra Core Concepts Guide Pt. 3 Tired of timeouts? Cursing your cursors? Join the distributed revolution and bring your dev team into application nirvana. You won’t believe how easy it is to be code complete on your next big project. We will show you how to lead your devs away from the clutches of the DBA and be in control of their own data destiny. Discover the methodology that will make your Cassandra project epic. Stay tuned!
  • 43. RachelP50 or PatrickM50- 50% off Priority Pass RachelPCert or PatrickMCert- 25% Certification