SlideShare a Scribd company logo
CQL3 &
Data Modeling 101
with Apache Cassandra
San Diego Cassandra Meetup
Feb 27, 2014
Not how you’ve done it before

In the beginning

There was
the Row, and
the Column
And the Row was fast to
ïŹnd and scale,
And the Column was
fast to order.
Cassandra Properties
C*
‱

Column Oriented

‱

Log Structured

‱

Distributed Database
Column Oriented
‱

Columns actually hold the
data

‱

Key/Value pair

‱

Name can be used to store
meaning as well
Distributed Database
‱

Rows are used to distribute

‱

C* pulls the entire row into
memory

‱

Can pull out individual parts
or write to individual parts, but
it’s still considered together
Log Structured Updates
‱

Commitlog and sstables are
log structured

‱

Oriented around appending
(streaming at a know location)

‱

==> Writes quickly

‱

And you want to avoid
rewrites
Random Reads
‱

Data is scattered around the
store (have to get location and
random read to look it up)

‱

Some indexing, and hopefully
it’s in the vfs page cache, but
still.

‱

==> Reads “slower”
General Rules of Thumb
‱

De-normalize Everything

‱

Duplicate your data

‱

Organize it for reading
Containers
Keyspace
‱

For modeling, not much

‱

All data lives inside of
Keyspaces
Columnfamily
‱

aka Table

‱

Grouping of similar data

‱

Has unique key/row space

‱

Where some structure is
applied
Row
‱

Unique inside of a column
family

‱

Key/Value where the Value is
all of the columns in the row

‱

Can handle some additional
meaning to the row name
‱

Typically “bucketing”
Column
‱

Holds data values

‱

Key/Value

‱

Can have meaning in the
column name as well
Thrift Interface
‱

Operates on the raw rows and
columns

‱

Many different language
drivers

‱

Can use cassandra-cli to do
this on the command line
Data Patterns
Users CF
mac@mac
.com

NAME

TWITTER

TAGS

mac

@macmceniry

admin,super,cool

jsmith@mac
.com

NAME

ICQ

Employer

Hobby

John

89403270

Smithco

Miniature Horses

bb@example
.com

NAME

IRC

Bobtholomew

bb@DAL

NAME

TWITTER

Food

TAGS

Elizabeth

@liz

Cheesecake

admin

NAME

IRC

Steven

steve@DARK

liz@example
.com
steve@my
.net
Lookup by chat handle
ICQ CF

IRC CF

EMAIL

89403270

EMAIL

bb@DAL
bb@example
.com

jsmith@mac.com

EMAIL

steve@
DARK
steve@my
.net
Lookup by chat handle
ICQ CF
EMAIL

IRC CF
Name

89403270

EMAIL

Name

bb@example
.com

Bobtholomew

EMAIL

Name

steve@my
.net

Steve

bb@DAL
jsmith@mac.com

John

steve@
DARK
HandleCF
EMAIL

NAME

jsmith@mac.com

John

EMAIL

NAME

bb@example.com

Bobtholomew

EMAIL

NAME

steve@my.net

Steven

89403270

bb@DAL

steve@DARK
HandleCF
TYPE

EMAIL

NAME

ICQ

jsmith@mac.com

John

TYPE

EMAIL

NAME

IRC

bb@example.com

Bobtholomew

TYPE

EMAIL

NAME

IRC

steve@my.net

Steven

89403270

bb@DAL

steve@DARK
How do I create these?
[default@userdb] create column family usersCF;
5ecec19a-3a43-3490-8c9a-3eb2901e2e97
Waiting for schema agreement...
... schemas agree across the cluster
[default@userdb] create column family handleCF;
df82135c-eb1f-3abf-b9df-02c605d571d5
Waiting for schema agreement...
... schemas agree across the cluster
How do I insert data?
[default@userdb] set handleCF[utf8(‘bb@DAL’)]

 [utf8(‘NAME’)] = utf8('Bobtholomew');
Value inserted.
Elapsed time: 22 msec(s).
[default@userdb] set handleCF[utf8(‘bb@DAL')]

 [utf8(‘EMAIL’)] = utf8(‘bb@example.com’);
Value inserted.
Elapsed time: 3.43 msec(s).
Users CF - TAGS
mac@mac
.com

NAME

TWITTER

TAGS

mac

@macmceniry

admin,super,cool

jsmith@mac
.com

NAME

ICQ

Employer

Hobby

John

89403270

Smithco

Miniature Horses

bb@example
.com

NAME

IRC

Bobtholomew

bb@DAL

NAME

TWITTER

Food

TAGS

Elizabeth

@liz

Cheesecake

admin

NAME

IRC

Steven

steve@DARK

liz@example
.com
steve@my
.net
Users CF - TAGS
mac@mac
.com

jsmith@mac
.com

bb@example
.com

liz@example
.com

steve@my
.net

NAME

TWITTER

TAGS:admin

TAGS:cool

mac

@macmceniry

NAME

ICQ

Employer

Hobby

John

89403270

Smithco

Miniature Horses

NAME

IRC

Bobtholomew

bb@DAL

NAME

TWITTER

Food

TAGS:admin

Elizabeth

@liz

Cheesecake

NAME

IRC

Steven

steve@DARK

TAGS:super
“Types”
[default@userdb] set handleCF[utf8(‘bb@DAL’)]
!
‱

Key Validator

‱

(Column) Comparator

‱

(Column Value) Default Validator, Metadata

‱

BytesType, AsciiType, UTF8Type, IntegerType,
Int32Type, LongType, UUIDType, TimeUUIDType,
DateType, BooleanType, FloatType, DoubleType,
DecimalType, CounterColumnType (, CompositeTypes)
What’s in a name?
‱

Can use row names and
column names to add meaning

‱

Row name meaning creates a
new distribution bin

‱

Column name meaning can
create a data hierarchy

‱

No real change to the column
family creation in the thrift
interface (well, types
depending on what you’re
doing)
EventCF
mac@mac
.com:
20140203

08:10:15

08:15:15

09:10:15

join

update

logout

mac@mac
.com:
20140204

08:11:23

08:14:57

18:45:12

18:50:52

19:01:29

login

logout

login

logout

logout

mac@mac
.com:
20140205

09:23:23

09:57:44

login

logout

liz@example
.com:
20140203

11:22:33

11:44:55

22:10:05

22:52:02

login

logout

login

logout

liz@example
.com:
20140205

08:11:23

08:14:57

login

logout
That was then, this is

Now
‱

Same underlying structure none of that has changed
‱

‱

‱

Rows - reference quickly use for searching
Columns - scan quickly user for ordering

But now have usage patterns
‱

Some have been codiïŹed
into CQL
CQL
‱

Thrift alternative

‱

Simpler API

‱

Hides the structure of the
internal storage

‱

3 generations
‱

Only looking at CQL3 here

‱

cqlsh [-3]
How does handle look here?
cqlsh:userdb> CREATE TABLE handles (

 handlename VARCHAR,

 email VARCHAR,

 name VARCHAR,

 PRIMARY KEY (handlename)

 );
cqlsh:userdb> INSERT INTO handles

 (handlename, email, name) VALUES

 (‘bb@DAL’, ‘bb@example.com’, ‘Bobtholomew’);
handles Table
email

name

bb@example.com

Bobtholomew

bb@DAL
How does handle look here?
cqlsh:userdb> SELECT * FROM handles;
handlename | email
| name
————————————|————————————————|—————————————
bb@DAL | bb@example.com | Bobtholomew
cqlsh:userdb> SELECT * FROM handles WHERE

 handlename = ‘bb@DAL’;
handlename | email
| name
————————————|————————————————|—————————————
bb@DAL | bb@example.com | Bobtholomew
How do I change it?
cqlsh:userdb> UPDATE handles SET email=‘none’

 WHERE handlename = ‘bb@DAL’;
cqlsh:userdb> SELECT * FROM handles;
handlename | email
| name
————————————|————————————————|—————————————
bb@DAL |
none | Bobtholomew
upsert
‱

Update instead of Insert
‱

‱

Does the same thing (as
long as it’s not a key)

Insert instead of Update
‱

Overwrites data if it’s
already there
What about our event
buckets from earlier?
‱

Can do the same thing

‱

Creating a composite key
‱

‱

USERNAME:DATE

Creating a composite column
‱

hh:mm:ss
EventCF
mac@mac
.com:
20140203

08:10:15

08:15:15

09:10:15

join

update

logout

mac@mac
.com:
20140204

08:11:23

08:14:57

18:45:12

18:50:52

19:01:29

login

logout

login

logout

logout

mac@mac
.com:
20140205

09:23:23

09:57:44

login

logout

liz@example
.com:
20140203

11:22:33

11:44:55

22:10:05

22:52:02

login

logout

login

logout

liz@example
.com:
20140205

08:11:23

08:14:57

login

logout
cqlsh:userdb> CREATE TABLE events (

 username VARCHAR,

 d VARCHAR,

 hr INT,

 min INT,

 sec INT,

 event VARCHAR,

 PRIMARY KEY ( (username,d), hr, min, sec ) );

 PRIMARY KEY ( (username,d), hr, min, sec ) );

ROW NAME
(C* 1.2)

COLUMN NAME
Tags
‱

CQL has collections
‱

map, list, set

‱

Collections are build similar to
small/special composite
columns

‱

Can add to our existing
handle table
cqlsh:userdb> ALTER TABLE handles ADD tags SET;
cqlsh:userdb> UPDATE TABLE handles

 SET tags = (‘admin’, ‘foo’);
email

name

bb@example.com

Bobtholomew

bb@DAL

tags:admin

tags:foo
Design the data model so
that it’s idempotent (eBay)
‱

Counter versus Collection (what question is
being asked?)
Count
100
Count
200
Count
300

Likes A
Likes B
Likes C
Likes A
Likes B
Likes C

+user11
1393287359
+user12
1393287359
+user11
1393287359

+user12
1393280912

-user11
1393281942

+user13
1393212345

1393287100
+user12
1393287100

1393287100
+user13
1393287100

1393287100
+user14
1393287100
Go Forth and Model!
Thank You!

PS
 Sony Network is hiring!

More Related Content

Similar to CQL3 and Data Modeling 101 with Apache Cassandra (20)

PPT
My sql presentation
Nikhil Jain
 
PPTX
Rails, Postgres, Angular, and Bootstrap: The Power Stack
David Copeland
 
PPT
98765432345671223Intro-to-PostgreSQL.ppt
HastavaramDineshKuma
 
PPT
Diva10
diva23
 
PPTX
Training on Microsoft SQL Server(older version).pptx
naibedyakar00
 
PPT
MySql slides (ppt)
webhostingguy
 
PDF
Coffeescript: An Opinionated Introduction
Joe Fleming
 
PPT
Intro to tsql unit 1
Syed Asrarali
 
PPT
Intro To TSQL - Unit 1
iccma
 
PPT
mysqlHiep.ppt
webhostingguy
 
PPT
MySQL Database System Hiep Dinh
webhostingguy
 
PPT
asal12 sqliii
Hannif Mardani
 
PPTX
SQL
Shahriar Robbani
 
ODP
Meetup cassandra for_java_cql
zznate
 
PPTX
DNN Database Tips & Tricks
Will Strohl
 
ODP
Web Developement Workshop (Oct 2009) Slides
Manish Sinha
 
KEY
Using Regular Expressions and Staying Sane
Carl Brown
 
PDF
225523359001djcj4_DBMS_LAB_THEORY_DML.pdf
sahilurrahemankhan
 
PDF
1.1 Intro to WinDDI.pdf
ssuser8b6c85
 
PPTX
Mysql
lotlot
 
My sql presentation
Nikhil Jain
 
Rails, Postgres, Angular, and Bootstrap: The Power Stack
David Copeland
 
98765432345671223Intro-to-PostgreSQL.ppt
HastavaramDineshKuma
 
Diva10
diva23
 
Training on Microsoft SQL Server(older version).pptx
naibedyakar00
 
MySql slides (ppt)
webhostingguy
 
Coffeescript: An Opinionated Introduction
Joe Fleming
 
Intro to tsql unit 1
Syed Asrarali
 
Intro To TSQL - Unit 1
iccma
 
mysqlHiep.ppt
webhostingguy
 
MySQL Database System Hiep Dinh
webhostingguy
 
asal12 sqliii
Hannif Mardani
 
Meetup cassandra for_java_cql
zznate
 
DNN Database Tips & Tricks
Will Strohl
 
Web Developement Workshop (Oct 2009) Slides
Manish Sinha
 
Using Regular Expressions and Staying Sane
Carl Brown
 
225523359001djcj4_DBMS_LAB_THEORY_DML.pdf
sahilurrahemankhan
 
1.1 Intro to WinDDI.pdf
ssuser8b6c85
 
Mysql
lotlot
 

More from Chris McEniry (8)

PDF
Evolving for Kubernetes
Chris McEniry
 
PDF
On the Edge Systems Administration with Golang
Chris McEniry
 
PDF
LISA2017 Kubernetes: Hit the Ground Running
Chris McEniry
 
PDF
LISA2017 Big Three Cloud Networking
Chris McEniry
 
PDF
Go for SysAdmins - LISA 2015
Chris McEniry
 
PDF
OSCON2014 : Quick Introduction to System Tools Programming with Go
Chris McEniry
 
PDF
Intro to linux performance analysis
Chris McEniry
 
PDF
Value streammapping cascadiait2014-mceniry
Chris McEniry
 
Evolving for Kubernetes
Chris McEniry
 
On the Edge Systems Administration with Golang
Chris McEniry
 
LISA2017 Kubernetes: Hit the Ground Running
Chris McEniry
 
LISA2017 Big Three Cloud Networking
Chris McEniry
 
Go for SysAdmins - LISA 2015
Chris McEniry
 
OSCON2014 : Quick Introduction to System Tools Programming with Go
Chris McEniry
 
Intro to linux performance analysis
Chris McEniry
 
Value streammapping cascadiait2014-mceniry
Chris McEniry
 
Ad

Recently uploaded (20)

PPTX
The Project Compass - GDG on Campus MSIT
dscmsitkol
 
PDF
How do you fast track Agentic automation use cases discovery?
DianaGray10
 
PDF
Peak of Data & AI Encore AI-Enhanced Workflows for the Real World
Safe Software
 
PDF
“Voice Interfaces on a Budget: Building Real-time Speech Recognition on Low-c...
Edge AI and Vision Alliance
 
PDF
ICONIQ State of AI Report 2025 - The Builder's Playbook
Razin Mustafiz
 
PDF
Dev Dives: Accelerating agentic automation with Autopilot for Everyone
UiPathCommunity
 
PDF
UiPath DevConnect 2025: Agentic Automation Community User Group Meeting
DianaGray10
 
PDF
Software Development Company Keene Systems, Inc (1).pdf
Custom Software Development Company | Keene Systems, Inc.
 
PDF
Survival Models: Proper Scoring Rule and Stochastic Optimization with Competi...
Paris Women in Machine Learning and Data Science
 
PDF
What’s my job again? Slides from Mark Simos talk at 2025 Tampa BSides
Mark Simos
 
PPTX
Wondershare Filmora Crack Free Download 2025
josanj305
 
PDF
Next Generation AI: Anticipatory Intelligence, Forecasting Inflection Points ...
dleka294658677
 
PPTX
Digital Circuits, important subject in CS
contactparinay1
 
PDF
Go Concurrency Real-World Patterns, Pitfalls, and Playground Battles.pdf
Emily Achieng
 
PPTX
Manual Testing for Accessibility Enhancement
Julia Undeutsch
 
PDF
“ONNX and Python to C++: State-of-the-art Graph Compilation,” a Presentation ...
Edge AI and Vision Alliance
 
PDF
Automating Feature Enrichment and Station Creation in Natural Gas Utility Net...
Safe Software
 
PPTX
Securing Model Context Protocol with Keycloak: AuthN/AuthZ for MCP Servers
Hitachi, Ltd. OSS Solution Center.
 
PDF
🚀 Let’s Build Our First Slack Workflow! 🔧.pdf
SanjeetMishra29
 
PPTX
Mastering ODC + Okta Configuration - Chennai OSUG
HathiMaryA
 
The Project Compass - GDG on Campus MSIT
dscmsitkol
 
How do you fast track Agentic automation use cases discovery?
DianaGray10
 
Peak of Data & AI Encore AI-Enhanced Workflows for the Real World
Safe Software
 
“Voice Interfaces on a Budget: Building Real-time Speech Recognition on Low-c...
Edge AI and Vision Alliance
 
ICONIQ State of AI Report 2025 - The Builder's Playbook
Razin Mustafiz
 
Dev Dives: Accelerating agentic automation with Autopilot for Everyone
UiPathCommunity
 
UiPath DevConnect 2025: Agentic Automation Community User Group Meeting
DianaGray10
 
Software Development Company Keene Systems, Inc (1).pdf
Custom Software Development Company | Keene Systems, Inc.
 
Survival Models: Proper Scoring Rule and Stochastic Optimization with Competi...
Paris Women in Machine Learning and Data Science
 
What’s my job again? Slides from Mark Simos talk at 2025 Tampa BSides
Mark Simos
 
Wondershare Filmora Crack Free Download 2025
josanj305
 
Next Generation AI: Anticipatory Intelligence, Forecasting Inflection Points ...
dleka294658677
 
Digital Circuits, important subject in CS
contactparinay1
 
Go Concurrency Real-World Patterns, Pitfalls, and Playground Battles.pdf
Emily Achieng
 
Manual Testing for Accessibility Enhancement
Julia Undeutsch
 
“ONNX and Python to C++: State-of-the-art Graph Compilation,” a Presentation ...
Edge AI and Vision Alliance
 
Automating Feature Enrichment and Station Creation in Natural Gas Utility Net...
Safe Software
 
Securing Model Context Protocol with Keycloak: AuthN/AuthZ for MCP Servers
Hitachi, Ltd. OSS Solution Center.
 
🚀 Let’s Build Our First Slack Workflow! 🔧.pdf
SanjeetMishra29
 
Mastering ODC + Okta Configuration - Chennai OSUG
HathiMaryA
 
Ad

CQL3 and Data Modeling 101 with Apache Cassandra