SlideShare a Scribd company logo
Brian O‟Neill, Lead Architect, Health Market Science




                                                bone@alumni.brown.edu
                                                @boneill42
 Background
 Setup
 Data Model / Schema
 Naughty List (Astyanax)
 Toy List (CQL)
Our Problem




 Good, bad doctors? Dead doctors?
 Prescriber eligibility and remediation.
The World-Wide
Globally Scalable
Naughty List!
   How about a Naughty and
    Nice list for Santa?

   1.9 billion children
     That will fit in a single row!


   Queries to support:
     Children can login and check
      their standing.
     Santa can find nice children
      by country, state or zip.
C*ollege Credit: Creating Your First App in Java with Cassandra
Installation
   As easy as…
     Download
     http://cassandra.apache.org/download/

     Uncompress
     tar -xvzf apache-cassandra-1.2.0-beta3-bin.tar.gz

     Run
     bin/cassandra –f
      (-f puts it in foreground)
Configuration
   conf/cassandra.yaml
start_native_transport: true // CHANGE THIS TO TRUE
commitlog_directory: /var/lib/cassandra/commitlog



   conf/log4j-server.properties
log4j.appender.R.File=/var/log/cassandra/system.log
Data Model
 Schema (a.k.a. Keyspace)
 Table (a.k.a. Column Family)
 Row
     Have arbitrary #‟s of columns
     Validator for keys (e.g. UTF8Type)
   Column
     Validator for values and keys
     Comparator for keys (e.g. DateType or BYOC)

    (http://www.youtube.com/watch?v=bKfND4woylw)
Distributed Architecture
   Nodes form a token ring.

   Nodes partition the ring by initial token
     initial_token: (in cassandra.yaml)


   Partitioners map row keys to tokens.
     Usually randomly, to evenly distribute the data


   All columns for a row are stored together on disk
    in sorted order.
Visually
Row     Hash   Token/Hash Range : 0-99
Alice   50
Bob     3
Eve     15




                                  (1-33)
Java Interpretation
 Each table is a Distributed HashMap
 Each row is a SortedMap.

Cassandra provides a massively scalable version of:
HashMap<rowKey, SortedMap<columnKey, columnValue>


   Implications:
     Direct row fetch is fast.
     Searching a range of rows can be costly.
     Searching a range of columns is cheap.
C*ollege Credit: Creating Your First App in Java with Cassandra
Two Tables
 Children     Table
     Store all the children in the world.
     One row per child.
     One column per attribute.


   NaughtyOrNice Table
     Supports the queries we anticipate
     Wide-Row Strategy
Details of the NaughtyOrNice
List
   One row per standing:country
     Ensures all children in a country are grouped together
      on disk.

   One column per child using a compound key
     Ensures the columns are sorted to support our search
      at varying levels of granularity
      ○ e.g. All nice children in the US.
      ○ e.g. All naughty children in PA.
Visually                            Nice:USA
                           Node 1   CA:94333:johny.b.good
(1) Go to the row.                  CA:94333:richie.rich
(2) Get the column slice
                                    Nice:IRL
                           Node 2   D:EI33:collin.oneill
Watch out for:                      D:EI33:owen.oneill
• Hot spotting
• Unbalanced Clusters
                                    Nice:USA
                                    CA:94111:bart.simpson
                           Node 3
                                    CA:94222:dennis.menace
                                    PA:18964:michael.myers
Our Schema
   bin/cqlsh -3
       CREATE KEYSPACE northpole WITH replication = {'class':'SimpleStrategy',
        'replication_factor':1};

       create table children ( childId varchar, firstName varchar, lastName varchar, timezone varchar,
        country varchar, state varchar, zip varchar, primary key (childId ) ) WITH COMPACT STORAGE;

       create table naughtyOrNiceList ( standingByZone varchar, country varchar, state varchar, zip
        varchar, childId varchar, primary key (standingByZone, country, state, zip, childId) );




   bin/cassandra-cli
     (the “old school” interface)
The CQL->Data Model
Rules
   First primary key becomes the rowkey.

   Subsequent components of the primary key
    form a composite column name.

   One column is then written for each non-
    primary key column.
CQL View
cqlsh:northpole> select * from naughtyornicelist ;

 standingbycountry | state | zip | childid
-------------------+-------+-------+---------------
      naughty:USA | CA | 94111 | bart.simpson
      naughty:USA | CA | 94222 | dennis.menace
        nice:IRL | D | EI33 | collin.oneill
        nice:IRL | D | EI33 | owen.oneill
        nice:USA | CA | 94333 | johny.b.good
        nice:USA | CA | 94333 | richie.rich
CLI View
[default@northpole] list naughtyornicelist;
Using default limit of 100
Using default column limit of 100
-------------------
RowKey: naughty:USA
=> (column=CA:94111:bart.simpson:, value=, timestamp=1355168971612000)
=> (column=CA:94222:dennis.menace:, value=, timestamp=1355168971614000)
-------------------
RowKey: nice:IRL
=> (column=D:EI33:collin.oneill:, value=, timestamp=1355168971604000)
=> (column=D:EI33:owen.oneill:, value=, timestamp=1355168971601000)
-------------------
RowKey: nice:USA
=> (column=CA:94333:johny.b.good:, value=, timestamp=1355168971610000)
=> (column=CA:94333:richie.rich:, value=, timestamp=1355168971606000)
Data Model Implications
select * from children where childid='owen.oneill';

select * from naughtyornicelist where childid='owen.oneill';
Bad Request:

select * from naughtyornicelist where
standingbycountry='nice:IRL' and state='D' and zip='EI33'
and childid='owen.oneill';
C*ollege Credit: Creating Your First App in Java with Cassandra
No, seriously. Let‟s code!
   What API should we use?
                      Production-   Potential   Momentum
                      Readiness
    Thrift                10           -1          -1
    Hector                10           8           8
    Astyanax              8            9           10
    Kundera (JPA)         6            9           9
    Pelops                7            6           7
    Firebrand             8            10          8
    PlayORM               5            8           7
    GORA                  6            9           7
    CQL Driver            ?            ?           ?

                    Asytanax FTW!
Connect
this.astyanaxContext = new AstyanaxContext.Builder()
         .forCluster("ClusterName")
         .forKeyspace(keyspace)
         .withAstyanaxConfiguration(…)
         .withConnectionPoolConfiguration(…)
         .buildKeyspace(ThriftFamilyFactory.getInstance());


   Specify:
       Cluster Name (arbitrary identifier)
       Keyspace
       Node Discovery Method
       Connection Pool Information


Write/Update
MutationBatch mutation = keyspace.prepareMutationBatch();
columnFamily = new ColumnFamily<String, String>(columnFamilyName,
          StringSerializer.get(), StringSerializer.get());
mutation.withRow(columnFamily, rowKey)
         .putColumn(entry.getKey(), entry.getValue(), null);
mutation.execute();


   Process:
     Create a mutation
     Specify the Column Family with Serializers
     Put your columns.
     Execute
Composite Types
   Composite (a.k.a. Compound)

public class ListEntry {
  @Component(ordinal = 0)
  public String state;
  @Component(ordinal = 1)
  public String zip;
  @Component(ordinal = 2)
  public String childId;
}
Range Builders
range = entitySerializer.buildRange()
.withPrefix(state)
.greaterThanEquals("")
.lessThanEquals("99999");

Then...

.withColumnRange(range).execute();
C*ollege Credit: Creating Your First App in Java with Cassandra
CQL Collections!
http://www.datastax.com/dev/blog/cql3_collections

   Set
     UPDATE users SET emails = emails + {'fb@friendsofmordor.org'} WHERE
      user_id = 'frodo';

   List
     UPDATE users SET top_places = [ 'the shire' ] + top_places WHERE
      user_id = 'frodo';

   Maps
     UPDATE users SET todo['2012-10-2 12:10'] = 'die' WHERE user_id =
      'frodo';
CQL vs. Thrift
http://www.datastax.com/dev/blog/thrift-to-cql3

   Thrift is legacy API on which all of the Java
    APIs are built.

   CQL is the new native protocol and driver.
Let‟s get back to cranking…
   Recreate the schema (to be CQL friendly)
   UPDATE children SET toys = toys + [ „legos' ] WHERE childId = ‟owen.oneill‟;



   Crank out a Dao layer to use CQL collections
    operations.
Shameless Shoutout(s)
 Virgil
 https://github.com/boneill42/virgil
     REST interface for Cassandra


   https://github.com/boneill42/storm-cassandra
     Distributed Processing on Cassandra
     (Webinar in January)
C*ollege Credit: Creating Your First App in Java with Cassandra

More Related Content

What's hot (20)

PDF
CQL performance with Apache Cassandra 3.0 (Aaron Morton, The Last Pickle) | C...
DataStax
 
DOC
Rac nonrac clone
stevejones167
 
PDF
Lab1-DB-Cassandra
Lilia Sfaxi
 
PDF
Полнотекстовый поиск в PostgreSQL за миллисекунды (Олег Бартунов, Александр К...
Ontico
 
PDF
glance replicator
irix_jp
 
PDF
Store and Process Big Data with Hadoop and Cassandra
Deependra Ariyadewa
 
PDF
C* Summit 2013: Cassandra at Instagram by Rick Branson
DataStax Academy
 
PDF
Better Full Text Search in PostgreSQL
Artur Zakirov
 
PPTX
MongoDB-SESSION03
Jainul Musani
 
PPT
Oracle 10g Performance: chapter 00 sampling
Kyle Hailey
 
PPTX
Cassandra 2.2 & 3.0
Victor Coustenoble
 
PDF
Diagnosing Open-Source Community Health with Spark-(William Benton, Red Hat)
Spark Summit
 
PDF
RestMQ - HTTP/Redis based Message Queue
Gleicon Moraes
 
PDF
How to Use JSON in MySQL Wrong
Karwin Software Solutions LLC
 
PDF
Mito, a successor of Integral
fukamachi
 
PDF
Cassandra By Example: Data Modelling with CQL3
Eric Evans
 
PDF
Full Text Search in PostgreSQL
Aleksander Alekseev
 
PDF
Cassandra Summit 2014: Cassandra at Instagram 2014
DataStax Academy
 
PPT
Oracle 10g Performance: chapter 09 enqueues
Kyle Hailey
 
PDF
Top Node.js Metrics to Watch
Sematext Group, Inc.
 
CQL performance with Apache Cassandra 3.0 (Aaron Morton, The Last Pickle) | C...
DataStax
 
Rac nonrac clone
stevejones167
 
Lab1-DB-Cassandra
Lilia Sfaxi
 
Полнотекстовый поиск в PostgreSQL за миллисекунды (Олег Бартунов, Александр К...
Ontico
 
glance replicator
irix_jp
 
Store and Process Big Data with Hadoop and Cassandra
Deependra Ariyadewa
 
C* Summit 2013: Cassandra at Instagram by Rick Branson
DataStax Academy
 
Better Full Text Search in PostgreSQL
Artur Zakirov
 
MongoDB-SESSION03
Jainul Musani
 
Oracle 10g Performance: chapter 00 sampling
Kyle Hailey
 
Cassandra 2.2 & 3.0
Victor Coustenoble
 
Diagnosing Open-Source Community Health with Spark-(William Benton, Red Hat)
Spark Summit
 
RestMQ - HTTP/Redis based Message Queue
Gleicon Moraes
 
How to Use JSON in MySQL Wrong
Karwin Software Solutions LLC
 
Mito, a successor of Integral
fukamachi
 
Cassandra By Example: Data Modelling with CQL3
Eric Evans
 
Full Text Search in PostgreSQL
Aleksander Alekseev
 
Cassandra Summit 2014: Cassandra at Instagram 2014
DataStax Academy
 
Oracle 10g Performance: chapter 09 enqueues
Kyle Hailey
 
Top Node.js Metrics to Watch
Sematext Group, Inc.
 

Viewers also liked (20)

PDF
Effective cassandra development with achilles
Duyhai Doan
 
PPTX
Using Cassandra with your Web Application
supertom
 
PDF
Intro to developing for @twitterapi
Raffi Krikorian
 
PDF
Chicago Hadoop Users Group: Enterprise Data Workflows
Paco Nathan
 
PDF
Spring 3.1 and MVC Testing Support - 4Developers
Sam Brannen
 
PDF
Reactive Programming With Akka - Lessons Learned
Daniel Sawano
 
PDF
The no-framework Scala Dependency Injection Framework
Adam Warski
 
PDF
A Sceptical Guide to Functional Programming
Garth Gilmour
 
PDF
Effective akka scalaio
shinolajla
 
PDF
Actor Based Asyncronous IO in Akka
drewhk
 
PDF
Efficient HTTP Apis
Adrian Cole
 
PDF
Beginning Haskell, Dive In, Its Not That Scary!
priort
 
PDF
On Cassandra Development: Past, Present and Future
pcmanus
 
PDF
Cassandra Development Nirvana
DataStax
 
PDF
Software Development with Apache Cassandra
zznate
 
PDF
Successful Software Development with Apache Cassandra
zznate
 
PDF
7. Jessica Stromback (VaasaETT) - Consumer Program Development in Europe Toda...
Cassandra Project
 
PDF
Building ‘Bootiful’ microservices cloud
Idan Fridman
 
PDF
Effective Actors
shinolajla
 
KEY
Curator intro
Jordan Zimmerman
 
Effective cassandra development with achilles
Duyhai Doan
 
Using Cassandra with your Web Application
supertom
 
Intro to developing for @twitterapi
Raffi Krikorian
 
Chicago Hadoop Users Group: Enterprise Data Workflows
Paco Nathan
 
Spring 3.1 and MVC Testing Support - 4Developers
Sam Brannen
 
Reactive Programming With Akka - Lessons Learned
Daniel Sawano
 
The no-framework Scala Dependency Injection Framework
Adam Warski
 
A Sceptical Guide to Functional Programming
Garth Gilmour
 
Effective akka scalaio
shinolajla
 
Actor Based Asyncronous IO in Akka
drewhk
 
Efficient HTTP Apis
Adrian Cole
 
Beginning Haskell, Dive In, Its Not That Scary!
priort
 
On Cassandra Development: Past, Present and Future
pcmanus
 
Cassandra Development Nirvana
DataStax
 
Software Development with Apache Cassandra
zznate
 
Successful Software Development with Apache Cassandra
zznate
 
7. Jessica Stromback (VaasaETT) - Consumer Program Development in Europe Toda...
Cassandra Project
 
Building ‘Bootiful’ microservices cloud
Idan Fridman
 
Effective Actors
shinolajla
 
Curator intro
Jordan Zimmerman
 
Ad

Similar to C*ollege Credit: Creating Your First App in Java with Cassandra (20)

PPT
Scaling web applications with cassandra presentation
Murat Çakal
 
PDF
Cassandra Data Modelling with CQL (OSCON 2015)
twentyideas
 
ODP
Introduciton to Apache Cassandra for Java Developers (JavaOne)
zznate
 
PDF
Slide presentation pycassa_upload
Rajini Ramesh
 
PDF
Cassandra
Robert Koletka
 
PDF
Big Data Grows Up - A (re)introduction to Cassandra
Robbie Strickland
 
PPT
Scaling Web Applications with Cassandra Presentation (1).ppt
veronica380506
 
PPT
Scaling Web Applications with Cassandra Presentation.ppt
ssuserbad56d
 
PDF
Cassandra Talk: Austin JUG
Stu Hood
 
ODP
Meetup cassandra for_java_cql
zznate
 
PPTX
Cassandra20141113
Brian Enochson
 
PPTX
Cassandra20141009
Brian Enochson
 
PPTX
Cassandra's Sweet Spot - an introduction to Apache Cassandra
Dave Gardner
 
DOCX
Cassandra data modelling best practices
Sandeep Sharma IIMK Smart City,IoT,Bigdata,Cloud,BI,DW
 
PDF
Introduction to Cassandra
shimi_k
 
PDF
Cassandra: Open Source Bigtable + Dynamo
jbellis
 
PDF
Ben Coverston - The Apache Cassandra Project
Morningstar Tech Talks
 
PPTX
Apache Cassandra Developer Training Slide Deck
DataStax Academy
 
PPTX
Cassandra 2012 scandit
Charlie Zhu
 
ODP
Nyc summit intro_to_cassandra
zznate
 
Scaling web applications with cassandra presentation
Murat Çakal
 
Cassandra Data Modelling with CQL (OSCON 2015)
twentyideas
 
Introduciton to Apache Cassandra for Java Developers (JavaOne)
zznate
 
Slide presentation pycassa_upload
Rajini Ramesh
 
Cassandra
Robert Koletka
 
Big Data Grows Up - A (re)introduction to Cassandra
Robbie Strickland
 
Scaling Web Applications with Cassandra Presentation (1).ppt
veronica380506
 
Scaling Web Applications with Cassandra Presentation.ppt
ssuserbad56d
 
Cassandra Talk: Austin JUG
Stu Hood
 
Meetup cassandra for_java_cql
zznate
 
Cassandra20141113
Brian Enochson
 
Cassandra20141009
Brian Enochson
 
Cassandra's Sweet Spot - an introduction to Apache Cassandra
Dave Gardner
 
Cassandra data modelling best practices
Sandeep Sharma IIMK Smart City,IoT,Bigdata,Cloud,BI,DW
 
Introduction to Cassandra
shimi_k
 
Cassandra: Open Source Bigtable + Dynamo
jbellis
 
Ben Coverston - The Apache Cassandra Project
Morningstar Tech Talks
 
Apache Cassandra Developer Training Slide Deck
DataStax Academy
 
Cassandra 2012 scandit
Charlie Zhu
 
Nyc summit intro_to_cassandra
zznate
 
Ad

More from DataStax (20)

PPTX
Is Your Enterprise Ready to Shine This Holiday Season?
DataStax
 
PPTX
Designing Fault-Tolerant Applications with DataStax Enterprise and Apache Cas...
DataStax
 
PPTX
Running DataStax Enterprise in VMware Cloud and Hybrid Environments
DataStax
 
PPTX
Best Practices for Getting to Production with DataStax Enterprise Graph
DataStax
 
PPTX
Webinar | Data Management for Hybrid and Multi-Cloud: A Four-Step Journey
DataStax
 
PPTX
Webinar | How to Understand Apache Cassandra™ Performance Through Read/Writ...
DataStax
 
PDF
Webinar | Better Together: Apache Cassandra and Apache Kafka
DataStax
 
PDF
Top 10 Best Practices for Apache Cassandra and DataStax Enterprise
DataStax
 
PDF
Introduction to Apache Cassandra™ + What’s New in 4.0
DataStax
 
PPTX
Webinar: How Active Everywhere Database Architecture Accelerates Hybrid Cloud...
DataStax
 
PPTX
Webinar | Aligning GDPR Requirements with Today's Hybrid Cloud Realities
DataStax
 
PDF
Designing a Distributed Cloud Database for Dummies
DataStax
 
PDF
How to Power Innovation with Geo-Distributed Data Management in Hybrid Cloud
DataStax
 
PDF
How to Evaluate Cloud Databases for eCommerce
DataStax
 
PPTX
Webinar: DataStax Enterprise 6: 10 Ways to Multiply the Power of Apache Cassa...
DataStax
 
PPTX
Webinar: DataStax and Microsoft Azure: Empowering the Right-Now Enterprise wi...
DataStax
 
PPTX
Webinar - Real-Time Customer Experience for the Right-Now Enterprise featurin...
DataStax
 
PPTX
Datastax - The Architect's guide to customer experience (CX)
DataStax
 
PPTX
An Operational Data Layer is Critical for Transformative Banking Applications
DataStax
 
PPTX
Becoming a Customer-Centric Enterprise Via Real-Time Data and Design Thinking
DataStax
 
Is Your Enterprise Ready to Shine This Holiday Season?
DataStax
 
Designing Fault-Tolerant Applications with DataStax Enterprise and Apache Cas...
DataStax
 
Running DataStax Enterprise in VMware Cloud and Hybrid Environments
DataStax
 
Best Practices for Getting to Production with DataStax Enterprise Graph
DataStax
 
Webinar | Data Management for Hybrid and Multi-Cloud: A Four-Step Journey
DataStax
 
Webinar | How to Understand Apache Cassandra™ Performance Through Read/Writ...
DataStax
 
Webinar | Better Together: Apache Cassandra and Apache Kafka
DataStax
 
Top 10 Best Practices for Apache Cassandra and DataStax Enterprise
DataStax
 
Introduction to Apache Cassandra™ + What’s New in 4.0
DataStax
 
Webinar: How Active Everywhere Database Architecture Accelerates Hybrid Cloud...
DataStax
 
Webinar | Aligning GDPR Requirements with Today's Hybrid Cloud Realities
DataStax
 
Designing a Distributed Cloud Database for Dummies
DataStax
 
How to Power Innovation with Geo-Distributed Data Management in Hybrid Cloud
DataStax
 
How to Evaluate Cloud Databases for eCommerce
DataStax
 
Webinar: DataStax Enterprise 6: 10 Ways to Multiply the Power of Apache Cassa...
DataStax
 
Webinar: DataStax and Microsoft Azure: Empowering the Right-Now Enterprise wi...
DataStax
 
Webinar - Real-Time Customer Experience for the Right-Now Enterprise featurin...
DataStax
 
Datastax - The Architect's guide to customer experience (CX)
DataStax
 
An Operational Data Layer is Critical for Transformative Banking Applications
DataStax
 
Becoming a Customer-Centric Enterprise Via Real-Time Data and Design Thinking
DataStax
 

C*ollege Credit: Creating Your First App in Java with Cassandra

  • 1. Brian O‟Neill, Lead Architect, Health Market Science [email protected] @boneill42
  • 2.  Background  Setup  Data Model / Schema  Naughty List (Astyanax)  Toy List (CQL)
  • 3. Our Problem  Good, bad doctors? Dead doctors?  Prescriber eligibility and remediation.
  • 4. The World-Wide Globally Scalable Naughty List!  How about a Naughty and Nice list for Santa?  1.9 billion children  That will fit in a single row!  Queries to support:  Children can login and check their standing.  Santa can find nice children by country, state or zip.
  • 6. Installation  As easy as…  Download http://cassandra.apache.org/download/  Uncompress tar -xvzf apache-cassandra-1.2.0-beta3-bin.tar.gz  Run bin/cassandra –f (-f puts it in foreground)
  • 7. Configuration  conf/cassandra.yaml start_native_transport: true // CHANGE THIS TO TRUE commitlog_directory: /var/lib/cassandra/commitlog  conf/log4j-server.properties log4j.appender.R.File=/var/log/cassandra/system.log
  • 8. Data Model  Schema (a.k.a. Keyspace)  Table (a.k.a. Column Family)  Row  Have arbitrary #‟s of columns  Validator for keys (e.g. UTF8Type)  Column  Validator for values and keys  Comparator for keys (e.g. DateType or BYOC) (http://www.youtube.com/watch?v=bKfND4woylw)
  • 9. Distributed Architecture  Nodes form a token ring.  Nodes partition the ring by initial token  initial_token: (in cassandra.yaml)  Partitioners map row keys to tokens.  Usually randomly, to evenly distribute the data  All columns for a row are stored together on disk in sorted order.
  • 10. Visually Row Hash Token/Hash Range : 0-99 Alice 50 Bob 3 Eve 15 (1-33)
  • 11. Java Interpretation  Each table is a Distributed HashMap  Each row is a SortedMap. Cassandra provides a massively scalable version of: HashMap<rowKey, SortedMap<columnKey, columnValue>  Implications:  Direct row fetch is fast.  Searching a range of rows can be costly.  Searching a range of columns is cheap.
  • 13. Two Tables  Children Table  Store all the children in the world.  One row per child.  One column per attribute.  NaughtyOrNice Table  Supports the queries we anticipate  Wide-Row Strategy
  • 14. Details of the NaughtyOrNice List  One row per standing:country  Ensures all children in a country are grouped together on disk.  One column per child using a compound key  Ensures the columns are sorted to support our search at varying levels of granularity ○ e.g. All nice children in the US. ○ e.g. All naughty children in PA.
  • 15. Visually Nice:USA Node 1 CA:94333:johny.b.good (1) Go to the row. CA:94333:richie.rich (2) Get the column slice Nice:IRL Node 2 D:EI33:collin.oneill Watch out for: D:EI33:owen.oneill • Hot spotting • Unbalanced Clusters Nice:USA CA:94111:bart.simpson Node 3 CA:94222:dennis.menace PA:18964:michael.myers
  • 16. Our Schema  bin/cqlsh -3  CREATE KEYSPACE northpole WITH replication = {'class':'SimpleStrategy', 'replication_factor':1};  create table children ( childId varchar, firstName varchar, lastName varchar, timezone varchar, country varchar, state varchar, zip varchar, primary key (childId ) ) WITH COMPACT STORAGE;  create table naughtyOrNiceList ( standingByZone varchar, country varchar, state varchar, zip varchar, childId varchar, primary key (standingByZone, country, state, zip, childId) );  bin/cassandra-cli  (the “old school” interface)
  • 17. The CQL->Data Model Rules  First primary key becomes the rowkey.  Subsequent components of the primary key form a composite column name.  One column is then written for each non- primary key column.
  • 18. CQL View cqlsh:northpole> select * from naughtyornicelist ; standingbycountry | state | zip | childid -------------------+-------+-------+--------------- naughty:USA | CA | 94111 | bart.simpson naughty:USA | CA | 94222 | dennis.menace nice:IRL | D | EI33 | collin.oneill nice:IRL | D | EI33 | owen.oneill nice:USA | CA | 94333 | johny.b.good nice:USA | CA | 94333 | richie.rich
  • 19. CLI View [default@northpole] list naughtyornicelist; Using default limit of 100 Using default column limit of 100 ------------------- RowKey: naughty:USA => (column=CA:94111:bart.simpson:, value=, timestamp=1355168971612000) => (column=CA:94222:dennis.menace:, value=, timestamp=1355168971614000) ------------------- RowKey: nice:IRL => (column=D:EI33:collin.oneill:, value=, timestamp=1355168971604000) => (column=D:EI33:owen.oneill:, value=, timestamp=1355168971601000) ------------------- RowKey: nice:USA => (column=CA:94333:johny.b.good:, value=, timestamp=1355168971610000) => (column=CA:94333:richie.rich:, value=, timestamp=1355168971606000)
  • 20. Data Model Implications select * from children where childid='owen.oneill'; select * from naughtyornicelist where childid='owen.oneill'; Bad Request: select * from naughtyornicelist where standingbycountry='nice:IRL' and state='D' and zip='EI33' and childid='owen.oneill';
  • 22. No, seriously. Let‟s code!  What API should we use? Production- Potential Momentum Readiness Thrift 10 -1 -1 Hector 10 8 8 Astyanax 8 9 10 Kundera (JPA) 6 9 9 Pelops 7 6 7 Firebrand 8 10 8 PlayORM 5 8 7 GORA 6 9 7 CQL Driver ? ? ? Asytanax FTW!
  • 23. Connect this.astyanaxContext = new AstyanaxContext.Builder() .forCluster("ClusterName") .forKeyspace(keyspace) .withAstyanaxConfiguration(…) .withConnectionPoolConfiguration(…) .buildKeyspace(ThriftFamilyFactory.getInstance());  Specify:  Cluster Name (arbitrary identifier)  Keyspace  Node Discovery Method  Connection Pool Information  
  • 24. Write/Update MutationBatch mutation = keyspace.prepareMutationBatch(); columnFamily = new ColumnFamily<String, String>(columnFamilyName, StringSerializer.get(), StringSerializer.get()); mutation.withRow(columnFamily, rowKey) .putColumn(entry.getKey(), entry.getValue(), null); mutation.execute();  Process:  Create a mutation  Specify the Column Family with Serializers  Put your columns.  Execute
  • 25. Composite Types  Composite (a.k.a. Compound) public class ListEntry { @Component(ordinal = 0) public String state; @Component(ordinal = 1) public String zip; @Component(ordinal = 2) public String childId; }
  • 26. Range Builders range = entitySerializer.buildRange() .withPrefix(state) .greaterThanEquals("") .lessThanEquals("99999"); Then... .withColumnRange(range).execute();
  • 28. CQL Collections! http://www.datastax.com/dev/blog/cql3_collections  Set  UPDATE users SET emails = emails + {'[email protected]'} WHERE user_id = 'frodo';  List  UPDATE users SET top_places = [ 'the shire' ] + top_places WHERE user_id = 'frodo';  Maps  UPDATE users SET todo['2012-10-2 12:10'] = 'die' WHERE user_id = 'frodo';
  • 29. CQL vs. Thrift http://www.datastax.com/dev/blog/thrift-to-cql3  Thrift is legacy API on which all of the Java APIs are built.  CQL is the new native protocol and driver.
  • 30. Let‟s get back to cranking…  Recreate the schema (to be CQL friendly)  UPDATE children SET toys = toys + [ „legos' ] WHERE childId = ‟owen.oneill‟;  Crank out a Dao layer to use CQL collections operations.
  • 31. Shameless Shoutout(s)  Virgil  https://github.com/boneill42/virgil  REST interface for Cassandra  https://github.com/boneill42/storm-cassandra  Distributed Processing on Cassandra  (Webinar in January)