SlideShare a Scribd company logo
NoSQL Introduction
Slides and Feedback at: http://joind.in/11012
NoSQL Introduction
• Understand what NoSQL is and what it is not.
• Why would you want to use NoSQL within your project
and which NoSQL database would you utilize?
• Explore the relationships between NoSQL and RDBMS.
• Understand how to select between an RDBMs (MySQL
and PostgreSQL), Document Database (MongoDB), Key-
Value Store, Graph Database, and Columnar databases or
combinations of the above.
Thursday May 8th 2014, 3:00pm-3:50pm SB 139
Slides and Feedback at: http://joind.in/11012
2
NoSQL
• History
• Popular NoSQL Databases
• NoSQL Database Comparisons
• Terminology
• Consistency, Replication, Performance
• NoSQL Implementation CRUD Operations
3 Slides and Feedback at: http://joind.in/11012
NoSQL Introduction
• NoSQL is a commonly adopted misnomer
• Typically does not use ANSI SQL
– SQL = Structured Query Language
– Structure exists but is more Flexible
– Queries are performed
– Language is closer to Programming Languages
4
Slides and Feedback at: http://joind.in/11012
NoSQL History
5 http://www.w3resource.com/mongodb/nosql.php
NoSQL History
• 1998 Carlo Strozzi Command Line Database
• June 11, 2009 Meetup
– Open Source, Distributed, Non-Relational DB
– Eric Evans (Rackspace)
– Johan Oskarsson (Last.fm)
6
NoSQL History
7
NoSQL History
• Bad name, but it stuck!
• Not a definitive term
• Generally, Newer databases solving new
and different problems
• Not Only SQL http://blog.sym-
link.com/2009/10/30/nosql_whats_in_a_name.html
8 Slides and Feedback at: http://joind.in/11012
NoSQL Origination
• Problems not solved by RDBMs
• Limitations of RDBMs, not SQL
9 Slides and Feedback at: http://joind.in/11012
Most Popular Databases
http://db-engines.com/en/ranking
Ranking by: Web Content, Web Searches, Technical Discussion, Jobs, Resumes
10
Most Popular NoSQL
• MongoDB - Document Store
• Cassandra – Wide Column Store
• Solr – Search Engine
• Redis – Key-value store
• Hbase – Wide Column Store
• Memcached – Key-value Store
• CouchDB – Document Store
• Neo4j – Graph Database
• Riak – Key-value Store
• SimpleDB – Key-value Store within Amazon Cloud
11 Slides and Feedback at: http://joind.in/11012
12
NoSQL vs RDBMs
Image Reference: http://blogs.the451group.com/information_management/2012/11/02/updated-database-landscape-graphic/
Reading Recommendations
Great Overview of NoSQL:
Seven Databases in Seven Weeks
Eric Redmond and Jim Wilson
13
NoSQL “Bleeding Edge”
• Several solutions are mature and stable
enough to run large scale production
environments
• Not all permutations have been considered
• Several (but not all) optimization strategies
have been published
• Crucial elements such as Security may be a
secondary add-on in favor of performance.
14
NoSQL “Bleeding Edge”
Sun Microsystems csh man page:
“Although robust enough for
general use, adventures into the
esoteric periphery of the C shell
may reveal unexpected quirks.”
15 Slides and Feedback at: http://joind.in/11012
NoSQL Comparison
16
Take note of patterns:
Recent Release, Open Source, Utilized at High-Volume sites
Variety of Formats:
Key-Value, Wide-Column, Document, Graph
http://db-engines.com/en/ranking
NoSQL Database Types
17
• Key-Value
• Column Oriented Databases (Columnar)
• Graph
• Document
• Search Database - Solr
• Key-Value Web Optimization - Memcached
Key-Value Stores
18
Key Value
code:java 17.316% Lowest rank on Feb 2014
code:C 18.334% Lowest rank on August 2013
code:Objective-C Lowest rank on Dec 2007 11.341%
code:C++ {“score”:”6.892%”, “low rank”: “Feb 2008”}
Key Value
drink:java coffee
drink:punch Sprite + pineapple juice
drink:pop Carbonated Soda
http://www.tiobe.com/index.php/content/paperinfo/tpci/index.html
Codebucketdrinkbucket
Column Oriented Database
19
Neo4j
20
Document Oriented Database
21
{
"_id" : 1,
"name" : { "first" : "John", "last" : "Backus" },
"contribs" : [ "Fortran", "ALGOL", "FP" ],
"awards" : [
{ "award" : "W.W. McDowell Award",
"year" : 1967,
"by" : "IEEE Computer Society" },
{ "award" : "Draper Prize",
"year" : 1993,
"by" : "National Academy of Engineering" }
]
}
Document Oriented Database
22
{ "facutly" :
[
{
"_id" : 1,
"name" : { "first" : "John", "last" : "Backus" },
"contribs" : [ "Fortran", "ALGOL", "FP" ],
"awards" : [
{ "award" : "W.W. McDowell Award",
"year" : 1967,
"by" : "IEEE Computer Society" },
{ "award" : "Draper Prize",
"year" : 1993,
"by" : "National Academy of Engineering" }
]
},
{
"_id" : 2,
"name" : { "first" : "David", "last" : "Williams" },
"contribs" : [ "C#", "Java", "PHP" ],
"awards" : [
{ "award" : "Sherman Peabody Award II",
"year" : 2095,
"location" : "Paris",
"by" : "Intergalactic Continuum" },
{ "award" : "Sherman Peabody Award IX",
"year" : 2090,
"location" : "Paris",
"by" : "Intergalactic Continuum" },
{ "award" : "Sherman Peabody Award IV",
"year" : 2093,
"location" : "Paris",
"by" : "Intergalactic Continuum" }
]
}
]
}
Document Oriented Database
23
http://chris.photobooks.com/json/
Slides and Feedback at: http://joind.in/11012
24
http://visualizer.json2html.com/
NoSQL Comparison
25
No ANSI SQL Standards, No Predefined Schemas, Replication,
Eventual Consistency, Rarely Foreign Keys, Data Types not required
Newer Concepts: Sharding, REST API, JSON, MapReduce
NoSQL Characteristics
26
No Predefined Schemas
• May insert data without creating a table
• Schema Versions (v1.5, v1.6, v1.7,…)
Rarely Foreign Keys
• No JOIN operations
• Relationships are not automatically maintained
Eventual Consistency
• Old copies being replaced by new records
• Inconsistent data until all replacements are complete
Download NoSQL v95141.3
27
Released 4/1/2014
http://www.nosql.org/downloads/ymbkm.zip
NoSQL
Terminology
and
Concepts
28 Slides and Feedback at: http://joind.in/11012
Sharding
29
Partitions – Data distributed across disks
Sharding – Data distributed across servers
Map Reduce
30
Divides work across distributed systems
Parallel processing of large data sets
Divide – Conquer – Consolidate
Often Implement by defining Map and Reduce classes or functions
1+2+3+6+7+8+9=?
Google’s MapReduce Programming Model – Revisited Ralf Lammel, Microsoft, 2008
http://www.sciencedirect.com/science/article/pii/S0167642307001281
2
6
8
1
7
3
9
16
20
36
JSON
31
Subset of JavaScript Object Notation
Similarities to XML method for representing data
Syntax
Name : Value pairs
“salary” : “125000”
Values are: number, string, Boolean, array, object, or NULL
Objects can store Objects, Arrays can store Arrays
Separate pairs by commas
“salary” : “125000”, “gender” : “male”
Curly braces denote objects
{ “salary” : “125000”, “gender” : “male” }
Square brackets denote arrays
“phone” : [”555-1212”, ”555-3344”]
“phone” : [ {“office” : ”555-1212”}, {“mobile” : ”555-3344”} ]
JSON Example
32
{
"_id" : 1,
"name" : { "first" : "John", "last" : "Backus" },
"contribs" : [ "Fortran", "ALGOL", "FP" ],
"awards" : [
{ "award" : "W.W. McDowell Award",
"year" : 1967,
"by" : "IEEE Computer Society" },
{ "award" : "Draper Prize",
"year" : 1993,
"by" : "National Academy of Engineering" }
]
}
http://www.mongodb.com/json-and-bson
REST API
33
CRUD (Create, Read, Update, Delete) operations through the web
HTTP Methods
GET (List/Read)
POST(Update)
PUT(Create)
DELETE(Delete)
EXAMPLE API http://www.blinksale.com/api/
List/Read Data via HTTP GET to
http://www.blinksale.com/invoices
http://www.blinksale.com/invoices/invoice_id/payments
http://www.blinksale.com/invoices/?start=2006&end=2008
Returns XML results
REST API
34
Update data via HTTP POST to
http://www.blinksale.com/invoices/invoice_id/payments
<?xml version="1.0" encoding="UTF-8"?>
<payment xmlns="http://www.blinksale.com/api">
<amount>1000.00</amount>
<date>2006-09-27</date>
</payment>
REST = REpresentational State Transfer
Twitter Example:
https://dev.twitter.com/docs/api/1.1 (GET and POST only)
Database SELECT Statements
35
Oracle
SELECT * FROM relationships
MongoDB
db.relationships.find()
Cassandra (CQL)
SELECT * FROM relationships
Slides and Feedback at: http://joind.in/11012
Database SELECT Statements
36
Redis – Key-Value Store
SMEMBERS relationships
Riak – Key-Value Store with REST API (+ proprietary drivers)
http://localhost:8091/riak/relationships/likes
Neo4j (Cypher)
MATCH (n)-[r:LIKES]->(m) RETURN n,r,m
Slides and Feedback at: http://joind.in/11012
JOINS without Foreign Keys
37
original_id = ObjectId()
db.employer.insert({
"_id": original_id,
"name": "Broadway Tech",
"url": "bc.example.net" })
db.people.insert({
"name": "Erin",
“employer_id": original_id,
"url": "bc.example.net/Erin" })
“Erin” works at “Broadway Tech”
One of the employees at “Broadway Tech” is “Erin”
http://docs.mongodb.org/manual/reference/database-references/#document-references
Replication Challenge is
Write Consistency
38
ACID, BASE, CAP, CPR
1979 Gray, 1983 Reuter & Härder - ACID
Atomic, Consistent, Isolated, Durable
Rollback: All or Nothing, Follows Rules, Simultaneous, No Drops
1997 Brewer - BASE
Basically Available, Soft-state, Eventually consistent
2000 Brewer – CAP (Pick Two)
Consistency, Availability, Partition Tolerance
CPR (Pick Two)
Consistency, Performance, Replication/Redundancy
Contrived - Stretch Definitions
39
CPR
40
Consistency
Redundancy
Performance
Pick Two
CPR
41
Consistency Performance
A DCB
Spread data across storage or computer
Redundancy
42
Redundancy
Performance
Consistency
ABCE ABCDABCDABCE
Updates may be
inconsistent across devices
43
Consistency
Redundancy
Performance
ABCD ABCDABCDABCD
One Update Locks all Nodes
CRUD
44
Create
Read
Update
Delete
Slides and Feedback at: http://joind.in/11012
SQL CRUD
45
Create
INSERT INTO table (column1, column2) VALUES (9, 'string');
Read
SELECT column1, column2 FROM table;
Update
UPDATE table SET column2 = 'text' WHERE column1= 9
Delete
DELETE FROM table WHERE column2='text'
Key-Value Stores
46
Key Value
code:java 17.316% Lowest rank on Feb 2014
code:C 18.334% Lowest rank on August 2013
code:Objective-C Lowest rank on Dec 2007 11.341%
code:C++ {“score”:”6.892%”, “low rank”: “Feb 2008”}
Key Value
drink:java coffee
drink:punch Sprite + pineapple juice
drink:pop Carbonated Soda
http://www.tiobe.com/index.php/content/paperinfo/tpci/index.html
codebucketdrinkbucket
Redis CRUD
47
http://redis.io/commands
http://redis.io/topics/data-types-intro
http://openmymind.net/2011/11/8/Redis-Zero-To-Master-In-30-Minutes-Part-1/
Redis is an in-memory Key-Value Store which stores:
Strings, Hashes, Lists, Sets, or Ordered sets
Strings: values of strings are concrete and can not be altered
SET user:jim {lastname: ‘Mathews’, salary: 125000}
GET user:jim
Hashes: allows modification and retrieval of individual values
HSET user:jim salary 125000
HSET user:jim lastname Mathews
HGET jim salary
Redis CRUD
48
Lists: One-dimensional array with insert, append, pop, and
push
Redis.lpush(‘users:employees’, ‘user:jim’)
redis.mget(redis.lrange(‘users:employess’,0,5))
Sets: lists with no duplicate values (SADD = Set Add)
SADD users:employees jim
SADD users:employees krishna
SMEMBERS employees
Sorted Sets: are sets with an added sorting value
ZADD users:employees 125000 jim
ZADD users:employees 157000 Krishna
ZRANGEBYSCORE users:employees 100000 180000
Riak CRUD
49
Easy to install and configure test cluster
REST Queries
Create/PUT a “course:CIS2120” row
curl –v –X PUT http://localhost:8091/riak/course/CIS2120 
-H “Content-Type: application/json” 
-d ‘{“name”:”Database Coding”, “days”:”MWF”}’
Read/GET the value for “course:CIS2120”
curl –X GET http://localhost:8091/riak/course/CIS2120
curl http://localhost:8091/riak/course/CIS2120
Key Value
course:CIS2120 {“name”:”Database Coding”, “days”:”MWF”}
Riak Links
50
Riak can link on value to key:value to another with a relationship
curl –v –X PUT http://localhost:8091/riak/student/sorensen 
-H “Content-Type: application/json” 
-H “Link: </riak/course/CIS2120>; riaktag=”enrolled”” 
-d ‘{“firstname”:”Conner”}’
This does not automatically create a link from “sorensen“ to
“CIS2120”
Neo4j
51
Neo4j – Graph Database
52
http://www.neo4j.org/learn/try
http://docs.neo4j.org/refcard/2.0/
MATCH (n)-[r:LIKES]->(m) RETURN n,r,m
Matches a person “n” that likes person “m”
https://gist.github.com/peterneubauer/6019125
http://gist.neo4j.org/?6019125
Neo4j CRUD
53
Must try dragging nodes at: http://www.neo4j.org/learn/try
MATCH (user {name:“Bill"})-[:KNOWS]->(colleague)
WHERE colleague.employer=“LinkedIn”
RETURN user,colleague
ORDER BY colleague.name LIMIT 10
http://docs.neo4j.org/refcard/2.0/
MATCH (n)-[r:LIKES]->(m) RETURN n,r,m
Matches a person “n” that likes person “m”
MATCH (n)-[r]->(m) RETURN n,r,m
Matches any relationship between “n” and “m”
http://www.neo4j.org/learn/cypher
Neo4j(LUKE {name:"Luke Skywalker"}),
(HAN {name:"Han Solo"}),
(LEIA {name:"Princess Leia Organa"}),
(OBI_WAN {name:"Obi Wan Kenobi"}),
(YODA {name : "Yoda"}),
(VADER {name:"Darth Vader"}),
(C3PO {name:"C3PO", droid:true}),
(R2D2 {name:"R2D2", droid:true}),
(CHEWBACCA {name:"Chewbacca"}),
(TATOOINE {name:"Tatooine", distance:13184}),
(DAGOBAH {name:"Dagobah", distance:15407}),
(JEDI {name:"Jedi"}),
(SITH {name:"Sith"}),
(REBELLION {name:"Rebellion"}),
(EMPIRE {name:"Empire"}),
(DARK_SIDE {name:"Dark Side"}),
(LIGHT_SIDE {name:"Light Side"}),
…
(LUKE)-[:FRIENDS_WITH]->(HAN),
(LUKE)-[:FRIENDS_WITH]->(LEIA),
(HAN)-[:FRIENDS_WITH]->(CHEWBACCA),
(YODA)-[:TEACHES]->(OBI_WAN),
(YODA)-[:TEACHES]->(LUKE),
(OBI_WAN)-[:TEACHES]->(LUKE),
(OBI_WAN)-[:KNOWS]->(VADER),
(LUKE)-[:KNOWS]->(R2D2),
(R2D2)-[:KNOWS]->(C3PO),
(LUKE)-[:LIVED_ON]->(TATOOINE),
(HAN)-[:LIVED_ON]->(CORELLIA),
(LEIA)-[:LIVED_ON]->(ALDERAAN),
(YODA)-[:LIVED_ON]->(DAGOBAH),
(LUKE)-[:DEVOTED_TO]->(JEDI),
(LUKE)-[:DEVOTED_TO]->(REBELLION),
(LUKE)-[:DEVOTED_TO]->(LIGHT_SIDE),
(VADER)-[:DEVOTED_TO]->(SITH),
(VADER)-[:DEVOTED_TO]->(EMPIRE),
(VADER)-[:DEVOTED_TO]->(DARK_SIDE),
(LEIA)-[:DEVOTED_TO]->(REBELLION),
(HAN)-[:DEVOTED_TO]->(REBELLION)
…
https://gist.github.com/peterneubauer/6019125
http://gist.neo4j.org/?6019125
MATCH y-[r]-other
WHERE y.name='Yoda'
return y.name, type(r), other.name
54
Google BigTable
• White Paper published in 2006
• Many databases based upon BigTable
• 13 pages, readable for many non-techies
• Insightful into the early days of NoSQL
http://static.googleusercontent.com/media/research.google.com/en/us/archive/bigtable-osdi06.pdf
55
Hbase
56
Large-Scale, Column-oriented database
Consistency, Performance, Fault-Tolerant, ACID via Locking
Tables are created before initial data is added
Tables have
row keys are indexed row identifier strings
column families – contain one or more columns
timestamp for version control
Hbase
57
Row key is a unifier for column families.
If row does insert values in a column family no disk space
is utilized within the column family.
Keys are identified by column_family:column_name
text:
revision:author
revision:comment
Write-Ahead Logging
(WAL)
similar to file system
journaling
Hbase CRUD
58
create ‘wiki_table’, ‘text_column_family’, ‘revision_column_family’
create ‘wiki’, ‘text’, ‘revision’
put ‘wiki’, ‘first page’, ‘text:’, ‘…’
put ‘wiki’, ‘first page’, ‘revision:author’, ‘…’
get ‘wiki’, ‘first page’, [‘revision:author’, ‘revision:comment’]
delete ‘wiki’, ‘first page’, ‘revision:author’
scan ‘wiki’ = SELECT * FROM wiki
Seven Databases in Seven Weeks, Redmond & Wilson 2012
MongoDB Document Store
59
db.courses.insert({
name: “CIS2120”,
description: “Database Coding”,
instructor: {
name: “David Williams”,
email: “david.williams@usu.edu”
}
instructor2: {
name: “John Kerley-Weeks”,
office: “JQL230”
}
subjects: [“Python”, “MongoDB”, “3NF”, “ETL”, “Star Schema”]
})
MongoDB vs SQL
60
http://docs.mongodb.org/manual/reference/sql-comparison/
MongoDB -> SQL Terminology
Collection -> Table
Document -> Row
Column -> Field
db.courses.find() = SELECT * FROM courses
db.courses.count() = SELECT COUNT(*) FROM courses
db.courses.find({name: “CIS2120”})
MongoDB Simple Database
61
http://media.mongodb.org/zips.json
{"city": "ACMAR", "loc": [-86.51557, 33.584132], "pop": 6055, "state": "AL", "_id": "35004"}
{"city": "ADAMSVILLE", "loc": [-86.959727, 33.588437], "pop": 10616, "state": "AL", "_id": "35005"}
{"city": "ADGER", "loc": [-87.167455, 33.434277], "pop": 3205, "state": "AL", "_id": "35006"}
{"city": "KEYSTONE", "loc": [-86.812861, 33.236868], "pop": 14218, "state": "AL", "_id": "35007"}
{"city": "NEW SITE", "loc": [-85.951086, 32.941445], "pop": 19942, "state": "AL", "_id": "35010"}
{"city": "ALPINE", "loc": [-86.208934, 33.331165], "pop": 3062, "state": "AL", "_id": "35014"}
{"city": "ARAB", "loc": [-86.489638, 34.328339], "pop": 13650, "state": "AL", "_id": "35016"}
{"city": "BAILEYTON", "loc": [-86.621299, 34.268298], "pop": 1781, "state": "AL", "_id": "35019"}
{"city": "BESSEMER", "loc": [-86.947547, 33.409002], "pop": 40549, "state": "AL", "_id": "35020"}
{"city": "HUEYTOWN", "loc": [-86.999607, 33.414625], "pop": 39677, "state": "AL", "_id": "35023"}
{"city": "BLOUNTSVILLE", "loc": [-86.568628, 34.092937], "pop": 9058, "state": "AL", "_id": "35031"}
{"city": "BREMEN", "loc": [-87.004281, 33.973664], "pop": 3448, "state": "AL", "_id": "35033"}
{"city": "BRENT", "loc": [-87.211387, 32.93567], "pop": 3791, "state": "AL", "_id": "35034"}
{"city": "BRIERFIELD", "loc": [-86.951672, 33.042747], "pop": 1282, "state": "AL", "_id": "35035"}
{“city”: “Logan, UT”, “additionally”: [“Nibley, UT”, “River Heights, UT”], “state”: “UT”, “version”: “2.1”, “_id”: “84321”}
{“city”: “Olivehurst, CA”, “additionally”: [“Arboga, CA”, “Plumas Lake, CA”, “West Linda, CA”], “state”: “CA”, “version”: “2.1”,
“_id”: “95961”}
Cassandra Characteristics
62
Scalable, High-availability Wide-columnar datastore
Peer-to-peer rather than master-slave clusters
Tunable consistency can read/write to a single node,
quorum of nodes or all nodes
Recommends static and dynamic column families
Static column families have contain pre-defined columns
Contact Info: phone, address, email, web
Dynamic families have variable numbers of similar columns
Students enrolled in a course
Cassandra CRUD
63
http://www.datastax.com/docs/0.8/references/cql
http://cassandra.apache.org/doc/cql3/CQL.html#selectStmt
CREATE TABLE course (
name text PRIMARY KEY,
instructor text,
maxstudents int
)
INSERT INTO course (name, instructor, maxstudents) VALUES
(‘CIS2120’, ‘Williams’, 28)
UPDATE course SET maxstudents=26 WHERE name=‘CIS2120’
SELECT name, instructor FROM course WHERE maxstudents > 20
Cassandra CRUD
64
No JOIN operations or FOREIGN KEYS
CREATE TABLE people (
name text PRIMARY KEY,
email text,
phones map<text, text>
)
INSERT INTO people (name, email, phones)
VALUES (‘John Weeks’, ‘john.weeks@usu.edu’,
{‘mobile’ : ‘555-1212’, ‘office’ : ‘797-7133’, ‘fax’ : ‘555-1212’})
UPDATE people SET phones[‘office’] = ‘555-1212’
WHERE email = ‘john.weeks@usu.edu’
Questions
???
65 Slides and Feedback at: http://joind.in/11012

More Related Content

Similar to NoSQL Introduction (20)

PPTX
NoSQL and MongoDB Introdction
Brian Enochson
 
PDF
Considerations for using NoSQL technology on your next IT project - Akmal Cha...
jaxconf
 
PDF
NoSQL on microsoft azure april 2014
Brian Benz
 
PDF
Considerations for using NoSQL technology on your next IT project - Akmal Cha...
BCS Data Management Specialist Group
 
PDF
Using MongoDB to Build a Fast and Scalable Content Repository
MongoDB
 
PDF
Intro Couchdb
selvamanisampath
 
PDF
[SSA] 03.newsql database (2014.02.05)
Steve Min
 
PDF
ER/Studio and DB PowerStudio Launch Webinar: Big Data, Big Models, Big News!
Embarcadero Technologies
 
PPTX
SQL to NoSQL: Top 6 Questions
Mike Broberg
 
PPTX
Building your first MEAN application
FITC
 
PDF
Play framework
Andrew Skiba
 
PDF
SQL? NoSQL? NewSQL?!? What’s a Java developer to do? - JDC2012 Cairo, Egypt
Chris Richardson
 
PPTX
Where are yours vertexes and what are they talking about?
Roberto Franchini
 
PDF
Lodstats: The Data Web Census Dataset. Kobe, Japan, 2016
Ivan Ermilov
 
PPTX
Building your First MEAN App
MongoDB
 
PDF
Tech Talk: DevOps at LeanIX @ Startup Camp Berlin
LeanIX GmbH
 
PPTX
Riviera jug apicassandra
Cédrick Lunven
 
PPTX
Tackle Containerization Advisor (TCA) for Legacy Applications
Konveyor Community
 
PDF
2.28.17 Introducing DSpace 7 Webinar Slides
DuraSpace
 
PPTX
Intro to node and mongodb 1
Mohammad Qureshi
 
NoSQL and MongoDB Introdction
Brian Enochson
 
Considerations for using NoSQL technology on your next IT project - Akmal Cha...
jaxconf
 
NoSQL on microsoft azure april 2014
Brian Benz
 
Considerations for using NoSQL technology on your next IT project - Akmal Cha...
BCS Data Management Specialist Group
 
Using MongoDB to Build a Fast and Scalable Content Repository
MongoDB
 
Intro Couchdb
selvamanisampath
 
[SSA] 03.newsql database (2014.02.05)
Steve Min
 
ER/Studio and DB PowerStudio Launch Webinar: Big Data, Big Models, Big News!
Embarcadero Technologies
 
SQL to NoSQL: Top 6 Questions
Mike Broberg
 
Building your first MEAN application
FITC
 
Play framework
Andrew Skiba
 
SQL? NoSQL? NewSQL?!? What’s a Java developer to do? - JDC2012 Cairo, Egypt
Chris Richardson
 
Where are yours vertexes and what are they talking about?
Roberto Franchini
 
Lodstats: The Data Web Census Dataset. Kobe, Japan, 2016
Ivan Ermilov
 
Building your First MEAN App
MongoDB
 
Tech Talk: DevOps at LeanIX @ Startup Camp Berlin
LeanIX GmbH
 
Riviera jug apicassandra
Cédrick Lunven
 
Tackle Containerization Advisor (TCA) for Legacy Applications
Konveyor Community
 
2.28.17 Introducing DSpace 7 Webinar Slides
DuraSpace
 
Intro to node and mongodb 1
Mohammad Qureshi
 

Recently uploaded (20)

PDF
Newgen Beyond Frankenstein_Build vs Buy_Digital_version.pdf
darshakparmar
 
PPTX
WooCommerce Workshop: Bring Your Laptop
Laura Hartwig
 
PPTX
AUTOMATION AND ROBOTICS IN PHARMA INDUSTRY.pptx
sameeraaabegumm
 
PDF
Exolore The Essential AI Tools in 2025.pdf
Srinivasan M
 
PPTX
Webinar: Introduction to LF Energy EVerest
DanBrown980551
 
PDF
Presentation - Vibe Coding The Future of Tech
yanuarsinggih1
 
PDF
How Startups Are Growing Faster with App Developers in Australia.pdf
India App Developer
 
PPTX
OpenID AuthZEN - Analyst Briefing July 2025
David Brossard
 
PDF
Blockchain Transactions Explained For Everyone
CIFDAQ
 
PDF
"AI Transformation: Directions and Challenges", Pavlo Shaternik
Fwdays
 
PDF
SWEBOK Guide and Software Services Engineering Education
Hironori Washizaki
 
PPTX
UiPath Academic Alliance Educator Panels: Session 2 - Business Analyst Content
DianaGray10
 
PDF
Log-Based Anomaly Detection: Enhancing System Reliability with Machine Learning
Mohammed BEKKOUCHE
 
PDF
July Patch Tuesday
Ivanti
 
PDF
LLMs.txt: Easily Control How AI Crawls Your Site
Keploy
 
PDF
Chris Elwell Woburn, MA - Passionate About IT Innovation
Chris Elwell Woburn, MA
 
PDF
Reverse Engineering of Security Products: Developing an Advanced Microsoft De...
nwbxhhcyjv
 
PPTX
✨Unleashing Collaboration: Salesforce Channels & Community Power in Patna!✨
SanjeetMishra29
 
PDF
Fl Studio 24.2.2 Build 4597 Crack for Windows Free Download 2025
faizk77g
 
PDF
[Newgen] NewgenONE Marvin Brochure 1.pdf
darshakparmar
 
Newgen Beyond Frankenstein_Build vs Buy_Digital_version.pdf
darshakparmar
 
WooCommerce Workshop: Bring Your Laptop
Laura Hartwig
 
AUTOMATION AND ROBOTICS IN PHARMA INDUSTRY.pptx
sameeraaabegumm
 
Exolore The Essential AI Tools in 2025.pdf
Srinivasan M
 
Webinar: Introduction to LF Energy EVerest
DanBrown980551
 
Presentation - Vibe Coding The Future of Tech
yanuarsinggih1
 
How Startups Are Growing Faster with App Developers in Australia.pdf
India App Developer
 
OpenID AuthZEN - Analyst Briefing July 2025
David Brossard
 
Blockchain Transactions Explained For Everyone
CIFDAQ
 
"AI Transformation: Directions and Challenges", Pavlo Shaternik
Fwdays
 
SWEBOK Guide and Software Services Engineering Education
Hironori Washizaki
 
UiPath Academic Alliance Educator Panels: Session 2 - Business Analyst Content
DianaGray10
 
Log-Based Anomaly Detection: Enhancing System Reliability with Machine Learning
Mohammed BEKKOUCHE
 
July Patch Tuesday
Ivanti
 
LLMs.txt: Easily Control How AI Crawls Your Site
Keploy
 
Chris Elwell Woburn, MA - Passionate About IT Innovation
Chris Elwell Woburn, MA
 
Reverse Engineering of Security Products: Developing an Advanced Microsoft De...
nwbxhhcyjv
 
✨Unleashing Collaboration: Salesforce Channels & Community Power in Patna!✨
SanjeetMishra29
 
Fl Studio 24.2.2 Build 4597 Crack for Windows Free Download 2025
faizk77g
 
[Newgen] NewgenONE Marvin Brochure 1.pdf
darshakparmar
 
Ad

NoSQL Introduction

  • 1. NoSQL Introduction Slides and Feedback at: http://joind.in/11012
  • 2. NoSQL Introduction • Understand what NoSQL is and what it is not. • Why would you want to use NoSQL within your project and which NoSQL database would you utilize? • Explore the relationships between NoSQL and RDBMS. • Understand how to select between an RDBMs (MySQL and PostgreSQL), Document Database (MongoDB), Key- Value Store, Graph Database, and Columnar databases or combinations of the above. Thursday May 8th 2014, 3:00pm-3:50pm SB 139 Slides and Feedback at: http://joind.in/11012 2
  • 3. NoSQL • History • Popular NoSQL Databases • NoSQL Database Comparisons • Terminology • Consistency, Replication, Performance • NoSQL Implementation CRUD Operations 3 Slides and Feedback at: http://joind.in/11012
  • 4. NoSQL Introduction • NoSQL is a commonly adopted misnomer • Typically does not use ANSI SQL – SQL = Structured Query Language – Structure exists but is more Flexible – Queries are performed – Language is closer to Programming Languages 4 Slides and Feedback at: http://joind.in/11012
  • 6. NoSQL History • 1998 Carlo Strozzi Command Line Database • June 11, 2009 Meetup – Open Source, Distributed, Non-Relational DB – Eric Evans (Rackspace) – Johan Oskarsson (Last.fm) 6
  • 8. NoSQL History • Bad name, but it stuck! • Not a definitive term • Generally, Newer databases solving new and different problems • Not Only SQL http://blog.sym- link.com/2009/10/30/nosql_whats_in_a_name.html 8 Slides and Feedback at: http://joind.in/11012
  • 9. NoSQL Origination • Problems not solved by RDBMs • Limitations of RDBMs, not SQL 9 Slides and Feedback at: http://joind.in/11012
  • 11. Most Popular NoSQL • MongoDB - Document Store • Cassandra – Wide Column Store • Solr – Search Engine • Redis – Key-value store • Hbase – Wide Column Store • Memcached – Key-value Store • CouchDB – Document Store • Neo4j – Graph Database • Riak – Key-value Store • SimpleDB – Key-value Store within Amazon Cloud 11 Slides and Feedback at: http://joind.in/11012
  • 12. 12 NoSQL vs RDBMs Image Reference: http://blogs.the451group.com/information_management/2012/11/02/updated-database-landscape-graphic/
  • 13. Reading Recommendations Great Overview of NoSQL: Seven Databases in Seven Weeks Eric Redmond and Jim Wilson 13
  • 14. NoSQL “Bleeding Edge” • Several solutions are mature and stable enough to run large scale production environments • Not all permutations have been considered • Several (but not all) optimization strategies have been published • Crucial elements such as Security may be a secondary add-on in favor of performance. 14
  • 15. NoSQL “Bleeding Edge” Sun Microsystems csh man page: “Although robust enough for general use, adventures into the esoteric periphery of the C shell may reveal unexpected quirks.” 15 Slides and Feedback at: http://joind.in/11012
  • 16. NoSQL Comparison 16 Take note of patterns: Recent Release, Open Source, Utilized at High-Volume sites Variety of Formats: Key-Value, Wide-Column, Document, Graph http://db-engines.com/en/ranking
  • 17. NoSQL Database Types 17 • Key-Value • Column Oriented Databases (Columnar) • Graph • Document • Search Database - Solr • Key-Value Web Optimization - Memcached
  • 18. Key-Value Stores 18 Key Value code:java 17.316% Lowest rank on Feb 2014 code:C 18.334% Lowest rank on August 2013 code:Objective-C Lowest rank on Dec 2007 11.341% code:C++ {“score”:”6.892%”, “low rank”: “Feb 2008”} Key Value drink:java coffee drink:punch Sprite + pineapple juice drink:pop Carbonated Soda http://www.tiobe.com/index.php/content/paperinfo/tpci/index.html Codebucketdrinkbucket
  • 21. Document Oriented Database 21 { "_id" : 1, "name" : { "first" : "John", "last" : "Backus" }, "contribs" : [ "Fortran", "ALGOL", "FP" ], "awards" : [ { "award" : "W.W. McDowell Award", "year" : 1967, "by" : "IEEE Computer Society" }, { "award" : "Draper Prize", "year" : 1993, "by" : "National Academy of Engineering" } ] }
  • 22. Document Oriented Database 22 { "facutly" : [ { "_id" : 1, "name" : { "first" : "John", "last" : "Backus" }, "contribs" : [ "Fortran", "ALGOL", "FP" ], "awards" : [ { "award" : "W.W. McDowell Award", "year" : 1967, "by" : "IEEE Computer Society" }, { "award" : "Draper Prize", "year" : 1993, "by" : "National Academy of Engineering" } ] }, { "_id" : 2, "name" : { "first" : "David", "last" : "Williams" }, "contribs" : [ "C#", "Java", "PHP" ], "awards" : [ { "award" : "Sherman Peabody Award II", "year" : 2095, "location" : "Paris", "by" : "Intergalactic Continuum" }, { "award" : "Sherman Peabody Award IX", "year" : 2090, "location" : "Paris", "by" : "Intergalactic Continuum" }, { "award" : "Sherman Peabody Award IV", "year" : 2093, "location" : "Paris", "by" : "Intergalactic Continuum" } ] } ] }
  • 23. Document Oriented Database 23 http://chris.photobooks.com/json/ Slides and Feedback at: http://joind.in/11012
  • 25. NoSQL Comparison 25 No ANSI SQL Standards, No Predefined Schemas, Replication, Eventual Consistency, Rarely Foreign Keys, Data Types not required Newer Concepts: Sharding, REST API, JSON, MapReduce
  • 26. NoSQL Characteristics 26 No Predefined Schemas • May insert data without creating a table • Schema Versions (v1.5, v1.6, v1.7,…) Rarely Foreign Keys • No JOIN operations • Relationships are not automatically maintained Eventual Consistency • Old copies being replaced by new records • Inconsistent data until all replacements are complete
  • 27. Download NoSQL v95141.3 27 Released 4/1/2014 http://www.nosql.org/downloads/ymbkm.zip
  • 28. NoSQL Terminology and Concepts 28 Slides and Feedback at: http://joind.in/11012
  • 29. Sharding 29 Partitions – Data distributed across disks Sharding – Data distributed across servers
  • 30. Map Reduce 30 Divides work across distributed systems Parallel processing of large data sets Divide – Conquer – Consolidate Often Implement by defining Map and Reduce classes or functions 1+2+3+6+7+8+9=? Google’s MapReduce Programming Model – Revisited Ralf Lammel, Microsoft, 2008 http://www.sciencedirect.com/science/article/pii/S0167642307001281 2 6 8 1 7 3 9 16 20 36
  • 31. JSON 31 Subset of JavaScript Object Notation Similarities to XML method for representing data Syntax Name : Value pairs “salary” : “125000” Values are: number, string, Boolean, array, object, or NULL Objects can store Objects, Arrays can store Arrays Separate pairs by commas “salary” : “125000”, “gender” : “male” Curly braces denote objects { “salary” : “125000”, “gender” : “male” } Square brackets denote arrays “phone” : [”555-1212”, ”555-3344”] “phone” : [ {“office” : ”555-1212”}, {“mobile” : ”555-3344”} ]
  • 32. JSON Example 32 { "_id" : 1, "name" : { "first" : "John", "last" : "Backus" }, "contribs" : [ "Fortran", "ALGOL", "FP" ], "awards" : [ { "award" : "W.W. McDowell Award", "year" : 1967, "by" : "IEEE Computer Society" }, { "award" : "Draper Prize", "year" : 1993, "by" : "National Academy of Engineering" } ] } http://www.mongodb.com/json-and-bson
  • 33. REST API 33 CRUD (Create, Read, Update, Delete) operations through the web HTTP Methods GET (List/Read) POST(Update) PUT(Create) DELETE(Delete) EXAMPLE API http://www.blinksale.com/api/ List/Read Data via HTTP GET to http://www.blinksale.com/invoices http://www.blinksale.com/invoices/invoice_id/payments http://www.blinksale.com/invoices/?start=2006&end=2008 Returns XML results
  • 34. REST API 34 Update data via HTTP POST to http://www.blinksale.com/invoices/invoice_id/payments <?xml version="1.0" encoding="UTF-8"?> <payment xmlns="http://www.blinksale.com/api"> <amount>1000.00</amount> <date>2006-09-27</date> </payment> REST = REpresentational State Transfer Twitter Example: https://dev.twitter.com/docs/api/1.1 (GET and POST only)
  • 35. Database SELECT Statements 35 Oracle SELECT * FROM relationships MongoDB db.relationships.find() Cassandra (CQL) SELECT * FROM relationships Slides and Feedback at: http://joind.in/11012
  • 36. Database SELECT Statements 36 Redis – Key-Value Store SMEMBERS relationships Riak – Key-Value Store with REST API (+ proprietary drivers) http://localhost:8091/riak/relationships/likes Neo4j (Cypher) MATCH (n)-[r:LIKES]->(m) RETURN n,r,m Slides and Feedback at: http://joind.in/11012
  • 37. JOINS without Foreign Keys 37 original_id = ObjectId() db.employer.insert({ "_id": original_id, "name": "Broadway Tech", "url": "bc.example.net" }) db.people.insert({ "name": "Erin", “employer_id": original_id, "url": "bc.example.net/Erin" }) “Erin” works at “Broadway Tech” One of the employees at “Broadway Tech” is “Erin” http://docs.mongodb.org/manual/reference/database-references/#document-references
  • 39. ACID, BASE, CAP, CPR 1979 Gray, 1983 Reuter & Härder - ACID Atomic, Consistent, Isolated, Durable Rollback: All or Nothing, Follows Rules, Simultaneous, No Drops 1997 Brewer - BASE Basically Available, Soft-state, Eventually consistent 2000 Brewer – CAP (Pick Two) Consistency, Availability, Partition Tolerance CPR (Pick Two) Consistency, Performance, Replication/Redundancy Contrived - Stretch Definitions 39
  • 41. CPR 41 Consistency Performance A DCB Spread data across storage or computer Redundancy
  • 44. CRUD 44 Create Read Update Delete Slides and Feedback at: http://joind.in/11012
  • 45. SQL CRUD 45 Create INSERT INTO table (column1, column2) VALUES (9, 'string'); Read SELECT column1, column2 FROM table; Update UPDATE table SET column2 = 'text' WHERE column1= 9 Delete DELETE FROM table WHERE column2='text'
  • 46. Key-Value Stores 46 Key Value code:java 17.316% Lowest rank on Feb 2014 code:C 18.334% Lowest rank on August 2013 code:Objective-C Lowest rank on Dec 2007 11.341% code:C++ {“score”:”6.892%”, “low rank”: “Feb 2008”} Key Value drink:java coffee drink:punch Sprite + pineapple juice drink:pop Carbonated Soda http://www.tiobe.com/index.php/content/paperinfo/tpci/index.html codebucketdrinkbucket
  • 47. Redis CRUD 47 http://redis.io/commands http://redis.io/topics/data-types-intro http://openmymind.net/2011/11/8/Redis-Zero-To-Master-In-30-Minutes-Part-1/ Redis is an in-memory Key-Value Store which stores: Strings, Hashes, Lists, Sets, or Ordered sets Strings: values of strings are concrete and can not be altered SET user:jim {lastname: ‘Mathews’, salary: 125000} GET user:jim Hashes: allows modification and retrieval of individual values HSET user:jim salary 125000 HSET user:jim lastname Mathews HGET jim salary
  • 48. Redis CRUD 48 Lists: One-dimensional array with insert, append, pop, and push Redis.lpush(‘users:employees’, ‘user:jim’) redis.mget(redis.lrange(‘users:employess’,0,5)) Sets: lists with no duplicate values (SADD = Set Add) SADD users:employees jim SADD users:employees krishna SMEMBERS employees Sorted Sets: are sets with an added sorting value ZADD users:employees 125000 jim ZADD users:employees 157000 Krishna ZRANGEBYSCORE users:employees 100000 180000
  • 49. Riak CRUD 49 Easy to install and configure test cluster REST Queries Create/PUT a “course:CIS2120” row curl –v –X PUT http://localhost:8091/riak/course/CIS2120 -H “Content-Type: application/json” -d ‘{“name”:”Database Coding”, “days”:”MWF”}’ Read/GET the value for “course:CIS2120” curl –X GET http://localhost:8091/riak/course/CIS2120 curl http://localhost:8091/riak/course/CIS2120 Key Value course:CIS2120 {“name”:”Database Coding”, “days”:”MWF”}
  • 50. Riak Links 50 Riak can link on value to key:value to another with a relationship curl –v –X PUT http://localhost:8091/riak/student/sorensen -H “Content-Type: application/json” -H “Link: </riak/course/CIS2120>; riaktag=”enrolled”” -d ‘{“firstname”:”Conner”}’ This does not automatically create a link from “sorensen“ to “CIS2120”
  • 52. Neo4j – Graph Database 52 http://www.neo4j.org/learn/try http://docs.neo4j.org/refcard/2.0/ MATCH (n)-[r:LIKES]->(m) RETURN n,r,m Matches a person “n” that likes person “m” https://gist.github.com/peterneubauer/6019125 http://gist.neo4j.org/?6019125
  • 53. Neo4j CRUD 53 Must try dragging nodes at: http://www.neo4j.org/learn/try MATCH (user {name:“Bill"})-[:KNOWS]->(colleague) WHERE colleague.employer=“LinkedIn” RETURN user,colleague ORDER BY colleague.name LIMIT 10 http://docs.neo4j.org/refcard/2.0/ MATCH (n)-[r:LIKES]->(m) RETURN n,r,m Matches a person “n” that likes person “m” MATCH (n)-[r]->(m) RETURN n,r,m Matches any relationship between “n” and “m” http://www.neo4j.org/learn/cypher
  • 54. Neo4j(LUKE {name:"Luke Skywalker"}), (HAN {name:"Han Solo"}), (LEIA {name:"Princess Leia Organa"}), (OBI_WAN {name:"Obi Wan Kenobi"}), (YODA {name : "Yoda"}), (VADER {name:"Darth Vader"}), (C3PO {name:"C3PO", droid:true}), (R2D2 {name:"R2D2", droid:true}), (CHEWBACCA {name:"Chewbacca"}), (TATOOINE {name:"Tatooine", distance:13184}), (DAGOBAH {name:"Dagobah", distance:15407}), (JEDI {name:"Jedi"}), (SITH {name:"Sith"}), (REBELLION {name:"Rebellion"}), (EMPIRE {name:"Empire"}), (DARK_SIDE {name:"Dark Side"}), (LIGHT_SIDE {name:"Light Side"}), … (LUKE)-[:FRIENDS_WITH]->(HAN), (LUKE)-[:FRIENDS_WITH]->(LEIA), (HAN)-[:FRIENDS_WITH]->(CHEWBACCA), (YODA)-[:TEACHES]->(OBI_WAN), (YODA)-[:TEACHES]->(LUKE), (OBI_WAN)-[:TEACHES]->(LUKE), (OBI_WAN)-[:KNOWS]->(VADER), (LUKE)-[:KNOWS]->(R2D2), (R2D2)-[:KNOWS]->(C3PO), (LUKE)-[:LIVED_ON]->(TATOOINE), (HAN)-[:LIVED_ON]->(CORELLIA), (LEIA)-[:LIVED_ON]->(ALDERAAN), (YODA)-[:LIVED_ON]->(DAGOBAH), (LUKE)-[:DEVOTED_TO]->(JEDI), (LUKE)-[:DEVOTED_TO]->(REBELLION), (LUKE)-[:DEVOTED_TO]->(LIGHT_SIDE), (VADER)-[:DEVOTED_TO]->(SITH), (VADER)-[:DEVOTED_TO]->(EMPIRE), (VADER)-[:DEVOTED_TO]->(DARK_SIDE), (LEIA)-[:DEVOTED_TO]->(REBELLION), (HAN)-[:DEVOTED_TO]->(REBELLION) … https://gist.github.com/peterneubauer/6019125 http://gist.neo4j.org/?6019125 MATCH y-[r]-other WHERE y.name='Yoda' return y.name, type(r), other.name 54
  • 55. Google BigTable • White Paper published in 2006 • Many databases based upon BigTable • 13 pages, readable for many non-techies • Insightful into the early days of NoSQL http://static.googleusercontent.com/media/research.google.com/en/us/archive/bigtable-osdi06.pdf 55
  • 56. Hbase 56 Large-Scale, Column-oriented database Consistency, Performance, Fault-Tolerant, ACID via Locking Tables are created before initial data is added Tables have row keys are indexed row identifier strings column families – contain one or more columns timestamp for version control
  • 57. Hbase 57 Row key is a unifier for column families. If row does insert values in a column family no disk space is utilized within the column family. Keys are identified by column_family:column_name text: revision:author revision:comment Write-Ahead Logging (WAL) similar to file system journaling
  • 58. Hbase CRUD 58 create ‘wiki_table’, ‘text_column_family’, ‘revision_column_family’ create ‘wiki’, ‘text’, ‘revision’ put ‘wiki’, ‘first page’, ‘text:’, ‘…’ put ‘wiki’, ‘first page’, ‘revision:author’, ‘…’ get ‘wiki’, ‘first page’, [‘revision:author’, ‘revision:comment’] delete ‘wiki’, ‘first page’, ‘revision:author’ scan ‘wiki’ = SELECT * FROM wiki Seven Databases in Seven Weeks, Redmond & Wilson 2012
  • 59. MongoDB Document Store 59 db.courses.insert({ name: “CIS2120”, description: “Database Coding”, instructor: { name: “David Williams”, email: “[email protected]” } instructor2: { name: “John Kerley-Weeks”, office: “JQL230” } subjects: [“Python”, “MongoDB”, “3NF”, “ETL”, “Star Schema”] })
  • 60. MongoDB vs SQL 60 http://docs.mongodb.org/manual/reference/sql-comparison/ MongoDB -> SQL Terminology Collection -> Table Document -> Row Column -> Field db.courses.find() = SELECT * FROM courses db.courses.count() = SELECT COUNT(*) FROM courses db.courses.find({name: “CIS2120”})
  • 61. MongoDB Simple Database 61 http://media.mongodb.org/zips.json {"city": "ACMAR", "loc": [-86.51557, 33.584132], "pop": 6055, "state": "AL", "_id": "35004"} {"city": "ADAMSVILLE", "loc": [-86.959727, 33.588437], "pop": 10616, "state": "AL", "_id": "35005"} {"city": "ADGER", "loc": [-87.167455, 33.434277], "pop": 3205, "state": "AL", "_id": "35006"} {"city": "KEYSTONE", "loc": [-86.812861, 33.236868], "pop": 14218, "state": "AL", "_id": "35007"} {"city": "NEW SITE", "loc": [-85.951086, 32.941445], "pop": 19942, "state": "AL", "_id": "35010"} {"city": "ALPINE", "loc": [-86.208934, 33.331165], "pop": 3062, "state": "AL", "_id": "35014"} {"city": "ARAB", "loc": [-86.489638, 34.328339], "pop": 13650, "state": "AL", "_id": "35016"} {"city": "BAILEYTON", "loc": [-86.621299, 34.268298], "pop": 1781, "state": "AL", "_id": "35019"} {"city": "BESSEMER", "loc": [-86.947547, 33.409002], "pop": 40549, "state": "AL", "_id": "35020"} {"city": "HUEYTOWN", "loc": [-86.999607, 33.414625], "pop": 39677, "state": "AL", "_id": "35023"} {"city": "BLOUNTSVILLE", "loc": [-86.568628, 34.092937], "pop": 9058, "state": "AL", "_id": "35031"} {"city": "BREMEN", "loc": [-87.004281, 33.973664], "pop": 3448, "state": "AL", "_id": "35033"} {"city": "BRENT", "loc": [-87.211387, 32.93567], "pop": 3791, "state": "AL", "_id": "35034"} {"city": "BRIERFIELD", "loc": [-86.951672, 33.042747], "pop": 1282, "state": "AL", "_id": "35035"} {“city”: “Logan, UT”, “additionally”: [“Nibley, UT”, “River Heights, UT”], “state”: “UT”, “version”: “2.1”, “_id”: “84321”} {“city”: “Olivehurst, CA”, “additionally”: [“Arboga, CA”, “Plumas Lake, CA”, “West Linda, CA”], “state”: “CA”, “version”: “2.1”, “_id”: “95961”}
  • 62. Cassandra Characteristics 62 Scalable, High-availability Wide-columnar datastore Peer-to-peer rather than master-slave clusters Tunable consistency can read/write to a single node, quorum of nodes or all nodes Recommends static and dynamic column families Static column families have contain pre-defined columns Contact Info: phone, address, email, web Dynamic families have variable numbers of similar columns Students enrolled in a course
  • 63. Cassandra CRUD 63 http://www.datastax.com/docs/0.8/references/cql http://cassandra.apache.org/doc/cql3/CQL.html#selectStmt CREATE TABLE course ( name text PRIMARY KEY, instructor text, maxstudents int ) INSERT INTO course (name, instructor, maxstudents) VALUES (‘CIS2120’, ‘Williams’, 28) UPDATE course SET maxstudents=26 WHERE name=‘CIS2120’ SELECT name, instructor FROM course WHERE maxstudents > 20
  • 64. Cassandra CRUD 64 No JOIN operations or FOREIGN KEYS CREATE TABLE people ( name text PRIMARY KEY, email text, phones map<text, text> ) INSERT INTO people (name, email, phones) VALUES (‘John Weeks’, ‘[email protected]’, {‘mobile’ : ‘555-1212’, ‘office’ : ‘797-7133’, ‘fax’ : ‘555-1212’}) UPDATE people SET phones[‘office’] = ‘555-1212’ WHERE email = ‘[email protected]
  • 65. Questions ??? 65 Slides and Feedback at: http://joind.in/11012