SlideShare a Scribd company logo
BRITAIN’S


    DATA
OR
YOU’RE DOING IT WRONG
IAN PLOSKER
Basho Technologies

@dstroyallmodels
WHO IS




basho
   ?
WE MAKE
Next Top Data Model by Ian Plosker
DATABASE
Next Top Data Model by Ian Plosker
THIS WORKS AS LONG AS YOU HAVE GOBS OF
                MEMORY
Text
     Text




ORMs & ODMs
NOSQL
DON’T LISTEN TO NOSQL CLOWNS
Who say that all projects must use this newfangled NoSQL
OR NOSQL BROS
Who say that their NoSQL DB is right for every project
THERE IS NO SUCH THING AS NOSQL
THERE'S JUST DATABASES
MAKING DIFFERENT TRADEOFFS
PERSISTENCE STRATEGY

In-Memory               Persistent

             Periodic            Immediate

 Memcache   MongoDB                  Riak

   Redis      Redis              Cassandra

   Hana
(PRIMARY) QUERY MODEL

Rich Query                Key-Value

                Pure       Document     Tablet

 Relational     Riak       MongoDB    Cassandra

  Vertica     BerkleyDB     Couch      HBase

  Datomic     Voldemort     Redis     Big Table
REPLICATION

Master-Slave     Masterless

  Oracle DB         Riak

   MySQL          Cassandra

 PostgreSQL       Voldemort

    Redis

  MongoDB
DISTRIBUTION

  BYO             Sharded         Ring

Oracle DB         MongoDB       Cassandra

 MySQL          MySQL Cluster     Riak

PostgreSQL                      Voldemort

  Redis
DATA MODEL

Relational      Object    Column-Family

Oracle DB         Riak       Cassandra

  MySQL         MongoDB       HBase

PostgreSQL       Couch       BigTable

                 Redis      HyperTable

                Datomic
THERE'S JUST DATABASES
AND QUERIES
CORE ONLINE QUERY TYPES
        Key-                                 Graph/
                   Search        Geo                    Event
        Value                                Relation

       BerkleyDB
                                PostGIS
Scale- CouchDB      SOLR
                               MongoDB        neo4j     MySQL
 up MongoDB         Sphinx
                                 SOLR
        MySQL


Scale- Riak
               elasticsearch elasticsearch      ???     HBase
 out Cassandra
HOW TO DATA MODEL
HAVE A THINK
STORE YOUR DATA RIGHT
FIT YOUR DATA MODEL TO YOUR APP
YOUR DATA AND QUERY MODEL
  SHOULD LIVE IN HARMONY
SELECT	
  SUM(offerTotal)	
  as	
  theOfferTotal,	
  SUM(lienTotal)	
  AS	
  theLienTotal,	
  SUM(CLVtotal)	
  AS	
  
theCLVtotal,	
  SUM(estGrossProfitTotal)	
  AS	
  theESTGPtotal	
  FROM	
  ((	
  SELECT	
  
COALESCE(SUM(COALESCE(offerAmount,	
  0)),	
  0)	
  AS	
  offerTotal,	
  COALESCE(SUM(COALESCE(amount,	
  0)	
  +	
  
COALESCE(legalFees,	
  0)	
  +	
  COALESCE(costs,	
  0)),	
  0)	
  AS	
  lienTotal,	
  COALESCE(SUM(((amount	
  +	
  legalFees	
  
+	
  costs)	
  *	
  (1	
  +	
  (rateOfInterest	
  /	
  100)	
  *	
  (FLOOR((UNIX_TIMESTAMP(NOW())	
  -­‐	
  
UNIX_TIMESTAMP(dateOfAttachment))	
  /	
  86400)	
  /	
  365)))),	
  0)	
  AS	
  CLVtotal,	
  COALESCE(SUM((((amount	
  +	
  
legalFees	
  +	
  costs)	
  *	
  (1	
  +	
  (rateOfInterest	
  /	
  100)	
  *	
  (FLOOR((UNIX_TIMESTAMP(NOW())	
  -­‐	
  
UNIX_TIMESTAMP(dateOfAttachment))	
  /	
  86400)	
  /	
  365)))	
  -­‐	
  COALESCE(offerAmount,	
  0))),	
  0)	
  AS	
  
estGrossProfitTotal	
  FROM	
  lienTable	
  AS	
  theLienTable,	
  propertyTable,	
  property_lien,	
  
stateInterestTable,	
  data,	
  judgementLienTable	
  WHERE	
  theLienTable.lienID	
  =	
  property_lien.lienID	
  AND	
  
propertyTable.propertyID	
  =	
  property_lien.propertyID	
  AND	
  propertyTable.state	
  =	
  
stateInterestTable.state	
  AND	
  theLienTable.lienID	
  =	
  judgementLienTable.lienID	
  AND	
  
theLienTable.lienStatusID	
  IN	
  (65,	
  70,	
  75)	
  AND	
  data.id	
  =	
  (SELECT	
  data.id	
  FROM	
  lienTable,	
  data,	
  
data_lien	
  WHERE	
  lienTable.lienID	
  =	
  data_lien.lienID	
  AND	
  data_lien.id	
  =	
  data.id	
  AND	
  category	
  =	
  15	
  
AND	
  lienTable.lienID	
  =	
  theLienTable.lienID	
  ORDER	
  BY	
  data.id	
  DESC	
  LIMIT	
  1)	
  AND	
  dateOfAttachment	
  !=	
  
0	
  AND	
  UNIX_TIMESTAMP(NOW())	
  >	
  UNIX_TIMESTAMP(dateOfAttachment)	
  AND	
  FLOOR((UNIX_TIMESTAMP(NOW())	
  -­‐	
  
UNIX_TIMESTAMP(dateOfAttachment))	
  /	
  86400)	
  >	
  0	
  AND	
  rateOfInterest	
  >	
  0	
  )	
  UNION	
  (	
  SELECT	
  
COALESCE(SUM(COALESCE(offerAmount,	
  0)),	
  0)	
  AS	
  offerTotal,	
  COALESCE(SUM(COALESCE(amount,	
  0)	
  +	
  
COALESCE(legalFees,	
  0)	
  +	
  COALESCE(costs,	
  0)),	
  0)	
  AS	
  lienTotal,	
  COALESCE(SUM(((amount	
  +	
  legalFees	
  
+	
  costs)	
  *	
  (1	
  +	
  (rateOfInterest	
  /	
  100)	
  *	
  (FLOOR((UNIX_TIMESTAMP(NOW())	
  -­‐	
  
UNIX_TIMESTAMP(judgementDate))	
  /	
  86400)	
  /	
  365)))),	
  0)	
  AS	
  CLVtotal,	
  COALESCE(SUM((((amount	
  +	
  
legalFees	
  +	
  costs)	
  *	
  (1	
  +	
  (rateOfInterest	
  /	
  100)	
  *	
  (FLOOR((UNIX_TIMESTAMP(NOW())	
  -­‐	
  
UNIX_TIMESTAMP(dateOfAttachment))	
  /	
  86400)	
  /	
  365)))	
  -­‐	
  COALESCE(offerAmount,	
  0))),	
  0)	
  AS	
  
estGrossProfitTotal	
  FROM	
  lienTable	
  AS	
  theLienTable,	
  propertyTable,	
  property_lien,	
  
stateInterestTable,	
  data,	
  judgementLienTable	
  WHERE	
  theLienTable.lienID	
  =	
  property_lien.lienID	
  AND	
  
propertyTable.propertyID	
  =	
  property_lien.propertyID	
  AND	
  propertyTable.state	
  =	
  
stateInterestTable.state	
  AND	
  theLienTable.lienID	
  =	
  judgementLienTable.lienID	
  AND	
  
theLienTable.lienStatusID	
  IN	
  (65,	
  70,	
  75)	
  AND	
  data.id	
  =	
  (SELECT	
  data.id	
  FROM	
  lienTable,	
  data,	
  
data_lien	
  WHERE	
  lienTable.lienID	
  =	
  data_lien.lienID	
  AND	
  data_lien.id	
  =	
  data.id	
  AND	
  category	
  =	
  15	
  
AND	
  lienTable.lienID	
  =	
  theLienTable.lienID	
  ORDER	
  BY	
  data.id	
  DESC	
  LIMIT	
  1)	
  AND	
  
COALESCE(dateOfAttachment,	
  0)	
  =	
  0	
  AND	
  judgementDate	
  !=	
  0	
  AND	
  UNIX_TIMESTAMP(NOW())	
  >	
  
UNIX_TIMESTAMP(judgementDate)	
  AND	
  FLOOR((UNIX_TIMESTAMP(NOW())	
  -­‐	
  UNIX_TIMESTAMP(judgementDate))	
  /	
  
86400)	
  >	
  0	
  AND	
  rateOfInterest	
  >	
  0	
  )	
  )	
  AS	
  theBigTable;
THIS IS NOT HARMONY
YOUR DATA AND QUERY MODEL
DON’T DENORMALIZE FOR THE SAKE OF
         DENORMALIZING
DATA QUERIED TOGETHER SHOULD BE STORED
               TOGETHER
TIME BOXING
EXAMPLE: TIMEBOX

Key: "2012-07-20 11:30"
Value: {
  "2012-07-20 11:30": 10,
  "2012-07-20 11:31": 8,
  "2012-07-20 11:32": 28,
  "2012-07-20 11:33": 1,
  "2012-07-20 11:34": 13
}
TIERED DATA MODEL
ROLL UPS
EXAMPLE: ROLLUPS
Key: "2012-07-20 11:30"      Key: "2012-07-20 11:35"
Value: {                     Value: {
  "2012-07-20 11:30": 10,      "2012-07-20 11:35": 4,
  "2012-07-20 11:31": 8,       "2012-07-20 11:36": 9,
  "2012-07-20 11:32": 28,      "2012-07-20 11:37": 3,
  "2012-07-20 11:33": 1,       "2012-07-20 11:38": 12,
  "2012-07-20 11:34": 13       "2012-07-20 11:39": 10
  }                          }

                 Key: "2012-07-20 11:40"
                 Value: {
                   "2012-07-20 11:40": 24,
                   "2012-07-20 11:41": 30,
                   "2012-07-20 11:42": 12,
                   "2012-07-20 11:43": 8,
                   "2012-07-20 11:44": 7
                 }
EXAMPLE: ROLLUPS

  Key: "2012-07-20 11:30"
  Value: {
    "2012-07-20 11:30": 60,
    "2012-07-20 11:35": 38,
    "2012-07-20 11:40": 81,
    "2012-07-20 11:45": 58,
    "2012-07-20 11:50": 34,
    "2012-07-20 11:55": 110
    }
USE NATURAL KEYS
bucket: user
user_id: f47ac10b-58cc-4372-a567-0e02b2c3d479




              SERIOUSLY?
bucket: user
user_id: iplosker




         ISN'T THIS SIMPLER
IT ISN’T JUST YOUR DATABASE THAT NEEDS TO BE
                   SCALABLE
YOUR DATA MODEL NEEDS TO BE SCALABLE
CONFLICT-FREE REPLICATED DATA TYPES
EXAMPLE: OR-SET

{                       {
    observed: ["A"],        observed: [],
    removed: []             removed: []
}                       }



                ["A"]
EXAMPLE: OR-SET

{                      {
    observed: ["A"],       observed: ["B"],
    removed: []            removed: []
}                      }




               ["A", "B"]
EXAMPLE: OR-SET

{                          {
    observed: ["A","B"],       observed: ["A", "B"],
    removed: []                removed: []
}                          }




                    ["A", "B"]
EXAMPLE: OR-SET

{                          {
    observed: ["A","B"],       observed: ["A", "B"],
    removed: ["B"]             removed: []
}                          }




                    ["A","B"]
EXAMPLE: OR-SET

{                          {
    observed: ["A","B"],       observed: ["A","B"],
    removed: ["B"]             removed: []
}                          }




                       ["A"]
EXAMPLE: OR-SET

{                          {
    observed: ["A","B"],       observed: ["A","B"],
    removed: ["B"]             removed: ["B"]
}                          }




                       ["A"]
PICK THE SOLUTION THAT FITS YOUR PROBLEM
ABOVE ALL
KISS

More Related Content

What's hot (20)

PPTX
OrientDB vs Neo4j - Comparison of query/speed/functionality
Curtis Mosters
 
PDF
Java/Scala Lab: Борис Трофимов - Обжигающая Big Data.
GeeksLab Odessa
 
PPT
Chris Mc Glothen Sql Portfolio
clmcglothen
 
PDF
Indexing and Query Optimizer (Mongo Austin)
MongoDB
 
PDF
Overview of running R in the Oracle Database
Brendan Tierney
 
PDF
SparkSQL and Dataframe
Namgee Lee
 
PDF
Drilling Cyber Security Data With Apache Drill
Charles Givre
 
PPTX
MongoDB Aggregation
Amit Ghosh
 
PPTX
Reducing Development Time with MongoDB vs. SQL
MongoDB
 
PDF
Cloudera Impala, updated for v1.0
Scott Leberknight
 
PDF
Out ofmemoryerror what is the cost of java objects
Jean-Philippe BEMPEL
 
PPTX
The Aggregation Framework
MongoDB
 
PDF
Cascading Through Hadoop for the Boulder JUG
Matthew McCullough
 
ODP
2011 Mongo FR - Indexing in MongoDB
antoinegirbal
 
PDF
Data Exploration with Apache Drill: Day 2
Charles Givre
 
PPTX
MongoDB San Francisco 2013: Hash-based Sharding in MongoDB 2.4 presented by B...
MongoDB
 
PDF
MongoDB .local Paris 2020: La puissance du Pipeline d'Agrégation de MongoDB
MongoDB
 
PDF
Aggregation Framework MongoDB Days Munich
Norberto Leite
 
ODP
Aggregation Framework in MongoDB Overview Part-1
Anuj Jain
 
PPTX
Data Binding Intro (Windows 8)
Gilbok Lee
 
OrientDB vs Neo4j - Comparison of query/speed/functionality
Curtis Mosters
 
Java/Scala Lab: Борис Трофимов - Обжигающая Big Data.
GeeksLab Odessa
 
Chris Mc Glothen Sql Portfolio
clmcglothen
 
Indexing and Query Optimizer (Mongo Austin)
MongoDB
 
Overview of running R in the Oracle Database
Brendan Tierney
 
SparkSQL and Dataframe
Namgee Lee
 
Drilling Cyber Security Data With Apache Drill
Charles Givre
 
MongoDB Aggregation
Amit Ghosh
 
Reducing Development Time with MongoDB vs. SQL
MongoDB
 
Cloudera Impala, updated for v1.0
Scott Leberknight
 
Out ofmemoryerror what is the cost of java objects
Jean-Philippe BEMPEL
 
The Aggregation Framework
MongoDB
 
Cascading Through Hadoop for the Boulder JUG
Matthew McCullough
 
2011 Mongo FR - Indexing in MongoDB
antoinegirbal
 
Data Exploration with Apache Drill: Day 2
Charles Givre
 
MongoDB San Francisco 2013: Hash-based Sharding in MongoDB 2.4 presented by B...
MongoDB
 
MongoDB .local Paris 2020: La puissance du Pipeline d'Agrégation de MongoDB
MongoDB
 
Aggregation Framework MongoDB Days Munich
Norberto Leite
 
Aggregation Framework in MongoDB Overview Part-1
Anuj Jain
 
Data Binding Intro (Windows 8)
Gilbok Lee
 

Similar to Next Top Data Model by Ian Plosker (20)

PPTX
Joins and Other MongoDB 3.2 Aggregation Enhancements
Andrew Morgan
 
PPTX
To scale or not to scale: Key/Value, Document, SQL, JPA – What’s right for my...
Uri Cohen
 
PPTX
Yes, Sql!
Uri Cohen
 
PDF
MongoDB
Hemant Kumar Tiwary
 
PDF
Practical Ruby Projects with MongoDB - Ruby Kaigi 2010
Alex Sharp
 
PPTX
Query for json databases
Binh Le
 
PDF
MongoDB and RDBMS
francescapasha
 
PDF
Cassandra 3.0 - JSON at scale - StampedeCon 2015
StampedeCon
 
ODP
MongoDB & PHP
Sanjeev Shrestha
 
PDF
Slides: Moving from a Relational Model to NoSQL
DATAVERSITY
 
PPTX
Couchbase Tutorial: Big data Open Source Systems: VLDB2018
Keshav Murthy
 
PDF
Elasticsearch in 15 Minutes
Karel Minarik
 
PDF
Avro, la puissance du binaire, la souplesse du JSON
Alexandre Victoor
 
PPTX
Power JSON with PostgreSQL
EDB
 
PPTX
Webinar: General Technical Overview of MongoDB for Dev Teams
MongoDB
 
KEY
Practical Ruby Projects (Alex Sharp)
MongoSF
 
KEY
Practical Ruby Projects with MongoDB - MongoSF
Alex Sharp
 
PPTX
N1QL+GSI: Language and Performance Improvements in Couchbase 5.0 and 5.5
Keshav Murthy
 
PPTX
MongoDB Knowledge share
Mr Kyaing
 
PPTX
171_74_216_Module_5-Non_relational_database_-mongodb.pptx
sukrithlal008
 
Joins and Other MongoDB 3.2 Aggregation Enhancements
Andrew Morgan
 
To scale or not to scale: Key/Value, Document, SQL, JPA – What’s right for my...
Uri Cohen
 
Yes, Sql!
Uri Cohen
 
Practical Ruby Projects with MongoDB - Ruby Kaigi 2010
Alex Sharp
 
Query for json databases
Binh Le
 
MongoDB and RDBMS
francescapasha
 
Cassandra 3.0 - JSON at scale - StampedeCon 2015
StampedeCon
 
MongoDB & PHP
Sanjeev Shrestha
 
Slides: Moving from a Relational Model to NoSQL
DATAVERSITY
 
Couchbase Tutorial: Big data Open Source Systems: VLDB2018
Keshav Murthy
 
Elasticsearch in 15 Minutes
Karel Minarik
 
Avro, la puissance du binaire, la souplesse du JSON
Alexandre Victoor
 
Power JSON with PostgreSQL
EDB
 
Webinar: General Technical Overview of MongoDB for Dev Teams
MongoDB
 
Practical Ruby Projects (Alex Sharp)
MongoSF
 
Practical Ruby Projects with MongoDB - MongoSF
Alex Sharp
 
N1QL+GSI: Language and Performance Improvements in Couchbase 5.0 and 5.5
Keshav Murthy
 
MongoDB Knowledge share
Mr Kyaing
 
171_74_216_Module_5-Non_relational_database_-mongodb.pptx
sukrithlal008
 
Ad

More from SyncConf (7)

PPT
The Multimap Journey and How to raise Angel Investment by Sean Phelan
SyncConf
 
PPTX
Behaviour Driven Development by Liz Keogh
SyncConf
 
PDF
Tackling Complex Data with Neo4j by Ian Robinson
SyncConf
 
PDF
Writing Usable APIs in Practice by Giovanni Asproni
SyncConf
 
PPTX
The Ubiquitous Digital Map (Abridged) by Gary Gale
SyncConf
 
PDF
Breaking News and Breaking Software by Andy Hume
SyncConf
 
PDF
The 90 minute Guide to Agile – What, Why, How by Allan Kelly
SyncConf
 
The Multimap Journey and How to raise Angel Investment by Sean Phelan
SyncConf
 
Behaviour Driven Development by Liz Keogh
SyncConf
 
Tackling Complex Data with Neo4j by Ian Robinson
SyncConf
 
Writing Usable APIs in Practice by Giovanni Asproni
SyncConf
 
The Ubiquitous Digital Map (Abridged) by Gary Gale
SyncConf
 
Breaking News and Breaking Software by Andy Hume
SyncConf
 
The 90 minute Guide to Agile – What, Why, How by Allan Kelly
SyncConf
 
Ad

Recently uploaded (20)

PDF
Jak MŚP w Europie Środkowo-Wschodniej odnajdują się w świecie AI
dominikamizerska1
 
PDF
Python basic programing language for automation
DanialHabibi2
 
PDF
LLMs.txt: Easily Control How AI Crawls Your Site
Keploy
 
PDF
[Newgen] NewgenONE Marvin Brochure 1.pdf
darshakparmar
 
PPTX
WooCommerce Workshop: Bring Your Laptop
Laura Hartwig
 
PDF
Complete JavaScript Notes: From Basics to Advanced Concepts.pdf
haydendavispro
 
PDF
Empower Inclusion Through Accessible Java Applications
Ana-Maria Mihalceanu
 
PDF
Log-Based Anomaly Detection: Enhancing System Reliability with Machine Learning
Mohammed BEKKOUCHE
 
PDF
Exolore The Essential AI Tools in 2025.pdf
Srinivasan M
 
PDF
Blockchain Transactions Explained For Everyone
CIFDAQ
 
PDF
SWEBOK Guide and Software Services Engineering Education
Hironori Washizaki
 
PDF
CIFDAQ Market Insights for July 7th 2025
CIFDAQ
 
PDF
Chris Elwell Woburn, MA - Passionate About IT Innovation
Chris Elwell Woburn, MA
 
PDF
The Builder’s Playbook - 2025 State of AI Report.pdf
jeroen339954
 
PDF
"Beyond English: Navigating the Challenges of Building a Ukrainian-language R...
Fwdays
 
PDF
Fl Studio 24.2.2 Build 4597 Crack for Windows Free Download 2025
faizk77g
 
PPTX
Webinar: Introduction to LF Energy EVerest
DanBrown980551
 
PPTX
"Autonomy of LLM Agents: Current State and Future Prospects", Oles` Petriv
Fwdays
 
PPTX
Q2 FY26 Tableau User Group Leader Quarterly Call
lward7
 
PDF
New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
Jak MŚP w Europie Środkowo-Wschodniej odnajdują się w świecie AI
dominikamizerska1
 
Python basic programing language for automation
DanialHabibi2
 
LLMs.txt: Easily Control How AI Crawls Your Site
Keploy
 
[Newgen] NewgenONE Marvin Brochure 1.pdf
darshakparmar
 
WooCommerce Workshop: Bring Your Laptop
Laura Hartwig
 
Complete JavaScript Notes: From Basics to Advanced Concepts.pdf
haydendavispro
 
Empower Inclusion Through Accessible Java Applications
Ana-Maria Mihalceanu
 
Log-Based Anomaly Detection: Enhancing System Reliability with Machine Learning
Mohammed BEKKOUCHE
 
Exolore The Essential AI Tools in 2025.pdf
Srinivasan M
 
Blockchain Transactions Explained For Everyone
CIFDAQ
 
SWEBOK Guide and Software Services Engineering Education
Hironori Washizaki
 
CIFDAQ Market Insights for July 7th 2025
CIFDAQ
 
Chris Elwell Woburn, MA - Passionate About IT Innovation
Chris Elwell Woburn, MA
 
The Builder’s Playbook - 2025 State of AI Report.pdf
jeroen339954
 
"Beyond English: Navigating the Challenges of Building a Ukrainian-language R...
Fwdays
 
Fl Studio 24.2.2 Build 4597 Crack for Windows Free Download 2025
faizk77g
 
Webinar: Introduction to LF Energy EVerest
DanBrown980551
 
"Autonomy of LLM Agents: Current State and Future Prospects", Oles` Petriv
Fwdays
 
Q2 FY26 Tableau User Group Leader Quarterly Call
lward7
 
New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 

Next Top Data Model by Ian Plosker

  • 1. BRITAIN’S DATA
  • 2. OR
  • 10. THIS WORKS AS LONG AS YOU HAVE GOBS OF MEMORY
  • 11. Text Text ORMs & ODMs
  • 12. NOSQL
  • 13. DON’T LISTEN TO NOSQL CLOWNS Who say that all projects must use this newfangled NoSQL
  • 14. OR NOSQL BROS Who say that their NoSQL DB is right for every project
  • 15. THERE IS NO SUCH THING AS NOSQL
  • 18. PERSISTENCE STRATEGY In-Memory Persistent Periodic Immediate Memcache MongoDB Riak Redis Redis Cassandra Hana
  • 19. (PRIMARY) QUERY MODEL Rich Query Key-Value Pure Document Tablet Relational Riak MongoDB Cassandra Vertica BerkleyDB Couch HBase Datomic Voldemort Redis Big Table
  • 20. REPLICATION Master-Slave Masterless Oracle DB Riak MySQL Cassandra PostgreSQL Voldemort Redis MongoDB
  • 21. DISTRIBUTION BYO Sharded Ring Oracle DB MongoDB Cassandra MySQL MySQL Cluster Riak PostgreSQL Voldemort Redis
  • 22. DATA MODEL Relational Object Column-Family Oracle DB Riak Cassandra MySQL MongoDB HBase PostgreSQL Couch BigTable Redis HyperTable Datomic
  • 25. CORE ONLINE QUERY TYPES Key- Graph/ Search Geo Event Value Relation BerkleyDB PostGIS Scale- CouchDB SOLR MongoDB neo4j MySQL up MongoDB Sphinx SOLR MySQL Scale- Riak elasticsearch elasticsearch ??? HBase out Cassandra
  • 26. HOW TO DATA MODEL
  • 29. FIT YOUR DATA MODEL TO YOUR APP
  • 30. YOUR DATA AND QUERY MODEL SHOULD LIVE IN HARMONY
  • 31. SELECT  SUM(offerTotal)  as  theOfferTotal,  SUM(lienTotal)  AS  theLienTotal,  SUM(CLVtotal)  AS   theCLVtotal,  SUM(estGrossProfitTotal)  AS  theESTGPtotal  FROM  ((  SELECT   COALESCE(SUM(COALESCE(offerAmount,  0)),  0)  AS  offerTotal,  COALESCE(SUM(COALESCE(amount,  0)  +   COALESCE(legalFees,  0)  +  COALESCE(costs,  0)),  0)  AS  lienTotal,  COALESCE(SUM(((amount  +  legalFees   +  costs)  *  (1  +  (rateOfInterest  /  100)  *  (FLOOR((UNIX_TIMESTAMP(NOW())  -­‐   UNIX_TIMESTAMP(dateOfAttachment))  /  86400)  /  365)))),  0)  AS  CLVtotal,  COALESCE(SUM((((amount  +   legalFees  +  costs)  *  (1  +  (rateOfInterest  /  100)  *  (FLOOR((UNIX_TIMESTAMP(NOW())  -­‐   UNIX_TIMESTAMP(dateOfAttachment))  /  86400)  /  365)))  -­‐  COALESCE(offerAmount,  0))),  0)  AS   estGrossProfitTotal  FROM  lienTable  AS  theLienTable,  propertyTable,  property_lien,   stateInterestTable,  data,  judgementLienTable  WHERE  theLienTable.lienID  =  property_lien.lienID  AND   propertyTable.propertyID  =  property_lien.propertyID  AND  propertyTable.state  =   stateInterestTable.state  AND  theLienTable.lienID  =  judgementLienTable.lienID  AND   theLienTable.lienStatusID  IN  (65,  70,  75)  AND  data.id  =  (SELECT  data.id  FROM  lienTable,  data,   data_lien  WHERE  lienTable.lienID  =  data_lien.lienID  AND  data_lien.id  =  data.id  AND  category  =  15   AND  lienTable.lienID  =  theLienTable.lienID  ORDER  BY  data.id  DESC  LIMIT  1)  AND  dateOfAttachment  !=   0  AND  UNIX_TIMESTAMP(NOW())  >  UNIX_TIMESTAMP(dateOfAttachment)  AND  FLOOR((UNIX_TIMESTAMP(NOW())  -­‐   UNIX_TIMESTAMP(dateOfAttachment))  /  86400)  >  0  AND  rateOfInterest  >  0  )  UNION  (  SELECT   COALESCE(SUM(COALESCE(offerAmount,  0)),  0)  AS  offerTotal,  COALESCE(SUM(COALESCE(amount,  0)  +   COALESCE(legalFees,  0)  +  COALESCE(costs,  0)),  0)  AS  lienTotal,  COALESCE(SUM(((amount  +  legalFees   +  costs)  *  (1  +  (rateOfInterest  /  100)  *  (FLOOR((UNIX_TIMESTAMP(NOW())  -­‐   UNIX_TIMESTAMP(judgementDate))  /  86400)  /  365)))),  0)  AS  CLVtotal,  COALESCE(SUM((((amount  +   legalFees  +  costs)  *  (1  +  (rateOfInterest  /  100)  *  (FLOOR((UNIX_TIMESTAMP(NOW())  -­‐   UNIX_TIMESTAMP(dateOfAttachment))  /  86400)  /  365)))  -­‐  COALESCE(offerAmount,  0))),  0)  AS   estGrossProfitTotal  FROM  lienTable  AS  theLienTable,  propertyTable,  property_lien,   stateInterestTable,  data,  judgementLienTable  WHERE  theLienTable.lienID  =  property_lien.lienID  AND   propertyTable.propertyID  =  property_lien.propertyID  AND  propertyTable.state  =   stateInterestTable.state  AND  theLienTable.lienID  =  judgementLienTable.lienID  AND   theLienTable.lienStatusID  IN  (65,  70,  75)  AND  data.id  =  (SELECT  data.id  FROM  lienTable,  data,   data_lien  WHERE  lienTable.lienID  =  data_lien.lienID  AND  data_lien.id  =  data.id  AND  category  =  15   AND  lienTable.lienID  =  theLienTable.lienID  ORDER  BY  data.id  DESC  LIMIT  1)  AND   COALESCE(dateOfAttachment,  0)  =  0  AND  judgementDate  !=  0  AND  UNIX_TIMESTAMP(NOW())  >   UNIX_TIMESTAMP(judgementDate)  AND  FLOOR((UNIX_TIMESTAMP(NOW())  -­‐  UNIX_TIMESTAMP(judgementDate))  /   86400)  >  0  AND  rateOfInterest  >  0  )  )  AS  theBigTable;
  • 32. THIS IS NOT HARMONY
  • 33. YOUR DATA AND QUERY MODEL
  • 34. DON’T DENORMALIZE FOR THE SAKE OF DENORMALIZING
  • 35. DATA QUERIED TOGETHER SHOULD BE STORED TOGETHER
  • 37. EXAMPLE: TIMEBOX Key: "2012-07-20 11:30" Value: { "2012-07-20 11:30": 10, "2012-07-20 11:31": 8, "2012-07-20 11:32": 28, "2012-07-20 11:33": 1, "2012-07-20 11:34": 13 }
  • 40. EXAMPLE: ROLLUPS Key: "2012-07-20 11:30" Key: "2012-07-20 11:35" Value: { Value: { "2012-07-20 11:30": 10, "2012-07-20 11:35": 4, "2012-07-20 11:31": 8, "2012-07-20 11:36": 9, "2012-07-20 11:32": 28, "2012-07-20 11:37": 3, "2012-07-20 11:33": 1, "2012-07-20 11:38": 12, "2012-07-20 11:34": 13 "2012-07-20 11:39": 10 } } Key: "2012-07-20 11:40" Value: { "2012-07-20 11:40": 24, "2012-07-20 11:41": 30, "2012-07-20 11:42": 12, "2012-07-20 11:43": 8, "2012-07-20 11:44": 7 }
  • 41. EXAMPLE: ROLLUPS Key: "2012-07-20 11:30" Value: { "2012-07-20 11:30": 60, "2012-07-20 11:35": 38, "2012-07-20 11:40": 81, "2012-07-20 11:45": 58, "2012-07-20 11:50": 34, "2012-07-20 11:55": 110 }
  • 44. bucket: user user_id: iplosker ISN'T THIS SIMPLER
  • 45. IT ISN’T JUST YOUR DATABASE THAT NEEDS TO BE SCALABLE
  • 46. YOUR DATA MODEL NEEDS TO BE SCALABLE
  • 48. EXAMPLE: OR-SET { { observed: ["A"], observed: [], removed: [] removed: [] } } ["A"]
  • 49. EXAMPLE: OR-SET { { observed: ["A"], observed: ["B"], removed: [] removed: [] } } ["A", "B"]
  • 50. EXAMPLE: OR-SET { { observed: ["A","B"], observed: ["A", "B"], removed: [] removed: [] } } ["A", "B"]
  • 51. EXAMPLE: OR-SET { { observed: ["A","B"], observed: ["A", "B"], removed: ["B"] removed: [] } } ["A","B"]
  • 52. EXAMPLE: OR-SET { { observed: ["A","B"], observed: ["A","B"], removed: ["B"] removed: [] } } ["A"]
  • 53. EXAMPLE: OR-SET { { observed: ["A","B"], observed: ["A","B"], removed: ["B"] removed: ["B"] } } ["A"]
  • 54. PICK THE SOLUTION THAT FITS YOUR PROBLEM
  • 56. KISS