SlideShare a Scribd company logo
Oracle NoSQL Database
Dave Rubin
Director – NoSQL Database Development
The following is intended to outline our general
product direction. It is intended for information
purposes only, and may not be incorporated into any
contract. It is not a commitment to deliver any
material, code, or functionality, and should not be
relied upon in making purchasing decisions.
The development, release, and timing of any
features or functionality described for Oracle’s
products remains at the sole discretion of Oracle.
Agenda


• NoSQL Use Case
• Oracle NoSQL Database
   • Architecture
   • Integration with the RDBMS
   • Benchmark Results
Use Case – Online Display Advertising
• Problem
  • Very low latency requirements – Publishers require 50 – 60 ms response
    time from the ad serving platform
  • Extreme data velocity – Multi-millions of requests per second
  • Highly available – 24/7 sites
  • Revenue maximization – Deliver the most relevant ad to maximize
    revenue


• Solution – Where to use a NoSQL Database?
  • Cookie store – NoSQL database used to store cookies and associated
    behavioral segments
  • Track behavioral data – Beacons utilized during browsing to store
    timestamp, frequency, and behavioral segments by cookie
  • Optimize ad delivery – Recency, frequency, and behavioral segments
    used to determine optimal ad to deliver to user
Online Display Advertising Overall Solution
                Real Time Reporting and
                Campaign Management
                                 RDBMS
                                          Hadoop Cluster




    Ad Server




                     Multi Dimensional
                     Reporting
Online Display Advertising – Usage
   Characteristics
• NoSQL Database
  • Low latency high volume
     • Millions of ad serving requests per minute or second
     • Stringent latency requirements from publishers
  • Loose consistency
     • Cookie data used for ad targeting – Increase probability that user will click on ad.
• Relational Database
  • Campaign booking information – hundreds of users
  • Real time business metrics for publishers and advertisers
  • Business financials for ad serving company
     •   Year to date revenue, quarter over quarter etc.
     •   Billing
     •   SOX reporting for public companies

• Hadoop
  • Unique visits (select count(distinct)) over many terabytes of data
  • Inventory forecasting across behavioral segments
Agenda


• NoSQL Use Case
• Oracle NoSQL Database
   • Architecture
   • Integration with the RDBMS
   • Benchmark Results
A Distributed, Scalable Key-Value Database

• Simple Data Model
   • Key-value pair with major+minor-key paradigm
   • CRUD + range scans                                    Application      Application

• Scalability                                           NoSQL DB Driver   NoSQL DB Driver

   • Dynamic data partitioning and distribution
   • Optimized data access via intelligent driver
• High availability
   • One or more replicas
   • Resilient to partition failures
   • Disaster recovery through location of replicas
   • No single point of failure
• Transparent load balancing                          Storage Nodes         Storage Nodes
                                                       Data Center A         Data Center B
   • Reads from master or replicas
   • Driver is network topology & latency aware
• Elastic Expansion
   • Online addition/removal of storage nodes and automatic data redistribution
Architecture – The Application’s Perspective
                   Application
                 NoSQL DB Driver




 Shard 1             Shard 2        Shard N

 Master             Master           Master




 Replicas           Replicas        Replicas
Transactions


• ACID transactions at shard granularity


• Transaction Scope
  • Single API call
  • All records must have the same major key
  • Multiple operations within a transaction via collections



• Can be relaxed for increased performance on a per-
 operation basis
Simple Data Model
 ACID Transactions – Configurability

• Configurable Durability Policy




• Configurable Consistency Policy
Integration with the RDBMS and Other
 Products

• Oracle External Tables
   • Export data directly from NoSQL database and create Oracle
     External Table
   • Pre-packaged utility


• Oracle Loader for Hadoop
   • Parallel map reduce job
   • Utilizes InputFormat


• Oracle Event Processing
   • NoSQL data available through OEP query language (CQL)
Benchmarks – General Configuration

•   YCSB-based QA/benchmarking
    •   Key ~= 10 bytes, Data = 1108 bytes
•   Configurations of 6-30 nodes
    •   Typical Replication Factor of 3 (master + 2 replicas)
    •   200m records per shard, 2 billion records in total
    •   2 replication nodes per storage node
    •   Used SSDs - Two of them per host
•   Minimal I/O overhead
    •   B+Tree fits in memory => one I/O per record read
    •   Writes are buffered + log structured storage system == fast write throughput
Benchmark Results

                                                           Insert Throughput
                                             250,000




                                                                                                   Average Latency (ms)
                      Throughput (ops/sec)
• 2 billion records                          200,000
                                                                                               4



• 226K ops/sec                               150,000
                                                                                               3



• HA ack. policy =                           100,000
                                                                                               2

‘Majority’
                                              50,000                                           1
• Low latency
                                                  0                                            0
• Highly Scalable                                      6 (2x3)   12 (4x3) 24 (8x3) 30 (10x3)
                                                                    Cluster Size


                                                  Throughput (insert/sec)     Write Latency (ms)
Benchmark Results (cont.)

                                                                Mixed Throughput
                                                  1,400,000

                                                                                                       4
• 95% read, 5% update                             1,200,000




                                                                                                           Average Latency (ms)
                           Throughput (ops/sec)
• 2 billion records                               1,000,000
                                                                                                       3

                                                   800,000
• 1.25M ops/sec
                                                   600,000                                             2

• HA ack. policy =
‘Majority’                                         400,000
                                                                                                       1

• Low read/write latency                           200,000


                                                         0                                             0
• Highly Scalable                                             6 (2x3) 12 (4x3) 24 (8x3) 30 (10x3)
                                                                         Cluster Size
                                                          Throughput (ops/sec)    Write Latency (ms)
                                                          Read Latency (ms)
Benchmark Results (cont.)

                                                     Insert Throughput
                                                   500,000




                            Throughput (ops/sec)
                                                   400,000
• Changed ack-policy from
‘MAJORITY’ to ‘NONE’
                                                   300,000
•Throughput increased
from 226K to 407K                                                        Majority
ops/sec                                                                  None
                                                   200,000

• 80% improvement
                                                   100,000




                                                        0

                                                             30 (10x3)
Questions

More Related Content

What's hot (20)

PDF
MyCassandra (Full English Version)
Shun Nakamura
 
PPTX
MongoDB at Scale
MongoDB
 
DOCX
Build your own cloud server
Randall Spence
 
ODP
Nyc summit intro_to_cassandra
zznate
 
PDF
MySQL High-Availability and Scale-Out architectures
FromDual GmbH
 
PPTX
Clustrix Database Percona Ruby on Rails benchmark
Clustrix
 
PPTX
Building the Perfect SharePoint 2010 Farm - MS Days Bulgaria 2012
Michael Noel
 
PDF
keyvi the key value index @ Cliqz
Hendrik Muhs
 
PDF
My sql cluster_taipei_event
Ivan Tu
 
PPTX
Hadoop Meetup Jan 2019 - Dynamometer and a Case Study in NameNode GC
Erik Krogen
 
PPTX
Oct 2012 HUG: Hadoop .Next (0.23) - Customer Impact and Deployment
Yahoo Developer Network
 
PDF
1 Introduction at CloudStack Developer Day
Kimihiko Kitase
 
PDF
Scaling HDFS to Manage Billions of Files
Haohui Mai
 
PPTX
NoSQL Intro with cassandra
Brian Enochson
 
PPTX
Strata + Hadoop World 2012: Apache HBase Features for the Enterprise
Cloudera, Inc.
 
PPTX
Riding the Stream Processing Wave (Strange loop 2019)
Samarth Shetty
 
PDF
Apache hbase for the enterprise (Strata+Hadoop World 2012)
jmhsieh
 
PDF
Optymalizacja środowiska Open Source w celu zwiększenia oszczędności i kontroli
EDB
 
PDF
High-Performance Storage Services with HailDB and Java
sunnygleason
 
PPTX
Global Azure Virtual 2020 What's new on Azure IaaS for SQL VMs
Marco Obinu
 
MyCassandra (Full English Version)
Shun Nakamura
 
MongoDB at Scale
MongoDB
 
Build your own cloud server
Randall Spence
 
Nyc summit intro_to_cassandra
zznate
 
MySQL High-Availability and Scale-Out architectures
FromDual GmbH
 
Clustrix Database Percona Ruby on Rails benchmark
Clustrix
 
Building the Perfect SharePoint 2010 Farm - MS Days Bulgaria 2012
Michael Noel
 
keyvi the key value index @ Cliqz
Hendrik Muhs
 
My sql cluster_taipei_event
Ivan Tu
 
Hadoop Meetup Jan 2019 - Dynamometer and a Case Study in NameNode GC
Erik Krogen
 
Oct 2012 HUG: Hadoop .Next (0.23) - Customer Impact and Deployment
Yahoo Developer Network
 
1 Introduction at CloudStack Developer Day
Kimihiko Kitase
 
Scaling HDFS to Manage Billions of Files
Haohui Mai
 
NoSQL Intro with cassandra
Brian Enochson
 
Strata + Hadoop World 2012: Apache HBase Features for the Enterprise
Cloudera, Inc.
 
Riding the Stream Processing Wave (Strange loop 2019)
Samarth Shetty
 
Apache hbase for the enterprise (Strata+Hadoop World 2012)
jmhsieh
 
Optymalizacja środowiska Open Source w celu zwiększenia oszczędności i kontroli
EDB
 
High-Performance Storage Services with HailDB and Java
sunnygleason
 
Global Azure Virtual 2020 What's new on Azure IaaS for SQL VMs
Marco Obinu
 

Similar to Oracle no sql overview brief (20)

PDF
Modernización del manejo de datos con v fabric
Software Guru
 
PDF
Oow 2008 yahoo_pie-db
bohanchen
 
PDF
The 5 Stages of Scale
xcbsmith
 
PDF
Yahoo Cloud Serving Benchmark
kevin han
 
PDF
Bottlenecks, Bottlenecks, and more Bottlenecks: Lessons Learned from 2 Years ...
Enkitec
 
PDF
Choosing a MySQL High Availability solution - Percona Live UK 2011
Henrik Ingo
 
PPT
Shapira oda perf_webinar_v2
Gwen (Chen) Shapira
 
PDF
cosbench-openstack.pdf
OpenStack Foundation
 
PPTX
Database Sharding the Right Way: Easy, Reliable, and Open source - HighLoad++...
CUBRID
 
PDF
Using Distributed In-Memory Computing for Fast Data Analysis
ScaleOut Software
 
PDF
MySQL Cluster performance best practices
Mat Keep
 
PDF
Intro to NoSQL and MongoDB
DATAVERSITY
 
PPTX
Database sharding the right way: еasy, reliable, and open source (Esen Sagynov)
Ontico
 
PDF
Cloudcon East Presentation
br7tt
 
PDF
Cloudcon East Presentation
br7tt
 
PDF
Overcoming the Top Four Challenges to Real-Time Performance in Large-Scale, D...
SL Corporation
 
PDF
MySQL Cluster Scaling to a Billion Queries
Bernd Ocklin
 
PPTX
Hadoop/HBase POC framework
Doug Chang
 
PDF
Go simple-fast-elastic-with-couchbase-server-borkar
Dipti Borkar
 
PPTX
MEW22 22nd Machine Evaluation Workshop Microsoft
Lee Stott
 
Modernización del manejo de datos con v fabric
Software Guru
 
Oow 2008 yahoo_pie-db
bohanchen
 
The 5 Stages of Scale
xcbsmith
 
Yahoo Cloud Serving Benchmark
kevin han
 
Bottlenecks, Bottlenecks, and more Bottlenecks: Lessons Learned from 2 Years ...
Enkitec
 
Choosing a MySQL High Availability solution - Percona Live UK 2011
Henrik Ingo
 
Shapira oda perf_webinar_v2
Gwen (Chen) Shapira
 
cosbench-openstack.pdf
OpenStack Foundation
 
Database Sharding the Right Way: Easy, Reliable, and Open source - HighLoad++...
CUBRID
 
Using Distributed In-Memory Computing for Fast Data Analysis
ScaleOut Software
 
MySQL Cluster performance best practices
Mat Keep
 
Intro to NoSQL and MongoDB
DATAVERSITY
 
Database sharding the right way: еasy, reliable, and open source (Esen Sagynov)
Ontico
 
Cloudcon East Presentation
br7tt
 
Cloudcon East Presentation
br7tt
 
Overcoming the Top Four Challenges to Real-Time Performance in Large-Scale, D...
SL Corporation
 
MySQL Cluster Scaling to a Billion Queries
Bernd Ocklin
 
Hadoop/HBase POC framework
Doug Chang
 
Go simple-fast-elastic-with-couchbase-server-borkar
Dipti Borkar
 
MEW22 22nd Machine Evaluation Workshop Microsoft
Lee Stott
 
Ad

More from InfiniteGraph (20)

PDF
Making Sense of Graph Databases
InfiniteGraph
 
PPTX
Webinar 3/12/14: Using Social Media to Drive Value
InfiniteGraph
 
PDF
NoSQL Simplified: Schema vs. Schema-less
InfiniteGraph
 
PDF
The Value of Explicit Schema for Graph Use Cases
InfiniteGraph
 
PDF
Solution Use Case Demo: The Power of Relationships in Your Big Data
InfiniteGraph
 
PDF
PowerOfRelationshipsInBigData_SVNoSQL
InfiniteGraph
 
PPT
Objectivity/DB: A Multipurpose NoSQL Database
InfiniteGraph
 
PPT
Making sense of the Graph Revolution
InfiniteGraph
 
PPT
An Introduction to Graph Databases
InfiniteGraph
 
PDF
Using A Distributed Graph Database To Make Sense Of Disparate Data Stores
InfiniteGraph
 
PPT
Turning Big Data into Smart Data with Graph Technologies
InfiniteGraph
 
PPTX
NoSQL Technology and Real-time, Accurate Predictive Analytics
InfiniteGraph
 
PPTX
How we Learned to Stop Worrying and Solve the Distributed Graph Problem
InfiniteGraph
 
PDF
Everything Goes Better With Bacon: Revisiting the Six Degrees Problem with a ...
InfiniteGraph
 
PPTX
Vodafone xone fev142013v3 ext
InfiniteGraph
 
PDF
Dbta Webinar Realize Value of Big Data with graph 011713
InfiniteGraph
 
PPT
Infinite graph nosql meetup dec 2012
InfiniteGraph
 
PDF
Oracle NoSQL DB & InfiniteGraph - Trends in Big Data and Graph Technology
InfiniteGraph
 
PPTX
Silicon valley nosql meetup april 2012
InfiniteGraph
 
PPT
NOSQL Now! Presentation, August 24, 2011: Graph Databases: Connecting the Dot...
InfiniteGraph
 
Making Sense of Graph Databases
InfiniteGraph
 
Webinar 3/12/14: Using Social Media to Drive Value
InfiniteGraph
 
NoSQL Simplified: Schema vs. Schema-less
InfiniteGraph
 
The Value of Explicit Schema for Graph Use Cases
InfiniteGraph
 
Solution Use Case Demo: The Power of Relationships in Your Big Data
InfiniteGraph
 
PowerOfRelationshipsInBigData_SVNoSQL
InfiniteGraph
 
Objectivity/DB: A Multipurpose NoSQL Database
InfiniteGraph
 
Making sense of the Graph Revolution
InfiniteGraph
 
An Introduction to Graph Databases
InfiniteGraph
 
Using A Distributed Graph Database To Make Sense Of Disparate Data Stores
InfiniteGraph
 
Turning Big Data into Smart Data with Graph Technologies
InfiniteGraph
 
NoSQL Technology and Real-time, Accurate Predictive Analytics
InfiniteGraph
 
How we Learned to Stop Worrying and Solve the Distributed Graph Problem
InfiniteGraph
 
Everything Goes Better With Bacon: Revisiting the Six Degrees Problem with a ...
InfiniteGraph
 
Vodafone xone fev142013v3 ext
InfiniteGraph
 
Dbta Webinar Realize Value of Big Data with graph 011713
InfiniteGraph
 
Infinite graph nosql meetup dec 2012
InfiniteGraph
 
Oracle NoSQL DB & InfiniteGraph - Trends in Big Data and Graph Technology
InfiniteGraph
 
Silicon valley nosql meetup april 2012
InfiniteGraph
 
NOSQL Now! Presentation, August 24, 2011: Graph Databases: Connecting the Dot...
InfiniteGraph
 
Ad

Oracle no sql overview brief

  • 1. Oracle NoSQL Database Dave Rubin Director – NoSQL Database Development
  • 2. The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. The development, release, and timing of any features or functionality described for Oracle’s products remains at the sole discretion of Oracle.
  • 3. Agenda • NoSQL Use Case • Oracle NoSQL Database • Architecture • Integration with the RDBMS • Benchmark Results
  • 4. Use Case – Online Display Advertising • Problem • Very low latency requirements – Publishers require 50 – 60 ms response time from the ad serving platform • Extreme data velocity – Multi-millions of requests per second • Highly available – 24/7 sites • Revenue maximization – Deliver the most relevant ad to maximize revenue • Solution – Where to use a NoSQL Database? • Cookie store – NoSQL database used to store cookies and associated behavioral segments • Track behavioral data – Beacons utilized during browsing to store timestamp, frequency, and behavioral segments by cookie • Optimize ad delivery – Recency, frequency, and behavioral segments used to determine optimal ad to deliver to user
  • 5. Online Display Advertising Overall Solution Real Time Reporting and Campaign Management RDBMS Hadoop Cluster Ad Server Multi Dimensional Reporting
  • 6. Online Display Advertising – Usage Characteristics • NoSQL Database • Low latency high volume • Millions of ad serving requests per minute or second • Stringent latency requirements from publishers • Loose consistency • Cookie data used for ad targeting – Increase probability that user will click on ad. • Relational Database • Campaign booking information – hundreds of users • Real time business metrics for publishers and advertisers • Business financials for ad serving company • Year to date revenue, quarter over quarter etc. • Billing • SOX reporting for public companies • Hadoop • Unique visits (select count(distinct)) over many terabytes of data • Inventory forecasting across behavioral segments
  • 7. Agenda • NoSQL Use Case • Oracle NoSQL Database • Architecture • Integration with the RDBMS • Benchmark Results
  • 8. A Distributed, Scalable Key-Value Database • Simple Data Model • Key-value pair with major+minor-key paradigm • CRUD + range scans Application Application • Scalability NoSQL DB Driver NoSQL DB Driver • Dynamic data partitioning and distribution • Optimized data access via intelligent driver • High availability • One or more replicas • Resilient to partition failures • Disaster recovery through location of replicas • No single point of failure • Transparent load balancing Storage Nodes Storage Nodes Data Center A Data Center B • Reads from master or replicas • Driver is network topology & latency aware • Elastic Expansion • Online addition/removal of storage nodes and automatic data redistribution
  • 9. Architecture – The Application’s Perspective Application NoSQL DB Driver Shard 1 Shard 2 Shard N Master Master Master Replicas Replicas Replicas
  • 10. Transactions • ACID transactions at shard granularity • Transaction Scope • Single API call • All records must have the same major key • Multiple operations within a transaction via collections • Can be relaxed for increased performance on a per- operation basis
  • 11. Simple Data Model ACID Transactions – Configurability • Configurable Durability Policy • Configurable Consistency Policy
  • 12. Integration with the RDBMS and Other Products • Oracle External Tables • Export data directly from NoSQL database and create Oracle External Table • Pre-packaged utility • Oracle Loader for Hadoop • Parallel map reduce job • Utilizes InputFormat • Oracle Event Processing • NoSQL data available through OEP query language (CQL)
  • 13. Benchmarks – General Configuration • YCSB-based QA/benchmarking • Key ~= 10 bytes, Data = 1108 bytes • Configurations of 6-30 nodes • Typical Replication Factor of 3 (master + 2 replicas) • 200m records per shard, 2 billion records in total • 2 replication nodes per storage node • Used SSDs - Two of them per host • Minimal I/O overhead • B+Tree fits in memory => one I/O per record read • Writes are buffered + log structured storage system == fast write throughput
  • 14. Benchmark Results Insert Throughput 250,000 Average Latency (ms) Throughput (ops/sec) • 2 billion records 200,000 4 • 226K ops/sec 150,000 3 • HA ack. policy = 100,000 2 ‘Majority’ 50,000 1 • Low latency 0 0 • Highly Scalable 6 (2x3) 12 (4x3) 24 (8x3) 30 (10x3) Cluster Size Throughput (insert/sec) Write Latency (ms)
  • 15. Benchmark Results (cont.) Mixed Throughput 1,400,000 4 • 95% read, 5% update 1,200,000 Average Latency (ms) Throughput (ops/sec) • 2 billion records 1,000,000 3 800,000 • 1.25M ops/sec 600,000 2 • HA ack. policy = ‘Majority’ 400,000 1 • Low read/write latency 200,000 0 0 • Highly Scalable 6 (2x3) 12 (4x3) 24 (8x3) 30 (10x3) Cluster Size Throughput (ops/sec) Write Latency (ms) Read Latency (ms)
  • 16. Benchmark Results (cont.) Insert Throughput 500,000 Throughput (ops/sec) 400,000 • Changed ack-policy from ‘MAJORITY’ to ‘NONE’ 300,000 •Throughput increased from 226K to 407K Majority ops/sec None 200,000 • 80% improvement 100,000 0 30 (10x3)