Oracle no sql overview brief

Oracle NoSQL Database
Dave Rubin
Director – NoSQL Database Development

The following is intended to outline our general
product direction. It is intended for information
purposes only, and may not be incorporated into any
contract. It is not a commitment to deliver any
material, code, or functionality, and should not be
relied upon in making purchasing decisions.
The development, release, and timing of any
features or functionality described for Oracle’s
products remains at the sole discretion of Oracle.

Agenda

• NoSQL Use Case
• Oracle NoSQL Database
• Architecture
• Integration with the RDBMS
• Benchmark Results

Use Case – Online Display Advertising
• Problem
• Very low latency requirements – Publishers require 50 – 60 ms response
time from the ad serving platform
• Extreme data velocity – Multi-millions of requests per second
• Highly available – 24/7 sites
• Revenue maximization – Deliver the most relevant ad to maximize
revenue

• Solution – Where to use a NoSQL Database?
• Cookie store – NoSQL database used to store cookies and associated
behavioral segments
• Track behavioral data – Beacons utilized during browsing to store
timestamp, frequency, and behavioral segments by cookie
• Optimize ad delivery – Recency, frequency, and behavioral segments
used to determine optimal ad to deliver to user

Online Display Advertising Overall Solution
Real Time Reporting and
Campaign Management
RDBMS
Hadoop Cluster

Ad Server

Multi Dimensional
Reporting

Online Display Advertising – Usage
Characteristics
• NoSQL Database
• Low latency high volume
• Millions of ad serving requests per minute or second
• Stringent latency requirements from publishers
• Loose consistency
• Cookie data used for ad targeting – Increase probability that user will click on ad.
• Relational Database
• Campaign booking information – hundreds of users
• Real time business metrics for publishers and advertisers
• Business financials for ad serving company
• Year to date revenue, quarter over quarter etc.
• Billing
• SOX reporting for public companies

• Hadoop
• Unique visits (select count(distinct)) over many terabytes of data
• Inventory forecasting across behavioral segments

A Distributed, Scalable Key-Value Database

• Simple Data Model
• Key-value pair with major+minor-key paradigm
• CRUD + range scans Application Application

• Scalability NoSQL DB Driver NoSQL DB Driver

• Dynamic data partitioning and distribution
• Optimized data access via intelligent driver
• High availability
• One or more replicas
• Resilient to partition failures
• Disaster recovery through location of replicas
• No single point of failure
• Transparent load balancing Storage Nodes Storage Nodes
Data Center A Data Center B
• Reads from master or replicas
• Driver is network topology & latency aware
• Elastic Expansion
• Online addition/removal of storage nodes and automatic data redistribution

Architecture – The Application’s Perspective
Application
NoSQL DB Driver

Shard 1 Shard 2 Shard N

Master Master Master

Replicas Replicas Replicas

Transactions

• ACID transactions at shard granularity

• Transaction Scope
• Single API call
• All records must have the same major key
• Multiple operations within a transaction via collections

• Can be relaxed for increased performance on a per-
operation basis

Simple Data Model
ACID Transactions – Configurability

• Configurable Durability Policy

• Configurable Consistency Policy

Integration with the RDBMS and Other
Products

• Oracle External Tables
• Export data directly from NoSQL database and create Oracle
External Table
• Pre-packaged utility

• Oracle Loader for Hadoop
• Parallel map reduce job
• Utilizes InputFormat

• Oracle Event Processing
• NoSQL data available through OEP query language (CQL)

Benchmarks – General Configuration

• YCSB-based QA/benchmarking
• Key ~= 10 bytes, Data = 1108 bytes
• Configurations of 6-30 nodes
• Typical Replication Factor of 3 (master + 2 replicas)
• 200m records per shard, 2 billion records in total
• 2 replication nodes per storage node
• Used SSDs - Two of them per host
• Minimal I/O overhead
• B+Tree fits in memory => one I/O per record read
• Writes are buffered + log structured storage system == fast write throughput

Benchmark Results

Insert Throughput
250,000

Average Latency (ms)
Throughput (ops/sec)
• 2 billion records 200,000
4

• 226K ops/sec 150,000
3

• HA ack. policy = 100,000
2

‘Majority’
50,000 1
• Low latency
0 0
• Highly Scalable 6 (2x3) 12 (4x3) 24 (8x3) 30 (10x3)
Cluster Size

Throughput (insert/sec) Write Latency (ms)

Benchmark Results (cont.)

Mixed Throughput
1,400,000

4
• 95% read, 5% update 1,200,000

Average Latency (ms)
• 2 billion records 1,000,000
3

800,000
• 1.25M ops/sec
600,000 2

• HA ack. policy =
‘Majority’ 400,000
1

• Low read/write latency 200,000

0 0
• Highly Scalable 6 (2x3) 12 (4x3) 24 (8x3) 30 (10x3)
Cluster Size
Throughput (ops/sec) Write Latency (ms)
Read Latency (ms)

Benchmark Results (cont.)

Insert Throughput
500,000

400,000
• Changed ack-policy from
‘MAJORITY’ to ‘NONE’
300,000
•Throughput increased
from 226K to 407K Majority
ops/sec None
200,000

• 80% improvement
100,000

0

30 (10x3)

Oracle no sql overview brief

More Related Content

What's hot (20)

Similar to Oracle no sql overview brief (20)

More from InfiniteGraph (20)

Oracle no sql overview brief