SlideShare a Scribd company logo
Grab some 
coffee and 
enjoy the 
pre-show 
banter 
before the 
top of the 
hour!
Hadoop and the Relational Database: The Best of Both Worlds 
The Briefing Room
Twitter Tag: #briefr 
The Briefing Room 
Welcome 
Host: 
Eric Kavanagh 
eric.kavanagh@bloorgroup.com 
@eric_kavanagh
! Reveal the essential characteristics of enterprise software, 
good and bad 
! Provide a forum for detailed analysis of today’s innovative 
technologies 
! Give vendors a chance to explain their product to savvy 
analysts 
! Allow audience members to pose serious questions... and get 
answers! 
Twitter Tag: #briefr 
The Briefing Room 
Mission
This Month: BIG DATA ECOSYSTEM 
September: INTEGRATION & DATA FLOW 
October: ANALYTIC PLATFORMS 
Twitter Tag: #briefr 
The Briefing Room 
Topics 
2014 Editorial Calendar at 
www.insideanalysis.com/webcasts/the-briefing-room
Twitter Tag: #briefr 
The Briefing Room 
Executive Summary 
Scale out is the new Agile 
Business needs constant flexibility 
No time for down time 
Grow as quickly as you can sell
Twitter Tag: #briefr 
The Briefing Room 
Analyst: Robin Bloor 
Robin Bloor is 
Chief Analyst at 
The Bloor Group 
robin.bloor@bloorgroup.com 
@robinbloor
Twitter Tag: #briefr 
The Briefing Room 
Splice Machine 
! Splice Machine is a SQL-on-Hadoop database 
! The product is ACID-compliant and can power both 
OLAP and OLTP workloads 
! Splice Machine is built on Java-based Apache Derby 
and Hbase/Hadoop
Guests: John Leach & Rich Reimer 
John Leach, Co-Founder and Chief Technology Officer 
With over 15 years of software experience under his belt, John’s expertise in 
analytics and BI drives his role as Chief Technology Officer. Prior to Splice 
Machine, John founded Incite Retail in June 2008 and led the company’s strategy 
and development efforts. Prior to Incite Retail, he ran the business intelligence 
practice at Blue Martini Software and built strategic partnerships with 
integration partners. His focus at Blue Martini was helping clients incorporate 
decision support knowledge into their current business processes utilizing 
advanced algorithms and machine learning. 
Rich Reimer, VP of Marketing and Product Management 
Rich has over 15 years of sales, marketing and management experience in high-tech 
Treasure Isle studio head, where he used petabytes of data from millions of daily 
users to optimize the business in real-time. Prior to Zynga, he was the COO and 
co-founder of a social media platform named Grouply. Before founding Grouply, 
Rich held executive positions at Siebel Systems, Blue Martini Software and Oracle 
Corporation as well as sales and marketing positions at General Electric and Bell 
Atlantic. 
Twitter Tag: #briefr 
companies. Before joining Splice Machine, Rich worked at Zynga as the 
The Briefing Room
Affordable 
Scale-­‐Out 
August 
5, 
2014
11 
Data 
Doubling 
Every 
2 
Years… 
Driven 
by 
web, 
social, 
mobile, 
and 
Internet 
of 
Things 
Source: 2013 IBM Briefing Book
12 
TradiBonal 
RDBMSs 
Overwhelmed… 
Scale-­‐up 
becoming 
cost-­‐prohibi=ve 
Oracle 
is 
too 
darn 
expensive! 
My 
DB 
is 
hiLng 
the 
wall 
Users 
keep 
geLng 
those 
spinning 
beach 
balls 
We 
have 
to 
throw 
data 
away 
Our 
reports 
take 
forever
13 
Case 
Study: 
Harte-­‐Hanks 
Overview 
! Digital 
markeBng 
services 
provider 
! Real-­‐Bme 
campaign 
management 
! Complex 
OLTP 
and 
OLAP 
environment 
Challenges 
! Oracle 
RAC 
too 
expensive 
to 
scale 
! Queries 
too 
slow 
– 
even 
up 
to 
½ 
hour 
! GeLng 
worse 
– 
expect 
30-­‐50% 
data 
growth 
! Looked 
for 
9 
months 
for 
a 
cost-­‐effecBve 
soluBon 
SoluBon 
Diagram 
IniBal 
Results 
10-­‐20x 
price/perf 
with 
no 
applicaBon, 
BI 
or 
ETL 
rewrites 
¼ 
cost 
with 
commodity 
scale 
out 
3-­‐7x 
faster 
through 
parallelized 
queries 
Cross-Channel 
Campaigns 
Real-Time 
Personalization 
Real-Time Actions
14 
Scale-­‐Out: 
The 
Future 
of 
Databases 
Drama=c 
improvement 
in 
price/performance 
Scale 
Up 
(Increase 
server 
size) 
Scale 
Out 
(More 
small 
servers) 
$ vs. 
$ 
$ 
$ 
$ 
$
15 
Who 
are 
We? 
THE 
ONLY 
HADOOP 
RDBMS 
Replace 
your 
old 
RDBMS 
with 
a 
scale-­‐out 
SQL 
database 
! Affordable, 
Scale-­‐Out 
! ACID 
TransacBons 
! No 
ApplicaBon 
Rewrites 
10x 
Beier 
Price/Perf
16 
Customer 
Performance 
Benchmarks 
Typically 
10x 
price/performance 
improvement 
30x 
3-­‐7x 
10-­‐20x 
10x 
20x 
10-­‐15x 
7x 
5x 
SPEED 
VS. 
PRICE/PERFORMANCE
Use 
Cases 
§ Digital 
MarkeBng 
§ Campaign 
management 
§ Unified 
Customer 
Profile 
§ Real-­‐Bme 
personalizaBon 
§ Data 
Lake 
§ OperaBonal 
reporBng 
and 
analyBcs 
§ OperaBonal 
Data 
Stores 
§ Fraud 
DetecBon 
§ Personalized 
Medicine 
§ Internet 
of 
Things 
§ Network 
monitoring 
§ Cyber-­‐threat 
security 
§ Wearables 
and 
sensors 
17
Seasoned 
Team 
18 
Successful 
Serial 
Entrepreneurs 
Enterprise 
So?ware 
Experience 
Database 
& 
Big 
Data 
Experience 
Big 
Data 
Research 
& 
Community 
Leadership 
Hadoop 
User Group
What 
People 
are 
Saying… 
19 
Recognized 
as 
a 
key 
innovator 
in 
databases 
Scaling 
out 
on 
Splice 
Machine 
presented 
some 
major 
benefits 
over 
Oracle 
...automaBc 
balancing 
between 
clusters...avoiding 
the 
costly 
licensing 
issues. 
Quotes 
Awards 
An 
alternaKve 
to 
today’s 
RDBMSes, 
Splice 
Machine 
effecBvely 
combines 
tradiBonal 
relaBonal 
database 
technology 
with 
the 
scale-­‐out 
capabiliBes 
of 
Hadoop. 
The 
unique 
claim 
of 
… 
Splice 
Machine 
is 
that 
it 
can 
run 
transacKonal 
applicaKons 
as 
well 
as 
support 
analyBcs 
on 
top 
of 
Hadoop.
20 
Proven 
Building 
Blocks: 
Hadoop 
and 
Derby 
APACHE 
DERBY 
§ 
ANSI 
SQL-­‐99 
RDBMS 
§ 
Java-­‐based 
§ 
ODBC/JDBC 
Compliant 
APACHE 
HBASE/HDFS 
§ Auto-­‐sharding 
§ Real-­‐Bme 
updates 
§ Fault-­‐tolerance 
§ Scalability 
to 
100s 
of 
PBs 
§ Data 
replicaBon
21 
HBase: 
Proven 
Scale-­‐Out 
§ Auto-­‐sharding 
§ Scales 
with 
commodity 
hardware 
§ Cost-­‐effecBve 
from 
GBs 
to 
PBs 
§ High 
availability 
thru 
failover 
and 
replicaBon 
§ LSM-­‐trees
22 
Distributed, 
Parallelized 
Query 
ExecuBon 
! Parallelized 
computaBon 
across 
cluster 
! Moves 
computaBon 
to 
the 
data 
! UBlizes 
HBase 
co-­‐processors 
! No 
MapReduce
ANSI 
SQL-­‐99 
Coverage 
23 
§ Data 
types 
– 
e.g., 
INTEGER, 
REAL, 
CHARACTER, 
DATE, 
BOOLEAN, 
BIGINT 
§ DDL 
– 
e.g., 
CREATE 
TABLE, 
CREATE 
SCHEMA, 
ALTER 
TABLE, 
DELETE, 
UPDATE 
§ Predicates 
– 
e.g., 
IN, 
BETWEEN, 
LIKE, 
EXISTS 
§ DML 
– 
e.g., 
INSERT, 
DELETE, 
UPDATE, 
SELECT 
§ Query 
specificaKon 
– 
e.g., 
SELECT 
DISTINCT, 
GROUP 
BY, 
HAVING 
§ SET 
funcKons 
– 
e.g., 
UNION, 
ABS, 
MOD, 
ALL, 
CHECK 
§ AggregaKon 
funcKons 
– 
e.g., 
AVG, 
MAX, 
COUNT 
§ String 
funcKons 
– 
e.g., 
SUBSTRING, 
concatenaBon, 
UPPER, 
LOWER, 
POSITION, 
TRIM, 
LENGTH 
§ CondiKonal 
funcKons 
– 
e.g., 
CASE, 
searched 
CASE 
§ Privileges 
– 
e.g., 
privileges 
for 
SELECT, 
DELETE, 
INSERT, 
EXECUTE 
§ Cursors 
– 
e.g., 
updatable, 
read-­‐only, 
posiBoned 
DELETE/UPDATE 
§ Joins 
– 
e.g., 
INNER 
JOIN, 
LEFT 
OUTER 
JOIN 
§ TransacKons 
– 
e.g., 
COMMIT, 
ROLLBACK, 
READ 
COMMITTED, 
REPEATABLE 
READ, 
READ 
UNCOMMITTED, 
Snapshot 
IsolaBon 
§ Sub-­‐queries 
§ Triggers 
§ User-­‐defined 
funcKons 
(UDFs) 
§ Views 
– 
including 
grouped 
views
24 
Lockless, 
ACID 
transacBons 
State-­‐of-­‐the-­‐Art 
Snapshot 
Isola=on 
Transaction C 
! Adds 
mulB-­‐row, 
mulB-­‐table 
transacBons 
to 
HBase 
with 
rollback 
! Fast, 
lockless, 
high 
concurrency 
! ZooKeeper 
coordinaBon 
! Extends 
research 
from 
Google 
Percolator, 
Yahoo 
Labs, 
U 
of 
Waterloo 
Transaction A 
Transaction B 
Ts Tc
25 
BI 
and 
SQL 
tool 
support 
via 
ODBC 
No 
applica=on 
rewrites 
needed
26 
Who 
are 
We? 
THE 
ONLY 
HADOOP 
RDBMS 
Replace 
your 
old 
RDBMS 
with 
a 
scale-­‐out 
SQL 
database 
! Affordable, 
Scale-­‐Out 
! ACID 
TransacBons 
! No 
ApplicaBon 
Rewrites 
10x 
Beier 
Price/Perf
Thank 
You!
Twitter Tag: #briefr 
The Briefing Room 
Perceptions & Questions 
Analyst: 
Robin Bloor
Hadoop as a 
Data Refinery? 
Robin Bloor, PhD
Data Flow – A Set of Principles 
u The data layer is one logical collection of data, 
both external and internal 
u The data flows, from ingest through a refining 
process to a point of application 
u It is best if data doesn’t flow much 
u “Vanilla Hadoop” is a viable catching & refining 
vehicle 
u Beyond that a database is required to manage 
workloads
Big Data Architecture
Data Refining
The Data Engines 
STREAMING DATA 
OLTP 
LARGE QUERY 
LARGE ANALYTICAL QUERY 
SQL, JSON, SPARQL QUERIES
u How does Splice Machine organize its data? 
u Is this an OLTP database or a BI database? Or can 
it be both at the same time? 
u What do you see as the sweet spot for this 
database: 
• In respect of Big Data? 
• In respect of business applications?
u Is Splice Machine also suited for analytical 
applications? 
u Do you also find yourselves competing with 
NoSQL products? 
u In respect of scale, what is your largest 
implementation by data volume and by 
transaction rate?
Twitter Tag: #briefr 
The Briefing Room
This Month: BIG DATA ECOSYSTEM 
September: INTEGRATION & DATA FLOW 
October: ANALYTIC PLATFORMS 
www.insideanalysis.com/webcasts/the-briefing-room 
Twitter Tag: #briefr 
The Briefing Room 
Upcoming Topics 
2014 Editorial Calendar at 
www.insideanalysis.com
Twitter Tag: #briefr 
THANK YOU 
for your 
ATTENTION! 
Opening slide image courtesy of Wikimedia Commons 
The Briefing Room

More Related Content

What's hot (18)

PDF
Enterprise Search: Addressing the First Problem of Big Data & Analytics - Sta...
StampedeCon
 
PDF
From Beginners to Experts, Data Wrangling for All
DataWorks Summit
 
PDF
Big Data for Managers: From hadoop to streaming and beyond
DataWorks Summit/Hadoop Summit
 
PDF
CIO Guide to Using SAP HANA Platform For Big Data
Snehanshu Shah
 
PDF
Splice machine-bloor-webinar-data-lakes
Edgar Alejandro Villegas
 
PDF
SAP Lambda Architecture Point of View
Snehanshu Shah
 
PDF
The Big Data Journey – How Companies Adopt Hadoop - StampedeCon 2016
StampedeCon
 
PPTX
Hadoop is not an Island in the Enterprise
DataWorks Summit
 
PPTX
Trafodion overview
Rohit Jain
 
PDF
Hadoop as an Analytic Platform: Why Not?
Inside Analysis
 
PDF
MDS ap_OEM Product Portfolio Intorduction to the DT & Analytics
MDS ap
 
PPTX
Growth hacking in the age of Data
Daniel Saito
 
PPTX
Uotm workshop
Ravi Patel
 
PDF
Packaging Ecosystems -Monki Gras 2017
Treasure Data, Inc.
 
PPTX
2 - Trafodion and Hadoop HBase
Rohit Jain
 
PPT
Mr bi
renjan131
 
PDF
EsgynDB: A Big Data Engine. Simplifying Fast and Reliable Mixed Workloads
Srikanth Ramakrishnan
 
PDF
The New Frontier: Optimizing Big Data Exploration
Inside Analysis
 
Enterprise Search: Addressing the First Problem of Big Data & Analytics - Sta...
StampedeCon
 
From Beginners to Experts, Data Wrangling for All
DataWorks Summit
 
Big Data for Managers: From hadoop to streaming and beyond
DataWorks Summit/Hadoop Summit
 
CIO Guide to Using SAP HANA Platform For Big Data
Snehanshu Shah
 
Splice machine-bloor-webinar-data-lakes
Edgar Alejandro Villegas
 
SAP Lambda Architecture Point of View
Snehanshu Shah
 
The Big Data Journey – How Companies Adopt Hadoop - StampedeCon 2016
StampedeCon
 
Hadoop is not an Island in the Enterprise
DataWorks Summit
 
Trafodion overview
Rohit Jain
 
Hadoop as an Analytic Platform: Why Not?
Inside Analysis
 
MDS ap_OEM Product Portfolio Intorduction to the DT & Analytics
MDS ap
 
Growth hacking in the age of Data
Daniel Saito
 
Uotm workshop
Ravi Patel
 
Packaging Ecosystems -Monki Gras 2017
Treasure Data, Inc.
 
2 - Trafodion and Hadoop HBase
Rohit Jain
 
Mr bi
renjan131
 
EsgynDB: A Big Data Engine. Simplifying Fast and Reliable Mixed Workloads
Srikanth Ramakrishnan
 
The New Frontier: Optimizing Big Data Exploration
Inside Analysis
 

Viewers also liked (20)

PPTX
implementation of 4G
neeraja507
 
PPTX
Using HBase Co-Processors to Build a Distributed, Transactional RDBMS - Splic...
Chicago Hadoop Users Group
 
PPTX
HBaseConEast2016: Splice machine open source rdbms
Michael Stack
 
PPTX
Splice Machine Overview
Kunal Gupta
 
PPT
Transportation and our environment
JesusMartinez96
 
PDF
Crawl, Walk, Run: How to Get Started with Hadoop
Inside Analysis
 
PDF
SQL on Hadoop
nvvrajesh
 
PPTX
October 2016 HUG: Architecture of an Open Source RDBMS powered by HBase and ...
Yahoo Developer Network
 
PDF
Spark as part of a Hybrid RDBMS Architecture-John Leach Cofounder Splice Machine
Data Con LA
 
PPT
ANDDS - GASTRO RETENTIVE DRUG DELIVERY SYSTEM
lakshamandpatel
 
PDF
Environment 15 (Transportation Engineering Dr.Lina Shbeeb)
Hossam Shafiq I
 
PPTX
January 2015 HUG: Using HBase Co-Processors to Build a Distributed, Transacti...
Yahoo Developer Network
 
PPT
Presentation GRDDS
Nadia Jawaid
 
PPTX
Transportation | Effect to the environment
danupong9
 
PPTX
Anatomy of stomach
Sachin Patne
 
PPTX
15) groundwater contamination, prevention and remedial techniques as on 27-05...
Najam Ul Syed Hassan
 
PDF
Anatomy of stomach medical images for power point
Medical_PPT_Images
 
PDF
water treatment slides
Anand Keshri
 
ODP
19 Final Slide Ideas for Concluding Your Presentation
Strongpages
 
PPTX
Anatomy of stomach
Sumit Sharma
 
implementation of 4G
neeraja507
 
Using HBase Co-Processors to Build a Distributed, Transactional RDBMS - Splic...
Chicago Hadoop Users Group
 
HBaseConEast2016: Splice machine open source rdbms
Michael Stack
 
Splice Machine Overview
Kunal Gupta
 
Transportation and our environment
JesusMartinez96
 
Crawl, Walk, Run: How to Get Started with Hadoop
Inside Analysis
 
SQL on Hadoop
nvvrajesh
 
October 2016 HUG: Architecture of an Open Source RDBMS powered by HBase and ...
Yahoo Developer Network
 
Spark as part of a Hybrid RDBMS Architecture-John Leach Cofounder Splice Machine
Data Con LA
 
ANDDS - GASTRO RETENTIVE DRUG DELIVERY SYSTEM
lakshamandpatel
 
Environment 15 (Transportation Engineering Dr.Lina Shbeeb)
Hossam Shafiq I
 
January 2015 HUG: Using HBase Co-Processors to Build a Distributed, Transacti...
Yahoo Developer Network
 
Presentation GRDDS
Nadia Jawaid
 
Transportation | Effect to the environment
danupong9
 
Anatomy of stomach
Sachin Patne
 
15) groundwater contamination, prevention and remedial techniques as on 27-05...
Najam Ul Syed Hassan
 
Anatomy of stomach medical images for power point
Medical_PPT_Images
 
water treatment slides
Anand Keshri
 
19 Final Slide Ideas for Concluding Your Presentation
Strongpages
 
Anatomy of stomach
Sumit Sharma
 
Ad

Similar to Hadoop and the Relational Database: The Best of Both Worlds (20)

PDF
Seeing Redshift: How Amazon Changed Data Warehousing Forever
Inside Analysis
 
PDF
Business in the Driver’s Seat – An Improved Model for Integration
Inside Analysis
 
PDF
Take Action: The New Reality of Data-Driven Business
Inside Analysis
 
PDF
The Maturity Model: Taking the Growing Pains Out of Hadoop
Inside Analysis
 
PDF
A6 big data_in_the_cloud
Dr. Wilfred Lin (Ph.D.)
 
PDF
The Right Data Warehouse: Automation Now, Business Value Thereafter
Inside Analysis
 
PDF
ADV Slides: 2021 Trends in Enterprise Analytics
DATAVERSITY
 
PDF
Data Discovery and BI - Is there Really a Difference?
Inside Analysis
 
PDF
The Anywhere Enterprise – How a Flexible Foundation Opens Doors
Inside Analysis
 
PDF
SAP IQ 16 Product Annoucement
Dobler Consulting
 
PDF
2015 02 12 talend hortonworks webinar challenges to hadoop adoption
Hortonworks
 
PPTX
Retail & CPG
Tata Consultancy Services
 
PDF
Analytic Excellence - Saying Goodbye to Old Constraints
Inside Analysis
 
PDF
Is Hadoop the Demise of Data Warehousing? The Impact of Hadoop/Big Data on BI...
Senturus
 
PDF
OPEN'17_4_Postgres: The Centerpiece for Modernising IT Infrastructures
Kangaroot
 
PPT
Architecting for Big Data - Gartner Innovation Peer Forum Sept 2011
Jonathan Seidman
 
PPT
Gartner peer forum sept 2011 orbitz
Raghu Kashyap
 
PPSX
Maximize Big Data ROI via Best of Breed Patterns and Practices
Jeff Bertman
 
PDF
Power to the People: A Stack to Empower Every User to Make Data-Driven Decisions
Looker
 
PDF
The Perfect Fit: Scalable Graph for Big Data
Inside Analysis
 
Seeing Redshift: How Amazon Changed Data Warehousing Forever
Inside Analysis
 
Business in the Driver’s Seat – An Improved Model for Integration
Inside Analysis
 
Take Action: The New Reality of Data-Driven Business
Inside Analysis
 
The Maturity Model: Taking the Growing Pains Out of Hadoop
Inside Analysis
 
A6 big data_in_the_cloud
Dr. Wilfred Lin (Ph.D.)
 
The Right Data Warehouse: Automation Now, Business Value Thereafter
Inside Analysis
 
ADV Slides: 2021 Trends in Enterprise Analytics
DATAVERSITY
 
Data Discovery and BI - Is there Really a Difference?
Inside Analysis
 
The Anywhere Enterprise – How a Flexible Foundation Opens Doors
Inside Analysis
 
SAP IQ 16 Product Annoucement
Dobler Consulting
 
2015 02 12 talend hortonworks webinar challenges to hadoop adoption
Hortonworks
 
Analytic Excellence - Saying Goodbye to Old Constraints
Inside Analysis
 
Is Hadoop the Demise of Data Warehousing? The Impact of Hadoop/Big Data on BI...
Senturus
 
OPEN'17_4_Postgres: The Centerpiece for Modernising IT Infrastructures
Kangaroot
 
Architecting for Big Data - Gartner Innovation Peer Forum Sept 2011
Jonathan Seidman
 
Gartner peer forum sept 2011 orbitz
Raghu Kashyap
 
Maximize Big Data ROI via Best of Breed Patterns and Practices
Jeff Bertman
 
Power to the People: A Stack to Empower Every User to Make Data-Driven Decisions
Looker
 
The Perfect Fit: Scalable Graph for Big Data
Inside Analysis
 
Ad

More from Inside Analysis (20)

PDF
An Ounce of Prevention: Forging Healthy BI
Inside Analysis
 
PDF
Agile, Automated, Aware: How to Model for Success
Inside Analysis
 
PDF
First in Class: Optimizing the Data Lake for Tighter Integration
Inside Analysis
 
PDF
Fit For Purpose: Preventing a Big Data Letdown
Inside Analysis
 
PDF
To Serve and Protect: Making Sense of Hadoop Security
Inside Analysis
 
PDF
The Hadoop Guarantee: Keeping Analytics Running On Time
Inside Analysis
 
PDF
Introducing: A Complete Algebra of Data
Inside Analysis
 
PDF
The Role of Data Wrangling in Driving Hadoop Adoption
Inside Analysis
 
PDF
Ahead of the Stream: How to Future-Proof Real-Time Analytics
Inside Analysis
 
PDF
All Together Now: Connected Analytics for the Internet of Everything
Inside Analysis
 
PDF
The Biggest Picture: Situational Awareness on a Global Level
Inside Analysis
 
PDF
Structurally Sound: How to Tame Your Architecture
Inside Analysis
 
PDF
SQL In Hadoop: Big Data Innovation Without the Risk
Inside Analysis
 
PDF
A Revolutionary Approach to Modernizing the Data Warehouse
Inside Analysis
 
PDF
Rethinking Data Availability and Governance in a Mobile World
Inside Analysis
 
PDF
DisrupTech - Dave Duggal
Inside Analysis
 
PPTX
Modus Operandi
Inside Analysis
 
PPTX
Phasic Systems - Dr. Geoffrey Malafsky
Inside Analysis
 
PPT
Red Hat - Sarangan Rangachari
Inside Analysis
 
PPTX
WebAction-Sami Abkay
Inside Analysis
 
An Ounce of Prevention: Forging Healthy BI
Inside Analysis
 
Agile, Automated, Aware: How to Model for Success
Inside Analysis
 
First in Class: Optimizing the Data Lake for Tighter Integration
Inside Analysis
 
Fit For Purpose: Preventing a Big Data Letdown
Inside Analysis
 
To Serve and Protect: Making Sense of Hadoop Security
Inside Analysis
 
The Hadoop Guarantee: Keeping Analytics Running On Time
Inside Analysis
 
Introducing: A Complete Algebra of Data
Inside Analysis
 
The Role of Data Wrangling in Driving Hadoop Adoption
Inside Analysis
 
Ahead of the Stream: How to Future-Proof Real-Time Analytics
Inside Analysis
 
All Together Now: Connected Analytics for the Internet of Everything
Inside Analysis
 
The Biggest Picture: Situational Awareness on a Global Level
Inside Analysis
 
Structurally Sound: How to Tame Your Architecture
Inside Analysis
 
SQL In Hadoop: Big Data Innovation Without the Risk
Inside Analysis
 
A Revolutionary Approach to Modernizing the Data Warehouse
Inside Analysis
 
Rethinking Data Availability and Governance in a Mobile World
Inside Analysis
 
DisrupTech - Dave Duggal
Inside Analysis
 
Modus Operandi
Inside Analysis
 
Phasic Systems - Dr. Geoffrey Malafsky
Inside Analysis
 
Red Hat - Sarangan Rangachari
Inside Analysis
 
WebAction-Sami Abkay
Inside Analysis
 

Recently uploaded (20)

PDF
The Rise of AI and IoT in Mobile App Tech.pdf
IMG Global Infotech
 
PDF
Transcript: Book industry state of the nation 2025 - Tech Forum 2025
BookNet Canada
 
PPTX
Seamless Tech Experiences Showcasing Cross-Platform App Design.pptx
presentifyai
 
PDF
Go Concurrency Real-World Patterns, Pitfalls, and Playground Battles.pdf
Emily Achieng
 
PPTX
COMPARISON OF RASTER ANALYSIS TOOLS OF QGIS AND ARCGIS
Sharanya Sarkar
 
PPTX
From Sci-Fi to Reality: Exploring AI Evolution
Svetlana Meissner
 
PDF
How do you fast track Agentic automation use cases discovery?
DianaGray10
 
PDF
Staying Human in a Machine- Accelerated World
Catalin Jora
 
PDF
NLJUG Speaker academy 2025 - first session
Bert Jan Schrijver
 
PDF
“Squinting Vision Pipelines: Detecting and Correcting Errors in Vision Models...
Edge AI and Vision Alliance
 
PDF
Reverse Engineering of Security Products: Developing an Advanced Microsoft De...
nwbxhhcyjv
 
PPTX
Designing_the_Future_AI_Driven_Product_Experiences_Across_Devices.pptx
presentifyai
 
PDF
Newgen 2022-Forrester Newgen TEI_13 05 2022-The-Total-Economic-Impact-Newgen-...
darshakparmar
 
PDF
Future-Proof or Fall Behind? 10 Tech Trends You Can’t Afford to Ignore in 2025
DIGITALCONFEX
 
PDF
What’s my job again? Slides from Mark Simos talk at 2025 Tampa BSides
Mark Simos
 
PDF
Peak of Data & AI Encore AI-Enhanced Workflows for the Real World
Safe Software
 
PPTX
Agentforce World Tour Toronto '25 - Supercharge MuleSoft Development with Mod...
Alexandra N. Martinez
 
PPTX
Q2 FY26 Tableau User Group Leader Quarterly Call
lward7
 
PPTX
Mastering ODC + Okta Configuration - Chennai OSUG
HathiMaryA
 
PDF
UPDF - AI PDF Editor & Converter Key Features
DealFuel
 
The Rise of AI and IoT in Mobile App Tech.pdf
IMG Global Infotech
 
Transcript: Book industry state of the nation 2025 - Tech Forum 2025
BookNet Canada
 
Seamless Tech Experiences Showcasing Cross-Platform App Design.pptx
presentifyai
 
Go Concurrency Real-World Patterns, Pitfalls, and Playground Battles.pdf
Emily Achieng
 
COMPARISON OF RASTER ANALYSIS TOOLS OF QGIS AND ARCGIS
Sharanya Sarkar
 
From Sci-Fi to Reality: Exploring AI Evolution
Svetlana Meissner
 
How do you fast track Agentic automation use cases discovery?
DianaGray10
 
Staying Human in a Machine- Accelerated World
Catalin Jora
 
NLJUG Speaker academy 2025 - first session
Bert Jan Schrijver
 
“Squinting Vision Pipelines: Detecting and Correcting Errors in Vision Models...
Edge AI and Vision Alliance
 
Reverse Engineering of Security Products: Developing an Advanced Microsoft De...
nwbxhhcyjv
 
Designing_the_Future_AI_Driven_Product_Experiences_Across_Devices.pptx
presentifyai
 
Newgen 2022-Forrester Newgen TEI_13 05 2022-The-Total-Economic-Impact-Newgen-...
darshakparmar
 
Future-Proof or Fall Behind? 10 Tech Trends You Can’t Afford to Ignore in 2025
DIGITALCONFEX
 
What’s my job again? Slides from Mark Simos talk at 2025 Tampa BSides
Mark Simos
 
Peak of Data & AI Encore AI-Enhanced Workflows for the Real World
Safe Software
 
Agentforce World Tour Toronto '25 - Supercharge MuleSoft Development with Mod...
Alexandra N. Martinez
 
Q2 FY26 Tableau User Group Leader Quarterly Call
lward7
 
Mastering ODC + Okta Configuration - Chennai OSUG
HathiMaryA
 
UPDF - AI PDF Editor & Converter Key Features
DealFuel
 

Hadoop and the Relational Database: The Best of Both Worlds

  • 1. Grab some coffee and enjoy the pre-show banter before the top of the hour!
  • 2. Hadoop and the Relational Database: The Best of Both Worlds The Briefing Room
  • 3. Twitter Tag: #briefr The Briefing Room Welcome Host: Eric Kavanagh [email protected] @eric_kavanagh
  • 4. ! Reveal the essential characteristics of enterprise software, good and bad ! Provide a forum for detailed analysis of today’s innovative technologies ! Give vendors a chance to explain their product to savvy analysts ! Allow audience members to pose serious questions... and get answers! Twitter Tag: #briefr The Briefing Room Mission
  • 5. This Month: BIG DATA ECOSYSTEM September: INTEGRATION & DATA FLOW October: ANALYTIC PLATFORMS Twitter Tag: #briefr The Briefing Room Topics 2014 Editorial Calendar at www.insideanalysis.com/webcasts/the-briefing-room
  • 6. Twitter Tag: #briefr The Briefing Room Executive Summary Scale out is the new Agile Business needs constant flexibility No time for down time Grow as quickly as you can sell
  • 7. Twitter Tag: #briefr The Briefing Room Analyst: Robin Bloor Robin Bloor is Chief Analyst at The Bloor Group [email protected] @robinbloor
  • 8. Twitter Tag: #briefr The Briefing Room Splice Machine ! Splice Machine is a SQL-on-Hadoop database ! The product is ACID-compliant and can power both OLAP and OLTP workloads ! Splice Machine is built on Java-based Apache Derby and Hbase/Hadoop
  • 9. Guests: John Leach & Rich Reimer John Leach, Co-Founder and Chief Technology Officer With over 15 years of software experience under his belt, John’s expertise in analytics and BI drives his role as Chief Technology Officer. Prior to Splice Machine, John founded Incite Retail in June 2008 and led the company’s strategy and development efforts. Prior to Incite Retail, he ran the business intelligence practice at Blue Martini Software and built strategic partnerships with integration partners. His focus at Blue Martini was helping clients incorporate decision support knowledge into their current business processes utilizing advanced algorithms and machine learning. Rich Reimer, VP of Marketing and Product Management Rich has over 15 years of sales, marketing and management experience in high-tech Treasure Isle studio head, where he used petabytes of data from millions of daily users to optimize the business in real-time. Prior to Zynga, he was the COO and co-founder of a social media platform named Grouply. Before founding Grouply, Rich held executive positions at Siebel Systems, Blue Martini Software and Oracle Corporation as well as sales and marketing positions at General Electric and Bell Atlantic. Twitter Tag: #briefr companies. Before joining Splice Machine, Rich worked at Zynga as the The Briefing Room
  • 11. 11 Data Doubling Every 2 Years… Driven by web, social, mobile, and Internet of Things Source: 2013 IBM Briefing Book
  • 12. 12 TradiBonal RDBMSs Overwhelmed… Scale-­‐up becoming cost-­‐prohibi=ve Oracle is too darn expensive! My DB is hiLng the wall Users keep geLng those spinning beach balls We have to throw data away Our reports take forever
  • 13. 13 Case Study: Harte-­‐Hanks Overview ! Digital markeBng services provider ! Real-­‐Bme campaign management ! Complex OLTP and OLAP environment Challenges ! Oracle RAC too expensive to scale ! Queries too slow – even up to ½ hour ! GeLng worse – expect 30-­‐50% data growth ! Looked for 9 months for a cost-­‐effecBve soluBon SoluBon Diagram IniBal Results 10-­‐20x price/perf with no applicaBon, BI or ETL rewrites ¼ cost with commodity scale out 3-­‐7x faster through parallelized queries Cross-Channel Campaigns Real-Time Personalization Real-Time Actions
  • 14. 14 Scale-­‐Out: The Future of Databases Drama=c improvement in price/performance Scale Up (Increase server size) Scale Out (More small servers) $ vs. $ $ $ $ $
  • 15. 15 Who are We? THE ONLY HADOOP RDBMS Replace your old RDBMS with a scale-­‐out SQL database ! Affordable, Scale-­‐Out ! ACID TransacBons ! No ApplicaBon Rewrites 10x Beier Price/Perf
  • 16. 16 Customer Performance Benchmarks Typically 10x price/performance improvement 30x 3-­‐7x 10-­‐20x 10x 20x 10-­‐15x 7x 5x SPEED VS. PRICE/PERFORMANCE
  • 17. Use Cases § Digital MarkeBng § Campaign management § Unified Customer Profile § Real-­‐Bme personalizaBon § Data Lake § OperaBonal reporBng and analyBcs § OperaBonal Data Stores § Fraud DetecBon § Personalized Medicine § Internet of Things § Network monitoring § Cyber-­‐threat security § Wearables and sensors 17
  • 18. Seasoned Team 18 Successful Serial Entrepreneurs Enterprise So?ware Experience Database & Big Data Experience Big Data Research & Community Leadership Hadoop User Group
  • 19. What People are Saying… 19 Recognized as a key innovator in databases Scaling out on Splice Machine presented some major benefits over Oracle ...automaBc balancing between clusters...avoiding the costly licensing issues. Quotes Awards An alternaKve to today’s RDBMSes, Splice Machine effecBvely combines tradiBonal relaBonal database technology with the scale-­‐out capabiliBes of Hadoop. The unique claim of … Splice Machine is that it can run transacKonal applicaKons as well as support analyBcs on top of Hadoop.
  • 20. 20 Proven Building Blocks: Hadoop and Derby APACHE DERBY § ANSI SQL-­‐99 RDBMS § Java-­‐based § ODBC/JDBC Compliant APACHE HBASE/HDFS § Auto-­‐sharding § Real-­‐Bme updates § Fault-­‐tolerance § Scalability to 100s of PBs § Data replicaBon
  • 21. 21 HBase: Proven Scale-­‐Out § Auto-­‐sharding § Scales with commodity hardware § Cost-­‐effecBve from GBs to PBs § High availability thru failover and replicaBon § LSM-­‐trees
  • 22. 22 Distributed, Parallelized Query ExecuBon ! Parallelized computaBon across cluster ! Moves computaBon to the data ! UBlizes HBase co-­‐processors ! No MapReduce
  • 23. ANSI SQL-­‐99 Coverage 23 § Data types – e.g., INTEGER, REAL, CHARACTER, DATE, BOOLEAN, BIGINT § DDL – e.g., CREATE TABLE, CREATE SCHEMA, ALTER TABLE, DELETE, UPDATE § Predicates – e.g., IN, BETWEEN, LIKE, EXISTS § DML – e.g., INSERT, DELETE, UPDATE, SELECT § Query specificaKon – e.g., SELECT DISTINCT, GROUP BY, HAVING § SET funcKons – e.g., UNION, ABS, MOD, ALL, CHECK § AggregaKon funcKons – e.g., AVG, MAX, COUNT § String funcKons – e.g., SUBSTRING, concatenaBon, UPPER, LOWER, POSITION, TRIM, LENGTH § CondiKonal funcKons – e.g., CASE, searched CASE § Privileges – e.g., privileges for SELECT, DELETE, INSERT, EXECUTE § Cursors – e.g., updatable, read-­‐only, posiBoned DELETE/UPDATE § Joins – e.g., INNER JOIN, LEFT OUTER JOIN § TransacKons – e.g., COMMIT, ROLLBACK, READ COMMITTED, REPEATABLE READ, READ UNCOMMITTED, Snapshot IsolaBon § Sub-­‐queries § Triggers § User-­‐defined funcKons (UDFs) § Views – including grouped views
  • 24. 24 Lockless, ACID transacBons State-­‐of-­‐the-­‐Art Snapshot Isola=on Transaction C ! Adds mulB-­‐row, mulB-­‐table transacBons to HBase with rollback ! Fast, lockless, high concurrency ! ZooKeeper coordinaBon ! Extends research from Google Percolator, Yahoo Labs, U of Waterloo Transaction A Transaction B Ts Tc
  • 25. 25 BI and SQL tool support via ODBC No applica=on rewrites needed
  • 26. 26 Who are We? THE ONLY HADOOP RDBMS Replace your old RDBMS with a scale-­‐out SQL database ! Affordable, Scale-­‐Out ! ACID TransacBons ! No ApplicaBon Rewrites 10x Beier Price/Perf
  • 28. Twitter Tag: #briefr The Briefing Room Perceptions & Questions Analyst: Robin Bloor
  • 29. Hadoop as a Data Refinery? Robin Bloor, PhD
  • 30. Data Flow – A Set of Principles u The data layer is one logical collection of data, both external and internal u The data flows, from ingest through a refining process to a point of application u It is best if data doesn’t flow much u “Vanilla Hadoop” is a viable catching & refining vehicle u Beyond that a database is required to manage workloads
  • 33. The Data Engines STREAMING DATA OLTP LARGE QUERY LARGE ANALYTICAL QUERY SQL, JSON, SPARQL QUERIES
  • 34. u How does Splice Machine organize its data? u Is this an OLTP database or a BI database? Or can it be both at the same time? u What do you see as the sweet spot for this database: • In respect of Big Data? • In respect of business applications?
  • 35. u Is Splice Machine also suited for analytical applications? u Do you also find yourselves competing with NoSQL products? u In respect of scale, what is your largest implementation by data volume and by transaction rate?
  • 36. Twitter Tag: #briefr The Briefing Room
  • 37. This Month: BIG DATA ECOSYSTEM September: INTEGRATION & DATA FLOW October: ANALYTIC PLATFORMS www.insideanalysis.com/webcasts/the-briefing-room Twitter Tag: #briefr The Briefing Room Upcoming Topics 2014 Editorial Calendar at www.insideanalysis.com
  • 38. Twitter Tag: #briefr THANK YOU for your ATTENTION! Opening slide image courtesy of Wikimedia Commons The Briefing Room