SlideShare a Scribd company logo
This presentation includes information that is confidential and
proprietary to Basho Technologies and should not be forwarded or
distributed without Basho's prior written consent. © 2014. Basho
Technologies, Inc. All Rights Reserved.
This presentation includes information that is confidential and proprietary to
Basho Technologies and should not be forwarded or distributed without Basho's
prior written consent. © 2014. Basho Technologies, Inc. All Rights Reserved.
Matt Brender
Developer Advocate
Taming
Big Data with NoSQL
Relational databases are
not bad
Data Scientist
Big Data
Basho Confidential 3
Data Scientist
Big Data
4
Big Data is
Basho Confidential 5
6
And it’s a distributed
systems problem
Basho Confidential 7
Beyond the
Scope of
one tool
Beyond the
Scope of
one file
system
Beyond the
Scope of
one
database
8
Ergo, NoSQL
9
NoSQL is
10
11
For Good Reason
Basho Confidential 12
13
Consistency Level
Conflict Resolution
Partitioning Strategy
14
Consistency Level
Eventually Consistent
C = Consistency
A = Availability
P = Partition Tolerance
Client Client
DBDBDB
Network Partition
Cap theorem states that a distributed system can at most support
2 out of these 3 properties
16
Consistency Level
17
Conflict Resolution
Last Write Wins
vs.
Causal Context
18
Conflict Resolution
19
Partition Strategy
Master
Slave Slave Slave
OR
Node	
  1	
   Node	
  2	
   Node	
  3	
  
20
Partition Strategy
21
“data is powerful when
stored and analyzed”
Relational databases are not
bad
23
Storing
24
Report on this
Basho Confidential 25
Report on this
26
vs
27
A single scalable system
28
A single scalable system
29
Analyzing
What kind of questions
do you need to ask?
Basho Confidential 31
32
Error Analysis?
As simple as NoSQL + Solr
33
Patterns?
Machine Learning?
Multi-client writes to NoSQL
& HDFS
OR
35
NoSQL + ETL process =>
other datastore + Spark
and/or Hadoop M/R
OR
37
NoSQL & Apache Storm to
Kafka to HDFS
38
LAMDA
39
So we agree.
NoSQL is helpful.
Side Note: NoSQL is a
terrible term
42
In Review
43
You can’t analyze what you
don’t have.
And you don’t want an analysis
system to be unreliable.
THE COST OF DOWNTIME
44
Basho Confidential 45
46
everything works
at small scale
47
Nothing matters if..
Basho Confidential 48
•  Hadoop

A framework that allows for
the distributed processing of
large data sets across
clusters
•  Spark

A fast, general engine for
large-scale data processing
•  Storm

A distributed real-time
computation system
•  NoSQL

A collection of highly
scalable, highly available
systems that fall within CAP
theorem
•  Solr

Apache project for indexing
text for search
•  Kafka

Distributed scalable pub/sub
messaging queue
Summary
50
Thank You!
Matt Brender
@mjbrender

More Related Content

What's hot (20)

PDF
A secure and dynamic multi keyword ranked search scheme over encrypted cloud ...
Shakas Technologies
 
PPTX
Fast raq a fast approach to range aggregate queries in big data environments
Nexgen Technology
 
PDF
Efficient Privacy Preserving Clustering Based Multi Keyword Search
IRJET Journal
 
DOC
A secure and dynamic multi keyword ranked search scheme over encrypted cloud ...
Shakas Technologies
 
PDF
A secure and dynamic multi keyword ranked search scheme over encrypted cloud ...
LeMeniz Infotech
 
PDF
Efficient Similarity Search over Encrypted Data
IRJET Journal
 
DOCX
A SECURE AND DYNAMIC MULTI-KEYWORD RANKED SEARCH SCHEME OVER ENCRYPTED CLOUD...
Nexgen Technology
 
DOCX
a scalable two phase top down specialization approach for data anonymization ...
swathi78
 
PPTX
Topic modeling using big data analytics
Farheen Nilofer
 
DOC
A secure and dynamic multi keyword ranked search scheme over encrypted cloud ...
Shakas Technologies
 
PPTX
Expanding Elastic: Learn how anyone can leverage heterogeneous compute to ext...
Ryft
 
DOCX
Enabling Efficient and Geometric Range Query with Access Control over Encrypt...
JAYAPRAKASH JPINFOTECH
 
PDF
Privacy preserving machine learning
Michał Kuźba
 
PDF
Nastel Solution for kafka Monitoring and Management
Sam Garforth
 
DOCX
Enabling Efficient and Geometric Range Query with Access Control over Encrypt...
JAYAPRAKASH JPINFOTECH
 
PDF
Fluency Introduction Deck - October, 23, 2017
Collin Miles
 
PPTX
Talk to NTU - Spark
Jay Chen
 
PPTX
Is Spark the right choice for data analysis ?
Ahmed Kamal
 
PPTX
Overview for Hadoop Framework
Jay Chen
 
A secure and dynamic multi keyword ranked search scheme over encrypted cloud ...
Shakas Technologies
 
Fast raq a fast approach to range aggregate queries in big data environments
Nexgen Technology
 
Efficient Privacy Preserving Clustering Based Multi Keyword Search
IRJET Journal
 
A secure and dynamic multi keyword ranked search scheme over encrypted cloud ...
Shakas Technologies
 
A secure and dynamic multi keyword ranked search scheme over encrypted cloud ...
LeMeniz Infotech
 
Efficient Similarity Search over Encrypted Data
IRJET Journal
 
A SECURE AND DYNAMIC MULTI-KEYWORD RANKED SEARCH SCHEME OVER ENCRYPTED CLOUD...
Nexgen Technology
 
a scalable two phase top down specialization approach for data anonymization ...
swathi78
 
Topic modeling using big data analytics
Farheen Nilofer
 
A secure and dynamic multi keyword ranked search scheme over encrypted cloud ...
Shakas Technologies
 
Expanding Elastic: Learn how anyone can leverage heterogeneous compute to ext...
Ryft
 
Enabling Efficient and Geometric Range Query with Access Control over Encrypt...
JAYAPRAKASH JPINFOTECH
 
Privacy preserving machine learning
Michał Kuźba
 
Nastel Solution for kafka Monitoring and Management
Sam Garforth
 
Enabling Efficient and Geometric Range Query with Access Control over Encrypt...
JAYAPRAKASH JPINFOTECH
 
Fluency Introduction Deck - October, 23, 2017
Collin Miles
 
Talk to NTU - Spark
Jay Chen
 
Is Spark the right choice for data analysis ?
Ahmed Kamal
 
Overview for Hadoop Framework
Jay Chen
 

Viewers also liked (20)

PDF
Taming the Big Data Beast - Together
Kennisalliantie
 
PDF
Spark: Taming Big Data
Leonardo Gamas
 
PPT
Big Data
NGDATA
 
PPT
Big data ppt
IDBI Bank Ltd.
 
PPTX
What is Big Data?
Bernard Marr
 
PPTX
Big data ppt
Nasrin Hussain
 
PDF
Peter Bakas - Zero to Insights - Real time analytics with Kafka, C*, and Spar...
NoSQLmatters
 
PDF
Big Data introduction - Café Numérique Bruxelles
Eric Rodriguez (Hiring in Lex)
 
PPTX
Taming the Data Science Monster with A New ‘Sword’ – U-SQL
Michael Rys
 
PPT
Big data introduction - Big Data from a Consulting perspective - Sogeti
Edzo Botjes
 
PPTX
Introduction to Big Data
Vipin Batra
 
PPTX
Taming Big Data!
Ian Foster
 
PDF
Big Data: an introduction
Bart Vandewoestyne
 
PDF
Introduction to big data
Richard Vidgen
 
PPTX
Introduction to Big Data
Karan Desai
 
PDF
Taming Social Data: How Social Data Framing liberates analysis and accelerate...
DataSift
 
PDF
Big data Introduction by Mohan
Venkata Reddy Konasani
 
PDF
Autonomic Computing: Vision or Reality - Presentation
Ivo Neskovic
 
PPT
Autonomic Computing (Basics) Presentation
Jaspreet Singh
 
PPTX
Installing windows 10
clcewing
 
Taming the Big Data Beast - Together
Kennisalliantie
 
Spark: Taming Big Data
Leonardo Gamas
 
Big Data
NGDATA
 
Big data ppt
IDBI Bank Ltd.
 
What is Big Data?
Bernard Marr
 
Big data ppt
Nasrin Hussain
 
Peter Bakas - Zero to Insights - Real time analytics with Kafka, C*, and Spar...
NoSQLmatters
 
Big Data introduction - Café Numérique Bruxelles
Eric Rodriguez (Hiring in Lex)
 
Taming the Data Science Monster with A New ‘Sword’ – U-SQL
Michael Rys
 
Big data introduction - Big Data from a Consulting perspective - Sogeti
Edzo Botjes
 
Introduction to Big Data
Vipin Batra
 
Taming Big Data!
Ian Foster
 
Big Data: an introduction
Bart Vandewoestyne
 
Introduction to big data
Richard Vidgen
 
Introduction to Big Data
Karan Desai
 
Taming Social Data: How Social Data Framing liberates analysis and accelerate...
DataSift
 
Big data Introduction by Mohan
Venkata Reddy Konasani
 
Autonomic Computing: Vision or Reality - Presentation
Ivo Neskovic
 
Autonomic Computing (Basics) Presentation
Jaspreet Singh
 
Installing windows 10
clcewing
 
Ad

Similar to Taming Big Data with NoSQL (20)

PDF
NOSQL- Presentation on NoSQL
Ramakant Soni
 
PDF
NOSQL in big data is the not only structure langua.pdf
ajajkhan16
 
PPT
6269441.ppt
Swapna Jk
 
PDF
Nosql Presentation.pdf for DBMS understanding
HUSNAINAHMAD39
 
PPTX
UNIT I Introduction to NoSQL.pptx
Rahul Borate
 
PPTX
NoSQLDatabases
Adi Challa
 
PPTX
UNIT I Introduction to NoSQL.pptx
Rahul Borate
 
PDF
1. Lecture1_NOSQL_Introduction.pdf
ShaimaaMohamedGalal
 
PPTX
NoSQL in Big Data Analytics Tools .pptx
Shadia Afrin
 
PPTX
DMDW Extra Lesson - NoSql and MongoDB
Johannes Hoppe
 
DOCX
Know what is NOSQL
Prasoon Sharma
 
PDF
the rising no sql technology
INFOGAIN PUBLICATION
 
PPTX
No sql database
vishal gupta
 
PPTX
NoSQL Data Stores in Research and Practice - ICDE 2016 Tutorial - Extended Ve...
Felix Gessert
 
PPT
NoSQL - 05March2014 Seminar
Jainul Musani
 
PPT
No sql
Murat Çakal
 
PPTX
NoSQL databases - An introduction
Pooyan Mehrparvar
 
PPTX
NOSQL PRESENTATION ON INTRRODUCTION Intro.pptx
plvdravikumarit
 
PPTX
NoSQL A brief look at Apache Cassandra Distributed Database
Joe Alex
 
PPT
No sql (1)
sanskar2674
 
NOSQL- Presentation on NoSQL
Ramakant Soni
 
NOSQL in big data is the not only structure langua.pdf
ajajkhan16
 
6269441.ppt
Swapna Jk
 
Nosql Presentation.pdf for DBMS understanding
HUSNAINAHMAD39
 
UNIT I Introduction to NoSQL.pptx
Rahul Borate
 
NoSQLDatabases
Adi Challa
 
UNIT I Introduction to NoSQL.pptx
Rahul Borate
 
1. Lecture1_NOSQL_Introduction.pdf
ShaimaaMohamedGalal
 
NoSQL in Big Data Analytics Tools .pptx
Shadia Afrin
 
DMDW Extra Lesson - NoSql and MongoDB
Johannes Hoppe
 
Know what is NOSQL
Prasoon Sharma
 
the rising no sql technology
INFOGAIN PUBLICATION
 
No sql database
vishal gupta
 
NoSQL Data Stores in Research and Practice - ICDE 2016 Tutorial - Extended Ve...
Felix Gessert
 
NoSQL - 05March2014 Seminar
Jainul Musani
 
No sql
Murat Çakal
 
NoSQL databases - An introduction
Pooyan Mehrparvar
 
NOSQL PRESENTATION ON INTRRODUCTION Intro.pptx
plvdravikumarit
 
NoSQL A brief look at Apache Cassandra Distributed Database
Joe Alex
 
No sql (1)
sanskar2674
 
Ad

More from Basho Technologies (11)

PPTX
Data Modeling IoT and Time Series data in NoSQL
Basho Technologies
 
PPTX
A Zen Journey to Database Management
Basho Technologies
 
PPTX
Vagrant up a Distributed Test Environment - Nginx Summit 2015
Basho Technologies
 
PPTX
O'Reilly Webinar: Simplicity Scales - Big Data
Basho Technologies
 
PPTX
A little about Message Queues - Boston Riak Meetup
Basho Technologies
 
PPTX
tecFinal 451 webinar deck
Basho Technologies
 
PPTX
NoSQL Implementation - Part 1 (Velocity 2015)
Basho Technologies
 
PPTX
Coding with Riak (from Velocity 2015)
Basho Technologies
 
PDF
Relational Databases to Riak
Basho Technologies
 
PPTX
Basho and Riak at GOTO Stockholm: "Don't Use My Database."
Basho Technologies
 
KEY
Using Basho Bench to Load Test Distributed Applications
Basho Technologies
 
Data Modeling IoT and Time Series data in NoSQL
Basho Technologies
 
A Zen Journey to Database Management
Basho Technologies
 
Vagrant up a Distributed Test Environment - Nginx Summit 2015
Basho Technologies
 
O'Reilly Webinar: Simplicity Scales - Big Data
Basho Technologies
 
A little about Message Queues - Boston Riak Meetup
Basho Technologies
 
tecFinal 451 webinar deck
Basho Technologies
 
NoSQL Implementation - Part 1 (Velocity 2015)
Basho Technologies
 
Coding with Riak (from Velocity 2015)
Basho Technologies
 
Relational Databases to Riak
Basho Technologies
 
Basho and Riak at GOTO Stockholm: "Don't Use My Database."
Basho Technologies
 
Using Basho Bench to Load Test Distributed Applications
Basho Technologies
 

Recently uploaded (20)

PDF
Per Axbom: The spectacular lies of maps
Nexer Digital
 
PDF
Researching The Best Chat SDK Providers in 2025
Ray Fields
 
PDF
State-Dependent Conformal Perception Bounds for Neuro-Symbolic Verification
Ivan Ruchkin
 
PPTX
AVL ( audio, visuals or led ), technology.
Rajeshwri Panchal
 
PPTX
Dev Dives: Automate, test, and deploy in one place—with Unified Developer Exp...
AndreeaTom
 
PDF
TrustArc Webinar - Navigating Data Privacy in LATAM: Laws, Trends, and Compli...
TrustArc
 
PDF
How Open Source Changed My Career by abdelrahman ismail
a0m0rajab1
 
PPTX
Agile Chennai 18-19 July 2025 Ideathon | AI Powered Microfinance Literacy Gui...
AgileNetwork
 
PDF
Generative AI vs Predictive AI-The Ultimate Comparison Guide
Lily Clark
 
PDF
Brief History of Internet - Early Days of Internet
sutharharshit158
 
PDF
OFFOFFBOX™ – A New Era for African Film | Startup Presentation
ambaicciwalkerbrian
 
PDF
Trying to figure out MCP by actually building an app from scratch with open s...
Julien SIMON
 
PDF
A Strategic Analysis of the MVNO Wave in Emerging Markets.pdf
IPLOOK Networks
 
PDF
GDG Cloud Munich - Intro - Luiz Carneiro - #BuildWithAI - July - Abdel.pdf
Luiz Carneiro
 
PDF
Google I/O Extended 2025 Baku - all ppts
HusseinMalikMammadli
 
PDF
introduction to computer hardware and sofeware
chauhanshraddha2007
 
PDF
Research-Fundamentals-and-Topic-Development.pdf
ayesha butalia
 
PPTX
cloud computing vai.pptx for the project
vaibhavdobariyal79
 
PPTX
OA presentation.pptx OA presentation.pptx
pateldhruv002338
 
PDF
RAT Builders - How to Catch Them All [DeepSec 2024]
malmoeb
 
Per Axbom: The spectacular lies of maps
Nexer Digital
 
Researching The Best Chat SDK Providers in 2025
Ray Fields
 
State-Dependent Conformal Perception Bounds for Neuro-Symbolic Verification
Ivan Ruchkin
 
AVL ( audio, visuals or led ), technology.
Rajeshwri Panchal
 
Dev Dives: Automate, test, and deploy in one place—with Unified Developer Exp...
AndreeaTom
 
TrustArc Webinar - Navigating Data Privacy in LATAM: Laws, Trends, and Compli...
TrustArc
 
How Open Source Changed My Career by abdelrahman ismail
a0m0rajab1
 
Agile Chennai 18-19 July 2025 Ideathon | AI Powered Microfinance Literacy Gui...
AgileNetwork
 
Generative AI vs Predictive AI-The Ultimate Comparison Guide
Lily Clark
 
Brief History of Internet - Early Days of Internet
sutharharshit158
 
OFFOFFBOX™ – A New Era for African Film | Startup Presentation
ambaicciwalkerbrian
 
Trying to figure out MCP by actually building an app from scratch with open s...
Julien SIMON
 
A Strategic Analysis of the MVNO Wave in Emerging Markets.pdf
IPLOOK Networks
 
GDG Cloud Munich - Intro - Luiz Carneiro - #BuildWithAI - July - Abdel.pdf
Luiz Carneiro
 
Google I/O Extended 2025 Baku - all ppts
HusseinMalikMammadli
 
introduction to computer hardware and sofeware
chauhanshraddha2007
 
Research-Fundamentals-and-Topic-Development.pdf
ayesha butalia
 
cloud computing vai.pptx for the project
vaibhavdobariyal79
 
OA presentation.pptx OA presentation.pptx
pateldhruv002338
 
RAT Builders - How to Catch Them All [DeepSec 2024]
malmoeb
 

Taming Big Data with NoSQL