SlideShare a Scribd company logo
© 2015 MapR Technologies 1© 2015 MapR Technologies
MapR-DB: New Options For Creating Breakthrough
Next Gen Apps with NoSQL And Hadoop
© 2015 MapR Technologies 2
NoSQL Was Designed For Big Data
• RDBMSs has been the default
choice for applications
– But face cost/time challenges for
rapidly growing, varying data sets
• NoSQL was designed for big data
– E.g., User transaction data, sensor
data, IoT data, time series data, etc.
RDBMS
NoSQL
© 2015 MapR Technologies 3
Known NoSQL Database Challenges Today
With Other NoSQL Databases
• Data loss
• Data inconsistency
• Long maintenance downtime (e.g.,
compactions, anti-entropy)
• Coarse grained access controls
X
• Cluster/silo sprawl
– Maintenance pains
– Complexity, more error prone
• Constant data movement between
database and analytics cluster
– Excessive bandwidth utilization
– Delays in accessing data
• Modeling of complex data
– Longer app development cycles
– Higher chance of coding errors
• Multiple databases for multiple kinds
of applications
© 2015 MapR Technologies 4
Requirements to Resolve Today’s Challenges
• Tighter Hadoop integration
– Reduce cluster sprawl
– Reduce data movement
– Enable real-time analytics on live data
– Lower administrative overhead
• Flexible JSON data model
• Automatic optimizations
– Less maintenance downtime
– Consistent high performance
• Fine grained access controls
– More than simply table/document level
• Globally consistent deployment capability
Hadoop NoSQL
Data Platform
© 2015 MapR Technologies 5
MapR-DB Architectural Principles
Dramatically Simpler, High-Performance at Global Scale
• Self-healing from HW and SW failures
– Replicated state and data for instant recovery
– Automated re-replication of data
• High performance and low latency
– Integrated system with fewer software layers
– Single hop to data
– No compactions, low i/o amplification (patented secret sauce)
• Minimal administration
– Single namespace for files and tables (and streams going forward)
– Built-in data management & protection
– Automatic splits and merges as data grows and shrinks
• Global low-latency replication for disaster recovery
© 2015 MapR Technologies 6
Built-into Hadoop = Real-time
Hadoop NoSQL
Churn
Analysis
Offers
Fraud
Detection
Customer
Profiles
Log files IoT Data
Batch Copies
Analytical Operational
MapR Distribution
Churn
Analysis
Offers
Fraud
Detection
Customer
Profiles
Log files IoT Data
Analytical + Operational
Analytics as it happens, no cross-cluster copying
Hadoop MapR-DB
Non-MapR:
• Batch-only
• Cluster sprawl
With MapR:
• Real-time data access
• Multi-use-case platform
Revenue
Optimization
Predictive
Analytics
Sentiment
analysis
Click
streams
Call logs
Social
media
© 2015 MapR Technologies 7
Real-Time Integration with Other Systems
MapR-DB replication engine is extensible for
integration with any external systems
MapR-DB
Streaming
Real-Time
Reliable Transport
Storm
Elasticsearch
Remote MapR-DB Tables
Future
© 2015 MapR Technologies 8
Designed For Global deployments
Multi-master (aka, active/active) replication
Active Read/Write
End Users
• Faster data access – minimize network
latency on global data with local clusters
• Reduced risk of data loss – real-time,
bi-directional replication for synchronized
data across active clusters
• Application failover – upon any cluster
failure, applications continue via
redirection to another cluster
© 2015 MapR Technologies 9
Real-Time Analytics With Hadoop
Distributed clusters close to the end
users, with real-time analytics at central
cluster
MapR-DB cluster
(London)
MapR-DB cluster
(New York)
MapR-DB cluster
(Singapore)
MapR-DB/Hadoop
cluster
Hadoop analytics
Operational and analytical workloads
combined in a single cluster in in a
single datacenter
Operationally efficient,
consolidated MapR cluster
Database
operations
Hadoop
analytics
Active Read/Write
End Users
© 2015 MapR Technologies 10
Granular Security
Use Access Control Expressions (ACEs) to set granular
permissions.
Example: user:mary | (group:admins & group:VP) &
user:!bob
© 2015 MapR Technologies 11
Open Source OJAI API for JSON-Based Applications on
Hadoop
Open JSON Application Interface (OJAI)
Databases Other Systems
MapR-DB
MapR-Client
{JSON}
File Systems
© 2015 MapR Technologies 12
Single Cluster Data Lake Capabilities
Paste your MapR distribution for
Hadoop diagram from Part A,
(slide 2) here
MapR-DB MapR-FS
MapR Data Platform
Distribution including
Apache Hadoop
MapR-DB: relational,
time series,
structured data
MapR-FS: emails,
blogs, tweets, log
files, unstructured
data
Agile, self-
service data
exploration
ETL into operational
reporting formats
(e.g., Parquet)
Multi-tenancy:
job/data placement
control, volumes
Access controls:
file, table, column,
column family, doc,
sub-doc levels
Sources
RELATIONAL,
SAAS,
MAINFRAME
DOCUMENTS,
EMAILS
LOG FILES,
CLICKSTREAM
SENSORS
BLOGS,
TWEETS,
LINK DATA
DATA
WAREHOUSES,
DATA MARTS
Auditing:
compliance, analyze
user accesses
Snapshots:
track data lineage
and history
Table Replication:
global multi-master,
business continuity
© 2015 MapR Technologies 13
Q&A
@mapr maprtech
@mapr.com
Engage with us!
MapR
maprtech
mapr-technologies

More Related Content

What's hot (20)

PDF
Library Automation An Overview
ijtsrd
 
PDF
MapR Tutorial Series
selvaraaju
 
PDF
How Snowflake Sink Connector Uses Snowpipe’s Streaming Ingestion Feature, Jay...
HostedbyConfluent
 
PPTX
Hitachi Content Platform
Indika Dias
 
PDF
MySQL SQL Tutorial
Chien Chung Shen
 
PDF
Apache Hbase Architecture
Rupak Roy
 
DOCX
Digital library softaware greenstone & dsapce
S.N,D.T Women's University
 
PDF
HDFS Design Principles
Konstantin V. Shvachko
 
PDF
Data 101: Introduction to Data Visualization
David Newbury
 
PDF
CouchDB
Rashmi Agale
 
PPTX
INIS.pptx
DrIrfanulHaqAkhoon
 
PDF
Your first ClickHouse data warehouse
Altinity Ltd
 
PPTX
Data Engineering with Databricks Presentation
Knoldus Inc.
 
PPTX
Zipf's law
Mayur Pakhale
 
PPTX
Library Automation & Criteria for selection Library Software
Nishant Kashyap Ghatowar
 
PPT
Overview of oss(open source software library) and its pros and cons
Yuga Priya Satheesh
 
PPTX
How to size up an Apache Cassandra cluster (Training)
DataStax Academy
 
PPTX
Free Training: How to Build a Lakehouse
Databricks
 
PDF
Qlik Sense for Beginners - www.techstuffy.com - QlikView Next Generation
Practical QlikView
 
Library Automation An Overview
ijtsrd
 
MapR Tutorial Series
selvaraaju
 
How Snowflake Sink Connector Uses Snowpipe’s Streaming Ingestion Feature, Jay...
HostedbyConfluent
 
Hitachi Content Platform
Indika Dias
 
MySQL SQL Tutorial
Chien Chung Shen
 
Apache Hbase Architecture
Rupak Roy
 
Digital library softaware greenstone & dsapce
S.N,D.T Women's University
 
HDFS Design Principles
Konstantin V. Shvachko
 
Data 101: Introduction to Data Visualization
David Newbury
 
CouchDB
Rashmi Agale
 
Your first ClickHouse data warehouse
Altinity Ltd
 
Data Engineering with Databricks Presentation
Knoldus Inc.
 
Zipf's law
Mayur Pakhale
 
Library Automation & Criteria for selection Library Software
Nishant Kashyap Ghatowar
 
Overview of oss(open source software library) and its pros and cons
Yuga Priya Satheesh
 
How to size up an Apache Cassandra cluster (Training)
DataStax Academy
 
Free Training: How to Build a Lakehouse
Databricks
 
Qlik Sense for Beginners - www.techstuffy.com - QlikView Next Generation
Practical QlikView
 

Viewers also liked (20)

PPTX
Introduction to Apache HBase, MapR Tables and Security
MapR Technologies
 
PDF
Dchug m7-30 apr2013
jdfiori
 
PDF
Architectural Overview of MapR's Apache Hadoop Distribution
mcsrivas
 
PPTX
Inside MapR's M7
Ted Dunning
 
PDF
MapR & Skytree:
MapR Technologies
 
PPTX
NoSQL Application Development with JSON and MapR-DB
MapR Technologies
 
PPTX
Apache HBase Performance Tuning
Lars Hofhansl
 
PDF
Spark Streaming Data Pipelines
MapR Technologies
 
PDF
Design, Scale and Performance of MapR's Distribution for Hadoop
mcsrivas
 
PPTX
Apache Drill – Hands-On SQL References
MapR Technologies
 
PPTX
Machine Learning with Hadoop Boston hug 2012
MapR Technologies
 
PPTX
Inside MapR's M7
MapR Technologies
 
PPTX
Spark & Hadoop at Production at Scale
MapR Technologies
 
PPTX
HBase backups and performance on MapR
lohitvijayarenu
 
PPTX
Enabling Next Gen Analytics with Azure Data Lake and StreamSets
Streamsets Inc.
 
PDF
EMC World 2014 Breakout: Move to the Business Data Lake – Not as Hard as It S...
Capgemini
 
PPTX
R + Storm Moneyball - Realtime Advanced Statistics - Hadoop Summit - San Jose
Allen Day, PhD
 
PDF
HBase Sizing Notes
larsgeorge
 
PPTX
Practical Machine Learning: Innovations in Recommendation Workshop
MapR Technologies
 
PDF
HBase Sizing Guide
larsgeorge
 
Introduction to Apache HBase, MapR Tables and Security
MapR Technologies
 
Dchug m7-30 apr2013
jdfiori
 
Architectural Overview of MapR's Apache Hadoop Distribution
mcsrivas
 
Inside MapR's M7
Ted Dunning
 
MapR & Skytree:
MapR Technologies
 
NoSQL Application Development with JSON and MapR-DB
MapR Technologies
 
Apache HBase Performance Tuning
Lars Hofhansl
 
Spark Streaming Data Pipelines
MapR Technologies
 
Design, Scale and Performance of MapR's Distribution for Hadoop
mcsrivas
 
Apache Drill – Hands-On SQL References
MapR Technologies
 
Machine Learning with Hadoop Boston hug 2012
MapR Technologies
 
Inside MapR's M7
MapR Technologies
 
Spark & Hadoop at Production at Scale
MapR Technologies
 
HBase backups and performance on MapR
lohitvijayarenu
 
Enabling Next Gen Analytics with Azure Data Lake and StreamSets
Streamsets Inc.
 
EMC World 2014 Breakout: Move to the Business Data Lake – Not as Hard as It S...
Capgemini
 
R + Storm Moneyball - Realtime Advanced Statistics - Hadoop Summit - San Jose
Allen Day, PhD
 
HBase Sizing Notes
larsgeorge
 
Practical Machine Learning: Innovations in Recommendation Workshop
MapR Technologies
 
HBase Sizing Guide
larsgeorge
 
Ad

Similar to MapR-DB – The First In-Hadoop Document Database (20)

PDF
Hadoop and NoSQL joining forces by Dale Kim of MapR
Data Con LA
 
PPTX
Integrating Hadoop into your enterprise IT environment
MapR Technologies
 
PPTX
IoT and Big Data - Iot Asia 2014
John Berns
 
PDF
Hadoop and Your Enterprise Data Warehouse
Edgar Alejandro Villegas
 
PPTX
Enabling Real-Time Business with Change Data Capture
MapR Technologies
 
PDF
Realtime analytics with_hadoop
Edgar Alejandro Villegas
 
PDF
Logical Data Lakes: From Single Purpose to Multipurpose Data Lakes (APAC)
Denodo
 
PDF
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization
Denodo
 
PPTX
Simplifying Real-Time Architectures for IoT with Apache Kudu
Cloudera, Inc.
 
PPTX
Using Familiar BI Tools and Hadoop to Analyze Enterprise Networks
MapR Technologies
 
PDF
Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...
Precisely
 
PPTX
Containerized Hadoop beyond Kubernetes
DataWorks Summit
 
PDF
Logical Data Lakes: From Single Purpose to Multipurpose Data Lakes (APAC)
Denodo
 
PDF
Advanced Analytics and Big Data (August 2014)
Thomas W. Dinsmore
 
PPTX
Using Familiar BI Tools and Hadoop to Analyze Enterprise Networks
DataWorks Summit
 
PDF
Simple, Modular and Extensible Big Data Platform Concept
Satish Mohan
 
PPTX
20131111 - Santa Monica - BigDataCamp - Big Data Design Patterns
Allen Day, PhD
 
PPTX
Webinar: ROI on Big Data - RDBMS, NoSQL or Both? A Simple Guide for Knowing H...
DataStax
 
PPTX
Big data4businessusers
Bob Hardaway
 
PDF
Cloud-Native Data: What data questions to ask when building cloud-native apps
VMware Tanzu
 
Hadoop and NoSQL joining forces by Dale Kim of MapR
Data Con LA
 
Integrating Hadoop into your enterprise IT environment
MapR Technologies
 
IoT and Big Data - Iot Asia 2014
John Berns
 
Hadoop and Your Enterprise Data Warehouse
Edgar Alejandro Villegas
 
Enabling Real-Time Business with Change Data Capture
MapR Technologies
 
Realtime analytics with_hadoop
Edgar Alejandro Villegas
 
Logical Data Lakes: From Single Purpose to Multipurpose Data Lakes (APAC)
Denodo
 
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization
Denodo
 
Simplifying Real-Time Architectures for IoT with Apache Kudu
Cloudera, Inc.
 
Using Familiar BI Tools and Hadoop to Analyze Enterprise Networks
MapR Technologies
 
Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...
Precisely
 
Containerized Hadoop beyond Kubernetes
DataWorks Summit
 
Logical Data Lakes: From Single Purpose to Multipurpose Data Lakes (APAC)
Denodo
 
Advanced Analytics and Big Data (August 2014)
Thomas W. Dinsmore
 
Using Familiar BI Tools and Hadoop to Analyze Enterprise Networks
DataWorks Summit
 
Simple, Modular and Extensible Big Data Platform Concept
Satish Mohan
 
20131111 - Santa Monica - BigDataCamp - Big Data Design Patterns
Allen Day, PhD
 
Webinar: ROI on Big Data - RDBMS, NoSQL or Both? A Simple Guide for Knowing H...
DataStax
 
Big data4businessusers
Bob Hardaway
 
Cloud-Native Data: What data questions to ask when building cloud-native apps
VMware Tanzu
 
Ad

More from MapR Technologies (20)

PPTX
Converging your data landscape
MapR Technologies
 
PPTX
ML Workshop 2: Machine Learning Model Comparison & Evaluation
MapR Technologies
 
PPTX
Self-Service Data Science for Leveraging ML & AI on All of Your Data
MapR Technologies
 
PPTX
Machine Learning for Chickens, Autonomous Driving and a 3-year-old Who Won’t ...
MapR Technologies
 
PPTX
ML Workshop 1: A New Architecture for Machine Learning Logistics
MapR Technologies
 
PPTX
Machine Learning Success: The Key to Easier Model Management
MapR Technologies
 
PPTX
Data Warehouse Modernization: Accelerating Time-To-Action
MapR Technologies
 
PDF
Live Tutorial – Streaming Real-Time Events Using Apache APIs
MapR Technologies
 
PPTX
Bringing Structure, Scalability, and Services to Cloud-Scale Storage
MapR Technologies
 
PDF
Live Machine Learning Tutorial: Churn Prediction
MapR Technologies
 
PDF
An Introduction to the MapR Converged Data Platform
MapR Technologies
 
PPTX
How to Leverage the Cloud for Business Solutions | Strata Data Conference Lon...
MapR Technologies
 
PPTX
Best Practices for Data Convergence in Healthcare
MapR Technologies
 
PPTX
Geo-Distributed Big Data and Analytics
MapR Technologies
 
PPTX
MapR Product Update - Spring 2017
MapR Technologies
 
PPTX
3 Benefits of Multi-Temperature Data Management for Data Analytics
MapR Technologies
 
PPTX
Cisco & MapR bring 3 Superpowers to SAP HANA Deployments
MapR Technologies
 
PPTX
MapR and Cisco Make IT Better
MapR Technologies
 
PPTX
Evolving from RDBMS to NoSQL + SQL
MapR Technologies
 
PPTX
Evolving Beyond the Data Lake: A Story of Wind and Rain
MapR Technologies
 
Converging your data landscape
MapR Technologies
 
ML Workshop 2: Machine Learning Model Comparison & Evaluation
MapR Technologies
 
Self-Service Data Science for Leveraging ML & AI on All of Your Data
MapR Technologies
 
Machine Learning for Chickens, Autonomous Driving and a 3-year-old Who Won’t ...
MapR Technologies
 
ML Workshop 1: A New Architecture for Machine Learning Logistics
MapR Technologies
 
Machine Learning Success: The Key to Easier Model Management
MapR Technologies
 
Data Warehouse Modernization: Accelerating Time-To-Action
MapR Technologies
 
Live Tutorial – Streaming Real-Time Events Using Apache APIs
MapR Technologies
 
Bringing Structure, Scalability, and Services to Cloud-Scale Storage
MapR Technologies
 
Live Machine Learning Tutorial: Churn Prediction
MapR Technologies
 
An Introduction to the MapR Converged Data Platform
MapR Technologies
 
How to Leverage the Cloud for Business Solutions | Strata Data Conference Lon...
MapR Technologies
 
Best Practices for Data Convergence in Healthcare
MapR Technologies
 
Geo-Distributed Big Data and Analytics
MapR Technologies
 
MapR Product Update - Spring 2017
MapR Technologies
 
3 Benefits of Multi-Temperature Data Management for Data Analytics
MapR Technologies
 
Cisco & MapR bring 3 Superpowers to SAP HANA Deployments
MapR Technologies
 
MapR and Cisco Make IT Better
MapR Technologies
 
Evolving from RDBMS to NoSQL + SQL
MapR Technologies
 
Evolving Beyond the Data Lake: A Story of Wind and Rain
MapR Technologies
 

Recently uploaded (20)

PDF
Presentation - Vibe Coding The Future of Tech
yanuarsinggih1
 
PPTX
WooCommerce Workshop: Bring Your Laptop
Laura Hartwig
 
PDF
Blockchain Transactions Explained For Everyone
CIFDAQ
 
PDF
Agentic AI lifecycle for Enterprise Hyper-Automation
Debmalya Biswas
 
PDF
New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
PDF
Newgen 2022-Forrester Newgen TEI_13 05 2022-The-Total-Economic-Impact-Newgen-...
darshakparmar
 
PDF
What Makes Contify’s News API Stand Out: Key Features at a Glance
Contify
 
PDF
HCIP-Data Center Facility Deployment V2.0 Training Material (Without Remarks ...
mcastillo49
 
PDF
Newgen Beyond Frankenstein_Build vs Buy_Digital_version.pdf
darshakparmar
 
PDF
Achieving Consistent and Reliable AI Code Generation - Medusa AI
medusaaico
 
PDF
Fl Studio 24.2.2 Build 4597 Crack for Windows Free Download 2025
faizk77g
 
PDF
Using FME to Develop Self-Service CAD Applications for a Major UK Police Force
Safe Software
 
PDF
Mastering Financial Management in Direct Selling
Epixel MLM Software
 
PDF
NewMind AI - Journal 100 Insights After The 100th Issue
NewMind AI
 
PDF
Biography of Daniel Podor.pdf
Daniel Podor
 
PPTX
"Autonomy of LLM Agents: Current State and Future Prospects", Oles` Petriv
Fwdays
 
PDF
Bitcoin for Millennials podcast with Bram, Power Laws of Bitcoin
Stephen Perrenod
 
PDF
[Newgen] NewgenONE Marvin Brochure 1.pdf
darshakparmar
 
PDF
July Patch Tuesday
Ivanti
 
PPTX
Building Search Using OpenSearch: Limitations and Workarounds
Sease
 
Presentation - Vibe Coding The Future of Tech
yanuarsinggih1
 
WooCommerce Workshop: Bring Your Laptop
Laura Hartwig
 
Blockchain Transactions Explained For Everyone
CIFDAQ
 
Agentic AI lifecycle for Enterprise Hyper-Automation
Debmalya Biswas
 
New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
Newgen 2022-Forrester Newgen TEI_13 05 2022-The-Total-Economic-Impact-Newgen-...
darshakparmar
 
What Makes Contify’s News API Stand Out: Key Features at a Glance
Contify
 
HCIP-Data Center Facility Deployment V2.0 Training Material (Without Remarks ...
mcastillo49
 
Newgen Beyond Frankenstein_Build vs Buy_Digital_version.pdf
darshakparmar
 
Achieving Consistent and Reliable AI Code Generation - Medusa AI
medusaaico
 
Fl Studio 24.2.2 Build 4597 Crack for Windows Free Download 2025
faizk77g
 
Using FME to Develop Self-Service CAD Applications for a Major UK Police Force
Safe Software
 
Mastering Financial Management in Direct Selling
Epixel MLM Software
 
NewMind AI - Journal 100 Insights After The 100th Issue
NewMind AI
 
Biography of Daniel Podor.pdf
Daniel Podor
 
"Autonomy of LLM Agents: Current State and Future Prospects", Oles` Petriv
Fwdays
 
Bitcoin for Millennials podcast with Bram, Power Laws of Bitcoin
Stephen Perrenod
 
[Newgen] NewgenONE Marvin Brochure 1.pdf
darshakparmar
 
July Patch Tuesday
Ivanti
 
Building Search Using OpenSearch: Limitations and Workarounds
Sease
 

MapR-DB – The First In-Hadoop Document Database

  • 1. © 2015 MapR Technologies 1© 2015 MapR Technologies MapR-DB: New Options For Creating Breakthrough Next Gen Apps with NoSQL And Hadoop
  • 2. © 2015 MapR Technologies 2 NoSQL Was Designed For Big Data • RDBMSs has been the default choice for applications – But face cost/time challenges for rapidly growing, varying data sets • NoSQL was designed for big data – E.g., User transaction data, sensor data, IoT data, time series data, etc. RDBMS NoSQL
  • 3. © 2015 MapR Technologies 3 Known NoSQL Database Challenges Today With Other NoSQL Databases • Data loss • Data inconsistency • Long maintenance downtime (e.g., compactions, anti-entropy) • Coarse grained access controls X • Cluster/silo sprawl – Maintenance pains – Complexity, more error prone • Constant data movement between database and analytics cluster – Excessive bandwidth utilization – Delays in accessing data • Modeling of complex data – Longer app development cycles – Higher chance of coding errors • Multiple databases for multiple kinds of applications
  • 4. © 2015 MapR Technologies 4 Requirements to Resolve Today’s Challenges • Tighter Hadoop integration – Reduce cluster sprawl – Reduce data movement – Enable real-time analytics on live data – Lower administrative overhead • Flexible JSON data model • Automatic optimizations – Less maintenance downtime – Consistent high performance • Fine grained access controls – More than simply table/document level • Globally consistent deployment capability Hadoop NoSQL Data Platform
  • 5. © 2015 MapR Technologies 5 MapR-DB Architectural Principles Dramatically Simpler, High-Performance at Global Scale • Self-healing from HW and SW failures – Replicated state and data for instant recovery – Automated re-replication of data • High performance and low latency – Integrated system with fewer software layers – Single hop to data – No compactions, low i/o amplification (patented secret sauce) • Minimal administration – Single namespace for files and tables (and streams going forward) – Built-in data management & protection – Automatic splits and merges as data grows and shrinks • Global low-latency replication for disaster recovery
  • 6. © 2015 MapR Technologies 6 Built-into Hadoop = Real-time Hadoop NoSQL Churn Analysis Offers Fraud Detection Customer Profiles Log files IoT Data Batch Copies Analytical Operational MapR Distribution Churn Analysis Offers Fraud Detection Customer Profiles Log files IoT Data Analytical + Operational Analytics as it happens, no cross-cluster copying Hadoop MapR-DB Non-MapR: • Batch-only • Cluster sprawl With MapR: • Real-time data access • Multi-use-case platform Revenue Optimization Predictive Analytics Sentiment analysis Click streams Call logs Social media
  • 7. © 2015 MapR Technologies 7 Real-Time Integration with Other Systems MapR-DB replication engine is extensible for integration with any external systems MapR-DB Streaming Real-Time Reliable Transport Storm Elasticsearch Remote MapR-DB Tables Future
  • 8. © 2015 MapR Technologies 8 Designed For Global deployments Multi-master (aka, active/active) replication Active Read/Write End Users • Faster data access – minimize network latency on global data with local clusters • Reduced risk of data loss – real-time, bi-directional replication for synchronized data across active clusters • Application failover – upon any cluster failure, applications continue via redirection to another cluster
  • 9. © 2015 MapR Technologies 9 Real-Time Analytics With Hadoop Distributed clusters close to the end users, with real-time analytics at central cluster MapR-DB cluster (London) MapR-DB cluster (New York) MapR-DB cluster (Singapore) MapR-DB/Hadoop cluster Hadoop analytics Operational and analytical workloads combined in a single cluster in in a single datacenter Operationally efficient, consolidated MapR cluster Database operations Hadoop analytics Active Read/Write End Users
  • 10. © 2015 MapR Technologies 10 Granular Security Use Access Control Expressions (ACEs) to set granular permissions. Example: user:mary | (group:admins & group:VP) & user:!bob
  • 11. © 2015 MapR Technologies 11 Open Source OJAI API for JSON-Based Applications on Hadoop Open JSON Application Interface (OJAI) Databases Other Systems MapR-DB MapR-Client {JSON} File Systems
  • 12. © 2015 MapR Technologies 12 Single Cluster Data Lake Capabilities Paste your MapR distribution for Hadoop diagram from Part A, (slide 2) here MapR-DB MapR-FS MapR Data Platform Distribution including Apache Hadoop MapR-DB: relational, time series, structured data MapR-FS: emails, blogs, tweets, log files, unstructured data Agile, self- service data exploration ETL into operational reporting formats (e.g., Parquet) Multi-tenancy: job/data placement control, volumes Access controls: file, table, column, column family, doc, sub-doc levels Sources RELATIONAL, SAAS, MAINFRAME DOCUMENTS, EMAILS LOG FILES, CLICKSTREAM SENSORS BLOGS, TWEETS, LINK DATA DATA WAREHOUSES, DATA MARTS Auditing: compliance, analyze user accesses Snapshots: track data lineage and history Table Replication: global multi-master, business continuity
  • 13. © 2015 MapR Technologies 13 Q&A @mapr maprtech @mapr.com Engage with us! MapR maprtech mapr-technologies

Editor's Notes

  • #3: What is NoSQL used for (one slide) Applications for non-relational data, rapidly growing data sets, time series data, consolidating disparate data sets
  • #4: Typical pain points (one slide) Versus RDBMS –scaling challenges (leading to loss of performance and higher costs), data modeling challenges Versus existing NoSQL –data integrity/reliability, high maintenance, inability to handle 24x7 environments, limited security capabilities
  • #5: What we think needs to be resolved (one slide) Hadoop integration “big data capabilities” – predictive analytics, anomaly detection, large-scale processing Enterprise-grade reliability Reduced cluster sprawl, real-time access to data, reduced data movement, reduced administration on disparate technologies Automatic optimizations – no compactions, garbage collection, anti-entropy, complex HA configurations Security (access controls) at granular level