SlideShare a Scribd company logo
© 2017 MapR TechnologiesMapR Confidential 1
MapR Product Update
Spring 2017
MapR Product Team
April 27, 2017
© 2017 MapR TechnologiesMapR Confidential 2
Before We Begin
• This webinar is being recorded. Later this week, you will
receive an email on how to get the recording and slide deck.
• If you can’t hear the audio from your computer, a dial-in phone
number is in the chat window in the lower left corner of your
screen.
• If you have any questions during the webinar, please type them
in the chat window.
© 2017 MapR TechnologiesMapR Confidential 3
Introducing Our Speakers from MapR
Mitesh Shah
Sr. Product Marketing Manager
Rachel Silver
Technical Product Manager,
Apache Spark and Ecosystem Projects
Saurabh Mahapatra
Sr. Product Manager, Apache Drill
© 2017 MapR TechnologiesMapR Confidential 4
Agenda
• MapR Persistent Application Client Containers (PACC)
• MapR Edge for Internet of Things
• MapR Ecosystem Pack 3.0
– Apache Hive 2.1.1 with significantly faster performance
– Spark 2.1.0 with many stability and security enhancements
– New connectors and APIs
• Drill 1.10 new features
• Q & A
© 2017 MapR TechnologiesMapR Confidential 5
MAPR CONVERGED DATA
PLATFORM FOR DOCKER
© 2017 MapR TechnologiesMapR Confidential 6
Announcement
MapR Announces the Converged Data
Platform for Docker
Supports containerization of existing
and new applications by providing containers with
persistent data access from anywhere
© 2017 MapR TechnologiesMapR Confidential 7
MapR Persistent Application Client Container (PACC)
Easily persist data as files, tables, documents, or streams from any Docker container
MapR POSIX
Client for
Containers
MapR Converged
Client for
Containers
Space for Customer Application
MapR PACC
A pre-built, optimized container image for
connecting to MapR services.
Streamlined.
All the necessary bits – no more, no less.
Secure. Container-level authentication,
encrypted communications.
Easy. Immediate connectivity to all services.
Customizable.
Docker Image available on Docker Hub.
Dockerfiles available soon on GitHub.
© 2017 MapR TechnologiesMapR Confidential 8
Short Primer on Containers
© 2017 MapR TechnologiesMapR Confidential 9
Container Benefits – Physical World
Pack ShipLoad Hope
World Without Containers World With Containers
Without containers, goods are loaded ad-hoc
and may not reach their final destination.
With containers, goods are packed in a more
consistent way, greatly improving portability
and minimizing the chance of loss.
© 2017 MapR TechnologiesMapR Confidential 10
Container Benefits – Virtual World
World Without Docker World With Docker
Developer IT Admin
1. My app is done. Can
you deploy it?
2. Sure, give me two weeks.
3. Provision stuff
5. It didn’t work. Can you
try again?
System
Admin
Storage
Admin
Network
Admin
On Prem
Public Cloud
Private Cloud
IT AdminDeveloper
1. My containerized app is
done. Can you deploy it?
3. Done.
2. Deploy
Anywhere
© 2017 MapR TechnologiesMapR Confidential 11
Containerization Benefits
Repeatability Improved Hardware
Utilization
Testability
PortabilityFault Tolerance Scalability /
Elasticity
More Efficient than VM-Based
Virtualization
Key Benefits of Containers
© 2017 MapR TechnologiesMapR Confidential 12
Challenges with Containers Today
© 2017 MapR TechnologiesMapR Confidential 13
How Do You Persist State in Docker?
• Go stateless? Not always desirable or feasible (logs?)
• Use local storage (volumes)? Hard to find again when you re-deploy
• Use a SAN or NAS? Expensive, and single-purpose (files only)
• Use separate filers, databases, and message queues?
Complex, expensive, and you’re not done yet (analytics?)
When a container is deleted, data within that container is erased.
© 2017 MapR TechnologiesMapR Confidential 14
Summary of Containerization Challenges
• No Out-of-Box Way to Let Applications Persist State
• In case of application or hardware failure, all data
written by applications is lost
App
Container
App
Container
…
Existing/legacy applications could benefit from
containers, but require persistent state.
Newly developed microservices require
persistence, but their need is more “converged”.
? ? ?
??
• No Ability to Handle:
 Logs, Streams of Application History
 Database for Operational State
 Streams for Communications between
Microservices
Existing/Legacy Apps Microservices
© 2017 MapR TechnologiesMapR Confidential 15
MapR Solves Container Challenges
© 2017 MapR TechnologiesMapR Confidential 16
MapR Persistent Application Client Container (PACC)
Easily persist data as files, tables, documents, or streams from any Docker container
MapR POSIX
Client for
Containers
MapR Converged
Client for
Containers
Space for Customer Application
MapR PACC
A pre-built, optimized container image for
connecting to MapR services.
Streamlined.
All the necessary bits – no more, no less.
Secure. Container-level authentication,
encrypted communications.
Easy. Immediate connectivity to all services.
Customizable.
Docker Image available on Docker Hub.
Dockerfiles available soon on GitHub.
© 2017 MapR TechnologiesMapR Confidential 17
MapR Converged Data Platform for Docker
Provides persistent storage with a powerful Docker client for fast access
of any data from any node (including nodes outside the MapR cluster)
MapR Converged Data Platform for Docker
Flexible Hybrid/Multi Cloud Infrastructure: Datacenter Servers, Private Cloud, Public Cloud
© 2017 MapR TechnologiesMapR Confidential 18
MapR Converged Data Platform
Deploy existing
stateful apps in
containers on
any node
No special
configuration
for nodes
Data access is built
into containers…App
Container
App
Container
App
Container
App
Container
App
Container
App
Container
Leverage on-premises and/or cloud
© 2017 MapR TechnologiesMapR Confidential 19
Three Example Use Cases
© 2017 MapR TechnologiesMapR Confidential 20
Use Case #1 – Storage for Containerized Apps
MapR Cluster
MariaDB
Volume
Logs Volume
Advantages
• Containers can survive application or hardware
failures by restarting and accessing data
• Containers can move across data centers using
MapR data replication capability
Example Apps
• RDBMS (MySQL, Postgres, Vertica, SAP)
• Source control (Git, Mercurial)
© 2017 MapR TechnologiesMapR Confidential 21
Use Case #2 - Shared Storage for App & Analytics
MapR Cluster
MariaDB
Volume
Logs Volume
Advantages
• Shared data repository for multiple apps,
operations & analytics
• Decoupled scaling of compute & storage
Example Apps
• Rapid-ingest Logging (Impressions,
Clicks)
• Image Store (Thumbnails, High Res)
© 2017 MapR TechnologiesMapR Confidential 22
Use Case #3 - Microservices
Advantages
• Efficiency: data services are offered by the platform,
not deployed ad-hoc by developers
• Scale, reliability inherited by all apps
Use Cases
• Stateful microservices
• Inter-microservice Communication
• Per-microservice database
MapR Cluster
MapR-FS
© 2017 MapR TechnologiesMapR Confidential 23
QUESTIONS?
© 2017 MapR TechnologiesMapR Confidential 24
MAPR EDGE
© 2017 MapR TechnologiesMapR Confidential 25
Announcement
MapR Extends Convergence
to the IoT Edge
MapR Edge is a Small Footprint Edition of the MapR
Converged Data Platform that Addresses the Need to Capture,
Process, and Analyze Data Generated by IoT Devices Close
the Source.
“Act Locally, Learn Globally” with MapR Edge
© 2017 MapR TechnologiesMapR Confidential 26
Flexible processing where
change is the norm
Distributed processing across clusters,
data centers and public and private cloud
environments
Supports global apps that
can scale arbitrarily
KEY TO REAL-TIME AT SCALE: GLOBAL CLOUD
PROCESSING
© 2017 MapR TechnologiesMapR Confidential 27
1
0
1
0
0
0
1
0
1
0
0
1
1
0
1
0
1
0
0
0
1
0
0
0
0
1
0
0
0
1
0
1
0
1
0
0
0
1
0
1
0
0
1
1
0
1
0
1
0
0
0
1
0
0
0
0
1
0
0
0
1
0
1
0
1
0
0
0
1
0
1
0
0
1
1
0
1
0
1
0
0
0
1
0
0
0
0
1
0
0
0
1
0
1
0
1
0
0
0
1
0
1
0
0
1
1
0
1
0
1
0
0
0
1
0
0
0
0
1
0
0
0
1
0
1
0
1
0
0
0
1
0
1
0
0
1
1
0
1
0
1
0
0
0
1
0
0
0
0
1
0
0
0
1
0
1
0
1
0
0
0
1
0
1
0
0
1
1
0
1
0
1
0
0
0
1
0
0
0
0
1
0
0
0
1
0
1
0
1
0
0
0
1
0
1
0
0
1
1
0
1
0
1
0
0
0
1
0
0
0
0
1
0
0
0
1
0
1
0
1
0
0
0
0
1
0
0
1
1
1
0
1
0
0
0
1
0
0
0
0
1
0
0
0
1
1
0
1
0
0
0
1
0
1
0
0
1
1
0
1
0
1
0
0
0
1
0
0
0
0
1
0
0
0
1
0
1
0
1
0
0
0
1
0
1
0
0
1
1
0
1
0
1
0
0
0
1
0
0
0
0
1
0
0
0
1
0
1
0
1
0
0
0
1
0
1
0
0
1
1
0
1
0
1
0
0
0
1
0
0
0
0
1
0
0
0
1
0
1
0
1
0
0
0
1
0
1
0
0
1
1
0
1
0
1
0
0
0
1
0
0
0
0
1
0
0
0
1
0
1
0
1
0
0
0
1
0
1
0
0
1
1
0
1
0
1
0
0
0
1
0
0
0
0
1
0
0
0
1
0
1
0
1
0
0
0
1
0
1
0
0
1
1
0
1
0
1
0
0
0
1
0
0
0
0
1
0
0
0
1
0
1
0
1
0
0
0
1
0
1
0
0
1
1
0
1
0
1
0
0
0
1
0
0
0
0
1
0
0
0
1
0
1
0
1
0
0
0
1
0
1
0
0
1
1
0
1
0
1
0
0
0
1
0
0
0
0
1
0
0
0
1
0
1
0
1
0
0
0
1
0
1
0
0
1
1
0
1
0
1
0
0
0
1
0
0
0
0
1
0
0
0
1
0
1
0
1
0
0
0
1
0
1
0
0
1
1
0
1
0
1
0
0
0
1
0
0
0
0
1
0
0
0
1
0
1
0
1
0
0
0
1
0
1
0
0
1
1
0
1
0
1
0
0
0
1
0
0
0
0
1
0
0
0
1
0
1
0
1
0
0
0
1
0
1
0
0
1
1
0
1
0
1
0
0
0
1
0
0
0
0
1
0
0
0
1
0
1
0
1
0
0
0
1
0
1
0
0
1
1
0
1
0
1
0
0
0
1
0
0
0
0
1
0
0
0
1
0
1
0
1
0
0
0
1
0
1
0
0
1
1
0
1
0
1
0
0
0
1
0
0
0
0
1
0
0
0
1
0
1
0
1
0
0
0
1
0
1
0
0
1
1
0
1
0
1
0
0
0
1
0
0
0
0
1
0
0
0
1
0
1
0
1
0
0
0
1
0
1
0
0
1
1
0
1
0
1
0
0
0
1
0
0
0
0
1
0
0
0
1
0
1
0
1
0
0
0
1
0
1
0
0
1
1
0
1
0
1
0
0
0
1
0
0
0
0
1
0
0
0
1
0
1
0
1
0
0
0
1
0
1
0
0
1
1
0
1
0
1
0
0
0
1
0
0
0
0
1
0
0
0
1
0
1
0
1
0
0
0
1
0
1
0
0
1
1
0
1
0
1
0
0
0
1
0
0
0
0
1
0
0
0
1
0
1
0
1
0
0
0
1
0
1
0
0
1
1
0
1
0
1
0
0
0
1
0
0
0
0
1
0
0
0
1
0
1
0
1
0
0
0
1
0
1
0
0
1
1
0
1
0
1
0
0
0
1
0
0
0
0
1
0
0
0
1
0
1
0
1
0
0
0
1
0
0
1
1
0
1
0
0
0
1
0
1
0
0
0
0
0
0
1
0
0
0
1
0
1
0
0
0
1
0
0
1
1
0
1
0
0
0
0
0
0
1
0
0
0
1
0
1
0
1
0
0
0
1
0
1
0
0
0
1
0
1
0
0
0
0
0
0
1
0
1
0
1
0
1
0
0
0
0
0
0
1
0
0
1
0
1
0
0
0
1
0
1
0
0
0
1
0
0
1
1
0
1
0
1
0
0
0
1
0
1
0
0
0
1
0
0
1
1
0
1
0
1
0
0
1
0
0
0
0
1
0
0
0
1
0
1
0
0
0
0
0
1
0
1
0
1
0
0
0
0
0
1
0
1
0
1
0
0
0
1
0
1
0
0
1
0
0
1
0
1
0
1
0
0
0
1
0
1
0
0
0
1
0
0
0
1
0
1
0
1
0
0
0
1
0
1
0
1
0
0
0
1
0
1
0
0
0
1
0
1
1
0
1
0
1
1
0
1
0
0
0
1
0
0
0
0
1
0
0
0
1
0
1
0
1
0
0
0
1
0
1
1
0
1
0
1
1
0
1
0
0
0
1
0
0
0
0
1
0
0
0
1
0
1
0
1
0
0
0
1
0
1
1
0
1
0
1
1
0
1
0
0
0
1
0
0
0
0
1
0
0
0
1
0
1
0
1
0
0
0
1
0
1
1
0
1
0
1
1
0
1
0
0
0
1
0
0
0
0
1
0
0
0
1
0
1
0
1
0
0
0
1
0
0
1
0
1
0
1
1
0
1
0
0
0
1
0
0
0
0
1
0
0
0
1
0
DEVICES
Chip/Device
Edge
1
1
0
1
0
1
0
1
1
0
0
0
1
0
0
0
1
0
1
0
0
1
1
0
1
0
0
1
1
0 1
1
0
1
0
1
0
1
1
1
1
0
1
0
1
0
1
1
1
1
0
1
0
1
0
1
1
1
1
0
1
0
1
0
1
1
1
1
0
1
0
1
0
1
1
0
0
1
0
1
1
0
1
0
1
0
1
1
Public Private On Premise
DISTRIBUTED
PROCESSING
(DEVICE, EDGE,
CLOUD,
ON PREMISES)
Location driven by
• Data gravity
• Safe harbor
• Costs
• Resource availability
1
0
0
1
1
0
© 2017 MapR TechnologiesMapR Confidential 28
EXAMPLE: CONNECTED CAR
• HIGH FREQUENCY
DECISIONING USE
CASES
• Advanced Driver
Assistance Systems
(ADAS)
• Computer Aided Driving
• Vehicle Healthcare
• Fleet Management
By 2020, more than 250 million
vehicles will be connected globally
- Gartner
© 2017 MapR TechnologiesMapR Confidential 29
AUTOMOTIVE IOT EXAMPLES
ADVANCED DRIVER
ASSISTANCE
SYSTEMS (ADAS)
COMPUTER
AIDED DRIVING
VEHICLE
HEALTHCARE
INSURANCE AND
FINANCIAL SERVICES
Lane change
Traffic sign recognition
Pedestrian recognition
Collision avoidance
Automatic parking
Detecting traffic jams
Road condition
warnings
Dynamic maps
Exhaust monitoring
Engine function
Predictive maintenance
Accident/breakdown
management
Pay as you drive
Pay how you drive
Pay where you drive
Fleet Management
Delivery services
Utilities/Trucking companies
© 2017 MapR TechnologiesMapR Confidential 30
MapR Edge Solves IoT Analytics
Challenges
© 2017 MapR TechnologiesMapR Confidential 31
MAPR EDGE FOR INTERNET-OF-THINGS
© 2017 MapR TechnologiesMapR Confidential 32
MAPR SOLVES BIG DATA AT THE EDGE
DESIGNED FOR EDGE LOCATIONS
Sites with slow or
occasionally connected
network access
Data sources
that create huge
volumes of data
E.g., oil rigs,
hospitals, vehicles,
remote offices, etc. INTEL NUC MINI PCS
(8.3” X 4.6” X 1.1”)
Space constrained locations requiring small footprints
• 3-5 node cluster, storage capacity limits, 16GB RAM
• Optimized for mini PCs (e.g., Intel NUCs)
© 2017 MapR TechnologiesMapR Confidential 33
EXTEND THE POWER OF THE MAPR PLATFORM
• Access to files, tables, documents, streams
• Multiple compute engines – Spark, Drill, Hive, etc.
• Data management capabilities such as volumes, quotas,
compression, MapR Control System
• Business continuity via replication, mirroring, and snapshots
• Data distribution via global, bandwidth-aware, incremental
replication capabilities
MapR Edge supports:
© 2017 MapR TechnologiesMapR Confidential 34
MAPR EDGE KEY FEATURES AND BENEFITS
Distributed data aggregation
• Reduce bandwidth requirements
• Maintain data privacy
• Comply with data location regulations
• Reliably consolidate data from edge sites
• Minimize space requirements
Bandwidth-awareness
• Transport data reliably even in slow or occasional connections
Global data plane
• Simplify development/deployment with a global view of all data
in a single namespace
© 2017 MapR TechnologiesMapR Confidential 35
MAPR EDGE KEY FEATURES AND BENEFITS (CONT.)
Converged analytics
• Gain faster time-to-insight at the edge.
Unified security
• Protect data stored at the edge as well as in motion
• Reduce complexity with a consistent security framework
Standards-based
• Simplify code development standard-based interfaces
Enterprise-grade reliability
• Reduce costly downtime
© 2017 MapR TechnologiesMapR Confidential 36
Two Example Use Cases
© 2017 MapR TechnologiesMapR Confidential 37
Use Case #1 – Oil & Gas
© 2017 MapR TechnologiesMapR Confidential 38
Use Case #1 – Oil & Gas (Before and After MapR Edge)
Source
1
Source
2
Source
1000
Time to insight (48 hrs)
Manual process
Before MapR Edge
Source
1
Source
2
Source
1000
Time to insight (< 2
hrs)
Automated processThousands of
oil and gas
sources
Down-sampling prior to
delivery to core cluster
Source
1
Source
2
Source
1000
MapR Core ClusterMapR Core Cluster
Internet
With MapR Edge
© 2017 MapR TechnologiesMapR Confidential 39
Use Case #2 – Automotive
© 2017 MapR TechnologiesMapR Confidential 40
Use Case #2 – Automotive (Before and After Edge)
Source
1
Source
2
Source
1000
Time to insight (24 hrs)
Manual processes, scripts,
requires high bandwidth
network
Before MapR Edge
Source
1
Source
2
Source
1000
Time to insight (< 5 mins)
Automated processThousands of test
cars running 24/7
Capture data
around exceptions
With MapR Edge
Source
1
Source
2
Source
1000
1-5 TB/day
Internet
Internet
MapR Core ClusterMapR Core Cluster
© 2017 MapR TechnologiesMapR Confidential 41
QUESTIONS?
© 2017 MapR TechnologiesMapR Confidential 42
MAPR ECOSYSTEM PACK
© 2017 MapR TechnologiesMapR Confidential 43
AGENDA
What We’re Covering Today
• MEP Overview
• MEP 3.0 Updates
• New Spark Connectors
• New Streams APIs
© 2017 MapR TechnologiesMapR Confidential 44
On-Premise, In the Cloud, Hybrid
HDFS API POSIX, NFS HBASE API JSON API KAFKA API
Database
MapR-DB
Event Streaming
MapR Streams
Enterprise-Grade
Platform Services
High Availability
Web-Scale Storage
MapR-FS
Real Time Unified Security Multi-tenancy Disaster Recovery Global Namespace
MAPR CONVERGED DATA PLATFORM 2017
© 2017 MapR TechnologiesMapR Confidential 45
MAPR ECOSYSTEM PACKS (MEPs)
Extended
Ecosystem
MapR Core
Ecosystem
MEP
Outside support: vendor or
community.
Fully supported, updates tied
to MapR core.
Fully supported, updates
follow MEP process.
© 2017 MapR TechnologiesMapR Confidential 46
WE DO ECOSYSTEM BETTER
Competitor process: All-or-nothing
● Must upgrade full stack to receive any updates
● Infrequent opportunities for upgrade: ~2/year
MapR Ecosystem Packs (MEP) Process:
● Reduce upgrade effort – upgrade only at the level you
need, instead of your entire stack
● Frequent (quarterly) opportunities for upgrade
MEP 1.0 MEP 2.0 MEP ?
Less disruption to production environments! Upgrades are disruptive and infrequent!
© 2017 MapR TechnologiesMapR Confidential 47
MEP RELEASE CADENCE AND CORE SUPPORT
The hypothetical release schedule
MapR 5.2
MEP
1.0
Maintenance releases assumed to follow this pattern for every MEP release until EOL of that MEP
Q3’16 Q4’16 Q1’17 Q2’17 Q3’17 Q4’17
MEP
3.0
MEP
4.0
MEP
?.?
MapR ?
MEP
1.0.2
MEP
1.0.3
MEP
1.0.4
MEP
1.0.5
MEP
1.0.1
MEP
2.0 ...
...
MEP
?.?
© 2017 MapR TechnologiesMapR Confidential 48
MAINTENANCE RELEASE SUPPORT FOR MEPS
• It is recommended to upgrade to latest compatible MEP release
when core MR is updated
• We will support new MEP releases with older version of
maintenance releases on a supported core release. See diagram:
MapR ?
MEP MEPMEP
MapR
X.Y.Z
MapR
X.Y.ZMapR 5.2
MEP MEPMEP
MapR
5.2.1
MapR
5.2.2
© 2017 MapR TechnologiesMapR Confidential 49
MEP 3.0 Updates
© 2017 MapR TechnologiesMapR Confidential 50
MEP 3.0 UPGRADES
NEW FEATURES PATCHES
• Flume 1.6 -> 1.7
• Oozie 4.2 -> 4.3
• Impala 2.5 -> 2.7
• Sentry 1.6 -> 1.7
• HBase 1.1.2 -> 1.1.8
• Drill 1.9 -> Drill 1.10
• Hive 1.2.1 -> Hive 2.1.1
•Spark 2.0.1 -> 2.1.0
– Added MapR-SASL support to
Thrift Server
•Hue 3.10 -> Hue 3.12
– Experimental Drill integration
via JDBC notebook
© 2017 MapR TechnologiesMapR Confidential 51
FASTER HIVE 2.1.1
• 2X Faster ETL through a smarter Cost-Based Optimizer (CBO),
faster type conversions and dynamic partition pruning.
• Procedural SQL support.
• New HiveServer UI with new diagnostics and monitoring tools
• Dynamically partitioned hash joins provide unsorted inputs in
order to eliminate the sorting.
• Vectorized query execution greatly reduces the CPU usage for
typical query operations like scans, filters, aggregates, and
joins.
Faster processing, lower latency, and higher throughput
© 2017 MapR TechnologiesMapR Confidential 52
SPARK 2.1.1
• 1200 JIRAs patched on 2.X branch
• Provides for secure connections using MapR-SASL in addition
to Kerberos for:
– Inbound client connections to the Spark SQL Thrift server
– Spark connections to Hive Metastore
Big improvements in stability and security
© 2017 MapR TechnologiesMapR Confidential 53
MEP IS CONTINUOUSLY EVOLVING
The MEP charter has expanded to include connectors and APIs
Connectors to enhance &
enable native integrations
for ecosystem projects
Developer APIs to provide
agile interactions with your
platform
© 2017 MapR TechnologiesMapR Confidential 54
MEP 3.0 Connectors & APIs
CONNECTORS APIs
• MapR Streams C APIs
• MapR Streams Python APIs
• Spark OJAI Connector for
MapR-DB JSON
– Phase 1: support for RDDs only
– Read and write JSON to/from
MapR-DB
•Spark HBase Connector for
MapR-DB Binary
– Phase 1: support for RDDs only
– Read and write using HBase
Contexts
© 2017 MapR TechnologiesMapR Confidential 55
Spark Connectors
© 2017 MapR TechnologiesMapR Confidential 56
SPARK OJAI CONNECTOR FOR MAPR-DB JSON
• Two new APIs that allow you to:
– Load data from a MapR-DB JSON table to a Spark RDD
– Save a Spark RDD to a MapR-DB JSON table
•Data locality:
– When the connector reads data from MapR-DB, it uses the data
locality feature of MapR-DB to spawn the Spark executors.
•A custom partitioner that allows you to partition data for better
performance
Build real-time or batch pipelines between your data and MapR-DB
© 2017 MapR TechnologiesMapR Confidential 57
SPARK OJAI CONNECTOR FOR MAPR-DB JSON
Leverages the OJAI API internally to talk to MapR-DB JSON tables
© 2017 MapR TechnologiesMapR Confidential 58
THE SPARK HBASE CONNECTOR FOR MAPR-DB BINARY
Enabling HBase contexts for Spark and Spark Streaming
• Spark applications can now consume and use MapR-DB &
HBase binary tables
• Provides for bulk insert into HBase HFiles
– Basic bulk load functionality
– Thin-record bulk load option
© 2017 MapR TechnologiesMapR Confidential 59
MapR Streams APIs
© 2017 MapR TechnologiesMapR Confidential 60
MAPR STREAMS C APPLICATIONS
A distribution of librdkafka that can work with MapR Streams
• The MapR Streams C Client supports a majority of the
librdkafka API plus:
– Streams.consumer.default.stream: specifies a default consumer
stream path and name.
– Streams.producer.default.stream: specifies a default producer stream
path and name.
– Streams.parallel.flushers.per.partition: enables multiple parallel send
requests to the server for each topic partition
© 2017 MapR TechnologiesMapR Confidential 61
MAPR STREAMS PYTHON APPLICATIONS
A Python client binding of librdkafka that can work with MapR Streams
• Allows for writing Python Streams Applications
• Includes all additional configuration options provided by the
MapR Streams C API
© 2017 MapR TechnologiesMapR Confidential 62
QUESTIONS?
© 2017 MapR TechnologiesMapR Confidential 63
APACHE DRILL 1.10
© 2017 MapR TechnologiesMapR Confidential 64
Drill Release 1.10
• Native connector to Tableau
– Temporary Tables
• Hive 2.1.1 integration
– Drill queries are forward compatible
• Security
– Kerberos authentication & MAPR-SASL between client-and-Drill bit
• Performance
– Asynchronous parquet reader, hash join fix
• Compatibility
– Support for INT96 timestamp types
© 2017 MapR TechnologiesMapR Confidential 65
Resources
• Visit www.mapr.com
• Visit the MapR Community
community.mapr.com
• Contact us at maprisr@mapr.com
© 2017 MapR TechnologiesMapR Confidential 66
Q&A
ENGAGE WITH US
@mapr
@mapr.com

More Related Content

PPTX
How to Leverage the Cloud for Business Solutions | Strata Data Conference Lon...
MapR Technologies
 
PPTX
Best Practices for Data Convergence in Healthcare
MapR Technologies
 
PPTX
Data Warehouse Modernization: Accelerating Time-To-Action
MapR Technologies
 
PPTX
Geo-Distributed Big Data and Analytics
MapR Technologies
 
PDF
An Introduction to the MapR Converged Data Platform
MapR Technologies
 
PPTX
3 Benefits of Multi-Temperature Data Management for Data Analytics
MapR Technologies
 
PDF
Meruvian - Introduction to MapR
The World Bank
 
PPTX
Bringing Structure, Scalability, and Services to Cloud-Scale Storage
MapR Technologies
 
How to Leverage the Cloud for Business Solutions | Strata Data Conference Lon...
MapR Technologies
 
Best Practices for Data Convergence in Healthcare
MapR Technologies
 
Data Warehouse Modernization: Accelerating Time-To-Action
MapR Technologies
 
Geo-Distributed Big Data and Analytics
MapR Technologies
 
An Introduction to the MapR Converged Data Platform
MapR Technologies
 
3 Benefits of Multi-Temperature Data Management for Data Analytics
MapR Technologies
 
Meruvian - Introduction to MapR
The World Bank
 
Bringing Structure, Scalability, and Services to Cloud-Scale Storage
MapR Technologies
 

What's hot (20)

PPTX
Self-Service Data Science for Leveraging ML & AI on All of Your Data
MapR Technologies
 
PPTX
Enabling Real-Time Business with Change Data Capture
MapR Technologies
 
PDF
Streaming Goes Mainstream: New Architecture & Emerging Technologies for Strea...
MapR Technologies
 
PPTX
Machine Learning Success: The Key to Easier Model Management
MapR Technologies
 
PPTX
Converging your data landscape
MapR Technologies
 
PDF
Live Machine Learning Tutorial: Churn Prediction
MapR Technologies
 
PPTX
ML Workshop 2: Machine Learning Model Comparison & Evaluation
MapR Technologies
 
PDF
Applying Machine learning to IOT: End to End Distributed Distributed Pipeline...
Carol McDonald
 
PPTX
MapR Streams and MapR Converged Data Platform
MapR Technologies
 
PPTX
Machine Learning for Chickens, Autonomous Driving and a 3-year-old Who Won’t ...
MapR Technologies
 
PPTX
State of the Art Robot Predictive Maintenance with Real-time Sensor Data
Mathieu Dumoulin
 
PPTX
ML Workshop 1: A New Architecture for Machine Learning Logistics
MapR Technologies
 
PDF
Spark and MapR Streams: A Motivating Example
Ian Downard
 
PDF
Streaming Architecture to Connect Everything (Including Hybrid Cloud) - Strat...
Mathieu Dumoulin
 
PPTX
Cisco & MapR bring 3 Superpowers to SAP HANA Deployments
MapR Technologies
 
PPTX
Converged and Containerized Distributed Deep Learning With TensorFlow and Kub...
Mathieu Dumoulin
 
PDF
Demystifying AI, Machine Learning and Deep Learning
Carol McDonald
 
PPTX
MapR on Azure: Getting Value from Big Data in the Cloud -
MapR Technologies
 
PPTX
CEP - simplified streaming architecture - Strata Singapore 2016
Mathieu Dumoulin
 
PPTX
Xactly: How to Build a Successful Converged Data Platform with Hadoop, Spark,...
MapR Technologies
 
Self-Service Data Science for Leveraging ML & AI on All of Your Data
MapR Technologies
 
Enabling Real-Time Business with Change Data Capture
MapR Technologies
 
Streaming Goes Mainstream: New Architecture & Emerging Technologies for Strea...
MapR Technologies
 
Machine Learning Success: The Key to Easier Model Management
MapR Technologies
 
Converging your data landscape
MapR Technologies
 
Live Machine Learning Tutorial: Churn Prediction
MapR Technologies
 
ML Workshop 2: Machine Learning Model Comparison & Evaluation
MapR Technologies
 
Applying Machine learning to IOT: End to End Distributed Distributed Pipeline...
Carol McDonald
 
MapR Streams and MapR Converged Data Platform
MapR Technologies
 
Machine Learning for Chickens, Autonomous Driving and a 3-year-old Who Won’t ...
MapR Technologies
 
State of the Art Robot Predictive Maintenance with Real-time Sensor Data
Mathieu Dumoulin
 
ML Workshop 1: A New Architecture for Machine Learning Logistics
MapR Technologies
 
Spark and MapR Streams: A Motivating Example
Ian Downard
 
Streaming Architecture to Connect Everything (Including Hybrid Cloud) - Strat...
Mathieu Dumoulin
 
Cisco & MapR bring 3 Superpowers to SAP HANA Deployments
MapR Technologies
 
Converged and Containerized Distributed Deep Learning With TensorFlow and Kub...
Mathieu Dumoulin
 
Demystifying AI, Machine Learning and Deep Learning
Carol McDonald
 
MapR on Azure: Getting Value from Big Data in the Cloud -
MapR Technologies
 
CEP - simplified streaming architecture - Strata Singapore 2016
Mathieu Dumoulin
 
Xactly: How to Build a Successful Converged Data Platform with Hadoop, Spark,...
MapR Technologies
 
Ad

Similar to MapR Product Update - Spring 2017 (20)

PDF
Big Data LDN 2017: How to leverage the cloud for Business Solutions
Matt Stubbs
 
PDF
Containers and Kubernetes without limits
Antje Barth
 
PDF
Container and Kubernetes without limits
Antje Barth
 
PPTX
Progress for big data in Kubernetes
Ted Dunning
 
PDF
Containerization Use Cases.pdf
Simform
 
PPTX
MapR and Cisco Make IT Better
MapR Technologies
 
PPTX
Keys for Success from Streams to Queries
DataWorks Summit/Hadoop Summit
 
PPTX
Real-time Hadoop: The Ideal Messaging System for Hadoop
DataWorks Summit/Hadoop Summit
 
PPTX
Real time-hadoop
Ted Dunning
 
PDF
Advanced Spark and TensorFlow Meetup - Dec 12 2017 - Dong Meng, MapR + Kubern...
Chris Fregly
 
PDF
HUG Italy meet-up with Fabian Wilckens, MapR EMEA Solutions Architect
SpagoWorld
 
PPTX
How to Get Going with Kubernetes
Ted Dunning
 
PDF
Big Data LDN 2018: PROGRESS FOR BIG DATA IN KUBERNETES
Matt Stubbs
 
PDF
Big Data LDN 2018: 7 SUCCESSFUL HABITS FOR DATA-INTENSIVE APPLICATIONS IN PRO...
Matt Stubbs
 
PDF
Streaming in the Extreme
Julius Remigio, CBIP
 
PPTX
Evolving Beyond the Data Lake: A Story of Wind and Rain
MapR Technologies
 
PPTX
What is the past future tense of data?
Ted Dunning
 
PPTX
Integrating Hadoop into your enterprise IT environment
MapR Technologies
 
PDF
Google Tech Talk with Dr. Eric Brewer in Korea Apr.27.2015
Chris Jang
 
PDF
Unlocking Opportunities on the Cloud Through Container Technology
Skillmine Technology Consulting
 
Big Data LDN 2017: How to leverage the cloud for Business Solutions
Matt Stubbs
 
Containers and Kubernetes without limits
Antje Barth
 
Container and Kubernetes without limits
Antje Barth
 
Progress for big data in Kubernetes
Ted Dunning
 
Containerization Use Cases.pdf
Simform
 
MapR and Cisco Make IT Better
MapR Technologies
 
Keys for Success from Streams to Queries
DataWorks Summit/Hadoop Summit
 
Real-time Hadoop: The Ideal Messaging System for Hadoop
DataWorks Summit/Hadoop Summit
 
Real time-hadoop
Ted Dunning
 
Advanced Spark and TensorFlow Meetup - Dec 12 2017 - Dong Meng, MapR + Kubern...
Chris Fregly
 
HUG Italy meet-up with Fabian Wilckens, MapR EMEA Solutions Architect
SpagoWorld
 
How to Get Going with Kubernetes
Ted Dunning
 
Big Data LDN 2018: PROGRESS FOR BIG DATA IN KUBERNETES
Matt Stubbs
 
Big Data LDN 2018: 7 SUCCESSFUL HABITS FOR DATA-INTENSIVE APPLICATIONS IN PRO...
Matt Stubbs
 
Streaming in the Extreme
Julius Remigio, CBIP
 
Evolving Beyond the Data Lake: A Story of Wind and Rain
MapR Technologies
 
What is the past future tense of data?
Ted Dunning
 
Integrating Hadoop into your enterprise IT environment
MapR Technologies
 
Google Tech Talk with Dr. Eric Brewer in Korea Apr.27.2015
Chris Jang
 
Unlocking Opportunities on the Cloud Through Container Technology
Skillmine Technology Consulting
 
Ad

More from MapR Technologies (10)

PDF
Live Tutorial – Streaming Real-Time Events Using Apache APIs
MapR Technologies
 
PPTX
Evolving from RDBMS to NoSQL + SQL
MapR Technologies
 
PDF
Open Source Innovations in the MapR Ecosystem Pack 2.0
MapR Technologies
 
PPTX
How Spark is Enabling the New Wave of Converged Cloud Applications
MapR Technologies
 
PDF
MapR 5.2: Getting More Value from the MapR Converged Data Platform
MapR Technologies
 
PDF
Handling the Extremes: Scaling and Streaming in Finance
MapR Technologies
 
PDF
Baptist Health: Solving Healthcare Problems with Big Data
MapR Technologies
 
PDF
The Keys to Digital Transformation
MapR Technologies
 
PDF
Insight Platforms Accelerate Digital Transformation
MapR Technologies
 
PPTX
Design Patterns for working with Fast Data
MapR Technologies
 
Live Tutorial – Streaming Real-Time Events Using Apache APIs
MapR Technologies
 
Evolving from RDBMS to NoSQL + SQL
MapR Technologies
 
Open Source Innovations in the MapR Ecosystem Pack 2.0
MapR Technologies
 
How Spark is Enabling the New Wave of Converged Cloud Applications
MapR Technologies
 
MapR 5.2: Getting More Value from the MapR Converged Data Platform
MapR Technologies
 
Handling the Extremes: Scaling and Streaming in Finance
MapR Technologies
 
Baptist Health: Solving Healthcare Problems with Big Data
MapR Technologies
 
The Keys to Digital Transformation
MapR Technologies
 
Insight Platforms Accelerate Digital Transformation
MapR Technologies
 
Design Patterns for working with Fast Data
MapR Technologies
 

Recently uploaded (20)

PPTX
Introduction to Biostatistics Presentation.pptx
AtemJoshua
 
PPT
Real Life Application of Set theory, Relations and Functions
manavparmar205
 
PPTX
Fluvial_Civilizations_Presentation (1).pptx
alisslovemendoza7
 
PDF
The_Future_of_Data_Analytics_by_CA_Suvidha_Chaplot_UPDATED.pdf
CA Suvidha Chaplot
 
PDF
SUMMER INTERNSHIP REPORT[1] (AutoRecovered) (6) (1).pdf
pandeydiksha814
 
PPT
From Vision to Reality: The Digital India Revolution
Harsh Bharvadiya
 
PDF
717629748-Databricks-Certified-Data-Engineer-Professional-Dumps-by-Ball-21-03...
pedelli41
 
PPTX
Introduction to Data Analytics and Data Science
KavithaCIT
 
PPTX
Pipeline Automatic Leak Detection for Water Distribution Systems
Sione Palu
 
PDF
Practical Measurement Systems Analysis (Gage R&R) for design
Rob Schubert
 
PDF
Blitz Campinas - Dia 24 de maio - Piettro.pdf
fabigreek
 
PPTX
White Blue Simple Modern Enhancing Sales Strategy Presentation_20250724_21093...
RamNeymarjr
 
PDF
Classifcation using Machine Learning and deep learning
bhaveshagrawal35
 
PDF
WISE main accomplishments for ISQOLS award July 2025.pdf
StatsCommunications
 
PDF
Key_Statistical_Techniques_in_Analytics_by_CA_Suvidha_Chaplot.pdf
CA Suvidha Chaplot
 
PPTX
IP_Journal_Articles_2025IP_Journal_Articles_2025
mishell212144
 
PPTX
World-population.pptx fire bunberbpeople
umutunsalnsl4402
 
PDF
202501214233242351219 QASS Session 2.pdf
lauramejiamillan
 
PPTX
Data Security Breach: Immediate Action Plan
varmabhuvan266
 
PPTX
lecture 13 mind test academy it skills.pptx
ggesjmrasoolpark
 
Introduction to Biostatistics Presentation.pptx
AtemJoshua
 
Real Life Application of Set theory, Relations and Functions
manavparmar205
 
Fluvial_Civilizations_Presentation (1).pptx
alisslovemendoza7
 
The_Future_of_Data_Analytics_by_CA_Suvidha_Chaplot_UPDATED.pdf
CA Suvidha Chaplot
 
SUMMER INTERNSHIP REPORT[1] (AutoRecovered) (6) (1).pdf
pandeydiksha814
 
From Vision to Reality: The Digital India Revolution
Harsh Bharvadiya
 
717629748-Databricks-Certified-Data-Engineer-Professional-Dumps-by-Ball-21-03...
pedelli41
 
Introduction to Data Analytics and Data Science
KavithaCIT
 
Pipeline Automatic Leak Detection for Water Distribution Systems
Sione Palu
 
Practical Measurement Systems Analysis (Gage R&R) for design
Rob Schubert
 
Blitz Campinas - Dia 24 de maio - Piettro.pdf
fabigreek
 
White Blue Simple Modern Enhancing Sales Strategy Presentation_20250724_21093...
RamNeymarjr
 
Classifcation using Machine Learning and deep learning
bhaveshagrawal35
 
WISE main accomplishments for ISQOLS award July 2025.pdf
StatsCommunications
 
Key_Statistical_Techniques_in_Analytics_by_CA_Suvidha_Chaplot.pdf
CA Suvidha Chaplot
 
IP_Journal_Articles_2025IP_Journal_Articles_2025
mishell212144
 
World-population.pptx fire bunberbpeople
umutunsalnsl4402
 
202501214233242351219 QASS Session 2.pdf
lauramejiamillan
 
Data Security Breach: Immediate Action Plan
varmabhuvan266
 
lecture 13 mind test academy it skills.pptx
ggesjmrasoolpark
 

MapR Product Update - Spring 2017

  • 1. © 2017 MapR TechnologiesMapR Confidential 1 MapR Product Update Spring 2017 MapR Product Team April 27, 2017
  • 2. © 2017 MapR TechnologiesMapR Confidential 2 Before We Begin • This webinar is being recorded. Later this week, you will receive an email on how to get the recording and slide deck. • If you can’t hear the audio from your computer, a dial-in phone number is in the chat window in the lower left corner of your screen. • If you have any questions during the webinar, please type them in the chat window.
  • 3. © 2017 MapR TechnologiesMapR Confidential 3 Introducing Our Speakers from MapR Mitesh Shah Sr. Product Marketing Manager Rachel Silver Technical Product Manager, Apache Spark and Ecosystem Projects Saurabh Mahapatra Sr. Product Manager, Apache Drill
  • 4. © 2017 MapR TechnologiesMapR Confidential 4 Agenda • MapR Persistent Application Client Containers (PACC) • MapR Edge for Internet of Things • MapR Ecosystem Pack 3.0 – Apache Hive 2.1.1 with significantly faster performance – Spark 2.1.0 with many stability and security enhancements – New connectors and APIs • Drill 1.10 new features • Q & A
  • 5. © 2017 MapR TechnologiesMapR Confidential 5 MAPR CONVERGED DATA PLATFORM FOR DOCKER
  • 6. © 2017 MapR TechnologiesMapR Confidential 6 Announcement MapR Announces the Converged Data Platform for Docker Supports containerization of existing and new applications by providing containers with persistent data access from anywhere
  • 7. © 2017 MapR TechnologiesMapR Confidential 7 MapR Persistent Application Client Container (PACC) Easily persist data as files, tables, documents, or streams from any Docker container MapR POSIX Client for Containers MapR Converged Client for Containers Space for Customer Application MapR PACC A pre-built, optimized container image for connecting to MapR services. Streamlined. All the necessary bits – no more, no less. Secure. Container-level authentication, encrypted communications. Easy. Immediate connectivity to all services. Customizable. Docker Image available on Docker Hub. Dockerfiles available soon on GitHub.
  • 8. © 2017 MapR TechnologiesMapR Confidential 8 Short Primer on Containers
  • 9. © 2017 MapR TechnologiesMapR Confidential 9 Container Benefits – Physical World Pack ShipLoad Hope World Without Containers World With Containers Without containers, goods are loaded ad-hoc and may not reach their final destination. With containers, goods are packed in a more consistent way, greatly improving portability and minimizing the chance of loss.
  • 10. © 2017 MapR TechnologiesMapR Confidential 10 Container Benefits – Virtual World World Without Docker World With Docker Developer IT Admin 1. My app is done. Can you deploy it? 2. Sure, give me two weeks. 3. Provision stuff 5. It didn’t work. Can you try again? System Admin Storage Admin Network Admin On Prem Public Cloud Private Cloud IT AdminDeveloper 1. My containerized app is done. Can you deploy it? 3. Done. 2. Deploy Anywhere
  • 11. © 2017 MapR TechnologiesMapR Confidential 11 Containerization Benefits Repeatability Improved Hardware Utilization Testability PortabilityFault Tolerance Scalability / Elasticity More Efficient than VM-Based Virtualization Key Benefits of Containers
  • 12. © 2017 MapR TechnologiesMapR Confidential 12 Challenges with Containers Today
  • 13. © 2017 MapR TechnologiesMapR Confidential 13 How Do You Persist State in Docker? • Go stateless? Not always desirable or feasible (logs?) • Use local storage (volumes)? Hard to find again when you re-deploy • Use a SAN or NAS? Expensive, and single-purpose (files only) • Use separate filers, databases, and message queues? Complex, expensive, and you’re not done yet (analytics?) When a container is deleted, data within that container is erased.
  • 14. © 2017 MapR TechnologiesMapR Confidential 14 Summary of Containerization Challenges • No Out-of-Box Way to Let Applications Persist State • In case of application or hardware failure, all data written by applications is lost App Container App Container … Existing/legacy applications could benefit from containers, but require persistent state. Newly developed microservices require persistence, but their need is more “converged”. ? ? ? ?? • No Ability to Handle:  Logs, Streams of Application History  Database for Operational State  Streams for Communications between Microservices Existing/Legacy Apps Microservices
  • 15. © 2017 MapR TechnologiesMapR Confidential 15 MapR Solves Container Challenges
  • 16. © 2017 MapR TechnologiesMapR Confidential 16 MapR Persistent Application Client Container (PACC) Easily persist data as files, tables, documents, or streams from any Docker container MapR POSIX Client for Containers MapR Converged Client for Containers Space for Customer Application MapR PACC A pre-built, optimized container image for connecting to MapR services. Streamlined. All the necessary bits – no more, no less. Secure. Container-level authentication, encrypted communications. Easy. Immediate connectivity to all services. Customizable. Docker Image available on Docker Hub. Dockerfiles available soon on GitHub.
  • 17. © 2017 MapR TechnologiesMapR Confidential 17 MapR Converged Data Platform for Docker Provides persistent storage with a powerful Docker client for fast access of any data from any node (including nodes outside the MapR cluster) MapR Converged Data Platform for Docker Flexible Hybrid/Multi Cloud Infrastructure: Datacenter Servers, Private Cloud, Public Cloud
  • 18. © 2017 MapR TechnologiesMapR Confidential 18 MapR Converged Data Platform Deploy existing stateful apps in containers on any node No special configuration for nodes Data access is built into containers…App Container App Container App Container App Container App Container App Container Leverage on-premises and/or cloud
  • 19. © 2017 MapR TechnologiesMapR Confidential 19 Three Example Use Cases
  • 20. © 2017 MapR TechnologiesMapR Confidential 20 Use Case #1 – Storage for Containerized Apps MapR Cluster MariaDB Volume Logs Volume Advantages • Containers can survive application or hardware failures by restarting and accessing data • Containers can move across data centers using MapR data replication capability Example Apps • RDBMS (MySQL, Postgres, Vertica, SAP) • Source control (Git, Mercurial)
  • 21. © 2017 MapR TechnologiesMapR Confidential 21 Use Case #2 - Shared Storage for App & Analytics MapR Cluster MariaDB Volume Logs Volume Advantages • Shared data repository for multiple apps, operations & analytics • Decoupled scaling of compute & storage Example Apps • Rapid-ingest Logging (Impressions, Clicks) • Image Store (Thumbnails, High Res)
  • 22. © 2017 MapR TechnologiesMapR Confidential 22 Use Case #3 - Microservices Advantages • Efficiency: data services are offered by the platform, not deployed ad-hoc by developers • Scale, reliability inherited by all apps Use Cases • Stateful microservices • Inter-microservice Communication • Per-microservice database MapR Cluster MapR-FS
  • 23. © 2017 MapR TechnologiesMapR Confidential 23 QUESTIONS?
  • 24. © 2017 MapR TechnologiesMapR Confidential 24 MAPR EDGE
  • 25. © 2017 MapR TechnologiesMapR Confidential 25 Announcement MapR Extends Convergence to the IoT Edge MapR Edge is a Small Footprint Edition of the MapR Converged Data Platform that Addresses the Need to Capture, Process, and Analyze Data Generated by IoT Devices Close the Source. “Act Locally, Learn Globally” with MapR Edge
  • 26. © 2017 MapR TechnologiesMapR Confidential 26 Flexible processing where change is the norm Distributed processing across clusters, data centers and public and private cloud environments Supports global apps that can scale arbitrarily KEY TO REAL-TIME AT SCALE: GLOBAL CLOUD PROCESSING
  • 27. © 2017 MapR TechnologiesMapR Confidential 27 1 0 1 0 0 0 1 0 1 0 0 1 1 0 1 0 1 0 0 0 1 0 0 0 0 1 0 0 0 1 0 1 0 1 0 0 0 1 0 1 0 0 1 1 0 1 0 1 0 0 0 1 0 0 0 0 1 0 0 0 1 0 1 0 1 0 0 0 1 0 1 0 0 1 1 0 1 0 1 0 0 0 1 0 0 0 0 1 0 0 0 1 0 1 0 1 0 0 0 1 0 1 0 0 1 1 0 1 0 1 0 0 0 1 0 0 0 0 1 0 0 0 1 0 1 0 1 0 0 0 1 0 1 0 0 1 1 0 1 0 1 0 0 0 1 0 0 0 0 1 0 0 0 1 0 1 0 1 0 0 0 1 0 1 0 0 1 1 0 1 0 1 0 0 0 1 0 0 0 0 1 0 0 0 1 0 1 0 1 0 0 0 1 0 1 0 0 1 1 0 1 0 1 0 0 0 1 0 0 0 0 1 0 0 0 1 0 1 0 1 0 0 0 0 1 0 0 1 1 1 0 1 0 0 0 1 0 0 0 0 1 0 0 0 1 1 0 1 0 0 0 1 0 1 0 0 1 1 0 1 0 1 0 0 0 1 0 0 0 0 1 0 0 0 1 0 1 0 1 0 0 0 1 0 1 0 0 1 1 0 1 0 1 0 0 0 1 0 0 0 0 1 0 0 0 1 0 1 0 1 0 0 0 1 0 1 0 0 1 1 0 1 0 1 0 0 0 1 0 0 0 0 1 0 0 0 1 0 1 0 1 0 0 0 1 0 1 0 0 1 1 0 1 0 1 0 0 0 1 0 0 0 0 1 0 0 0 1 0 1 0 1 0 0 0 1 0 1 0 0 1 1 0 1 0 1 0 0 0 1 0 0 0 0 1 0 0 0 1 0 1 0 1 0 0 0 1 0 1 0 0 1 1 0 1 0 1 0 0 0 1 0 0 0 0 1 0 0 0 1 0 1 0 1 0 0 0 1 0 1 0 0 1 1 0 1 0 1 0 0 0 1 0 0 0 0 1 0 0 0 1 0 1 0 1 0 0 0 1 0 1 0 0 1 1 0 1 0 1 0 0 0 1 0 0 0 0 1 0 0 0 1 0 1 0 1 0 0 0 1 0 1 0 0 1 1 0 1 0 1 0 0 0 1 0 0 0 0 1 0 0 0 1 0 1 0 1 0 0 0 1 0 1 0 0 1 1 0 1 0 1 0 0 0 1 0 0 0 0 1 0 0 0 1 0 1 0 1 0 0 0 1 0 1 0 0 1 1 0 1 0 1 0 0 0 1 0 0 0 0 1 0 0 0 1 0 1 0 1 0 0 0 1 0 1 0 0 1 1 0 1 0 1 0 0 0 1 0 0 0 0 1 0 0 0 1 0 1 0 1 0 0 0 1 0 1 0 0 1 1 0 1 0 1 0 0 0 1 0 0 0 0 1 0 0 0 1 0 1 0 1 0 0 0 1 0 1 0 0 1 1 0 1 0 1 0 0 0 1 0 0 0 0 1 0 0 0 1 0 1 0 1 0 0 0 1 0 1 0 0 1 1 0 1 0 1 0 0 0 1 0 0 0 0 1 0 0 0 1 0 1 0 1 0 0 0 1 0 1 0 0 1 1 0 1 0 1 0 0 0 1 0 0 0 0 1 0 0 0 1 0 1 0 1 0 0 0 1 0 1 0 0 1 1 0 1 0 1 0 0 0 1 0 0 0 0 1 0 0 0 1 0 1 0 1 0 0 0 1 0 1 0 0 1 1 0 1 0 1 0 0 0 1 0 0 0 0 1 0 0 0 1 0 1 0 1 0 0 0 1 0 1 0 0 1 1 0 1 0 1 0 0 0 1 0 0 0 0 1 0 0 0 1 0 1 0 1 0 0 0 1 0 1 0 0 1 1 0 1 0 1 0 0 0 1 0 0 0 0 1 0 0 0 1 0 1 0 1 0 0 0 1 0 1 0 0 1 1 0 1 0 1 0 0 0 1 0 0 0 0 1 0 0 0 1 0 1 0 1 0 0 0 1 0 0 1 1 0 1 0 0 0 1 0 1 0 0 0 0 0 0 1 0 0 0 1 0 1 0 0 0 1 0 0 1 1 0 1 0 0 0 0 0 0 1 0 0 0 1 0 1 0 1 0 0 0 1 0 1 0 0 0 1 0 1 0 0 0 0 0 0 1 0 1 0 1 0 1 0 0 0 0 0 0 1 0 0 1 0 1 0 0 0 1 0 1 0 0 0 1 0 0 1 1 0 1 0 1 0 0 0 1 0 1 0 0 0 1 0 0 1 1 0 1 0 1 0 0 1 0 0 0 0 1 0 0 0 1 0 1 0 0 0 0 0 1 0 1 0 1 0 0 0 0 0 1 0 1 0 1 0 0 0 1 0 1 0 0 1 0 0 1 0 1 0 1 0 0 0 1 0 1 0 0 0 1 0 0 0 1 0 1 0 1 0 0 0 1 0 1 0 1 0 0 0 1 0 1 0 0 0 1 0 1 1 0 1 0 1 1 0 1 0 0 0 1 0 0 0 0 1 0 0 0 1 0 1 0 1 0 0 0 1 0 1 1 0 1 0 1 1 0 1 0 0 0 1 0 0 0 0 1 0 0 0 1 0 1 0 1 0 0 0 1 0 1 1 0 1 0 1 1 0 1 0 0 0 1 0 0 0 0 1 0 0 0 1 0 1 0 1 0 0 0 1 0 1 1 0 1 0 1 1 0 1 0 0 0 1 0 0 0 0 1 0 0 0 1 0 1 0 1 0 0 0 1 0 0 1 0 1 0 1 1 0 1 0 0 0 1 0 0 0 0 1 0 0 0 1 0 DEVICES Chip/Device Edge 1 1 0 1 0 1 0 1 1 0 0 0 1 0 0 0 1 0 1 0 0 1 1 0 1 0 0 1 1 0 1 1 0 1 0 1 0 1 1 1 1 0 1 0 1 0 1 1 1 1 0 1 0 1 0 1 1 1 1 0 1 0 1 0 1 1 1 1 0 1 0 1 0 1 1 0 0 1 0 1 1 0 1 0 1 0 1 1 Public Private On Premise DISTRIBUTED PROCESSING (DEVICE, EDGE, CLOUD, ON PREMISES) Location driven by • Data gravity • Safe harbor • Costs • Resource availability 1 0 0 1 1 0
  • 28. © 2017 MapR TechnologiesMapR Confidential 28 EXAMPLE: CONNECTED CAR • HIGH FREQUENCY DECISIONING USE CASES • Advanced Driver Assistance Systems (ADAS) • Computer Aided Driving • Vehicle Healthcare • Fleet Management By 2020, more than 250 million vehicles will be connected globally - Gartner
  • 29. © 2017 MapR TechnologiesMapR Confidential 29 AUTOMOTIVE IOT EXAMPLES ADVANCED DRIVER ASSISTANCE SYSTEMS (ADAS) COMPUTER AIDED DRIVING VEHICLE HEALTHCARE INSURANCE AND FINANCIAL SERVICES Lane change Traffic sign recognition Pedestrian recognition Collision avoidance Automatic parking Detecting traffic jams Road condition warnings Dynamic maps Exhaust monitoring Engine function Predictive maintenance Accident/breakdown management Pay as you drive Pay how you drive Pay where you drive Fleet Management Delivery services Utilities/Trucking companies
  • 30. © 2017 MapR TechnologiesMapR Confidential 30 MapR Edge Solves IoT Analytics Challenges
  • 31. © 2017 MapR TechnologiesMapR Confidential 31 MAPR EDGE FOR INTERNET-OF-THINGS
  • 32. © 2017 MapR TechnologiesMapR Confidential 32 MAPR SOLVES BIG DATA AT THE EDGE DESIGNED FOR EDGE LOCATIONS Sites with slow or occasionally connected network access Data sources that create huge volumes of data E.g., oil rigs, hospitals, vehicles, remote offices, etc. INTEL NUC MINI PCS (8.3” X 4.6” X 1.1”) Space constrained locations requiring small footprints • 3-5 node cluster, storage capacity limits, 16GB RAM • Optimized for mini PCs (e.g., Intel NUCs)
  • 33. © 2017 MapR TechnologiesMapR Confidential 33 EXTEND THE POWER OF THE MAPR PLATFORM • Access to files, tables, documents, streams • Multiple compute engines – Spark, Drill, Hive, etc. • Data management capabilities such as volumes, quotas, compression, MapR Control System • Business continuity via replication, mirroring, and snapshots • Data distribution via global, bandwidth-aware, incremental replication capabilities MapR Edge supports:
  • 34. © 2017 MapR TechnologiesMapR Confidential 34 MAPR EDGE KEY FEATURES AND BENEFITS Distributed data aggregation • Reduce bandwidth requirements • Maintain data privacy • Comply with data location regulations • Reliably consolidate data from edge sites • Minimize space requirements Bandwidth-awareness • Transport data reliably even in slow or occasional connections Global data plane • Simplify development/deployment with a global view of all data in a single namespace
  • 35. © 2017 MapR TechnologiesMapR Confidential 35 MAPR EDGE KEY FEATURES AND BENEFITS (CONT.) Converged analytics • Gain faster time-to-insight at the edge. Unified security • Protect data stored at the edge as well as in motion • Reduce complexity with a consistent security framework Standards-based • Simplify code development standard-based interfaces Enterprise-grade reliability • Reduce costly downtime
  • 36. © 2017 MapR TechnologiesMapR Confidential 36 Two Example Use Cases
  • 37. © 2017 MapR TechnologiesMapR Confidential 37 Use Case #1 – Oil & Gas
  • 38. © 2017 MapR TechnologiesMapR Confidential 38 Use Case #1 – Oil & Gas (Before and After MapR Edge) Source 1 Source 2 Source 1000 Time to insight (48 hrs) Manual process Before MapR Edge Source 1 Source 2 Source 1000 Time to insight (< 2 hrs) Automated processThousands of oil and gas sources Down-sampling prior to delivery to core cluster Source 1 Source 2 Source 1000 MapR Core ClusterMapR Core Cluster Internet With MapR Edge
  • 39. © 2017 MapR TechnologiesMapR Confidential 39 Use Case #2 – Automotive
  • 40. © 2017 MapR TechnologiesMapR Confidential 40 Use Case #2 – Automotive (Before and After Edge) Source 1 Source 2 Source 1000 Time to insight (24 hrs) Manual processes, scripts, requires high bandwidth network Before MapR Edge Source 1 Source 2 Source 1000 Time to insight (< 5 mins) Automated processThousands of test cars running 24/7 Capture data around exceptions With MapR Edge Source 1 Source 2 Source 1000 1-5 TB/day Internet Internet MapR Core ClusterMapR Core Cluster
  • 41. © 2017 MapR TechnologiesMapR Confidential 41 QUESTIONS?
  • 42. © 2017 MapR TechnologiesMapR Confidential 42 MAPR ECOSYSTEM PACK
  • 43. © 2017 MapR TechnologiesMapR Confidential 43 AGENDA What We’re Covering Today • MEP Overview • MEP 3.0 Updates • New Spark Connectors • New Streams APIs
  • 44. © 2017 MapR TechnologiesMapR Confidential 44 On-Premise, In the Cloud, Hybrid HDFS API POSIX, NFS HBASE API JSON API KAFKA API Database MapR-DB Event Streaming MapR Streams Enterprise-Grade Platform Services High Availability Web-Scale Storage MapR-FS Real Time Unified Security Multi-tenancy Disaster Recovery Global Namespace MAPR CONVERGED DATA PLATFORM 2017
  • 45. © 2017 MapR TechnologiesMapR Confidential 45 MAPR ECOSYSTEM PACKS (MEPs) Extended Ecosystem MapR Core Ecosystem MEP Outside support: vendor or community. Fully supported, updates tied to MapR core. Fully supported, updates follow MEP process.
  • 46. © 2017 MapR TechnologiesMapR Confidential 46 WE DO ECOSYSTEM BETTER Competitor process: All-or-nothing ● Must upgrade full stack to receive any updates ● Infrequent opportunities for upgrade: ~2/year MapR Ecosystem Packs (MEP) Process: ● Reduce upgrade effort – upgrade only at the level you need, instead of your entire stack ● Frequent (quarterly) opportunities for upgrade MEP 1.0 MEP 2.0 MEP ? Less disruption to production environments! Upgrades are disruptive and infrequent!
  • 47. © 2017 MapR TechnologiesMapR Confidential 47 MEP RELEASE CADENCE AND CORE SUPPORT The hypothetical release schedule MapR 5.2 MEP 1.0 Maintenance releases assumed to follow this pattern for every MEP release until EOL of that MEP Q3’16 Q4’16 Q1’17 Q2’17 Q3’17 Q4’17 MEP 3.0 MEP 4.0 MEP ?.? MapR ? MEP 1.0.2 MEP 1.0.3 MEP 1.0.4 MEP 1.0.5 MEP 1.0.1 MEP 2.0 ... ... MEP ?.?
  • 48. © 2017 MapR TechnologiesMapR Confidential 48 MAINTENANCE RELEASE SUPPORT FOR MEPS • It is recommended to upgrade to latest compatible MEP release when core MR is updated • We will support new MEP releases with older version of maintenance releases on a supported core release. See diagram: MapR ? MEP MEPMEP MapR X.Y.Z MapR X.Y.ZMapR 5.2 MEP MEPMEP MapR 5.2.1 MapR 5.2.2
  • 49. © 2017 MapR TechnologiesMapR Confidential 49 MEP 3.0 Updates
  • 50. © 2017 MapR TechnologiesMapR Confidential 50 MEP 3.0 UPGRADES NEW FEATURES PATCHES • Flume 1.6 -> 1.7 • Oozie 4.2 -> 4.3 • Impala 2.5 -> 2.7 • Sentry 1.6 -> 1.7 • HBase 1.1.2 -> 1.1.8 • Drill 1.9 -> Drill 1.10 • Hive 1.2.1 -> Hive 2.1.1 •Spark 2.0.1 -> 2.1.0 – Added MapR-SASL support to Thrift Server •Hue 3.10 -> Hue 3.12 – Experimental Drill integration via JDBC notebook
  • 51. © 2017 MapR TechnologiesMapR Confidential 51 FASTER HIVE 2.1.1 • 2X Faster ETL through a smarter Cost-Based Optimizer (CBO), faster type conversions and dynamic partition pruning. • Procedural SQL support. • New HiveServer UI with new diagnostics and monitoring tools • Dynamically partitioned hash joins provide unsorted inputs in order to eliminate the sorting. • Vectorized query execution greatly reduces the CPU usage for typical query operations like scans, filters, aggregates, and joins. Faster processing, lower latency, and higher throughput
  • 52. © 2017 MapR TechnologiesMapR Confidential 52 SPARK 2.1.1 • 1200 JIRAs patched on 2.X branch • Provides for secure connections using MapR-SASL in addition to Kerberos for: – Inbound client connections to the Spark SQL Thrift server – Spark connections to Hive Metastore Big improvements in stability and security
  • 53. © 2017 MapR TechnologiesMapR Confidential 53 MEP IS CONTINUOUSLY EVOLVING The MEP charter has expanded to include connectors and APIs Connectors to enhance & enable native integrations for ecosystem projects Developer APIs to provide agile interactions with your platform
  • 54. © 2017 MapR TechnologiesMapR Confidential 54 MEP 3.0 Connectors & APIs CONNECTORS APIs • MapR Streams C APIs • MapR Streams Python APIs • Spark OJAI Connector for MapR-DB JSON – Phase 1: support for RDDs only – Read and write JSON to/from MapR-DB •Spark HBase Connector for MapR-DB Binary – Phase 1: support for RDDs only – Read and write using HBase Contexts
  • 55. © 2017 MapR TechnologiesMapR Confidential 55 Spark Connectors
  • 56. © 2017 MapR TechnologiesMapR Confidential 56 SPARK OJAI CONNECTOR FOR MAPR-DB JSON • Two new APIs that allow you to: – Load data from a MapR-DB JSON table to a Spark RDD – Save a Spark RDD to a MapR-DB JSON table •Data locality: – When the connector reads data from MapR-DB, it uses the data locality feature of MapR-DB to spawn the Spark executors. •A custom partitioner that allows you to partition data for better performance Build real-time or batch pipelines between your data and MapR-DB
  • 57. © 2017 MapR TechnologiesMapR Confidential 57 SPARK OJAI CONNECTOR FOR MAPR-DB JSON Leverages the OJAI API internally to talk to MapR-DB JSON tables
  • 58. © 2017 MapR TechnologiesMapR Confidential 58 THE SPARK HBASE CONNECTOR FOR MAPR-DB BINARY Enabling HBase contexts for Spark and Spark Streaming • Spark applications can now consume and use MapR-DB & HBase binary tables • Provides for bulk insert into HBase HFiles – Basic bulk load functionality – Thin-record bulk load option
  • 59. © 2017 MapR TechnologiesMapR Confidential 59 MapR Streams APIs
  • 60. © 2017 MapR TechnologiesMapR Confidential 60 MAPR STREAMS C APPLICATIONS A distribution of librdkafka that can work with MapR Streams • The MapR Streams C Client supports a majority of the librdkafka API plus: – Streams.consumer.default.stream: specifies a default consumer stream path and name. – Streams.producer.default.stream: specifies a default producer stream path and name. – Streams.parallel.flushers.per.partition: enables multiple parallel send requests to the server for each topic partition
  • 61. © 2017 MapR TechnologiesMapR Confidential 61 MAPR STREAMS PYTHON APPLICATIONS A Python client binding of librdkafka that can work with MapR Streams • Allows for writing Python Streams Applications • Includes all additional configuration options provided by the MapR Streams C API
  • 62. © 2017 MapR TechnologiesMapR Confidential 62 QUESTIONS?
  • 63. © 2017 MapR TechnologiesMapR Confidential 63 APACHE DRILL 1.10
  • 64. © 2017 MapR TechnologiesMapR Confidential 64 Drill Release 1.10 • Native connector to Tableau – Temporary Tables • Hive 2.1.1 integration – Drill queries are forward compatible • Security – Kerberos authentication & MAPR-SASL between client-and-Drill bit • Performance – Asynchronous parquet reader, hash join fix • Compatibility – Support for INT96 timestamp types
  • 65. © 2017 MapR TechnologiesMapR Confidential 65 Resources • Visit www.mapr.com • Visit the MapR Community community.mapr.com • Contact us at [email protected]
  • 66. © 2017 MapR TechnologiesMapR Confidential 66 Q&A ENGAGE WITH US @mapr @mapr.com