SlideShare a Scribd company logo
MapReduce to Apache Spark:
An Ecosystem Evolves
Doug Cutting (@cutting)
Chief Architect & Co-founder of Apache Hadoop
Hadoop’s Original Architecture
MapReduce
Data Processing and Resource Management
HDFS
Filesystem/Storage
The MapReduce Breakthrough
Key advances in MapReduce:
• Data locality: Automatic split computation and appropriate launch of mappers
• Fault-tolerance: Write-out of intermediate results and restartable mappers provides ability to run on commodity
hardware
• Linear scalability: Combination of locality + programming model forces developers to write generally scalable
solutions
Map Map Map Map Map Map Map Map Map Map Map Map
Reduce Reduce Reduce Reduce
Apache Spark: A Better MapReduce
Easy, Expressive API
• Rich API (Java, Scala, and Python)
• Interactive shell
• 2-5x less code needed than MR
Fast Execution
• General execution graphs
• In-memory storage
• Order-of-magnitude improvement
over MR
Big Data Developers are Rapidly Sparking Up
Source: Typesafe Apache Spark
Adoption Survey, Jan. 2015
• 82% have replaced MapReduce
with Spark
• 78% need faster processing for
large data sets
• 62% load data into Spark via HDFS
• 22% of respondents run CDH, more
than twice as many as any other
Hadoop platform
Spark is now an important part of the Hadoop Platform
A Platform That Just Won’t Stop Growing
NEWPROJECTS
EXISTINGPROJECTS
*CDHSUPPORTED
Core Hadoop
(HDFS,
MapReduce)
Solr
Pig
Core Hadoop
HBase
ZooKeeper
Solr
Pig
Core Hadoop
Hive
Mahout
HBase
ZooKeeper
Solr
Pig
Core Hadoop
Sqoop
Avro
Hive
Mahout
HBase
ZooKeeper
Solr
Pig
Core Hadoop
Flume
Bigtop
Oozie
HCatalog
Hue
Sqoop
Avro
Hive
Mahout
HBase
ZooKeeper
Solr
Pig
YARN
Core Hadoop
Spark
Tez
Impala
Kafka
Drill
Flume
Bigtop
Oozie
HCatalog
Hue
Sqoop
Avro
Hive
Mahout
HBase
ZooKeeper
Solr
Pig
YARN
Core Hadoop
Parquet
Sentry
Spark
Tez
Impala
Kafka
Drill
Flume
Bigtop
Oozie
HCatalog
Hue
Sqoop
Avro
Hive
Mahout
HBase
ZooKeeper
Solr
Pig
YARN
Core Hadoop
Knox
Flink
Parquet
Sentry
Spark
Tez
Impala
Kafka
Drill
Flume
Bigtop
Oozie
HCatalog
Hue
Sqoop
Avro
Hive
Mahout
HBase
ZooKeeper
Solr
Pig
YARN
Core Hadoop
Kudu*
RecordService*
Ibis*
Falcon
Knox
Flink
Parquet*
Sentry*
Spark*
Tez
Impala*
Kafka*
Drill
Flume*
Bigtop*
Oozie*
Hcatalog*
Hue*
Sqoop*
Avro*
Hive*
Mahout*
Hbase*
ZooKeeper*
Solr*
Pig*
YARN*
Core Hadoop*
2006 2008 2009 2010 2011 2012 20132007 2014 2015
Hadoop’s Next 10 Years
Interest in public-cloud
deployments are driving
native support for them
into the platform.
Rapid hardware advances
are forcing the
community to re-think
Hadoop’s foundations.
Data sources are more
numerous, distributed,
and diverse (IoT), and
Hadoop will adapt.
Learn More
cloudera.com/hadoop10

More Related Content

What's hot (20)

PPTX
LEGO: Data Driven Growth Hacking Powered by Big Data
DataWorks Summit/Hadoop Summit
 
PPTX
Cloudera Data Science Workbench: sparklyr, implyr, and More - dplyr Interfac...
Cloudera, Inc.
 
PPTX
Hadoop and Machine Learning
joshwills
 
PPTX
Accelerating Data Warehouse Modernization
DataWorks Summit/Hadoop Summit
 
PDF
Big Data Computing Architecture
Gang Tao
 
PPTX
Ignite Your Big Data With a Spark!
Progress
 
PDF
High Performance Spatial-Temporal Trajectory Analysis with Spark
DataWorks Summit/Hadoop Summit
 
PDF
IlOUG Tech Days 2016 - Big Data for Oracle Developers - Towards Spark, Real-T...
Mark Rittman
 
PPTX
Introduction to Kudu - StampedeCon 2016
StampedeCon
 
PPTX
The EDW Ecosystem
DataWorks Summit/Hadoop Summit
 
PPTX
Solr consistency and recovery internals
Cloudera, Inc.
 
PPTX
Extreme Sports & Beyond: Exploring a new frontier in data with GoPro
Cloudera, Inc.
 
PPTX
Optimizing Big Data to run in the Public Cloud
Qubole
 
PPTX
Hadoop Essentials -- The What, Why and How to Meet Agency Objectives
Cloudera, Inc.
 
PDF
Ibis: Scaling Python Analytics on Hadoop and Impala
Wes McKinney
 
PPTX
Hadoop in the Cloud: Common Architectural Patterns
DataWorks Summit
 
PPTX
Part 2: Apache Kudu: Extending the Capabilities of Operational and Analytic D...
Cloudera, Inc.
 
PPTX
Atlanta MLConf
Qubole
 
PPTX
Smart Enterprise Big Data Bus for the Modern Responsive Enterprise
DataWorks Summit
 
PPTX
New Performance Benchmarks: Apache Impala (incubating) Leads Traditional Anal...
Cloudera, Inc.
 
LEGO: Data Driven Growth Hacking Powered by Big Data
DataWorks Summit/Hadoop Summit
 
Cloudera Data Science Workbench: sparklyr, implyr, and More - dplyr Interfac...
Cloudera, Inc.
 
Hadoop and Machine Learning
joshwills
 
Accelerating Data Warehouse Modernization
DataWorks Summit/Hadoop Summit
 
Big Data Computing Architecture
Gang Tao
 
Ignite Your Big Data With a Spark!
Progress
 
High Performance Spatial-Temporal Trajectory Analysis with Spark
DataWorks Summit/Hadoop Summit
 
IlOUG Tech Days 2016 - Big Data for Oracle Developers - Towards Spark, Real-T...
Mark Rittman
 
Introduction to Kudu - StampedeCon 2016
StampedeCon
 
Solr consistency and recovery internals
Cloudera, Inc.
 
Extreme Sports & Beyond: Exploring a new frontier in data with GoPro
Cloudera, Inc.
 
Optimizing Big Data to run in the Public Cloud
Qubole
 
Hadoop Essentials -- The What, Why and How to Meet Agency Objectives
Cloudera, Inc.
 
Ibis: Scaling Python Analytics on Hadoop and Impala
Wes McKinney
 
Hadoop in the Cloud: Common Architectural Patterns
DataWorks Summit
 
Part 2: Apache Kudu: Extending the Capabilities of Operational and Analytic D...
Cloudera, Inc.
 
Atlanta MLConf
Qubole
 
Smart Enterprise Big Data Bus for the Modern Responsive Enterprise
DataWorks Summit
 
New Performance Benchmarks: Apache Impala (incubating) Leads Traditional Anal...
Cloudera, Inc.
 

Viewers also liked (20)

PPTX
Spark & Cassandra at DataStax Meetup on Jan 29, 2015
Sameer Farooqui
 
PDF
Hadoop security
shrey mehrotra
 
PPTX
Securing Hadoop's REST APIs with Apache Knox Gateway Hadoop Summit June 6th, ...
Kevin Minder
 
PPTX
Hadoop Security in Big-Data-as-a-Service Deployments - Presented at Hadoop Su...
Abhiraj Butala
 
PDF
Discover HDP 2.2: Comprehensive Hadoop Security with Apache Ranger and Apache...
Hortonworks
 
PDF
Hadoop & Security - Past, Present, Future
Uwe Printz
 
PDF
From MapReduce to Apache Spark
Jen Aman
 
PDF
Spring Boot Intro
Alberto Flores
 
PPTX
Open Source Security Tools for Big Data
Rommel Garcia
 
PPTX
Hadoop Security Today & Tomorrow with Apache Knox
Vinay Shukla
 
PDF
REST with Spring Boot #jqfk
Toshiaki Maki
 
PPTX
Hadoop REST API Security with Apache Knox Gateway
DataWorks Summit
 
PPTX
10 Amazing Things To Do With a Hadoop-Based Data Lake
VMware Tanzu
 
PPT
Developing Java Web Applications
hchen1
 
PDF
Microservices with Java, Spring Boot and Spring Cloud
Eberhard Wolff
 
PDF
Microservices with Spring Boot
Joshua Long
 
PPT
3 Tier Architecture
Webx
 
PPTX
Spring boot
sdeeg
 
PPT
Spring ppt
Mumbai Academisc
 
PDF
Spark Summit San Francisco 2016 - Matei Zaharia Keynote: Apache Spark 2.0
Databricks
 
Spark & Cassandra at DataStax Meetup on Jan 29, 2015
Sameer Farooqui
 
Hadoop security
shrey mehrotra
 
Securing Hadoop's REST APIs with Apache Knox Gateway Hadoop Summit June 6th, ...
Kevin Minder
 
Hadoop Security in Big-Data-as-a-Service Deployments - Presented at Hadoop Su...
Abhiraj Butala
 
Discover HDP 2.2: Comprehensive Hadoop Security with Apache Ranger and Apache...
Hortonworks
 
Hadoop & Security - Past, Present, Future
Uwe Printz
 
From MapReduce to Apache Spark
Jen Aman
 
Spring Boot Intro
Alberto Flores
 
Open Source Security Tools for Big Data
Rommel Garcia
 
Hadoop Security Today & Tomorrow with Apache Knox
Vinay Shukla
 
REST with Spring Boot #jqfk
Toshiaki Maki
 
Hadoop REST API Security with Apache Knox Gateway
DataWorks Summit
 
10 Amazing Things To Do With a Hadoop-Based Data Lake
VMware Tanzu
 
Developing Java Web Applications
hchen1
 
Microservices with Java, Spring Boot and Spring Cloud
Eberhard Wolff
 
Microservices with Spring Boot
Joshua Long
 
3 Tier Architecture
Webx
 
Spring boot
sdeeg
 
Spring ppt
Mumbai Academisc
 
Spark Summit San Francisco 2016 - Matei Zaharia Keynote: Apache Spark 2.0
Databricks
 
Ad

Similar to Keynote – From MapReduce to Spark: An Ecosystem Evolves by Doug Cutting, Chief Architect, Cloudera (20)

PDF
Tachyon and Apache Spark
rhatr
 
PDF
Hadoop Master Class : A concise overview
Abhishek Roy
 
PDF
Transitioning Compute Models: Hadoop MapReduce to Spark
Slim Baltagi
 
PDF
Apache Spark: killer or savior of Apache Hadoop?
rhatr
 
PPTX
Apache Spark - San Diego Big Data Meetup Jan 14th 2015
cdmaxime
 
PPTX
Introduction To Big Data with Hadoop and Spark - For Batch and Real Time Proc...
Agile Testing Alliance
 
PDF
Is Spark Replacing Hadoop
MapR Technologies
 
PDF
Apache Hadoop and Spark: Introduction and Use Cases for Data Analysis
Trieu Nguyen
 
PPTX
Spark and Hadoop Technology
Avinash Gautam
 
PPTX
Apache Spark - Santa Barbara Scala Meetup Dec 18th 2014
cdmaxime
 
PDF
Hadoop to spark_v2
elephantscale
 
PDF
spark_v1_2
Frank Schroeter
 
PPTX
2016-07-21-Godil-presentation.pptx
D21CE161GOSWAMIPARTH
 
PPTX
Hadoop or Spark: is it an either-or proposition? By Slim Baltagi
Slim Baltagi
 
PDF
MapReduce and Hadoop
Nicola Cadenelli
 
PPTX
Let Spark Fly: Advantages and Use Cases for Spark on Hadoop
MapR Technologies
 
PDF
Hadoop/Spark Non-Technical Basics
Zitao Liu
 
PPTX
Not Just Another Overview of Apache Hadoop
Adaryl "Bob" Wakefield, MBA
 
PPTX
Apache Spark - Las Vegas Big Data Meetup Dec 3rd 2014
cdmaxime
 
PPTX
Hadoop: An Industry Perspective
Cloudera, Inc.
 
Tachyon and Apache Spark
rhatr
 
Hadoop Master Class : A concise overview
Abhishek Roy
 
Transitioning Compute Models: Hadoop MapReduce to Spark
Slim Baltagi
 
Apache Spark: killer or savior of Apache Hadoop?
rhatr
 
Apache Spark - San Diego Big Data Meetup Jan 14th 2015
cdmaxime
 
Introduction To Big Data with Hadoop and Spark - For Batch and Real Time Proc...
Agile Testing Alliance
 
Is Spark Replacing Hadoop
MapR Technologies
 
Apache Hadoop and Spark: Introduction and Use Cases for Data Analysis
Trieu Nguyen
 
Spark and Hadoop Technology
Avinash Gautam
 
Apache Spark - Santa Barbara Scala Meetup Dec 18th 2014
cdmaxime
 
Hadoop to spark_v2
elephantscale
 
spark_v1_2
Frank Schroeter
 
2016-07-21-Godil-presentation.pptx
D21CE161GOSWAMIPARTH
 
Hadoop or Spark: is it an either-or proposition? By Slim Baltagi
Slim Baltagi
 
MapReduce and Hadoop
Nicola Cadenelli
 
Let Spark Fly: Advantages and Use Cases for Spark on Hadoop
MapR Technologies
 
Hadoop/Spark Non-Technical Basics
Zitao Liu
 
Not Just Another Overview of Apache Hadoop
Adaryl "Bob" Wakefield, MBA
 
Apache Spark - Las Vegas Big Data Meetup Dec 3rd 2014
cdmaxime
 
Hadoop: An Industry Perspective
Cloudera, Inc.
 
Ad

More from Cloudera, Inc. (20)

PPTX
Partner Briefing_January 25 (FINAL).pptx
Cloudera, Inc.
 
PPTX
Cloudera Data Impact Awards 2021 - Finalists
Cloudera, Inc.
 
PPTX
2020 Cloudera Data Impact Awards Finalists
Cloudera, Inc.
 
PPTX
Edc event vienna presentation 1 oct 2019
Cloudera, Inc.
 
PPTX
Machine Learning with Limited Labeled Data 4/3/19
Cloudera, Inc.
 
PPTX
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Cloudera, Inc.
 
PPTX
Introducing Cloudera DataFlow (CDF) 2.13.19
Cloudera, Inc.
 
PPTX
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Cloudera, Inc.
 
PPTX
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Cloudera, Inc.
 
PPTX
Leveraging the cloud for analytics and machine learning 1.29.19
Cloudera, Inc.
 
PPTX
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Cloudera, Inc.
 
PPTX
Leveraging the Cloud for Big Data Analytics 12.11.18
Cloudera, Inc.
 
PPTX
Modern Data Warehouse Fundamentals Part 3
Cloudera, Inc.
 
PPTX
Modern Data Warehouse Fundamentals Part 2
Cloudera, Inc.
 
PPTX
Modern Data Warehouse Fundamentals Part 1
Cloudera, Inc.
 
PPTX
Extending Cloudera SDX beyond the Platform
Cloudera, Inc.
 
PPTX
Federated Learning: ML with Privacy on the Edge 11.15.18
Cloudera, Inc.
 
PPTX
Analyst Webinar: Doing a 180 on Customer 360
Cloudera, Inc.
 
PPTX
Build a modern platform for anti-money laundering 9.19.18
Cloudera, Inc.
 
PPTX
Introducing the data science sandbox as a service 8.30.18
Cloudera, Inc.
 
Partner Briefing_January 25 (FINAL).pptx
Cloudera, Inc.
 
Cloudera Data Impact Awards 2021 - Finalists
Cloudera, Inc.
 
2020 Cloudera Data Impact Awards Finalists
Cloudera, Inc.
 
Edc event vienna presentation 1 oct 2019
Cloudera, Inc.
 
Machine Learning with Limited Labeled Data 4/3/19
Cloudera, Inc.
 
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Cloudera, Inc.
 
Introducing Cloudera DataFlow (CDF) 2.13.19
Cloudera, Inc.
 
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Cloudera, Inc.
 
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Cloudera, Inc.
 
Leveraging the cloud for analytics and machine learning 1.29.19
Cloudera, Inc.
 
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Cloudera, Inc.
 
Leveraging the Cloud for Big Data Analytics 12.11.18
Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 3
Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 2
Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 1
Cloudera, Inc.
 
Extending Cloudera SDX beyond the Platform
Cloudera, Inc.
 
Federated Learning: ML with Privacy on the Edge 11.15.18
Cloudera, Inc.
 
Analyst Webinar: Doing a 180 on Customer 360
Cloudera, Inc.
 
Build a modern platform for anti-money laundering 9.19.18
Cloudera, Inc.
 
Introducing the data science sandbox as a service 8.30.18
Cloudera, Inc.
 

Recently uploaded (20)

PPTX
AEM User Group: India Chapter Kickoff Meeting
jennaf3
 
PDF
How to Hire AI Developers_ Step-by-Step Guide in 2025.pdf
DianApps Technologies
 
PDF
MiniTool Partition Wizard Free Crack + Full Free Download 2025
bashirkhan333g
 
PPTX
Coefficient of Variance in IBM SPSS Statistics Version 31.pptx
Version 1 Analytics
 
PPTX
Agentic Automation: Build & Deploy Your First UiPath Agent
klpathrudu
 
PPTX
OpenChain @ OSS NA - In From the Cold: Open Source as Part of Mainstream Soft...
Shane Coughlan
 
PPTX
Comprehensive Risk Assessment Module for Smarter Risk Management
EHA Soft Solutions
 
PPTX
Empowering Asian Contributions: The Rise of Regional User Groups in Open Sour...
Shane Coughlan
 
PPTX
Homogeneity of Variance Test Options IBM SPSS Statistics Version 31.pptx
Version 1 Analytics
 
PDF
SAP Firmaya İade ABAB Kodları - ABAB ile yazılmıl hazır kod örneği
Salih Küçük
 
PDF
Download Canva Pro 2025 PC Crack Full Latest Version
bashirkhan333g
 
PDF
IDM Crack with Internet Download Manager 6.42 Build 43 with Patch Latest 2025
bashirkhan333g
 
PDF
Odoo CRM vs Zoho CRM: Honest Comparison 2025
Odiware Technologies Private Limited
 
PDF
Wondershare PDFelement Pro Crack for MacOS New Version Latest 2025
bashirkhan333g
 
PPTX
ChiSquare Procedure in IBM SPSS Statistics Version 31.pptx
Version 1 Analytics
 
PDF
The 5 Reasons for IT Maintenance - Arna Softech
Arna Softech
 
PDF
4K Video Downloader Plus Pro Crack for MacOS New Download 2025
bashirkhan333g
 
PPTX
Finding Your License Details in IBM SPSS Statistics Version 31.pptx
Version 1 Analytics
 
PDF
IObit Driver Booster Pro 12.4.0.585 Crack Free Download
henryc1122g
 
PDF
Top Agile Project Management Tools for Teams in 2025
Orangescrum
 
AEM User Group: India Chapter Kickoff Meeting
jennaf3
 
How to Hire AI Developers_ Step-by-Step Guide in 2025.pdf
DianApps Technologies
 
MiniTool Partition Wizard Free Crack + Full Free Download 2025
bashirkhan333g
 
Coefficient of Variance in IBM SPSS Statistics Version 31.pptx
Version 1 Analytics
 
Agentic Automation: Build & Deploy Your First UiPath Agent
klpathrudu
 
OpenChain @ OSS NA - In From the Cold: Open Source as Part of Mainstream Soft...
Shane Coughlan
 
Comprehensive Risk Assessment Module for Smarter Risk Management
EHA Soft Solutions
 
Empowering Asian Contributions: The Rise of Regional User Groups in Open Sour...
Shane Coughlan
 
Homogeneity of Variance Test Options IBM SPSS Statistics Version 31.pptx
Version 1 Analytics
 
SAP Firmaya İade ABAB Kodları - ABAB ile yazılmıl hazır kod örneği
Salih Küçük
 
Download Canva Pro 2025 PC Crack Full Latest Version
bashirkhan333g
 
IDM Crack with Internet Download Manager 6.42 Build 43 with Patch Latest 2025
bashirkhan333g
 
Odoo CRM vs Zoho CRM: Honest Comparison 2025
Odiware Technologies Private Limited
 
Wondershare PDFelement Pro Crack for MacOS New Version Latest 2025
bashirkhan333g
 
ChiSquare Procedure in IBM SPSS Statistics Version 31.pptx
Version 1 Analytics
 
The 5 Reasons for IT Maintenance - Arna Softech
Arna Softech
 
4K Video Downloader Plus Pro Crack for MacOS New Download 2025
bashirkhan333g
 
Finding Your License Details in IBM SPSS Statistics Version 31.pptx
Version 1 Analytics
 
IObit Driver Booster Pro 12.4.0.585 Crack Free Download
henryc1122g
 
Top Agile Project Management Tools for Teams in 2025
Orangescrum
 

Keynote – From MapReduce to Spark: An Ecosystem Evolves by Doug Cutting, Chief Architect, Cloudera

  • 1. MapReduce to Apache Spark: An Ecosystem Evolves Doug Cutting (@cutting) Chief Architect & Co-founder of Apache Hadoop
  • 2. Hadoop’s Original Architecture MapReduce Data Processing and Resource Management HDFS Filesystem/Storage
  • 3. The MapReduce Breakthrough Key advances in MapReduce: • Data locality: Automatic split computation and appropriate launch of mappers • Fault-tolerance: Write-out of intermediate results and restartable mappers provides ability to run on commodity hardware • Linear scalability: Combination of locality + programming model forces developers to write generally scalable solutions Map Map Map Map Map Map Map Map Map Map Map Map Reduce Reduce Reduce Reduce
  • 4. Apache Spark: A Better MapReduce Easy, Expressive API • Rich API (Java, Scala, and Python) • Interactive shell • 2-5x less code needed than MR Fast Execution • General execution graphs • In-memory storage • Order-of-magnitude improvement over MR
  • 5. Big Data Developers are Rapidly Sparking Up Source: Typesafe Apache Spark Adoption Survey, Jan. 2015 • 82% have replaced MapReduce with Spark • 78% need faster processing for large data sets • 62% load data into Spark via HDFS • 22% of respondents run CDH, more than twice as many as any other Hadoop platform
  • 6. Spark is now an important part of the Hadoop Platform
  • 7. A Platform That Just Won’t Stop Growing NEWPROJECTS EXISTINGPROJECTS *CDHSUPPORTED Core Hadoop (HDFS, MapReduce) Solr Pig Core Hadoop HBase ZooKeeper Solr Pig Core Hadoop Hive Mahout HBase ZooKeeper Solr Pig Core Hadoop Sqoop Avro Hive Mahout HBase ZooKeeper Solr Pig Core Hadoop Flume Bigtop Oozie HCatalog Hue Sqoop Avro Hive Mahout HBase ZooKeeper Solr Pig YARN Core Hadoop Spark Tez Impala Kafka Drill Flume Bigtop Oozie HCatalog Hue Sqoop Avro Hive Mahout HBase ZooKeeper Solr Pig YARN Core Hadoop Parquet Sentry Spark Tez Impala Kafka Drill Flume Bigtop Oozie HCatalog Hue Sqoop Avro Hive Mahout HBase ZooKeeper Solr Pig YARN Core Hadoop Knox Flink Parquet Sentry Spark Tez Impala Kafka Drill Flume Bigtop Oozie HCatalog Hue Sqoop Avro Hive Mahout HBase ZooKeeper Solr Pig YARN Core Hadoop Kudu* RecordService* Ibis* Falcon Knox Flink Parquet* Sentry* Spark* Tez Impala* Kafka* Drill Flume* Bigtop* Oozie* Hcatalog* Hue* Sqoop* Avro* Hive* Mahout* Hbase* ZooKeeper* Solr* Pig* YARN* Core Hadoop* 2006 2008 2009 2010 2011 2012 20132007 2014 2015
  • 8. Hadoop’s Next 10 Years Interest in public-cloud deployments are driving native support for them into the platform. Rapid hardware advances are forcing the community to re-think Hadoop’s foundations. Data sources are more numerous, distributed, and diverse (IoT), and Hadoop will adapt.

Editor's Notes

  • #6: This data is from Typesafe’s 2015 survey of 2100+ developers, data scientists, and IT executives whose orgs are either running or researching Spark
  • #9: What does the future hold for Hadoop? There are many possible permutations, but these are just a couple of the obvious influences going forward.