SlideShare a Scribd company logo
Grab some
coffee and
enjoy the
pre-­show
banter
before the
top of the
hour!
The Briefing Room
The Hadoop Guarantee: Keeping Analytics Running On Time
Twitter Tag: #briefr The Briefing Room
Welcome
Host:
Eric Kavanagh
eric.kavanagh@bloorgroup.com
@eric_kavanagh
Twitter Tag: #briefr The Briefing Room
  Reveal the essential characteristics of enterprise
software, good and bad
  Provide a forum for detailed analysis of today s innovative
technologies
  Give vendors a chance to explain their product to savvy
analysts
  Allow audience members to pose serious questions... and
get answers!
Mission
Twitter Tag: #briefr The Briefing Room
Topics
September: HADOOP 2.0
October: DATA MANAGEMENT
November: ANALYTICS
Twitter Tag: #briefr The Briefing Room
The Holy Grail of Hadoop
Ø Mixed Workloads!
Ø Deep visibility into the
cluster
Ø Ability to define &
meet SLAs
Twitter Tag: #briefr The Briefing Room
Analyst: Robin Bloor
Robin Bloor is
Chief Analyst at
The Bloor Group
robin.bloor@bloorgroup.com
@robinbloor
Twitter Tag: #briefr The Briefing Room
Pepperdata
Pepperdata offers a platform for managing and
optimizing Hadoop clusters
  The platform monitors and balances resources
across multiple workloads and/or clusters in real
time
Pepperdata provides an interactive dashboard with
real-time visualizations and reports on hardware
usage
Twitter Tag: #briefr The Briefing Room
Guest: Sean Suchter
Sean Suchter, Cofounder, CEO of Pepperdata
Sean was the founding GM of Microsoft’s
Silicon Valley Search Technology Center,
where he led the integration of Facebook
and Twitter content into Bing search. Prior
to Microsoft, Sean managed the Yahoo
Search Technology team, the first
production user of Hadoop. Sean joined
Yahoo through the acquisition of Inktomi,
and holds a B.S. in Engineering and Applied
Science from Caltech.
©2015 Pepperdata
Sean Suchter, CEO & Cofounder
September 15, 2015
Pepperdata:
Bringing Predictability & Reliability
to Hadoop
©2015 Pepperdata
Agenda
•  Market trends
•  Customer demands
•  Where Pepperdata fits
•  Q&A
©2015 Pepperdata
Market Reality
•  Unreliability of Hadoop
•  Growing skills gap
•  Multitude of vendors & tools in ecosystem
Unpredictable jobs Bottlenecks, missed SLAs
Poor visibility Lengthy troubleshooting,
“flying blind”
Inefficient cluster allocation Overbuilding, costs
Many organizations state that big data is high priority for them, but many will fail to see a
competitive advantage due to issues such as:
©2015 Pepperdata
Mature deployments have increasing requirements
•  Multi-tenancy (multiple workloads, multiple tenants)
•  Internal deployments of Hadoop-as-a-Service
•  Guaranteed SLAs
Organizations today demand
©2015 Pepperdata
Node-level metrics
YARN
Node-level metrics
Pepperdata
Real-time metrics by
queue, user, job, task
Allocate resources dynamically
(maximize utilization)
Control hardware usage
(priority jobs complete on time)
Schedule jobs;
pre-allocate memory, CPU
Prevent rogue jobs from
harming high-priority jobs
When jobs
are scheduled
Once jobs
are running
During & after
job runtime
You need more than YARN
©2015 Pepperdata
No human can make the thousands of decisions a second necessary for dynamic, real-
time hardware resource management.
Time and sweat won’t solve the problem
©2015 Pepperdata
Pepperdata lets enterprises rely on Hadoop
•  Provide mission critical applications
in multi-tenant environments
•  Monitor and control hardware usage
dynamically and in real time
•  Enable SLAs, increase throughput,
and improve visibility
Companies can now:
©2015 Pepperdata
“ DEMO
©2015 Pepperdata
Pepperdata: unmatched visibility
©2015 Pepperdata
“ Thank you.
Twitter Tag: #briefr The Briefing Room
Perceptions & Questions
Analyst:
Robin Bloor
Hadoop Performance
Robin Bloor, PhD
Hadoop Had a Dream
The Biological Analog
u  Our human control system works at
different speeds:
•  Internal systems – Enteric nervous system
•  Instant external reflex – Spinal cord
•  Fast external response – Motor systems
•  Considered response – The brain
u  Swift external response is
predictive analytics & triggers
u  Considered response is analytics
A While Ago…
The Hadoop Disruption
Hadoop Evolution
HDFS & MapReduce
HDFS YARNSpark
HDFS YARNMapReduce
Serial Single Batch
Serial Multiple Batch
Serial Multiple Microbatch
The Spark Dynamic
u  Spark has become the de facto vehicle for
many distinct Hadoop projects: analytics and
data integration
u  It can do “microbtach streaming,” but it is not
ideal for very low latency applications
u  It has in-memory capability (=100x in memory,
10x on disk)
u  Speed of development
u  Spark SQL
So What’s Missing?
u  Resource allocation
u  Resource management by “job”
u  Dynamic prioritization of workloads
u  Real-time monitoring
u  Service management: performance and
throughput feedback and controls
u  Capacity planning
Operational Control
Hadoop has the potential to be the
“scale-out OS” for data as soon as
it can manage its resources
u  How easy is Pepperdata to implement? What’s
the process?
u  What is (roughly) the most complex
environment in respect to workloads where
Pepperdata is deployed? Please describe.
u  What is the Pepperdata proposition in respect
to ROI?
u  Are there any competing products?
u  Which specific companies/products do you
complement?
u  Is there any Hadoop distribution that you prefer?
If so, why?
Twitter Tag: #briefr The Briefing Room
Twitter Tag: #briefr The Briefing Room
Upcoming Topics
www.insideanalysis.com
September: HADOOP 2.0
October: DATA MANAGEMENT
November: ANALYTICS
Twitter Tag: #briefr The Briefing Room
THANK YOU
for your
ATTENTION!
Some images provided courtesy of Wikimedia Commons
and http://desvadgama.com/wp-content/uploads/2012/11/holy-grail.jpg

More Related Content

What's hot (20)

PPTX
Concur Discovers the True Value of Data
Cloudera, Inc.
 
PDF
Shortening the Feedback Loop: How Spotify’s Big Data Ecosystem has evolved to...
Big Data Spain
 
PPTX
Introducing Cloudera Navigator Optimizer: Offload Assessments and Active Data...
Cloudera, Inc.
 
PDF
Big Data Day LA 2016/ Big Data Track - Rapid Analytics @ Netflix LA (Updated ...
Data Con LA
 
PDF
Real World Use Cases: Hadoop and NoSQL in Production
Codemotion
 
PDF
Spark in the Hadoop Ecosystem-(Mike Olson, Cloudera)
Spark Summit
 
PPTX
Real time monitoring of hadoop and spark workflows
Shankar Manian
 
PDF
Turn Data Into Actionable Insights - StampedeCon 2016
StampedeCon
 
PDF
Spark at Airbnb
Hao Wang
 
PDF
Disaster Recovery for Big Data by Carlos Izquierdo at Big Data Spain 2017
Big Data Spain
 
PPTX
Data Science lifecycle with Apache Zeppelin and Spark by Moonsoo Lee
Spark Summit
 
PDF
Pandas UDF: Scalable Analysis with Python and PySpark
Li Jin
 
PPTX
Netflix Data Engineering @ Uber Engineering Meetup
Blake Irvine
 
PDF
Scalable Monitoring Using Apache Spark and Friends with Utkarsh Bhatnagar
Databricks
 
PPTX
[Strata] Sparkta
Stratio
 
PPTX
GPU 101: The Beast In Data Centers
Rommel Garcia
 
PPTX
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Building an Event-oriented...
Data Con LA
 
PDF
Customer Applications Of Hadoop On Red Hat Storage Server
Red_Hat_Storage
 
PDF
Spark and Online Analytics: Spark Summit East talky by Shubham Chopra
Spark Summit
 
PDF
Impala use case @ Zoosk
Cloudera, Inc.
 
Concur Discovers the True Value of Data
Cloudera, Inc.
 
Shortening the Feedback Loop: How Spotify’s Big Data Ecosystem has evolved to...
Big Data Spain
 
Introducing Cloudera Navigator Optimizer: Offload Assessments and Active Data...
Cloudera, Inc.
 
Big Data Day LA 2016/ Big Data Track - Rapid Analytics @ Netflix LA (Updated ...
Data Con LA
 
Real World Use Cases: Hadoop and NoSQL in Production
Codemotion
 
Spark in the Hadoop Ecosystem-(Mike Olson, Cloudera)
Spark Summit
 
Real time monitoring of hadoop and spark workflows
Shankar Manian
 
Turn Data Into Actionable Insights - StampedeCon 2016
StampedeCon
 
Spark at Airbnb
Hao Wang
 
Disaster Recovery for Big Data by Carlos Izquierdo at Big Data Spain 2017
Big Data Spain
 
Data Science lifecycle with Apache Zeppelin and Spark by Moonsoo Lee
Spark Summit
 
Pandas UDF: Scalable Analysis with Python and PySpark
Li Jin
 
Netflix Data Engineering @ Uber Engineering Meetup
Blake Irvine
 
Scalable Monitoring Using Apache Spark and Friends with Utkarsh Bhatnagar
Databricks
 
[Strata] Sparkta
Stratio
 
GPU 101: The Beast In Data Centers
Rommel Garcia
 
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Building an Event-oriented...
Data Con LA
 
Customer Applications Of Hadoop On Red Hat Storage Server
Red_Hat_Storage
 
Spark and Online Analytics: Spark Summit East talky by Shubham Chopra
Spark Summit
 
Impala use case @ Zoosk
Cloudera, Inc.
 

Similar to The Hadoop Guarantee: Keeping Analytics Running On Time (20)

PDF
Capturing big value in big data
BSP Media Group
 
PDF
Time to Fly - Why Predictive Analytics is Going Mainstream
Inside Analysis
 
PDF
Hadoop as an Analytic Platform: Why Not?
Inside Analysis
 
PDF
Pepperdata's Real-time Hadoop Cluster Optimization
Becky Mendenhall
 
PDF
Pepperdata's Real-time Hadoop Cluster Optimization
Becky Mendenhall
 
PPTX
Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...
MapR Technologies
 
PPSX
November 2013 HUG: Real-time analytics with in-memory grid
Yahoo Developer Network
 
PPTX
Main Street, Meet Mr Watson - Matt Coatney
Matt Coatney
 
PDF
Take Action: The New Reality of Data-Driven Business
Inside Analysis
 
PPTX
Using The Hadoop Ecosystem to Drive Healthcare Innovation
Dan Wellisch
 
PDF
Data, Interconnectedness & The Internet of Things
Software AG
 
PDF
Mighty Guides- Data Disruption
Mighty Guides, Inc.
 
PPTX
Predicting Consumer Behaviour via Hadoop
Skillspeed
 
PDF
Forecast of Big Data Trends
IMC Institute
 
PPTX
Transform You Business with Big Data and Hortonworks
Hortonworks
 
PPTX
Transform Your Business with Big Data and Hortonworks
Pactera_US
 
PPTX
Hadoop: Making it work for the Business Unit
DataWorks Summit
 
PPT
Big data
Palash Jain
 
DOCX
Diginomica 2019 2020 not ai neil raden article links and captions
Neil Raden
 
PPTX
Team 2 Big Data Presentation
Matthew Urdan
 
Capturing big value in big data
BSP Media Group
 
Time to Fly - Why Predictive Analytics is Going Mainstream
Inside Analysis
 
Hadoop as an Analytic Platform: Why Not?
Inside Analysis
 
Pepperdata's Real-time Hadoop Cluster Optimization
Becky Mendenhall
 
Pepperdata's Real-time Hadoop Cluster Optimization
Becky Mendenhall
 
Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...
MapR Technologies
 
November 2013 HUG: Real-time analytics with in-memory grid
Yahoo Developer Network
 
Main Street, Meet Mr Watson - Matt Coatney
Matt Coatney
 
Take Action: The New Reality of Data-Driven Business
Inside Analysis
 
Using The Hadoop Ecosystem to Drive Healthcare Innovation
Dan Wellisch
 
Data, Interconnectedness & The Internet of Things
Software AG
 
Mighty Guides- Data Disruption
Mighty Guides, Inc.
 
Predicting Consumer Behaviour via Hadoop
Skillspeed
 
Forecast of Big Data Trends
IMC Institute
 
Transform You Business with Big Data and Hortonworks
Hortonworks
 
Transform Your Business with Big Data and Hortonworks
Pactera_US
 
Hadoop: Making it work for the Business Unit
DataWorks Summit
 
Big data
Palash Jain
 
Diginomica 2019 2020 not ai neil raden article links and captions
Neil Raden
 
Team 2 Big Data Presentation
Matthew Urdan
 
Ad

More from Inside Analysis (20)

PDF
An Ounce of Prevention: Forging Healthy BI
Inside Analysis
 
PDF
Agile, Automated, Aware: How to Model for Success
Inside Analysis
 
PDF
First in Class: Optimizing the Data Lake for Tighter Integration
Inside Analysis
 
PDF
Fit For Purpose: Preventing a Big Data Letdown
Inside Analysis
 
PDF
To Serve and Protect: Making Sense of Hadoop Security
Inside Analysis
 
PDF
Introducing: A Complete Algebra of Data
Inside Analysis
 
PDF
The Role of Data Wrangling in Driving Hadoop Adoption
Inside Analysis
 
PDF
Ahead of the Stream: How to Future-Proof Real-Time Analytics
Inside Analysis
 
PDF
All Together Now: Connected Analytics for the Internet of Everything
Inside Analysis
 
PDF
Goodbye, Bottlenecks: How Scale-Out and In-Memory Solve ETL
Inside Analysis
 
PDF
The Biggest Picture: Situational Awareness on a Global Level
Inside Analysis
 
PDF
Structurally Sound: How to Tame Your Architecture
Inside Analysis
 
PDF
SQL In Hadoop: Big Data Innovation Without the Risk
Inside Analysis
 
PDF
The Perfect Fit: Scalable Graph for Big Data
Inside Analysis
 
PDF
A Revolutionary Approach to Modernizing the Data Warehouse
Inside Analysis
 
PDF
The Maturity Model: Taking the Growing Pains Out of Hadoop
Inside Analysis
 
PDF
Rethinking Data Availability and Governance in a Mobile World
Inside Analysis
 
PDF
DisrupTech - Dave Duggal
Inside Analysis
 
PPTX
Modus Operandi
Inside Analysis
 
PPTX
Phasic Systems - Dr. Geoffrey Malafsky
Inside Analysis
 
An Ounce of Prevention: Forging Healthy BI
Inside Analysis
 
Agile, Automated, Aware: How to Model for Success
Inside Analysis
 
First in Class: Optimizing the Data Lake for Tighter Integration
Inside Analysis
 
Fit For Purpose: Preventing a Big Data Letdown
Inside Analysis
 
To Serve and Protect: Making Sense of Hadoop Security
Inside Analysis
 
Introducing: A Complete Algebra of Data
Inside Analysis
 
The Role of Data Wrangling in Driving Hadoop Adoption
Inside Analysis
 
Ahead of the Stream: How to Future-Proof Real-Time Analytics
Inside Analysis
 
All Together Now: Connected Analytics for the Internet of Everything
Inside Analysis
 
Goodbye, Bottlenecks: How Scale-Out and In-Memory Solve ETL
Inside Analysis
 
The Biggest Picture: Situational Awareness on a Global Level
Inside Analysis
 
Structurally Sound: How to Tame Your Architecture
Inside Analysis
 
SQL In Hadoop: Big Data Innovation Without the Risk
Inside Analysis
 
The Perfect Fit: Scalable Graph for Big Data
Inside Analysis
 
A Revolutionary Approach to Modernizing the Data Warehouse
Inside Analysis
 
The Maturity Model: Taking the Growing Pains Out of Hadoop
Inside Analysis
 
Rethinking Data Availability and Governance in a Mobile World
Inside Analysis
 
DisrupTech - Dave Duggal
Inside Analysis
 
Modus Operandi
Inside Analysis
 
Phasic Systems - Dr. Geoffrey Malafsky
Inside Analysis
 
Ad

Recently uploaded (20)

PDF
A Strategic Analysis of the MVNO Wave in Emerging Markets.pdf
IPLOOK Networks
 
PDF
OFFOFFBOX™ – A New Era for African Film | Startup Presentation
ambaicciwalkerbrian
 
PDF
Make GenAI investments go further with the Dell AI Factory
Principled Technologies
 
PDF
The Future of Artificial Intelligence (AI)
Mukul
 
PPTX
AVL ( audio, visuals or led ), technology.
Rajeshwri Panchal
 
PPTX
Agile Chennai 18-19 July 2025 | Emerging patterns in Agentic AI by Bharani Su...
AgileNetwork
 
PDF
Presentation about Hardware and Software in Computer
snehamodhawadiya
 
PPTX
Applied-Statistics-Mastering-Data-Driven-Decisions.pptx
parmaryashparmaryash
 
PPTX
Dev Dives: Automate, test, and deploy in one place—with Unified Developer Exp...
AndreeaTom
 
PPTX
AI and Robotics for Human Well-being.pptx
JAYMIN SUTHAR
 
PPTX
Introduction to Flutter by Ayush Desai.pptx
ayushdesai204
 
PDF
Tea4chat - another LLM Project by Kerem Atam
a0m0rajab1
 
PDF
Research-Fundamentals-and-Topic-Development.pdf
ayesha butalia
 
PPTX
Agentic AI in Healthcare Driving the Next Wave of Digital Transformation
danielle hunter
 
PPTX
The Future of AI & Machine Learning.pptx
pritsen4700
 
PPTX
AI in Daily Life: How Artificial Intelligence Helps Us Every Day
vanshrpatil7
 
PPTX
Agile Chennai 18-19 July 2025 Ideathon | AI Powered Microfinance Literacy Gui...
AgileNetwork
 
PPTX
cloud computing vai.pptx for the project
vaibhavdobariyal79
 
PDF
Brief History of Internet - Early Days of Internet
sutharharshit158
 
PDF
How ETL Control Logic Keeps Your Pipelines Safe and Reliable.pdf
Stryv Solutions Pvt. Ltd.
 
A Strategic Analysis of the MVNO Wave in Emerging Markets.pdf
IPLOOK Networks
 
OFFOFFBOX™ – A New Era for African Film | Startup Presentation
ambaicciwalkerbrian
 
Make GenAI investments go further with the Dell AI Factory
Principled Technologies
 
The Future of Artificial Intelligence (AI)
Mukul
 
AVL ( audio, visuals or led ), technology.
Rajeshwri Panchal
 
Agile Chennai 18-19 July 2025 | Emerging patterns in Agentic AI by Bharani Su...
AgileNetwork
 
Presentation about Hardware and Software in Computer
snehamodhawadiya
 
Applied-Statistics-Mastering-Data-Driven-Decisions.pptx
parmaryashparmaryash
 
Dev Dives: Automate, test, and deploy in one place—with Unified Developer Exp...
AndreeaTom
 
AI and Robotics for Human Well-being.pptx
JAYMIN SUTHAR
 
Introduction to Flutter by Ayush Desai.pptx
ayushdesai204
 
Tea4chat - another LLM Project by Kerem Atam
a0m0rajab1
 
Research-Fundamentals-and-Topic-Development.pdf
ayesha butalia
 
Agentic AI in Healthcare Driving the Next Wave of Digital Transformation
danielle hunter
 
The Future of AI & Machine Learning.pptx
pritsen4700
 
AI in Daily Life: How Artificial Intelligence Helps Us Every Day
vanshrpatil7
 
Agile Chennai 18-19 July 2025 Ideathon | AI Powered Microfinance Literacy Gui...
AgileNetwork
 
cloud computing vai.pptx for the project
vaibhavdobariyal79
 
Brief History of Internet - Early Days of Internet
sutharharshit158
 
How ETL Control Logic Keeps Your Pipelines Safe and Reliable.pdf
Stryv Solutions Pvt. Ltd.
 

The Hadoop Guarantee: Keeping Analytics Running On Time

  • 1. Grab some coffee and enjoy the pre-­show banter before the top of the hour!
  • 2. The Briefing Room The Hadoop Guarantee: Keeping Analytics Running On Time
  • 3. Twitter Tag: #briefr The Briefing Room Welcome Host: Eric Kavanagh [email protected] @eric_kavanagh
  • 4. Twitter Tag: #briefr The Briefing Room   Reveal the essential characteristics of enterprise software, good and bad   Provide a forum for detailed analysis of today s innovative technologies   Give vendors a chance to explain their product to savvy analysts   Allow audience members to pose serious questions... and get answers! Mission
  • 5. Twitter Tag: #briefr The Briefing Room Topics September: HADOOP 2.0 October: DATA MANAGEMENT November: ANALYTICS
  • 6. Twitter Tag: #briefr The Briefing Room The Holy Grail of Hadoop Ø Mixed Workloads! Ø Deep visibility into the cluster Ø Ability to define & meet SLAs
  • 7. Twitter Tag: #briefr The Briefing Room Analyst: Robin Bloor Robin Bloor is Chief Analyst at The Bloor Group [email protected] @robinbloor
  • 8. Twitter Tag: #briefr The Briefing Room Pepperdata Pepperdata offers a platform for managing and optimizing Hadoop clusters   The platform monitors and balances resources across multiple workloads and/or clusters in real time Pepperdata provides an interactive dashboard with real-time visualizations and reports on hardware usage
  • 9. Twitter Tag: #briefr The Briefing Room Guest: Sean Suchter Sean Suchter, Cofounder, CEO of Pepperdata Sean was the founding GM of Microsoft’s Silicon Valley Search Technology Center, where he led the integration of Facebook and Twitter content into Bing search. Prior to Microsoft, Sean managed the Yahoo Search Technology team, the first production user of Hadoop. Sean joined Yahoo through the acquisition of Inktomi, and holds a B.S. in Engineering and Applied Science from Caltech.
  • 10. ©2015 Pepperdata Sean Suchter, CEO & Cofounder September 15, 2015 Pepperdata: Bringing Predictability & Reliability to Hadoop
  • 11. ©2015 Pepperdata Agenda •  Market trends •  Customer demands •  Where Pepperdata fits •  Q&A
  • 12. ©2015 Pepperdata Market Reality •  Unreliability of Hadoop •  Growing skills gap •  Multitude of vendors & tools in ecosystem Unpredictable jobs Bottlenecks, missed SLAs Poor visibility Lengthy troubleshooting, “flying blind” Inefficient cluster allocation Overbuilding, costs Many organizations state that big data is high priority for them, but many will fail to see a competitive advantage due to issues such as:
  • 13. ©2015 Pepperdata Mature deployments have increasing requirements •  Multi-tenancy (multiple workloads, multiple tenants) •  Internal deployments of Hadoop-as-a-Service •  Guaranteed SLAs Organizations today demand
  • 14. ©2015 Pepperdata Node-level metrics YARN Node-level metrics Pepperdata Real-time metrics by queue, user, job, task Allocate resources dynamically (maximize utilization) Control hardware usage (priority jobs complete on time) Schedule jobs; pre-allocate memory, CPU Prevent rogue jobs from harming high-priority jobs When jobs are scheduled Once jobs are running During & after job runtime You need more than YARN
  • 15. ©2015 Pepperdata No human can make the thousands of decisions a second necessary for dynamic, real- time hardware resource management. Time and sweat won’t solve the problem
  • 16. ©2015 Pepperdata Pepperdata lets enterprises rely on Hadoop •  Provide mission critical applications in multi-tenant environments •  Monitor and control hardware usage dynamically and in real time •  Enable SLAs, increase throughput, and improve visibility Companies can now:
  • 20. Twitter Tag: #briefr The Briefing Room Perceptions & Questions Analyst: Robin Bloor
  • 22. Hadoop Had a Dream
  • 23. The Biological Analog u  Our human control system works at different speeds: •  Internal systems – Enteric nervous system •  Instant external reflex – Spinal cord •  Fast external response – Motor systems •  Considered response – The brain u  Swift external response is predictive analytics & triggers u  Considered response is analytics
  • 26. Hadoop Evolution HDFS & MapReduce HDFS YARNSpark HDFS YARNMapReduce Serial Single Batch Serial Multiple Batch Serial Multiple Microbatch
  • 27. The Spark Dynamic u  Spark has become the de facto vehicle for many distinct Hadoop projects: analytics and data integration u  It can do “microbtach streaming,” but it is not ideal for very low latency applications u  It has in-memory capability (=100x in memory, 10x on disk) u  Speed of development u  Spark SQL
  • 28. So What’s Missing? u  Resource allocation u  Resource management by “job” u  Dynamic prioritization of workloads u  Real-time monitoring u  Service management: performance and throughput feedback and controls u  Capacity planning
  • 29. Operational Control Hadoop has the potential to be the “scale-out OS” for data as soon as it can manage its resources
  • 30. u  How easy is Pepperdata to implement? What’s the process? u  What is (roughly) the most complex environment in respect to workloads where Pepperdata is deployed? Please describe. u  What is the Pepperdata proposition in respect to ROI? u  Are there any competing products?
  • 31. u  Which specific companies/products do you complement? u  Is there any Hadoop distribution that you prefer? If so, why?
  • 32. Twitter Tag: #briefr The Briefing Room
  • 33. Twitter Tag: #briefr The Briefing Room Upcoming Topics www.insideanalysis.com September: HADOOP 2.0 October: DATA MANAGEMENT November: ANALYTICS
  • 34. Twitter Tag: #briefr The Briefing Room THANK YOU for your ATTENTION! Some images provided courtesy of Wikimedia Commons and http://desvadgama.com/wp-content/uploads/2012/11/holy-grail.jpg