SlideShare a Scribd company logo
The Enterprise Data Hub in the Cloud
Eli Collins, Chief Technologist

1

©2014 Cloudera, Inc. All rights reserved.
2

©2014 Cloudera, Inc. All rights reserved.
What we’re really talking about
Host

Customer

Vendor
3

Vendor

AWS
GCE
SoftLayer
…

Customer

Manage

T-Systems
Accenture
..

EMR
AltiScale
..

©2014 Cloudera, Inc. All rights reserved.

Primarily dedicated
physical on-prem
infrastructure
2. Alternatives
emerging
1.
Engineering Perspective
• Long-running

Long-running batch jobs
• Cluster stores the data and provides services (Impala,
Search, HBase, Accumulo, etc)
•

• Ephemeral

Self-service, demos
• Test/Dev, POC
• Periodic batch
•

4

©2014 Cloudera, Inc. All rights reserved.
Product Thinking
• Many EDH environments

will be hybrid

Valid reasons for/against cloud deployments
• Private/public capabilities will converge
•

• Run Cloudera anywhere
•

5

EDH works with multiple deployment models

©2014 Cloudera, Inc. All rights reserved.
Portability is KEY
• Multiple deployment options

Cloud Connect: AWS, SoftLayer, Savvis, T-Systems, Verizon
• Integrated support offerings
• Growing provider, SI, and MSP ecosystem
•

• Multiple pricing models

Traditional
• Usage-based
•

6

©2014 Cloudera, Inc. All rights reserved.
Functionality is KEY too
• Enterprise Data Hub functionality & innovation

Impala, Search, Sentry, Spark, ..
• ISV ecosystem
•

• Management
•

7

Cloudera Manager, NAVIGATOR, and BDR

©2014 Cloudera, Inc. All rights reserved.
Our Reference Architecture

+

8

©2014 Cloudera, Inc. All rights reserved.
Cloudera Leveraging AWS
• Elastic Compute (EC2)
• Simple Storage Service (S3)
• Relational Database Service (RDS)
• Elastic Block Store (EBS)
• Direct Connect

• Virtual Private Cloud (VPC)

9

©2014 Cloudera, Inc. All rights reserved.
Private VPC Subnet

10

©2014 Cloudera, Inc. All rights reserved.
Public VPC Subnet

11

©2014 Cloudera, Inc. All rights reserved.
Private and Public Subnets

12

©2014 Cloudera, Inc. All rights reserved.
Instance Types and Roles

13

©2014 Cloudera, Inc. All rights reserved.
What’s coming?
• Automated deployment

Joint reference architectures
• Extend this with your IT
•

• Self-service (via service providers)
• More platforms and providers

14

©2014 Cloudera, Inc. All rights reserved.
Taking Full Advantage of the Cloud
• Enhanced transient clusters

Grow/shrink, compute only instances, spot instances
• Improved S3 and Swift support
•

• Hybrid environments

Cross-DC operation
• Centralized discovery & management
• Bursting
•

15

©2014 Cloudera, Inc. All rights reserved.
16

©2014 Cloudera, Inc. All rights reserved.

More Related Content

What's hot (20)

PPTX
How to Build Multi-disciplinary Analytics Applications on a Shared Data Platform
Cloudera, Inc.
 
PPTX
Data Modeling for Data Science: Simplify Your Workload with Complex Types in ...
Cloudera, Inc.
 
PPTX
How to Run Cloudera Enterprise on Microsoft Azure
Cloudera, Inc.
 
PPTX
Standing Up an Effective Enterprise Data Hub -- Technology and Beyond
Cloudera, Inc.
 
PPTX
Enterprise Hadoop in the Cloud. In Minutes. | How to Run Cloudera Enterprise ...
Cloudera, Inc.
 
PPTX
Big data journey to the cloud maz chaudhri 5.30.18
Cloudera, Inc.
 
PPTX
Unlock Hadoop Success with Cloudera Navigator Optimizer
Cloudera, Inc.
 
PPTX
Consolidate your data marts for fast, flexible analytics 5.24.18
Cloudera, Inc.
 
PPTX
PaaS or Fail: Rule the Cloud with Altus
Cloudera, Inc.
 
PPTX
The Vision & Challenge of Applied Machine Learning
Cloudera, Inc.
 
PPTX
Analyzing Hadoop Data Using Sparklyr

Cloudera, Inc.
 
PPTX
Cloudera Altus: Big Data in the Cloud Made Easy
Cloudera, Inc.
 
PPTX
Part 2: Apache Kudu: Extending the Capabilities of Operational and Analytic D...
Cloudera, Inc.
 
PDF
Data Science and Machine Learning for the Enterprise
Cloudera, Inc.
 
PPTX
Making Self-Service BI a Reality in the Enterprise
Cloudera, Inc.
 
PPTX
Multi-Tenant Operations with Cloudera 5.7 & BT
Cloudera, Inc.
 
PPTX
Comment développer une stratégie Big Data dans le cloud public avec l'offre P...
Cloudera, Inc.
 
PPTX
Data Drive Applications_Webinar
Sean Spediacci
 
PPTX
The 6th Wave of Automation: Automation of Decisions | Cloudera Analytics & Ma...
Cloudera, Inc.
 
PPTX
What’s New in Cloudera Enterprise 6.0: The Inside Scoop 6.14.18
Cloudera, Inc.
 
How to Build Multi-disciplinary Analytics Applications on a Shared Data Platform
Cloudera, Inc.
 
Data Modeling for Data Science: Simplify Your Workload with Complex Types in ...
Cloudera, Inc.
 
How to Run Cloudera Enterprise on Microsoft Azure
Cloudera, Inc.
 
Standing Up an Effective Enterprise Data Hub -- Technology and Beyond
Cloudera, Inc.
 
Enterprise Hadoop in the Cloud. In Minutes. | How to Run Cloudera Enterprise ...
Cloudera, Inc.
 
Big data journey to the cloud maz chaudhri 5.30.18
Cloudera, Inc.
 
Unlock Hadoop Success with Cloudera Navigator Optimizer
Cloudera, Inc.
 
Consolidate your data marts for fast, flexible analytics 5.24.18
Cloudera, Inc.
 
PaaS or Fail: Rule the Cloud with Altus
Cloudera, Inc.
 
The Vision & Challenge of Applied Machine Learning
Cloudera, Inc.
 
Analyzing Hadoop Data Using Sparklyr

Cloudera, Inc.
 
Cloudera Altus: Big Data in the Cloud Made Easy
Cloudera, Inc.
 
Part 2: Apache Kudu: Extending the Capabilities of Operational and Analytic D...
Cloudera, Inc.
 
Data Science and Machine Learning for the Enterprise
Cloudera, Inc.
 
Making Self-Service BI a Reality in the Enterprise
Cloudera, Inc.
 
Multi-Tenant Operations with Cloudera 5.7 & BT
Cloudera, Inc.
 
Comment développer une stratégie Big Data dans le cloud public avec l'offre P...
Cloudera, Inc.
 
Data Drive Applications_Webinar
Sean Spediacci
 
The 6th Wave of Automation: Automation of Decisions | Cloudera Analytics & Ma...
Cloudera, Inc.
 
What’s New in Cloudera Enterprise 6.0: The Inside Scoop 6.14.18
Cloudera, Inc.
 

Viewers also liked (11)

PDF
"Targeting the Big Guys: Account Based Sales Development" at SaaStr Annual 2016
saastr
 
PDF
AWS re:Invent re:Cap - 데이터 분석: Amazon EC2 C4 Instance + Amazon EBS - 김일호
Amazon Web Services Korea
 
PDF
Challenges for running Hadoop on AWS - AdvancedAWS Meetup
Andrei Savu
 
PPTX
Cloudera Impala - Las Vegas Big Data Meetup Nov 5th 2014
cdmaxime
 
PPTX
Cloudera
Ahmed Salman
 
PDF
Hw09 Clouderas Distribution For Hadoop
Cloudera, Inc.
 
PPTX
Five Tips for Running Cloudera on AWS
Cloudera, Inc.
 
PDF
Which Hadoop Distribution to use: Apache, Cloudera, MapR or HortonWorks?
Edureka!
 
PDF
Apache hadoop and cdh(cloudera distribution) introduction 基本介紹
Anna Yen
 
PDF
Hadoop benchmark: Evaluating Cloudera, Hortonworks, and MapR
Douglas Bernardini
 
PDF
Hadoop Workshop using Cloudera on Amazon EC2
IMC Institute
 
"Targeting the Big Guys: Account Based Sales Development" at SaaStr Annual 2016
saastr
 
AWS re:Invent re:Cap - 데이터 분석: Amazon EC2 C4 Instance + Amazon EBS - 김일호
Amazon Web Services Korea
 
Challenges for running Hadoop on AWS - AdvancedAWS Meetup
Andrei Savu
 
Cloudera Impala - Las Vegas Big Data Meetup Nov 5th 2014
cdmaxime
 
Cloudera
Ahmed Salman
 
Hw09 Clouderas Distribution For Hadoop
Cloudera, Inc.
 
Five Tips for Running Cloudera on AWS
Cloudera, Inc.
 
Which Hadoop Distribution to use: Apache, Cloudera, MapR or HortonWorks?
Edureka!
 
Apache hadoop and cdh(cloudera distribution) introduction 基本介紹
Anna Yen
 
Hadoop benchmark: Evaluating Cloudera, Hortonworks, and MapR
Douglas Bernardini
 
Hadoop Workshop using Cloudera on Amazon EC2
IMC Institute
 
Ad

Similar to Cloudera Federal Forum 2014: Cloud Deployment for the Enterprise Data Hub (20)

PPTX
Cloudera Director: Unlock the Full Potential of Hadoop in the Cloud
Cloudera, Inc.
 
PDF
Introducing Cloudera Director at Big Data Bash
Andrei Savu
 
PPTX
Leveraging the cloud for analytics and machine learning 1.29.19
Cloudera, Inc.
 
PDF
My sql en la nube conoce las mejores prácticas en administración y operación_...
GeneXus
 
PPTX
Basics of cloud computing ( aws )
Deepak Singhal
 
PPTX
Using MySQL in the Cloud
Matt Lord
 
PDF
Infoblox Cloud Solutions - Cisco Mid-Atlantic User Group
NetCraftsmen
 
PPTX
Introducing Azure Arc
Mohamed Wali
 
PDF
Hadoop Operations – Past, Present, and Future
DataWorks Summit
 
PDF
Getting the Most Out of Your Data in the Cloud with Cloudbreak
Hortonworks
 
PDF
Cloudera GoDataFest Deploying Cloudera in the Cloud
GoDataDriven
 
PDF
Hadoop Operations - Past, Present, and Future
DataWorks Summit
 
PPTX
Learn Cloud Computing from Scratch
Intellipaat
 
PPTX
Hadoop security @ Philly Hadoop Meetup May 2015
Shravan (Sean) Pabba
 
PDF
Data in the Cloud Crash Course
DataWorks Summit
 
PPTX
Cloud computing(ppt)
priyas211420
 
PDF
Open stack @ sierra wireless
LINAGORA
 
PPTX
What Is Cloud Computing? | Cloud Computing For Beginners | Cloud Computing Tr...
Simplilearn
 
PDF
One Hadoop, Multiple Clouds
Cloudera, Inc.
 
Cloudera Director: Unlock the Full Potential of Hadoop in the Cloud
Cloudera, Inc.
 
Introducing Cloudera Director at Big Data Bash
Andrei Savu
 
Leveraging the cloud for analytics and machine learning 1.29.19
Cloudera, Inc.
 
My sql en la nube conoce las mejores prácticas en administración y operación_...
GeneXus
 
Basics of cloud computing ( aws )
Deepak Singhal
 
Using MySQL in the Cloud
Matt Lord
 
Infoblox Cloud Solutions - Cisco Mid-Atlantic User Group
NetCraftsmen
 
Introducing Azure Arc
Mohamed Wali
 
Hadoop Operations – Past, Present, and Future
DataWorks Summit
 
Getting the Most Out of Your Data in the Cloud with Cloudbreak
Hortonworks
 
Cloudera GoDataFest Deploying Cloudera in the Cloud
GoDataDriven
 
Hadoop Operations - Past, Present, and Future
DataWorks Summit
 
Learn Cloud Computing from Scratch
Intellipaat
 
Hadoop security @ Philly Hadoop Meetup May 2015
Shravan (Sean) Pabba
 
Data in the Cloud Crash Course
DataWorks Summit
 
Cloud computing(ppt)
priyas211420
 
Open stack @ sierra wireless
LINAGORA
 
What Is Cloud Computing? | Cloud Computing For Beginners | Cloud Computing Tr...
Simplilearn
 
One Hadoop, Multiple Clouds
Cloudera, Inc.
 
Ad

More from Cloudera, Inc. (20)

PPTX
Partner Briefing_January 25 (FINAL).pptx
Cloudera, Inc.
 
PPTX
Cloudera Data Impact Awards 2021 - Finalists
Cloudera, Inc.
 
PPTX
2020 Cloudera Data Impact Awards Finalists
Cloudera, Inc.
 
PPTX
Edc event vienna presentation 1 oct 2019
Cloudera, Inc.
 
PPTX
Machine Learning with Limited Labeled Data 4/3/19
Cloudera, Inc.
 
PPTX
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Cloudera, Inc.
 
PPTX
Introducing Cloudera DataFlow (CDF) 2.13.19
Cloudera, Inc.
 
PPTX
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Cloudera, Inc.
 
PPTX
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Cloudera, Inc.
 
PPTX
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Cloudera, Inc.
 
PPTX
Leveraging the Cloud for Big Data Analytics 12.11.18
Cloudera, Inc.
 
PPTX
Modern Data Warehouse Fundamentals Part 3
Cloudera, Inc.
 
PPTX
Modern Data Warehouse Fundamentals Part 2
Cloudera, Inc.
 
PPTX
Modern Data Warehouse Fundamentals Part 1
Cloudera, Inc.
 
PPTX
Extending Cloudera SDX beyond the Platform
Cloudera, Inc.
 
PPTX
Federated Learning: ML with Privacy on the Edge 11.15.18
Cloudera, Inc.
 
PPTX
Analyst Webinar: Doing a 180 on Customer 360
Cloudera, Inc.
 
PPTX
Build a modern platform for anti-money laundering 9.19.18
Cloudera, Inc.
 
PPTX
Introducing the data science sandbox as a service 8.30.18
Cloudera, Inc.
 
PPTX
Cloudera SDX
Cloudera, Inc.
 
Partner Briefing_January 25 (FINAL).pptx
Cloudera, Inc.
 
Cloudera Data Impact Awards 2021 - Finalists
Cloudera, Inc.
 
2020 Cloudera Data Impact Awards Finalists
Cloudera, Inc.
 
Edc event vienna presentation 1 oct 2019
Cloudera, Inc.
 
Machine Learning with Limited Labeled Data 4/3/19
Cloudera, Inc.
 
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Cloudera, Inc.
 
Introducing Cloudera DataFlow (CDF) 2.13.19
Cloudera, Inc.
 
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Cloudera, Inc.
 
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Cloudera, Inc.
 
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Cloudera, Inc.
 
Leveraging the Cloud for Big Data Analytics 12.11.18
Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 3
Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 2
Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 1
Cloudera, Inc.
 
Extending Cloudera SDX beyond the Platform
Cloudera, Inc.
 
Federated Learning: ML with Privacy on the Edge 11.15.18
Cloudera, Inc.
 
Analyst Webinar: Doing a 180 on Customer 360
Cloudera, Inc.
 
Build a modern platform for anti-money laundering 9.19.18
Cloudera, Inc.
 
Introducing the data science sandbox as a service 8.30.18
Cloudera, Inc.
 
Cloudera SDX
Cloudera, Inc.
 

Recently uploaded (20)

PDF
Agentic AI lifecycle for Enterprise Hyper-Automation
Debmalya Biswas
 
PDF
NewMind AI - Journal 100 Insights After The 100th Issue
NewMind AI
 
PDF
LLMs.txt: Easily Control How AI Crawls Your Site
Keploy
 
PDF
CIFDAQ Token Spotlight for 9th July 2025
CIFDAQ
 
PDF
The Builder’s Playbook - 2025 State of AI Report.pdf
jeroen339954
 
PDF
Chris Elwell Woburn, MA - Passionate About IT Innovation
Chris Elwell Woburn, MA
 
PDF
Smart Trailers 2025 Update with History and Overview
Paul Menig
 
PDF
CIFDAQ Weekly Market Wrap for 11th July 2025
CIFDAQ
 
PDF
Transcript: New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
PDF
Achieving Consistent and Reliable AI Code Generation - Medusa AI
medusaaico
 
PDF
Bitcoin for Millennials podcast with Bram, Power Laws of Bitcoin
Stephen Perrenod
 
PDF
HubSpot Main Hub: A Unified Growth Platform
Jaswinder Singh
 
PDF
Newgen 2022-Forrester Newgen TEI_13 05 2022-The-Total-Economic-Impact-Newgen-...
darshakparmar
 
PDF
Timothy Rottach - Ramp up on AI Use Cases, from Vector Search to AI Agents wi...
AWS Chicago
 
PDF
How Startups Are Growing Faster with App Developers in Australia.pdf
India App Developer
 
PDF
Fl Studio 24.2.2 Build 4597 Crack for Windows Free Download 2025
faizk77g
 
PPTX
Q2 FY26 Tableau User Group Leader Quarterly Call
lward7
 
PDF
HCIP-Data Center Facility Deployment V2.0 Training Material (Without Remarks ...
mcastillo49
 
PDF
Presentation - Vibe Coding The Future of Tech
yanuarsinggih1
 
PPTX
COMPARISON OF RASTER ANALYSIS TOOLS OF QGIS AND ARCGIS
Sharanya Sarkar
 
Agentic AI lifecycle for Enterprise Hyper-Automation
Debmalya Biswas
 
NewMind AI - Journal 100 Insights After The 100th Issue
NewMind AI
 
LLMs.txt: Easily Control How AI Crawls Your Site
Keploy
 
CIFDAQ Token Spotlight for 9th July 2025
CIFDAQ
 
The Builder’s Playbook - 2025 State of AI Report.pdf
jeroen339954
 
Chris Elwell Woburn, MA - Passionate About IT Innovation
Chris Elwell Woburn, MA
 
Smart Trailers 2025 Update with History and Overview
Paul Menig
 
CIFDAQ Weekly Market Wrap for 11th July 2025
CIFDAQ
 
Transcript: New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
Achieving Consistent and Reliable AI Code Generation - Medusa AI
medusaaico
 
Bitcoin for Millennials podcast with Bram, Power Laws of Bitcoin
Stephen Perrenod
 
HubSpot Main Hub: A Unified Growth Platform
Jaswinder Singh
 
Newgen 2022-Forrester Newgen TEI_13 05 2022-The-Total-Economic-Impact-Newgen-...
darshakparmar
 
Timothy Rottach - Ramp up on AI Use Cases, from Vector Search to AI Agents wi...
AWS Chicago
 
How Startups Are Growing Faster with App Developers in Australia.pdf
India App Developer
 
Fl Studio 24.2.2 Build 4597 Crack for Windows Free Download 2025
faizk77g
 
Q2 FY26 Tableau User Group Leader Quarterly Call
lward7
 
HCIP-Data Center Facility Deployment V2.0 Training Material (Without Remarks ...
mcastillo49
 
Presentation - Vibe Coding The Future of Tech
yanuarsinggih1
 
COMPARISON OF RASTER ANALYSIS TOOLS OF QGIS AND ARCGIS
Sharanya Sarkar
 

Cloudera Federal Forum 2014: Cloud Deployment for the Enterprise Data Hub

  • 1. The Enterprise Data Hub in the Cloud Eli Collins, Chief Technologist 1 ©2014 Cloudera, Inc. All rights reserved.
  • 2. 2 ©2014 Cloudera, Inc. All rights reserved.
  • 3. What we’re really talking about Host Customer Vendor 3 Vendor AWS GCE SoftLayer … Customer Manage T-Systems Accenture .. EMR AltiScale .. ©2014 Cloudera, Inc. All rights reserved. Primarily dedicated physical on-prem infrastructure 2. Alternatives emerging 1.
  • 4. Engineering Perspective • Long-running Long-running batch jobs • Cluster stores the data and provides services (Impala, Search, HBase, Accumulo, etc) • • Ephemeral Self-service, demos • Test/Dev, POC • Periodic batch • 4 ©2014 Cloudera, Inc. All rights reserved.
  • 5. Product Thinking • Many EDH environments will be hybrid Valid reasons for/against cloud deployments • Private/public capabilities will converge • • Run Cloudera anywhere • 5 EDH works with multiple deployment models ©2014 Cloudera, Inc. All rights reserved.
  • 6. Portability is KEY • Multiple deployment options Cloud Connect: AWS, SoftLayer, Savvis, T-Systems, Verizon • Integrated support offerings • Growing provider, SI, and MSP ecosystem • • Multiple pricing models Traditional • Usage-based • 6 ©2014 Cloudera, Inc. All rights reserved.
  • 7. Functionality is KEY too • Enterprise Data Hub functionality & innovation Impala, Search, Sentry, Spark, .. • ISV ecosystem • • Management • 7 Cloudera Manager, NAVIGATOR, and BDR ©2014 Cloudera, Inc. All rights reserved.
  • 8. Our Reference Architecture + 8 ©2014 Cloudera, Inc. All rights reserved.
  • 9. Cloudera Leveraging AWS • Elastic Compute (EC2) • Simple Storage Service (S3) • Relational Database Service (RDS) • Elastic Block Store (EBS) • Direct Connect • Virtual Private Cloud (VPC) 9 ©2014 Cloudera, Inc. All rights reserved.
  • 10. Private VPC Subnet 10 ©2014 Cloudera, Inc. All rights reserved.
  • 11. Public VPC Subnet 11 ©2014 Cloudera, Inc. All rights reserved.
  • 12. Private and Public Subnets 12 ©2014 Cloudera, Inc. All rights reserved.
  • 13. Instance Types and Roles 13 ©2014 Cloudera, Inc. All rights reserved.
  • 14. What’s coming? • Automated deployment Joint reference architectures • Extend this with your IT • • Self-service (via service providers) • More platforms and providers 14 ©2014 Cloudera, Inc. All rights reserved.
  • 15. Taking Full Advantage of the Cloud • Enhanced transient clusters Grow/shrink, compute only instances, spot instances • Improved S3 and Swift support • • Hybrid environments Cross-DC operation • Centralized discovery & management • Bursting • 15 ©2014 Cloudera, Inc. All rights reserved.
  • 16. 16 ©2014 Cloudera, Inc. All rights reserved.