SlideShare a Scribd company logo
1
Moving Cold Data to Hadoop
2
2 Trends
Forcing a revolution in enterprise architecture
3
Industry Leaders Compete and Win with Data1TREND
More Data Beats Better Algorithms
Collecting interaction data from ecommerce, social media, offline, and call centers
enables a “customer 360 view” and consumer intimacy
Competitive Advantage is Decided by 0.5%
Consumer financial services: 1% improvement in fraud detection means hundreds of millions of dollars
Advertising and retail: 0.5% improvement in lift means millions of dollars increase in profitability
4
Big Data is Overwhelming Traditional Systems
• Mission-critical reliability
• Transaction guarantees
• Deep security
• Real-time performance
• Backup and recovery
• Interactive SQL
• Rich analytics
• Workload management
• Data governance
• Backup and recovery
Enterprise
Data
Architecture
2TREND
ENTERPRISE
USERS
OPERATIONAL
SYSTEMS
ANALYTICAL
SYSTEMS
PRODUCTION
REQUIREMENTS
PRODUCTION
REQUIREMENTS
OUTSIDE SOURCES
5
And 2 Realities
6
OPERATIONAL
SYSTEMS
ANALYTICAL
SYSTEMS
ENTERPRISE
USERS
1REALITY
• Data staging
• Archive
• Data transformation
• Data exploration
• Streaming,
interactions
Hadoop Relieves the Pressure from Enterprise Systems
2 Interoperability
1 Reliability and DR
4
Supports operations
and analytics
3 High performance
Keys for Production Success
7
FOUNDATION
Architecture Matters for Success2REALITY
Data protection
& security
High performance
Multi-tenancy
Real-time operational
& analytical apps
Open standards
for integration
NEW APPLICATIONS SLAs TRUSTEDINFORMATION LOWERTCO
8
Data Warehouse Optimization
9
TDWI: Evolving Data Warehouse Architectures
2
1 Data Staging & Archive
3 Big Data Analytics
2 ETL
Hadoop Uses in
Data Warehouse Environment
Source: TDWI April 2014
10
The MapR Advantage
• Scale Reliability Across the Enterprise
– Advanced multi-tenancy
– Business continuity – HA, DR
• Speed
– 2-7x faster than other Hadoop distro’s
– Ultra-fast data ingest (100M data points per sec)
– NFS & R/W file system
• Real-time & Self-Service Data Exploration
– On-the-fly SQL without up-front schema
– Fast lookups and queries
Best Hadoop Platform for Data Warehouse Optimization & Analytics
Security
Streaming
NoSQL & Search
Provisioning
&
coordination
ML, Graph
W orkflow
& Data Governance
Batch
SQL
INTEGRATED
COMMERCIAL
ENGINES
TOOLSCOMPUTE
ENGINES
Batch
Interactive
Real-time
Online
Others
Management
Operations
Governance
Audits
Security
MapR-FS MapR-DB
MapR Data Platform
11
Attunity Solutions
Right Data. Right Place. Right Time.
12
Attunity – Growing, Modular Portfolio
Delivering
Big Data
for
Analytics
13
Data Warehouse Optimization with Hadoop
1
2
3
Assess and identify data and workloads to
rebalance on Hadoop
Develop a roadmap to move data and
workloads
Implement the roadmap incrementally and
iteratively
14
Completely analyze workloads and data usage
Reduce costs | Optimize performance | Justify investments
The Data Dashboard
User Activity Data Usage Workload Performance
Attunity Visibility – The Data Dashboard
15
Attunity Replicate
• Real-time data movement
• Change Data Capture (CDC)
• Broadest platform support
• Files - MF - RDBMS - Hadoop
• Non-intrusive architecture
• Automation of standard maintenance
tasks
• “Click-to-Load” design
16
MapR and Attunity
17
MapR and Attunity Are a Great Partnership
• Complimentary set of enterprise-grade features
– Focus on Data
• Movement
• Identification
• Usage
• High availability
• Scale
• Data Warehouse Optimization
– Experience across broad set of use cases/workloads
• Customer 360 view
• Telco
• Internet of Things (IoT)
18
Additional Resources
• Go to: www.Attunity.com/mapr
• Find us on Twitter:
– @mapR
– @attunity
• Watch our video
• View the Moving Cold Data to Hadoop webinar

More Related Content

What's hot (20)

PPTX
Best Practices for Supercharging Cloud Analytics on Amazon Redshift
SnapLogic
 
PPTX
Real-time Data Pipelines with SAP and Apache Kafka
Carole Gunst
 
PPTX
Atlanta Data Science Meetup | Qubole slides
Qubole
 
PPTX
Optimizing industrial operations using the big data ecosystem
DataWorks Summit
 
PPTX
Versa Shore Microsoft APS PDW webinar
Shawn Rao
 
PPTX
Webinar: The Modern Streaming Data Stack with Kinetica & StreamSets
Kinetica
 
PPTX
Webinar: ROI on Big Data - RDBMS, NoSQL or Both? A Simple Guide for Knowing H...
DataStax
 
PDF
Database Camp 2016 @ United Nations, NYC - Michael Glukhovsky, Co-Founder, Re...
✔ Eric David Benari, PMP
 
PPTX
The Microsoft BigData Story
Lynn Langit
 
PDF
Big Data Computing Architecture
Gang Tao
 
PPTX
Modernizing Your Data Warehouse using APS
Stéphane Fréchette
 
PDF
What is an Open Data Lake? - Data Sheets | Whitepaper
Vasu S
 
PPTX
Solving Performance Problems on Hadoop
Tyler Mitchell
 
PPTX
Accelerating Data Warehouse Modernization
DataWorks Summit/Hadoop Summit
 
PDF
Database Camp 2016 @ United Nations, NYC - Brad Bebee, CEO, Blazegraph
✔ Eric David Benari, PMP
 
PPTX
Big Data Analytics in the Cloud with Microsoft Azure
Mark Kromer
 
PPTX
Free Servers to Build Big Data System on: Bing’s Approach
DataWorks Summit
 
PDF
Seeing Redshift: How Amazon Changed Data Warehousing Forever
Inside Analysis
 
PPTX
Streaming Real-time Data to Azure Data Lake Storage Gen 2
Carole Gunst
 
PPTX
Pentaho Analytics on MongoDB
Mark Kromer
 
Best Practices for Supercharging Cloud Analytics on Amazon Redshift
SnapLogic
 
Real-time Data Pipelines with SAP and Apache Kafka
Carole Gunst
 
Atlanta Data Science Meetup | Qubole slides
Qubole
 
Optimizing industrial operations using the big data ecosystem
DataWorks Summit
 
Versa Shore Microsoft APS PDW webinar
Shawn Rao
 
Webinar: The Modern Streaming Data Stack with Kinetica & StreamSets
Kinetica
 
Webinar: ROI on Big Data - RDBMS, NoSQL or Both? A Simple Guide for Knowing H...
DataStax
 
Database Camp 2016 @ United Nations, NYC - Michael Glukhovsky, Co-Founder, Re...
✔ Eric David Benari, PMP
 
The Microsoft BigData Story
Lynn Langit
 
Big Data Computing Architecture
Gang Tao
 
Modernizing Your Data Warehouse using APS
Stéphane Fréchette
 
What is an Open Data Lake? - Data Sheets | Whitepaper
Vasu S
 
Solving Performance Problems on Hadoop
Tyler Mitchell
 
Accelerating Data Warehouse Modernization
DataWorks Summit/Hadoop Summit
 
Database Camp 2016 @ United Nations, NYC - Brad Bebee, CEO, Blazegraph
✔ Eric David Benari, PMP
 
Big Data Analytics in the Cloud with Microsoft Azure
Mark Kromer
 
Free Servers to Build Big Data System on: Bing’s Approach
DataWorks Summit
 
Seeing Redshift: How Amazon Changed Data Warehousing Forever
Inside Analysis
 
Streaming Real-time Data to Azure Data Lake Storage Gen 2
Carole Gunst
 
Pentaho Analytics on MongoDB
Mark Kromer
 

Viewers also liked (20)

PDF
Real-World Machine Learning - Leverage the Features of MapR Converged Data Pl...
Mathieu Dumoulin
 
PPTX
Seattle Scalability Meetup - Ted Dunning - MapR
clive boulton
 
PDF
Tdwi solution spotlight presentation slides
William Lam
 
PDF
Tdwi agile data warehouse - dv, what is the buzz about
Prudenza B.V
 
PPTX
NYC Hadoop Meetup - MapR, Architecture, Philosophy and Applications
Jason Shao
 
PDF
TDWI Roundtable: The HANA EDW
ukc4
 
PPTX
Эволюция Big Data и Information Management. Reference Architecture.
Andrey Akulov
 
PPTX
SQL-on-Hadoop with Apache Drill
MapR Technologies
 
PPTX
Map r hadoop-security-mar2014 (2)
MapR Technologies
 
PDF
Hadoop and Your Enterprise Data Warehouse
Edgar Alejandro Villegas
 
PPTX
Executive BI, Analytics, Modeling and Insights Strategy Framework Practices
InsightSlides
 
PPT
Going MAD: A Framework For Delivering Pervasive BI Solutions
The Data Warehousing Institute (TDWI)
 
PPTX
Design Patterns for working with Fast Data in Kafka
Ian Downard
 
PDF
Big Data Journey
Tugdual Grall
 
PDF
Why Elastic? @ 50th Vinitaly 2016
Christoph Wurm
 
PPT
Gartner: The BI, Analytics and Performance Management Framework
Gartner
 
PDF
Elastic v5.0.0 Update uptoalpha3 v0.2 - 김종민
NAVER D2
 
PPTX
Using Familiar BI Tools and Hadoop to Analyze Enterprise Networks
MapR Technologies
 
PDF
Understanding Metadata: Why it's essential to your big data solution and how ...
Zaloni
 
PDF
MapR-DB Elasticsearch Integration
MapR Technologies
 
Real-World Machine Learning - Leverage the Features of MapR Converged Data Pl...
Mathieu Dumoulin
 
Seattle Scalability Meetup - Ted Dunning - MapR
clive boulton
 
Tdwi solution spotlight presentation slides
William Lam
 
Tdwi agile data warehouse - dv, what is the buzz about
Prudenza B.V
 
NYC Hadoop Meetup - MapR, Architecture, Philosophy and Applications
Jason Shao
 
TDWI Roundtable: The HANA EDW
ukc4
 
Эволюция Big Data и Information Management. Reference Architecture.
Andrey Akulov
 
SQL-on-Hadoop with Apache Drill
MapR Technologies
 
Map r hadoop-security-mar2014 (2)
MapR Technologies
 
Hadoop and Your Enterprise Data Warehouse
Edgar Alejandro Villegas
 
Executive BI, Analytics, Modeling and Insights Strategy Framework Practices
InsightSlides
 
Going MAD: A Framework For Delivering Pervasive BI Solutions
The Data Warehousing Institute (TDWI)
 
Design Patterns for working with Fast Data in Kafka
Ian Downard
 
Big Data Journey
Tugdual Grall
 
Why Elastic? @ 50th Vinitaly 2016
Christoph Wurm
 
Gartner: The BI, Analytics and Performance Management Framework
Gartner
 
Elastic v5.0.0 Update uptoalpha3 v0.2 - 김종민
NAVER D2
 
Using Familiar BI Tools and Hadoop to Analyze Enterprise Networks
MapR Technologies
 
Understanding Metadata: Why it's essential to your big data solution and how ...
Zaloni
 
MapR-DB Elasticsearch Integration
MapR Technologies
 
Ad

Similar to Which data should you move to Hadoop? (20)

PPTX
Data Con LA 2018 - Populating your Enterprise Data Hub for Next Gen Analytics...
Data Con LA
 
PPTX
Enterprise Data Hub: The Next Big Thing in Big Data
Cloudera, Inc.
 
PDF
The Practice of Big Data - The Hadoop ecosystem explained with usage scenarios
kcmallu
 
PPTX
Integrating Hadoop into your enterprise IT environment
MapR Technologies
 
PDF
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization
Denodo
 
PDF
Webinar - Accelerating Hadoop Success with Rapid Data Integration for the Mod...
Hortonworks
 
PPTX
Fast Data Strategy Houston Roadshow Presentation
Denodo
 
PPTX
From Data to Services at the Speed of Business
Ali Hodroj
 
PPTX
Opportunity: Data, Analytic & Azure
Abhimanyu Singhal
 
PPTX
Skilwise Big data
Skillwise Group
 
PPTX
MongoDB IoT City Tour STUTTGART: Hadoop and future data management. By, Cloudera
MongoDB
 
PPTX
Skillwise Big Data part 2
Skillwise Group
 
PPTX
MongoDB IoT City Tour LONDON: Hadoop and the future of data management. By, M...
MongoDB
 
PPTX
Assessing New Databases– Translytical Use Cases
DATAVERSITY
 
PPTX
Real time data integration best practices and architecture
Bui Kiet
 
PDF
Data Platform Overview
Hamid J. Fard
 
PPTX
Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...
MapR Technologies
 
PDF
Accelerate Self-Service Analytics with Virtualization and Visualisation (Thai)
Denodo
 
PPT
Fast and Furious: From POC to an Enterprise Big Data Stack in 2014
MapR Technologies
 
PDF
Key Considerations for Putting Hadoop in Production SlideShare
MapR Technologies
 
Data Con LA 2018 - Populating your Enterprise Data Hub for Next Gen Analytics...
Data Con LA
 
Enterprise Data Hub: The Next Big Thing in Big Data
Cloudera, Inc.
 
The Practice of Big Data - The Hadoop ecosystem explained with usage scenarios
kcmallu
 
Integrating Hadoop into your enterprise IT environment
MapR Technologies
 
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization
Denodo
 
Webinar - Accelerating Hadoop Success with Rapid Data Integration for the Mod...
Hortonworks
 
Fast Data Strategy Houston Roadshow Presentation
Denodo
 
From Data to Services at the Speed of Business
Ali Hodroj
 
Opportunity: Data, Analytic & Azure
Abhimanyu Singhal
 
Skilwise Big data
Skillwise Group
 
MongoDB IoT City Tour STUTTGART: Hadoop and future data management. By, Cloudera
MongoDB
 
Skillwise Big Data part 2
Skillwise Group
 
MongoDB IoT City Tour LONDON: Hadoop and the future of data management. By, M...
MongoDB
 
Assessing New Databases– Translytical Use Cases
DATAVERSITY
 
Real time data integration best practices and architecture
Bui Kiet
 
Data Platform Overview
Hamid J. Fard
 
Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...
MapR Technologies
 
Accelerate Self-Service Analytics with Virtualization and Visualisation (Thai)
Denodo
 
Fast and Furious: From POC to an Enterprise Big Data Stack in 2014
MapR Technologies
 
Key Considerations for Putting Hadoop in Production SlideShare
MapR Technologies
 
Ad

Recently uploaded (20)

PPTX
AI Presentation Tool Pitch Deck Presentation.pptx
ShyamPanthavoor1
 
PDF
Data Chunking Strategies for RAG in 2025.pdf
Tamanna
 
PPT
deep dive data management sharepoint apps.ppt
novaprofk
 
PPTX
apidays Munich 2025 - Building an AWS Serverless Application with Terraform, ...
apidays
 
PPTX
apidays Singapore 2025 - Designing for Change, Julie Schiller (Google)
apidays
 
PPTX
Exploring Multilingual Embeddings for Italian Semantic Search: A Pretrained a...
Sease
 
PDF
OPPOTUS - Malaysias on Malaysia 1Q2025.pdf
Oppotus
 
PDF
apidays Helsinki & North 2025 - Monetizing AI APIs: The New API Economy, Alla...
apidays
 
PDF
Choosing the Right Database for Indexing.pdf
Tamanna
 
PDF
apidays Helsinki & North 2025 - How (not) to run a Graphql Stewardship Group,...
apidays
 
PDF
WEF_Future_of_Global_Fintech_Second_Edition_2025.pdf
AproximacionAlFuturo
 
PPTX
apidays Helsinki & North 2025 - API access control strategies beyond JWT bear...
apidays
 
PDF
JavaScript - Good or Bad? Tips for Google Tag Manager
📊 Markus Baersch
 
PDF
Web Scraping with Google Gemini 2.0 .pdf
Tamanna
 
PDF
Building Production-Ready AI Agents with LangGraph.pdf
Tamanna
 
PDF
Avatar for apidays apidays PRO June 07, 2025 0 5 apidays Helsinki & North 2...
apidays
 
PDF
apidays Helsinki & North 2025 - API-Powered Journeys: Mobility in an API-Driv...
apidays
 
PDF
How to Connect Your On-Premises Site to AWS Using Site-to-Site VPN.pdf
Tamanna
 
PPTX
Module-5-Measures-of-Central-Tendency-Grouped-Data-1.pptx
lacsonjhoma0407
 
PPTX
Numbers of a nation: how we estimate population statistics | Accessible slides
Office for National Statistics
 
AI Presentation Tool Pitch Deck Presentation.pptx
ShyamPanthavoor1
 
Data Chunking Strategies for RAG in 2025.pdf
Tamanna
 
deep dive data management sharepoint apps.ppt
novaprofk
 
apidays Munich 2025 - Building an AWS Serverless Application with Terraform, ...
apidays
 
apidays Singapore 2025 - Designing for Change, Julie Schiller (Google)
apidays
 
Exploring Multilingual Embeddings for Italian Semantic Search: A Pretrained a...
Sease
 
OPPOTUS - Malaysias on Malaysia 1Q2025.pdf
Oppotus
 
apidays Helsinki & North 2025 - Monetizing AI APIs: The New API Economy, Alla...
apidays
 
Choosing the Right Database for Indexing.pdf
Tamanna
 
apidays Helsinki & North 2025 - How (not) to run a Graphql Stewardship Group,...
apidays
 
WEF_Future_of_Global_Fintech_Second_Edition_2025.pdf
AproximacionAlFuturo
 
apidays Helsinki & North 2025 - API access control strategies beyond JWT bear...
apidays
 
JavaScript - Good or Bad? Tips for Google Tag Manager
📊 Markus Baersch
 
Web Scraping with Google Gemini 2.0 .pdf
Tamanna
 
Building Production-Ready AI Agents with LangGraph.pdf
Tamanna
 
Avatar for apidays apidays PRO June 07, 2025 0 5 apidays Helsinki & North 2...
apidays
 
apidays Helsinki & North 2025 - API-Powered Journeys: Mobility in an API-Driv...
apidays
 
How to Connect Your On-Premises Site to AWS Using Site-to-Site VPN.pdf
Tamanna
 
Module-5-Measures-of-Central-Tendency-Grouped-Data-1.pptx
lacsonjhoma0407
 
Numbers of a nation: how we estimate population statistics | Accessible slides
Office for National Statistics
 

Which data should you move to Hadoop?

  • 1. 1 Moving Cold Data to Hadoop
  • 2. 2 2 Trends Forcing a revolution in enterprise architecture
  • 3. 3 Industry Leaders Compete and Win with Data1TREND More Data Beats Better Algorithms Collecting interaction data from ecommerce, social media, offline, and call centers enables a “customer 360 view” and consumer intimacy Competitive Advantage is Decided by 0.5% Consumer financial services: 1% improvement in fraud detection means hundreds of millions of dollars Advertising and retail: 0.5% improvement in lift means millions of dollars increase in profitability
  • 4. 4 Big Data is Overwhelming Traditional Systems • Mission-critical reliability • Transaction guarantees • Deep security • Real-time performance • Backup and recovery • Interactive SQL • Rich analytics • Workload management • Data governance • Backup and recovery Enterprise Data Architecture 2TREND ENTERPRISE USERS OPERATIONAL SYSTEMS ANALYTICAL SYSTEMS PRODUCTION REQUIREMENTS PRODUCTION REQUIREMENTS OUTSIDE SOURCES
  • 6. 6 OPERATIONAL SYSTEMS ANALYTICAL SYSTEMS ENTERPRISE USERS 1REALITY • Data staging • Archive • Data transformation • Data exploration • Streaming, interactions Hadoop Relieves the Pressure from Enterprise Systems 2 Interoperability 1 Reliability and DR 4 Supports operations and analytics 3 High performance Keys for Production Success
  • 7. 7 FOUNDATION Architecture Matters for Success2REALITY Data protection & security High performance Multi-tenancy Real-time operational & analytical apps Open standards for integration NEW APPLICATIONS SLAs TRUSTEDINFORMATION LOWERTCO
  • 9. 9 TDWI: Evolving Data Warehouse Architectures 2 1 Data Staging & Archive 3 Big Data Analytics 2 ETL Hadoop Uses in Data Warehouse Environment Source: TDWI April 2014
  • 10. 10 The MapR Advantage • Scale Reliability Across the Enterprise – Advanced multi-tenancy – Business continuity – HA, DR • Speed – 2-7x faster than other Hadoop distro’s – Ultra-fast data ingest (100M data points per sec) – NFS & R/W file system • Real-time & Self-Service Data Exploration – On-the-fly SQL without up-front schema – Fast lookups and queries Best Hadoop Platform for Data Warehouse Optimization & Analytics Security Streaming NoSQL & Search Provisioning & coordination ML, Graph W orkflow & Data Governance Batch SQL INTEGRATED COMMERCIAL ENGINES TOOLSCOMPUTE ENGINES Batch Interactive Real-time Online Others Management Operations Governance Audits Security MapR-FS MapR-DB MapR Data Platform
  • 11. 11 Attunity Solutions Right Data. Right Place. Right Time.
  • 12. 12 Attunity – Growing, Modular Portfolio Delivering Big Data for Analytics
  • 13. 13 Data Warehouse Optimization with Hadoop 1 2 3 Assess and identify data and workloads to rebalance on Hadoop Develop a roadmap to move data and workloads Implement the roadmap incrementally and iteratively
  • 14. 14 Completely analyze workloads and data usage Reduce costs | Optimize performance | Justify investments The Data Dashboard User Activity Data Usage Workload Performance Attunity Visibility – The Data Dashboard
  • 15. 15 Attunity Replicate • Real-time data movement • Change Data Capture (CDC) • Broadest platform support • Files - MF - RDBMS - Hadoop • Non-intrusive architecture • Automation of standard maintenance tasks • “Click-to-Load” design
  • 17. 17 MapR and Attunity Are a Great Partnership • Complimentary set of enterprise-grade features – Focus on Data • Movement • Identification • Usage • High availability • Scale • Data Warehouse Optimization – Experience across broad set of use cases/workloads • Customer 360 view • Telco • Internet of Things (IoT)
  • 18. 18 Additional Resources • Go to: www.Attunity.com/mapr • Find us on Twitter: – @mapR – @attunity • Watch our video • View the Moving Cold Data to Hadoop webinar