Denodo Data Virtualization
Stop collecting, start connecting
Ravi Shankar, CMO
August 2017
2
• Competition from a low cost
vendor
• Lower the price, affecting
margins?
• Or, maintain high price, but
differentiate in other ways?
3
Benefits
Large Heavy Equipment Manufacturer
Self-service / Predictive Analytics – IoT Integration
Improved asset performance and
proactive maintenance
Increased revenue from sale of
services and parts
Reduced warranty costs of parts
failure
4
“Big Data Challenges Impede Business Insights”
5
Big Data Fabric Architecture – Forrester Research, 2016
6
Big Data Fabric – Data Abstraction Layer
Abstracts access to disparate
data sources
Acts as a single repository
(virtual)
Makes data available in
real-time to consumers
7
Consume
in business
applications
Combine
related data
into views
2
3 DATA CONSUMERS
Enterprise Applications, Reporting, BI, Portals, ESB, Mobile, Web, Users, IoT/Streaming Data
Connect
to disparate
data sources
1 DISPARATE DATA SOURCES
Databases & Warehouses, Cloud/Saas Applications, Big Data, NoSQL, Web, XML, Excel, PDF, Word...
Less StructuredMore Structured
Multiple protocols,
formats
Linked data services
query, search, browse
Request/Reply,
event driven
Secure
delivery
Library of
wrappers
Web
automation
Any data
or content
Read
& Write
DATA VIRTUALIZATION
Design Tools
Optimization Engine
Data Discovery & Search
In-memory Fabric
Cache
Scheduler
DATA CONSUMERSAnalytical Operational
CONNECT COMBINE CONSUME
Share, Deliver,
Publish, Govern,
Collaborate
Discover,
Transform,
Prepare, Improve
Quality, Integrate
Normalized
views of
disparate data
Data Services (Real-time &
On-demand)
Data catalog / Metadata
Governance
Security
Management & Monitoring
8
Logical Data Lake – Use Cases
Data Warehouse OffloadingIoT Integration
9
Big Data Queries Faster with Denodo Platform
1. Data Virtualization delivers better performance without needing to replicate data into Hadoop.
2. Data Virtualization leverages Data Source Architectures for what they are good at.
Performance comparison of 5 different queries
Impala
Hadoop-only
Runtime (s)
Denodo
Runtime (s)
Denodo
Runtime w/
Cache (s)
Data Volumes
Query 1
199 120 68
Queries 1,2,3,5
•Exadata Row Count: ~5M
•Impala Row Count: ~500k
Query 4
•Exadata Row Count: ~5M
•Impala Row Count: ~2M
Query 2
187 96 88
Query 3
120 212 115
Query 4 timeout
328 69
Query 5
46 91 56
10
Denodo Dynamic Query Optimizer
System Execution Time Data Transferred Optimization Technique
Denodo 9 sec. 4 M Aggregation push-down
Tableau 125 sec. 292 M None: full scan
SELECT c.id, SUM(s.amount) as total
FROM customer c JOIN sales s
ON c.id = s.customer_id
GROUP BY c.id
290 M 2 M
Sales Customer
2 M
2 M
Sales Customer
join
group by join
group by
11
Data Virtualization as the Big Data Fabric
 Simplify access to Big Data
 Extend traditional data warehouses
 Capitalize on in-memory computing
 Exploit distributed query processing
 Bring benefits of Big data to business users
“Plan to evolve your big data lake into a fabric
over time by adding services like in-memory
caching, data virtualization, or metadata
cataloging.”
-Source: Forrester Research “The Anatomy Of A System Of
Insight, 2017”
12
ROI and TCO of Data Virtualization
Customer-reported projected savings by percentage
Data Integration Cost reduction
• 60-70% savings
Traditional Call Centres, Portals
• 30-70% savings
BI and Reporting
• 40-60% savings
ETL and Data Warehousing
• Project timelines of 6-12 months reduced to 3-6 months
• 85% time reduction
13
Big Data Fabric Vendors
Forrester Wave: Big Data Fabric, Q4 2016
The Forrester WaveTM: Big Data Fabric, Q4 2016
Denodo’s key strength is delivering a unified and
centralized data services fabric with security and real-time
integration across multiple traditional and big data
sources, including Hadoop, NoSQL, cloud, and software-
as-a-service (SaaS). Customers like its easy-to-use, simple
yet sophisticated data modeling capabilities, search, and
support for various big data sources.”
– Analyst Noel Yuhanna, Forrester Research
14
Data Virtualization, Federation, ETL, ESB Compared
Virtualization Federation ETL ESB
Data abstraction Full Partial Partial Full
Robust Performance Full
Limited to a few data
sources
Primarily in Batch mode
Limited to few data
sources
Zero replication Full Partial None Partial
Real-time Information Full
Limited to a few data
sources
Primarily Batch
Limited to few data
sources
Self-service data services Full None None Partial
Centralized metadata,
security, and governance
Full None Partial None
Solutions Denodo, Cisco, RedHat
Tableau, QueryGrid,
SAP SDA
Informatica PowerCenter,
IBM DataStage, Talend
Data Fabric
TIBCO ESB, Mulesoft
15
Denodo
The Leader in Data Virtualization
LEADERSHIP
 Longest continuous focus on
data virtualization – since 1999
 400+ customers
 Winner of numerous awards
Customer Awards
AUTODESK
FINALIST in 2017
Excellence Awards
SEACOAST BANK
2016 Business
Leadership Award
CIT BANK
2016 Premier 100
Technology Leader
ULTRA MOBILE
2017 Best Practices
Award
AUTODESK
2017 CIO 100
Award
ASURION
2017 Best Practices
Award
BIOSTORAGE
2016 Business
Leadership Award

More Related Content

PPTX
How OpenTable uses Big Data to impact growth by Raman Marya
PPTX
Delivering Quality Open Data by Chelsea Ursaner
PDF
What's new in Hortonworks DataFlow 3.0 by Andrew Psaltis
PDF
Data Virtualization: The Agile Delivery Platform
PDF
Performance Acceleration: Summaries, Recommendation, MPP and more
PDF
Designing an Agile Fast Data Architecture for Big Data Ecosystem using Logica...
PDF
Big Data Fabric: A Necessity For Any Successful Big Data Initiative
PDF
Data Virtualization: From Zero to Hero (Middle East)
How OpenTable uses Big Data to impact growth by Raman Marya
Delivering Quality Open Data by Chelsea Ursaner
What's new in Hortonworks DataFlow 3.0 by Andrew Psaltis
Data Virtualization: The Agile Delivery Platform
Performance Acceleration: Summaries, Recommendation, MPP and more
Designing an Agile Fast Data Architecture for Big Data Ecosystem using Logica...
Big Data Fabric: A Necessity For Any Successful Big Data Initiative
Data Virtualization: From Zero to Hero (Middle East)

What's hot (20)

PDF
Data Mesh in Practice - How Europe's Leading Online Platform for Fashion Goes...
PDF
Data Virtualization - Enabling Next Generation Analytics
PPTX
Enterprise 360 - Graphs at the Center of a Data Fabric
PDF
In Memory Parallel Processing for Big Data Scenarios
PDF
GDPR Noncompliance: Avoid the Risk with Data Virtualization
PDF
Simplifying Your Cloud Architecture with a Logical Data Fabric (APAC)
PDF
Denodo DataFest 2017: Conquering the Edge with Data Virtualization
PDF
Agile Data Management with Enterprise Data Fabric (ASEAN)
PDF
Apache Kafka® and the Data Mesh
PPTX
Take your Data Management Practice to the Next Level with Denodo 7
PDF
Building a Single Logical Data Lake: For Advanced Analytics, Data Science, an...
PDF
Best Practices: Data Virtualization Perspectives and Best Practices
PDF
Delivering Self-Service Analytics using Big Data and Data Virtualization on t...
PDF
Denodo DataFest 2017: Outpace Your Competition with Real-Time Responses
PDF
Data Virtualization: From Zero to Hero
PDF
Cloud Modernization with Data Virtualization
PDF
An Introduction to Data Virtualization in 2018
PDF
3 Reasons Data Virtualization Matters in Your Portfolio
PDF
Simplifying Cloud Architectures with Data Virtualization
PPTX
Fast Data Strategy Houston Roadshow Presentation
Data Mesh in Practice - How Europe's Leading Online Platform for Fashion Goes...
Data Virtualization - Enabling Next Generation Analytics
Enterprise 360 - Graphs at the Center of a Data Fabric
In Memory Parallel Processing for Big Data Scenarios
GDPR Noncompliance: Avoid the Risk with Data Virtualization
Simplifying Your Cloud Architecture with a Logical Data Fabric (APAC)
Denodo DataFest 2017: Conquering the Edge with Data Virtualization
Agile Data Management with Enterprise Data Fabric (ASEAN)
Apache Kafka® and the Data Mesh
Take your Data Management Practice to the Next Level with Denodo 7
Building a Single Logical Data Lake: For Advanced Analytics, Data Science, an...
Best Practices: Data Virtualization Perspectives and Best Practices
Delivering Self-Service Analytics using Big Data and Data Virtualization on t...
Denodo DataFest 2017: Outpace Your Competition with Real-Time Responses
Data Virtualization: From Zero to Hero
Cloud Modernization with Data Virtualization
An Introduction to Data Virtualization in 2018
3 Reasons Data Virtualization Matters in Your Portfolio
Simplifying Cloud Architectures with Data Virtualization
Fast Data Strategy Houston Roadshow Presentation
Ad

Similar to Big Data Fabric for At-Scale Real-Time Analysis by Edwin Robbins (20)

PDF
Denodo DataFest 2016: What’s New in Denodo Platform – Demo and Roadmap
PDF
Data Virtualization. An Introduction (ASEAN)
PDF
Data Virtualization: An Introduction
PDF
Analyst View of Data Virtualization: Conversations with Boulder Business Inte...
PDF
Data Virtualization: An Introduction
PDF
Why Data Virtualization? An Introduction
PDF
Big Data Fabric: A Recipe for Big Data Initiatives
PDF
Denodo Data Virtualization Platform: Overview (session 1 from Architect to Ar...
PDF
Data Virtualization: An Introduction
PDF
Rethink Your 2021 Data Management Strategy with Data Virtualization (ASEAN)
PDF
Introduction to Modern Data Virtualization (US)
PDF
Denodo Platform 7.0: What's New?
PDF
Introduction to Modern Data Virtualization 2021 (APAC)
PDF
Data Ninja Webinar Series: Realizing the Promise of Data Lakes
PDF
Parallel In-Memory Processing and Data Virtualization Redefine Analytics Arch...
PDF
How a Logical Data Fabric Enhances the Customer 360 View
PDF
The Role of the Logical Data Fabric in a Unified Platform for Modern Analytics
PDF
The Role of Logical Data Fabric in a Unified Platform for Modern Analytics (A...
PDF
Die Big Data Fabric als Enabler für Machine Learning & AI
PDF
Modern Data Management for Federal Modernization
Denodo DataFest 2016: What’s New in Denodo Platform – Demo and Roadmap
Data Virtualization. An Introduction (ASEAN)
Data Virtualization: An Introduction
Analyst View of Data Virtualization: Conversations with Boulder Business Inte...
Data Virtualization: An Introduction
Why Data Virtualization? An Introduction
Big Data Fabric: A Recipe for Big Data Initiatives
Denodo Data Virtualization Platform: Overview (session 1 from Architect to Ar...
Data Virtualization: An Introduction
Rethink Your 2021 Data Management Strategy with Data Virtualization (ASEAN)
Introduction to Modern Data Virtualization (US)
Denodo Platform 7.0: What's New?
Introduction to Modern Data Virtualization 2021 (APAC)
Data Ninja Webinar Series: Realizing the Promise of Data Lakes
Parallel In-Memory Processing and Data Virtualization Redefine Analytics Arch...
How a Logical Data Fabric Enhances the Customer 360 View
The Role of the Logical Data Fabric in a Unified Platform for Modern Analytics
The Role of Logical Data Fabric in a Unified Platform for Modern Analytics (A...
Die Big Data Fabric als Enabler für Machine Learning & AI
Modern Data Management for Federal Modernization
Ad

More from Data Con LA (20)

PPTX
Data Con LA 2022 Keynotes
PPTX
Data Con LA 2022 Keynotes
PDF
Data Con LA 2022 Keynote
PPTX
Data Con LA 2022 - Startup Showcase
PPTX
Data Con LA 2022 Keynote
PDF
Data Con LA 2022 - Using Google trends data to build product recommendations
PPTX
Data Con LA 2022 - AI Ethics
PDF
Data Con LA 2022 - Improving disaster response with machine learning
PDF
Data Con LA 2022 - What's new with MongoDB 6.0 and Atlas
PDF
Data Con LA 2022 - Real world consumer segmentation
PPTX
Data Con LA 2022 - Modernizing Analytics & AI for today's needs: Intuit Turbo...
PPTX
Data Con LA 2022 - Moving Data at Scale to AWS
PDF
Data Con LA 2022 - Collaborative Data Exploration using Conversational AI
PDF
Data Con LA 2022 - Why Database Modernization Makes Your Data Decisions More ...
PDF
Data Con LA 2022 - Intro to Data Science
PDF
Data Con LA 2022 - How are NFTs and DeFi Changing Entertainment
PPTX
Data Con LA 2022 - Why Data Quality vigilance requires an End-to-End, Automat...
PPTX
Data Con LA 2022-Perfect Viral Ad prediction of Superbowl 2022 using Tease, T...
PPTX
Data Con LA 2022- Embedding medical journeys with machine learning to improve...
PPTX
Data Con LA 2022 - Data Streaming with Kafka
Data Con LA 2022 Keynotes
Data Con LA 2022 Keynotes
Data Con LA 2022 Keynote
Data Con LA 2022 - Startup Showcase
Data Con LA 2022 Keynote
Data Con LA 2022 - Using Google trends data to build product recommendations
Data Con LA 2022 - AI Ethics
Data Con LA 2022 - Improving disaster response with machine learning
Data Con LA 2022 - What's new with MongoDB 6.0 and Atlas
Data Con LA 2022 - Real world consumer segmentation
Data Con LA 2022 - Modernizing Analytics & AI for today's needs: Intuit Turbo...
Data Con LA 2022 - Moving Data at Scale to AWS
Data Con LA 2022 - Collaborative Data Exploration using Conversational AI
Data Con LA 2022 - Why Database Modernization Makes Your Data Decisions More ...
Data Con LA 2022 - Intro to Data Science
Data Con LA 2022 - How are NFTs and DeFi Changing Entertainment
Data Con LA 2022 - Why Data Quality vigilance requires an End-to-End, Automat...
Data Con LA 2022-Perfect Viral Ad prediction of Superbowl 2022 using Tease, T...
Data Con LA 2022- Embedding medical journeys with machine learning to improve...
Data Con LA 2022 - Data Streaming with Kafka

Recently uploaded (20)

DOCX
search engine optimization ppt fir known well about this
PDF
Co-training pseudo-labeling for text classification with support vector machi...
PDF
Consumable AI The What, Why & How for Small Teams.pdf
PDF
Dell Pro Micro: Speed customer interactions, patient processing, and learning...
PDF
Advancing precision in air quality forecasting through machine learning integ...
PDF
AI.gov: A Trojan Horse in the Age of Artificial Intelligence
PDF
sustainability-14-14877-v2.pddhzftheheeeee
PDF
Transform-Your-Streaming-Platform-with-AI-Driven-Quality-Engineering.pdf
PDF
5-Ways-AI-is-Revolutionizing-Telecom-Quality-Engineering.pdf
PPTX
GROUP4NURSINGINFORMATICSREPORT-2 PRESENTATION
PPTX
Custom Battery Pack Design Considerations for Performance and Safety
PDF
Data Virtualization in Action: Scaling APIs and Apps with FME
PDF
sbt 2.0: go big (Scala Days 2025 edition)
PDF
“A New Era of 3D Sensing: Transforming Industries and Creating Opportunities,...
PDF
Auditboard EB SOX Playbook 2023 edition.
PDF
The-Future-of-Automotive-Quality-is-Here-AI-Driven-Engineering.pdf
PPTX
Configure Apache Mutual Authentication
PDF
Improvisation in detection of pomegranate leaf disease using transfer learni...
PDF
NewMind AI Weekly Chronicles – August ’25 Week IV
PPTX
Build Your First AI Agent with UiPath.pptx
search engine optimization ppt fir known well about this
Co-training pseudo-labeling for text classification with support vector machi...
Consumable AI The What, Why & How for Small Teams.pdf
Dell Pro Micro: Speed customer interactions, patient processing, and learning...
Advancing precision in air quality forecasting through machine learning integ...
AI.gov: A Trojan Horse in the Age of Artificial Intelligence
sustainability-14-14877-v2.pddhzftheheeeee
Transform-Your-Streaming-Platform-with-AI-Driven-Quality-Engineering.pdf
5-Ways-AI-is-Revolutionizing-Telecom-Quality-Engineering.pdf
GROUP4NURSINGINFORMATICSREPORT-2 PRESENTATION
Custom Battery Pack Design Considerations for Performance and Safety
Data Virtualization in Action: Scaling APIs and Apps with FME
sbt 2.0: go big (Scala Days 2025 edition)
“A New Era of 3D Sensing: Transforming Industries and Creating Opportunities,...
Auditboard EB SOX Playbook 2023 edition.
The-Future-of-Automotive-Quality-is-Here-AI-Driven-Engineering.pdf
Configure Apache Mutual Authentication
Improvisation in detection of pomegranate leaf disease using transfer learni...
NewMind AI Weekly Chronicles – August ’25 Week IV
Build Your First AI Agent with UiPath.pptx

Big Data Fabric for At-Scale Real-Time Analysis by Edwin Robbins

  • 1. Denodo Data Virtualization Stop collecting, start connecting Ravi Shankar, CMO August 2017
  • 2. 2 • Competition from a low cost vendor • Lower the price, affecting margins? • Or, maintain high price, but differentiate in other ways?
  • 3. 3 Benefits Large Heavy Equipment Manufacturer Self-service / Predictive Analytics – IoT Integration Improved asset performance and proactive maintenance Increased revenue from sale of services and parts Reduced warranty costs of parts failure
  • 4. 4 “Big Data Challenges Impede Business Insights”
  • 5. 5 Big Data Fabric Architecture – Forrester Research, 2016
  • 6. 6 Big Data Fabric – Data Abstraction Layer Abstracts access to disparate data sources Acts as a single repository (virtual) Makes data available in real-time to consumers
  • 7. 7 Consume in business applications Combine related data into views 2 3 DATA CONSUMERS Enterprise Applications, Reporting, BI, Portals, ESB, Mobile, Web, Users, IoT/Streaming Data Connect to disparate data sources 1 DISPARATE DATA SOURCES Databases & Warehouses, Cloud/Saas Applications, Big Data, NoSQL, Web, XML, Excel, PDF, Word... Less StructuredMore Structured Multiple protocols, formats Linked data services query, search, browse Request/Reply, event driven Secure delivery Library of wrappers Web automation Any data or content Read & Write DATA VIRTUALIZATION Design Tools Optimization Engine Data Discovery & Search In-memory Fabric Cache Scheduler DATA CONSUMERSAnalytical Operational CONNECT COMBINE CONSUME Share, Deliver, Publish, Govern, Collaborate Discover, Transform, Prepare, Improve Quality, Integrate Normalized views of disparate data Data Services (Real-time & On-demand) Data catalog / Metadata Governance Security Management & Monitoring
  • 8. 8 Logical Data Lake – Use Cases Data Warehouse OffloadingIoT Integration
  • 9. 9 Big Data Queries Faster with Denodo Platform 1. Data Virtualization delivers better performance without needing to replicate data into Hadoop. 2. Data Virtualization leverages Data Source Architectures for what they are good at. Performance comparison of 5 different queries Impala Hadoop-only Runtime (s) Denodo Runtime (s) Denodo Runtime w/ Cache (s) Data Volumes Query 1 199 120 68 Queries 1,2,3,5 •Exadata Row Count: ~5M •Impala Row Count: ~500k Query 4 •Exadata Row Count: ~5M •Impala Row Count: ~2M Query 2 187 96 88 Query 3 120 212 115 Query 4 timeout 328 69 Query 5 46 91 56
  • 10. 10 Denodo Dynamic Query Optimizer System Execution Time Data Transferred Optimization Technique Denodo 9 sec. 4 M Aggregation push-down Tableau 125 sec. 292 M None: full scan SELECT c.id, SUM(s.amount) as total FROM customer c JOIN sales s ON c.id = s.customer_id GROUP BY c.id 290 M 2 M Sales Customer 2 M 2 M Sales Customer join group by join group by
  • 11. 11 Data Virtualization as the Big Data Fabric  Simplify access to Big Data  Extend traditional data warehouses  Capitalize on in-memory computing  Exploit distributed query processing  Bring benefits of Big data to business users “Plan to evolve your big data lake into a fabric over time by adding services like in-memory caching, data virtualization, or metadata cataloging.” -Source: Forrester Research “The Anatomy Of A System Of Insight, 2017”
  • 12. 12 ROI and TCO of Data Virtualization Customer-reported projected savings by percentage Data Integration Cost reduction • 60-70% savings Traditional Call Centres, Portals • 30-70% savings BI and Reporting • 40-60% savings ETL and Data Warehousing • Project timelines of 6-12 months reduced to 3-6 months • 85% time reduction
  • 13. 13 Big Data Fabric Vendors Forrester Wave: Big Data Fabric, Q4 2016 The Forrester WaveTM: Big Data Fabric, Q4 2016 Denodo’s key strength is delivering a unified and centralized data services fabric with security and real-time integration across multiple traditional and big data sources, including Hadoop, NoSQL, cloud, and software- as-a-service (SaaS). Customers like its easy-to-use, simple yet sophisticated data modeling capabilities, search, and support for various big data sources.” – Analyst Noel Yuhanna, Forrester Research
  • 14. 14 Data Virtualization, Federation, ETL, ESB Compared Virtualization Federation ETL ESB Data abstraction Full Partial Partial Full Robust Performance Full Limited to a few data sources Primarily in Batch mode Limited to few data sources Zero replication Full Partial None Partial Real-time Information Full Limited to a few data sources Primarily Batch Limited to few data sources Self-service data services Full None None Partial Centralized metadata, security, and governance Full None Partial None Solutions Denodo, Cisco, RedHat Tableau, QueryGrid, SAP SDA Informatica PowerCenter, IBM DataStage, Talend Data Fabric TIBCO ESB, Mulesoft
  • 15. 15 Denodo The Leader in Data Virtualization LEADERSHIP  Longest continuous focus on data virtualization – since 1999  400+ customers  Winner of numerous awards Customer Awards AUTODESK FINALIST in 2017 Excellence Awards SEACOAST BANK 2016 Business Leadership Award CIT BANK 2016 Premier 100 Technology Leader ULTRA MOBILE 2017 Best Practices Award AUTODESK 2017 CIO 100 Award ASURION 2017 Best Practices Award BIOSTORAGE 2016 Business Leadership Award