SlideShare a Scribd company logo
Confidential and Proprietary1
Key Architecture &
Performance Principles to
Optimize Data Management
May 18, 2017
Confidential and Proprietary2
Webinar Goals
Outline key architecture principles for backup/recovery
& test data management in a modern data world
Illustrate the importance of these principles using real
customer examples
Identify specific tradeoffs that can be made when
deploying your data management infrastructure
Confidential and Proprietary3
Attributes of Modern Data Platforms
Scale out to petabyte
workloads
Analytics-driven
Intelligence
Storage optimization
across diverse data &
environments
Minimize copies
Create storage and
compute pools on
commodity H/W
Confidential and Proprietary4
Why Incremental Forever: Backups
Traditional
Approach
Big Data
Platform
Approach
Day 1 Days 2-7 Day 8
Full
backup
Incremental
backups
Full
backup
Incremental
backups
Full
backup
Incremental
backup
Days 9-14
Incremental
backups
Incremental
backups
Day 15
Full
backup
Incremental
backup
Confidential and Proprietary5
Why Incremental Forever: Restores
Day 1 2 3 . . . . 80 81
Data Size 1
TB
1.02
TB
1.03
TB
. . . . 1.2TB Developer
error
Changed data
from last
backup
- 50 GB 50 GB . . . . 50GB
Backup type Full Incr Incr . . . . Incr
Data recovered by traditional approach: 1 TB + 79 x 50 GB = 4.95 TB
Data recovered by big data approach: 1.2 TB
Key concept: the notion of a “virtualized full” image
Confidential and Proprietary6
The Importance of Parallelism
Test Platform Utility Talena Differential
Full backup 8 hours, 17 min 2 hours, 20 min 3.5x
Incremental
backup
4 hours, 55
minutes
26 min, 7 seconds 11.3x
Full restore to
different cluster
40 hours, 28 min 14 hours, 55
minutes
2.7x
Full restore to
same cluster
6 hours, 21 min 1 hour, 58 minutes 2.2x
Full restore using
incremental
restore point to
same cluster
21 hours, 28 min 2 hours, 5 min 10.3x
Eliminate choke
points
Tradeoff between
backup/restore
performance
versus production
cluster
Bandwidth efficiency
Confidential and Proprietary7
Elastic Scaling: What Are The Issues
Multi-DC
Cassandra Cluster
100-nodes, 320 TB
ARCHIVE
Data Center #1
Data Center #2
Cassandra Cluster
50-node, 125 TB
Data Center #1
Year 1 Year 2
Topic Key consideration
Scaling backup
infrastructure
Just adding nodes or
forklift
Agents/listeners Manageability
Multi-DC awareness Minimize WAN bandwidth
overhead
Confidential and Proprietary8
The Cloud Effect
NoSQL/Hado
op/EDW
Local
Storage
Production
Cluster
Object
Storage
Cold
Storage• Storage tiering
• Transparent access
• Bandwidth impact
Confidential and Proprietary9
The Evolution of Data Management
THE NEXT
25 YEARS
THE
TRADITIONAL
WORLD
Data ManagementData Platforms
Confidential and Proprietary10
The Talena Architecture
• Deep de-duplication and compression with app-aware architecture
• Incremental-forever backup architecture
• High availability via erasure coding in distributed cluster architecture
Smart Storage Optimizer
Confidential and Proprietary11
The Talena Architecture
Native querying and analytics
via active compute layer
Unbounded scale with a
Hadoop-native architecture
Smart Storage Optimizer
Active Compute Services Distributed File System
Confidential and Proprietary12
The Talena Architecture
• Google-like catalog
shortens data recovery
time
• Automatic schema
generation for mirroring
and backups
• Granular recovery at an
object level
• Recovery to multiple
topologies
• Native integration with
LDAP and Kerberos for
authentication
• Role-based access control
defines specific privileges
• Transparent data encryption
• Masking for PII data
Smart Storage Optimizer
Active Compute Services Distributed File System
Metadata Catalog Data Orchestration ServicesSecurity Services
Confidential and Proprietary13
Smart Storage Optimizer
The Talena Architecture
GUI CLI API
Active Compute Services Distributed File System
• ‘Single pane of glass’ for multiple use cases and data platforms
• Agentless architecture minimizes management overhead
• GUI, CLI, REST-based Talena API options
Metadata Catalog Data Orchestration ServicesSecurity Services
Confidential and Proprietary14
Q&A
 We’ll send you a link to our
architecture white paper
 Additional resources: talena-
inc.com/resources and
talena-inc.com/blog
 Ping us with any additional
questions: info@talena-
inc.com
Confidential and Proprietary15
Thank You

More Related Content

What's hot (20)

PPTX
Big Data Case Study: Fortune 100 Telco
BlueData, Inc.
 
PPTX
Polyglot Persistence and Database Deployment by Sandeep Khuperkar CTO and Dir...
Ashnikbiz
 
PPTX
Optimize Your Vertica Data Management Infrastructure
Imanis Data
 
PPTX
Get Savvy with Snowflake
Matillion
 
PPTX
Cloudian HyperStore Operating Environment
Cloudian
 
PDF
Reducing large S3 API costs using Alluxio at Datasapiens
Alluxio, Inc.
 
POTX
EDB Postgres in DBaaS & Container Platforms
Ashnikbiz
 
PDF
Enabling big data & AI workloads on the object store at DBS
Alluxio, Inc.
 
PPTX
AltaVault
John Davis
 
PPTX
Webinar: Don't believe the hype, you don't need dedicated storage for VDI
NetApp
 
PPTX
Webinar: Sizing Up Object Storage for the Enterprise
Storage Switzerland
 
PPTX
Altis AWS Snowflake Practice
SamanthaSwain7
 
PPTX
Snowflake + Power BI: Cloud Analytics for Everyone
Angel Abundez
 
PDF
#FMS2018 NGD Systems Real World Results with #ComputationalStorage
Scott Shadley, MBA,PMC-III
 
PDF
Revolutionising Storage for your Future Business Requirements
NetApp
 
PDF
Alluxio - Virtual Unified File System
Alluxio, Inc.
 
PDF
Data Fabric: NetApp's Vision for the Future of Data Management
NetApp
 
PPTX
Three Steps to Modern Media Asset Management with Active Archive
Avere Systems
 
PPTX
Part 1: Cloudera’s Analytic Database: BI & SQL Analytics in a Hybrid Cloud World
Cloudera, Inc.
 
PPTX
A 30 day plan to start ending your data struggle with Snowflake
Snowflake Computing
 
Big Data Case Study: Fortune 100 Telco
BlueData, Inc.
 
Polyglot Persistence and Database Deployment by Sandeep Khuperkar CTO and Dir...
Ashnikbiz
 
Optimize Your Vertica Data Management Infrastructure
Imanis Data
 
Get Savvy with Snowflake
Matillion
 
Cloudian HyperStore Operating Environment
Cloudian
 
Reducing large S3 API costs using Alluxio at Datasapiens
Alluxio, Inc.
 
EDB Postgres in DBaaS & Container Platforms
Ashnikbiz
 
Enabling big data & AI workloads on the object store at DBS
Alluxio, Inc.
 
AltaVault
John Davis
 
Webinar: Don't believe the hype, you don't need dedicated storage for VDI
NetApp
 
Webinar: Sizing Up Object Storage for the Enterprise
Storage Switzerland
 
Altis AWS Snowflake Practice
SamanthaSwain7
 
Snowflake + Power BI: Cloud Analytics for Everyone
Angel Abundez
 
#FMS2018 NGD Systems Real World Results with #ComputationalStorage
Scott Shadley, MBA,PMC-III
 
Revolutionising Storage for your Future Business Requirements
NetApp
 
Alluxio - Virtual Unified File System
Alluxio, Inc.
 
Data Fabric: NetApp's Vision for the Future of Data Management
NetApp
 
Three Steps to Modern Media Asset Management with Active Archive
Avere Systems
 
Part 1: Cloudera’s Analytic Database: BI & SQL Analytics in a Hybrid Cloud World
Cloudera, Inc.
 
A 30 day plan to start ending your data struggle with Snowflake
Snowflake Computing
 

Similar to Key Architecture and Performance Principles to Optimize Data Management (20)

PPTX
Debunking Common Myths of Hadoop Backup & Test Data Management
Imanis Data
 
PPTX
vFabric Data Director 2.7 customer deck
Junchi Zhang
 
PDF
ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...
DATAVERSITY
 
PDF
MISYS-KL - Cintra Optimized Oracle Archiecture Solution and Services 1.1.pdf
ssuserd5e338
 
PPTX
Cloudian Webinar - 7 Key Reasons why Object Storage lowers Storage TCO
Storage Switzerland
 
PDF
Revolutionary Storage for Modern Databases, Applications and Infrastrcture
sabnees
 
PDF
ADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
DATAVERSITY
 
PDF
32992 lam ebc storage overview3
gmazuel
 
PDF
Data Architecture Best Practices for Advanced Analytics
DATAVERSITY
 
PDF
Red hat Storage Day LA - Designing Ceph Clusters Using Intel-Based Hardware
Red_Hat_Storage
 
PDF
Oracle database 12c introduction- Satyendra Pasalapudi
pasalapudi123
 
PPTX
IMC Summit 2016 Breakout - Pandurang Naik - Demystifying In-Memory Data Grid,...
In-Memory Computing Summit
 
PDF
5. od optimized data-protection_archival_v1
Doina Draganescu
 
PDF
Presentation dell™ power vault™ md3
xKinAnx
 
PDF
Demystifying Data Warehouse as a Service (DWaaS)
Kent Graziano
 
PPTX
Webinar: Cloud Storage: The 5 Reasons IT Can Do it Better
Storage Switzerland
 
PPTX
1. beyond mission critical virtualizing big data and hadoop
Chiou-Nan Chen
 
PDF
Gestione gerarchica dei dati con SUSE Enterprise Storage e HPE DMF
SUSE Italy
 
PDF
Managing The Data Deluge By Optimizing Storage
Dell World
 
PPTX
IBM Spectrum Scale Overview november 2015
Doug O'Flaherty
 
Debunking Common Myths of Hadoop Backup & Test Data Management
Imanis Data
 
vFabric Data Director 2.7 customer deck
Junchi Zhang
 
ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...
DATAVERSITY
 
MISYS-KL - Cintra Optimized Oracle Archiecture Solution and Services 1.1.pdf
ssuserd5e338
 
Cloudian Webinar - 7 Key Reasons why Object Storage lowers Storage TCO
Storage Switzerland
 
Revolutionary Storage for Modern Databases, Applications and Infrastrcture
sabnees
 
ADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
DATAVERSITY
 
32992 lam ebc storage overview3
gmazuel
 
Data Architecture Best Practices for Advanced Analytics
DATAVERSITY
 
Red hat Storage Day LA - Designing Ceph Clusters Using Intel-Based Hardware
Red_Hat_Storage
 
Oracle database 12c introduction- Satyendra Pasalapudi
pasalapudi123
 
IMC Summit 2016 Breakout - Pandurang Naik - Demystifying In-Memory Data Grid,...
In-Memory Computing Summit
 
5. od optimized data-protection_archival_v1
Doina Draganescu
 
Presentation dell™ power vault™ md3
xKinAnx
 
Demystifying Data Warehouse as a Service (DWaaS)
Kent Graziano
 
Webinar: Cloud Storage: The 5 Reasons IT Can Do it Better
Storage Switzerland
 
1. beyond mission critical virtualizing big data and hadoop
Chiou-Nan Chen
 
Gestione gerarchica dei dati con SUSE Enterprise Storage e HPE DMF
SUSE Italy
 
Managing The Data Deluge By Optimizing Storage
Dell World
 
IBM Spectrum Scale Overview november 2015
Doug O'Flaherty
 
Ad

Recently uploaded (20)

PDF
Windsurf Meetup Ottawa 2025-07-12 - Planning Mode at Reliza.pdf
Pavel Shukhman
 
PDF
CIFDAQ Token Spotlight for 9th July 2025
CIFDAQ
 
PDF
Persuasive AI: risks and opportunities in the age of digital debate
Speck&Tech
 
PDF
CloudStack GPU Integration - Rohit Yadav
ShapeBlue
 
PDF
SFWelly Summer 25 Release Highlights July 2025
Anna Loughnan Colquhoun
 
PDF
Blockchain Transactions Explained For Everyone
CIFDAQ
 
PDF
LLMs.txt: Easily Control How AI Crawls Your Site
Keploy
 
PDF
Empowering Cloud Providers with Apache CloudStack and Stackbill
ShapeBlue
 
PDF
Apache CloudStack 201: Let's Design & Build an IaaS Cloud
ShapeBlue
 
PDF
CIFDAQ Weekly Market Wrap for 11th July 2025
CIFDAQ
 
PPTX
Darren Mills The Migration Modernization Balancing Act: Navigating Risks and...
AWS Chicago
 
PPTX
Building and Operating a Private Cloud with CloudStack and LINBIT CloudStack ...
ShapeBlue
 
PDF
Smart Air Quality Monitoring with Serrax AQM190 LITE
SERRAX TECHNOLOGIES LLP
 
PDF
Fl Studio 24.2.2 Build 4597 Crack for Windows Free Download 2025
faizk77g
 
PDF
Ampere Offers Energy-Efficient Future For AI And Cloud
ShapeBlue
 
PPTX
Top iOS App Development Company in the USA for Innovative Apps
SynapseIndia
 
PPTX
Top Managed Service Providers in Los Angeles
Captain IT
 
PDF
Building Resilience with Digital Twins : Lessons from Korea
SANGHEE SHIN
 
PDF
TrustArc Webinar - Data Privacy Trends 2025: Mid-Year Insights & Program Stra...
TrustArc
 
PDF
Impact of IEEE Computer Society in Advancing Emerging Technologies including ...
Hironori Washizaki
 
Windsurf Meetup Ottawa 2025-07-12 - Planning Mode at Reliza.pdf
Pavel Shukhman
 
CIFDAQ Token Spotlight for 9th July 2025
CIFDAQ
 
Persuasive AI: risks and opportunities in the age of digital debate
Speck&Tech
 
CloudStack GPU Integration - Rohit Yadav
ShapeBlue
 
SFWelly Summer 25 Release Highlights July 2025
Anna Loughnan Colquhoun
 
Blockchain Transactions Explained For Everyone
CIFDAQ
 
LLMs.txt: Easily Control How AI Crawls Your Site
Keploy
 
Empowering Cloud Providers with Apache CloudStack and Stackbill
ShapeBlue
 
Apache CloudStack 201: Let's Design & Build an IaaS Cloud
ShapeBlue
 
CIFDAQ Weekly Market Wrap for 11th July 2025
CIFDAQ
 
Darren Mills The Migration Modernization Balancing Act: Navigating Risks and...
AWS Chicago
 
Building and Operating a Private Cloud with CloudStack and LINBIT CloudStack ...
ShapeBlue
 
Smart Air Quality Monitoring with Serrax AQM190 LITE
SERRAX TECHNOLOGIES LLP
 
Fl Studio 24.2.2 Build 4597 Crack for Windows Free Download 2025
faizk77g
 
Ampere Offers Energy-Efficient Future For AI And Cloud
ShapeBlue
 
Top iOS App Development Company in the USA for Innovative Apps
SynapseIndia
 
Top Managed Service Providers in Los Angeles
Captain IT
 
Building Resilience with Digital Twins : Lessons from Korea
SANGHEE SHIN
 
TrustArc Webinar - Data Privacy Trends 2025: Mid-Year Insights & Program Stra...
TrustArc
 
Impact of IEEE Computer Society in Advancing Emerging Technologies including ...
Hironori Washizaki
 
Ad

Key Architecture and Performance Principles to Optimize Data Management

  • 1. Confidential and Proprietary1 Key Architecture & Performance Principles to Optimize Data Management May 18, 2017
  • 2. Confidential and Proprietary2 Webinar Goals Outline key architecture principles for backup/recovery & test data management in a modern data world Illustrate the importance of these principles using real customer examples Identify specific tradeoffs that can be made when deploying your data management infrastructure
  • 3. Confidential and Proprietary3 Attributes of Modern Data Platforms Scale out to petabyte workloads Analytics-driven Intelligence Storage optimization across diverse data & environments Minimize copies Create storage and compute pools on commodity H/W
  • 4. Confidential and Proprietary4 Why Incremental Forever: Backups Traditional Approach Big Data Platform Approach Day 1 Days 2-7 Day 8 Full backup Incremental backups Full backup Incremental backups Full backup Incremental backup Days 9-14 Incremental backups Incremental backups Day 15 Full backup Incremental backup
  • 5. Confidential and Proprietary5 Why Incremental Forever: Restores Day 1 2 3 . . . . 80 81 Data Size 1 TB 1.02 TB 1.03 TB . . . . 1.2TB Developer error Changed data from last backup - 50 GB 50 GB . . . . 50GB Backup type Full Incr Incr . . . . Incr Data recovered by traditional approach: 1 TB + 79 x 50 GB = 4.95 TB Data recovered by big data approach: 1.2 TB Key concept: the notion of a “virtualized full” image
  • 6. Confidential and Proprietary6 The Importance of Parallelism Test Platform Utility Talena Differential Full backup 8 hours, 17 min 2 hours, 20 min 3.5x Incremental backup 4 hours, 55 minutes 26 min, 7 seconds 11.3x Full restore to different cluster 40 hours, 28 min 14 hours, 55 minutes 2.7x Full restore to same cluster 6 hours, 21 min 1 hour, 58 minutes 2.2x Full restore using incremental restore point to same cluster 21 hours, 28 min 2 hours, 5 min 10.3x Eliminate choke points Tradeoff between backup/restore performance versus production cluster Bandwidth efficiency
  • 7. Confidential and Proprietary7 Elastic Scaling: What Are The Issues Multi-DC Cassandra Cluster 100-nodes, 320 TB ARCHIVE Data Center #1 Data Center #2 Cassandra Cluster 50-node, 125 TB Data Center #1 Year 1 Year 2 Topic Key consideration Scaling backup infrastructure Just adding nodes or forklift Agents/listeners Manageability Multi-DC awareness Minimize WAN bandwidth overhead
  • 8. Confidential and Proprietary8 The Cloud Effect NoSQL/Hado op/EDW Local Storage Production Cluster Object Storage Cold Storage• Storage tiering • Transparent access • Bandwidth impact
  • 9. Confidential and Proprietary9 The Evolution of Data Management THE NEXT 25 YEARS THE TRADITIONAL WORLD Data ManagementData Platforms
  • 10. Confidential and Proprietary10 The Talena Architecture • Deep de-duplication and compression with app-aware architecture • Incremental-forever backup architecture • High availability via erasure coding in distributed cluster architecture Smart Storage Optimizer
  • 11. Confidential and Proprietary11 The Talena Architecture Native querying and analytics via active compute layer Unbounded scale with a Hadoop-native architecture Smart Storage Optimizer Active Compute Services Distributed File System
  • 12. Confidential and Proprietary12 The Talena Architecture • Google-like catalog shortens data recovery time • Automatic schema generation for mirroring and backups • Granular recovery at an object level • Recovery to multiple topologies • Native integration with LDAP and Kerberos for authentication • Role-based access control defines specific privileges • Transparent data encryption • Masking for PII data Smart Storage Optimizer Active Compute Services Distributed File System Metadata Catalog Data Orchestration ServicesSecurity Services
  • 13. Confidential and Proprietary13 Smart Storage Optimizer The Talena Architecture GUI CLI API Active Compute Services Distributed File System • ‘Single pane of glass’ for multiple use cases and data platforms • Agentless architecture minimizes management overhead • GUI, CLI, REST-based Talena API options Metadata Catalog Data Orchestration ServicesSecurity Services
  • 14. Confidential and Proprietary14 Q&A  We’ll send you a link to our architecture white paper  Additional resources: talena- inc.com/resources and talena-inc.com/blog  Ping us with any additional questions: info@talena- inc.com

Editor's Notes

  • #2: .
  • #10: Starting over 20 years ago, the traditional database market became the foundation of enterprise applications. A whole ecosystem of data management products emerged to provide capabilities like backup/recovery (Veritas), storage pooling (Data Domain) test/dev management (Delphix) and Iron Mountain (archiving). But, companies had to purchase separate products to provide a full data management solution for their enterprise. Over the past few years and into the foreseeable future, modern data platforms will become new hubs of enterprise applications. These modern data platforms also need data management capabilities, similar to what happened with traditional databases. (Click for build) Our vision is to help companies with their critical data management needs in a single software product, one that is optimized specifically for these modern Big Data environments.
  • #11: The next few slides will introduce the unique Talena architecture and highlight how this architecture delivers on these core business benefits. One of the most significant components of our architecture is our Smart Storage Optimizer. By integrating compute and storage management into our storage optimizer, we’re able to deliver significant cost savings. Our application-aware architecture enables us to do deep de-duplication and compression. Our backup process is incremental-forever, saving on storage costs, and by incorporating erasure coding we also ensure high availability no matter how large a Talena cluster you choose to deploy.
  • #13: Supports transparent data encryption in the security services section
  • #14: Our agentless architecture makes Talena an ideal solution for big data architectures and minimizes your operational overhead. Furthermore, Talena can support multiple data platforms, versions, and use cases in a single deployment of Talena, thereby providing a “single pane of glass” for all your big data management needs. While most of our clients work within our user interface, we also provide a REST-based API to accomplish the same tasks.