SlideShare a Scribd company logo
Cloud Data Integration Best Practices Kurt Messersmith,  Amazon Web Services David S. Linthicum,  www.davidlinthicum.com Darren Cunningham, Informatica Cloud
Today’s Agenda Amazon Web Services Data Integration for “Cloud” Informatica Cloud
AMAZON WEB SERVICES  Kurt Messersmith,  Sr Manager, AWS
AMAZON’S THREE BUSINESSES Consumer (Retail) Business Tens of millions of active customer accounts Seven countries: US, UK, Germany, Japan, France, Canada, China Seller Business Sell on Amazon websites Use Amazon technology for your own retail website Leverage Amazon’s massive fulfillment center network Developers & IT Professionals On-demand infrastructure for hosting web-scale solutions Hundreds of thousands of registered customers
AWS USAGE GRAPH 2007: AWS bandwidth usage surpassed Amazon.com global websites  Today: AWS bandwidth usage 30% greater than Amazon.com global websites  Bandwidth Usage:
CLOUD ATTRIBUTES Abstract Resources Not tied to physical hardware and can be flexible as your needs demand. On-Demand Provisioning Ask for what you need, exactly when you need it.  Pay only for what you use. Scalability Scale up or down depending on usage needs. No Up-Front Costs No contracts or long-term commitments. Pay only for what you use. Efficiency of Experts Utilize the skills, knowledge and resources of experts.
Owned Infrastructure: The Heavy Lifting - Server hosting - Contract negotiation - Bandwidth management - Purchase decisions - Moving facilities OPPORTUNITY COSTS: OWNED VS. CLOUD Scaling and managing physical growth Heterogeneous hardware Legacy software Coordinating large teams Cloud Computing: The 70/30 Switch 30% of time, energy and dollars on differentiated value creation
PREDICTIONS COST MONEY Infrastructure Cost $ time Large Capital Expenditure You just lost customers Predicted Demand Traditional Hardware Actual Demand Automated Virtualization
AGILITY EXAMPLE—COST NEUTRAL EQUATION This graphic compares running the same 10,000 jobs on 2 servers versus 1000 servers. The cost is the same for either scenario in  using AWS (and RightScale), but the difference in elapsed time is 499 hours. (assuming each server can process 10 jobs/hour) 2 server cloud 10,000 jobs 10,000 jobs 1000 server cloud Output data Output data Total processing time:500 hours Total processing time:1 hour
Web site hosting Application hosting Internal IT application hosting Quick and effective marketing campaigns Content delivery and media distribution High performance computing, batch data processing, and large scale analytics Storage, backup, and disaster recovery Development and test environments DIVERSE ENTERPRISE USE CASES
DIVERSE ENTERPRISE CUSTOMER ROSTER
Data Integration for “Cloud.” David S. Linthicum www.davidlinthicum.com [email_address]
So, Why Data Integration and “Cloud”   Improved Adaptability and Agility Respond to business needs in near real-time Functional Reusability Eliminate the need for large scale rip and replace Independent Change Management Focus on configuration rather than programming Interoperability instead of point-to-point integration Loosely-coupled framework, services in network Orchestrate rather than integrate Configuration rather than development to deliver business needs
Understand Cloud Provider Interfaces New Accounts Commission Calculation Data Cleaning Sales Order Update Finance/ Operations Sales
Public Cloud Traditional Data Center Evolving Migration to Public Cloud Computing Providers
Public Cloud Traditional Data Center Evolving Migration to Public Cloud Computing Providers
SaaS IaaS PaaS APIs ERP Legacy Data On Premise On Demand Data Integration
Understanding the Problem Cloud services must integrate with existing enterprise systems to become more valuable. However, existing internal integration needs to exist to ensure: Production and consumption of structured information Semantic mediation Security mediation Service enablement Firewall management Transactional integrity Holistic management of complete integration chain
Getting Ready So, how do you prepare yourself? I have a few suggestions: First, accept the notion that it's okay to leverage services that are hosted on the Internet as part of your SOA.  Normal security management needs to apply, of course.  Second,  create a strategy for the consumption and management of outside-in services , including how you'll deal with semantic management, security, transactions, etc. Finally,  create a proof of concept now.  This does a few things including getting you through the initial learning process and providing proof points as to the feasibility of leveraging outside-in services.
Remember, there are a few technical issues that you must address… Semantic and metadata management , or, the management of the different information representations amount the external services and internal systems. Transformation and routing , or, accounting for those data differences during run time. Governance across all systems , meaning, not giving up the notion of security and control when extending your SOA to the global SOA. Discovery and service management , meaning, how to find and leverage services inside or outside of your enterprise, and how to keep track of those services through their maturation.  Information consumption, processing, and delivery , or, how to effectively move information to and from all interested systems. Connectivity and adapter management , or, how to externalize and internalize information and services from very old and proprietary systems. Process orchestration and service, and process abstraction , or, the ability to abstract the services and information flows into bound processes, thus creating a solution
Core Issues that Architects  Must  Consider when Integrating with “Clouds.” The ability to handle larger data sets.  The ability to handle and resolve data inaccuracies and inconsistencies.  The ability to do data manipulation efficiently and inexpensively.  The ability to provide visibility into the lineage of data.  The ability to decouple data access from the implementation
Limitations of Existing Integration Approaches Inefficient consumption of data by the integration engine from the source systems. Lack of validation and transformation of the data for the correct format and structure. No early detection of data inaccuracies and inconsistencies leading to error-prone business processes  Inability to handle data quality issues No tracking of data to insure data traceability and lineage Content transformation, on message and  large set of data Inefficient provisioning of the data from the integration and processing engine to the target system.
Issues You Need to Consider when Selecting Data Integration Technology for Enterprise-to-Cloud Lack of support for complex data transformations.  Challenges in handling large data volumes. Lack of support for handling varying data latencies including batch, trickle-feed and real-time. Difficulty in determining the origin of data or how it’s utilized.  Lack of standards-based approaches and limited re-use across projects. Lacking mechanisms to handle data quality issues across sources. No protection against changes to underlying data sources. The requirement for manual handling of diverse data structures, formats, access, etc. Limited support for metadata and impact analysis. Lacking a mechanism to automatically detect changes to the data. Lack of support for batch and trickle-feed (CDC) data movement.
Create the Information Model Ontologies Understand Ontologies Understand the Data Data Dictionary  & Metadata Catalog the Data Data Catalog Legacy Metadata External Metadata (B2B) Build Information Model Information Model
Start with the Architecture Understand: Business drivers Information under management Existing services under management Core business processes
The Informatica Cloud www.informaticacloud.com Darren Cunningham, Informatica Cloud Marketing
Replicate Data Primary Cloud Integration Use Cases: Your Company Load Data Synchronize Data Cleanse Data
Cloud Integration Options Outsource Cloud Services On-Premise  Tools 3 4 2 Hand Code You need to consider integration for what it is:  the mother of all single points of failure . “ ” David Linthicum Author,  Cloud Computing and SOA Convergence in Your Enterprise 1
The Informatica Cloud The Industry’s Broadest Cloud Integration Portfolio Informatica Cloud  Services Business Managers Migrate Validate Monitor Synch Replicate Informatica Cloud  Editions & Options IT Informatica Cloud  Platform SIs, ISVs, Developers Custom
Data Integration as a Service Advantages www.informaticacloud.com +500 customers +20K jobs/day + 5B rows/month Migrate Monitor Replicate Synch For Customers Rapid Deployment Utility Pricing Minimal Training Fewer IT Resources Seamless Upgrades Usage Tracking For ISVs Reduced Dev Costs Rapid Innovation Best of Breed Tech Greater Scalability Expand Your Market Focus on Your Core Custom
Data Replication as a Cloud Service We’re using Informatica Cloud Services to  replicate millions of rows of data from Salesforce to a centralized database running on Amazon EC2. ” “
Contacts David S. Linthicum www.davidlinthicum.com [email_address] Kurt Messersmith Amazon Web Services [email_address] Darren Cuningham www.informaticacloud.com [email_address]

More Related Content

What's hot (20)

PDF
Data Mesh 101
ChrisFord803185
 
PPTX
Microsoft Azure Technical Overview
gjuljo
 
PPT
MDM and Reference Data
Database Answers Ltd.
 
PPT
Active directory ii
deshvikas
 
PPTX
Azure Security Overview
Allen Brokken
 
PDF
Microsoft Security Overview
David J Rosenthal
 
PPT
Master Data Management
Sung Kuan
 
PPTX
Practical FinOps in Practice
Petri Kallberg
 
PPTX
Azure Virtual Desktop Overview.pptx
ceyhan1
 
PPT
Identity Access Management (IAM)
Prof. Jacques Folon (Ph.D)
 
PDF
Microsoft Azure Active Directory
David J Rosenthal
 
PDF
AZ-900 Azure Fundamentals.pdf
ssuser5813861
 
PDF
Microsoft Azure - Introduction to microsoft's public cloud
Atanas Gergiminov
 
PDF
SAP HANA Data integration using Informatica
Thomas Vengal
 
PDF
Power BI Governance and Development Best Practices - Presentation at #MSBIFI ...
Jouko Nyholm
 
PPTX
Secure your Access to Cloud Apps using Microsoft Defender for Cloud Apps
Vignesh Ganesan I Microsoft MVP
 
PPTX
Azure active directory
Raju Kumar
 
PPTX
Identity and Access Management Introduction
Aidy Tificate
 
PPTX
AZ-900T01 Microsoft Azure Fundamentals-01.pptx
sayyedghazali
 
PPTX
Microsoft Active Directory.pptx
masbulosoke
 
Data Mesh 101
ChrisFord803185
 
Microsoft Azure Technical Overview
gjuljo
 
MDM and Reference Data
Database Answers Ltd.
 
Active directory ii
deshvikas
 
Azure Security Overview
Allen Brokken
 
Microsoft Security Overview
David J Rosenthal
 
Master Data Management
Sung Kuan
 
Practical FinOps in Practice
Petri Kallberg
 
Azure Virtual Desktop Overview.pptx
ceyhan1
 
Identity Access Management (IAM)
Prof. Jacques Folon (Ph.D)
 
Microsoft Azure Active Directory
David J Rosenthal
 
AZ-900 Azure Fundamentals.pdf
ssuser5813861
 
Microsoft Azure - Introduction to microsoft's public cloud
Atanas Gergiminov
 
SAP HANA Data integration using Informatica
Thomas Vengal
 
Power BI Governance and Development Best Practices - Presentation at #MSBIFI ...
Jouko Nyholm
 
Secure your Access to Cloud Apps using Microsoft Defender for Cloud Apps
Vignesh Ganesan I Microsoft MVP
 
Azure active directory
Raju Kumar
 
Identity and Access Management Introduction
Aidy Tificate
 
AZ-900T01 Microsoft Azure Fundamentals-01.pptx
sayyedghazali
 
Microsoft Active Directory.pptx
masbulosoke
 

Viewers also liked (15)

PPT
Real-time data integration to the cloud
Sankar Nagarajan
 
PDF
C3 bringing the_power_of_the_public_cloud_to_your_secure_data_center
Dr. Wilfred Lin (Ph.D.)
 
PPTX
Privacy preserving public auditing for regenerating-code-based cloud storage
Nagamalleswararao Tadikonda
 
PPTX
Final review presentation
Rahid Abdul Kalam
 
PPTX
cloud computing preservity
chennuruvishnu
 
PPTX
Secure erasure code based cloud storage system with secure data forwarding
Priyank Rupera
 
DOC
Privacy Preserving Public Auditing for Data Storage Security in Cloud
Girish Chandra
 
PPTX
PRIVACY-PRESERVING PUBLIC AUDITING FOR DATA STORAGE SECURITY IN CLOUD COMPUTING
Kayalvizhi Selvaraj
 
PPT
Privacy Preserving Public Auditing for Data Storage Security in Cloud.ppt
Girish Chandra
 
PPTX
Graphical password authentication
shalini singh
 
PPT
graphical password authentication
Akhil Kumar
 
PPT
Ppt 1
shanmugamsara
 
PPTX
Graphical password authentication
Asim Kumar Pathak
 
PDF
Data Integration and Data Warehousing for Cloud, Big Data and IoT: 
What’s Ne...
Rittman Analytics
 
PPTX
Data security in cloud computing
Prince Chandu
 
Real-time data integration to the cloud
Sankar Nagarajan
 
C3 bringing the_power_of_the_public_cloud_to_your_secure_data_center
Dr. Wilfred Lin (Ph.D.)
 
Privacy preserving public auditing for regenerating-code-based cloud storage
Nagamalleswararao Tadikonda
 
Final review presentation
Rahid Abdul Kalam
 
cloud computing preservity
chennuruvishnu
 
Secure erasure code based cloud storage system with secure data forwarding
Priyank Rupera
 
Privacy Preserving Public Auditing for Data Storage Security in Cloud
Girish Chandra
 
PRIVACY-PRESERVING PUBLIC AUDITING FOR DATA STORAGE SECURITY IN CLOUD COMPUTING
Kayalvizhi Selvaraj
 
Privacy Preserving Public Auditing for Data Storage Security in Cloud.ppt
Girish Chandra
 
Graphical password authentication
shalini singh
 
graphical password authentication
Akhil Kumar
 
Graphical password authentication
Asim Kumar Pathak
 
Data Integration and Data Warehousing for Cloud, Big Data and IoT: 
What’s Ne...
Rittman Analytics
 
Data security in cloud computing
Prince Chandu
 
Ad

Similar to Cloud Data Integration Best Practices (20)

PPT
How to Get Cloud Architecture and Design Right the First Time
David Linthicum
 
PPTX
IBM Relay 2015: Open for Data
IBM
 
PDF
Data and Application Modernization in the Age of the Cloud
redmondpulver
 
PPTX
SMAC - Social, Mobile, Analytics and Cloud - An overview
Rajesh Menon
 
PPT
Virgílio Vargas Presentations / CloudViews.Org - Cloud Computing Conference 2...
EuroCloud
 
PPT
Cloud Computing and Enterprise Architecture
David Linthicum
 
PPTX
Data Mesh in Azure using Cloud Scale Analytics (WAF)
Nathan Bijnens
 
PDF
IBM Cloud pak for data brochure
Simon Harrison ACMA CGMA
 
PPTX
Cloud Computing
Madhusudan Partani
 
PPT
Cloud Computing Realities - Getting past the hype and setting your cloud stra...
Compuware APM
 
PPT
Cloud Computing Impact On Small Business
David Linthicum
 
PPTX
Data Mesh using Microsoft Fabric
Nathan Bijnens
 
PPT
Cloud computing adoption in sap technologies
sveldanda
 
PPTX
Agile IT: Filling in the Gaps in the Azure vs. AWS debate
Joel Brda
 
PPTX
IT perspective in cloud computing.pptx
shivanisaxena23114
 
PPTX
Big Data: It’s all about the Use Cases
James Serra
 
PPTX
Financial impact of Cloud Computing
krisbliesner
 
PDF
Accelerate Cloud Migrations and Architecture with Data Virtualization
Denodo
 
PPTX
The Future of Mainframe Data is in the Cloud
Precisely
 
PDF
Unlocking The Secrets: AWS Whitepapers That Simplify Cloud Computing
FredReynolds2
 
How to Get Cloud Architecture and Design Right the First Time
David Linthicum
 
IBM Relay 2015: Open for Data
IBM
 
Data and Application Modernization in the Age of the Cloud
redmondpulver
 
SMAC - Social, Mobile, Analytics and Cloud - An overview
Rajesh Menon
 
Virgílio Vargas Presentations / CloudViews.Org - Cloud Computing Conference 2...
EuroCloud
 
Cloud Computing and Enterprise Architecture
David Linthicum
 
Data Mesh in Azure using Cloud Scale Analytics (WAF)
Nathan Bijnens
 
IBM Cloud pak for data brochure
Simon Harrison ACMA CGMA
 
Cloud Computing
Madhusudan Partani
 
Cloud Computing Realities - Getting past the hype and setting your cloud stra...
Compuware APM
 
Cloud Computing Impact On Small Business
David Linthicum
 
Data Mesh using Microsoft Fabric
Nathan Bijnens
 
Cloud computing adoption in sap technologies
sveldanda
 
Agile IT: Filling in the Gaps in the Azure vs. AWS debate
Joel Brda
 
IT perspective in cloud computing.pptx
shivanisaxena23114
 
Big Data: It’s all about the Use Cases
James Serra
 
Financial impact of Cloud Computing
krisbliesner
 
Accelerate Cloud Migrations and Architecture with Data Virtualization
Denodo
 
The Future of Mainframe Data is in the Cloud
Precisely
 
Unlocking The Secrets: AWS Whitepapers That Simplify Cloud Computing
FredReynolds2
 
Ad

More from Darren Cunningham (20)

PDF
5 Signs You Need to Re-Think Your Data Integration Strategy
Darren Cunningham
 
PDF
SnapLogic Elastic Integration Platform as a Service (iPaaS)
Darren Cunningham
 
PDF
[Infographic] Why Are CIOs Getting SMACT?
Darren Cunningham
 
PPTX
Introducing the SnapLogic Integration Cloud
Darren Cunningham
 
PPTX
Cloud-Con: Informatica Vibe and Cloud Integration for the Hybrid Enterprise
Darren Cunningham
 
PPTX
Powering the Boundary-Free Enterprise with Cloud Data Management
Darren Cunningham
 
PPTX
Eliminate SaaS Sprawl with Cloud Integration
Darren Cunningham
 
PPTX
Informatica Cloud Winter 2013 - Data Integration and Data Quality
Darren Cunningham
 
PPTX
Informatica Cloud @ Dreamforce 2012
Darren Cunningham
 
PPT
Power the Connected Enterprise with Cloud Integration and Master Data Managem...
Darren Cunningham
 
PPTX
Informatica Cloud Data Replication for Salesforce
Darren Cunningham
 
PPTX
Informatica Cloud Customer Success: Uponor
Darren Cunningham
 
PPT
Salesforce Integration in Manufacturing: Getting Sales and Operations on the ...
Darren Cunningham
 
PPT
Accelerate #Salesforce Integration with Informatica Cloud and Mansa Systems
Darren Cunningham
 
PDF
Salesforce Integration: Talking the Pain out of Data Loading
Darren Cunningham
 
PPT
LA Salesforce.com User Group: Shopzilla and Informatica Cloud
Darren Cunningham
 
PPT
Cloud Integration: Oracle EBS and Salesforce.com
Darren Cunningham
 
PPT
Informatica Cloud Dreamforce 2011 Overview
Darren Cunningham
 
PPTX
Hybrid IT: The Importance of Integration to Salesforce Success
Darren Cunningham
 
PDF
Salesforce Integration Best Practices: How to Avoid SaaS Silos
Darren Cunningham
 
5 Signs You Need to Re-Think Your Data Integration Strategy
Darren Cunningham
 
SnapLogic Elastic Integration Platform as a Service (iPaaS)
Darren Cunningham
 
[Infographic] Why Are CIOs Getting SMACT?
Darren Cunningham
 
Introducing the SnapLogic Integration Cloud
Darren Cunningham
 
Cloud-Con: Informatica Vibe and Cloud Integration for the Hybrid Enterprise
Darren Cunningham
 
Powering the Boundary-Free Enterprise with Cloud Data Management
Darren Cunningham
 
Eliminate SaaS Sprawl with Cloud Integration
Darren Cunningham
 
Informatica Cloud Winter 2013 - Data Integration and Data Quality
Darren Cunningham
 
Informatica Cloud @ Dreamforce 2012
Darren Cunningham
 
Power the Connected Enterprise with Cloud Integration and Master Data Managem...
Darren Cunningham
 
Informatica Cloud Data Replication for Salesforce
Darren Cunningham
 
Informatica Cloud Customer Success: Uponor
Darren Cunningham
 
Salesforce Integration in Manufacturing: Getting Sales and Operations on the ...
Darren Cunningham
 
Accelerate #Salesforce Integration with Informatica Cloud and Mansa Systems
Darren Cunningham
 
Salesforce Integration: Talking the Pain out of Data Loading
Darren Cunningham
 
LA Salesforce.com User Group: Shopzilla and Informatica Cloud
Darren Cunningham
 
Cloud Integration: Oracle EBS and Salesforce.com
Darren Cunningham
 
Informatica Cloud Dreamforce 2011 Overview
Darren Cunningham
 
Hybrid IT: The Importance of Integration to Salesforce Success
Darren Cunningham
 
Salesforce Integration Best Practices: How to Avoid SaaS Silos
Darren Cunningham
 

Recently uploaded (20)

PDF
Using FME to Develop Self-Service CAD Applications for a Major UK Police Force
Safe Software
 
PPTX
Q2 FY26 Tableau User Group Leader Quarterly Call
lward7
 
PDF
POV_ Why Enterprises Need to Find Value in ZERO.pdf
darshakparmar
 
PDF
Staying Human in a Machine- Accelerated World
Catalin Jora
 
PDF
What Makes Contify’s News API Stand Out: Key Features at a Glance
Contify
 
PDF
Transcript: New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
PPTX
Designing Production-Ready AI Agents
Kunal Rai
 
PPTX
AI Penetration Testing Essentials: A Cybersecurity Guide for 2025
defencerabbit Team
 
PDF
CIFDAQ Market Wrap for the week of 4th July 2025
CIFDAQ
 
PDF
Smart Trailers 2025 Update with History and Overview
Paul Menig
 
PPTX
The Project Compass - GDG on Campus MSIT
dscmsitkol
 
PDF
How Startups Are Growing Faster with App Developers in Australia.pdf
India App Developer
 
DOCX
Python coding for beginners !! Start now!#
Rajni Bhardwaj Grover
 
PPTX
From Sci-Fi to Reality: Exploring AI Evolution
Svetlana Meissner
 
PDF
Exolore The Essential AI Tools in 2025.pdf
Srinivasan M
 
PDF
New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
PPTX
"Autonomy of LLM Agents: Current State and Future Prospects", Oles` Petriv
Fwdays
 
PDF
Building Real-Time Digital Twins with IBM Maximo & ArcGIS Indoors
Safe Software
 
PDF
CIFDAQ Token Spotlight for 9th July 2025
CIFDAQ
 
PDF
[Newgen] NewgenONE Marvin Brochure 1.pdf
darshakparmar
 
Using FME to Develop Self-Service CAD Applications for a Major UK Police Force
Safe Software
 
Q2 FY26 Tableau User Group Leader Quarterly Call
lward7
 
POV_ Why Enterprises Need to Find Value in ZERO.pdf
darshakparmar
 
Staying Human in a Machine- Accelerated World
Catalin Jora
 
What Makes Contify’s News API Stand Out: Key Features at a Glance
Contify
 
Transcript: New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
Designing Production-Ready AI Agents
Kunal Rai
 
AI Penetration Testing Essentials: A Cybersecurity Guide for 2025
defencerabbit Team
 
CIFDAQ Market Wrap for the week of 4th July 2025
CIFDAQ
 
Smart Trailers 2025 Update with History and Overview
Paul Menig
 
The Project Compass - GDG on Campus MSIT
dscmsitkol
 
How Startups Are Growing Faster with App Developers in Australia.pdf
India App Developer
 
Python coding for beginners !! Start now!#
Rajni Bhardwaj Grover
 
From Sci-Fi to Reality: Exploring AI Evolution
Svetlana Meissner
 
Exolore The Essential AI Tools in 2025.pdf
Srinivasan M
 
New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
"Autonomy of LLM Agents: Current State and Future Prospects", Oles` Petriv
Fwdays
 
Building Real-Time Digital Twins with IBM Maximo & ArcGIS Indoors
Safe Software
 
CIFDAQ Token Spotlight for 9th July 2025
CIFDAQ
 
[Newgen] NewgenONE Marvin Brochure 1.pdf
darshakparmar
 

Cloud Data Integration Best Practices

  • 1. Cloud Data Integration Best Practices Kurt Messersmith, Amazon Web Services David S. Linthicum, www.davidlinthicum.com Darren Cunningham, Informatica Cloud
  • 2. Today’s Agenda Amazon Web Services Data Integration for “Cloud” Informatica Cloud
  • 3. AMAZON WEB SERVICES Kurt Messersmith, Sr Manager, AWS
  • 4. AMAZON’S THREE BUSINESSES Consumer (Retail) Business Tens of millions of active customer accounts Seven countries: US, UK, Germany, Japan, France, Canada, China Seller Business Sell on Amazon websites Use Amazon technology for your own retail website Leverage Amazon’s massive fulfillment center network Developers & IT Professionals On-demand infrastructure for hosting web-scale solutions Hundreds of thousands of registered customers
  • 5. AWS USAGE GRAPH 2007: AWS bandwidth usage surpassed Amazon.com global websites Today: AWS bandwidth usage 30% greater than Amazon.com global websites Bandwidth Usage:
  • 6. CLOUD ATTRIBUTES Abstract Resources Not tied to physical hardware and can be flexible as your needs demand. On-Demand Provisioning Ask for what you need, exactly when you need it. Pay only for what you use. Scalability Scale up or down depending on usage needs. No Up-Front Costs No contracts or long-term commitments. Pay only for what you use. Efficiency of Experts Utilize the skills, knowledge and resources of experts.
  • 7. Owned Infrastructure: The Heavy Lifting - Server hosting - Contract negotiation - Bandwidth management - Purchase decisions - Moving facilities OPPORTUNITY COSTS: OWNED VS. CLOUD Scaling and managing physical growth Heterogeneous hardware Legacy software Coordinating large teams Cloud Computing: The 70/30 Switch 30% of time, energy and dollars on differentiated value creation
  • 8. PREDICTIONS COST MONEY Infrastructure Cost $ time Large Capital Expenditure You just lost customers Predicted Demand Traditional Hardware Actual Demand Automated Virtualization
  • 9. AGILITY EXAMPLE—COST NEUTRAL EQUATION This graphic compares running the same 10,000 jobs on 2 servers versus 1000 servers. The cost is the same for either scenario in using AWS (and RightScale), but the difference in elapsed time is 499 hours. (assuming each server can process 10 jobs/hour) 2 server cloud 10,000 jobs 10,000 jobs 1000 server cloud Output data Output data Total processing time:500 hours Total processing time:1 hour
  • 10. Web site hosting Application hosting Internal IT application hosting Quick and effective marketing campaigns Content delivery and media distribution High performance computing, batch data processing, and large scale analytics Storage, backup, and disaster recovery Development and test environments DIVERSE ENTERPRISE USE CASES
  • 12. Data Integration for “Cloud.” David S. Linthicum www.davidlinthicum.com [email_address]
  • 13. So, Why Data Integration and “Cloud” Improved Adaptability and Agility Respond to business needs in near real-time Functional Reusability Eliminate the need for large scale rip and replace Independent Change Management Focus on configuration rather than programming Interoperability instead of point-to-point integration Loosely-coupled framework, services in network Orchestrate rather than integrate Configuration rather than development to deliver business needs
  • 14. Understand Cloud Provider Interfaces New Accounts Commission Calculation Data Cleaning Sales Order Update Finance/ Operations Sales
  • 15. Public Cloud Traditional Data Center Evolving Migration to Public Cloud Computing Providers
  • 16. Public Cloud Traditional Data Center Evolving Migration to Public Cloud Computing Providers
  • 17. SaaS IaaS PaaS APIs ERP Legacy Data On Premise On Demand Data Integration
  • 18. Understanding the Problem Cloud services must integrate with existing enterprise systems to become more valuable. However, existing internal integration needs to exist to ensure: Production and consumption of structured information Semantic mediation Security mediation Service enablement Firewall management Transactional integrity Holistic management of complete integration chain
  • 19. Getting Ready So, how do you prepare yourself? I have a few suggestions: First, accept the notion that it's okay to leverage services that are hosted on the Internet as part of your SOA. Normal security management needs to apply, of course. Second, create a strategy for the consumption and management of outside-in services , including how you'll deal with semantic management, security, transactions, etc. Finally, create a proof of concept now. This does a few things including getting you through the initial learning process and providing proof points as to the feasibility of leveraging outside-in services.
  • 20. Remember, there are a few technical issues that you must address… Semantic and metadata management , or, the management of the different information representations amount the external services and internal systems. Transformation and routing , or, accounting for those data differences during run time. Governance across all systems , meaning, not giving up the notion of security and control when extending your SOA to the global SOA. Discovery and service management , meaning, how to find and leverage services inside or outside of your enterprise, and how to keep track of those services through their maturation. Information consumption, processing, and delivery , or, how to effectively move information to and from all interested systems. Connectivity and adapter management , or, how to externalize and internalize information and services from very old and proprietary systems. Process orchestration and service, and process abstraction , or, the ability to abstract the services and information flows into bound processes, thus creating a solution
  • 21. Core Issues that Architects Must Consider when Integrating with “Clouds.” The ability to handle larger data sets. The ability to handle and resolve data inaccuracies and inconsistencies. The ability to do data manipulation efficiently and inexpensively. The ability to provide visibility into the lineage of data. The ability to decouple data access from the implementation
  • 22. Limitations of Existing Integration Approaches Inefficient consumption of data by the integration engine from the source systems. Lack of validation and transformation of the data for the correct format and structure. No early detection of data inaccuracies and inconsistencies leading to error-prone business processes Inability to handle data quality issues No tracking of data to insure data traceability and lineage Content transformation, on message and large set of data Inefficient provisioning of the data from the integration and processing engine to the target system.
  • 23. Issues You Need to Consider when Selecting Data Integration Technology for Enterprise-to-Cloud Lack of support for complex data transformations. Challenges in handling large data volumes. Lack of support for handling varying data latencies including batch, trickle-feed and real-time. Difficulty in determining the origin of data or how it’s utilized. Lack of standards-based approaches and limited re-use across projects. Lacking mechanisms to handle data quality issues across sources. No protection against changes to underlying data sources. The requirement for manual handling of diverse data structures, formats, access, etc. Limited support for metadata and impact analysis. Lacking a mechanism to automatically detect changes to the data. Lack of support for batch and trickle-feed (CDC) data movement.
  • 24. Create the Information Model Ontologies Understand Ontologies Understand the Data Data Dictionary & Metadata Catalog the Data Data Catalog Legacy Metadata External Metadata (B2B) Build Information Model Information Model
  • 25. Start with the Architecture Understand: Business drivers Information under management Existing services under management Core business processes
  • 26. The Informatica Cloud www.informaticacloud.com Darren Cunningham, Informatica Cloud Marketing
  • 27. Replicate Data Primary Cloud Integration Use Cases: Your Company Load Data Synchronize Data Cleanse Data
  • 28. Cloud Integration Options Outsource Cloud Services On-Premise Tools 3 4 2 Hand Code You need to consider integration for what it is: the mother of all single points of failure . “ ” David Linthicum Author, Cloud Computing and SOA Convergence in Your Enterprise 1
  • 29. The Informatica Cloud The Industry’s Broadest Cloud Integration Portfolio Informatica Cloud Services Business Managers Migrate Validate Monitor Synch Replicate Informatica Cloud Editions & Options IT Informatica Cloud Platform SIs, ISVs, Developers Custom
  • 30. Data Integration as a Service Advantages www.informaticacloud.com +500 customers +20K jobs/day + 5B rows/month Migrate Monitor Replicate Synch For Customers Rapid Deployment Utility Pricing Minimal Training Fewer IT Resources Seamless Upgrades Usage Tracking For ISVs Reduced Dev Costs Rapid Innovation Best of Breed Tech Greater Scalability Expand Your Market Focus on Your Core Custom
  • 31. Data Replication as a Cloud Service We’re using Informatica Cloud Services to replicate millions of rows of data from Salesforce to a centralized database running on Amazon EC2. ” “
  • 32. Contacts David S. Linthicum www.davidlinthicum.com [email_address] Kurt Messersmith Amazon Web Services [email_address] Darren Cuningham www.informaticacloud.com [email_address]

Editor's Notes

  • #5: First, it’s useful to provide the context that the way we think about what Amazon.com is, the way we think about it at the highest levels of the Company, is that we have three macro and distinct businesses: our Consumer/Retail business, our Seller business, and our Developer business.
  • #12: Stanford: Stanford University uses Moonwalk’s Enterprise Data Management System to store and backup their data into Amazon S3.  Moonwalk is an ISV partner of AWS. CA: Management solutions from CA provide a business driven management approach and add comprehensive support for Amazon EC2 with real-time automation, provisioning, application performance, service, and database management.  Adobe: Adobe offers their LiveCycle Developer Express program on AWS, giving their their enterprise developer community ready access to their document workflow solution for developing solutions.  They are also now offering ColdFusion 9, their application development platform, on AWS. Microsoft: Microsoft has used AWS over the years for various tasks, including software delivery, human intelligence tasks, and application hosting.  Mailtrust (Rackspace): Mailtrust archives their mail servers in S3.  NYTimes: NYTimes have used AWS extensively over the years for data processing pipelines, data analysis, application hosting, etc.  Early projects included the TimesMachine, etc.  SanDisk: Sandisk uses S3 as the cloud-based storage mechanism to backup and share data from their flash drive products.  NASDAQ: NASDAQ uses S3 to store and delivery their ticker symbol data for the NASDAQ Market Replay application.  ESPN: ESPN uses EC2 and S3 to host several of their social networking and mobile properties.  Intuit: Intuit used SOASTA’s CloudTest service to run load testing on their tax software, utilizing 2200 EC2 cores.  They also host some applications one EC2 during tax season, including Intuit TaxCaster.  Netflix: Netflix is using AWS for a variety of mission critical applications and services, and will continue to look for ways to leverage the Amazon cloud to service their customers.  Autodesk: Autodesk hosts several applications on EC2, including Autodesk Seek.  Autodesk Seek is the online source for architects and building engineers to search and download manufacturer design information (i.e., reusable CAD models that can be dropped directly into design projects)  Pfizer: Pfizer has done antibody docking for drug design on EC2.  They leveraged up to 500 c1.mediums at a time to do the modeling. New York Life has created a financial planning application. This application will help their employees do 'what-if' scenarios for their customers. It will look at things like income, debt, expenses, etc, and come up with a customized plan.
  • #14: What should you expect from this approach Key to adaptability and agility Investments required in strategy not technology – how to use current technology to achieve goals. Investment in re-orienting the thinking of IT staff How do we provide services the same way FedEx enables overnight versus 2 day. FedEx has built flexible architecture to enable business level services
  • #28: So the cloud is taking off…but it’s also become the #1 driver of Data Fragmentation in the enterprise. As one of our customers said, “a SaaS application without integration is like a beautiful island that nobody can actually get to.” In fact, Forrester Research has shown that 65% of IT managers recognize integration issues as the top barriers to success.
  • #29: So the cloud is taking off…but it’s also become the #1 driver of Data Fragmentation in the enterprise. As one of our customers said, “a SaaS application without integration is like a beautiful island that nobody can actually get to.” In fact, Forrester Research has shown that 65% of IT managers recognize integration issues as the top barriers to success.
  • #30: That’s why Informatica has put such a focus on the cloud. We’re actually now delivering deployment options across the different flavors of cloud computing – SaaS, PaaS, and Infrastructure as a Service. Informatica Cloud Services are purpose-built applications that are designed for non-technical line of business users (often the SaaS administrator for example). We’ve initally focused on a set of specific use cases that we see as the primary requirements today for SaaS application customers: data migration (loading data in), data synchronization (keeping systems and processes unified on a real-time basis), data quality, and data replication (keeping a local copy of cloud data – typically for on-premise business intelligence). Last year we extended our cloud offerings in two important ways: We introduced the Informatica Cloud Platform, which allows our customers to build and share more complex mappings and functions as a custom cloud service and… Support for IaaS deployments such as Amazon EC2. This means you can sign up to use Informatica PowerCenter or Data Quality on an hourly basis or deploy your software directly on their servers.
  • #31: http://www.destinationcrm.com/Articles/Web-Exclusives/Viewpoints/Want-SaaS-Get-Integration-First-51857.aspx By 2010, 76% of US organizations will use at least one SaaS-delivered application for business.