SlideShare a Scribd company logo
The new dominant companies are
running on data
Take your company to the next level of value and efficiency
Rich Dill– Enterprise Solutions Architect– rdill@snaplogic.com
©2017 SnapLogic, Inc. All Rights Reserved Confidential Content
2
What problem do we want to solve?
How do we get value from all this data?
What is the solution?
Confidential Content
3
Sometimes it is not obvious to everyone involved
©2017 SnapLogic, Inc. All Rights Reserved
Decisions made without facts are opinions
◦ What are the facts? Again and again and again – what are the facts? Shun wishful thinking, ignore divine revelation, forget what
“the stars foretell,” avoid opinion, care not what the neighbors think, never mind the unguessable “verdict of history” – what are
the facts, and to how many decimal places? You pilot always into an unknown future; facts are your single clue. Get the facts!”
RH
Turn your latent assets into liquid to realize their value
- No longer latent but now liquid
◦ Data has to be on the move
- It must be leveraged by the masses
The business goal
◦ Actually deliver on the promise of transforming data into actionable information
◦ Predictive analytics improve forecasting
◦ Prescriptive analytics can guide business behaviour
◦ Geolocation analytics can improve resource utilization and inventory turns
What are the results?
- Delivering insights to executives yields direction
- Delivering insights to line workers yields results
corporate overview
Not everyone has the same problem
Use cases are variations on a common theme
Confidential Content
4
©2017 SnapLogic, Inc. All Rights Reserved
Sampling of Industry Focused Use Cases
Umbrella Industry
Fraud
Detection
Upsell &
Cross-sell
Customer360 Fault
Prediction
Sentiment
Analysis
Personalization M & A Management
Consulting
Manufacturing X X
Retail X X X X X X
Healthcare X X
Financial Services X X X X X
Energy X X
Logistics &
Transportation
X X X
Services X X
CPG X X X
Computer Software X X
Telecom X X X X X X X
Deployment Pattern
Data Refinery or
Data Lake Pop.
Hub-and-Spoke Hub-and-Spoke Data Refinery Data Refinery or
Data Lake Pop.
Data Refinery or
Data Lake Pop.
Common Data
Modeling
Common Data
Modeling
Data Lake Population
Data Lake
Storage: S3, HDFS,
Processing/Transformation
Ingestion
Source
System 1
Source
System 2
Source
System 3
Source
System N
Pull
Push
Stream
Streaming
Database
SaaS App
File
7
Data Refinery
Data Lake OLAP
Push
Storage: S3, HDFS,
Processing/Transformation
Ingestion
Pull
Push
Stream
Source
System 1
Source
System 2
Source
System 3
Source
System N
Streaming
Database
SaaS App
File
8
Common Data Model
Data Lake
Source
System 1
Source
System 2
Source
System 3
Source
System N
Source
System 1
Source
System 2
Source
System 3
Source
System N
Source
System 1
Source
System 2
Source
System 3
Source
System N
HDFS,
S3, Blob
Staging
**
Source
System 1
Source
System 1
Downstream
Apps
Push
Streaming
Database
SaaS
App
File
Processing/Transformation
Ingestion
Pull
Push
Stream
Storage: S3, HDFS,
9
Hub-and-Spoke
Data Lake EDWPush
Data
Mart
Data
Mart
Data
Mart
Data Science
Workbench
Pull
Push
Stream
Storage: S3, HDFS,
Processing/Transformation
Ingestion
Source
System 1
Source
System 2
Source
System 3
Source
System N
Streaming
Database
SaaS App
File
corporate overview
The first solution: custom built
Michelangelo@Uber
Confidential Content
11
Welcome my son to the machine…
©2017 SnapLogic, Inc. All Rights Reserved
The problem
◦ “There were no systems in place to build reliable, uniform, and reproducible pipelines for creating and
managing training and prediction data at scale.”
The solution: Machine Learning as a Service
◦ ML-as-a-service platform that democratizes machine learning and makes scaling AI to meet the needs of
business as easy as requesting a ride.
Michelangelo consists of a mix of open source systems and components built in-house. The
primary open sourced components used
are HDFS, Spark, Samza, Cassandra, MLLib, XGBoost, and TensorFlow.
Cost
◦ Two years
◦ $60 million
Results
◦ A Wall Street Journal report claims SoftBank has been in touch with Uber with the apparent goal of buying a
“multi-billion dollar stake” in the company. To date, Uber has raised close to $12 billion from investors, with its
most recent valuation reportedly above $60 billion. July 25, 2017
©2017 SnapLogic, Inc. All Rights Reserved Confidential Content
12
A Model feature report
Building on success
Confidential Content
13
Both the systems and staff continue to learn and evolve
©2017 SnapLogic, Inc. All Rights Reserved
“As the platform layers mature, we plan to invest in higher level tools and services to
drive democratization of machine learning and better support the needs of our
business”
For more information
◦ https://eng.uber.com/michelangelo/
corporate overview
The second solution: custom integration
The five year plan
Confidential Content
15
Rome was not built in a day
©2017 SnapLogic, Inc. All Rights Reserved
The problem
◦ A large multinational corporation grew in part by acquisition
◦ Technology stacks and silos as far as the eye can see
◦ They had one or more of every kind of technology
◦ They had hundreds of data warehouses and data marts
The cost
◦ Implementing any new business processes were blindingly expensive, took too long and were not what the user was expecting
or needed
The solution
◦ Simplify, standardized, consolidate and adopt a cloud strategy
◦ Insert a Data Lake into the data lifecycle
◦ Adopt a Citizen Integrator model where ever possible
The business result
◦ The combination of migration from a perpetual software license model to SaaS and the reduced labor costs of the Citizen
Integrator model resulted in savings in the millions
The evolving data lifecycle
Confidential Content
16
©2017 SnapLogic, Inc. All Rights Reserved
Data
Lake
Source
System 1
Source
System 2
Source
System 3
Source
System N
Source
System 1
Source
System 2
Source
System 3
Source
System N
Source
System 1
Source
System 2
Source
System 3
Source
System N
Source
System 1
Source
System 2
Source
System 3
Source
System N
Source
System 1
Source
System 2
Source
System 3
Source
System N
Source
System 1
Source
System 2
Source
System 3
Source
System N
EDW
Data
Mart
Data
Mart
Data
Mart
Data Science
Workbench
EDW
Data
Mart
Data
Mart
Data
Mart
Two stages, OLTP to DW and Data marts Three stages, OLTP to Data Lake, the to on shore
Data marts and DW
Results
Confidential Content
17
Happy productive business users
©2017 SnapLogic, Inc. All Rights Reserved
Faster time to market for new programs with agility and LOB alignment
Over 500 users from almost all business units
Savings in the millions
A more agile business environment
corporate overview
The third is a solution
The solution approach
Confidential Content
19
Business goal drive the architectural requirements
©2017 SnapLogic, Inc. All Rights Reserved
The problem/business goal
◦ Obtain a customer 360 view by removing the constraints of an on-premises environment and move to a cloud-first
environment where multiple departments/constituents can access data and obtain insights.
Key Characteristics of a cloud-first enterprise stack:
◦ Scalable
◦ Collaborative
◦ Promotes easy data sharing
◦ Reduces on-premises maintenance overhead with auto updates
The process
◦ Upgrade the cloud data warehouse
◦ Move legacy BI to a modern tool like Tableau or PowerBI, for greater data fluency
◦ Create a foundation for an AI/ML workbench for predictive analytics
◦ Use ML framework like TensorFlow from Google generates Java code that runs anywhere
20
Proposed Enterprise Stack
Amazon S3
Amazon EMR
SnapLogic (AWS Deployed)
Pull
Push
Stream
Push Tableau
Streaming
Database
Webservices
File
SAS
Cognos
Analytics
Kafka, JMS
Hbase, Hive, Dynamo,
Mongo, Redshift,
SQLServer, AzureSQL,
Aurora, MySQL
REST, SOAP
Flat Files, XML, JSon,
Excel, Word doc, PDF,
S3, FTP/SFTP, ORC,
Parquet
Sources & Targets
Social Media
Facebook, LinkedIn,
Twitter
Machine Learning Integration Point
Key Benefits of Proposed Architecture
Confidential Content
21
©2017 SnapLogic, Inc. All Rights Reserved
Enables migration in phases rather than all at once
Promotes data re-use and reduces time to insight across the organization
Scalable and flexible to accommodate company’s changing needs
Reduced maintenance costs to enable IT to stay focused on enabling the business
Complete view of the customer with real-time data updates
Better focused marketing programs (less waste, higher performance)
Greater customer loyalty due to more relevant customer engagement
Observations from the field
Confidential Content
22
Some observations and a few of Rich’s rules of technology
©2017 SnapLogic, Inc. All Rights Reserved
Technology is a tool, use the right one for the job
◦ It amazes me how some engineers have almost religious beliefs in their favorite technology
- If the only tool you have is a hammer…
Software evolves like a funnel
◦ Early releases have limitations that are fixed with later releases
We work in an industry where change is constant
◦ Absolute truths can change every 5-10 years
◦ The rate of change can make you old, or keep you young. As the Iron Giant said, choose!
Different technologies require different approaches and techniques
◦ I don’t code Scala like C or Cobol
◦ “A mind is like a parachute it only functions when it is open” Thomas Dewar
The adoption curve entails risk… and costs
◦ There is a reason we call it the bleeding edge
Open source is not free
◦ The money you save on license cost, you will spend on additional labor, plus 25%
©2017 SnapLogic, Inc. All Rights Reserved Confidential Content
Q & A
Thank You
San Mateo, CA
Boulder, CO
New York, NY
London, UK
Melbourne, AUS
Hyderabad, India
www.snaplogic.com
Rich Dill– Enterprise Solutions Architect
rdill@snaplogic.com

More Related Content

PDF
Big Data LDN 2017: The New Dominant Companies Are Running on Data
Matt Stubbs
 
PDF
Big Data LDN 2017: The 3rd Wave of Business Intelligence
Matt Stubbs
 
PDF
Big Data LDN 2017: The New Dominant Companies Are Running on Data
Matt Stubbs
 
PPTX
5 Tips to Building a Successful Big Data Strategy
Western Digital
 
PDF
How to Avoid Pitfalls in Big Data Analytics Webinar
Datameer
 
PPTX
Increase your ROI with Hadoop in Six Months - Presented by Dell, Cloudera and...
Cloudera, Inc.
 
PPTX
Digital Government: Data + Government Isn't Enough | Wrangle Conference 2017
Cloudera, Inc.
 
PPTX
Rocking the World of Big Data at Centrica
DataWorks Summit/Hadoop Summit
 
Big Data LDN 2017: The New Dominant Companies Are Running on Data
Matt Stubbs
 
Big Data LDN 2017: The 3rd Wave of Business Intelligence
Matt Stubbs
 
Big Data LDN 2017: The New Dominant Companies Are Running on Data
Matt Stubbs
 
5 Tips to Building a Successful Big Data Strategy
Western Digital
 
How to Avoid Pitfalls in Big Data Analytics Webinar
Datameer
 
Increase your ROI with Hadoop in Six Months - Presented by Dell, Cloudera and...
Cloudera, Inc.
 
Digital Government: Data + Government Isn't Enough | Wrangle Conference 2017
Cloudera, Inc.
 
Rocking the World of Big Data at Centrica
DataWorks Summit/Hadoop Summit
 

What's hot (20)

PDF
How Businesses use Big Data to Impact the Bottom Line
Enterprise Management Associates
 
PPTX
Becoming Data-Driven Through Cultural Change
Cloudera, Inc.
 
PDF
Cloudera + Syncsort: Fuel Business Insights, Analytics, and Next Generation T...
Precisely
 
PPTX
Cloudera Fast Forward Labs: Accelerate machine learning
Cloudera, Inc.
 
PDF
Fit For Purpose: Preventing a Big Data Letdown
Inside Analysis
 
PDF
Presumption of Abundance: Architecting the Future of Success
Inside Analysis
 
PPTX
Optimize your cloud strategy for machine learning and analytics
Cloudera, Inc.
 
PPTX
Random Decision Forests at Scale
Cloudera, Inc.
 
PDF
Transform IT Service Delivery Helion
Andrey Karpov
 
PPTX
The Five Markers on Your Big Data Journey
Cloudera, Inc.
 
PPTX
The Big Picture: Real-time Data is Defining Intelligent Offers
Cloudera, Inc.
 
PPTX
Put Alternative Data to Use in Capital Markets

Cloudera, Inc.
 
PDF
How to Streamline DataOps on AWS
Enterprise Management Associates
 
PDF
Understanding Big Data Analytics - solutions for growing businesses - Rafał M...
GetInData
 
PDF
Accelerate Digital Transformation with Data Virtualization in Banking, Financ...
Denodo
 
PDF
Logical Data Fabric: Maturing Implementation from Small to Big (APAC)
Denodo
 
PDF
First in Class: Optimizing the Data Lake for Tighter Integration
Inside Analysis
 
PDF
Ask Bigger Questions with Cloudera and Apache Hadoop - Big Data Day Paris 2013
Publicis Sapient Engineering
 
PDF
Traditional BI vs. Business Data Lake – A Comparison
Capgemini
 
PPTX
Unlocking data science in the enterprise - with Oracle and Cloudera
Cloudera, Inc.
 
How Businesses use Big Data to Impact the Bottom Line
Enterprise Management Associates
 
Becoming Data-Driven Through Cultural Change
Cloudera, Inc.
 
Cloudera + Syncsort: Fuel Business Insights, Analytics, and Next Generation T...
Precisely
 
Cloudera Fast Forward Labs: Accelerate machine learning
Cloudera, Inc.
 
Fit For Purpose: Preventing a Big Data Letdown
Inside Analysis
 
Presumption of Abundance: Architecting the Future of Success
Inside Analysis
 
Optimize your cloud strategy for machine learning and analytics
Cloudera, Inc.
 
Random Decision Forests at Scale
Cloudera, Inc.
 
Transform IT Service Delivery Helion
Andrey Karpov
 
The Five Markers on Your Big Data Journey
Cloudera, Inc.
 
The Big Picture: Real-time Data is Defining Intelligent Offers
Cloudera, Inc.
 
Put Alternative Data to Use in Capital Markets

Cloudera, Inc.
 
How to Streamline DataOps on AWS
Enterprise Management Associates
 
Understanding Big Data Analytics - solutions for growing businesses - Rafał M...
GetInData
 
Accelerate Digital Transformation with Data Virtualization in Banking, Financ...
Denodo
 
Logical Data Fabric: Maturing Implementation from Small to Big (APAC)
Denodo
 
First in Class: Optimizing the Data Lake for Tighter Integration
Inside Analysis
 
Ask Bigger Questions with Cloudera and Apache Hadoop - Big Data Day Paris 2013
Publicis Sapient Engineering
 
Traditional BI vs. Business Data Lake – A Comparison
Capgemini
 
Unlocking data science in the enterprise - with Oracle and Cloudera
Cloudera, Inc.
 
Ad

Similar to The new dominant companies are running on data (20)

PDF
Capgemini Leap Data Transformation Framework with Cloudera
Capgemini
 
PDF
5 Steps to Achieving the Single Pane of Glass Across DevOps -- APM, NPM, Metr...
DevOps.com
 
PPTX
Insights into Real-world Data Management Challenges
DataWorks Summit
 
PDF
Connecta Event: Big Query och dataanalys med Google Cloud Platform
ConnectaDigital
 
PPTX
Insights into Real World Data Management Challenges
DataWorks Summit
 
PDF
Where the Warehouse Ends: A New Age of Information Access
Inside Analysis
 
PPT
Making Hadoop Ready for the Enterprise
DataWorks Summit
 
PPTX
Big Data Everywhere Chicago: Platfora - Practices for Customer Analytics on H...
BigDataEverywhere
 
PPTX
Is your big data journey stalling? Take the Leap with Capgemini and Cloudera
Cloudera, Inc.
 
PDF
NRB - BE MAINFRAME DAY 2017 - Data spark and the data federation
NRB
 
PDF
NRB - LUXEMBOURG MAINFRAME DAY 2017 - Data Spark and the Data Federation
NRB
 
PPTX
Turning Petabytes of Data into Profit with Hadoop for the World’s Biggest Ret...
Cloudera, Inc.
 
PDF
Future of Data Strategy (ASEAN)
Denodo
 
PDF
Extending BI with Big Data Analytics
Datameer
 
PDF
SAP Forum Ankara 2017 - "Verinin Merkezine Seyahat"
MDS ap
 
PPTX
The Double win business transformation and in-year ROI and TCO reduction
MongoDB
 
PDF
Horses for Courses: Database Roundtable
Eric Kavanagh
 
PDF
Four Key Considerations for your Big Data Analytics Strategy
Arcadia Data
 
PDF
Product Keynote: Denodo 8.0 - A Logical Data Fabric for the Intelligent Enter...
Denodo
 
PDF
Tdwi austin simplifying big data delivery to drive new insights final
Sal Marcuz
 
Capgemini Leap Data Transformation Framework with Cloudera
Capgemini
 
5 Steps to Achieving the Single Pane of Glass Across DevOps -- APM, NPM, Metr...
DevOps.com
 
Insights into Real-world Data Management Challenges
DataWorks Summit
 
Connecta Event: Big Query och dataanalys med Google Cloud Platform
ConnectaDigital
 
Insights into Real World Data Management Challenges
DataWorks Summit
 
Where the Warehouse Ends: A New Age of Information Access
Inside Analysis
 
Making Hadoop Ready for the Enterprise
DataWorks Summit
 
Big Data Everywhere Chicago: Platfora - Practices for Customer Analytics on H...
BigDataEverywhere
 
Is your big data journey stalling? Take the Leap with Capgemini and Cloudera
Cloudera, Inc.
 
NRB - BE MAINFRAME DAY 2017 - Data spark and the data federation
NRB
 
NRB - LUXEMBOURG MAINFRAME DAY 2017 - Data Spark and the Data Federation
NRB
 
Turning Petabytes of Data into Profit with Hadoop for the World’s Biggest Ret...
Cloudera, Inc.
 
Future of Data Strategy (ASEAN)
Denodo
 
Extending BI with Big Data Analytics
Datameer
 
SAP Forum Ankara 2017 - "Verinin Merkezine Seyahat"
MDS ap
 
The Double win business transformation and in-year ROI and TCO reduction
MongoDB
 
Horses for Courses: Database Roundtable
Eric Kavanagh
 
Four Key Considerations for your Big Data Analytics Strategy
Arcadia Data
 
Product Keynote: Denodo 8.0 - A Logical Data Fabric for the Intelligent Enter...
Denodo
 
Tdwi austin simplifying big data delivery to drive new insights final
Sal Marcuz
 
Ad

More from SnapLogic (20)

PPTX
The AI Mindset: Bridging Industry and Academic Perspectives
SnapLogic
 
PPTX
Supercharging Self-Service API Integration with AI
SnapLogic
 
PPTX
Intelligent data summit: Self-Service Big Data and AI/ML: Reality or Myth?
SnapLogic
 
PPTX
SnapLogic Culture Deck
SnapLogic
 
PPTX
Euromoney's integration journey: Selecting SnapLogic's self-service integrati...
SnapLogic
 
PPTX
Digital Transformation is Cloud-Powered
SnapLogic
 
PPTX
How to Build a Winning Data Culture
SnapLogic
 
PPTX
Data Warehousing in the Cloud: Practical Migration Strategies
SnapLogic
 
PPTX
Overcoming the challenge of multiple data frameworks in a multiple cloud envi...
SnapLogic
 
PPTX
SnapLogic Technology Open House – January 2018
SnapLogic
 
PDF
Self-Service Integration in the Age of Digital Transformation at Box
SnapLogic
 
PPTX
Live Demo: Accelerate the integration of workday applications
SnapLogic
 
PDF
Spring 2017 release customer webinar
SnapLogic
 
PDF
SnapLogic unveils machine-learning-driven integration assistant
SnapLogic
 
PDF
Webinar: Evolution of Data Management for the IoT
SnapLogic
 
PDF
The API Lie
SnapLogic
 
PPTX
SnapLogic Culture
SnapLogic
 
PPTX
SnapLogic Live: Enabling the Citizen Integrator
SnapLogic
 
PPTX
Big Data Management: What's New, What's Different, and What You Need To Know
SnapLogic
 
PPTX
SnapLogic Live: Workday Integration
SnapLogic
 
The AI Mindset: Bridging Industry and Academic Perspectives
SnapLogic
 
Supercharging Self-Service API Integration with AI
SnapLogic
 
Intelligent data summit: Self-Service Big Data and AI/ML: Reality or Myth?
SnapLogic
 
SnapLogic Culture Deck
SnapLogic
 
Euromoney's integration journey: Selecting SnapLogic's self-service integrati...
SnapLogic
 
Digital Transformation is Cloud-Powered
SnapLogic
 
How to Build a Winning Data Culture
SnapLogic
 
Data Warehousing in the Cloud: Practical Migration Strategies
SnapLogic
 
Overcoming the challenge of multiple data frameworks in a multiple cloud envi...
SnapLogic
 
SnapLogic Technology Open House – January 2018
SnapLogic
 
Self-Service Integration in the Age of Digital Transformation at Box
SnapLogic
 
Live Demo: Accelerate the integration of workday applications
SnapLogic
 
Spring 2017 release customer webinar
SnapLogic
 
SnapLogic unveils machine-learning-driven integration assistant
SnapLogic
 
Webinar: Evolution of Data Management for the IoT
SnapLogic
 
The API Lie
SnapLogic
 
SnapLogic Culture
SnapLogic
 
SnapLogic Live: Enabling the Citizen Integrator
SnapLogic
 
Big Data Management: What's New, What's Different, and What You Need To Know
SnapLogic
 
SnapLogic Live: Workday Integration
SnapLogic
 

Recently uploaded (20)

PPT
Real Life Application of Set theory, Relations and Functions
manavparmar205
 
PPTX
short term internship project on Data visualization
JMJCollegeComputerde
 
PDF
Company Presentation pada Perusahaan ADB.pdf
didikfahmi
 
PDF
The_Future_of_Data_Analytics_by_CA_Suvidha_Chaplot_UPDATED.pdf
CA Suvidha Chaplot
 
PPTX
World-population.pptx fire bunberbpeople
umutunsalnsl4402
 
PPTX
Data Security Breach: Immediate Action Plan
varmabhuvan266
 
PDF
202501214233242351219 QASS Session 2.pdf
lauramejiamillan
 
PDF
SUMMER INTERNSHIP REPORT[1] (AutoRecovered) (6) (1).pdf
pandeydiksha814
 
PDF
717629748-Databricks-Certified-Data-Engineer-Professional-Dumps-by-Ball-21-03...
pedelli41
 
PDF
TIC ACTIVIDAD 1geeeeeeeeeeeeeeeeeeeeeeeeeeeeeer3.pdf
Thais Ruiz
 
PPTX
lecture 13 mind test academy it skills.pptx
ggesjmrasoolpark
 
PPTX
Presentation (1) (1).pptx k8hhfftuiiigff
karthikjagath2005
 
PPTX
Power BI in Business Intelligence with AI
KPR Institute of Engineering and Technology
 
PPTX
Introduction to computer chapter one 2017.pptx
mensunmarley
 
PPTX
Introduction to Biostatistics Presentation.pptx
AtemJoshua
 
PDF
Mastering Financial Analysis Materials.pdf
SalamiAbdullahi
 
PDF
Research about a FoodFolio app for personalized dietary tracking and health o...
AustinLiamAndres
 
PPTX
INFO8116 -Big data architecture and analytics
guddipatel10
 
PPTX
Blue and Dark Blue Modern Technology Presentation.pptx
ap177979
 
PDF
An Uncut Conversation With Grok | PDF Document
Mike Hydes
 
Real Life Application of Set theory, Relations and Functions
manavparmar205
 
short term internship project on Data visualization
JMJCollegeComputerde
 
Company Presentation pada Perusahaan ADB.pdf
didikfahmi
 
The_Future_of_Data_Analytics_by_CA_Suvidha_Chaplot_UPDATED.pdf
CA Suvidha Chaplot
 
World-population.pptx fire bunberbpeople
umutunsalnsl4402
 
Data Security Breach: Immediate Action Plan
varmabhuvan266
 
202501214233242351219 QASS Session 2.pdf
lauramejiamillan
 
SUMMER INTERNSHIP REPORT[1] (AutoRecovered) (6) (1).pdf
pandeydiksha814
 
717629748-Databricks-Certified-Data-Engineer-Professional-Dumps-by-Ball-21-03...
pedelli41
 
TIC ACTIVIDAD 1geeeeeeeeeeeeeeeeeeeeeeeeeeeeeer3.pdf
Thais Ruiz
 
lecture 13 mind test academy it skills.pptx
ggesjmrasoolpark
 
Presentation (1) (1).pptx k8hhfftuiiigff
karthikjagath2005
 
Power BI in Business Intelligence with AI
KPR Institute of Engineering and Technology
 
Introduction to computer chapter one 2017.pptx
mensunmarley
 
Introduction to Biostatistics Presentation.pptx
AtemJoshua
 
Mastering Financial Analysis Materials.pdf
SalamiAbdullahi
 
Research about a FoodFolio app for personalized dietary tracking and health o...
AustinLiamAndres
 
INFO8116 -Big data architecture and analytics
guddipatel10
 
Blue and Dark Blue Modern Technology Presentation.pptx
ap177979
 
An Uncut Conversation With Grok | PDF Document
Mike Hydes
 

The new dominant companies are running on data

  • 1. The new dominant companies are running on data Take your company to the next level of value and efficiency Rich Dill– Enterprise Solutions Architect– [email protected]
  • 2. ©2017 SnapLogic, Inc. All Rights Reserved Confidential Content 2 What problem do we want to solve? How do we get value from all this data?
  • 3. What is the solution? Confidential Content 3 Sometimes it is not obvious to everyone involved ©2017 SnapLogic, Inc. All Rights Reserved Decisions made without facts are opinions ◦ What are the facts? Again and again and again – what are the facts? Shun wishful thinking, ignore divine revelation, forget what “the stars foretell,” avoid opinion, care not what the neighbors think, never mind the unguessable “verdict of history” – what are the facts, and to how many decimal places? You pilot always into an unknown future; facts are your single clue. Get the facts!” RH Turn your latent assets into liquid to realize their value - No longer latent but now liquid ◦ Data has to be on the move - It must be leveraged by the masses The business goal ◦ Actually deliver on the promise of transforming data into actionable information ◦ Predictive analytics improve forecasting ◦ Prescriptive analytics can guide business behaviour ◦ Geolocation analytics can improve resource utilization and inventory turns What are the results? - Delivering insights to executives yields direction - Delivering insights to line workers yields results
  • 4. corporate overview Not everyone has the same problem Use cases are variations on a common theme Confidential Content 4 ©2017 SnapLogic, Inc. All Rights Reserved
  • 5. Sampling of Industry Focused Use Cases Umbrella Industry Fraud Detection Upsell & Cross-sell Customer360 Fault Prediction Sentiment Analysis Personalization M & A Management Consulting Manufacturing X X Retail X X X X X X Healthcare X X Financial Services X X X X X Energy X X Logistics & Transportation X X X Services X X CPG X X X Computer Software X X Telecom X X X X X X X Deployment Pattern Data Refinery or Data Lake Pop. Hub-and-Spoke Hub-and-Spoke Data Refinery Data Refinery or Data Lake Pop. Data Refinery or Data Lake Pop. Common Data Modeling Common Data Modeling
  • 6. Data Lake Population Data Lake Storage: S3, HDFS, Processing/Transformation Ingestion Source System 1 Source System 2 Source System 3 Source System N Pull Push Stream Streaming Database SaaS App File
  • 7. 7 Data Refinery Data Lake OLAP Push Storage: S3, HDFS, Processing/Transformation Ingestion Pull Push Stream Source System 1 Source System 2 Source System 3 Source System N Streaming Database SaaS App File
  • 8. 8 Common Data Model Data Lake Source System 1 Source System 2 Source System 3 Source System N Source System 1 Source System 2 Source System 3 Source System N Source System 1 Source System 2 Source System 3 Source System N HDFS, S3, Blob Staging ** Source System 1 Source System 1 Downstream Apps Push Streaming Database SaaS App File Processing/Transformation Ingestion Pull Push Stream Storage: S3, HDFS,
  • 9. 9 Hub-and-Spoke Data Lake EDWPush Data Mart Data Mart Data Mart Data Science Workbench Pull Push Stream Storage: S3, HDFS, Processing/Transformation Ingestion Source System 1 Source System 2 Source System 3 Source System N Streaming Database SaaS App File
  • 10. corporate overview The first solution: custom built
  • 11. Michelangelo@Uber Confidential Content 11 Welcome my son to the machine… ©2017 SnapLogic, Inc. All Rights Reserved The problem ◦ “There were no systems in place to build reliable, uniform, and reproducible pipelines for creating and managing training and prediction data at scale.” The solution: Machine Learning as a Service ◦ ML-as-a-service platform that democratizes machine learning and makes scaling AI to meet the needs of business as easy as requesting a ride. Michelangelo consists of a mix of open source systems and components built in-house. The primary open sourced components used are HDFS, Spark, Samza, Cassandra, MLLib, XGBoost, and TensorFlow. Cost ◦ Two years ◦ $60 million Results ◦ A Wall Street Journal report claims SoftBank has been in touch with Uber with the apparent goal of buying a “multi-billion dollar stake” in the company. To date, Uber has raised close to $12 billion from investors, with its most recent valuation reportedly above $60 billion. July 25, 2017
  • 12. ©2017 SnapLogic, Inc. All Rights Reserved Confidential Content 12 A Model feature report
  • 13. Building on success Confidential Content 13 Both the systems and staff continue to learn and evolve ©2017 SnapLogic, Inc. All Rights Reserved “As the platform layers mature, we plan to invest in higher level tools and services to drive democratization of machine learning and better support the needs of our business” For more information ◦ https://eng.uber.com/michelangelo/
  • 14. corporate overview The second solution: custom integration
  • 15. The five year plan Confidential Content 15 Rome was not built in a day ©2017 SnapLogic, Inc. All Rights Reserved The problem ◦ A large multinational corporation grew in part by acquisition ◦ Technology stacks and silos as far as the eye can see ◦ They had one or more of every kind of technology ◦ They had hundreds of data warehouses and data marts The cost ◦ Implementing any new business processes were blindingly expensive, took too long and were not what the user was expecting or needed The solution ◦ Simplify, standardized, consolidate and adopt a cloud strategy ◦ Insert a Data Lake into the data lifecycle ◦ Adopt a Citizen Integrator model where ever possible The business result ◦ The combination of migration from a perpetual software license model to SaaS and the reduced labor costs of the Citizen Integrator model resulted in savings in the millions
  • 16. The evolving data lifecycle Confidential Content 16 ©2017 SnapLogic, Inc. All Rights Reserved Data Lake Source System 1 Source System 2 Source System 3 Source System N Source System 1 Source System 2 Source System 3 Source System N Source System 1 Source System 2 Source System 3 Source System N Source System 1 Source System 2 Source System 3 Source System N Source System 1 Source System 2 Source System 3 Source System N Source System 1 Source System 2 Source System 3 Source System N EDW Data Mart Data Mart Data Mart Data Science Workbench EDW Data Mart Data Mart Data Mart Two stages, OLTP to DW and Data marts Three stages, OLTP to Data Lake, the to on shore Data marts and DW
  • 17. Results Confidential Content 17 Happy productive business users ©2017 SnapLogic, Inc. All Rights Reserved Faster time to market for new programs with agility and LOB alignment Over 500 users from almost all business units Savings in the millions A more agile business environment
  • 19. The solution approach Confidential Content 19 Business goal drive the architectural requirements ©2017 SnapLogic, Inc. All Rights Reserved The problem/business goal ◦ Obtain a customer 360 view by removing the constraints of an on-premises environment and move to a cloud-first environment where multiple departments/constituents can access data and obtain insights. Key Characteristics of a cloud-first enterprise stack: ◦ Scalable ◦ Collaborative ◦ Promotes easy data sharing ◦ Reduces on-premises maintenance overhead with auto updates The process ◦ Upgrade the cloud data warehouse ◦ Move legacy BI to a modern tool like Tableau or PowerBI, for greater data fluency ◦ Create a foundation for an AI/ML workbench for predictive analytics ◦ Use ML framework like TensorFlow from Google generates Java code that runs anywhere
  • 20. 20 Proposed Enterprise Stack Amazon S3 Amazon EMR SnapLogic (AWS Deployed) Pull Push Stream Push Tableau Streaming Database Webservices File SAS Cognos Analytics Kafka, JMS Hbase, Hive, Dynamo, Mongo, Redshift, SQLServer, AzureSQL, Aurora, MySQL REST, SOAP Flat Files, XML, JSon, Excel, Word doc, PDF, S3, FTP/SFTP, ORC, Parquet Sources & Targets Social Media Facebook, LinkedIn, Twitter Machine Learning Integration Point
  • 21. Key Benefits of Proposed Architecture Confidential Content 21 ©2017 SnapLogic, Inc. All Rights Reserved Enables migration in phases rather than all at once Promotes data re-use and reduces time to insight across the organization Scalable and flexible to accommodate company’s changing needs Reduced maintenance costs to enable IT to stay focused on enabling the business Complete view of the customer with real-time data updates Better focused marketing programs (less waste, higher performance) Greater customer loyalty due to more relevant customer engagement
  • 22. Observations from the field Confidential Content 22 Some observations and a few of Rich’s rules of technology ©2017 SnapLogic, Inc. All Rights Reserved Technology is a tool, use the right one for the job ◦ It amazes me how some engineers have almost religious beliefs in their favorite technology - If the only tool you have is a hammer… Software evolves like a funnel ◦ Early releases have limitations that are fixed with later releases We work in an industry where change is constant ◦ Absolute truths can change every 5-10 years ◦ The rate of change can make you old, or keep you young. As the Iron Giant said, choose! Different technologies require different approaches and techniques ◦ I don’t code Scala like C or Cobol ◦ “A mind is like a parachute it only functions when it is open” Thomas Dewar The adoption curve entails risk… and costs ◦ There is a reason we call it the bleeding edge Open source is not free ◦ The money you save on license cost, you will spend on additional labor, plus 25%
  • 23. ©2017 SnapLogic, Inc. All Rights Reserved Confidential Content Q & A
  • 24. Thank You San Mateo, CA Boulder, CO New York, NY London, UK Melbourne, AUS Hyderabad, India www.snaplogic.com Rich Dill– Enterprise Solutions Architect [email protected]