SlideShare a Scribd company logo
ZERO DOWNTIME APP
DEPLOYMENT USING HADOOP
Hadoop Summit 2016 – San Jose
Heman Duraiswamy
Solutions Engineer
Wei Wang
Data Scientist
Jun 30, 2016
Agenda
 Introduction
 Our Story that ends with “Happily ever after”
“Zero downtime app deployment”
 Reference Architecture
 Demo
Introduction: Why?
 Expedia revenue in 2015 == $6.67 B
 >18.2MM/day (or) >$750,000/hr
 Cost of 15 min deployment window and having one deployment every other week??
 ~5MM $$
 Same figure for Amazon?? - $$80MM
 And offcourse, customer trust & confidence!
Introduction: How?
Innovate
(Tools,
technology
and
architecture)
Monitor
(In-real time)
React
(Actionable
Intelligence)
Zero downtime app deployment
using Hadoop
Our Story: Once upon a time…
Our Story: Once upon a time…
Our Story: Once upon a time…
Our Story: Once upon a time…
Our Story: then they evolved
Rolling deployment
 Near Zero downtime deployment
 Hampers Innovation
 Operational over-head
End game…
Using HADOOP
Innovate -- Micro services
Header
module
Search
module
localDest
module
topDeal module
Loyalty module
Deploy-at-will -- Continuous
Delivery
Continuous
Integration
Jenkins
Build
Glu
Deployment
Learn & Succeed (or fail)
Canary Deployment
a/b testing
Operational monitoring
24*7 Instant feedback
Reference Infrastructure layout
Header
module
Search
module
localDest
module
topDeal module
Loyalty module
Server001 Server002 Server003
Server004 Server005
Server006 Server007
Server012 Server013 Server014
Server008 Server009 Server010 Server011
Use case 1: server in
bad state – meaning
serves higher
proportion of 404 &503
pages
Use case 2: server
is slow – takes
longer to process
requests
Use case 3: Deploy bad
app version – serve
high propotion of
application CRIT and
ERROR messages
Reference Architecture
HDF
(nifi)
Kafka
Storm
Topology
Application
log files
Access log
files
Server
s
Server
s
Solr
Hive
HDFS
Banana
view
Ansible
script
DEMO: Log prep
DEMO: HDF flow
DEMO: Storm Topology
Storm
Kafka Bolt
Aggregation
Bolt
Timer Bolt
Calculatio
n Bolt
Solr Bolt Hive Bolt
DEMO: Storm Topology
DEMO: Storm Topology
DEMO: Analytics Results in
Banana
DEMO…
THANK YOU!
hduraiswamy@hortonworks.com
@hduraiswamy
https://www.linkedin.com/in/hemananthan
wwang@hortonworks.com
https://www.linkedin.com/in/wei-wang-0957902
https://github.com/heman-duraiswamy/ZeroDowntimeDeployment
https://github.com/ww2265columbia/hadoopsummit2016/
Heman Duraiswamy
Solutions Engineer
Wei Wang
Data Scientist

More Related Content

What's hot (20)

PPTX
Driving the On-Demand Economy with Spark and Predictive Analytics
SingleStore
 
PPTX
CTO View: Driving the On-Demand Economy with Predictive Analytics
SingleStore
 
PPTX
Best Practices for Supercharging Cloud Analytics on Amazon Redshift
SnapLogic
 
PPTX
Druid Overview by Rachel Pedreschi
Brian Olsen
 
PDF
Power Your Delta Lake with Streaming Transactional Changes
Databricks
 
PDF
Spark and Hadoop at Production Scale-(Anil Gadre, MapR)
Spark Summit
 
PDF
Unifying Streaming and Historical Telemetry Data For Real-time Performance Re...
Databricks
 
PPTX
Driving the On-Demand Economy with Predictive Analytics
SingleStore
 
PPTX
Real-Time Analytics with MemSQL and Spark
SingleStore
 
PDF
Enabling Real-Time Analytics for IoT
SingleStore
 
PDF
Life is but a Stream
Databricks
 
PDF
Winning the On-Demand Economy with Spark and Predictive Analytics
SingleStore
 
PPTX
Migrating Big Data Workloads to the Cloud
Robert Sanders
 
PDF
Machines and the Magic of Fast Learning
SingleStore
 
PPTX
Snaplogic Live: Big Data in Motion
SnapLogic
 
PDF
Building an IoT Kafka Pipeline in Under 5 Minutes
SingleStore
 
PDF
Building the Next-gen Digital Meter Platform for Fluvius
Databricks
 
PDF
Building an Enterprise Data Platform with Azure Databricks to Enable Machine ...
Databricks
 
PDF
Real-Time Forecasting at Scale using Delta Lake and Delta Caching
Databricks
 
PDF
Saving Energy in Homes with a Unified Approach to Data and AI
Databricks
 
Driving the On-Demand Economy with Spark and Predictive Analytics
SingleStore
 
CTO View: Driving the On-Demand Economy with Predictive Analytics
SingleStore
 
Best Practices for Supercharging Cloud Analytics on Amazon Redshift
SnapLogic
 
Druid Overview by Rachel Pedreschi
Brian Olsen
 
Power Your Delta Lake with Streaming Transactional Changes
Databricks
 
Spark and Hadoop at Production Scale-(Anil Gadre, MapR)
Spark Summit
 
Unifying Streaming and Historical Telemetry Data For Real-time Performance Re...
Databricks
 
Driving the On-Demand Economy with Predictive Analytics
SingleStore
 
Real-Time Analytics with MemSQL and Spark
SingleStore
 
Enabling Real-Time Analytics for IoT
SingleStore
 
Life is but a Stream
Databricks
 
Winning the On-Demand Economy with Spark and Predictive Analytics
SingleStore
 
Migrating Big Data Workloads to the Cloud
Robert Sanders
 
Machines and the Magic of Fast Learning
SingleStore
 
Snaplogic Live: Big Data in Motion
SnapLogic
 
Building an IoT Kafka Pipeline in Under 5 Minutes
SingleStore
 
Building the Next-gen Digital Meter Platform for Fluvius
Databricks
 
Building an Enterprise Data Platform with Azure Databricks to Enable Machine ...
Databricks
 
Real-Time Forecasting at Scale using Delta Lake and Delta Caching
Databricks
 
Saving Energy in Homes with a Unified Approach to Data and AI
Databricks
 

Viewers also liked (18)

PPTX
Data Regions: Modernizing your company's data ecosystem
DataWorks Summit/Hadoop Summit
 
PPTX
Intuit Analytics Cloud 101
DataWorks Summit/Hadoop Summit
 
PDF
Show me the Money! Cost & Resource Tracking for Hadoop and Storm
DataWorks Summit/Hadoop Summit
 
PPTX
Lambda-less Stream Processing @Scale in LinkedIn
DataWorks Summit/Hadoop Summit
 
PDF
Stream Processing made simple with Kafka
DataWorks Summit/Hadoop Summit
 
PPTX
IoT:what about data storage?
DataWorks Summit/Hadoop Summit
 
PPTX
Meeting Performance Goals in multi-tenant Hadoop Clusters
DataWorks Summit/Hadoop Summit
 
PPTX
Managing Hadoop, HBase and Storm Clusters at Yahoo Scale
DataWorks Summit/Hadoop Summit
 
PDF
Big Data Ready Enterprise
DataWorks Summit/Hadoop Summit
 
PDF
Timeline service V2 at the Hadoop Summit SJ 2016
Vrushali Channapattan
 
PPTX
The Stream Processor as a Database Apache Flink
DataWorks Summit/Hadoop Summit
 
PDF
Extend Governance in Hadoop with Atlas Ecosystem: Waterline, Attivo & Trifacta
DataWorks Summit/Hadoop Summit
 
PPTX
LLAP: Sub-Second Analytical Queries in Hive
DataWorks Summit/Hadoop Summit
 
PPTX
Effective Spark on Multi-Tenant Clusters
DataWorks Summit/Hadoop Summit
 
PPTX
Apache Phoenix + Apache HBase
DataWorks Summit/Hadoop Summit
 
PDF
Big Data Security and Governance
DataWorks Summit/Hadoop Summit
 
PPTX
The Columnar Era: Leveraging Parquet, Arrow and Kudu for High-Performance Ana...
DataWorks Summit/Hadoop Summit
 
PPTX
Big data and Hadoop
Rahul Agarwal
 
Data Regions: Modernizing your company's data ecosystem
DataWorks Summit/Hadoop Summit
 
Intuit Analytics Cloud 101
DataWorks Summit/Hadoop Summit
 
Show me the Money! Cost & Resource Tracking for Hadoop and Storm
DataWorks Summit/Hadoop Summit
 
Lambda-less Stream Processing @Scale in LinkedIn
DataWorks Summit/Hadoop Summit
 
Stream Processing made simple with Kafka
DataWorks Summit/Hadoop Summit
 
IoT:what about data storage?
DataWorks Summit/Hadoop Summit
 
Meeting Performance Goals in multi-tenant Hadoop Clusters
DataWorks Summit/Hadoop Summit
 
Managing Hadoop, HBase and Storm Clusters at Yahoo Scale
DataWorks Summit/Hadoop Summit
 
Big Data Ready Enterprise
DataWorks Summit/Hadoop Summit
 
Timeline service V2 at the Hadoop Summit SJ 2016
Vrushali Channapattan
 
The Stream Processor as a Database Apache Flink
DataWorks Summit/Hadoop Summit
 
Extend Governance in Hadoop with Atlas Ecosystem: Waterline, Attivo & Trifacta
DataWorks Summit/Hadoop Summit
 
LLAP: Sub-Second Analytical Queries in Hive
DataWorks Summit/Hadoop Summit
 
Effective Spark on Multi-Tenant Clusters
DataWorks Summit/Hadoop Summit
 
Apache Phoenix + Apache HBase
DataWorks Summit/Hadoop Summit
 
Big Data Security and Governance
DataWorks Summit/Hadoop Summit
 
The Columnar Era: Leveraging Parquet, Arrow and Kudu for High-Performance Ana...
DataWorks Summit/Hadoop Summit
 
Big data and Hadoop
Rahul Agarwal
 
Ad

More from DataWorks Summit/Hadoop Summit (20)

PPT
Running Apache Spark & Apache Zeppelin in Production
DataWorks Summit/Hadoop Summit
 
PPT
State of Security: Apache Spark & Apache Zeppelin
DataWorks Summit/Hadoop Summit
 
PDF
Unleashing the Power of Apache Atlas with Apache Ranger
DataWorks Summit/Hadoop Summit
 
PDF
Enabling Digital Diagnostics with a Data Science Platform
DataWorks Summit/Hadoop Summit
 
PDF
Revolutionize Text Mining with Spark and Zeppelin
DataWorks Summit/Hadoop Summit
 
PDF
Double Your Hadoop Performance with Hortonworks SmartSense
DataWorks Summit/Hadoop Summit
 
PDF
Hadoop Crash Course
DataWorks Summit/Hadoop Summit
 
PDF
Data Science Crash Course
DataWorks Summit/Hadoop Summit
 
PDF
Apache Spark Crash Course
DataWorks Summit/Hadoop Summit
 
PDF
Dataflow with Apache NiFi
DataWorks Summit/Hadoop Summit
 
PPTX
Schema Registry - Set you Data Free
DataWorks Summit/Hadoop Summit
 
PPTX
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
DataWorks Summit/Hadoop Summit
 
PDF
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
DataWorks Summit/Hadoop Summit
 
PPTX
Mool - Automated Log Analysis using Data Science and ML
DataWorks Summit/Hadoop Summit
 
PPTX
How Hadoop Makes the Natixis Pack More Efficient
DataWorks Summit/Hadoop Summit
 
PPTX
HBase in Practice
DataWorks Summit/Hadoop Summit
 
PPTX
The Challenge of Driving Business Value from the Analytics of Things (AOT)
DataWorks Summit/Hadoop Summit
 
PDF
Breaking the 1 Million OPS/SEC Barrier in HOPS Hadoop
DataWorks Summit/Hadoop Summit
 
PPTX
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
DataWorks Summit/Hadoop Summit
 
PPTX
Backup and Disaster Recovery in Hadoop
DataWorks Summit/Hadoop Summit
 
Running Apache Spark & Apache Zeppelin in Production
DataWorks Summit/Hadoop Summit
 
State of Security: Apache Spark & Apache Zeppelin
DataWorks Summit/Hadoop Summit
 
Unleashing the Power of Apache Atlas with Apache Ranger
DataWorks Summit/Hadoop Summit
 
Enabling Digital Diagnostics with a Data Science Platform
DataWorks Summit/Hadoop Summit
 
Revolutionize Text Mining with Spark and Zeppelin
DataWorks Summit/Hadoop Summit
 
Double Your Hadoop Performance with Hortonworks SmartSense
DataWorks Summit/Hadoop Summit
 
Hadoop Crash Course
DataWorks Summit/Hadoop Summit
 
Data Science Crash Course
DataWorks Summit/Hadoop Summit
 
Apache Spark Crash Course
DataWorks Summit/Hadoop Summit
 
Dataflow with Apache NiFi
DataWorks Summit/Hadoop Summit
 
Schema Registry - Set you Data Free
DataWorks Summit/Hadoop Summit
 
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
DataWorks Summit/Hadoop Summit
 
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
DataWorks Summit/Hadoop Summit
 
Mool - Automated Log Analysis using Data Science and ML
DataWorks Summit/Hadoop Summit
 
How Hadoop Makes the Natixis Pack More Efficient
DataWorks Summit/Hadoop Summit
 
HBase in Practice
DataWorks Summit/Hadoop Summit
 
The Challenge of Driving Business Value from the Analytics of Things (AOT)
DataWorks Summit/Hadoop Summit
 
Breaking the 1 Million OPS/SEC Barrier in HOPS Hadoop
DataWorks Summit/Hadoop Summit
 
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
DataWorks Summit/Hadoop Summit
 
Backup and Disaster Recovery in Hadoop
DataWorks Summit/Hadoop Summit
 
Ad

Recently uploaded (20)

PDF
“NPU IP Hardware Shaped Through Software and Use-case Analysis,” a Presentati...
Edge AI and Vision Alliance
 
PPTX
"Autonomy of LLM Agents: Current State and Future Prospects", Oles` Petriv
Fwdays
 
PDF
CIFDAQ Token Spotlight for 9th July 2025
CIFDAQ
 
PDF
LOOPS in C Programming Language - Technology
RishabhDwivedi43
 
PDF
Staying Human in a Machine- Accelerated World
Catalin Jora
 
PDF
Smart Trailers 2025 Update with History and Overview
Paul Menig
 
PPTX
Building Search Using OpenSearch: Limitations and Workarounds
Sease
 
PPTX
The Project Compass - GDG on Campus MSIT
dscmsitkol
 
PDF
Reverse Engineering of Security Products: Developing an Advanced Microsoft De...
nwbxhhcyjv
 
PDF
Building Real-Time Digital Twins with IBM Maximo & ArcGIS Indoors
Safe Software
 
PDF
CIFDAQ Market Insights for July 7th 2025
CIFDAQ
 
PDF
CIFDAQ Market Wrap for the week of 4th July 2025
CIFDAQ
 
PDF
"Beyond English: Navigating the Challenges of Building a Ukrainian-language R...
Fwdays
 
PPTX
Webinar: Introduction to LF Energy EVerest
DanBrown980551
 
PDF
POV_ Why Enterprises Need to Find Value in ZERO.pdf
darshakparmar
 
PDF
Agentic AI lifecycle for Enterprise Hyper-Automation
Debmalya Biswas
 
PPTX
Q2 FY26 Tableau User Group Leader Quarterly Call
lward7
 
PDF
Achieving Consistent and Reliable AI Code Generation - Medusa AI
medusaaico
 
PPTX
AUTOMATION AND ROBOTICS IN PHARMA INDUSTRY.pptx
sameeraaabegumm
 
PDF
Mastering Financial Management in Direct Selling
Epixel MLM Software
 
“NPU IP Hardware Shaped Through Software and Use-case Analysis,” a Presentati...
Edge AI and Vision Alliance
 
"Autonomy of LLM Agents: Current State and Future Prospects", Oles` Petriv
Fwdays
 
CIFDAQ Token Spotlight for 9th July 2025
CIFDAQ
 
LOOPS in C Programming Language - Technology
RishabhDwivedi43
 
Staying Human in a Machine- Accelerated World
Catalin Jora
 
Smart Trailers 2025 Update with History and Overview
Paul Menig
 
Building Search Using OpenSearch: Limitations and Workarounds
Sease
 
The Project Compass - GDG on Campus MSIT
dscmsitkol
 
Reverse Engineering of Security Products: Developing an Advanced Microsoft De...
nwbxhhcyjv
 
Building Real-Time Digital Twins with IBM Maximo & ArcGIS Indoors
Safe Software
 
CIFDAQ Market Insights for July 7th 2025
CIFDAQ
 
CIFDAQ Market Wrap for the week of 4th July 2025
CIFDAQ
 
"Beyond English: Navigating the Challenges of Building a Ukrainian-language R...
Fwdays
 
Webinar: Introduction to LF Energy EVerest
DanBrown980551
 
POV_ Why Enterprises Need to Find Value in ZERO.pdf
darshakparmar
 
Agentic AI lifecycle for Enterprise Hyper-Automation
Debmalya Biswas
 
Q2 FY26 Tableau User Group Leader Quarterly Call
lward7
 
Achieving Consistent and Reliable AI Code Generation - Medusa AI
medusaaico
 
AUTOMATION AND ROBOTICS IN PHARMA INDUSTRY.pptx
sameeraaabegumm
 
Mastering Financial Management in Direct Selling
Epixel MLM Software
 

Zero Downtime App Deployment using Hadoop

Editor's Notes

  • #4: Refer to orbitz deployment of ~1000 times/year for the host of 150 different applications
  • #5: Offcourse one way of achieving zero down time is doing NOTHING – but that is going to kill your company from inside-out
  • #10: Challenges are aplenty.. Not even going to get there..