SlideShare a Scribd company logo
Architecting for Failure
in a Containerized World
Tom Faulhaber
Infolace
Architecting for Failure in a Containerized World
Architecting for Failure in a Containerized World
Architecting for Failure in a Containerized World
How can container tech help us
build robust systems?
Key takeaway: an architectural
toolkit for building robust
systems with containers
The Rules
Decomposition Orchestration and
Synchronization
Managing Stateful Apps
Simplicity
Simple means:
“Do one thing!”
The opposite of
simple is complex
Complexity exists
within
components
Complexity exists
between
components
Example: a counter
Counter
Service
1 2 3 4 50 …
Counter
Service
1 2 3 4 50 …x Counter
Service
1 2 3 4 50
1 2 3 4 50 1 2 3 4 50
Example: a counter
Counter
Service
1 2 3 4 50 …
Counter
Service
1 2 3 4 50 …
Load
Balancer
1 2 3 4 50 1 2 3 4 50
State + composition =
complexity
Part 1:
Decomposition
Rule:
Decompose vertically
App Server
Service
#1
Service
#2
Service
#3
App Server
Rule:
Separation of concerns
Example: Logging
App
Core
Code
Logging
Driver
Config
Logging Server
Example: Logging
Logger
App
Core
Code
Logging
Driver
Config
Logging Server
StdOut
Aspect-oriented programming
Rule:
Constrain state
Relational DB
Session Store
Rule:
Battle-tested tools
Redis
MySQL
Rule:
High code churn
→Easy restart
Rule:
No start-up order!
time
a
b
c
d
time
x
a
b
c
d
time
x
a
b
c
d
x
x
x
time
x
a
b
c
d
x
x
x
time
a
b
c
d
time
a
b
c
d
time
a
b
c
d
Rule:
Consider higher-order failure
The Rules
Decomposition
Decompose vertically
Separation of concerns
Constrain state
Battle-tested tools
High code churn, easy
restart
No start-up order!
Consider higher-order
failure
Orchestration and
Synchronization
Managing Stateful Apps
Part 2:
Orchestration and
Synchronization
Rule:
Use Framework Restarts
• Mesos: Marathon always restarts
• Kubernetes: RestartPolicy=Always
• Docker: Swarm always restarts
Rule:
Create your own framework
Mesos
Agent
Framework
Executor
Mesos
Master
Framework
Driver
Mesos
Agent
Framework
Executor
Mesos
Agent
Framework
Executor
Rule:
Use
Synchronized State
Synchronized State
Tools:
- zookeeper
- etcd
- consul
Patterns:
- leader election
- shared counters
- peer awareness
- work partitioning
Rule:
Minimize
Synchronized State
Even battle-tested state management is a headache.
(Source: http://blog.cloudera.com/blog/2014/03/zookeeper-resilience-at-pinterest/)
The Rules
Decomposition
Decompose vertically
Separation of concerns
Constrain state
Battle-tested tools
High code churn, easy
restart
No start-up order!
Consider higher-order
failure
Orchestration and
Synchronization
Use framework restarts
Create your own framework
Use synchronized state
Minimize synchronized state
Managing Stateful Apps
Part 3:
Managing Stateful Apps
Rule (repeat!):
Always use battle-tested tools!
(State is the weak point)
Rule:
Choose the DB architecture
Option 1: External DB
Execution cluster
Database cluster
Option 1: External DB
Pros
• Somebody else’s problem!
• Can use a DB designed for
clustering directly
• Can use DB as a service
Cons
• Not really somebody else’s
problem!
• Higher latency/no reference
locality
• Can’t leverage orchestration,
etc.
Option 2: Run on Raw HW
HDFS
Mesos
Marathon
App
HDFS
Mesos
Marathon
App
HDFS
Mesos
Marathon
App
Option 2: Run on Raw HW
Pros
• Use existing recipes
• Have local data
• Manage a single cluster
Cons
• Orchestration doesn’t help with
failure
• Increased management
complexity
Option 3: In-memory DB
Mesos
Marathon
App
MemSQL
Mesos
Marathon
App
MemSQL
Mesos
Marathon
App
MemSQL
Option 3: In-memory DB
Pros
• No need for volume tracking
• Fast
• Have local data
• Manage a single cluster
Cons
• Bets all machines won’t go
down
• Bets on orchestration
framework
Option 4: Use Orchestration
Mesos
Marathon
App
Cassandra
Mesos
Marathon
App
Cassandra
Mesos
Marathon
App
Cassandra
Option 4: Use Orchestration
Pros
• Orchestration manages
volumes
• One model for all programs
• Have local data
• Single cluster
Cons
• Currently the least mature
• Not well supported by vendors
Option 5: Roll Your Own
Mesos
Marathon
App
ImageMgr
Mesos
Master
Framework
Mesos
Marathon
App
ImageMgr
Mesos
Marathon
App
ImageMgr
Option 5: Roll Your Own
Pros
• Very precise control
• You decide whether to use
containers
• Have local data
• Can be system aware
Cons
• You’re on your own!
• Wedded to a single
orchestration platform
• Not battle tested
Rule:
Have replication
The Rules
Decomposition
Decompose vertically
Separation of concerns
Constrain state
Battle-tested tools
High code churn, easy
restart
No start-up order!
Consider higher-order
failure
Orchestration and
Synchronization
Use framework restarts
Create your own framework
Use synchronized state
Minimize synchronized state
Managing Stateful Apps
Battle-tested tools
Choose the DB architecture
Have replication
Fin
References
• Rich Hickey:

“Are We There Yet?” (https://www.infoq.com/presentations/Are-We-
There-Yet-Rich-Hickey)

“Simple Made Easy” (https://www.infoq.com/presentations/Simple-
Made-Easy-QCon-London-2012)
• David Greenberg, Building Applications on Mesos, O’Reilly, 2016
• Joe Johnston, et al., Docker in Production: Lessons from the
Trenches, Bleeding Edge Press, 2015
The Rules
Decomposition
Decompose vertically
Separation of concerns
Constrain state
Battle-tested tools
High code churn, easy
restart
No start-up order!
Consider higher-order
failure
Orchestration and
Synchronization
Use framework restarts
Create your own framework
Use synchronized state
Minimize synchronized state
Managing Stateful Apps
Battle-tested tools
Choose the DB architecture
Have replication

More Related Content

PPTX
Database replication
Arslan111
 
DOCX
data replication
Hassanein Alwan
 
PPT
Client Centric Consistency Model
Rajat Kumar
 
PPT
Consistency protocols
ZongYing Lyu
 
ODP
Distributed systems and consistency
seldo
 
PPTX
Load Balancing from the Cloud - Layer 7 Aware Solution
Imperva Incapsula
 
PPTX
Server load balancer ppt
Shilpi Tandon
 
Database replication
Arslan111
 
data replication
Hassanein Alwan
 
Client Centric Consistency Model
Rajat Kumar
 
Consistency protocols
ZongYing Lyu
 
Distributed systems and consistency
seldo
 
Load Balancing from the Cloud - Layer 7 Aware Solution
Imperva Incapsula
 
Server load balancer ppt
Shilpi Tandon
 

What's hot (20)

PPTX
Come Fly With Me: Database Migration Patterns with Flyway
Joris Kuipers
 
PDF
Five Workload-to-Cloud Migration Methods
Peak 10
 
PPT
The Architect's Two Hats
Ben Stopford
 
PPTX
Natural Laws of Software Performance
Gibraltar Software
 
PDF
An Overview of Distributed Debugging
Anant Narayanan
 
PPTX
Advanced databases ben stopford
Ben Stopford
 
PPTX
Replication in Distributed Real Time Database
Ghanshyam Yadav
 
PPTX
Replication in Distributed Database
Abhilasha Lahigude
 
PPT
Building large scale, job processing systems with Scala Akka Actor framework
Vignesh Sukumar
 
PDF
Patterns of enterprise application architecture
thlias
 
PPT
Dynamic Load balancing Linux private Cloud (DRS)
kamrankausar
 
PPTX
Scaling Systems: Architectures that Grow
Gibraltar Software
 
PDF
PROPOSED LOAD BALANCING ALGORITHM TO REDUCE RESPONSE TIME AND PROCESSING TIME...
IJCNCJournal
 
PPTX
LOAD BALANCING ALGORITHMS
tanmayshah95
 
PDF
Architecting for the cloud elasticity security
Len Bass
 
PPTX
Base paper ppt-. A load balancing model based on cloud partitioning for the ...
Lavanya Vigrahala
 
PPT
CAP, PACELC, and Determinism
Daniel Abadi
 
PPTX
A load balancing model based on cloud partitioning for the public cloud. ppt
Lavanya Vigrahala
 
PPTX
Real time operating systems (rtos) concepts 8
Abu Bakr Ramadan
 
PPT
Continuent Tungsten - Scalable Saa S Data Management
guest2e11e8
 
Come Fly With Me: Database Migration Patterns with Flyway
Joris Kuipers
 
Five Workload-to-Cloud Migration Methods
Peak 10
 
The Architect's Two Hats
Ben Stopford
 
Natural Laws of Software Performance
Gibraltar Software
 
An Overview of Distributed Debugging
Anant Narayanan
 
Advanced databases ben stopford
Ben Stopford
 
Replication in Distributed Real Time Database
Ghanshyam Yadav
 
Replication in Distributed Database
Abhilasha Lahigude
 
Building large scale, job processing systems with Scala Akka Actor framework
Vignesh Sukumar
 
Patterns of enterprise application architecture
thlias
 
Dynamic Load balancing Linux private Cloud (DRS)
kamrankausar
 
Scaling Systems: Architectures that Grow
Gibraltar Software
 
PROPOSED LOAD BALANCING ALGORITHM TO REDUCE RESPONSE TIME AND PROCESSING TIME...
IJCNCJournal
 
LOAD BALANCING ALGORITHMS
tanmayshah95
 
Architecting for the cloud elasticity security
Len Bass
 
Base paper ppt-. A load balancing model based on cloud partitioning for the ...
Lavanya Vigrahala
 
CAP, PACELC, and Determinism
Daniel Abadi
 
A load balancing model based on cloud partitioning for the public cloud. ppt
Lavanya Vigrahala
 
Real time operating systems (rtos) concepts 8
Abu Bakr Ramadan
 
Continuent Tungsten - Scalable Saa S Data Management
guest2e11e8
 
Ad

Viewers also liked (15)

PPTX
ARCHITECTURAL INNOVATION: EVENTING, EVENT SOURCING
Skills Matter
 
PDF
Implementing the Split-Apply-Combine model in Clojure and Incanter
Tom Faulhaber
 
PPTX
Greg Young on Architectural Innovation: Eventing, Event Sourcing
Skills Matter
 
PPTX
Cqrs but different
Particular Software
 
PDF
Chord Presentation at Papers We Love SF, August 2016
Tom Faulhaber
 
PDF
Scala e xchange 2013 haoyi li on metascala a tiny diy jvm
Skills Matter
 
PDF
Async/Await: NServiceBus v6 API Update
Particular Software
 
PPTX
How to avoid microservice pitfalls
Particular Software
 
PDF
Connect front end to back end using SignalR and Messaging
Particular Software
 
PPTX
Making communication across boundaries simple with Azure Service Bus
Particular Software
 
PPTX
Making communications across boundaries simple with NServiceBus
Particular Software
 
PDF
5 things cucumber is bad at by Richard Lawrence
Skills Matter
 
ODP
Patterns for slick database applications
Skills Matter
 
PDF
Efficient Immutable Data Structures (Okasaki for Dummies)
Tom Faulhaber
 
PDF
MobiliTeaTime #12 : RETAILXPERIENCE - Penser son point de vente comme un site...
USERADGENTS
 
ARCHITECTURAL INNOVATION: EVENTING, EVENT SOURCING
Skills Matter
 
Implementing the Split-Apply-Combine model in Clojure and Incanter
Tom Faulhaber
 
Greg Young on Architectural Innovation: Eventing, Event Sourcing
Skills Matter
 
Cqrs but different
Particular Software
 
Chord Presentation at Papers We Love SF, August 2016
Tom Faulhaber
 
Scala e xchange 2013 haoyi li on metascala a tiny diy jvm
Skills Matter
 
Async/Await: NServiceBus v6 API Update
Particular Software
 
How to avoid microservice pitfalls
Particular Software
 
Connect front end to back end using SignalR and Messaging
Particular Software
 
Making communication across boundaries simple with Azure Service Bus
Particular Software
 
Making communications across boundaries simple with NServiceBus
Particular Software
 
5 things cucumber is bad at by Richard Lawrence
Skills Matter
 
Patterns for slick database applications
Skills Matter
 
Efficient Immutable Data Structures (Okasaki for Dummies)
Tom Faulhaber
 
MobiliTeaTime #12 : RETAILXPERIENCE - Penser son point de vente comme un site...
USERADGENTS
 
Ad

Similar to Architecting for Failure in a Containerized World (20)

ODP
The journey to container adoption in enterprise
Igor Moochnick
 
PDF
Getting Deep on Orchestration - Nickoloff - DockerCon16
allingeek
 
PDF
Scaling and Embracing Failure: Clustering Docker with Mesos
Rob Gulewich
 
PDF
The Container Revolution: Reflections after the first decade
bcantrill
 
PDF
Docker Orchestrators
Andrew Sullivan
 
PDF
[WSO2Con EU 2018] Architecting for a Container Native Environment
WSO2
 
PPTX
{code} and containers
{code} by Dell EMC
 
PDF
Docker Tips And Tricks at the Docker Beijing Meetup
Jérôme Petazzoni
 
PPTX
Doing Dropbox the Native Cloud Native Way
Minio
 
PPTX
Docker-N-Beyond
santosh007
 
PDF
Operator-Less DataCenters A Near Future Reality
Kishore Arya
 
PDF
Operator-less DataCenters -- A Reality
Kishore Arya
 
PDF
Designing for operability and managability
Gaurav Bahrani
 
PDF
Tales Of The Black Knight - Keeping EverythingMe running
Dvir Volk
 
PDF
2023-09-28-AWS Las Palmas UG - Dynamic Anti-Frigile Systems.pdf
Andrey Devyatkin
 
PPTX
From monolith to microservice with containers.
Marcel Dempers
 
PPTX
{code} and Containers - Open Source Infrastructure within Dell Technologies
The {code} Team
 
PPTX
Scalable service architectures @ VDB16
Zoltán Németh
 
PPTX
Docker for the enterprise
Bert Poller
 
PPTX
Containerization - The DevOps Revolution
Yulian Slobodyan
 
The journey to container adoption in enterprise
Igor Moochnick
 
Getting Deep on Orchestration - Nickoloff - DockerCon16
allingeek
 
Scaling and Embracing Failure: Clustering Docker with Mesos
Rob Gulewich
 
The Container Revolution: Reflections after the first decade
bcantrill
 
Docker Orchestrators
Andrew Sullivan
 
[WSO2Con EU 2018] Architecting for a Container Native Environment
WSO2
 
{code} and containers
{code} by Dell EMC
 
Docker Tips And Tricks at the Docker Beijing Meetup
Jérôme Petazzoni
 
Doing Dropbox the Native Cloud Native Way
Minio
 
Docker-N-Beyond
santosh007
 
Operator-Less DataCenters A Near Future Reality
Kishore Arya
 
Operator-less DataCenters -- A Reality
Kishore Arya
 
Designing for operability and managability
Gaurav Bahrani
 
Tales Of The Black Knight - Keeping EverythingMe running
Dvir Volk
 
2023-09-28-AWS Las Palmas UG - Dynamic Anti-Frigile Systems.pdf
Andrey Devyatkin
 
From monolith to microservice with containers.
Marcel Dempers
 
{code} and Containers - Open Source Infrastructure within Dell Technologies
The {code} Team
 
Scalable service architectures @ VDB16
Zoltán Németh
 
Docker for the enterprise
Bert Poller
 
Containerization - The DevOps Revolution
Yulian Slobodyan
 

Recently uploaded (20)

PDF
The Role of Automation and AI in EHS Management for Data Centers.pdf
TECH EHS Solution
 
PDF
PFAS Reporting Requirements 2026 Are You Submission Ready Certivo.pdf
Certivo Inc
 
PDF
Microsoft Teams Essentials; The pricing and the versions_PDF.pdf
Q-Advise
 
PDF
Multi-factor Authentication (MFA) requirement for Microsoft 365 Admin Center_...
Q-Advise
 
PPTX
Explanation about Structures in C language.pptx
Veeral Rathod
 
PDF
Exploring AI Agents in Process Industries
amoreira6
 
PDF
Become an Agentblazer Champion Challenge Kickoff
Dele Amefo
 
PDF
49784907924775488180_LRN2959_Data_Pump_23ai.pdf
Abilash868456
 
PPTX
ASSIGNMENT_1[1][1][1][1][1] (1) variables.pptx
kr2589474
 
PDF
Teaching Reproducibility and Embracing Variability: From Floating-Point Exper...
University of Rennes, INSA Rennes, Inria/IRISA, CNRS
 
PDF
IEEE-CS Tech Predictions, SWEBOK and Quantum Software: Towards Q-SWEBOK
Hironori Washizaki
 
PPTX
Role Of Python In Programing Language.pptx
jaykoshti048
 
PDF
advancepresentationskillshdhdhhdhdhdhhfhf
jasmenrojas249
 
PDF
Jenkins: An open-source automation server powering CI/CD Automation
SaikatBasu37
 
PPTX
Odoo Integration Services by Candidroot Solutions
CandidRoot Solutions Private Limited
 
PDF
Become an Agentblazer Champion Challenge
Dele Amefo
 
PDF
49785682629390197565_LRN3014_Migrating_the_Beast.pdf
Abilash868456
 
PDF
Why Use Open Source Reporting Tools for Business Intelligence.pdf
Varsha Nayak
 
PDF
Community & News Update Q2 Meet Up 2025
VictoriaMetrics
 
PDF
On Software Engineers' Productivity - Beyond Misleading Metrics
Romén Rodríguez-Gil
 
The Role of Automation and AI in EHS Management for Data Centers.pdf
TECH EHS Solution
 
PFAS Reporting Requirements 2026 Are You Submission Ready Certivo.pdf
Certivo Inc
 
Microsoft Teams Essentials; The pricing and the versions_PDF.pdf
Q-Advise
 
Multi-factor Authentication (MFA) requirement for Microsoft 365 Admin Center_...
Q-Advise
 
Explanation about Structures in C language.pptx
Veeral Rathod
 
Exploring AI Agents in Process Industries
amoreira6
 
Become an Agentblazer Champion Challenge Kickoff
Dele Amefo
 
49784907924775488180_LRN2959_Data_Pump_23ai.pdf
Abilash868456
 
ASSIGNMENT_1[1][1][1][1][1] (1) variables.pptx
kr2589474
 
Teaching Reproducibility and Embracing Variability: From Floating-Point Exper...
University of Rennes, INSA Rennes, Inria/IRISA, CNRS
 
IEEE-CS Tech Predictions, SWEBOK and Quantum Software: Towards Q-SWEBOK
Hironori Washizaki
 
Role Of Python In Programing Language.pptx
jaykoshti048
 
advancepresentationskillshdhdhhdhdhdhhfhf
jasmenrojas249
 
Jenkins: An open-source automation server powering CI/CD Automation
SaikatBasu37
 
Odoo Integration Services by Candidroot Solutions
CandidRoot Solutions Private Limited
 
Become an Agentblazer Champion Challenge
Dele Amefo
 
49785682629390197565_LRN3014_Migrating_the_Beast.pdf
Abilash868456
 
Why Use Open Source Reporting Tools for Business Intelligence.pdf
Varsha Nayak
 
Community & News Update Q2 Meet Up 2025
VictoriaMetrics
 
On Software Engineers' Productivity - Beyond Misleading Metrics
Romén Rodríguez-Gil
 

Architecting for Failure in a Containerized World