SlideShare a Scribd company logo
Docker at SpareFoot
Lessons From a Journey to Production
DevOps Days Austin
May 3, 2016
Who am I?
Steve Woodruff
❏ Director of DevOps at SpareFoot
implementing CI/CD
❏ Spent 10+ years at Motorola doing
embedded development (C, C++)
❏ Spent 5 years at IBM as a sys admin in a
large server farm (Linux, AIX, Solaris)
swoodruff@sparefoot.com
Twitter: @sjwoodr
GitHub: sjwoodr
● Think Hotels.com for self storage*
● All infrastructure in AWS
● 40 Developers on 7 Teams
○ Continuous Delivery
● Docker in production since 2014
*This kind of storage:
The Beginning: SpareFoot + Docker
Hackathon! Docker + Fig
(now compose) allowed us to
run production architecture
locally.
Yim - Call Center Application
Used exclusively by our
call center
Chrome ONLY
Node version n+1
React + Flux
Vers. n
+1
Vers. n
+1
Vers. n
Yim - Call Center Application
Used exclusively by our
call center
Chrome ONLY
Node version n+1
React + Flux
Vers. n
+1
Vers. n
+1
Vers. n
CI and deployments
Janky shell scripts… slow builds, etc…
Used Bamboo to build images
feature branches were built/deployed to Dev
master branch was built/deployed to Staging
Dynamically created custom container start script
Tried to auto-detect when the containers started to begin post-deploy test
Build times were rather long
Spent an awful long time doing docker push (to our registry) and docker pull (on the target hosts)
Ok, so Docker feels like the a solution
… and we kind of know how to do this. But....
Continuous Integration / Delivery?
○ Docker Registry
○ Bamboo
○ Deployments
● Host Volumes and Port Forwarding rules?
○ Not saved with the source code
● Get Docker to run in local, dev, staging, and production environments?
○ Configuration?
Docker in Production (technically)!
We had 2 load balanced
EC2 instances running a
node app.
ELB
443
3000 3000
Docker in Production (technically)!
We had 2 load balanced
EC2 instances running a
node app.
Now we have 2 load
balanced EC2 instances
running docker containers
that run a node app!
ELB
443
3000 3000
ELB
App 1 App 1
3000 3000
443
Docker in Production (technically)!
ELB ELB
App 1 App 1
We had 2 load balanced
EC2 instances running a
node app.
Now we have 2 load
balanced EC2 instances
running docker containers
that run a node app!
NEW443
3000 3000 3000 3000
443
Yim: Trouble in Docker Paradise
Hosting our own Docker registry was a bad idea
Stability was a problem
No level of access control on the registry itself
Mimicking servers - 1 container per host. Need orchestration please!
Amazon Linux AMI -> old version of Docker… doh!
Docker push/pull to/from registry was very slow
build - push to registry
deploy - pull from registry to each host, serially
Performance was fine….
But stability was the issue
This internal-facing nodejs app was moved to a pair of EC2 instances and out of Docker after about 4
months of pain and suffering
Yim: Lessons Learned
We need orchestration
Rolling our own docker deployments was confusing to OPS and to the Dev team
Our own docker registry is a bad idea
Stability was a problem
No level of access control on the registry itself
Our S3 backend would grow by several GB per month with no automated
cleanup
No easy way to rollback failed deploys
Just fix it and deploy again...
All this culminated in a poor build process and affected CI velocity
Longer builds, longer deploys, no real gain
Like everyone else....
...we were “deconstructing the monolith”
Application
Monolithic Library
Data
Like everyone else....
...we were “deconstructing the monolith”
Application
REST API
Data
Microservice
REST API
Data
Microservice
REST API
Data
Microservice
REST API
Data
Microservice
API Gateway
A Better Docker Registry
With Yim we learned that rolling our own Registry was a bad idea.
Limited Access Control
We have to maintain it
Let’s try Quay...
Has Access Control
Robots, yusss!
We don’t have to maintain it
We’ve learned some things...
● Easier than we thought
● Quay was the glue we needed
○ Use an off the shelf solution.
○ We like Quay.io
● Bolting on to our existing CI pipeline worked really well.
○ Developers didn’t have to learn new process
○ Microservice consumers can pull tagged versions
○ We can automate tests against all versions
Now we talk containers from local -> dev -> staging but NOT
in production.
MASTE
R
BRANCH A
Dev Staging
Service1
service1:prod
Production
Service1
service1:stage
Service1
service1:dev-branch-name
Production - What is still needed
Orchestration
Yim sucked because we tried to do this ourselves
Better Deployments
With rollbacks
Configuration Management
We have things to hide
Production - Orchestration
Production - Orchestration
Production - Software Selection
Choosing orchestration software / container service in early 2015
StackEngine
Lacked docker-compose support
Kubernetes
PhD Required
Mesosphere
Nice, but slow to deploy
EC2 Container Service
Lacked docker-compose support and custom AMIs
Tutum (now Docker Cloud)
Rancher
Production - Enter Rancher
After running proof-of-concepts of both Tutum and Rancher, we decided to continue down our
path to production deploys with Rancher.
Had more mature support for docker-compose files.
Tutum added this after our evaluation had ended
Did not require us to orchestrate the deployments through their remote endpoint
Rancher server runs on our EC2 instances and we are in full control of all the things
Had a full API we can work with in addition to the custom rancher-compose cli
Had a very-active user community and a beta-forum where the Rancher development team
was active in answering questions and even troubleshooting configuration problems.
Overlaying Docker on AWS
ELB
EC2
Containers
Overlaying Docker on AWS
Why the extra HAProxy layer?
Allows us to create the ELB and leave them alone
When we deploy new versioned services we update the service alias / haproxy links
Allows for fast rollback to previous version of the service
Deployments and Rollbacks
Developers can deploy to production whenever they want
HipChat bot commands to deploy and rollback/revert
Deployments to each of the 3 environments use rancher-compose to
Deploy new versioned services / containers
Create or update service aliases / haproxy links
Delete previous versioned services except for current and previous
When things go haywire…
We simply rollback
Production deploy creates a docker-compose-rollback.yml file
Query Rancher API to get list of running services
Allows us to change haproxy and service alias links back to the previous version
Super fast to rollback, no containers need to be spun up!
Overlaying Docker on AWS
ELB
EC2
Containers
Overlaying Docker on AWS
ELB
EC2
Containers
Overlaying Docker on AWS
ELB
EC2
Containers
Overlaying Docker on AWS
ELB
EC2
Containers
Rollback!
Secret Management
We’re already using SaltStack to manage our EC2 minions (VMs)
Salt Grains are used for some common variables used in salt states
Salt Pillar Data exists which is configuration data available only to certain
minions
This Salt Pillar Data is already broken down by environment (dev/stage/prod)
We should just use this data to dynamically create the docker-compose and
rancher-compose files!
Technical Challenge - docker-compose
We needed to support a single docker-compose.yml file, maintained by
developers of an app or service
They don’t want to maintain local, dev, stage, and prod versions of this file
Changes to multiple files would be error-prone
Must support differences in the architecture or configuration of services across environments
Secret Secret, I’ve got a Secret
A templated rancher-compose file
{% set sf_env = grains['bookingservice-env'] %}
{% set version = grains['bookingservice-version'] %}
bookingservice-{{ sf_env }}-{{ version }}:
scale: 1
We use a scale of 1 because we use global host scheduling combined with host affinity so that one
container of this service is deployed to each VM of the specified environment (dev/stage/prod). This
allows us to spin up a new Rancher host and easily deploy to the new host VM.
A templated docker-compose file
A Closer Look
MYSQL_SPAREFOOT_HOST: {{ salt['pillar.get']('bookingservice-dev:MYSQL_SPAREFOOT_HOST') }}
MYSQL_SPAREFOOT_DB: {{ salt['pillar.get']('bookingservice-dev:MYSQL_SPAREFOOT_DB') }}
MYSQL_SPAREFOOT_USER: {{ salt['pillar.get']('bookingservice-dev:MYSQL_SPAREFOOT_USER') }}
MYSQL_SPAREFOOT_PASS: {{ salt['pillar.get']('bookingservice-dev:MYSQL_SPAREFOOT_PASS') }}
MYSQL_SPAREFOOT_PORT: {{ salt['pillar.get']('bookingservice-dev:MYSQL_SPAREFOOT_PORT') }}
APP_LOG_FILE: {{ salt['pillar.get']('bookingservice-dev:APP_LOG_FILE') }}
REDIS_HOST: {{ salt['pillar.get']('bookingservice-dev:REDIS_HOST') }}
REDIS_PORT: {{ salt['pillar.get']('bookingservice-dev:REDIS_PORT') }}
Deployments with rancher-compose
Deployments to Dev and Staging are done via Bamboo
Deployments to Production are done by developers via HipChat commands
In the end, everything is invoking our salt-deploy.py script
Set some salt grains for target env, version, buildid, image tag in quay.io
Services get versioned with a timestamp and bamboo build id
Render jinja2 / Inject Salt grains and pillar data via salt minion python code
caller.sminion.functions['cp.get_template'](cwd + '/docker-compose.yml', cwd + '/docker-compose-
salt.yml')
caller.sminion.functions['cp.get_template'](cwd + '/rancher-compose.yml', cwd + '/rancher-compose-
salt.yml')
Invokes rancher-compose create / up
Cleanup to keep the live verison of a service and live-1 version. The rest are purged.
Surprise! Rancher Adds Variable Support
Does the support for interpolating variables, added in Rancher 0.41, deprecate the
work we've done with Salt and rendering jinja2 templates?
No. We already maintain data in grains and pillars so we just reuse that data.
Rancher implementation uses the environment variables on the host running
rancher-compose to fill in the blanks
It would require logic to load those env variables based on the target env (dev/
stage/prod) so might as well get the data out of salt pillar which has separate
pillars for each service and then broken down by target environment.
So we deployed our first microservice and...
So we deployed our first microservice and...
...Everything worked...
So we deployed our first microservice and...
...Everything worked...
… Until it didn’t.
The Day Rancher Died
ELB
EC2
Containers
The Day Rancher Died
ELB
EC2
Containers
The Day Rancher Died
ELB
EC2
Containers
Where are we now?
52 Microservices in production with Rancher + Docker
5-10 Deployments per day on average
Busiest services handling around 50 requests / second
Consumer facing applications being containerized in development
New teams cutting their teeth
Keep on “Strangling”*
* DO NOT: google image search for “strangling hands”
Finally
Start small
Fail (a lot)
Move on and apply
what you learned
Thank you!
These Slides: http://bit.ly/1SVGaRA
Reach out:
● Steve (swoodruff@sparefoot.com, Twitter @sjwoodr)
Questions?

More Related Content

What's hot (15)

PDF
AWS Step Function with API Gateway Integration - Metin Kale, Chicago
AWS Chicago
 
PDF
Serverless Architecture - A Gentle Overview
CodeOps Technologies LLP
 
PDF
Serverless Architecture Patterns - Manoj Ganapathi
CodeOps Technologies LLP
 
PDF
From AUI to Atlaskit - Streamlining Development for Server & Cloud Apps
Atlassian
 
PDF
Open stack ocata summit enabling aws lambda-like functionality with openstac...
Shaun Murakami
 
PPTX
Evolution of the REST API
JeremyOtt5
 
PDF
Screencast dave dev-introtoask-andecho-july2015
David Isbitski
 
PDF
Cooking your Ravioli "al dente" with Hexagonal Architecture
Jeroen Rosenberg
 
PDF
Serverless Architectures on AWS Lambda
Serhat Can
 
PDF
Spring cloud Service-Discovery
Nikhil Hiremath
 
PDF
How to Build a Big Data Application: Serverless Edition
Lecole Cole
 
PPTX
Magic of web components
HYS Enterprise
 
PDF
Rails 5 – most effective features for apps upgradation
Andolasoft Inc
 
ODP
2010 07-20 TDD with ActiveResource
Wolfram Arnold
 
PDF
Azure Durable Functions (2018-06-13)
Paco de la Cruz
 
AWS Step Function with API Gateway Integration - Metin Kale, Chicago
AWS Chicago
 
Serverless Architecture - A Gentle Overview
CodeOps Technologies LLP
 
Serverless Architecture Patterns - Manoj Ganapathi
CodeOps Technologies LLP
 
From AUI to Atlaskit - Streamlining Development for Server & Cloud Apps
Atlassian
 
Open stack ocata summit enabling aws lambda-like functionality with openstac...
Shaun Murakami
 
Evolution of the REST API
JeremyOtt5
 
Screencast dave dev-introtoask-andecho-july2015
David Isbitski
 
Cooking your Ravioli "al dente" with Hexagonal Architecture
Jeroen Rosenberg
 
Serverless Architectures on AWS Lambda
Serhat Can
 
Spring cloud Service-Discovery
Nikhil Hiremath
 
How to Build a Big Data Application: Serverless Edition
Lecole Cole
 
Magic of web components
HYS Enterprise
 
Rails 5 – most effective features for apps upgradation
Andolasoft Inc
 
2010 07-20 TDD with ActiveResource
Wolfram Arnold
 
Azure Durable Functions (2018-06-13)
Paco de la Cruz
 

Viewers also liked (20)

PDF
The Cognitive Neuroscience of Empathy, DevOpsDays Austin 2016
Dave Mangot
 
PDF
2016 - IGNITE - How Do I Even Swarm
devopsdaysaustin
 
PPTX
2016 - IGNITE - A Developer's Progress: The mistaeks that have made me who I am
devopsdaysaustin
 
PDF
2016 - Open Mic - IGNITE - Open Infrastructure = ANY Infrastructure
devopsdaysaustin
 
PPTX
2016 - Safely Removing the Last Roadblock to Continuous Delivery
devopsdaysaustin
 
PPTX
2015 | Continuous Acceleration: Why Continuous Everything Needs A Supply Chai...
joshcorman
 
PDF
2016 - IGNITE - ChatOps for Developers and Everyone Else, Too
devopsdaysaustin
 
PPTX
2016 - Fail Proof Ways to Run Beautiful Tests Regardless Of Browser Choice
devopsdaysaustin
 
PDF
2016 - IGNITE - Rugged Enterprise DevSecNetQAGovOps
devopsdaysaustin
 
PDF
2016 - IGNITE - Blameless System Design
devopsdaysaustin
 
PDF
Dockerfy Your CI/CD - DevOpsDays Austin 2014
DevOpsDays Austin 2014
 
PDF
2016 - IGNITE - 17th Century Shipbuild and Your Failed Software Project
devopsdaysaustin
 
PDF
2016 - DevOpsDays Austin Keynote - 2016 State of DevOps
devopsdaysaustin
 
PDF
DevOpsDays Austin: Helping Horses Become Unicorns, Chef's Operations Maturity...
Matt Ray
 
PPTX
2016 - IGNITE - Terraform to go from Zero to Prod in less than 1 month and TH...
devopsdaysaustin
 
PPTX
Scaling a Start-up DevOps team to 10x while scaling the system 50x
Stefan Zier
 
PDF
2016 - IGNITE - The Cynefin Model for Operational Transformation
devopsdaysaustin
 
PDF
Devopsdays Austin 2015 - Guns, Germs and Microservices
John Willis
 
PPTX
2016 - Orchestrating multi-container apps: How I came to love the pod
devopsdaysaustin
 
PDF
Devops Done Us Wrong
Shaun Mouton
 
The Cognitive Neuroscience of Empathy, DevOpsDays Austin 2016
Dave Mangot
 
2016 - IGNITE - How Do I Even Swarm
devopsdaysaustin
 
2016 - IGNITE - A Developer's Progress: The mistaeks that have made me who I am
devopsdaysaustin
 
2016 - Open Mic - IGNITE - Open Infrastructure = ANY Infrastructure
devopsdaysaustin
 
2016 - Safely Removing the Last Roadblock to Continuous Delivery
devopsdaysaustin
 
2015 | Continuous Acceleration: Why Continuous Everything Needs A Supply Chai...
joshcorman
 
2016 - IGNITE - ChatOps for Developers and Everyone Else, Too
devopsdaysaustin
 
2016 - Fail Proof Ways to Run Beautiful Tests Regardless Of Browser Choice
devopsdaysaustin
 
2016 - IGNITE - Rugged Enterprise DevSecNetQAGovOps
devopsdaysaustin
 
2016 - IGNITE - Blameless System Design
devopsdaysaustin
 
Dockerfy Your CI/CD - DevOpsDays Austin 2014
DevOpsDays Austin 2014
 
2016 - IGNITE - 17th Century Shipbuild and Your Failed Software Project
devopsdaysaustin
 
2016 - DevOpsDays Austin Keynote - 2016 State of DevOps
devopsdaysaustin
 
DevOpsDays Austin: Helping Horses Become Unicorns, Chef's Operations Maturity...
Matt Ray
 
2016 - IGNITE - Terraform to go from Zero to Prod in less than 1 month and TH...
devopsdaysaustin
 
Scaling a Start-up DevOps team to 10x while scaling the system 50x
Stefan Zier
 
2016 - IGNITE - The Cynefin Model for Operational Transformation
devopsdaysaustin
 
Devopsdays Austin 2015 - Guns, Germs and Microservices
John Willis
 
2016 - Orchestrating multi-container apps: How I came to love the pod
devopsdaysaustin
 
Devops Done Us Wrong
Shaun Mouton
 
Ad

Similar to 2016 - Easing Your Way Into Docker: Lessons From a Journey to Production (20)

PDF
ContainerDays NYC 2015: "Easing Your Way Into Docker: Lessons From a Journey ...
DynamicInfraDays
 
PDF
Container Days
Patrick Mizer
 
PDF
Shipping Applications to Production in Containers with Docker
JĂŠrĂ´me Petazzoni
 
PDF
Docker Online Meetup #3: Docker in Production
Docker, Inc.
 
PDF
Introduction to Docker and Monitoring with InfluxData
InfluxData
 
PDF
Docker + Microservices in Production
Patrick Mizer
 
PDF
Container (Docker) Orchestration Tools
Dhilipsiva DS
 
PDF
From development environments to production deployments with Docker, Compose,...
JĂŠrĂ´me Petazzoni
 
PPTX
OpenStack Boston
Docker, Inc.
 
PPTX
The challenge of application distribution - Introduction to Docker (2014 dec ...
SĂŠbastien Portebois
 
PPTX
Docker open stack boston
dotCloud
 
PPTX
Docker introduction
dotCloud
 
PDF
Docker and OpenStack Boston Meetup
Kamesh Pemmaraju
 
PDF
Docker 0.11 at MaxCDN meetup in Los Angeles
JĂŠrĂ´me Petazzoni
 
PDF
OSDC 2014: Tobias Schwab - Continuous Delivery with Docker
NETWAYS
 
PDF
6 Months Sailing with Docker in Production
Hung Lin
 
PDF
Docker-v3.pdf
Bruno Cornec
 
PPTX
OpenStack Summit
Docker, Inc.
 
PPTX
Write Once and REALLY Run Anywhere | OpenStack Summit HK 2013
dotCloud
 
PDF
Dockercon EU 2014
Rafe Colton
 
ContainerDays NYC 2015: "Easing Your Way Into Docker: Lessons From a Journey ...
DynamicInfraDays
 
Container Days
Patrick Mizer
 
Shipping Applications to Production in Containers with Docker
JĂŠrĂ´me Petazzoni
 
Docker Online Meetup #3: Docker in Production
Docker, Inc.
 
Introduction to Docker and Monitoring with InfluxData
InfluxData
 
Docker + Microservices in Production
Patrick Mizer
 
Container (Docker) Orchestration Tools
Dhilipsiva DS
 
From development environments to production deployments with Docker, Compose,...
JĂŠrĂ´me Petazzoni
 
OpenStack Boston
Docker, Inc.
 
The challenge of application distribution - Introduction to Docker (2014 dec ...
SĂŠbastien Portebois
 
Docker open stack boston
dotCloud
 
Docker introduction
dotCloud
 
Docker and OpenStack Boston Meetup
Kamesh Pemmaraju
 
Docker 0.11 at MaxCDN meetup in Los Angeles
JĂŠrĂ´me Petazzoni
 
OSDC 2014: Tobias Schwab - Continuous Delivery with Docker
NETWAYS
 
6 Months Sailing with Docker in Production
Hung Lin
 
Docker-v3.pdf
Bruno Cornec
 
OpenStack Summit
Docker, Inc.
 
Write Once and REALLY Run Anywhere | OpenStack Summit HK 2013
dotCloud
 
Dockercon EU 2014
Rafe Colton
 
Ad

More from devopsdaysaustin (13)

PPTX
2016 - Open Mic - IGNITE - The Power of #DadOps for women in tech
devopsdaysaustin
 
PDF
2016 - Open Mic - IGNITE - This is a Tire Fire
devopsdaysaustin
 
PDF
2016 - IGNITE - An ElasticSearch Cluster Named George Armstrong Custer
devopsdaysaustin
 
PDF
2016 - IGNITE - No Assholes
devopsdaysaustin
 
PDF
2016 - IGNITE - Real Heroes Draw Pictures
devopsdaysaustin
 
PDF
2016 - IGNITE - DevOps or NoOps
devopsdaysaustin
 
PPTX
2016 - You Don't Belong Here: Dealing with Impostor Syndrome
devopsdaysaustin
 
PDF
2016 - Compliance as Code - InSpec
devopsdaysaustin
 
PDF
2016 - IGNITE - Being an introvert and at a conference, not as hellish as you...
devopsdaysaustin
 
PDF
2016 - The Ops Must Be Crazy - Hack Your Team's Ops Culture With One Weird Trick
devopsdaysaustin
 
PDF
2016 - DevOps Meets APIs - Model once. Benefit everywhere.
devopsdaysaustin
 
PPTX
2016 - Continuously Delivering Microservices in Kubernetes using Jenkins
devopsdaysaustin
 
PDF
2016 - 10 questions you should answer before building a new microservice
devopsdaysaustin
 
2016 - Open Mic - IGNITE - The Power of #DadOps for women in tech
devopsdaysaustin
 
2016 - Open Mic - IGNITE - This is a Tire Fire
devopsdaysaustin
 
2016 - IGNITE - An ElasticSearch Cluster Named George Armstrong Custer
devopsdaysaustin
 
2016 - IGNITE - No Assholes
devopsdaysaustin
 
2016 - IGNITE - Real Heroes Draw Pictures
devopsdaysaustin
 
2016 - IGNITE - DevOps or NoOps
devopsdaysaustin
 
2016 - You Don't Belong Here: Dealing with Impostor Syndrome
devopsdaysaustin
 
2016 - Compliance as Code - InSpec
devopsdaysaustin
 
2016 - IGNITE - Being an introvert and at a conference, not as hellish as you...
devopsdaysaustin
 
2016 - The Ops Must Be Crazy - Hack Your Team's Ops Culture With One Weird Trick
devopsdaysaustin
 
2016 - DevOps Meets APIs - Model once. Benefit everywhere.
devopsdaysaustin
 
2016 - Continuously Delivering Microservices in Kubernetes using Jenkins
devopsdaysaustin
 
2016 - 10 questions you should answer before building a new microservice
devopsdaysaustin
 

Recently uploaded (20)

PDF
Simplify React app login with asgardeo-sdk
vaibhav289687
 
PDF
IObit Driver Booster Pro 12.4.0.585 Crack Free Download
henryc1122g
 
PPTX
Homogeneity of Variance Test Options IBM SPSS Statistics Version 31.pptx
Version 1 Analytics
 
PDF
Open Chain Q2 Steering Committee Meeting - 2025-06-25
Shane Coughlan
 
PDF
AI Prompts Cheat Code prompt engineering
Avijit Kumar Roy
 
PPTX
OpenChain @ OSS NA - In From the Cold: Open Source as Part of Mainstream Soft...
Shane Coughlan
 
PDF
Everything you need to know about pricing & licensing Microsoft 365 Copilot f...
Q-Advise
 
PPTX
Coefficient of Variance in IBM SPSS Statistics Version 31.pptx
Version 1 Analytics
 
PDF
UITP Summit Meep Pitch may 2025 MaaS Rebooted
campoamor1
 
PDF
The 5 Reasons for IT Maintenance - Arna Softech
Arna Softech
 
PDF
Top Agile Project Management Tools for Teams in 2025
Orangescrum
 
PDF
How to Hire AI Developers_ Step-by-Step Guide in 2025.pdf
DianApps Technologies
 
PDF
MiniTool Partition Wizard Free Crack + Full Free Download 2025
bashirkhan333g
 
PDF
Empower Your Tech Vision- Why Businesses Prefer to Hire Remote Developers fro...
logixshapers59
 
PPTX
Customise Your Correlation Table in IBM SPSS Statistics.pptx
Version 1 Analytics
 
PDF
[Solution] Why Choose the VeryPDF DRM Protector Custom-Built Solution for You...
Lingwen1998
 
PDF
Dipole Tech Innovations – Global IT Solutions for Business Growth
dipoletechi3
 
PDF
ERP Consulting Services and Solutions by Contetra Pvt Ltd
jayjani123
 
PDF
Add Background Images to Charts in IBM SPSS Statistics Version 31.pdf
Version 1 Analytics
 
PPTX
Foundations of Marketo Engage - Powering Campaigns with Marketo Personalization
bbedford2
 
Simplify React app login with asgardeo-sdk
vaibhav289687
 
IObit Driver Booster Pro 12.4.0.585 Crack Free Download
henryc1122g
 
Homogeneity of Variance Test Options IBM SPSS Statistics Version 31.pptx
Version 1 Analytics
 
Open Chain Q2 Steering Committee Meeting - 2025-06-25
Shane Coughlan
 
AI Prompts Cheat Code prompt engineering
Avijit Kumar Roy
 
OpenChain @ OSS NA - In From the Cold: Open Source as Part of Mainstream Soft...
Shane Coughlan
 
Everything you need to know about pricing & licensing Microsoft 365 Copilot f...
Q-Advise
 
Coefficient of Variance in IBM SPSS Statistics Version 31.pptx
Version 1 Analytics
 
UITP Summit Meep Pitch may 2025 MaaS Rebooted
campoamor1
 
The 5 Reasons for IT Maintenance - Arna Softech
Arna Softech
 
Top Agile Project Management Tools for Teams in 2025
Orangescrum
 
How to Hire AI Developers_ Step-by-Step Guide in 2025.pdf
DianApps Technologies
 
MiniTool Partition Wizard Free Crack + Full Free Download 2025
bashirkhan333g
 
Empower Your Tech Vision- Why Businesses Prefer to Hire Remote Developers fro...
logixshapers59
 
Customise Your Correlation Table in IBM SPSS Statistics.pptx
Version 1 Analytics
 
[Solution] Why Choose the VeryPDF DRM Protector Custom-Built Solution for You...
Lingwen1998
 
Dipole Tech Innovations – Global IT Solutions for Business Growth
dipoletechi3
 
ERP Consulting Services and Solutions by Contetra Pvt Ltd
jayjani123
 
Add Background Images to Charts in IBM SPSS Statistics Version 31.pdf
Version 1 Analytics
 
Foundations of Marketo Engage - Powering Campaigns with Marketo Personalization
bbedford2
 

2016 - Easing Your Way Into Docker: Lessons From a Journey to Production

  • 1. Docker at SpareFoot Lessons From a Journey to Production DevOps Days Austin May 3, 2016
  • 2. Who am I? Steve Woodruff ❏ Director of DevOps at SpareFoot implementing CI/CD ❏ Spent 10+ years at Motorola doing embedded development (C, C++) ❏ Spent 5 years at IBM as a sys admin in a large server farm (Linux, AIX, Solaris) [email protected] Twitter: @sjwoodr GitHub: sjwoodr
  • 3. ● Think Hotels.com for self storage* ● All infrastructure in AWS ● 40 Developers on 7 Teams ○ Continuous Delivery ● Docker in production since 2014 *This kind of storage:
  • 4. The Beginning: SpareFoot + Docker Hackathon! Docker + Fig (now compose) allowed us to run production architecture locally.
  • 5. Yim - Call Center Application Used exclusively by our call center Chrome ONLY Node version n+1 React + Flux Vers. n +1 Vers. n +1 Vers. n
  • 6. Yim - Call Center Application Used exclusively by our call center Chrome ONLY Node version n+1 React + Flux Vers. n +1 Vers. n +1 Vers. n
  • 7. CI and deployments Janky shell scripts… slow builds, etc… Used Bamboo to build images feature branches were built/deployed to Dev master branch was built/deployed to Staging Dynamically created custom container start script Tried to auto-detect when the containers started to begin post-deploy test Build times were rather long Spent an awful long time doing docker push (to our registry) and docker pull (on the target hosts)
  • 8. Ok, so Docker feels like the a solution … and we kind of know how to do this. But.... Continuous Integration / Delivery? ○ Docker Registry ○ Bamboo ○ Deployments ● Host Volumes and Port Forwarding rules? ○ Not saved with the source code ● Get Docker to run in local, dev, staging, and production environments? ○ Configuration?
  • 9. Docker in Production (technically)! We had 2 load balanced EC2 instances running a node app. ELB 443 3000 3000
  • 10. Docker in Production (technically)! We had 2 load balanced EC2 instances running a node app. Now we have 2 load balanced EC2 instances running docker containers that run a node app! ELB 443 3000 3000 ELB App 1 App 1 3000 3000 443
  • 11. Docker in Production (technically)! ELB ELB App 1 App 1 We had 2 load balanced EC2 instances running a node app. Now we have 2 load balanced EC2 instances running docker containers that run a node app! NEW443 3000 3000 3000 3000 443
  • 12. Yim: Trouble in Docker Paradise Hosting our own Docker registry was a bad idea Stability was a problem No level of access control on the registry itself Mimicking servers - 1 container per host. Need orchestration please! Amazon Linux AMI -> old version of Docker… doh! Docker push/pull to/from registry was very slow build - push to registry deploy - pull from registry to each host, serially Performance was fine…. But stability was the issue This internal-facing nodejs app was moved to a pair of EC2 instances and out of Docker after about 4 months of pain and suffering
  • 13. Yim: Lessons Learned We need orchestration Rolling our own docker deployments was confusing to OPS and to the Dev team Our own docker registry is a bad idea Stability was a problem No level of access control on the registry itself Our S3 backend would grow by several GB per month with no automated cleanup No easy way to rollback failed deploys Just fix it and deploy again... All this culminated in a poor build process and affected CI velocity Longer builds, longer deploys, no real gain
  • 14. Like everyone else.... ...we were “deconstructing the monolith” Application Monolithic Library Data
  • 15. Like everyone else.... ...we were “deconstructing the monolith” Application REST API Data Microservice REST API Data Microservice REST API Data Microservice REST API Data Microservice API Gateway
  • 16. A Better Docker Registry With Yim we learned that rolling our own Registry was a bad idea. Limited Access Control We have to maintain it
  • 17. Let’s try Quay... Has Access Control Robots, yusss! We don’t have to maintain it
  • 18. We’ve learned some things... ● Easier than we thought ● Quay was the glue we needed ○ Use an off the shelf solution. ○ We like Quay.io ● Bolting on to our existing CI pipeline worked really well. ○ Developers didn’t have to learn new process ○ Microservice consumers can pull tagged versions ○ We can automate tests against all versions Now we talk containers from local -> dev -> staging but NOT in production.
  • 20. Production - What is still needed Orchestration Yim sucked because we tried to do this ourselves Better Deployments With rollbacks Configuration Management We have things to hide
  • 23. Production - Software Selection Choosing orchestration software / container service in early 2015 StackEngine Lacked docker-compose support Kubernetes PhD Required Mesosphere Nice, but slow to deploy EC2 Container Service Lacked docker-compose support and custom AMIs Tutum (now Docker Cloud) Rancher
  • 24. Production - Enter Rancher After running proof-of-concepts of both Tutum and Rancher, we decided to continue down our path to production deploys with Rancher. Had more mature support for docker-compose files. Tutum added this after our evaluation had ended Did not require us to orchestrate the deployments through their remote endpoint Rancher server runs on our EC2 instances and we are in full control of all the things Had a full API we can work with in addition to the custom rancher-compose cli Had a very-active user community and a beta-forum where the Rancher development team was active in answering questions and even troubleshooting configuration problems.
  • 25. Overlaying Docker on AWS ELB EC2 Containers
  • 26. Overlaying Docker on AWS Why the extra HAProxy layer? Allows us to create the ELB and leave them alone When we deploy new versioned services we update the service alias / haproxy links Allows for fast rollback to previous version of the service
  • 27. Deployments and Rollbacks Developers can deploy to production whenever they want HipChat bot commands to deploy and rollback/revert Deployments to each of the 3 environments use rancher-compose to Deploy new versioned services / containers Create or update service aliases / haproxy links Delete previous versioned services except for current and previous When things go haywire… We simply rollback Production deploy creates a docker-compose-rollback.yml file Query Rancher API to get list of running services Allows us to change haproxy and service alias links back to the previous version Super fast to rollback, no containers need to be spun up!
  • 28. Overlaying Docker on AWS ELB EC2 Containers
  • 29. Overlaying Docker on AWS ELB EC2 Containers
  • 30. Overlaying Docker on AWS ELB EC2 Containers
  • 31. Overlaying Docker on AWS ELB EC2 Containers Rollback!
  • 32. Secret Management We’re already using SaltStack to manage our EC2 minions (VMs) Salt Grains are used for some common variables used in salt states Salt Pillar Data exists which is configuration data available only to certain minions This Salt Pillar Data is already broken down by environment (dev/stage/prod) We should just use this data to dynamically create the docker-compose and rancher-compose files!
  • 33. Technical Challenge - docker-compose We needed to support a single docker-compose.yml file, maintained by developers of an app or service They don’t want to maintain local, dev, stage, and prod versions of this file Changes to multiple files would be error-prone Must support differences in the architecture or configuration of services across environments Secret Secret, I’ve got a Secret
  • 34. A templated rancher-compose file {% set sf_env = grains['bookingservice-env'] %} {% set version = grains['bookingservice-version'] %} bookingservice-{{ sf_env }}-{{ version }}: scale: 1 We use a scale of 1 because we use global host scheduling combined with host affinity so that one container of this service is deployed to each VM of the specified environment (dev/stage/prod). This allows us to spin up a new Rancher host and easily deploy to the new host VM.
  • 36. A Closer Look MYSQL_SPAREFOOT_HOST: {{ salt['pillar.get']('bookingservice-dev:MYSQL_SPAREFOOT_HOST') }} MYSQL_SPAREFOOT_DB: {{ salt['pillar.get']('bookingservice-dev:MYSQL_SPAREFOOT_DB') }} MYSQL_SPAREFOOT_USER: {{ salt['pillar.get']('bookingservice-dev:MYSQL_SPAREFOOT_USER') }} MYSQL_SPAREFOOT_PASS: {{ salt['pillar.get']('bookingservice-dev:MYSQL_SPAREFOOT_PASS') }} MYSQL_SPAREFOOT_PORT: {{ salt['pillar.get']('bookingservice-dev:MYSQL_SPAREFOOT_PORT') }} APP_LOG_FILE: {{ salt['pillar.get']('bookingservice-dev:APP_LOG_FILE') }} REDIS_HOST: {{ salt['pillar.get']('bookingservice-dev:REDIS_HOST') }} REDIS_PORT: {{ salt['pillar.get']('bookingservice-dev:REDIS_PORT') }}
  • 37. Deployments with rancher-compose Deployments to Dev and Staging are done via Bamboo Deployments to Production are done by developers via HipChat commands In the end, everything is invoking our salt-deploy.py script Set some salt grains for target env, version, buildid, image tag in quay.io Services get versioned with a timestamp and bamboo build id Render jinja2 / Inject Salt grains and pillar data via salt minion python code caller.sminion.functions['cp.get_template'](cwd + '/docker-compose.yml', cwd + '/docker-compose- salt.yml') caller.sminion.functions['cp.get_template'](cwd + '/rancher-compose.yml', cwd + '/rancher-compose- salt.yml') Invokes rancher-compose create / up Cleanup to keep the live verison of a service and live-1 version. The rest are purged.
  • 38. Surprise! Rancher Adds Variable Support Does the support for interpolating variables, added in Rancher 0.41, deprecate the work we've done with Salt and rendering jinja2 templates? No. We already maintain data in grains and pillars so we just reuse that data. Rancher implementation uses the environment variables on the host running rancher-compose to fill in the blanks It would require logic to load those env variables based on the target env (dev/ stage/prod) so might as well get the data out of salt pillar which has separate pillars for each service and then broken down by target environment.
  • 39. So we deployed our first microservice and...
  • 40. So we deployed our first microservice and... ...Everything worked...
  • 41. So we deployed our first microservice and... ...Everything worked... … Until it didn’t.
  • 42. The Day Rancher Died ELB EC2 Containers
  • 43. The Day Rancher Died ELB EC2 Containers
  • 44. The Day Rancher Died ELB EC2 Containers
  • 45. Where are we now? 52 Microservices in production with Rancher + Docker 5-10 Deployments per day on average Busiest services handling around 50 requests / second Consumer facing applications being containerized in development New teams cutting their teeth Keep on “Strangling”* * DO NOT: google image search for “strangling hands”
  • 46. Finally Start small Fail (a lot) Move on and apply what you learned
  • 47. Thank you! These Slides: http://bit.ly/1SVGaRA Reach out: ● Steve ([email protected], Twitter @sjwoodr) Questions?