Tupperware:
Containerized Deployment at FB
Aravind Narayanan
aravindn@fb.com
DockerCon 2014
Scale makes everything harder
• Running single instance: easy
• Running at scale in production: messy and
complicated
Provision machines
Distribute binaries
Daemonize processMonitoring
Failover Geo-distribution
Machine decoms
Time spent on
Application Logic
Time spent on getting app
to run in prod
Tupperware to the rescue
“This is my binary. Run it on X machines!”
• Engineer is hands-off
• Doesn’t need to worry about machines in
prod
• Handles failover, when machines go bad
• Efficient use of infrastructure
• 300,000+ processes, spread over 15,000+
services
Agenda
1. Architecture
2. Sandboxes
3. Ecosystem
4. Lessons learnt
Facebook Datacenters
PRN
SNC FRC
ASH
LLA
Terminology
• A DC has one or more clusters
• A cluster has multiple racks
• A rack has multiple machines
!
• A TW job is equivalent to a service
• A job has multiple tasks, each an
instance of the service
Architecture
Scheduler
Host1 Host2 Host3
config.tw
twdeploy
Server DB
Start Task
BitTorrent-based binary
package store
QuoteService
Machine “failures” / hour
Failover
Failover
Scheduler
Host1 Host2 Host3
Server DB
Start Task
QuoteService QuoteService
Heartbeat
BitTorrent-based binary
package store
Painless Hardware maintenance
• Notify scheduler of impending operations
• Scheduler can preemptively move tasks
• Graceful migration for stateless services
• Stateful services may endure maintenance
Expressive allocation policies
Production Host
MyBigJob
QuoteService
QuoteAggregator
Production Host
Top of Rack Switch
NetworkHogJob
NetworkHogJob
Job M Job N
Job M Job N
TW Agent
Agent Helper
TW Agent process
Package Manager
Resource Manager
Task Manager API
Scheduler heartbeat
Agent Helper Task B
Agent Helper Task C
Task A
Production Host
Agent Helper process
Heartbeat with Agent, 	

prevents zombies
TW Agent process
Package Manager
Resource Manager
Task Manager API
Scheduler heartbeat Agent Helper Task A
Logging
Compress on the fly
Log Files
• Persistent logs
• Instantaneous rotation
Task A
Agent Helper
Log Catcher
stderr stdout
Sandboxing
Initially, used chroots to contain processes
• No isolation
• Not secure
LinuXContainers
• As tech matured, we switched
• Separate process and file
namespaces, set up by Helper
• Mount required resources
directly into container
• Secure & isolated
Service permissions
• Every container runs sshd
• SSH directly into the container
• Regulate access
Production Host
TW Agent process
Task Manager API
Agent Helper Task A
SSH
X
Configuring the container
Resource limits
• CPU, RAM & disk limits
• Implemented with cgroups
• Agent handles memory limits
with cgroup notification API
• Adaptive limits
Resource Limits in action
watchdog-service - tw.mem.rss_bytes
Migrate from Chroots to
Containers
• No-op for most services
• But new namespaces posed problems
for some
• Major hurdle was social, not technical
Service Discovery
Scheduler
Host1 Host2 Host3
Server DB
Service Registry
QuoteService
QuoteService
Host1:12345 ALIVE
Host2:12345 ALIVE
QuoteService
Host2:12345 ALIVE
Host1:12345 DEAD
Start Task
QuoteService
QuoteService
Host2:12345 ALIVE
Host3:12345 ALIVE
QuoteService
Start Task
Monitoring & Alerting
Alternatives to Tupperware
• Why not use Docker / CoreOS?
• They didn’t exist
• TW integrates with other FB systems
• Why not use VMs?
• Performance penalty
• Hypervisor makes debugging harder
Lessons learnt
Releases are scary!
• Release often
• Dry runs
• Canaries are your friends
• Manage dependencies
Sane defaults
• Users shouldn’t have to read entire manual
• Choose what makes sense for most services
What went wrong?
• Hard to understand why TW did
something
• It’s not about “what went wrong”,
but “what should I do next?”
Tupperware
• Automated deployment
• Less work for engineers
• Containers for security
and isolation
• Increased efficiency
Questions?
Time spent on getting app
to run in prod
Time spent on
Application Logic

Tupperware: Containerized Deployment at FB