SlideShare a Scribd company logo
Containerd Internals:
Building a Core Container
Runtime
Stephen Day, Docker
Phil Estes, IBM
September 11, 2017
#OSSummit
A Brief History
APRIL 2016 Containerd “0.2” announced, Docker 1.11
DECEMBER 2016Announce expansion of containerd OSS project
Management/Supervisor for the OCI runc executor
Containerd 1.0: A core container runtime project for the industry
MARCH 2017 Containerd project contributed to
CNCF
runc
containerd
Why Containerd 1.0?
▪ Continue projects spun out
from monolithic Docker engine
▪ Expected use beyond Docker
engine (Kubernetes CRI)
▪ Donation to foundation for
broad industry collaboration
▫ Similar to runc/libcontainer
and the OCI
Technical Goals/Intentions
▪ Clean gRPC-based API + client library
▪ Full OCI support (runtime and image spec)
▪ Stability and performance with tight, well-
defined core of container function
▪ Decoupled systems (image, filesystem,
runtime) for pluggability, reuse
Requirements
- A la carte: use only what is required
- Runtime agility: fits into different platforms
- Pass-through container configuration (direct OCI)
- Decoupled
- Use known-good technology
- OCI container runtime and images
- gRPC for API
- Prometheus for Metrics
Runtimes
Metadata
Architecture
ContainersContent DiffSnapshot Tasks EventsImages
GRPC Metrics
Runtimes
Storage
OS
Architecture
containerd
OS (Storage, FS, Networking Runtimes
API Client
(moby, containerd-cri, etc.)
Containerd: Rich Go API
Getting Started
https://github.com/containerd/containerd/blob/master/docs/getting-started.md
GoDoc
https://godoc.org/github.com/containerd/containerd
containerd
Events
EventsPublish Subscribe
# HELP container_blkio_io_service_bytes_recursive_bytes The blkio io service bytes recursive
# TYPE container_blkio_io_service_bytes_recursive_bytes gauge
container_blkio_io_service_bytes_recursive_bytes{container_id="foo4",device="/dev/nvme0n1",major="259",minor="0",namespace="default",op="Async"} 1.07159552e+08
container_blkio_io_service_bytes_recursive_bytes{container_id="foo4",device="/dev/nvme0n1",major="259",minor="0",namespace="default",op="Read"} 0
container_blkio_io_service_bytes_recursive_bytes{container_id="foo4",device="/dev/nvme0n1",major="259",minor="0",namespace="default",op="Sync"} 81920
container_blkio_io_service_bytes_recursive_bytes{container_id="foo4",device="/dev/nvme0n1",major="259",minor="0",namespace="default",op="Total"} 1.07241472e+08
container_blkio_io_service_bytes_recursive_bytes{container_id="foo4",device="/dev/nvme0n1",major="259",minor="0",namespace="default",op="Write"} 1.07241472e+08
# HELP container_blkio_io_serviced_recursive_total The blkio io servied recursive
# TYPE container_blkio_io_serviced_recursive_total gauge
container_blkio_io_serviced_recursive_total{container_id="foo4",device="/dev/nvme0n1",major="259",minor="0",namespace="default",op="Async"} 892
container_blkio_io_serviced_recursive_total{container_id="foo4",device="/dev/nvme0n1",major="259",minor="0",namespace="default",op="Read"} 0
container_blkio_io_serviced_recursive_total{container_id="foo4",device="/dev/nvme0n1",major="259",minor="0",namespace="default",op="Sync"} 888
container_blkio_io_serviced_recursive_total{container_id="foo4",device="/dev/nvme0n1",major="259",minor="0",namespace="default",op="Total"} 1780
container_blkio_io_serviced_recursive_total{container_id="foo4",device="/dev/nvme0n1",major="259",minor="0",namespace="default",op="Write"} 1780
# HELP container_cpu_kernel_nanoseconds The total kernel cpu time
# TYPE container_cpu_kernel_nanoseconds gauge
container_cpu_kernel_nanoseconds{container_id="foo4",namespace="default"} 2.6e+08
# HELP container_cpu_throttle_periods_total The total cpu throttle periods
# TYPE container_cpu_throttle_periods_total gauge
container_cpu_throttle_periods_total{container_id="foo4",namespace="default"} 0
# HELP container_cpu_throttled_periods_total The total cpu throttled periods
# TYPE container_cpu_throttled_periods_total gauge
container_cpu_throttled_periods_total{container_id="foo4",namespace="default"} 0
# HELP container_cpu_throttled_time_nanoseconds The total cpu throttled time
# TYPE container_cpu_throttled_time_nanoseconds gauge
container_cpu_throttled_time_nanoseconds{container_id="foo4",namespace="default"} 0
# HELP container_cpu_total_nanoseconds The total cpu time
# TYPE container_cpu_total_nanoseconds gauge
container_cpu_total_nanoseconds{container_id="foo4",namespace="default"} 1.003301578e+09
# HELP container_cpu_user_nanoseconds The total user cpu time
# TYPE container_cpu_user_nanoseconds gauge
container_cpu_user_nanoseconds{container_id="foo4",namespace="default"} 7e+08
# HELP container_hugetlb_failcnt_total The hugetlb failcnt
# TYPE container_hugetlb_failcnt_total gauge
container_hugetlb_failcnt_total{container_id="foo4",namespace="default",page="1GB"} 0
container_hugetlb_failcnt_total{container_id="foo4",namespace="default",page="2MB"} 0
# HELP container_hugetlb_max_bytes The hugetlb maximum usage
# TYPE container_hugetlb_max_bytes gauge
container_hugetlb_max_bytes{container_id="foo4",namespace="default",page="1GB"} 0
container_hugetlb_max_bytes{container_id="foo4",namespace="default",page="2MB"} 0
# HELP container_hugetlb_usage_bytes The hugetlb usage
# TYPE container_hugetlb_usage_bytes gauge
container_hugetlb_usage_bytes{container_id="foo4",namespace="default",page="1GB"} 0
container_hugetlb_usage_bytes{container_id="foo4",namespace="default",page="2MB"} 0
# HELP container_memory_active_anon_bytes The active_anon amount
# TYPE container_memory_active_anon_bytes gauge
container_memory_active_anon_bytes{container_id="foo4",namespace="default"} 2.658304e+06
# HELP container_memory_active_file_bytes The active_file amount
# TYPE container_memory_active_file_bytes gauge
container_memory_active_file_bytes{container_id="foo4",namespace="default"} 7.319552e+06
# HELP container_memory_cache_bytes The cache amount used
# TYPE container_memory_cache_bytes gauge
container_memory_cache_bytes{container_id="foo4",namespace="default"} 5.0597888e+07
# HELP container_memory_dirty_bytes The dirty amount
# TYPE container_memory_dirty_bytes gauge
container_memory_dirty_bytes{container_id="foo4",namespace="default"} 28672
Metrics
Pulling an Image
What do runtimes need?
{
"schemaVersion": 2,
"mediaType": "application/vnd.docker.distribution.manifest.list.v2+json",
"manifests": [
{
"mediaType": "application/vnd.docker.distribution.manifest.v1+json",
"size": 2094,
"digest": "sha256:7820f9a86d4ad15a2c4f0c0e5479298df2aa7c2f6871288e2ef8546f3e7b6783",
"platform": {
"architecture": "ppc64le",
"os": "linux"
}
},
{
"mediaType": "application/vnd.docker.distribution.manifest.v1+json",
"size": 1922,
"digest": "sha256:ae1b0e06e8ade3a11267564a26e750585ba2259c0ecab59ab165ad1af41d1bdd",
"platform": {
"architecture": "amd64",
"os": "linux",
"features": [
"sse"
]
}
},
{
"mediaType": "application/vnd.docker.distribution.manifest.v1+json",
"size": 2084,
"digest": "sha256:e4c0df75810b953d6717b8f8f28298d73870e8aa2a0d5e77b8391f16fdfbbbe2",
"platform": {
"architecture": "s390x",
"os": "linux"
}
},
{
"mediaType": "application/vnd.docker.distribution.manifest.v1+json",
"size": 2084,
"digest": "sha256:07ebe243465ef4a667b78154ae6c3ea46fdb1582936aac3ac899ea311a701b40",
"platform": {
"architecture": "arm",
"os": "linux",
"variant": "armv7"
}
},
{
"mediaType": "application/vnd.docker.distribution.manifest.v1+json",
"size": 2090,
"digest": "sha256:fb2fc0707b86dafa9959fe3d29e66af8787aee4d9a23581714be65db4265ad8a",
"platform": {
"architecture": "arm64",
"os": "linux",
"variant": "armv8"
}
}
]
Image Formats
Index (Manifest List)
linux amd64
linux ppc64le
windows amd64
Manifests:
Manifest
linux arm64
Layers:
Config:
L0
L1
Ln
Root Filesystem
/usr
/bin
/dev
/etc
/home
/lib
C
OCI Spec
process
args
env
cwd
…
root
mounts
Docker and OCI
Content Addressability
digest.FromString(“foo”) ->
“sha256:2c26b46b68ffc68ff99b453c1d30413413422d706483bfa0f98a5e886266e7ae”
digest.FromString(“foo tampered”) ->
“sha256:51f7f1d1f6bebed72b936c8ea257896cb221b91d303c5b5c44073fce33ab8dd8”
digest.FromString(“bar sha256:2c...”) ->
“sha256:2e94890c66fbcccca9ad680e1b1c933cc323a5b4bcb14cc8a4bc78bb88d41055”
“foo”
“bar sha256:2c…”
“foo tampered”
“bar sha256:2c…”
{
"schemaVersion": 2,
"mediaType": "application/vnd.docker.distribution.manifest.list.v2+json",
"manifests": [
{
"mediaType": "application/vnd.docker.distribution.manifest.v1+json",
"size": 2094,
"digest": "sha256:7820f9a86d4ad15a2c4f0c0e5479298df2aa7c2f6871288e2ef8546f3e7b6783",
"platform": {
"architecture": "ppc64le",
"os": "linux"
}
},
{
"mediaType": "application/vnd.docker.distribution.manifest.v1+json",
"size": 1922,
"digest": "sha256:ae1b0e06e8ade3a11267564a26e750585ba2259c0ecab59ab165ad1af41d1bdd",
"platform": {
"architecture": "amd64",
"os": "linux",
"features": [
"sse"
]
}
},
{
"mediaType": "application/vnd.docker.distribution.manifest.v1+json",
"size": 2084,
"digest": "sha256:e4c0df75810b953d6717b8f8f28298d73870e8aa2a0d5e77b8391f16fdfbbbe2",
"platform": {
"architecture": "s390x",
"os": "linux"
}
},
{
"mediaType": "application/vnd.docker.distribution.manifest.v1+json",
"size": 2084,
"digest": "sha256:07ebe243465ef4a667b78154ae6c3ea46fdb1582936aac3ac899ea311a701b40",
"platform": {
"architecture": "arm",
"os": "linux",
"variant": "armv7"
}
},
{
"mediaType": "application/vnd.docker.distribution.manifest.v1+json",
"size": 2090,
"digest": "sha256:fb2fc0707b86dafa9959fe3d29e66af8787aee4d9a23581714be65db4265ad8a",
"platform": {
"architecture": "arm64",
"os": "linux",
"variant": "armv8"
}
}
]
Image Formats
Docker and OCI
Index (Manifest List)
linux amd64
linux ppc64le
windows amd64
Manifests:
Manifest
linux arm64
Layers:
Config:
L0
Ln
C
Digest
Layer File 0
Layer File 1
Layer File N
L1
Resolution
Getting a digest from a name:
15
ubuntu
sha256:71cd81252a3563a03ad8daee81047b62ab5d892ebbfbf71cf53415f29c130950
Pulling an Image
Data Flow
Content Images Snapshots
Pull
Fetch Unpack
Events
Remote
Mounts
Remotes
Locators and Resolution
type Fetcher interface {
Fetch(ctx context.Context, id string, hints ...string) (io.ReadCloser,
error)
}
type Resolver interface {
Resolve(ctx context.Context, locator string) (Fetcher, error)
}
fetcher := resolver.Resolve("docker.io/library/ubuntu")
Endlessly Configurable!
(hint: think git remotes)
Example: Pull an Image
Via ctr client:
$ export 
CONTAINERD_NAMESPACE=example
$ ctr pull 
docker.io/library/redis:alpine
$ ctr image ls
...
import (
"context"
"github.com/containerd/containerd"
"github.com/containerd/containerd/namespaces"
)
// connect to our containerd daemon
client, err := containerd.New("/run/containerd/containerd.sock")
defer client.Close()
// set our namespace to “example”:
ctx := namespaces.WithNamespace(context.Background(), "example")
// pull the alpine-based redis image from DockerHub:
image, err := client.Pull(ctx,
"docker.io/library/redis:alpine",
containerd.WithPullUnpack)
Snapshotters
How do you build a container root filesystem?
Snapshots
● No mounting, just returns mounts!
● Explicit active (rw) and committed (ro)
● Commands represent lifecycle
● Reference key chosen by caller (allows
using content addresses)
● No tars and no diffs
Evolved from Graph Drivers
● Simple layer relationships
● Small and focused interface
● Non-opinionated string keys
● External Mount Lifecycle
type Snapshotter interface {
Stat(key string) (Info, error)
Mounts(key string) ([]containerd.Mount, error)
Prepare(key, parent string) ([]containerd.Mount,
error)
View(key, parent string) ([]containerd.Mount,
error)
Commit(name, key string) error
Remove(key string) error
Walk(fn func(Info) error) error
}
type Info struct {
Name string // name or key of snapshot
Parent string
Kind Kind
}
type Kind int
const (
KindView
KindActive
KindCommitted
)
Active Committed
Prepare(a, P0)
Commit(P1, a′)
Snapshot Model
P0a
a′
a′′
P1
P2
Commit(P2, a′′)
Remove(c)
Prepare(a′′, P1)
Example: Investigating Root
Filesystem
$ ctr snapshot ls
…
$ ctr snapshot tree
…
$ ctr snapshot mounts <target> <id>
Pulling an Image
1.Resolve manifest or index (manifest list)
2.Download all the resources referenced by the
manifest
3.Unpack layers into snapshots
4.Register the mappings between manifests and
constituent resources
Pulling an Image
Data Flow
Content Images Snapshots
Pull
Fetch Unpack
Events
Remote
Mounts
Starting a Container
1.Initialize a root filesystem (RootFS) from
snapshot
2.Setup OCI configuration (config.json)
3.Use metadata from container and snapshotter
to specify config and mounts
4.Start process via the task service
Starting a Container
Images Snapshot
Run
Initialize Start
Events
Running
Containers
Containers Tasks
Setup
Example: Run a Container
Via ctr client:
$ export 
CONTAINERD_NAMESPACE=example
$ ctr run -t 
docker.io/library/redis:alpine 
redis-server
$ ctr c
...
// create our container object and config
container, err := client.NewContainer(ctx,
"redis-server",
containerd.WithImage(image),
containerd.WithNewSpec(containerd.WithImageConfig(ima
ge)),
)
defer container.Delete()
// create a task from the container
task, err := container.NewTask(ctx, containerd.Stdio)
defer task.Delete(ctx)
// make sure we wait before calling start
exitStatusC, err := task.Wait(ctx)
// call start on the task to execute the redis server
if err := task.Start(ctx); err != nil {
return err
}
Example: Kill a Task
Via ctr client:
$ export 
CONTAINERD_NAMESPACE=example
$ ctr t kill redis-server
$ ctr t ls
...
// make sure we wait before calling start
exitStatusC, err := task.Wait(ctx)
time.Sleep(3 * time.Second)
if err := task.Kill(ctx, syscall.SIGTERM); err != nil {
return err
}
// retrieve the process exit status from the channel
status := <-exitStatusC
code, exitedAt, err := status.Result()
if err != nil {
return err
}
// print out the exit code from the process
fmt.Printf("redis-server exited with status: %dn", code)
Example: Customize OCI Configuration
// WithHtop configures a container to monitor the host via `htop`
func WithHtop(s *specs.Spec) error {
// make sure we are in the host pid namespace
if err := containerd.WithHostNamespace(specs.PIDNamespace)(s); err != nil
{
return err
}
// make sure we set htop as our arg
s.Process.Args = []string{"htop"}
// make sure we have a tty set for htop
if err := containerd.WithTTY(s); err != nil {
return err
}
return nil
}
With{func} functions cleanly separate modifiers
Customization
▪ Linux Namespaces -> WithLinuxNamespace
▪ Networking -> WithNetwork
▪ Volumes -> WithVolume
Use cases
- CURRENT
- Docker (moby)
- Kubernetes (cri-containerd)
- SwarmKit (experimental)
- LinuxKit
- BuildKit
- FUTURE/POTENTIAL
- IBM Cloud/Bluemix
- OpenFaaS
- {your project here}
Evolution
containerd
Going further with containerd
▪ Contributing: https://github.com/containerd/containerd
▫ Bug fixes, adding tests, improving docs, validation
▪ Using: See the getting started documentation in the docs
folder of the repo
▪ Porting/testing: Other architectures & OSs, stress testing
(see bucketbench, containerd-stress):
▫ git clone <repo>, make binaries, sudo make install
▪ K8s CRI: incubation project to use containerd as CRI
▫ In alpha today; e2e tests, validation, contributing
Moby Summit at OSS NA
Thursday, September 14, 2017
“An open framework to assemble specialized
container systems without reinventing the wheel.”
Tickets:
https://www.eventbrite.com/e/moby-summit-los-angeles-tickets-
35930560273
Bella Center, Copenhagen
16-19 October, 2017
https://europe-2017.dockercon.com/
10% discount code: CaptainPhil
Thank You! Questions?
▪ Stephen Day
▫ https://github.com/stevvooe
▫ stephen@docker.com
▫ Twitter: @stevvooe
▪ Phil Estes
▫ https://github.com/estesp
▫ estesp@gmail.com
▫ Twitter: @estesp
BACKUP
Image Names in Docker
Reference Type CLI Canonical
Repository ubuntu docker.io/library/ubuntu
Untagged ubuntu docker.io/libary/ubuntu:latest
Tagged ubuntu:16.04 docker.io/library/ubuntu:16.04
Content Trust ubuntu:latest docker.io/library/ubuntu@sha256:...
By digest ubuntu@sha256:.... docker.io/library/ubuntu@sha256:...
Unofficial tagged stevvooe/ubuntu:latest docker.io/stevvooe/ubuntu:latest
Private registry tagged myregistry.com/repo:latest myregistry.com/repo:latest
Other Approaches to Naming
▪ Self Describing
▫ Massive collisions
▫ Complex trust scenarios
▪ URI Schemes: docker://docker.io/library/ubuntu
▪ Redundant
▪ Confuses protocols and formats
▪ Operationally Limiting
▫ let configuration choose protocol and format
Locators
(docker.io/library/ubuntu, latest)
Schema-less URIs
ubuntu (Docker name)
docker.io/library/ubuntu:latest (Docker canonical)
locator object
Docker Graph Driver
▪ History
▫ AUFS - union filesystem model
for layers
▫ Graph Driver interface
▫ Block level snapshots
(devicemapper, btrfs, zfs)
▫ Union filesystems (aufs,
overlay)
▫ Content Addressability (1.10.0)
▫ No changes to graph driver
▫ Layerstore - content
Docker Storage Architecture
Graph Driver
“layers” “mounts”
Layer Store
“content addressable layers”
Image Store
“image configs”
Containers
“container configs”
Reference Store
“names to image”
Daemon
Containerd Storage Architecture
Snapshotter
“layer snapshots”
Content Store
“content addressed blobs”
Metadata Store
“references”
ctr
Config
Rootfs (mounts)

More Related Content

What's hot (20)

PPTX
root権限無しでKubernetesを動かす
Akihiro Suda
 
PDF
Docker internals
Rohit Jnagal
 
PPTX
Introduction To Docker, Docker Compose, Docker Swarm
An Nguyen
 
PDF
[Container Plumbing Days 2023] Why was nerdctl made?
Akihiro Suda
 
PDF
Kubernetes Networking with Cilium - Deep Dive
Michal Rostecki
 
PDF
Ceph: Open Source Storage Software Optimizations on Intel® Architecture for C...
Odinot Stanislas
 
PDF
Helm Charts Security 101
Deep Datta
 
PDF
From Android NDK To AOSP
Min-Yih Hsu
 
PDF
Android OTA updates
Gary Bisson
 
PDF
Kubernetes Networking - Sreenivas Makam - Google - CC18
CodeOps Technologies LLP
 
PDF
Android起動周りのノウハウ
chancelab
 
PDF
Introduction to Docker
Aditya Konarde
 
PDF
DevOps Days Kyiv 2019 -- Victoria Metrics // Artem Navoiev
Mykola Marzhan
 
PDF
Présentation docker et kubernetes
Kiwi Backup
 
PDF
Ceph RBD Update - June 2021
Ceph Community
 
PDF
Introduction to docker
John Willis
 
PPTX
Linux Network Stack
Adrien Mahieux
 
PPTX
Ceph Performance and Sizing Guide
Jose De La Rosa
 
PDF
Kernel Recipes 2017: Using Linux perf at Netflix
Brendan Gregg
 
PDF
Harbor RegistryのReplication機能
Masanori Nara
 
root権限無しでKubernetesを動かす
Akihiro Suda
 
Docker internals
Rohit Jnagal
 
Introduction To Docker, Docker Compose, Docker Swarm
An Nguyen
 
[Container Plumbing Days 2023] Why was nerdctl made?
Akihiro Suda
 
Kubernetes Networking with Cilium - Deep Dive
Michal Rostecki
 
Ceph: Open Source Storage Software Optimizations on Intel® Architecture for C...
Odinot Stanislas
 
Helm Charts Security 101
Deep Datta
 
From Android NDK To AOSP
Min-Yih Hsu
 
Android OTA updates
Gary Bisson
 
Kubernetes Networking - Sreenivas Makam - Google - CC18
CodeOps Technologies LLP
 
Android起動周りのノウハウ
chancelab
 
Introduction to Docker
Aditya Konarde
 
DevOps Days Kyiv 2019 -- Victoria Metrics // Artem Navoiev
Mykola Marzhan
 
Présentation docker et kubernetes
Kiwi Backup
 
Ceph RBD Update - June 2021
Ceph Community
 
Introduction to docker
John Willis
 
Linux Network Stack
Adrien Mahieux
 
Ceph Performance and Sizing Guide
Jose De La Rosa
 
Kernel Recipes 2017: Using Linux perf at Netflix
Brendan Gregg
 
Harbor RegistryのReplication機能
Masanori Nara
 

Viewers also liked (20)

PPTX
Kubernetes CRI containerd integration by Lantao Liu (Google)
Docker, Inc.
 
PPTX
State of Builder and Buildkit by Tonis Tiigi (Docker)
Docker, Inc.
 
PPTX
LinuxKit: the first five months by Justin Cormack & Riyaz Faizullabhoy (Docker)
Docker, Inc.
 
PDF
Bucketbench: Benchmarking Container Runtime Performance
Phil Estes
 
PDF
What's New in Docker 1.12?
Ajeet Singh Raina
 
PDF
Modernizing .NET Apps
Docker, Inc.
 
PDF
Modernizing Java Apps with Docker
Docker, Inc.
 
PDF
Container-relevant Upstream Kernel Developments
Docker, Inc.
 
PDF
Deeper Dive in Docker Overlay Networks
Docker, Inc.
 
PDF
Practical Design Patterns in Docker Networking
Docker, Inc.
 
PDF
Docker on Docker
Docker, Inc.
 
PDF
Docker summit 2015: 以 Docker Swarm 打造多主機叢集環境
謝 宗穎
 
PDF
Plug-ins: Building, Shipping, Storing, and Running - Nandhini Santhanam and T...
Docker, Inc.
 
PDF
Service Discovery & Load-Balancing under Docker 1.12.0 @ Docker Meetup #22
Ajeet Singh Raina
 
PDF
Monitoring Dell Infrastructure using Docker & Microservices
Ajeet Singh Raina
 
PDF
Deep Dive into Docker Swarm Mode
Ajeet Singh Raina
 
PDF
Introduction to Docker - IndiaOpsUG
Ajeet Singh Raina
 
PDF
Container Orchestration from Theory to Practice
Docker, Inc.
 
PDF
Kubernetes in Docker
Docker, Inc.
 
PDF
Under the Hood with Docker Swarm Mode - Drew Erny and Nishant Totla, Docker
Docker, Inc.
 
Kubernetes CRI containerd integration by Lantao Liu (Google)
Docker, Inc.
 
State of Builder and Buildkit by Tonis Tiigi (Docker)
Docker, Inc.
 
LinuxKit: the first five months by Justin Cormack & Riyaz Faizullabhoy (Docker)
Docker, Inc.
 
Bucketbench: Benchmarking Container Runtime Performance
Phil Estes
 
What's New in Docker 1.12?
Ajeet Singh Raina
 
Modernizing .NET Apps
Docker, Inc.
 
Modernizing Java Apps with Docker
Docker, Inc.
 
Container-relevant Upstream Kernel Developments
Docker, Inc.
 
Deeper Dive in Docker Overlay Networks
Docker, Inc.
 
Practical Design Patterns in Docker Networking
Docker, Inc.
 
Docker on Docker
Docker, Inc.
 
Docker summit 2015: 以 Docker Swarm 打造多主機叢集環境
謝 宗穎
 
Plug-ins: Building, Shipping, Storing, and Running - Nandhini Santhanam and T...
Docker, Inc.
 
Service Discovery & Load-Balancing under Docker 1.12.0 @ Docker Meetup #22
Ajeet Singh Raina
 
Monitoring Dell Infrastructure using Docker & Microservices
Ajeet Singh Raina
 
Deep Dive into Docker Swarm Mode
Ajeet Singh Raina
 
Introduction to Docker - IndiaOpsUG
Ajeet Singh Raina
 
Container Orchestration from Theory to Practice
Docker, Inc.
 
Kubernetes in Docker
Docker, Inc.
 
Under the Hood with Docker Swarm Mode - Drew Erny and Nishant Totla, Docker
Docker, Inc.
 
Ad

Similar to Containerd internals: building a core container runtime (20)

PDF
Containerd Internals: Building a Core Container Runtime
Phil Estes
 
PPTX
The state of containerd
Docker, Inc.
 
PDF
The State of containerd
Moby Project
 
PPTX
containerd the universal container runtime
Docker, Inc.
 
PDF
Hands on kubernetes_container_orchestration
Amir Hossein Sorouri
 
PDF
Docker 0.11 at MaxCDN meetup in Los Angeles
Jérôme Petazzoni
 
PDF
Docker Introduction + what is new in 0.9
Jérôme Petazzoni
 
PDF
Docker Introduction, and what's new in 0.9 — Docker Palo Alto at RelateIQ
Jérôme Petazzoni
 
PPTX
State of the Container Ecosystem
Vinay Rao
 
PDF
[KubeCon EU 2020] containerd Deep Dive
Akihiro Suda
 
PPTX
Introduction to containers
Nitish Jadia
 
PDF
Docker and Containers for Development and Deployment — SCALE12X
Jérôme Petazzoni
 
PDF
Container Performance Analysis Brendan Gregg, Netflix
Docker, Inc.
 
PDF
Intro to containerization
Balint Pato
 
PDF
Introduction to Docker at SF Peninsula Software Development Meetup @Guidewire
dotCloud
 
PDF
Introduction to Docker and all things containers, Docker Meetup at RelateIQ
dotCloud
 
PDF
A Gentle Introduction To Docker And All Things Containers
Jérôme Petazzoni
 
PDF
Introduction to Docker (as presented at December 2013 Global Hackathon)
Jérôme Petazzoni
 
PDF
Scaling Up Logging and Metrics
Ricardo Lourenço
 
PDF
Container Runtimes and Tooling
Kublr
 
Containerd Internals: Building a Core Container Runtime
Phil Estes
 
The state of containerd
Docker, Inc.
 
The State of containerd
Moby Project
 
containerd the universal container runtime
Docker, Inc.
 
Hands on kubernetes_container_orchestration
Amir Hossein Sorouri
 
Docker 0.11 at MaxCDN meetup in Los Angeles
Jérôme Petazzoni
 
Docker Introduction + what is new in 0.9
Jérôme Petazzoni
 
Docker Introduction, and what's new in 0.9 — Docker Palo Alto at RelateIQ
Jérôme Petazzoni
 
State of the Container Ecosystem
Vinay Rao
 
[KubeCon EU 2020] containerd Deep Dive
Akihiro Suda
 
Introduction to containers
Nitish Jadia
 
Docker and Containers for Development and Deployment — SCALE12X
Jérôme Petazzoni
 
Container Performance Analysis Brendan Gregg, Netflix
Docker, Inc.
 
Intro to containerization
Balint Pato
 
Introduction to Docker at SF Peninsula Software Development Meetup @Guidewire
dotCloud
 
Introduction to Docker and all things containers, Docker Meetup at RelateIQ
dotCloud
 
A Gentle Introduction To Docker And All Things Containers
Jérôme Petazzoni
 
Introduction to Docker (as presented at December 2013 Global Hackathon)
Jérôme Petazzoni
 
Scaling Up Logging and Metrics
Ricardo Lourenço
 
Container Runtimes and Tooling
Kublr
 
Ad

More from Docker, Inc. (20)

PDF
Containerize Your Game Server for the Best Multiplayer Experience
Docker, Inc.
 
PDF
How to Improve Your Image Builds Using Advance Docker Build
Docker, Inc.
 
PDF
Build & Deploy Multi-Container Applications to AWS
Docker, Inc.
 
PDF
Securing Your Containerized Applications with NGINX
Docker, Inc.
 
PDF
How To Build and Run Node Apps with Docker and Compose
Docker, Inc.
 
PDF
Hands-on Helm
Docker, Inc.
 
PDF
Distributed Deep Learning with Docker at Salesforce
Docker, Inc.
 
PDF
The First 10M Pulls: Building The Official Curl Image for Docker Hub
Docker, Inc.
 
PDF
Monitoring in a Microservices World
Docker, Inc.
 
PDF
COVID-19 in Italy: How Docker is Helping the Biggest Italian IT Company Conti...
Docker, Inc.
 
PDF
Predicting Space Weather with Docker
Docker, Inc.
 
PDF
Become a Docker Power User With Microsoft Visual Studio Code
Docker, Inc.
 
PDF
How to Use Mirroring and Caching to Optimize your Container Registry
Docker, Inc.
 
PDF
Monolithic to Microservices + Docker = SDLC on Steroids!
Docker, Inc.
 
PDF
Kubernetes at Datadog Scale
Docker, Inc.
 
PDF
Labels, Labels, Labels
Docker, Inc.
 
PDF
Using Docker Hub at Scale to Support Micro Focus' Delivery and Deployment Model
Docker, Inc.
 
PDF
Build & Deploy Multi-Container Applications to AWS
Docker, Inc.
 
PDF
From Fortran on the Desktop to Kubernetes in the Cloud: A Windows Migration S...
Docker, Inc.
 
PDF
Developing with Docker for the Arm Architecture
Docker, Inc.
 
Containerize Your Game Server for the Best Multiplayer Experience
Docker, Inc.
 
How to Improve Your Image Builds Using Advance Docker Build
Docker, Inc.
 
Build & Deploy Multi-Container Applications to AWS
Docker, Inc.
 
Securing Your Containerized Applications with NGINX
Docker, Inc.
 
How To Build and Run Node Apps with Docker and Compose
Docker, Inc.
 
Hands-on Helm
Docker, Inc.
 
Distributed Deep Learning with Docker at Salesforce
Docker, Inc.
 
The First 10M Pulls: Building The Official Curl Image for Docker Hub
Docker, Inc.
 
Monitoring in a Microservices World
Docker, Inc.
 
COVID-19 in Italy: How Docker is Helping the Biggest Italian IT Company Conti...
Docker, Inc.
 
Predicting Space Weather with Docker
Docker, Inc.
 
Become a Docker Power User With Microsoft Visual Studio Code
Docker, Inc.
 
How to Use Mirroring and Caching to Optimize your Container Registry
Docker, Inc.
 
Monolithic to Microservices + Docker = SDLC on Steroids!
Docker, Inc.
 
Kubernetes at Datadog Scale
Docker, Inc.
 
Labels, Labels, Labels
Docker, Inc.
 
Using Docker Hub at Scale to Support Micro Focus' Delivery and Deployment Model
Docker, Inc.
 
Build & Deploy Multi-Container Applications to AWS
Docker, Inc.
 
From Fortran on the Desktop to Kubernetes in the Cloud: A Windows Migration S...
Docker, Inc.
 
Developing with Docker for the Arm Architecture
Docker, Inc.
 

Recently uploaded (20)

PPTX
Top iOS App Development Company in the USA for Innovative Apps
SynapseIndia
 
PPTX
Webinar: Introduction to LF Energy EVerest
DanBrown980551
 
PDF
Jak MŚP w Europie Środkowo-Wschodniej odnajdują się w świecie AI
dominikamizerska1
 
PDF
Presentation - Vibe Coding The Future of Tech
yanuarsinggih1
 
PPTX
WooCommerce Workshop: Bring Your Laptop
Laura Hartwig
 
PPTX
Building a Production-Ready Barts Health Secure Data Environment Tooling, Acc...
Barts Health
 
PDF
Chris Elwell Woburn, MA - Passionate About IT Innovation
Chris Elwell Woburn, MA
 
PDF
Achieving Consistent and Reliable AI Code Generation - Medusa AI
medusaaico
 
PPTX
✨Unleashing Collaboration: Salesforce Channels & Community Power in Patna!✨
SanjeetMishra29
 
PDF
Smart Air Quality Monitoring with Serrax AQM190 LITE
SERRAX TECHNOLOGIES LLP
 
PDF
Complete Network Protection with Real-Time Security
L4RGINDIA
 
PDF
How Startups Are Growing Faster with App Developers in Australia.pdf
India App Developer
 
PDF
Transcript: New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
PDF
NewMind AI Journal - Weekly Chronicles - July'25 Week II
NewMind AI
 
PDF
Blockchain Transactions Explained For Everyone
CIFDAQ
 
PDF
SWEBOK Guide and Software Services Engineering Education
Hironori Washizaki
 
PDF
Complete JavaScript Notes: From Basics to Advanced Concepts.pdf
haydendavispro
 
PDF
New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
PDF
CIFDAQ Weekly Market Wrap for 11th July 2025
CIFDAQ
 
PDF
Human-centred design in online workplace learning and relationship to engagem...
Tracy Tang
 
Top iOS App Development Company in the USA for Innovative Apps
SynapseIndia
 
Webinar: Introduction to LF Energy EVerest
DanBrown980551
 
Jak MŚP w Europie Środkowo-Wschodniej odnajdują się w świecie AI
dominikamizerska1
 
Presentation - Vibe Coding The Future of Tech
yanuarsinggih1
 
WooCommerce Workshop: Bring Your Laptop
Laura Hartwig
 
Building a Production-Ready Barts Health Secure Data Environment Tooling, Acc...
Barts Health
 
Chris Elwell Woburn, MA - Passionate About IT Innovation
Chris Elwell Woburn, MA
 
Achieving Consistent and Reliable AI Code Generation - Medusa AI
medusaaico
 
✨Unleashing Collaboration: Salesforce Channels & Community Power in Patna!✨
SanjeetMishra29
 
Smart Air Quality Monitoring with Serrax AQM190 LITE
SERRAX TECHNOLOGIES LLP
 
Complete Network Protection with Real-Time Security
L4RGINDIA
 
How Startups Are Growing Faster with App Developers in Australia.pdf
India App Developer
 
Transcript: New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
NewMind AI Journal - Weekly Chronicles - July'25 Week II
NewMind AI
 
Blockchain Transactions Explained For Everyone
CIFDAQ
 
SWEBOK Guide and Software Services Engineering Education
Hironori Washizaki
 
Complete JavaScript Notes: From Basics to Advanced Concepts.pdf
haydendavispro
 
New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
CIFDAQ Weekly Market Wrap for 11th July 2025
CIFDAQ
 
Human-centred design in online workplace learning and relationship to engagem...
Tracy Tang
 

Containerd internals: building a core container runtime

  • 1. Containerd Internals: Building a Core Container Runtime Stephen Day, Docker Phil Estes, IBM September 11, 2017 #OSSummit
  • 2. A Brief History APRIL 2016 Containerd “0.2” announced, Docker 1.11 DECEMBER 2016Announce expansion of containerd OSS project Management/Supervisor for the OCI runc executor Containerd 1.0: A core container runtime project for the industry MARCH 2017 Containerd project contributed to CNCF
  • 3. runc containerd Why Containerd 1.0? ▪ Continue projects spun out from monolithic Docker engine ▪ Expected use beyond Docker engine (Kubernetes CRI) ▪ Donation to foundation for broad industry collaboration ▫ Similar to runc/libcontainer and the OCI
  • 4. Technical Goals/Intentions ▪ Clean gRPC-based API + client library ▪ Full OCI support (runtime and image spec) ▪ Stability and performance with tight, well- defined core of container function ▪ Decoupled systems (image, filesystem, runtime) for pluggability, reuse
  • 5. Requirements - A la carte: use only what is required - Runtime agility: fits into different platforms - Pass-through container configuration (direct OCI) - Decoupled - Use known-good technology - OCI container runtime and images - gRPC for API - Prometheus for Metrics
  • 6. Runtimes Metadata Architecture ContainersContent DiffSnapshot Tasks EventsImages GRPC Metrics Runtimes Storage OS
  • 7. Architecture containerd OS (Storage, FS, Networking Runtimes API Client (moby, containerd-cri, etc.)
  • 8. Containerd: Rich Go API Getting Started https://github.com/containerd/containerd/blob/master/docs/getting-started.md GoDoc https://godoc.org/github.com/containerd/containerd
  • 10. # HELP container_blkio_io_service_bytes_recursive_bytes The blkio io service bytes recursive # TYPE container_blkio_io_service_bytes_recursive_bytes gauge container_blkio_io_service_bytes_recursive_bytes{container_id="foo4",device="/dev/nvme0n1",major="259",minor="0",namespace="default",op="Async"} 1.07159552e+08 container_blkio_io_service_bytes_recursive_bytes{container_id="foo4",device="/dev/nvme0n1",major="259",minor="0",namespace="default",op="Read"} 0 container_blkio_io_service_bytes_recursive_bytes{container_id="foo4",device="/dev/nvme0n1",major="259",minor="0",namespace="default",op="Sync"} 81920 container_blkio_io_service_bytes_recursive_bytes{container_id="foo4",device="/dev/nvme0n1",major="259",minor="0",namespace="default",op="Total"} 1.07241472e+08 container_blkio_io_service_bytes_recursive_bytes{container_id="foo4",device="/dev/nvme0n1",major="259",minor="0",namespace="default",op="Write"} 1.07241472e+08 # HELP container_blkio_io_serviced_recursive_total The blkio io servied recursive # TYPE container_blkio_io_serviced_recursive_total gauge container_blkio_io_serviced_recursive_total{container_id="foo4",device="/dev/nvme0n1",major="259",minor="0",namespace="default",op="Async"} 892 container_blkio_io_serviced_recursive_total{container_id="foo4",device="/dev/nvme0n1",major="259",minor="0",namespace="default",op="Read"} 0 container_blkio_io_serviced_recursive_total{container_id="foo4",device="/dev/nvme0n1",major="259",minor="0",namespace="default",op="Sync"} 888 container_blkio_io_serviced_recursive_total{container_id="foo4",device="/dev/nvme0n1",major="259",minor="0",namespace="default",op="Total"} 1780 container_blkio_io_serviced_recursive_total{container_id="foo4",device="/dev/nvme0n1",major="259",minor="0",namespace="default",op="Write"} 1780 # HELP container_cpu_kernel_nanoseconds The total kernel cpu time # TYPE container_cpu_kernel_nanoseconds gauge container_cpu_kernel_nanoseconds{container_id="foo4",namespace="default"} 2.6e+08 # HELP container_cpu_throttle_periods_total The total cpu throttle periods # TYPE container_cpu_throttle_periods_total gauge container_cpu_throttle_periods_total{container_id="foo4",namespace="default"} 0 # HELP container_cpu_throttled_periods_total The total cpu throttled periods # TYPE container_cpu_throttled_periods_total gauge container_cpu_throttled_periods_total{container_id="foo4",namespace="default"} 0 # HELP container_cpu_throttled_time_nanoseconds The total cpu throttled time # TYPE container_cpu_throttled_time_nanoseconds gauge container_cpu_throttled_time_nanoseconds{container_id="foo4",namespace="default"} 0 # HELP container_cpu_total_nanoseconds The total cpu time # TYPE container_cpu_total_nanoseconds gauge container_cpu_total_nanoseconds{container_id="foo4",namespace="default"} 1.003301578e+09 # HELP container_cpu_user_nanoseconds The total user cpu time # TYPE container_cpu_user_nanoseconds gauge container_cpu_user_nanoseconds{container_id="foo4",namespace="default"} 7e+08 # HELP container_hugetlb_failcnt_total The hugetlb failcnt # TYPE container_hugetlb_failcnt_total gauge container_hugetlb_failcnt_total{container_id="foo4",namespace="default",page="1GB"} 0 container_hugetlb_failcnt_total{container_id="foo4",namespace="default",page="2MB"} 0 # HELP container_hugetlb_max_bytes The hugetlb maximum usage # TYPE container_hugetlb_max_bytes gauge container_hugetlb_max_bytes{container_id="foo4",namespace="default",page="1GB"} 0 container_hugetlb_max_bytes{container_id="foo4",namespace="default",page="2MB"} 0 # HELP container_hugetlb_usage_bytes The hugetlb usage # TYPE container_hugetlb_usage_bytes gauge container_hugetlb_usage_bytes{container_id="foo4",namespace="default",page="1GB"} 0 container_hugetlb_usage_bytes{container_id="foo4",namespace="default",page="2MB"} 0 # HELP container_memory_active_anon_bytes The active_anon amount # TYPE container_memory_active_anon_bytes gauge container_memory_active_anon_bytes{container_id="foo4",namespace="default"} 2.658304e+06 # HELP container_memory_active_file_bytes The active_file amount # TYPE container_memory_active_file_bytes gauge container_memory_active_file_bytes{container_id="foo4",namespace="default"} 7.319552e+06 # HELP container_memory_cache_bytes The cache amount used # TYPE container_memory_cache_bytes gauge container_memory_cache_bytes{container_id="foo4",namespace="default"} 5.0597888e+07 # HELP container_memory_dirty_bytes The dirty amount # TYPE container_memory_dirty_bytes gauge container_memory_dirty_bytes{container_id="foo4",namespace="default"} 28672 Metrics
  • 11. Pulling an Image What do runtimes need?
  • 12. { "schemaVersion": 2, "mediaType": "application/vnd.docker.distribution.manifest.list.v2+json", "manifests": [ { "mediaType": "application/vnd.docker.distribution.manifest.v1+json", "size": 2094, "digest": "sha256:7820f9a86d4ad15a2c4f0c0e5479298df2aa7c2f6871288e2ef8546f3e7b6783", "platform": { "architecture": "ppc64le", "os": "linux" } }, { "mediaType": "application/vnd.docker.distribution.manifest.v1+json", "size": 1922, "digest": "sha256:ae1b0e06e8ade3a11267564a26e750585ba2259c0ecab59ab165ad1af41d1bdd", "platform": { "architecture": "amd64", "os": "linux", "features": [ "sse" ] } }, { "mediaType": "application/vnd.docker.distribution.manifest.v1+json", "size": 2084, "digest": "sha256:e4c0df75810b953d6717b8f8f28298d73870e8aa2a0d5e77b8391f16fdfbbbe2", "platform": { "architecture": "s390x", "os": "linux" } }, { "mediaType": "application/vnd.docker.distribution.manifest.v1+json", "size": 2084, "digest": "sha256:07ebe243465ef4a667b78154ae6c3ea46fdb1582936aac3ac899ea311a701b40", "platform": { "architecture": "arm", "os": "linux", "variant": "armv7" } }, { "mediaType": "application/vnd.docker.distribution.manifest.v1+json", "size": 2090, "digest": "sha256:fb2fc0707b86dafa9959fe3d29e66af8787aee4d9a23581714be65db4265ad8a", "platform": { "architecture": "arm64", "os": "linux", "variant": "armv8" } } ] Image Formats Index (Manifest List) linux amd64 linux ppc64le windows amd64 Manifests: Manifest linux arm64 Layers: Config: L0 L1 Ln Root Filesystem /usr /bin /dev /etc /home /lib C OCI Spec process args env cwd … root mounts Docker and OCI
  • 13. Content Addressability digest.FromString(“foo”) -> “sha256:2c26b46b68ffc68ff99b453c1d30413413422d706483bfa0f98a5e886266e7ae” digest.FromString(“foo tampered”) -> “sha256:51f7f1d1f6bebed72b936c8ea257896cb221b91d303c5b5c44073fce33ab8dd8” digest.FromString(“bar sha256:2c...”) -> “sha256:2e94890c66fbcccca9ad680e1b1c933cc323a5b4bcb14cc8a4bc78bb88d41055” “foo” “bar sha256:2c…” “foo tampered” “bar sha256:2c…”
  • 14. { "schemaVersion": 2, "mediaType": "application/vnd.docker.distribution.manifest.list.v2+json", "manifests": [ { "mediaType": "application/vnd.docker.distribution.manifest.v1+json", "size": 2094, "digest": "sha256:7820f9a86d4ad15a2c4f0c0e5479298df2aa7c2f6871288e2ef8546f3e7b6783", "platform": { "architecture": "ppc64le", "os": "linux" } }, { "mediaType": "application/vnd.docker.distribution.manifest.v1+json", "size": 1922, "digest": "sha256:ae1b0e06e8ade3a11267564a26e750585ba2259c0ecab59ab165ad1af41d1bdd", "platform": { "architecture": "amd64", "os": "linux", "features": [ "sse" ] } }, { "mediaType": "application/vnd.docker.distribution.manifest.v1+json", "size": 2084, "digest": "sha256:e4c0df75810b953d6717b8f8f28298d73870e8aa2a0d5e77b8391f16fdfbbbe2", "platform": { "architecture": "s390x", "os": "linux" } }, { "mediaType": "application/vnd.docker.distribution.manifest.v1+json", "size": 2084, "digest": "sha256:07ebe243465ef4a667b78154ae6c3ea46fdb1582936aac3ac899ea311a701b40", "platform": { "architecture": "arm", "os": "linux", "variant": "armv7" } }, { "mediaType": "application/vnd.docker.distribution.manifest.v1+json", "size": 2090, "digest": "sha256:fb2fc0707b86dafa9959fe3d29e66af8787aee4d9a23581714be65db4265ad8a", "platform": { "architecture": "arm64", "os": "linux", "variant": "armv8" } } ] Image Formats Docker and OCI Index (Manifest List) linux amd64 linux ppc64le windows amd64 Manifests: Manifest linux arm64 Layers: Config: L0 Ln C Digest Layer File 0 Layer File 1 Layer File N L1
  • 15. Resolution Getting a digest from a name: 15 ubuntu sha256:71cd81252a3563a03ad8daee81047b62ab5d892ebbfbf71cf53415f29c130950
  • 16. Pulling an Image Data Flow Content Images Snapshots Pull Fetch Unpack Events Remote Mounts
  • 17. Remotes Locators and Resolution type Fetcher interface { Fetch(ctx context.Context, id string, hints ...string) (io.ReadCloser, error) } type Resolver interface { Resolve(ctx context.Context, locator string) (Fetcher, error) } fetcher := resolver.Resolve("docker.io/library/ubuntu") Endlessly Configurable! (hint: think git remotes)
  • 18. Example: Pull an Image Via ctr client: $ export CONTAINERD_NAMESPACE=example $ ctr pull docker.io/library/redis:alpine $ ctr image ls ... import ( "context" "github.com/containerd/containerd" "github.com/containerd/containerd/namespaces" ) // connect to our containerd daemon client, err := containerd.New("/run/containerd/containerd.sock") defer client.Close() // set our namespace to “example”: ctx := namespaces.WithNamespace(context.Background(), "example") // pull the alpine-based redis image from DockerHub: image, err := client.Pull(ctx, "docker.io/library/redis:alpine", containerd.WithPullUnpack)
  • 19. Snapshotters How do you build a container root filesystem?
  • 20. Snapshots ● No mounting, just returns mounts! ● Explicit active (rw) and committed (ro) ● Commands represent lifecycle ● Reference key chosen by caller (allows using content addresses) ● No tars and no diffs Evolved from Graph Drivers ● Simple layer relationships ● Small and focused interface ● Non-opinionated string keys ● External Mount Lifecycle type Snapshotter interface { Stat(key string) (Info, error) Mounts(key string) ([]containerd.Mount, error) Prepare(key, parent string) ([]containerd.Mount, error) View(key, parent string) ([]containerd.Mount, error) Commit(name, key string) error Remove(key string) error Walk(fn func(Info) error) error } type Info struct { Name string // name or key of snapshot Parent string Kind Kind } type Kind int const ( KindView KindActive KindCommitted )
  • 21. Active Committed Prepare(a, P0) Commit(P1, a′) Snapshot Model P0a a′ a′′ P1 P2 Commit(P2, a′′) Remove(c) Prepare(a′′, P1)
  • 22. Example: Investigating Root Filesystem $ ctr snapshot ls … $ ctr snapshot tree … $ ctr snapshot mounts <target> <id>
  • 23. Pulling an Image 1.Resolve manifest or index (manifest list) 2.Download all the resources referenced by the manifest 3.Unpack layers into snapshots 4.Register the mappings between manifests and constituent resources
  • 24. Pulling an Image Data Flow Content Images Snapshots Pull Fetch Unpack Events Remote Mounts
  • 25. Starting a Container 1.Initialize a root filesystem (RootFS) from snapshot 2.Setup OCI configuration (config.json) 3.Use metadata from container and snapshotter to specify config and mounts 4.Start process via the task service
  • 26. Starting a Container Images Snapshot Run Initialize Start Events Running Containers Containers Tasks Setup
  • 27. Example: Run a Container Via ctr client: $ export CONTAINERD_NAMESPACE=example $ ctr run -t docker.io/library/redis:alpine redis-server $ ctr c ... // create our container object and config container, err := client.NewContainer(ctx, "redis-server", containerd.WithImage(image), containerd.WithNewSpec(containerd.WithImageConfig(ima ge)), ) defer container.Delete() // create a task from the container task, err := container.NewTask(ctx, containerd.Stdio) defer task.Delete(ctx) // make sure we wait before calling start exitStatusC, err := task.Wait(ctx) // call start on the task to execute the redis server if err := task.Start(ctx); err != nil { return err }
  • 28. Example: Kill a Task Via ctr client: $ export CONTAINERD_NAMESPACE=example $ ctr t kill redis-server $ ctr t ls ... // make sure we wait before calling start exitStatusC, err := task.Wait(ctx) time.Sleep(3 * time.Second) if err := task.Kill(ctx, syscall.SIGTERM); err != nil { return err } // retrieve the process exit status from the channel status := <-exitStatusC code, exitedAt, err := status.Result() if err != nil { return err } // print out the exit code from the process fmt.Printf("redis-server exited with status: %dn", code)
  • 29. Example: Customize OCI Configuration // WithHtop configures a container to monitor the host via `htop` func WithHtop(s *specs.Spec) error { // make sure we are in the host pid namespace if err := containerd.WithHostNamespace(specs.PIDNamespace)(s); err != nil { return err } // make sure we set htop as our arg s.Process.Args = []string{"htop"} // make sure we have a tty set for htop if err := containerd.WithTTY(s); err != nil { return err } return nil } With{func} functions cleanly separate modifiers
  • 30. Customization ▪ Linux Namespaces -> WithLinuxNamespace ▪ Networking -> WithNetwork ▪ Volumes -> WithVolume
  • 31. Use cases - CURRENT - Docker (moby) - Kubernetes (cri-containerd) - SwarmKit (experimental) - LinuxKit - BuildKit - FUTURE/POTENTIAL - IBM Cloud/Bluemix - OpenFaaS - {your project here}
  • 33. Going further with containerd ▪ Contributing: https://github.com/containerd/containerd ▫ Bug fixes, adding tests, improving docs, validation ▪ Using: See the getting started documentation in the docs folder of the repo ▪ Porting/testing: Other architectures & OSs, stress testing (see bucketbench, containerd-stress): ▫ git clone <repo>, make binaries, sudo make install ▪ K8s CRI: incubation project to use containerd as CRI ▫ In alpha today; e2e tests, validation, contributing
  • 34. Moby Summit at OSS NA Thursday, September 14, 2017 “An open framework to assemble specialized container systems without reinventing the wheel.” Tickets: https://www.eventbrite.com/e/moby-summit-los-angeles-tickets- 35930560273
  • 35. Bella Center, Copenhagen 16-19 October, 2017 https://europe-2017.dockercon.com/ 10% discount code: CaptainPhil
  • 36. Thank You! Questions? ▪ Stephen Day ▫ https://github.com/stevvooe ▫ [email protected] ▫ Twitter: @stevvooe ▪ Phil Estes ▫ https://github.com/estesp ▫ [email protected] ▫ Twitter: @estesp
  • 38. Image Names in Docker Reference Type CLI Canonical Repository ubuntu docker.io/library/ubuntu Untagged ubuntu docker.io/libary/ubuntu:latest Tagged ubuntu:16.04 docker.io/library/ubuntu:16.04 Content Trust ubuntu:latest docker.io/library/ubuntu@sha256:... By digest ubuntu@sha256:.... docker.io/library/ubuntu@sha256:... Unofficial tagged stevvooe/ubuntu:latest docker.io/stevvooe/ubuntu:latest Private registry tagged myregistry.com/repo:latest myregistry.com/repo:latest
  • 39. Other Approaches to Naming ▪ Self Describing ▫ Massive collisions ▫ Complex trust scenarios ▪ URI Schemes: docker://docker.io/library/ubuntu ▪ Redundant ▪ Confuses protocols and formats ▪ Operationally Limiting ▫ let configuration choose protocol and format
  • 40. Locators (docker.io/library/ubuntu, latest) Schema-less URIs ubuntu (Docker name) docker.io/library/ubuntu:latest (Docker canonical) locator object
  • 41. Docker Graph Driver ▪ History ▫ AUFS - union filesystem model for layers ▫ Graph Driver interface ▫ Block level snapshots (devicemapper, btrfs, zfs) ▫ Union filesystems (aufs, overlay) ▫ Content Addressability (1.10.0) ▫ No changes to graph driver ▫ Layerstore - content
  • 42. Docker Storage Architecture Graph Driver “layers” “mounts” Layer Store “content addressable layers” Image Store “image configs” Containers “container configs” Reference Store “names to image” Daemon
  • 43. Containerd Storage Architecture Snapshotter “layer snapshots” Content Store “content addressed blobs” Metadata Store “references” ctr Config Rootfs (mounts)

Editor's Notes

  • #5: This is compared to “container systems of the past” that were monolithic and tightly coupled Example: hard to reuse components; e.g. take a Docker graphdriver and use it to implement a volume driver
  • #21: Active/Committed lifecycle Simple to make overlay act like a snapshot than vice versa How to handle no diffs for union filesystems which are “optimized” for diff
  • #27: Talk about task model -- runtime component of a container
  • #42: Graph driver interface developed to allow different layering backends. Block level snapshot drivers added, required mounting to satisfy changes interfaces.