SlideShare a Scribd company logo
Easy Cloud Native Transformation
with HashiCorp Nomad
Bram Vogelaar
@attachmentgenie
$ whoami
• Used to be a Molecular Biologist
• Then became a Dev
• Now an Ops
• Currently Cloud Engineer @ The Factory
• Amsterdam HUG organizer
Moving it all to the cloud
Vertical Scaling
Horizontal Scaling / Load Balancers
And than stuff got complicated….
The story starts with my personal website
Nomad
l Open Source tool for dynamic workload scheduling
l Batch, containerized, and non-containerized applications.
l Has native Consul and Vault integrations.
l Has token based access setup.
l Jobs written in (H)ashiCorp (C)onfiguration (L)anguage
https://www.nomadproject.io/
job "blog" {
datacenters = ["aws"]
type = "service"
group "hugo" {
network {
port "http" {
to = 80
}
}
task "nginx" {
driver = "docker"
config {
image = "${PRIVATE}.dkr.ecr.us-east-1.amazonaws.com/blog:19"
ports = ["http"]
Deploy the blog
1 == None
job "blog" {
datacenters = ["aws"]
type = "service"
group "hugo" {
count = 2
job "blog" {
datacenters = ["aws"]
type = "service"
group "hugo" {
count = 2
constraint {
operator = "distinct_hosts"
value = "true"
}
Force onto different hardware
job "blog" {
datacenters = ["aws"]
type = "service"
group "hugo" {
count = 2
Spread {
attribute = "${node.datacenter}"
}
Suggest onto different hardware
/etc/nomad.d/config.hcl
Client {
Enabled = true
Meta {
"rack" = "his"
}
}
Based on custom meta-data
job "blog" {
datacenters = ["aws"]
type = "service"
group "hugo" {
count = 2
Spread {
attribute = "${meta.rack}"
target "his" {
percent = 50
}
target "her" {
percent = 50
}
}
Based on custom meta-data
service {
name = ”blog"
provider = "nomad"
port = ”http"
}
Service Definition
template {
data = <<EOH
http {
server {
listen 80;
location / {
{{ range nomadService ”blog" }}
proxy_pass http://{{ .Address }}:{{ .Port }};
{{ end }}
}
}
}
EOH
destination = "local/api-servers"
Service Usage
Nomad Pack
• Levant
• Templating and packaging tool
• Easily deploy popular applications to Nomad
• Re-use common patterns across internal applications
• Find and share job specifications with the Nomad community
• Nightlies only right now!
https://github.com/hashicorp/nomad-pack-community-registry
Nomad Pack
• nomad-pack registry list
• nomad-pack run hello_world
• nomad-pack run hello_world --var message=hola
https://github.com/hashicorp/nomad-pack
https://github.com/attachmentgenie/vagrant-scheduler
Try it yourself
Consul
l Open-Source Service Discovery Tool
l Build-in KV store
l Service Mesh tool
https://www.consul.io/
service {
name = "blog"
port = "http"
check {
type = "tcp"
interval = "10s"
timeout = "2s"
}
}
Service Definition
Check {
type = "tcp"
interval = "10s"
timeout = "2s"
check_restart {
limit = 3
grace = "10s"
ignore_warnings = false
}
}
Stampeding herd
group "hugo" {
Restart {
interval = "10m"
attempts = 2
delay = "15s"
mode = "fail"
}
task "nginx" {
Restart failed jobs
group "hugo" {
Count = 2
reschedule {
delay = "30s"
delay_function = "constant" #constant, exponential, fibonacci
unlimited = true # or max_delay = “1h”
}
task "nginx" {
Reschedule a job
• https://github.com/ThomasObenaus/dummy-services
• https://github.com/Shopify/toxiproxy
Testing your assumptions
group "hugo" {
Count = 10
update {
max_parallel = 2
min_healthy_time = "30s"
healthy_deadline = "5m"
}
task "nginx" {
Updates
group "hugo" {
Count = 10
Update {
max_parallel = 1
canary = 10
min_healthy_time = "30s"
healthy_deadline = "10m"
auto_revert = true
auto_promote = false
}
task "nginx" {
Blue/Green Release
group "hugo" {
Count = 5
Update {
max_parallel = 1
canary = 1
min_healthy_time = "30s"
healthy_deadline = "10m"
auto_revert = true
auto_promote = true
}
task "nginx" {
Canary Release
service {
name = "blog"
tags = ["v2"]
}
$version++
group "hugo" {
Count = 5
Update {
max_parallel = 1
canary = 1
min_healthy_time = "30s"
healthy_deadline = "10m"
auto_revert = true
auto_promote = false
}
task "nginx" {
Canary Release++
kind = "service-router"
name = "blog"
routes = [
{
match {
http {
header = [
{
name = "group"
exact = "test"
}, ] } }
destination {
service = "blog"
service_subset = "v2"
} },]
Consul to the rescue
● Introduced in/with Nomad 0.11
● (Currently) independently release cycle
● Gaining new functionality every release
● Build in Functionality for horizontal and vertical scaling
● But extendable by your own (community) plugins
Nomad autoscaler
● Makes decisions based on a checks
● Checks are a combination of
• Data queried from an APM
• Defined STRATEGY
• Attempt to approach TARGET value
● Multiple Checks can be combined
• Answer with the most resources will win!
• ScaleOut and ScaleIn => ScaleOut
• ScaleOut and ScaleNone => ScaleOut
• ScaleIn and ScaleNone => ScaleNone
• ScaleOut(10) and ScaleOut(9) => ScaleOut(10)
• ScaleIn(3) and ScaleIn(4) => ScaleIn(4)
Auto-scaling TLDR
• job "autoscaler" {
type = "service"
datacenters = ["aws"]
group "autoscaler" {
count = 1
task "autoscaler" {
driver = "docker"
config {
image = "hashicorp/nomad-autoscaler:0.3.6"
command = "nomad-autoscaler"
args = [
"agent",
"-config",
"${NOMAD_TASK_DIR}/config.hcl",
"-http-bind-address",
"0.0.0.0",
]
Deploy the autoscaler
• /etc/nomad.d/config.hcl
• nomad {
address = "http://{{env "attr.unique.network.ip-address" }}:4646"
}
apm "prometheus" {
driver = "prometheus"
config = {
address = "http://prometheus.service.consul:9090"
}
}
strategy "target-value" {
driver = "target-value"
}
Config for the autoscaler
Metrics
https://prometheus.io/
• job "blog" {
datacenters = ["aws"]
type = "service"
group "hugo" {
count = 3
scaling {
enabled = true
min = 1
max = 20
policy {
cooldown = "20s"
check "avg_instance_sessions" {
source = "prometheus"
query = "scalar(avg(traefik_service_open_connections{service="blog@consulcatalog"}))"
strategy "target-value" {
target = 5
}
Enable autoscaling for the blog
Dashboards
https://grafana.com/oss/grafana/
Enable autoscaling
Observe scaling down event
agent: querying APM: policy_id=248f6157-ca37-f868-a0ab-cabbc67fec1d source=prometheus strategy=target-value target=local-nomad
agent: calculating new count: policy_id=248f6157-ca37-f868-a0ab-cabbc67fec1d source=prometheus strategy=target-value target=local-nomad
agent: next count outside limits: policy_id=248f6157-ca37-f868-a0ab-cabbc67fec1d source=prometheus strategy=
agent: updated count to be within limits: policy_id=248f6157-ca37-f868-a0ab-cabbc67fec1d source=prometheus strategy=target-value target=local-nomad from=3 to=1 min=1 max=10
agent: scaling target: policy_id=248f6157-ca37-f868-a0ab-cabbc67fec1d source=prometheus strategy=target-value
Observe the autoscaler
hey -z 1m -c 30 http://127.0.0.1:8000
Apply load
Remove load
Logs
https://grafana.com/oss/loki/
group "autoscaler" {
count = 1
task "autoscaler" {
driver = "docker"
config {
image = "hashicorp/nomad-autoscaler:0.3.6"
command = "nomad-autoscaler"
logging {
type = "loki"
config {
loki-url = 'http://loki.service.consul:3100/api/prom/push'
tag = "loki"
}
}
docker plugin install grafana/loki-docker-driver:latest --alias loki --grant-all-permissions
Direct to loki
task "promtail" {
driver = "docker"
lifecycle {
hook = "prestart"
sidecar = true
}
config {
image = "grafana/promtail:2.5.0"
args = [
"-config.file",
"local/promtail.yaml",
]
Promtail sidecar
• scrape_configs:
- job_name: system
entry_parser: raw
static_configs:
- targets:
- localhost
labels:
task: autoscaler
__path__: /alloc/logs/autoscaler*
pipeline_stages:
- match:
selector: '{task="autoscaler"}'
stages:
- regex:
expression: '.*policy_id=(?P<policy_id>[a-zA-Z0-9_-]+).*source=(?P<source>[a-zA-Z0-9_-
]+).*strategy=(?P<strategy>[a-zA-Z0-9_-]+).*target=(?P<target>[a-zA-Z0-9_-]+).*Group:(?P<group>[a-zA-Z0-
9]+).*Job:(?P<job>[a-zA-Z0-9_-]+).*Namespace:(?P<namespace>[a-zA-Z0-9_-]+)'
https://grafana.com/docs/loki/latest/clients/promtail/
Promtail sidecar
Annotate your graphs
Correlate events with metrics
• apm "prometheus" {
driver = "prometheus"
config = {
address = "http://prometheus.service.consul:9090"
}
}
• target "aws-asg" {
driver = "aws-asg"
config = {
aws_region = "{{ $x := env "attr.platform.aws.placement.availability-zone" }}{{ $length := len $x |subtract 1 }}{{ slice $x 0 $length}}"
}
}
Grow into your platform
• scaling "cluster_policy" {
policy {
cooldown = "2m"
evaluation_interval = "1m"
check "cpu_allocated_percentage" {
source = "prometheus"
query =
"scalar(sum(nomad_client_allocated_cpu{node_class="hashistack"}*100/(nomad_client_unallocated_cpu{node_class="hashistack"}+nomad_client_allocated_cpu{n
ode_class="hashistack"}))/count(nomad_client_allocated_cpu{node_class="hashistack"}))"
strategy "target-value" {
target = 70
}
}
target "aws-asg" {
dry-run = "false"
aws_asg_name = "${client_asg_name}"
node_class = "hashistack"
node_drain_deadline = "5m”
Grow into your platform
agent.worker.check_handler: querying source: check=mem_allocated_percentage policy_id=bf68649a-d087-2e69-362e-bbe71b5544f7 source=prometheus strategy=target-value target=a
agent.worker.check_handler: calculating new count: check=mem_allocated_percentage policy_id=bf68649a-d087-2e69-362e-bbe71b5544f7 source=prometheus strategy=target-value ta
agent.worker.check_handler: scaling target: check=mem_allocated_percentage policy_id=bf68649a-d087-2e69-36
internal_plugin.aws-asg: successfully performed and verified scaling out: action=scale_out asg_name=hashistack-n
agent.worker.check_handler: successfully submitted scaling action to target: check=mem_allocated_percentage policy_id=bf68649a-d087-2e69-362e-bbe71b5544f7 source=prometheus
Observe the autoscaler again
https://github.com/hashicorp/nomad-
autoscaler/tree/master/demo/remote
Try it yourself
Moving it all to the cloud – QED
Contact
bram@attachmentgenie.com
@attachmentgenie
https://www.slideshare.net/attachmentgenie
https://hashiconf.com/europe/ <= Running Trusted Payloads With Nomad and Waypoint
Questions ?
The Floor is yours…

More Related Content

What's hot (20)

PPTX
Automate DBA Tasks With Ansible
Ivica Arsov
 
PDF
Introduction to docker
Instruqt
 
PPTX
Prometheus and Grafana
Lhouceine OUHAMZA
 
PDF
Getting Started with Kubernetes
VMware Tanzu
 
PPTX
Kubernetes Introduction
Martin Danielsson
 
PDF
Kubernetes Application Deployment with Helm - A beginner Guide!
Krishna-Kumar
 
PDF
Kubernetes Concepts And Architecture Powerpoint Presentation Slides
SlideTeam
 
POTX
Performance Tuning EC2 Instances
Brendan Gregg
 
PPTX
01. Kubernetes-PPT.pptx
TamalBanerjee16
 
PDF
Kubernetes Introduction
Peng Xiao
 
PDF
Terraform: An Overview & Introduction
Lee Trout
 
PDF
Hunting for security bugs in AEM webapps
Mikhail Egorov
 
PPTX
Room 2 - 4 - Juncheng Anthony Lin - Redhat - A Practical Approach to Traditio...
Vietnam Open Infrastructure User Group
 
PPTX
GitLab for CI/CD process
HYS Enterprise
 
PPTX
NGINX: Basics and Best Practices
NGINX, Inc.
 
PDF
왕초보를 위한 도커 사용법
GeunCheolYeom
 
PDF
Meet the Founders: An Open Discussion About Rewriting Using Rust
InfluxData
 
PDF
Introducing Vault
Ramit Surana
 
PDF
A Introduction of Packer
Freyr Lin
 
PPTX
Kubernetes PPT.pptx
ssuser0cc9131
 
Automate DBA Tasks With Ansible
Ivica Arsov
 
Introduction to docker
Instruqt
 
Prometheus and Grafana
Lhouceine OUHAMZA
 
Getting Started with Kubernetes
VMware Tanzu
 
Kubernetes Introduction
Martin Danielsson
 
Kubernetes Application Deployment with Helm - A beginner Guide!
Krishna-Kumar
 
Kubernetes Concepts And Architecture Powerpoint Presentation Slides
SlideTeam
 
Performance Tuning EC2 Instances
Brendan Gregg
 
01. Kubernetes-PPT.pptx
TamalBanerjee16
 
Kubernetes Introduction
Peng Xiao
 
Terraform: An Overview & Introduction
Lee Trout
 
Hunting for security bugs in AEM webapps
Mikhail Egorov
 
Room 2 - 4 - Juncheng Anthony Lin - Redhat - A Practical Approach to Traditio...
Vietnam Open Infrastructure User Group
 
GitLab for CI/CD process
HYS Enterprise
 
NGINX: Basics and Best Practices
NGINX, Inc.
 
왕초보를 위한 도커 사용법
GeunCheolYeom
 
Meet the Founders: An Open Discussion About Rewriting Using Rust
InfluxData
 
Introducing Vault
Ramit Surana
 
A Introduction of Packer
Freyr Lin
 
Kubernetes PPT.pptx
ssuser0cc9131
 

Similar to Easy Cloud Native Transformation using HashiCorp Nomad (20)

PDF
Easy Cloud Native Transformation with Nomad
Bram Vogelaar
 
PDF
Autoscaling with hashi_corp_nomad
Bram Vogelaar
 
PPTX
Modern Scheduling for Modern Applications with Nomad
Mitchell Pronschinske
 
PDF
Uncomplicated Nomad
Bram Vogelaar
 
PPTX
Nomad by HashiCorp Presentation (DevOps)
Knoldus Inc.
 
PPTX
Nomad by HashiCorp Presentation (DevOps)
Knoldus Inc.
 
PDF
10 things I learned building Nomad packs
Bram Vogelaar
 
PDF
HashiStack. To the cloud and beyond...
Oleg Lobanov
 
PDF
Characterizing and Contrasting Kuhn-tey-ner Awr-kuh-streyt-ors
Sonatype
 
PDF
Homologous Apache Spark Clusters Using Nomad with Alex Dadgar
Databricks
 
PDF
Kubernetes vs dockers swarm supporting onap oom on multi-cloud multi-stack en...
Arthur Berezin
 
PDF
Living the Nomadic life - Nic Jackson
Paris Container Day
 
PDF
Self scaling Multi cloud nomad workloads
Bram Vogelaar
 
PPTX
Managing Container Clusters in OpenStack Native Way
Qiming Teng
 
PDF
Quantifying the Noisy Neighbor Problem in Openstack
Nodir Kodirov
 
PDF
Container World 2017 - Characterizing and Contrasting Container Orchestrators
Lee Calcote
 
PDF
Nomad Multi-Cloud
Nic Jackson
 
PDF
AWS와 Docker Swarm을 이용한 쉽고 빠른 컨테이너 오케스트레이션 - AWS Summit Seoul 2017
Amazon Web Services Korea
 
PPTX
Scaling Your App With Docker Swarm using Terraform, Packer on Openstack
Bobby DeVeaux, DevOps Consultant
 
PDF
Building an Autoscaler for DigitalOcean - DevOps Days Denver 2015
Jordan Stone
 
Easy Cloud Native Transformation with Nomad
Bram Vogelaar
 
Autoscaling with hashi_corp_nomad
Bram Vogelaar
 
Modern Scheduling for Modern Applications with Nomad
Mitchell Pronschinske
 
Uncomplicated Nomad
Bram Vogelaar
 
Nomad by HashiCorp Presentation (DevOps)
Knoldus Inc.
 
Nomad by HashiCorp Presentation (DevOps)
Knoldus Inc.
 
10 things I learned building Nomad packs
Bram Vogelaar
 
HashiStack. To the cloud and beyond...
Oleg Lobanov
 
Characterizing and Contrasting Kuhn-tey-ner Awr-kuh-streyt-ors
Sonatype
 
Homologous Apache Spark Clusters Using Nomad with Alex Dadgar
Databricks
 
Kubernetes vs dockers swarm supporting onap oom on multi-cloud multi-stack en...
Arthur Berezin
 
Living the Nomadic life - Nic Jackson
Paris Container Day
 
Self scaling Multi cloud nomad workloads
Bram Vogelaar
 
Managing Container Clusters in OpenStack Native Way
Qiming Teng
 
Quantifying the Noisy Neighbor Problem in Openstack
Nodir Kodirov
 
Container World 2017 - Characterizing and Contrasting Container Orchestrators
Lee Calcote
 
Nomad Multi-Cloud
Nic Jackson
 
AWS와 Docker Swarm을 이용한 쉽고 빠른 컨테이너 오케스트레이션 - AWS Summit Seoul 2017
Amazon Web Services Korea
 
Scaling Your App With Docker Swarm using Terraform, Packer on Openstack
Bobby DeVeaux, DevOps Consultant
 
Building an Autoscaler for DigitalOcean - DevOps Days Denver 2015
Jordan Stone
 
Ad

More from Bram Vogelaar (20)

PPTX
Terraforming your Platform Engineering organisation.pptx
Bram Vogelaar
 
PDF
Secure second days operations with Boundary and Vault.pdf
Bram Vogelaar
 
PDF
Cost reconciliation in a post CMDB world
Bram Vogelaar
 
PDF
Scraping metrics for fun and profit
Bram Vogelaar
 
PDF
Observability; a gentle introduction
Bram Vogelaar
 
PDF
Running Trusted Payload with Nomad and Waypoint
Bram Vogelaar
 
PDF
Securing Prometheus exporters using HashiCorp Vault
Bram Vogelaar
 
PDF
CICD using jenkins and Nomad
Bram Vogelaar
 
PDF
Bootstrapping multidc observability stack
Bram Vogelaar
 
PDF
Running trusted payloads with Nomad and Waypoint
Bram Vogelaar
 
PDF
Gamification of Chaos Testing
Bram Vogelaar
 
PDF
Puppet and the HashiStack
Bram Vogelaar
 
PDF
Bootstrapping multidc observability stack
Bram Vogelaar
 
PPTX
Creating Reusable Puppet Profiles
Bram Vogelaar
 
PDF
Gamification of Chaos Testing
Bram Vogelaar
 
PDF
Observability with Consul Connect
Bram Vogelaar
 
PDF
Testing your infrastructure with litmus
Bram Vogelaar
 
PDF
Devops its not about the tooling
Bram Vogelaar
 
PDF
High Available Drupal
Bram Vogelaar
 
PDF
Over engineering your personal website
Bram Vogelaar
 
Terraforming your Platform Engineering organisation.pptx
Bram Vogelaar
 
Secure second days operations with Boundary and Vault.pdf
Bram Vogelaar
 
Cost reconciliation in a post CMDB world
Bram Vogelaar
 
Scraping metrics for fun and profit
Bram Vogelaar
 
Observability; a gentle introduction
Bram Vogelaar
 
Running Trusted Payload with Nomad and Waypoint
Bram Vogelaar
 
Securing Prometheus exporters using HashiCorp Vault
Bram Vogelaar
 
CICD using jenkins and Nomad
Bram Vogelaar
 
Bootstrapping multidc observability stack
Bram Vogelaar
 
Running trusted payloads with Nomad and Waypoint
Bram Vogelaar
 
Gamification of Chaos Testing
Bram Vogelaar
 
Puppet and the HashiStack
Bram Vogelaar
 
Bootstrapping multidc observability stack
Bram Vogelaar
 
Creating Reusable Puppet Profiles
Bram Vogelaar
 
Gamification of Chaos Testing
Bram Vogelaar
 
Observability with Consul Connect
Bram Vogelaar
 
Testing your infrastructure with litmus
Bram Vogelaar
 
Devops its not about the tooling
Bram Vogelaar
 
High Available Drupal
Bram Vogelaar
 
Over engineering your personal website
Bram Vogelaar
 
Ad

Recently uploaded (20)

PDF
SIZING YOUR AIR CONDITIONER---A PRACTICAL GUIDE.pdf
Muhammad Rizwan Akram
 
PPTX
New ThousandEyes Product Innovations: Cisco Live June 2025
ThousandEyes
 
PDF
Bitcoin for Millennials podcast with Bram, Power Laws of Bitcoin
Stephen Perrenod
 
PDF
“NPU IP Hardware Shaped Through Software and Use-case Analysis,” a Presentati...
Edge AI and Vision Alliance
 
PPTX
COMPARISON OF RASTER ANALYSIS TOOLS OF QGIS AND ARCGIS
Sharanya Sarkar
 
PDF
UiPath DevConnect 2025: Agentic Automation Community User Group Meeting
DianaGray10
 
PDF
The 2025 InfraRed Report - Redpoint Ventures
Razin Mustafiz
 
PDF
Go Concurrency Real-World Patterns, Pitfalls, and Playground Battles.pdf
Emily Achieng
 
PDF
Mastering Financial Management in Direct Selling
Epixel MLM Software
 
PPTX
Designing_the_Future_AI_Driven_Product_Experiences_Across_Devices.pptx
presentifyai
 
PDF
[Newgen] NewgenONE Marvin Brochure 1.pdf
darshakparmar
 
PPTX
Digital Circuits, important subject in CS
contactparinay1
 
PDF
Kit-Works Team Study_20250627_한달만에만든사내서비스키링(양다윗).pdf
Wonjun Hwang
 
DOCX
Cryptography Quiz: test your knowledge of this important security concept.
Rajni Bhardwaj Grover
 
PDF
Book industry state of the nation 2025 - Tech Forum 2025
BookNet Canada
 
PDF
“Computer Vision at Sea: Automated Fish Tracking for Sustainable Fishing,” a ...
Edge AI and Vision Alliance
 
PDF
How do you fast track Agentic automation use cases discovery?
DianaGray10
 
PPTX
Future Tech Innovations 2025 – A TechLists Insight
TechLists
 
PDF
CIFDAQ Market Wrap for the week of 4th July 2025
CIFDAQ
 
PDF
Reverse Engineering of Security Products: Developing an Advanced Microsoft De...
nwbxhhcyjv
 
SIZING YOUR AIR CONDITIONER---A PRACTICAL GUIDE.pdf
Muhammad Rizwan Akram
 
New ThousandEyes Product Innovations: Cisco Live June 2025
ThousandEyes
 
Bitcoin for Millennials podcast with Bram, Power Laws of Bitcoin
Stephen Perrenod
 
“NPU IP Hardware Shaped Through Software and Use-case Analysis,” a Presentati...
Edge AI and Vision Alliance
 
COMPARISON OF RASTER ANALYSIS TOOLS OF QGIS AND ARCGIS
Sharanya Sarkar
 
UiPath DevConnect 2025: Agentic Automation Community User Group Meeting
DianaGray10
 
The 2025 InfraRed Report - Redpoint Ventures
Razin Mustafiz
 
Go Concurrency Real-World Patterns, Pitfalls, and Playground Battles.pdf
Emily Achieng
 
Mastering Financial Management in Direct Selling
Epixel MLM Software
 
Designing_the_Future_AI_Driven_Product_Experiences_Across_Devices.pptx
presentifyai
 
[Newgen] NewgenONE Marvin Brochure 1.pdf
darshakparmar
 
Digital Circuits, important subject in CS
contactparinay1
 
Kit-Works Team Study_20250627_한달만에만든사내서비스키링(양다윗).pdf
Wonjun Hwang
 
Cryptography Quiz: test your knowledge of this important security concept.
Rajni Bhardwaj Grover
 
Book industry state of the nation 2025 - Tech Forum 2025
BookNet Canada
 
“Computer Vision at Sea: Automated Fish Tracking for Sustainable Fishing,” a ...
Edge AI and Vision Alliance
 
How do you fast track Agentic automation use cases discovery?
DianaGray10
 
Future Tech Innovations 2025 – A TechLists Insight
TechLists
 
CIFDAQ Market Wrap for the week of 4th July 2025
CIFDAQ
 
Reverse Engineering of Security Products: Developing an Advanced Microsoft De...
nwbxhhcyjv
 

Easy Cloud Native Transformation using HashiCorp Nomad

  • 1. Easy Cloud Native Transformation with HashiCorp Nomad Bram Vogelaar @attachmentgenie
  • 2. $ whoami • Used to be a Molecular Biologist • Then became a Dev • Now an Ops • Currently Cloud Engineer @ The Factory • Amsterdam HUG organizer
  • 3. Moving it all to the cloud
  • 5. Horizontal Scaling / Load Balancers
  • 6. And than stuff got complicated….
  • 7. The story starts with my personal website
  • 8. Nomad l Open Source tool for dynamic workload scheduling l Batch, containerized, and non-containerized applications. l Has native Consul and Vault integrations. l Has token based access setup. l Jobs written in (H)ashiCorp (C)onfiguration (L)anguage https://www.nomadproject.io/
  • 9. job "blog" { datacenters = ["aws"] type = "service" group "hugo" { network { port "http" { to = 80 } } task "nginx" { driver = "docker" config { image = "${PRIVATE}.dkr.ecr.us-east-1.amazonaws.com/blog:19" ports = ["http"] Deploy the blog
  • 10. 1 == None job "blog" { datacenters = ["aws"] type = "service" group "hugo" { count = 2
  • 11. job "blog" { datacenters = ["aws"] type = "service" group "hugo" { count = 2 constraint { operator = "distinct_hosts" value = "true" } Force onto different hardware
  • 12. job "blog" { datacenters = ["aws"] type = "service" group "hugo" { count = 2 Spread { attribute = "${node.datacenter}" } Suggest onto different hardware
  • 13. /etc/nomad.d/config.hcl Client { Enabled = true Meta { "rack" = "his" } } Based on custom meta-data
  • 14. job "blog" { datacenters = ["aws"] type = "service" group "hugo" { count = 2 Spread { attribute = "${meta.rack}" target "his" { percent = 50 } target "her" { percent = 50 } } Based on custom meta-data
  • 15. service { name = ”blog" provider = "nomad" port = ”http" } Service Definition
  • 16. template { data = <<EOH http { server { listen 80; location / { {{ range nomadService ”blog" }} proxy_pass http://{{ .Address }}:{{ .Port }}; {{ end }} } } } EOH destination = "local/api-servers" Service Usage
  • 17. Nomad Pack • Levant • Templating and packaging tool • Easily deploy popular applications to Nomad • Re-use common patterns across internal applications • Find and share job specifications with the Nomad community • Nightlies only right now! https://github.com/hashicorp/nomad-pack-community-registry
  • 18. Nomad Pack • nomad-pack registry list • nomad-pack run hello_world • nomad-pack run hello_world --var message=hola https://github.com/hashicorp/nomad-pack
  • 20. Consul l Open-Source Service Discovery Tool l Build-in KV store l Service Mesh tool https://www.consul.io/
  • 21. service { name = "blog" port = "http" check { type = "tcp" interval = "10s" timeout = "2s" } } Service Definition
  • 22. Check { type = "tcp" interval = "10s" timeout = "2s" check_restart { limit = 3 grace = "10s" ignore_warnings = false } } Stampeding herd
  • 23. group "hugo" { Restart { interval = "10m" attempts = 2 delay = "15s" mode = "fail" } task "nginx" { Restart failed jobs
  • 24. group "hugo" { Count = 2 reschedule { delay = "30s" delay_function = "constant" #constant, exponential, fibonacci unlimited = true # or max_delay = “1h” } task "nginx" { Reschedule a job
  • 26. group "hugo" { Count = 10 update { max_parallel = 2 min_healthy_time = "30s" healthy_deadline = "5m" } task "nginx" { Updates
  • 27. group "hugo" { Count = 10 Update { max_parallel = 1 canary = 10 min_healthy_time = "30s" healthy_deadline = "10m" auto_revert = true auto_promote = false } task "nginx" { Blue/Green Release
  • 28. group "hugo" { Count = 5 Update { max_parallel = 1 canary = 1 min_healthy_time = "30s" healthy_deadline = "10m" auto_revert = true auto_promote = true } task "nginx" { Canary Release
  • 29. service { name = "blog" tags = ["v2"] } $version++
  • 30. group "hugo" { Count = 5 Update { max_parallel = 1 canary = 1 min_healthy_time = "30s" healthy_deadline = "10m" auto_revert = true auto_promote = false } task "nginx" { Canary Release++
  • 31. kind = "service-router" name = "blog" routes = [ { match { http { header = [ { name = "group" exact = "test" }, ] } } destination { service = "blog" service_subset = "v2" } },] Consul to the rescue
  • 32. ● Introduced in/with Nomad 0.11 ● (Currently) independently release cycle ● Gaining new functionality every release ● Build in Functionality for horizontal and vertical scaling ● But extendable by your own (community) plugins Nomad autoscaler
  • 33. ● Makes decisions based on a checks ● Checks are a combination of • Data queried from an APM • Defined STRATEGY • Attempt to approach TARGET value ● Multiple Checks can be combined • Answer with the most resources will win! • ScaleOut and ScaleIn => ScaleOut • ScaleOut and ScaleNone => ScaleOut • ScaleIn and ScaleNone => ScaleNone • ScaleOut(10) and ScaleOut(9) => ScaleOut(10) • ScaleIn(3) and ScaleIn(4) => ScaleIn(4) Auto-scaling TLDR
  • 34. • job "autoscaler" { type = "service" datacenters = ["aws"] group "autoscaler" { count = 1 task "autoscaler" { driver = "docker" config { image = "hashicorp/nomad-autoscaler:0.3.6" command = "nomad-autoscaler" args = [ "agent", "-config", "${NOMAD_TASK_DIR}/config.hcl", "-http-bind-address", "0.0.0.0", ] Deploy the autoscaler
  • 35. • /etc/nomad.d/config.hcl • nomad { address = "http://{{env "attr.unique.network.ip-address" }}:4646" } apm "prometheus" { driver = "prometheus" config = { address = "http://prometheus.service.consul:9090" } } strategy "target-value" { driver = "target-value" } Config for the autoscaler
  • 37. • job "blog" { datacenters = ["aws"] type = "service" group "hugo" { count = 3 scaling { enabled = true min = 1 max = 20 policy { cooldown = "20s" check "avg_instance_sessions" { source = "prometheus" query = "scalar(avg(traefik_service_open_connections{service="blog@consulcatalog"}))" strategy "target-value" { target = 5 } Enable autoscaling for the blog
  • 41. agent: querying APM: policy_id=248f6157-ca37-f868-a0ab-cabbc67fec1d source=prometheus strategy=target-value target=local-nomad agent: calculating new count: policy_id=248f6157-ca37-f868-a0ab-cabbc67fec1d source=prometheus strategy=target-value target=local-nomad agent: next count outside limits: policy_id=248f6157-ca37-f868-a0ab-cabbc67fec1d source=prometheus strategy= agent: updated count to be within limits: policy_id=248f6157-ca37-f868-a0ab-cabbc67fec1d source=prometheus strategy=target-value target=local-nomad from=3 to=1 min=1 max=10 agent: scaling target: policy_id=248f6157-ca37-f868-a0ab-cabbc67fec1d source=prometheus strategy=target-value Observe the autoscaler
  • 42. hey -z 1m -c 30 http://127.0.0.1:8000 Apply load
  • 45. group "autoscaler" { count = 1 task "autoscaler" { driver = "docker" config { image = "hashicorp/nomad-autoscaler:0.3.6" command = "nomad-autoscaler" logging { type = "loki" config { loki-url = 'http://loki.service.consul:3100/api/prom/push' tag = "loki" } } docker plugin install grafana/loki-docker-driver:latest --alias loki --grant-all-permissions Direct to loki
  • 46. task "promtail" { driver = "docker" lifecycle { hook = "prestart" sidecar = true } config { image = "grafana/promtail:2.5.0" args = [ "-config.file", "local/promtail.yaml", ] Promtail sidecar
  • 47. • scrape_configs: - job_name: system entry_parser: raw static_configs: - targets: - localhost labels: task: autoscaler __path__: /alloc/logs/autoscaler* pipeline_stages: - match: selector: '{task="autoscaler"}' stages: - regex: expression: '.*policy_id=(?P<policy_id>[a-zA-Z0-9_-]+).*source=(?P<source>[a-zA-Z0-9_- ]+).*strategy=(?P<strategy>[a-zA-Z0-9_-]+).*target=(?P<target>[a-zA-Z0-9_-]+).*Group:(?P<group>[a-zA-Z0- 9]+).*Job:(?P<job>[a-zA-Z0-9_-]+).*Namespace:(?P<namespace>[a-zA-Z0-9_-]+)' https://grafana.com/docs/loki/latest/clients/promtail/ Promtail sidecar
  • 50. • apm "prometheus" { driver = "prometheus" config = { address = "http://prometheus.service.consul:9090" } } • target "aws-asg" { driver = "aws-asg" config = { aws_region = "{{ $x := env "attr.platform.aws.placement.availability-zone" }}{{ $length := len $x |subtract 1 }}{{ slice $x 0 $length}}" } } Grow into your platform
  • 51. • scaling "cluster_policy" { policy { cooldown = "2m" evaluation_interval = "1m" check "cpu_allocated_percentage" { source = "prometheus" query = "scalar(sum(nomad_client_allocated_cpu{node_class="hashistack"}*100/(nomad_client_unallocated_cpu{node_class="hashistack"}+nomad_client_allocated_cpu{n ode_class="hashistack"}))/count(nomad_client_allocated_cpu{node_class="hashistack"}))" strategy "target-value" { target = 70 } } target "aws-asg" { dry-run = "false" aws_asg_name = "${client_asg_name}" node_class = "hashistack" node_drain_deadline = "5m” Grow into your platform
  • 52. agent.worker.check_handler: querying source: check=mem_allocated_percentage policy_id=bf68649a-d087-2e69-362e-bbe71b5544f7 source=prometheus strategy=target-value target=a agent.worker.check_handler: calculating new count: check=mem_allocated_percentage policy_id=bf68649a-d087-2e69-362e-bbe71b5544f7 source=prometheus strategy=target-value ta agent.worker.check_handler: scaling target: check=mem_allocated_percentage policy_id=bf68649a-d087-2e69-36 internal_plugin.aws-asg: successfully performed and verified scaling out: action=scale_out asg_name=hashistack-n agent.worker.check_handler: successfully submitted scaling action to target: check=mem_allocated_percentage policy_id=bf68649a-d087-2e69-362e-bbe71b5544f7 source=prometheus Observe the autoscaler again
  • 54. Moving it all to the cloud – QED
  • 56. Questions ? The Floor is yours…