SlideShare a Scribd company logo
Chaos Engineering at Jet.com
Rachel Reese | @rachelreese | rachelree.se
Jet Technology | @JetTechnology | tech.jet.com
InfoQ.com: News & Community Site
• 750,000 unique visitors/month
• Published in 4 languages (English, Chinese, Japanese and Brazilian
Portuguese)
• Post content from our QCon conferences
• News 15-20 / week
• Articles 3-4 / week
• Presentations (videos) 12-15 / week
• Interviews 2-3 / week
• Books 1 / month
Watch the video with slide
synchronization on InfoQ.com!
http://www.infoq.com/presentations
/jet-microservices-testing
Purpose of QCon
- to empower software development by facilitating the spread of
knowledge and innovation
Strategy
- practitioner-driven conference designed for YOU: influencers of
change and innovation in your teams
- speakers and topics driving the evolution and innovation
- connecting and catalyzing the influencers and innovators
Highlights
- attended by more than 12,000 delegates since 2007
- held in 9 cities worldwide
Presented at QCon London
www.qconlondon.com
Why do you need chaos testing?
The world is naturally chaotic
But do we need more testing?
Unit Sanity Random Continuous
UsabilityA/BLocalizationAcceptance
Regression Performance Integration Security
You’ve already tested all your
components in multiple ways.
Microservices Chaos Testing at Jet
It’s super important to test the interactions in your
environment
Jet? Jet who?
Taking on Amazon!
Launched July 22
• Both Apple & Android named our
app as one of their tops for 2015
• Over 20k orders per day
• Over 10.5 million SKUs
• #4 marketplace worldwide
• 700 microservices
We’re hiring!
http://jet.com/about-us/working-at-jet
Azure Web sites Cloud
services VMs Service bus
queues
Services
bus topics
Blob storage
Table
storage Queues Hadoop DNS Active
directory
SQL Azure R
F# Paket FSharp.Data Chessie Unquote SQLProvider Python
Deedle
FAK
E
FSharp.Async React Node Angular SAS
Storm Elastic
Search
Xamarin Microservices Consul Kafka PDW
Splunk Redis SQL Puppet Jenkins
Apache
Hive
Apache
Tez
Microservices at Jet
Microservices
• An application of the single responsibility principle at the service level.
• Has an input, produces an output.
Easy scalability
Independent releasability
More even distribution of complexity
Benefits
“A class should have one, and only one, reason to change.”
What is chaos engineering?
It’s just wreaking havoc with your code
for fun, right?
Microservices Chaos Testing at Jet
Chaos Engineering is…
Controlled experiments on a distributed system
that help you build confidence in the system’s
ability to tolerate the inevitable failures.
Microservices Chaos Testing at Jet
Principles of Chaos Engineering
1. Define “normal”
2. Assume ”normal” will continue in both a control group
and an experimental group.
3. Introduce chaos: servers that crash, hard drives that
malfunction, network connections that are severed, etc.
4. Look for a difference in behavior between the control
group and the experimental group.
Going farther
Build a Hypothesis around Normal Behavior
Vary Real-world Events
Run Experiments in Production
Automate Experiments to Run Continuously
From http://principlesofchaos.org/
Benefits of chaos engineering
Benefits of chaos engineering
You're awake Design for failure
Healthy systems Self service
Current examples of chaos engineering
Maybe you meant Netflix’s Chaos Monkey?
How is Jet different?
We’re not testing in prod (yet).
SQL restarts & geo-replication
Start
- Checks the source db for write access
- Renames db on destination server (to create a new one)
- Creates a geo-replication in the destination region
Stop
- Shuts down cloud services writing to source db
- Sets source db as read-only
- Ends continuous copy
- Allows writes to secondary db
Azure & F#
Why F#?
Microservices Chaos Testing at Jet
What FP means to us
Prefer immutability
Avoid state changes,
side effects, and
mutable data
Use data in  data out
transformations
Think about mapping
inputs to outputs.
Look at problems
recursively
Consider successively
smaller chunks of the
same problem
Treat functions as
unit of work
Higher-order functions
The F# solution offers us an order of magnitude
increase in productivity and allows one developer to
perform the work [of] a team of dedicated
developers…
Yan Cui
Lead Server Engineer, Gamesys
“
“ “
Concise and powerful code
public abstract class Transport{ }
public abstract class Car : Transport {
public string Make { get; private set; }
public string Model { get; private set; }
public Car (string make, string model) {
this.Make = make;
this.Model = model;
}
}
public abstract class Bus : Transport {
public int Route { get; private set; }
public Bus (int route) {
this.Route = route;
}
}
public class Bicycle: Transport {
public Bicycle() {
}
}
type Transport =
| Car of Make:string * Model:string
| Bus of Route:int
| Bicycle
C# F#
Trivial to pattern match on!
F#patternmatching
C#
Concise and powerful code
public abstract class Transport{ }
public abstract class Car : Transport {
public string Make { get; private set; }
public string Model { get; private set; }
public Car (string make, string model) {
this.Make = make;
this.Model = model;
}
}
public abstract class Bus : Transport {
public int Route { get; private set; }
public Bus (int route) {
this.Route = route;
}
}
public class Bicycle: Transport {
public Bicycle() {
}
}
type Transport =
| Car of Make:string * Model:string
| Bus of Route:int
| Bicycle
| Train of Line:int
let getThereVia (transport:Transport) =
match transport with
| Car (make,model) -> ...
| Bus route -> ...
| Bicycle -> ...
Warning FS0025: Incomplete pattern
matches on this expression. For example,
the value ’Train' may indicate a case not
covered by the pattern(s)
C# F#
Units of Measure
TickSpec – an F# project
Thanks to Scott Wlaschin for his post, Cycles and modularity in the wild
SpecFlow– a comparable C# project
Thanks to Scott Wlaschin for his post, Cycles and modularity in the wild
Chaos code!
Microservices Chaos Testing at Jet
type Input =
| Product of Product
type Output =
| ProductPriceNile of Product * decimal
| ProductPriceCheckFailed of PriceCheckFailed
let handle (input:Input) =
async {
return Some(ProductPriceNile({Sku="343434"; ProductId = 17; ProductDescription = "My
amazing product"; CostPer=1.96M}, 3.96M))
}
let interpret id output =
match output with
| Some (Output.ProductPriceNile (e, price)) -> async {()} // write to event store
| Some (Output.ProductPriceCheckFailed e) -> async {()} // log failure
| None -> async.Return ()
let consume = EventStoreQueue.consume (decodeT Input.Product) handle interpret
What do our services look like?
Define inputs
& outputs
Define how input
transforms to output
Define what to do
with output
Read events,
handle, & interpret
Our code!
let selectRandomInstance compute hostedService = async {
try
let! details = getHostedServiceDetails compute hostedService.ServiceName
let deployment = getProductionDeployment details
let instance = deployment.RoleInstances
|> Seq.toArray
|> randomPick
return details.ServiceName, deployment.Name, instance
with e ->
log.error "Failed selecting random instancen%A" e
reraise e
}
Our code!
let restartRandomInstance compute hostedService = async {
try
let! serviceName, deploymentId, roleInstance =
selectRandomInstance compute hostedService
match roleInstance.PowerState with
| RoleInstancePowerState.Stopped ->
log.info "Service=%s Instance=%s is stopped...ignoring...”
serviceName roleInstance.InstanceName
| _ ->
do! restartInstance compute serviceName deploymentId roleInstance.InstanceName
with e ->
log.error "%s" e.Message
}
Our code!
compute
|> getHostedServices
|> Seq.filter ignoreList
|> knuthShuffle
|> Seq.distinctBy (fun a -> a.ServiceName)
|> Seq.map (fun hostedService -> async {
try
return! restartRandomInstance compute hostedService
with
e -> log.warn "failed: service=%s . %A" hostedService.ServiceName e
return ()
})
|> Async.ParallelIgnore 1
|> Async.RunSynchronously
Has it helped?
Elasticsearch restart
Additional chaos finds
- Redis
- Checkpointing
Microservices Chaos Testing at Jet
If availability matters, you should be
testing for it.
Azure + F# + Chaos = <3
Chaos Engineering at Jet.com
Rachel Reese | @rachelreese | rachelree.se
Jet Technology | @JetTechnology | tech.jet.com
Nora Jones | @nora_js
Watch the video with slide synchronization on
InfoQ.com!
http://www.infoq.com/presentations/jet-
microservices-testing

More Related Content

Similar to Microservices Chaos Testing at Jet (20)

PPTX
London F-Sharp User Group : Don Syme on F# - 09/09/2010
Skills Matter
 
PPTX
Domain Modeling & Full-Stack Web Development F#
Kevin Avignon
 
PDF
Architecture Patterns with Python 1st Edition Harry Percival
allendanelia
 
PDF
Instant download Architecture Patterns with Python 1st Edition Harry Percival...
ramorafiga
 
PPT
Cse3 March2009cwd35with Crane
Emmanuel Fuchs
 
PDF
Design For Testability
Giovanni Asproni
 
PDF
Architecture Patterns with Python 1st Edition Harry Percival
ookuboichika
 
PPTX
F# for functional enthusiasts
Jack Fox
 
PPTX
Functional Architecture - goto copenhagen 2012
Phillip Trelford
 
PPTX
Chaos engineering
Alberto Acerbis
 
PDF
Chaos Engineering - The Art of Breaking Things in Production
Keet Sugathadasa
 
PDF
Ground rules
Lior Sion
 
PDF
System Design Interview - from both sides of the table.pdf
Dejan Vukmirovic
 
PDF
From the Drawing Board to the Trenches: Building a Production-ready Application
Hristo Iliev
 
PDF
System Design Interview Questions PDF By ScholarHat
Scholarhat
 
PPT
The Architect's Two Hats
Ben Stopford
 
PPTX
F# for BLOBA, by brandon d'imperio
MaslowB
 
PDF
When Should You Consider Meta Architectures
Daniel Cukier
 
PDF
When Should You Consider Meta Architectures
ccsl-usp
 
PPT
Contemporary Software Engineering Practices Together With Enterprise
Kenan Sevindik
 
London F-Sharp User Group : Don Syme on F# - 09/09/2010
Skills Matter
 
Domain Modeling & Full-Stack Web Development F#
Kevin Avignon
 
Architecture Patterns with Python 1st Edition Harry Percival
allendanelia
 
Instant download Architecture Patterns with Python 1st Edition Harry Percival...
ramorafiga
 
Cse3 March2009cwd35with Crane
Emmanuel Fuchs
 
Design For Testability
Giovanni Asproni
 
Architecture Patterns with Python 1st Edition Harry Percival
ookuboichika
 
F# for functional enthusiasts
Jack Fox
 
Functional Architecture - goto copenhagen 2012
Phillip Trelford
 
Chaos engineering
Alberto Acerbis
 
Chaos Engineering - The Art of Breaking Things in Production
Keet Sugathadasa
 
Ground rules
Lior Sion
 
System Design Interview - from both sides of the table.pdf
Dejan Vukmirovic
 
From the Drawing Board to the Trenches: Building a Production-ready Application
Hristo Iliev
 
System Design Interview Questions PDF By ScholarHat
Scholarhat
 
The Architect's Two Hats
Ben Stopford
 
F# for BLOBA, by brandon d'imperio
MaslowB
 
When Should You Consider Meta Architectures
Daniel Cukier
 
When Should You Consider Meta Architectures
ccsl-usp
 
Contemporary Software Engineering Practices Together With Enterprise
Kenan Sevindik
 

More from C4Media (20)

PDF
Streaming a Million Likes/Second: Real-Time Interactions on Live Video
C4Media
 
PDF
Next Generation Client APIs in Envoy Mobile
C4Media
 
PDF
Software Teams and Teamwork Trends Report Q1 2020
C4Media
 
PDF
Understand the Trade-offs Using Compilers for Java Applications
C4Media
 
PDF
Kafka Needs No Keeper
C4Media
 
PDF
High Performing Teams Act Like Owners
C4Media
 
PDF
Does Java Need Inline Types? What Project Valhalla Can Bring to Java
C4Media
 
PDF
Service Meshes- The Ultimate Guide
C4Media
 
PDF
Shifting Left with Cloud Native CI/CD
C4Media
 
PDF
CI/CD for Machine Learning
C4Media
 
PDF
Fault Tolerance at Speed
C4Media
 
PDF
Architectures That Scale Deep - Regaining Control in Deep Systems
C4Media
 
PDF
ML in the Browser: Interactive Experiences with Tensorflow.js
C4Media
 
PDF
Build Your Own WebAssembly Compiler
C4Media
 
PDF
User & Device Identity for Microservices @ Netflix Scale
C4Media
 
PDF
Scaling Patterns for Netflix's Edge
C4Media
 
PDF
Make Your Electron App Feel at Home Everywhere
C4Media
 
PDF
The Talk You've Been Await-ing For
C4Media
 
PDF
Future of Data Engineering
C4Media
 
PDF
Automated Testing for Terraform, Docker, Packer, Kubernetes, and More
C4Media
 
Streaming a Million Likes/Second: Real-Time Interactions on Live Video
C4Media
 
Next Generation Client APIs in Envoy Mobile
C4Media
 
Software Teams and Teamwork Trends Report Q1 2020
C4Media
 
Understand the Trade-offs Using Compilers for Java Applications
C4Media
 
Kafka Needs No Keeper
C4Media
 
High Performing Teams Act Like Owners
C4Media
 
Does Java Need Inline Types? What Project Valhalla Can Bring to Java
C4Media
 
Service Meshes- The Ultimate Guide
C4Media
 
Shifting Left with Cloud Native CI/CD
C4Media
 
CI/CD for Machine Learning
C4Media
 
Fault Tolerance at Speed
C4Media
 
Architectures That Scale Deep - Regaining Control in Deep Systems
C4Media
 
ML in the Browser: Interactive Experiences with Tensorflow.js
C4Media
 
Build Your Own WebAssembly Compiler
C4Media
 
User & Device Identity for Microservices @ Netflix Scale
C4Media
 
Scaling Patterns for Netflix's Edge
C4Media
 
Make Your Electron App Feel at Home Everywhere
C4Media
 
The Talk You've Been Await-ing For
C4Media
 
Future of Data Engineering
C4Media
 
Automated Testing for Terraform, Docker, Packer, Kubernetes, and More
C4Media
 
Ad

Recently uploaded (20)

PDF
Biography of Daniel Podor.pdf
Daniel Podor
 
PDF
From Code to Challenge: Crafting Skill-Based Games That Engage and Reward
aiyshauae
 
PDF
Achieving Consistent and Reliable AI Code Generation - Medusa AI
medusaaico
 
PPTX
WooCommerce Workshop: Bring Your Laptop
Laura Hartwig
 
PDF
Building Real-Time Digital Twins with IBM Maximo & ArcGIS Indoors
Safe Software
 
PDF
CIFDAQ Market Wrap for the week of 4th July 2025
CIFDAQ
 
PDF
Presentation - Vibe Coding The Future of Tech
yanuarsinggih1
 
PDF
Reverse Engineering of Security Products: Developing an Advanced Microsoft De...
nwbxhhcyjv
 
PDF
Agentic AI lifecycle for Enterprise Hyper-Automation
Debmalya Biswas
 
PDF
Newgen 2022-Forrester Newgen TEI_13 05 2022-The-Total-Economic-Impact-Newgen-...
darshakparmar
 
PDF
What Makes Contify’s News API Stand Out: Key Features at a Glance
Contify
 
PDF
Empower Inclusion Through Accessible Java Applications
Ana-Maria Mihalceanu
 
PDF
Using FME to Develop Self-Service CAD Applications for a Major UK Police Force
Safe Software
 
PDF
POV_ Why Enterprises Need to Find Value in ZERO.pdf
darshakparmar
 
PDF
DevBcn - Building 10x Organizations Using Modern Productivity Metrics
Justin Reock
 
PDF
Smart Trailers 2025 Update with History and Overview
Paul Menig
 
PDF
CIFDAQ Weekly Market Wrap for 11th July 2025
CIFDAQ
 
PDF
Chris Elwell Woburn, MA - Passionate About IT Innovation
Chris Elwell Woburn, MA
 
PDF
How Startups Are Growing Faster with App Developers in Australia.pdf
India App Developer
 
PDF
New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
Biography of Daniel Podor.pdf
Daniel Podor
 
From Code to Challenge: Crafting Skill-Based Games That Engage and Reward
aiyshauae
 
Achieving Consistent and Reliable AI Code Generation - Medusa AI
medusaaico
 
WooCommerce Workshop: Bring Your Laptop
Laura Hartwig
 
Building Real-Time Digital Twins with IBM Maximo & ArcGIS Indoors
Safe Software
 
CIFDAQ Market Wrap for the week of 4th July 2025
CIFDAQ
 
Presentation - Vibe Coding The Future of Tech
yanuarsinggih1
 
Reverse Engineering of Security Products: Developing an Advanced Microsoft De...
nwbxhhcyjv
 
Agentic AI lifecycle for Enterprise Hyper-Automation
Debmalya Biswas
 
Newgen 2022-Forrester Newgen TEI_13 05 2022-The-Total-Economic-Impact-Newgen-...
darshakparmar
 
What Makes Contify’s News API Stand Out: Key Features at a Glance
Contify
 
Empower Inclusion Through Accessible Java Applications
Ana-Maria Mihalceanu
 
Using FME to Develop Self-Service CAD Applications for a Major UK Police Force
Safe Software
 
POV_ Why Enterprises Need to Find Value in ZERO.pdf
darshakparmar
 
DevBcn - Building 10x Organizations Using Modern Productivity Metrics
Justin Reock
 
Smart Trailers 2025 Update with History and Overview
Paul Menig
 
CIFDAQ Weekly Market Wrap for 11th July 2025
CIFDAQ
 
Chris Elwell Woburn, MA - Passionate About IT Innovation
Chris Elwell Woburn, MA
 
How Startups Are Growing Faster with App Developers in Australia.pdf
India App Developer
 
New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
Ad

Microservices Chaos Testing at Jet

  • 1. Chaos Engineering at Jet.com Rachel Reese | @rachelreese | rachelree.se Jet Technology | @JetTechnology | tech.jet.com
  • 2. InfoQ.com: News & Community Site • 750,000 unique visitors/month • Published in 4 languages (English, Chinese, Japanese and Brazilian Portuguese) • Post content from our QCon conferences • News 15-20 / week • Articles 3-4 / week • Presentations (videos) 12-15 / week • Interviews 2-3 / week • Books 1 / month Watch the video with slide synchronization on InfoQ.com! http://www.infoq.com/presentations /jet-microservices-testing
  • 3. Purpose of QCon - to empower software development by facilitating the spread of knowledge and innovation Strategy - practitioner-driven conference designed for YOU: influencers of change and innovation in your teams - speakers and topics driving the evolution and innovation - connecting and catalyzing the influencers and innovators Highlights - attended by more than 12,000 delegates since 2007 - held in 9 cities worldwide Presented at QCon London www.qconlondon.com
  • 4. Why do you need chaos testing?
  • 5. The world is naturally chaotic
  • 6. But do we need more testing? Unit Sanity Random Continuous UsabilityA/BLocalizationAcceptance Regression Performance Integration Security
  • 7. You’ve already tested all your components in multiple ways.
  • 9. It’s super important to test the interactions in your environment
  • 11. Taking on Amazon! Launched July 22 • Both Apple & Android named our app as one of their tops for 2015 • Over 20k orders per day • Over 10.5 million SKUs • #4 marketplace worldwide • 700 microservices We’re hiring! http://jet.com/about-us/working-at-jet
  • 12. Azure Web sites Cloud services VMs Service bus queues Services bus topics Blob storage Table storage Queues Hadoop DNS Active directory SQL Azure R F# Paket FSharp.Data Chessie Unquote SQLProvider Python Deedle FAK E FSharp.Async React Node Angular SAS Storm Elastic Search Xamarin Microservices Consul Kafka PDW Splunk Redis SQL Puppet Jenkins Apache Hive Apache Tez
  • 14. Microservices • An application of the single responsibility principle at the service level. • Has an input, produces an output. Easy scalability Independent releasability More even distribution of complexity Benefits “A class should have one, and only one, reason to change.”
  • 15. What is chaos engineering?
  • 16. It’s just wreaking havoc with your code for fun, right?
  • 18. Chaos Engineering is… Controlled experiments on a distributed system that help you build confidence in the system’s ability to tolerate the inevitable failures.
  • 20. Principles of Chaos Engineering 1. Define “normal” 2. Assume ”normal” will continue in both a control group and an experimental group. 3. Introduce chaos: servers that crash, hard drives that malfunction, network connections that are severed, etc. 4. Look for a difference in behavior between the control group and the experimental group.
  • 21. Going farther Build a Hypothesis around Normal Behavior Vary Real-world Events Run Experiments in Production Automate Experiments to Run Continuously From http://principlesofchaos.org/
  • 22. Benefits of chaos engineering
  • 23. Benefits of chaos engineering You're awake Design for failure Healthy systems Self service
  • 24. Current examples of chaos engineering
  • 25. Maybe you meant Netflix’s Chaos Monkey?
  • 26. How is Jet different?
  • 27. We’re not testing in prod (yet).
  • 28. SQL restarts & geo-replication Start - Checks the source db for write access - Renames db on destination server (to create a new one) - Creates a geo-replication in the destination region Stop - Shuts down cloud services writing to source db - Sets source db as read-only - Ends continuous copy - Allows writes to secondary db
  • 32. What FP means to us Prefer immutability Avoid state changes, side effects, and mutable data Use data in  data out transformations Think about mapping inputs to outputs. Look at problems recursively Consider successively smaller chunks of the same problem Treat functions as unit of work Higher-order functions
  • 33. The F# solution offers us an order of magnitude increase in productivity and allows one developer to perform the work [of] a team of dedicated developers… Yan Cui Lead Server Engineer, Gamesys “ “ “
  • 34. Concise and powerful code public abstract class Transport{ } public abstract class Car : Transport { public string Make { get; private set; } public string Model { get; private set; } public Car (string make, string model) { this.Make = make; this.Model = model; } } public abstract class Bus : Transport { public int Route { get; private set; } public Bus (int route) { this.Route = route; } } public class Bicycle: Transport { public Bicycle() { } } type Transport = | Car of Make:string * Model:string | Bus of Route:int | Bicycle C# F# Trivial to pattern match on!
  • 36. Concise and powerful code public abstract class Transport{ } public abstract class Car : Transport { public string Make { get; private set; } public string Model { get; private set; } public Car (string make, string model) { this.Make = make; this.Model = model; } } public abstract class Bus : Transport { public int Route { get; private set; } public Bus (int route) { this.Route = route; } } public class Bicycle: Transport { public Bicycle() { } } type Transport = | Car of Make:string * Model:string | Bus of Route:int | Bicycle | Train of Line:int let getThereVia (transport:Transport) = match transport with | Car (make,model) -> ... | Bus route -> ... | Bicycle -> ... Warning FS0025: Incomplete pattern matches on this expression. For example, the value ’Train' may indicate a case not covered by the pattern(s) C# F#
  • 38. TickSpec – an F# project Thanks to Scott Wlaschin for his post, Cycles and modularity in the wild
  • 39. SpecFlow– a comparable C# project Thanks to Scott Wlaschin for his post, Cycles and modularity in the wild
  • 42. type Input = | Product of Product type Output = | ProductPriceNile of Product * decimal | ProductPriceCheckFailed of PriceCheckFailed let handle (input:Input) = async { return Some(ProductPriceNile({Sku="343434"; ProductId = 17; ProductDescription = "My amazing product"; CostPer=1.96M}, 3.96M)) } let interpret id output = match output with | Some (Output.ProductPriceNile (e, price)) -> async {()} // write to event store | Some (Output.ProductPriceCheckFailed e) -> async {()} // log failure | None -> async.Return () let consume = EventStoreQueue.consume (decodeT Input.Product) handle interpret What do our services look like? Define inputs & outputs Define how input transforms to output Define what to do with output Read events, handle, & interpret
  • 43. Our code! let selectRandomInstance compute hostedService = async { try let! details = getHostedServiceDetails compute hostedService.ServiceName let deployment = getProductionDeployment details let instance = deployment.RoleInstances |> Seq.toArray |> randomPick return details.ServiceName, deployment.Name, instance with e -> log.error "Failed selecting random instancen%A" e reraise e }
  • 44. Our code! let restartRandomInstance compute hostedService = async { try let! serviceName, deploymentId, roleInstance = selectRandomInstance compute hostedService match roleInstance.PowerState with | RoleInstancePowerState.Stopped -> log.info "Service=%s Instance=%s is stopped...ignoring...” serviceName roleInstance.InstanceName | _ -> do! restartInstance compute serviceName deploymentId roleInstance.InstanceName with e -> log.error "%s" e.Message }
  • 45. Our code! compute |> getHostedServices |> Seq.filter ignoreList |> knuthShuffle |> Seq.distinctBy (fun a -> a.ServiceName) |> Seq.map (fun hostedService -> async { try return! restartRandomInstance compute hostedService with e -> log.warn "failed: service=%s . %A" hostedService.ServiceName e return () }) |> Async.ParallelIgnore 1 |> Async.RunSynchronously
  • 48. Additional chaos finds - Redis - Checkpointing
  • 50. If availability matters, you should be testing for it.
  • 51. Azure + F# + Chaos = <3
  • 52. Chaos Engineering at Jet.com Rachel Reese | @rachelreese | rachelree.se Jet Technology | @JetTechnology | tech.jet.com Nora Jones | @nora_js
  • 53. Watch the video with slide synchronization on InfoQ.com! http://www.infoq.com/presentations/jet- microservices-testing