Abraham Marin-Perez
@AbrahamMarin
fromfragiletoagile.com
Keeping Your CI / CD Pipeline
as Fast as It Needs to Be
#JavaOne @AbrahamMarin @EqualExperts
About Me
About Me
About Me
About Me
About Me
About Me
About Me
About Me
About Me
https://goo.gl/I0lbhi
 Continuous Integration: check everything is still
working after every commit
 Continuous Deployment: every successful
commit turns into a release
What is CI / CD?
About This Talk
About This Talk
About This Talk
About This Talk
SUPER
APP
# Files: 75
# Tests: 800
Build Time: 4 min
Output: superapp.war
SUPER
APP
# Files: 113
# Tests: 1200
Build Time: 6 min
Output: superapp.war
SUPER
APP
# Files: 169
# Tests: 1800
Build Time: 9 min
Output: superapp.war
Slow feedback
Broken builds mask issues
Development paralysis
Impact on ability to meet our SLAs
Missed business opportunities
The Problems Of Size
Live with it
Partial CD: only quick tests
Phased CD: split into components
Test Deprecation Policy
Microservices
How Organisations Manage Size
Microservices
Microservices
Microservices
Microservices
Microservices
Microservices
Microservices
Microservices
Microservices
SUPER
APP
# Files: 169
# Tests: 1800
Build Time: 9 min
Output: superapp.war
APP
BACKEN
D
SUPER APP
# Files: 115
# Tests: 1200
Build Time: 6 min
Output: superapp.war
# Files: 72
# Tests: 800
Build Time: 4 min
Output: appbackend.jar
https://goo.gl/LvkkRq
Scalable Continuous Deployment
With Maven
A real case scenario
WAR file
WAR file
WAR file
Parent
POM
Logging
WAR file
WAR file
WAR file
Parent
POM
Logging
28%
WAR file
WAR file
WAR file
Parent
POM
Logging
28%
28%
28%
WAR file
WAR file
WAR file
Parent
POM
Logging
28%
28%
28%
WAR file
WAR file
WAR file
Parent
POM
Logging
28%
28%
28%
20%
WAR file
WAR file
WAR file
Parent
POM
Logging
48%
28%
28%
20%
WAR file
WAR file
WAR file
Parent
POM
Logging
WAR file
WAR file
WAR
file
 Build Time (BT): time an individual build takes
to run
 Change Rate (CR): percentage of commits upon
an individual build with respect to the whole
system
Useful Metrics
WAR file
WAR file
WAR
file
WAR file
WAR file
WAR file
Parent
POM
Logging
28%
 Impact Time (IT): total time to run a build and all the
builds that will be triggered as a result
Useful Metrics
No dependants 
IT(A) = BT(A)
A
Useful Metrics
Serial execution 
IT(A) = BT(A) + IT(B) + IT(C)
B
A
C
Useful Metrics
Parallel execution 
IT(A) = BT(A) + max(IT(B), IT(C))
B
A
C
Useful Metrics
WAR file
WAR file
WAR
file
Weighted Impact Time (WIT): impact time of a build
weighted according to its change rage
WIT(A) = IT(A) * CR(A)
Useful Metrics
Average Impact Time (AIT): total time needed, on
average, to execute all necessary builds after any
given commit anywhere in the system
AIT = WIT(A) + WIT(B) + ... + WIT(Z)
Useful Metrics
Sample Thresholds
Average Impact Time
Average Impact Time is what indicates how well you
have scaled your system
Sample Thresholds
Maximum Impact Time
In a worst-case scenario, a build won’t take longer
than this.
Sample Thresholds
Maximum Impact Time for Critical Components
The same, but only for your most sensitive modules
(log-in, payment gateway, etc.)
Beware of dependencies!
Sample Thresholds
WAR file
WAR file
WAR
file
Manual processing
takes time...
 Most CI systems provide an API
 Calculations aren’t complex
 Multiple graphical tools available
Automating Build Analysis
github.com/quiram/build-hotspots
Build Hotspots
https://commons.wikimedia.org/wiki/File:2012_Italian_GP_-_Lotus_wheel.jpg
Thank You
@EqualExper
ts
equal-
experts
equalexperts.com
Thank You
fromfragiletoagile.com
@AbrahamMarin
#FastCI #JavaOne

Keeping Your CI/CD Pipeline as Fast as It Needs to Be

Editor's Notes

  • #3 2
  • #4 3
  • #5 4
  • #6 5
  • #7 6
  • #8 7
  • #9 8
  • #10 9
  • #11 10
  • #12 11
  • #13 12
  • #14 13
  • #15 14
  • #16 15
  • #17 16
  • #20 19
  • #21 20
  • #22 21
  • #33 32
  • #34 33
  • #35 EITHER BY FOLLOWING THIS, OR BECAUSE YOU HAVE IT, THE NEXT STEP IS MANAGING THE NETWORK
  • #36 35
  • #46 This is good to measure what you should change if you need to change something. But do you need to? PERFORMANCE  ESTABLISH THRESHOLD, MEASURE, CHANGE IF ABOVE
  • #47 46
  • #50 CALCULATE IN DIFFERENT WAYS, DEPENDING ON OUR PARALLEL EXECUTION CAPABILITIES If we don’t have the ability to run builds in parallel, then we’ll run A and then B and C (or C and B). In any case, the impact time will the sum of all of them. CLICK
  • #51 If we allow parallel execution, then both B and C will be triggered at the same time after A, which means we’ll only have to wait for the slowest of the two. CLICK
  • #52 If we allow parallel execution, then both B and C will be triggered at the same time after A, which means we’ll only have to wait for the slowest of the two. CLICK
  • #53 Bear in mind these are only approximations. In real life it can be that your ability to run things in parallel is limited by total number of slaves (maybe you can only run up to 5 builds in parallel) or other shared resources (maybe you only have one staging database and two builds cannot get hold of it at the same time). But, despite being approximations, they are a good way to establish a baseline to track and compare. CLICK
  • #54 There is something interesting to note about Impact Time, and is that this grows as you go up in the hierarchy. This graph shows the Build Time as the size of the bubbles, but the Impact Time of each bubble will include directly or indirectly that of its dependants. This means that the Parent POM file will be the build with the highest Impact Time, since whenever we change that build we have to rebuild absolutely everything. Now, is that a problem? Maybe not, because it’s also the least modified build (hence its colour). This leads us to conclude that we need to assess the relationship between Impact Time and Change Rate, which brings us to the next metric. CLICK
  • #55 This value allows us to compare which builds are the ones causing the highest impact over a period of time, letting us know when an impactful build is infrequent enough so as not to be a problem. And then, by combining all the weighted impact times. CLICK
  • #56 We get to the Average Impact Time, which will tell us how long, on average, it takes for our build system to rebuild all the necessary modules after a commit anywhere in the system. Now we’re really getting onto something, because now that we have all these metrics we have a way to define (CLICK) useful thresholds for us.
  • #57 56
  • #58 57
  • #59 58
  • #60 59
  • #61 Now, let’s take a moment to reflect on all this. We’re defining metrics based on build duration, but also on change rate. And we are considering architectural changes, restructuring of modules, based on these data. But let’s take a closer look at this this temperature graph. It is driven by dependencies among builds, but also by where I am making changes. That means that some of the attributes of this graph will change over time as developers focus on different parts of the system so as to develop different features. That means that the optimal shape of the system will change according to the data of our build, and what was a good idea yesterday may not be so much today. Let’s also note that all these graphs are created manually. And I also had to do the analysis manually. I had to do these manually because there aren’t any tools (that I know of) that can provide this information for you. And, useful as this is, you can’t do it too often because CLICK manual processing takes time.
  • #63 62
  • #64 63
  • #65 CI/CD can be your worst bottleneck Keeping your CI/CD fast is a performance tuning activity, approach it as such No proper tools available, help me build them
  • #66 65