SlideShare a Scribd company logo
© COPYRIGHT 2016 COVEROS, INC. ALL RIGHTS 1
Agility. Security. Delivered.
DevOps in a Regulated and
Embedded Environment
By: Arjun Comar
(Was DevOps on a Legacy Project)
twitter: @arjuncomar email: arjun.comar@coveros.com
© COPYRIGHT 2016 COVEROS, INC. ALL RIGHTS 2
Agenda
• About Me
• Agile, DevOps, and Medical Devices: What’s the Problem?
• Git Flow in a Regulated World
• Expect to Deploy
• Scaling for Success and Resource Management
• Questions
twitter: @arjuncomar email: arjun.comar@coveros.com
© COPYRIGHT 2016 COVEROS, INC. ALL RIGHTS RESERVED. 3
About Me
• B.S. in Computer Science from the Rose-
Hulman Institute of Technology
• Worked on everything from the Linux
kernel to computer vision.
• Interested in software quality and
correctness.
• Been with Coveros for ~2.5 years.
• Run the local HaskellDC meetup group.
twitter: @arjuncomar email: arjun.comar@coveros.com
© COPYRIGHT 2016 COVEROS, INC. ALL RIGHTS RESERVED. 4
About Coveros
• Coveros builds security-critical applications using
agile methods.
• Coveros Services
• Agile transformations
• Agile development and testing
• DevOps and continuous integration
• Application security analysis
• Agile & Security training
• Government qualifications
• DCAA approved rates and accounting
• TS facility clearance
Areas of Expertise
twitter: @arjuncomar email: arjun.comar@coveros.com
© COPYRIGHT 2016 COVEROS, INC. ALL RIGHTS RESERVED. 5
Select Clients
twitter: @arjuncomar email: arjun.comar@coveros.com
© COPYRIGHT 2016 COVEROS, INC. ALL RIGHTS 6
Medical Devices and the Law
• It isn’t sufficient to write the code, release requires regulatory
approval.
• Approval is per feature (epic)
• Contingent on development, testing, risk mitigation, etc.
• We want short-lived branches, but…
• If we don’t get approval for one feature, business still wants to release
the others
• Unmerge all the feature branches that went into an epic?
• Further requirements around documentation, especially:
• Design
• Testing
• Risk Management
twitter: @arjuncomar email: arjun.comar@coveros.com
© COPYRIGHT 2016 COVEROS, INC. ALL RIGHTS 7
Legacy Problems
• C code, embedded device target
• cross compilation: Windows -> QNX
• Some modules only built on WinXP
• Manual build, deploy, test process
• Custom hardware, custom firmware
• Old codebase, not written to be unit tested
• Unit test execution requires target environment
• Rough order of magnitude, 200 kloc codebase
• Hardware platform ~25 years old
twitter: @arjuncomar email: arjun.comar@coveros.com
© COPYRIGHT 2016 COVEROS, INC. ALL RIGHTS 8
Integration and Deployment
• Manual builds, deploy to unit test?
• Unmaintained deployment scripts
• Written by a contractor in ksh,
• Last maintainer had already left the company
• Working deployments flashed unit with usb stick and physical dongle
• Rewrite with Chef? ...Ansible? … Bash?
• try: sh run over telnet
• No ruby, python, perl, bash, ssh, dhcp
• Network deployments/updates to a device that goes in a human
being…?
twitter: @arjuncomar email: arjun.comar@coveros.com
© COPYRIGHT 2016 COVEROS, INC. ALL RIGHTS 9
Feedback Cycles
• Deployments took ~30 minutes and required physical interaction
through the process
• Testing involved long protocols with detailed and very particular
steps
• ~5-6 weeks for the test team, maybe 8 weeks, but at least 3-4.
• Release cycle on the order of years.
twitter: @arjuncomar email: arjun.comar@coveros.com
© COPYRIGHT 2016 COVEROS, INC. ALL RIGHTS 10
Resource Needs and Team Size
• Business wanted multiple features in development in parallel
• Different tests take different lengths of time to run
• even when automated
• seconds -> weeks
• Business needed 4 teams like the one they had
• Continuous integration targets, unit test targets, deployment
testing targets, full functional test targets, partially automated test
targets
• Performance, reliability, security, durability, etc.?
twitter: @arjuncomar email: arjun.comar@coveros.com
© COPYRIGHT 2016 COVEROS, INC. ALL RIGHTS 11
Solutions
One thing at a time...
twitter: @arjuncomar email: arjun.comar@coveros.com
© COPYRIGHT 2016 COVEROS, INC. ALL RIGHTS 12
Git Flow
in a Regulated World
twitter: @arjuncomar email: arjun.comar@coveros.com
© COPYRIGHT 2016 COVEROS, INC. ALL RIGHTS 13
Git Workflows
• Linux Kernel: benevolent dictator, many trusted lieutenants, an
insane number of contributors.
• GitHub: Single (or small team) of maintainers, contributors submit
pull-requests
• Corporate git usage: Trusted team of developers, co-maintain
shared repository
twitter: @arjuncomar email: arjun.comar@coveros.com
© COPYRIGHT 2016 COVEROS, INC. ALL RIGHTS 14
Enter: Git Flow
twitter: @arjuncomar email: arjun.comar@coveros.com
© COPYRIGHT 2016 COVEROS, INC. ALL RIGHTS 15
But I can’t merge back daily...
• No, really. Daily merges back to develop means pulling an epic out
requires a virtually impossible unmerge.
• Might be legally required not to go forward with a feature
• Can’t get approval until feature is developed and tested with
known risks documented and mitigated
• Business still wants to release what they can
twitter: @arjuncomar email: arjun.comar@coveros.com
© COPYRIGHT 2016 COVEROS, INC. ALL RIGHTS 16
Can’t not integrate...
• Long lived lines of development, all separate
• Tested independently prior to release
• Business wants to release, integrate necessary branches and…
• Disaster: merge conflicts, retest everything, unknown interactions
everywhere
twitter: @arjuncomar email: arjun.comar@coveros.com
© COPYRIGHT 2016 COVEROS, INC. ALL RIGHTS 17
Extending the
workflow to deal with
regulation
Extend the git flow model
Keep epic specific code in ‘develop/epic-
name’ branches
Use ‘feature/epic-name/feature-name’
branches for daily work
Merge these back daily!
Epic branches get merged back for a
release
twitter: @arjuncomar email: arjun.comar@coveros.com
© COPYRIGHT 2016 COVEROS, INC. ALL RIGHTS 18
Integrating Continuously
• Use tooling to manage the problem for you
• Have Jenkins (or your CI stack of choice) do builds by merging
develop with the epic branches first
• develop holds code that will be released, features that conflict must be
fixed
• Run the normal deployment and testing cycle on these builds
• merge conflicts are failed builds
twitter: @arjuncomar email: arjun.comar@coveros.com
© COPYRIGHT 2016 COVEROS, INC. ALL RIGHTS 19
Integrating even more continuously
• Still need to know if there’s potential conflicts between epic
branches
• fail early, fail often, right?
• Take all the epic branches and merge them with develop
• Run a full build/deploy/test cycle on this mess as well.
• Any failures found -> failed build
• If it doesn’t cleanly merge, we can’t release, right?
• The software should always be ready to release; make it a business
decision, not a technical one.
twitter: @arjuncomar email: arjun.comar@coveros.com
© COPYRIGHT 2016 COVEROS, INC. ALL RIGHTS 20
Digging deeper to unearth conflict
• Better error detection and reporting:
• If we merge everything together, it looks like the later branches cause
conflicts more often
• Branches that conflict exclude each other
• Find conflicting pairs and report them both as failed
• Conflicts may only show up with the interaction of 3+ branches
• But this gets exponentially hard to detect
twitter: @arjuncomar email: arjun.comar@coveros.com
© COPYRIGHT 2016 COVEROS, INC. ALL RIGHTS 21
Do what you can
• Merge all possible epic branch pairs together, track+report failures
• Report these failures once or the team will ignore you...
• Branches that cleanly merge with everything get merged together
with development and built
• This assesses the health of the software as it exists at this moment
• This might be expensive, so do it overnight.
• Shortcuts:
• If ‘A’ merges with ‘B’, then ‘B’ merges with ‘A’
• ‘A’ always merges with ‘A’
• (You only need the top half of the n x n matrix)
twitter: @arjuncomar email: arjun.comar@coveros.com
© COPYRIGHT 2016 COVEROS, INC. ALL RIGHTS 22
This is a lot of work...
• Long-lived branches are hard to deal with.
• You could even go further and build the sets of conflicting
branches that can be merged together
• This is really hard; it’s easier to ask the team to fix the mess.
• If you don’t have to do it, don’t.
• You probably don’t unless regulatory constraints make you.
twitter: @arjuncomar email: arjun.comar@coveros.com
© COPYRIGHT 2016 COVEROS, INC. ALL RIGHTS 23
Expect to Deploy
What a lifesaver
twitter: @arjuncomar email: arjun.comar@coveros.com
© COPYRIGHT 2016 COVEROS, INC. ALL RIGHTS 24
Expect?
• Tcl scripting language used to automate interactive programs
• ...like telnet and ftp
• Was used to automate testing way back in the day
• Turns out to be rather perfect for scripting deployments, testing,
etc. in this tool restricted environment
• sh, ksh, telnet, ftp
• not: bash, python, ruby, ssh, perl, etc.
twitter: @arjuncomar email: arjun.comar@coveros.com
© COPYRIGHT 2016 COVEROS, INC. ALL RIGHTS 25
Wait, why not use ...
• Yes, we could have tried to beat that wall down
• Lots of effort/expertise to produce a working build of python for
the target environment
• QNX support would probably have been willing to help
• But loading new software onto the target environment to increase
its capabilities is fundamentally risky
• Business was understandably risk averse
• Rather limited DevOps team at this point of me, myself, and I.
twitter: @arjuncomar email: arjun.comar@coveros.com
© COPYRIGHT 2016 COVEROS, INC. ALL RIGHTS 26
A little expect script
$ cat login.expect
#!/usr/bin/expect
set timeout 20
set addr [lindex $argv 0]
set user [lindex $argv 1]
set pass [lindex $argv 2]
spawn telnet $addr
expect "login:"
send "$userr"
expect "Password:"
send "$passr"
expect "#"
interact
twitter: @arjuncomar email: arjun.comar@coveros.com
© COPYRIGHT 2016 COVEROS, INC. ALL RIGHTS 27
Adding a little abstraction
proc login { addr user pass } {
spawn telnet $addr
expect {
timeout { send_user "Could not connectn"; exit 1 }
eof { send_user "Connection refusedn"; exit 1 }
"login:"
}
send "$userr"
expect "Password:"
send "$passr"
expect {
timeout { send_user "Failed to login.n"; exit 1 }
"#"
}
}
twitter: @arjuncomar email: arjun.comar@coveros.com
© COPYRIGHT 2016 COVEROS, INC. ALL RIGHTS 28
Separation of Concerns
• It only takes minor modifications to use the same logic to connect
to ftp
• Use ftp to upload deployment archive, install sh script
• Use telnet to set permissions and execute install script on archive
• Deployment logic is now separate from connecting, setup, etc.
• “talking to the target” vs “doing stuff on the target”
• This is exactly the separation chef/puppet/ansible provide
• (They also provide a whole lot of other value as well, but it’s nice to
recover any of it!)
twitter: @arjuncomar email: arjun.comar@coveros.com
© COPYRIGHT 2016 COVEROS, INC. ALL RIGHTS 29
Towards a deployment framework
• How many environments like this are out there?
• limited tooling, embedded platform, etc.
• If there are a lot… we have the start of a deployment framework to
target these environments
• Dependencies are very minimal, can be used to target virtually
anything
• With work, we could get something idempotent with clean
modularity and composability.
• A whole lot of work… Is there a market that needs this?
twitter: @arjuncomar email: arjun.comar@coveros.com
© COPYRIGHT 2016 COVEROS, INC. ALL RIGHTS 30
Scaling for Success
and Resource Management
twitter: @arjuncomar email: arjun.comar@coveros.com
© COPYRIGHT 2016 COVEROS, INC. ALL RIGHTS 31
Resource Needs
• Embedded device with potential hardware attachments for
particular tests -- virtualization is out.
• Unit tests need to run in the target environment so one target is
needed at a minimum just for rapid feedback CI.
• Basic integration testing (i.e. devint env) takes ~1 min to ~ 10 mins
• Fully automated functional testing takes ~10 mins to 1+ hours to
run (i.e. test env)
• Partially automated tests require interaction, need another target.
• Longer term testing (i.e. stress, durability, performance, etc.) takes
weeks and needs its own target.
• ~5 targets minimum to support development for basic CI/CD
twitter: @arjuncomar email: arjun.comar@coveros.com
© COPYRIGHT 2016 COVEROS, INC. ALL RIGHTS 32
Tackling Resource Allocation
• If a new build kicks off and reaches deployment testing while the
previous round of smoke testing is still on-going, what happens?
• Probably: target gets bricked as OS level code is updated while the
machine is in use.
• Even if the pipeline is built carefully so these things can’t happen,
there’s always PEBKAC
• Deployment and testing tools need to be smart enough to check if
a console is available before attempting to use it
• We need a resource allocator...
twitter: @arjuncomar email: arjun.comar@coveros.com
© COPYRIGHT 2016 COVEROS, INC. ALL RIGHTS 33
Making a first pass
• Track the target state on the target
• Use an old Unix trick -- drop a lock file in a well-known spot, and
make tools attempt to acquire the lock before using the target
• Pros: Extremely simple to implement and use; it’s a really simple
pair of shell scripts.
• Cons: If the lockfile isn’t cleaned up, the target is unavailable; if the
tool (user) doesn’t check for the lock, they could still cause
problems. It’s hard to track what targets are in use where, there’s
no centralized management.
twitter: @arjuncomar email: arjun.comar@coveros.com
© COPYRIGHT 2016 COVEROS, INC. ALL RIGHTS 34
Aside: Jenkins Pipeline
• Specifying the pipeline in groovy
instead of shell/jenkins xml
prevented a lot of bugs.
• acquireLock and releaseLock have
simple contracts and provide
strong guarantees with try/finally
idiom.
• This is tricky/hard to achieve with
traditional jenkins.
def locking(target, action) {
try {
acquireLock(target)
action()
} finally {
releaseLock(target)
}
}
downloadTests(latest)
locking(targetAddr) {
deploy(targetAddr)
runTests(targetAddr, myBuild, testTags)
}
twitter: @arjuncomar email: arjun.comar@coveros.com
© COPYRIGHT 2016 COVEROS, INC. ALL RIGHTS 35
Multiple teams, multiple workstreams
• Goal is to reduce cycle time. If one team has to wait for feedback
for another team’s build to finish, we’re wasting time.
• Key takeaway: we can’t effectively share environments between
parallel streams of development.
• Business wanted ~4 streams of work progressing in parallel.
• Team needs to be able to support old releases via hotfixes (~2 old,
previous release, current stream of development).
• Hardware/firmware platform changes between releases
• Test automation team needs to an environment to test their tests.
• DevOps team needs to be able to test pipeline changes.
• ~40 target machines to effectively support CI/CD pipeline.
twitter: @arjuncomar email: arjun.comar@coveros.com
© COPYRIGHT 2016 COVEROS, INC. ALL RIGHTS 36
That’s a lot of equipment...
• Where do you put it all?
• Shelving/rackspace, cooling, switches, networking…
• Units are expensive; if they aren’t in use/needed, business is going
to get annoyed.
• Hard to track utilization, load, etc. from a really decentralized
place.
• We might also be able to save money / use fewer targets if we’re
more intelligent about allocating them; i.e. allocate on demand.
• Centralization also means we can start hitting nice-to-haves:
• console access from the web browser for debugging
• status/health check daemon reporting to the manager
twitter: @arjuncomar email: arjun.comar@coveros.com
© COPYRIGHT 2016 COVEROS, INC. ALL RIGHTS 37
Centralized Resource Management
• Pool available targets, expose REST API to acquire a target for use,
release a target, check a target, etc.
• Track target status, usage metrics, target requester statistics in
backend database.
• Set up a simple frontend to display statistics about usage, provide
a manual form to acquire a target for manual/ad-hoc testing, etc.
• Like a library; acquire target for duration, get grumpy emails if it’s
not returned in time.
• Can be easily expanded to provide additional services over time.
twitter: @arjuncomar email: arjun.comar@coveros.com
© COPYRIGHT 2016 COVEROS, INC. ALL RIGHTS 38
Lightning Quick Recap
• Integrate continuously to keep software testable, increase quality,
and build confidence.
• Prioritize the delivery of working software.
• Fail early, fail often.
• Make your tools serve your needs.
• Set yourself up to success -- plan ahead to cover scaling needs.
twitter: @arjuncomar email: arjun.comar@coveros.com
© COPYRIGHT 2016 COVEROS, INC. ALL RIGHTS 39
That was fast...
• There’s a lot more I’d love to talk about.
• Please feel free to ask me questions during the break or
afterwards.
• Thanks for your time!
twitter: @arjuncomar email: arjun.comar@coveros.com

More Related Content

What's hot (18)

PDF
Better Security Testing: Using the Cloud and Continuous Delivery
Gene Gotimer
 
PDF
The Future of Security and Productivity in Our Newly Remote World
DevOps.com
 
PPTX
Experiences Bringing CD to a DoD Project
Gene Gotimer
 
PPT
Securing Apache Web Servers
Information Technology
 
PDF
Increasing Quality with DevOps
Coveros, Inc.
 
PDF
Connect Ops and Security with Flexible Web App and API Protection
DevOps.com
 
PDF
DevSecOps: What Why and How : Blackhat 2019
NotSoSecure Global Services
 
PDF
Standardizing Jenkins with CloudBees Jenkins Team
Deborah Schalm
 
PDF
PKI in DevOps: How to Deploy Certificate Automation within CI/CD
DevOps.com
 
PDF
The DevSecOps Builder’s Guide to the CI/CD Pipeline
James Wickett
 
PPTX
Tests your pipeline might be missing
Gene Gotimer
 
PPTX
Continuous Testing and New Tools for Automation - Presentation from StarWest ...
Sauce Labs
 
PDF
DevSecOps: Taking a DevOps Approach to Security
Alert Logic
 
PPT
Code Quality - Security
sedukull
 
PPTX
Rapid software testing and conformance with static code analysis
Rogue Wave Software
 
PDF
DevSecCon London 2017: Permitting agility whilst enforcing security by Alina ...
DevSecCon
 
PPTX
DevOps & Security: Here & Now
Checkmarx
 
PDF
Security in CI/CD Pipelines: Tips for DevOps Engineers
DevOps.com
 
Better Security Testing: Using the Cloud and Continuous Delivery
Gene Gotimer
 
The Future of Security and Productivity in Our Newly Remote World
DevOps.com
 
Experiences Bringing CD to a DoD Project
Gene Gotimer
 
Securing Apache Web Servers
Information Technology
 
Increasing Quality with DevOps
Coveros, Inc.
 
Connect Ops and Security with Flexible Web App and API Protection
DevOps.com
 
DevSecOps: What Why and How : Blackhat 2019
NotSoSecure Global Services
 
Standardizing Jenkins with CloudBees Jenkins Team
Deborah Schalm
 
PKI in DevOps: How to Deploy Certificate Automation within CI/CD
DevOps.com
 
The DevSecOps Builder’s Guide to the CI/CD Pipeline
James Wickett
 
Tests your pipeline might be missing
Gene Gotimer
 
Continuous Testing and New Tools for Automation - Presentation from StarWest ...
Sauce Labs
 
DevSecOps: Taking a DevOps Approach to Security
Alert Logic
 
Code Quality - Security
sedukull
 
Rapid software testing and conformance with static code analysis
Rogue Wave Software
 
DevSecCon London 2017: Permitting agility whilst enforcing security by Alina ...
DevSecCon
 
DevOps & Security: Here & Now
Checkmarx
 
Security in CI/CD Pipelines: Tips for DevOps Engineers
DevOps.com
 

Similar to DevOps in a Regulated and Embedded Environment (AgileDC) (20)

PDF
DevOps in an Embedded and Regulated Environment
TechWell
 
PDF
Continuous Delivery in a Legacy Shop - One Step at a Time
Coveros, Inc.
 
PDF
Using DevOps to Improve Software Quality in the Cloud
TechWell
 
PDF
Integrating Automated Testing into DevOps
TechWell
 
PPTX
A better faster pipeline for software delivery, even in the government
Gene Gotimer
 
PDF
Git in the Enterprise: How to succeed at DevOps using Git and a monorepo
Perforce
 
PDF
Git in the Enterprise: How to succeed at DevOps using Git and a monorepo
Gina Bustos
 
PDF
Delivering Quality at Speed with GitOps
Weaveworks
 
PDF
Be a Happier Developer with Git / Productive Team #gettinggitright
Shunsuke (Sean) Osawa
 
PDF
Facilitating continuous delivery in a FinTech world with Salt, Jenkins, Nexus...
Chocolatey Software
 
PDF
Facilitating continuous delivery in a FinTech world with Salt, Jenkins, Nexus...
Michel Buczynski
 
PDF
Continuous Delivery in a Legacy Shop—One Step at a Time
TechWell
 
PPTX
Testing in a Continuous Delivery Pipeline - Better, Faster, Cheaper
Coveros, Inc.
 
PPTX
Roslyn on GitHub
Immo Landwerth
 
PDF
DevOps Patterns to Enable Success in Microservices
Rich Mills
 
PPTX
Git in Continuous Deployment
Brett Child
 
PPTX
Automating the Quality
Dejan Vukmirovic
 
PDF
DCVCS using GIT
Pravat Sutar
 
PDF
Updated non-lab version of Level Up. Delivered at LOPSA-East, May 3, 2014.
Mandi Walls
 
PDF
A Better, Faster Pipeline for Software Delivery
Gene Gotimer
 
DevOps in an Embedded and Regulated Environment
TechWell
 
Continuous Delivery in a Legacy Shop - One Step at a Time
Coveros, Inc.
 
Using DevOps to Improve Software Quality in the Cloud
TechWell
 
Integrating Automated Testing into DevOps
TechWell
 
A better faster pipeline for software delivery, even in the government
Gene Gotimer
 
Git in the Enterprise: How to succeed at DevOps using Git and a monorepo
Perforce
 
Git in the Enterprise: How to succeed at DevOps using Git and a monorepo
Gina Bustos
 
Delivering Quality at Speed with GitOps
Weaveworks
 
Be a Happier Developer with Git / Productive Team #gettinggitright
Shunsuke (Sean) Osawa
 
Facilitating continuous delivery in a FinTech world with Salt, Jenkins, Nexus...
Chocolatey Software
 
Facilitating continuous delivery in a FinTech world with Salt, Jenkins, Nexus...
Michel Buczynski
 
Continuous Delivery in a Legacy Shop—One Step at a Time
TechWell
 
Testing in a Continuous Delivery Pipeline - Better, Faster, Cheaper
Coveros, Inc.
 
Roslyn on GitHub
Immo Landwerth
 
DevOps Patterns to Enable Success in Microservices
Rich Mills
 
Git in Continuous Deployment
Brett Child
 
Automating the Quality
Dejan Vukmirovic
 
DCVCS using GIT
Pravat Sutar
 
Updated non-lab version of Level Up. Delivered at LOPSA-East, May 3, 2014.
Mandi Walls
 
A Better, Faster Pipeline for Software Delivery
Gene Gotimer
 
Ad

More from Coveros, Inc. (7)

PDF
Which Development Metrics Should I Watch?
Coveros, Inc.
 
PDF
10 Things You Might Not Know: Continuous Integration
Coveros, Inc.
 
PDF
Better Security Testing: Using the Cloud and Continuous Delivery
Coveros, Inc.
 
PDF
Create Disposable Test Environments with Vagrant and Puppet
Coveros, Inc.
 
PDF
Compatibility Testing of Your Web Apps - Tips and Tricks for Debugging Locall...
Coveros, Inc.
 
PPTX
Tests Your Pipeline Might Be Missing
Coveros, Inc.
 
PDF
Web Application Security Testing: Kali Linux Is the Way to Go
Coveros, Inc.
 
Which Development Metrics Should I Watch?
Coveros, Inc.
 
10 Things You Might Not Know: Continuous Integration
Coveros, Inc.
 
Better Security Testing: Using the Cloud and Continuous Delivery
Coveros, Inc.
 
Create Disposable Test Environments with Vagrant and Puppet
Coveros, Inc.
 
Compatibility Testing of Your Web Apps - Tips and Tricks for Debugging Locall...
Coveros, Inc.
 
Tests Your Pipeline Might Be Missing
Coveros, Inc.
 
Web Application Security Testing: Kali Linux Is the Way to Go
Coveros, Inc.
 
Ad

Recently uploaded (20)

PPTX
Why Businesses Are Switching to Open Source Alternatives to Crystal Reports.pptx
Varsha Nayak
 
PDF
HiHelloHR – Simplify HR Operations for Modern Workplaces
HiHelloHR
 
PDF
How to Hire AI Developers_ Step-by-Step Guide in 2025.pdf
DianApps Technologies
 
PPTX
Agentic Automation Journey Session 1/5: Context Grounding and Autopilot for E...
klpathrudu
 
PDF
Open Chain Q2 Steering Committee Meeting - 2025-06-25
Shane Coughlan
 
PDF
Build It, Buy It, or Already Got It? Make Smarter Martech Decisions
bbedford2
 
PDF
Unlock Efficiency with Insurance Policy Administration Systems
Insurance Tech Services
 
PDF
SAP Firmaya İade ABAB Kodları - ABAB ile yazılmıl hazır kod örneği
Salih Küçük
 
PDF
Top Agile Project Management Tools for Teams in 2025
Orangescrum
 
PDF
SciPy 2025 - Packaging a Scientific Python Project
Henry Schreiner
 
PDF
Alexander Marshalov - How to use AI Assistants with your Monitoring system Q2...
VictoriaMetrics
 
PDF
Digger Solo: Semantic search and maps for your local files
seanpedersen96
 
PDF
Empower Your Tech Vision- Why Businesses Prefer to Hire Remote Developers fro...
logixshapers59
 
PPTX
Tally software_Introduction_Presentation
AditiBansal54083
 
PPTX
Finding Your License Details in IBM SPSS Statistics Version 31.pptx
Version 1 Analytics
 
PDF
The 5 Reasons for IT Maintenance - Arna Softech
Arna Softech
 
PDF
Why Businesses Are Switching to Open Source Alternatives to Crystal Reports.pdf
Varsha Nayak
 
PPTX
Agentic Automation: Build & Deploy Your First UiPath Agent
klpathrudu
 
PPTX
Home Care Tools: Benefits, features and more
Third Rock Techkno
 
PPTX
Foundations of Marketo Engage - Powering Campaigns with Marketo Personalization
bbedford2
 
Why Businesses Are Switching to Open Source Alternatives to Crystal Reports.pptx
Varsha Nayak
 
HiHelloHR – Simplify HR Operations for Modern Workplaces
HiHelloHR
 
How to Hire AI Developers_ Step-by-Step Guide in 2025.pdf
DianApps Technologies
 
Agentic Automation Journey Session 1/5: Context Grounding and Autopilot for E...
klpathrudu
 
Open Chain Q2 Steering Committee Meeting - 2025-06-25
Shane Coughlan
 
Build It, Buy It, or Already Got It? Make Smarter Martech Decisions
bbedford2
 
Unlock Efficiency with Insurance Policy Administration Systems
Insurance Tech Services
 
SAP Firmaya İade ABAB Kodları - ABAB ile yazılmıl hazır kod örneği
Salih Küçük
 
Top Agile Project Management Tools for Teams in 2025
Orangescrum
 
SciPy 2025 - Packaging a Scientific Python Project
Henry Schreiner
 
Alexander Marshalov - How to use AI Assistants with your Monitoring system Q2...
VictoriaMetrics
 
Digger Solo: Semantic search and maps for your local files
seanpedersen96
 
Empower Your Tech Vision- Why Businesses Prefer to Hire Remote Developers fro...
logixshapers59
 
Tally software_Introduction_Presentation
AditiBansal54083
 
Finding Your License Details in IBM SPSS Statistics Version 31.pptx
Version 1 Analytics
 
The 5 Reasons for IT Maintenance - Arna Softech
Arna Softech
 
Why Businesses Are Switching to Open Source Alternatives to Crystal Reports.pdf
Varsha Nayak
 
Agentic Automation: Build & Deploy Your First UiPath Agent
klpathrudu
 
Home Care Tools: Benefits, features and more
Third Rock Techkno
 
Foundations of Marketo Engage - Powering Campaigns with Marketo Personalization
bbedford2
 

DevOps in a Regulated and Embedded Environment (AgileDC)

  • 1. © COPYRIGHT 2016 COVEROS, INC. ALL RIGHTS 1 Agility. Security. Delivered. DevOps in a Regulated and Embedded Environment By: Arjun Comar (Was DevOps on a Legacy Project) twitter: @arjuncomar email: [email protected]
  • 2. © COPYRIGHT 2016 COVEROS, INC. ALL RIGHTS 2 Agenda • About Me • Agile, DevOps, and Medical Devices: What’s the Problem? • Git Flow in a Regulated World • Expect to Deploy • Scaling for Success and Resource Management • Questions twitter: @arjuncomar email: [email protected]
  • 3. © COPYRIGHT 2016 COVEROS, INC. ALL RIGHTS RESERVED. 3 About Me • B.S. in Computer Science from the Rose- Hulman Institute of Technology • Worked on everything from the Linux kernel to computer vision. • Interested in software quality and correctness. • Been with Coveros for ~2.5 years. • Run the local HaskellDC meetup group. twitter: @arjuncomar email: [email protected]
  • 4. © COPYRIGHT 2016 COVEROS, INC. ALL RIGHTS RESERVED. 4 About Coveros • Coveros builds security-critical applications using agile methods. • Coveros Services • Agile transformations • Agile development and testing • DevOps and continuous integration • Application security analysis • Agile & Security training • Government qualifications • DCAA approved rates and accounting • TS facility clearance Areas of Expertise twitter: @arjuncomar email: [email protected]
  • 5. © COPYRIGHT 2016 COVEROS, INC. ALL RIGHTS RESERVED. 5 Select Clients twitter: @arjuncomar email: [email protected]
  • 6. © COPYRIGHT 2016 COVEROS, INC. ALL RIGHTS 6 Medical Devices and the Law • It isn’t sufficient to write the code, release requires regulatory approval. • Approval is per feature (epic) • Contingent on development, testing, risk mitigation, etc. • We want short-lived branches, but… • If we don’t get approval for one feature, business still wants to release the others • Unmerge all the feature branches that went into an epic? • Further requirements around documentation, especially: • Design • Testing • Risk Management twitter: @arjuncomar email: [email protected]
  • 7. © COPYRIGHT 2016 COVEROS, INC. ALL RIGHTS 7 Legacy Problems • C code, embedded device target • cross compilation: Windows -> QNX • Some modules only built on WinXP • Manual build, deploy, test process • Custom hardware, custom firmware • Old codebase, not written to be unit tested • Unit test execution requires target environment • Rough order of magnitude, 200 kloc codebase • Hardware platform ~25 years old twitter: @arjuncomar email: [email protected]
  • 8. © COPYRIGHT 2016 COVEROS, INC. ALL RIGHTS 8 Integration and Deployment • Manual builds, deploy to unit test? • Unmaintained deployment scripts • Written by a contractor in ksh, • Last maintainer had already left the company • Working deployments flashed unit with usb stick and physical dongle • Rewrite with Chef? ...Ansible? … Bash? • try: sh run over telnet • No ruby, python, perl, bash, ssh, dhcp • Network deployments/updates to a device that goes in a human being…? twitter: @arjuncomar email: [email protected]
  • 9. © COPYRIGHT 2016 COVEROS, INC. ALL RIGHTS 9 Feedback Cycles • Deployments took ~30 minutes and required physical interaction through the process • Testing involved long protocols with detailed and very particular steps • ~5-6 weeks for the test team, maybe 8 weeks, but at least 3-4. • Release cycle on the order of years. twitter: @arjuncomar email: [email protected]
  • 10. © COPYRIGHT 2016 COVEROS, INC. ALL RIGHTS 10 Resource Needs and Team Size • Business wanted multiple features in development in parallel • Different tests take different lengths of time to run • even when automated • seconds -> weeks • Business needed 4 teams like the one they had • Continuous integration targets, unit test targets, deployment testing targets, full functional test targets, partially automated test targets • Performance, reliability, security, durability, etc.? twitter: @arjuncomar email: [email protected]
  • 11. © COPYRIGHT 2016 COVEROS, INC. ALL RIGHTS 11 Solutions One thing at a time... twitter: @arjuncomar email: [email protected]
  • 12. © COPYRIGHT 2016 COVEROS, INC. ALL RIGHTS 12 Git Flow in a Regulated World twitter: @arjuncomar email: [email protected]
  • 13. © COPYRIGHT 2016 COVEROS, INC. ALL RIGHTS 13 Git Workflows • Linux Kernel: benevolent dictator, many trusted lieutenants, an insane number of contributors. • GitHub: Single (or small team) of maintainers, contributors submit pull-requests • Corporate git usage: Trusted team of developers, co-maintain shared repository twitter: @arjuncomar email: [email protected]
  • 14. © COPYRIGHT 2016 COVEROS, INC. ALL RIGHTS 14 Enter: Git Flow twitter: @arjuncomar email: [email protected]
  • 15. © COPYRIGHT 2016 COVEROS, INC. ALL RIGHTS 15 But I can’t merge back daily... • No, really. Daily merges back to develop means pulling an epic out requires a virtually impossible unmerge. • Might be legally required not to go forward with a feature • Can’t get approval until feature is developed and tested with known risks documented and mitigated • Business still wants to release what they can twitter: @arjuncomar email: [email protected]
  • 16. © COPYRIGHT 2016 COVEROS, INC. ALL RIGHTS 16 Can’t not integrate... • Long lived lines of development, all separate • Tested independently prior to release • Business wants to release, integrate necessary branches and… • Disaster: merge conflicts, retest everything, unknown interactions everywhere twitter: @arjuncomar email: [email protected]
  • 17. © COPYRIGHT 2016 COVEROS, INC. ALL RIGHTS 17 Extending the workflow to deal with regulation Extend the git flow model Keep epic specific code in ‘develop/epic- name’ branches Use ‘feature/epic-name/feature-name’ branches for daily work Merge these back daily! Epic branches get merged back for a release twitter: @arjuncomar email: [email protected]
  • 18. © COPYRIGHT 2016 COVEROS, INC. ALL RIGHTS 18 Integrating Continuously • Use tooling to manage the problem for you • Have Jenkins (or your CI stack of choice) do builds by merging develop with the epic branches first • develop holds code that will be released, features that conflict must be fixed • Run the normal deployment and testing cycle on these builds • merge conflicts are failed builds twitter: @arjuncomar email: [email protected]
  • 19. © COPYRIGHT 2016 COVEROS, INC. ALL RIGHTS 19 Integrating even more continuously • Still need to know if there’s potential conflicts between epic branches • fail early, fail often, right? • Take all the epic branches and merge them with develop • Run a full build/deploy/test cycle on this mess as well. • Any failures found -> failed build • If it doesn’t cleanly merge, we can’t release, right? • The software should always be ready to release; make it a business decision, not a technical one. twitter: @arjuncomar email: [email protected]
  • 20. © COPYRIGHT 2016 COVEROS, INC. ALL RIGHTS 20 Digging deeper to unearth conflict • Better error detection and reporting: • If we merge everything together, it looks like the later branches cause conflicts more often • Branches that conflict exclude each other • Find conflicting pairs and report them both as failed • Conflicts may only show up with the interaction of 3+ branches • But this gets exponentially hard to detect twitter: @arjuncomar email: [email protected]
  • 21. © COPYRIGHT 2016 COVEROS, INC. ALL RIGHTS 21 Do what you can • Merge all possible epic branch pairs together, track+report failures • Report these failures once or the team will ignore you... • Branches that cleanly merge with everything get merged together with development and built • This assesses the health of the software as it exists at this moment • This might be expensive, so do it overnight. • Shortcuts: • If ‘A’ merges with ‘B’, then ‘B’ merges with ‘A’ • ‘A’ always merges with ‘A’ • (You only need the top half of the n x n matrix) twitter: @arjuncomar email: [email protected]
  • 22. © COPYRIGHT 2016 COVEROS, INC. ALL RIGHTS 22 This is a lot of work... • Long-lived branches are hard to deal with. • You could even go further and build the sets of conflicting branches that can be merged together • This is really hard; it’s easier to ask the team to fix the mess. • If you don’t have to do it, don’t. • You probably don’t unless regulatory constraints make you. twitter: @arjuncomar email: [email protected]
  • 23. © COPYRIGHT 2016 COVEROS, INC. ALL RIGHTS 23 Expect to Deploy What a lifesaver twitter: @arjuncomar email: [email protected]
  • 24. © COPYRIGHT 2016 COVEROS, INC. ALL RIGHTS 24 Expect? • Tcl scripting language used to automate interactive programs • ...like telnet and ftp • Was used to automate testing way back in the day • Turns out to be rather perfect for scripting deployments, testing, etc. in this tool restricted environment • sh, ksh, telnet, ftp • not: bash, python, ruby, ssh, perl, etc. twitter: @arjuncomar email: [email protected]
  • 25. © COPYRIGHT 2016 COVEROS, INC. ALL RIGHTS 25 Wait, why not use ... • Yes, we could have tried to beat that wall down • Lots of effort/expertise to produce a working build of python for the target environment • QNX support would probably have been willing to help • But loading new software onto the target environment to increase its capabilities is fundamentally risky • Business was understandably risk averse • Rather limited DevOps team at this point of me, myself, and I. twitter: @arjuncomar email: [email protected]
  • 26. © COPYRIGHT 2016 COVEROS, INC. ALL RIGHTS 26 A little expect script $ cat login.expect #!/usr/bin/expect set timeout 20 set addr [lindex $argv 0] set user [lindex $argv 1] set pass [lindex $argv 2] spawn telnet $addr expect "login:" send "$userr" expect "Password:" send "$passr" expect "#" interact twitter: @arjuncomar email: [email protected]
  • 27. © COPYRIGHT 2016 COVEROS, INC. ALL RIGHTS 27 Adding a little abstraction proc login { addr user pass } { spawn telnet $addr expect { timeout { send_user "Could not connectn"; exit 1 } eof { send_user "Connection refusedn"; exit 1 } "login:" } send "$userr" expect "Password:" send "$passr" expect { timeout { send_user "Failed to login.n"; exit 1 } "#" } } twitter: @arjuncomar email: [email protected]
  • 28. © COPYRIGHT 2016 COVEROS, INC. ALL RIGHTS 28 Separation of Concerns • It only takes minor modifications to use the same logic to connect to ftp • Use ftp to upload deployment archive, install sh script • Use telnet to set permissions and execute install script on archive • Deployment logic is now separate from connecting, setup, etc. • “talking to the target” vs “doing stuff on the target” • This is exactly the separation chef/puppet/ansible provide • (They also provide a whole lot of other value as well, but it’s nice to recover any of it!) twitter: @arjuncomar email: [email protected]
  • 29. © COPYRIGHT 2016 COVEROS, INC. ALL RIGHTS 29 Towards a deployment framework • How many environments like this are out there? • limited tooling, embedded platform, etc. • If there are a lot… we have the start of a deployment framework to target these environments • Dependencies are very minimal, can be used to target virtually anything • With work, we could get something idempotent with clean modularity and composability. • A whole lot of work… Is there a market that needs this? twitter: @arjuncomar email: [email protected]
  • 30. © COPYRIGHT 2016 COVEROS, INC. ALL RIGHTS 30 Scaling for Success and Resource Management twitter: @arjuncomar email: [email protected]
  • 31. © COPYRIGHT 2016 COVEROS, INC. ALL RIGHTS 31 Resource Needs • Embedded device with potential hardware attachments for particular tests -- virtualization is out. • Unit tests need to run in the target environment so one target is needed at a minimum just for rapid feedback CI. • Basic integration testing (i.e. devint env) takes ~1 min to ~ 10 mins • Fully automated functional testing takes ~10 mins to 1+ hours to run (i.e. test env) • Partially automated tests require interaction, need another target. • Longer term testing (i.e. stress, durability, performance, etc.) takes weeks and needs its own target. • ~5 targets minimum to support development for basic CI/CD twitter: @arjuncomar email: [email protected]
  • 32. © COPYRIGHT 2016 COVEROS, INC. ALL RIGHTS 32 Tackling Resource Allocation • If a new build kicks off and reaches deployment testing while the previous round of smoke testing is still on-going, what happens? • Probably: target gets bricked as OS level code is updated while the machine is in use. • Even if the pipeline is built carefully so these things can’t happen, there’s always PEBKAC • Deployment and testing tools need to be smart enough to check if a console is available before attempting to use it • We need a resource allocator... twitter: @arjuncomar email: [email protected]
  • 33. © COPYRIGHT 2016 COVEROS, INC. ALL RIGHTS 33 Making a first pass • Track the target state on the target • Use an old Unix trick -- drop a lock file in a well-known spot, and make tools attempt to acquire the lock before using the target • Pros: Extremely simple to implement and use; it’s a really simple pair of shell scripts. • Cons: If the lockfile isn’t cleaned up, the target is unavailable; if the tool (user) doesn’t check for the lock, they could still cause problems. It’s hard to track what targets are in use where, there’s no centralized management. twitter: @arjuncomar email: [email protected]
  • 34. © COPYRIGHT 2016 COVEROS, INC. ALL RIGHTS 34 Aside: Jenkins Pipeline • Specifying the pipeline in groovy instead of shell/jenkins xml prevented a lot of bugs. • acquireLock and releaseLock have simple contracts and provide strong guarantees with try/finally idiom. • This is tricky/hard to achieve with traditional jenkins. def locking(target, action) { try { acquireLock(target) action() } finally { releaseLock(target) } } downloadTests(latest) locking(targetAddr) { deploy(targetAddr) runTests(targetAddr, myBuild, testTags) } twitter: @arjuncomar email: [email protected]
  • 35. © COPYRIGHT 2016 COVEROS, INC. ALL RIGHTS 35 Multiple teams, multiple workstreams • Goal is to reduce cycle time. If one team has to wait for feedback for another team’s build to finish, we’re wasting time. • Key takeaway: we can’t effectively share environments between parallel streams of development. • Business wanted ~4 streams of work progressing in parallel. • Team needs to be able to support old releases via hotfixes (~2 old, previous release, current stream of development). • Hardware/firmware platform changes between releases • Test automation team needs to an environment to test their tests. • DevOps team needs to be able to test pipeline changes. • ~40 target machines to effectively support CI/CD pipeline. twitter: @arjuncomar email: [email protected]
  • 36. © COPYRIGHT 2016 COVEROS, INC. ALL RIGHTS 36 That’s a lot of equipment... • Where do you put it all? • Shelving/rackspace, cooling, switches, networking… • Units are expensive; if they aren’t in use/needed, business is going to get annoyed. • Hard to track utilization, load, etc. from a really decentralized place. • We might also be able to save money / use fewer targets if we’re more intelligent about allocating them; i.e. allocate on demand. • Centralization also means we can start hitting nice-to-haves: • console access from the web browser for debugging • status/health check daemon reporting to the manager twitter: @arjuncomar email: [email protected]
  • 37. © COPYRIGHT 2016 COVEROS, INC. ALL RIGHTS 37 Centralized Resource Management • Pool available targets, expose REST API to acquire a target for use, release a target, check a target, etc. • Track target status, usage metrics, target requester statistics in backend database. • Set up a simple frontend to display statistics about usage, provide a manual form to acquire a target for manual/ad-hoc testing, etc. • Like a library; acquire target for duration, get grumpy emails if it’s not returned in time. • Can be easily expanded to provide additional services over time. twitter: @arjuncomar email: [email protected]
  • 38. © COPYRIGHT 2016 COVEROS, INC. ALL RIGHTS 38 Lightning Quick Recap • Integrate continuously to keep software testable, increase quality, and build confidence. • Prioritize the delivery of working software. • Fail early, fail often. • Make your tools serve your needs. • Set yourself up to success -- plan ahead to cover scaling needs. twitter: @arjuncomar email: [email protected]
  • 39. © COPYRIGHT 2016 COVEROS, INC. ALL RIGHTS 39 That was fast... • There’s a lot more I’d love to talk about. • Please feel free to ask me questions during the break or afterwards. • Thanks for your time! twitter: @arjuncomar email: [email protected]