Continuous failure
Why do we make our lives hard?
Krisztian Papp
1
Dream project
Requirements are clear
Deadlines are far
Teams are well performing
The domain is exciting
Tests are green
2
Reality
Requirements are unclear and changing
Deadlines are near
Teams delivering too slow
The domain is boring
What tests?
3
Contributing factors
Project management
Funding
Architecture
Engineering leadership
4
About me
Based @ Hungary
Principal Software Engineer @ Diligent
<3 Neovim
Enemy of the mutable state
Private pilot
5
Let's talk about failure
Failure is inevitable
Is necessary to grow
As long as you can learn from it
My failure < Others'
6
Recipe for failure: Minestrone
mishap
Start with good intentions: "I'll make
delicious minestrone!"
Ignore the recipe: "I know how to cook, I
don't need instructions"
Substitute ingredients randomly: "Chicken
broth instead of vegetable stock - what could
go wrong?"
Skip the prep work: "Who has time to properly
chop vegetables?"
Add everything at once: "Efficiency over
process!"
Result: A sad, watery chicken broth that
tastes nothing like minestrone
7
From Theory to Reality
8
A Perfect Storm
The Setup:
Flagship product rewrite
40 engineers, 5 domain teams
Modern tech stack chosen
Leadership fully committed
9
How do you turn all these advantages into a
disaster?
10
Spoiler: One decision at a time.
11
The Phoenix Project Come to
Life
12
13
How can this be?
14
Problems
Decision-making failures
Trust and collaboration issues
Planning and process weaknesses
Technical and design challenges
Product and QA gaps
15
The decision journey: Early
days
Month 1-2: Architectural foundations
Modular monolith - sounds reasonable
DynamoDB single table - modern and
scalable!
Ephemeral cloud environments - consistency
with prod!
Rely on integration tests - faster
delivery!
16
The decision journey: Reality
check
Month 6: Reality hits
Separate DBs per domain - enforce
boundaries
PostgreSQL - DynamoDB too complex, switch
back
Still no local env - developers frustrated
Integration tests slow - no unit tests to
catch issues early
17
But technical decisions weren't the only
problem...
18
Tasks without description
Unclear what should be done under a story
No breakdown
Developers making assumptions
"It should behave as the old one"
19
Few code owners
3 codeowners for 40 engineer
Slow PR reviews
Huge PRs because no task breakdown
20
No coding convention document
No standards apart from linters
Discussions keep happening on PRs
Slowing down onboarding
21
Changing requirements
Standard in our industry
Leads to rework and delays
Makes planning and estimation difficult
22
No ADRs about the decisions
Unclear why things are the way they are
No time to do that
Lack of accountability for key choices
23
As release approached, more issues emerged…
24
No production deployment possible
2 weeks before the release
Should be available in multiple regions
Critical issues discovered too late
25
E2E tests constantly break
E2E tests were not part of the pipeline
Constantly changing UI
No feedback on quality
26
No performance testing
Discovered 1 week before release
Database couldn't handle the load
Assumptions about scalability were wrong
27
What have we learned?
Failure is necessary for growth
Individual decisions can lead to systemic failures
Systemic problems require systemic solutions
28
Solutions
29
Prioritize your problems
Car with oil leak heading towards a cliff
Stop the car first
Change direction second
Fix the oil leak later
In software terms:
Fix critical production issues first
Address major architectural problems
next
Improve developer experience after
30
Improve decision-making
Avoid premature optimization
Document decisions using ADRs
Defer decisions
31
Example ADR: Ephemeral Cloud Environments
Status
Accepted
Context
To ensure consistency with production, we will not use local development environments but ephemeral cloud
environments.
Decision
Developers will use ephemeral cloud environments managed via Infrastructure-as-Code (IaC).
Consequences
Benefits: Consistent with production, centralized management
Trade-offs: Slower deployments, higher costs, learning curve
The real impact:
30-second local feedback → 15-minute cloud deployment
$1000/month pluggable env → $300/month per developer
Simple debugging → Complex cloud-based workflows
32
Building Better Teams
33
Build trust and collaboration
Promote shared code ownership and open contributions
Involve team members in decision-making processes
Break down communication silos between teams with regular cross-team syncs
34
Process & Planning Excellence
35
Enhance planning and processes
Define clear goals, "definition of done," and prioritize tasks
Create detailed requirements to reduce ambiguity
Use short iterations, regular retrospectives, and incremental delivery
Empower developers to say "no" to unclear or incomplete tasks
36
Technical Foundation
37
Strengthen technical and architectural practices
Design flexible, maintainable architectures
Set up robust local development environments
Integrate reliable automated tests into the CI/CD pipeline
Establish and enforce coding conventions
38
Quality & Production Readiness
39
Align QA and development
Collaborate early with QA on testable requirements
Make E2E tests reliable
Stabilize UX designs before development starts
40
Establish production readiness checklists
Create comprehensive pre-deployment checklists
Include performance, security, monitoring, and scalability criteria
Validate production readiness early and continuously
Use frameworks like AWS Well-Architected for guidance
Make production readiness a team responsibility, not just ops
41
The pattern of failure
Small decisions compound into big failures
Good intentions aren't enough without good
process
Systematic problems require systematic
solutions
Prevention is better than crisis management
42
Start somewhere, start today
Pick one practice from today's solutions
Document your next architectural decision
Improve one feedback loop in your team
Empower developers to push back on unclear
tasks
Remember: Learning from others' failures <
Your own
43
Further reading
The Phoenix Project by Gene Kim, Kevin Behr, and George Spafford
Accelerate: The Science of Lean Software and DevOps by Nicole Forsgren, Jez Humble, and Gene Kim
Clean Architecture by Robert C. Martin (Uncle Bob)
AWS Well-Architected Framework: https://aws.amazon.com/architecture/well-architected/
https://adr.github.io/
44
Thank you!
45

More Related Content

PPTX
Defect free development - QS Tag2019
PPT
Agile successful practices
PPTX
Writing acceptable patches: an empirical study of open source project patches
PPTX
Different approaches for different scopes: How to tackle a medium-sized Dr...
PDF
Agile Development – Why requirements matter
PPTX
Measuring Performance: See the Science of DevOps Measurement in Action
PDF
Modernizing Development - The Road to Agility and DevOps at Compuware
PPTX
Test driven development
Defect free development - QS Tag2019
Agile successful practices
Writing acceptable patches: an empirical study of open source project patches
Different approaches for different scopes: How to tackle a medium-sized Dr...
Agile Development – Why requirements matter
Measuring Performance: See the Science of DevOps Measurement in Action
Modernizing Development - The Road to Agility and DevOps at Compuware
Test driven development

Similar to Continouous failure - Why do we make our lives hard? (20)

PPT
Using Agile Processes on Documentum Projects
PPTX
Journey of atdd
PPTX
01- Lecture -Introduction to IT Agile Development.
PPT
VeeShell presentation
PPTX
Measure and Accelerate Your Software Delivery
PPTX
Measuring Performance: See the Science of DevOps Measurement in Action
PPT
Requirements Engineering Process Improvement
PPT
extreme Programming
PDF
What Are the Key Benefits of Continuous Integration Explore with BetaTest Sol...
PDF
Agile Development – Why requirements matter by Fariz Saracevic
PDF
How BDD enables True CI/CD
PDF
Bridging the Gap Between Development and Regulatory Teams
 
PPTX
Practical Testing Strategy for Agile Team
PPTX
Conquering Chaos: Helix & DevOps
PDF
Developer Productivity Engineering with Gradle
PPT
The Good, The Bad, and The Metrics
PPTX
The Need for Speed
PDF
ODD: Extending Agile 1.3
PPTX
Software Testing Basics
PPT
APSI - Analisa Perancangan Sistem Informasi
Using Agile Processes on Documentum Projects
Journey of atdd
01- Lecture -Introduction to IT Agile Development.
VeeShell presentation
Measure and Accelerate Your Software Delivery
Measuring Performance: See the Science of DevOps Measurement in Action
Requirements Engineering Process Improvement
extreme Programming
What Are the Key Benefits of Continuous Integration Explore with BetaTest Sol...
Agile Development – Why requirements matter by Fariz Saracevic
How BDD enables True CI/CD
Bridging the Gap Between Development and Regulatory Teams
 
Practical Testing Strategy for Agile Team
Conquering Chaos: Helix & DevOps
Developer Productivity Engineering with Gradle
The Good, The Bad, and The Metrics
The Need for Speed
ODD: Extending Agile 1.3
Software Testing Basics
APSI - Analisa Perancangan Sistem Informasi
Ad

Recently uploaded (20)

PPTX
WJQSJXNAZJVCVSAXJHBZKSJXKJKXJSBHJBJEHHJB
PPTX
Human-Computer Interaction for Lecture 1
PPTX
DevOpsDays Halifax 2025 - Building 10x Organizations Using Modern Productivit...
PPTX
Bandicam Screen Recorder 8.2.1 Build 2529 Crack
PDF
MAGIX Sound Forge Pro CrackSerial Key Keygen
PDF
MiniTool Power Data Recovery 12.6 Crack + Portable (Latest Version 2025)
PPTX
Folder Lock 10.1.9 Crack With Serial Key
PDF
Cloud Native Aachen Meetup - Aug 21, 2025
PPTX
ROI Analysis for Newspaper Industry with Odoo ERP
PPTX
DevOpsDays Halifax 2025 - Building 10x Organizations Using Modern Productivit...
PPTX
string python Python Strings: Literals, Slicing, Methods, Formatting, and Pra...
PDF
PDF-XChange Editor Plus 10.7.0.398.0 Crack Free Download Latest 2025
PDF
Workplace Software and Skills - OpenStax
PDF
Building an Inclusive Web Accessibility Made Simple with Accessibility Analyzer
PDF
Crypto Loss And Recovery Guide By Expert Recovery Agency.
PPTX
Human-Computer Interaction for Lecture 2
PPT
3.Software Design for software engineering
PDF
Streamlining Project Management in Microsoft Project, Planner, and Teams with...
PPTX
Foundations of Marketo Engage: Nurturing
PDF
Sun and Bloombase Spitfire StoreSafe End-to-end Storage Security Solution
WJQSJXNAZJVCVSAXJHBZKSJXKJKXJSBHJBJEHHJB
Human-Computer Interaction for Lecture 1
DevOpsDays Halifax 2025 - Building 10x Organizations Using Modern Productivit...
Bandicam Screen Recorder 8.2.1 Build 2529 Crack
MAGIX Sound Forge Pro CrackSerial Key Keygen
MiniTool Power Data Recovery 12.6 Crack + Portable (Latest Version 2025)
Folder Lock 10.1.9 Crack With Serial Key
Cloud Native Aachen Meetup - Aug 21, 2025
ROI Analysis for Newspaper Industry with Odoo ERP
DevOpsDays Halifax 2025 - Building 10x Organizations Using Modern Productivit...
string python Python Strings: Literals, Slicing, Methods, Formatting, and Pra...
PDF-XChange Editor Plus 10.7.0.398.0 Crack Free Download Latest 2025
Workplace Software and Skills - OpenStax
Building an Inclusive Web Accessibility Made Simple with Accessibility Analyzer
Crypto Loss And Recovery Guide By Expert Recovery Agency.
Human-Computer Interaction for Lecture 2
3.Software Design for software engineering
Streamlining Project Management in Microsoft Project, Planner, and Teams with...
Foundations of Marketo Engage: Nurturing
Sun and Bloombase Spitfire StoreSafe End-to-end Storage Security Solution
Ad

Continouous failure - Why do we make our lives hard?

  • 1. Continuous failure Why do we make our lives hard? Krisztian Papp 1
  • 2. Dream project Requirements are clear Deadlines are far Teams are well performing The domain is exciting Tests are green 2
  • 3. Reality Requirements are unclear and changing Deadlines are near Teams delivering too slow The domain is boring What tests? 3
  • 5. About me Based @ Hungary Principal Software Engineer @ Diligent <3 Neovim Enemy of the mutable state Private pilot 5
  • 6. Let's talk about failure Failure is inevitable Is necessary to grow As long as you can learn from it My failure < Others' 6
  • 7. Recipe for failure: Minestrone mishap Start with good intentions: "I'll make delicious minestrone!" Ignore the recipe: "I know how to cook, I don't need instructions" Substitute ingredients randomly: "Chicken broth instead of vegetable stock - what could go wrong?" Skip the prep work: "Who has time to properly chop vegetables?" Add everything at once: "Efficiency over process!" Result: A sad, watery chicken broth that tastes nothing like minestrone 7
  • 8. From Theory to Reality 8
  • 9. A Perfect Storm The Setup: Flagship product rewrite 40 engineers, 5 domain teams Modern tech stack chosen Leadership fully committed 9
  • 10. How do you turn all these advantages into a disaster? 10
  • 11. Spoiler: One decision at a time. 11
  • 12. The Phoenix Project Come to Life 12
  • 13. 13
  • 14. How can this be? 14
  • 15. Problems Decision-making failures Trust and collaboration issues Planning and process weaknesses Technical and design challenges Product and QA gaps 15
  • 16. The decision journey: Early days Month 1-2: Architectural foundations Modular monolith - sounds reasonable DynamoDB single table - modern and scalable! Ephemeral cloud environments - consistency with prod! Rely on integration tests - faster delivery! 16
  • 17. The decision journey: Reality check Month 6: Reality hits Separate DBs per domain - enforce boundaries PostgreSQL - DynamoDB too complex, switch back Still no local env - developers frustrated Integration tests slow - no unit tests to catch issues early 17
  • 18. But technical decisions weren't the only problem... 18
  • 19. Tasks without description Unclear what should be done under a story No breakdown Developers making assumptions "It should behave as the old one" 19
  • 20. Few code owners 3 codeowners for 40 engineer Slow PR reviews Huge PRs because no task breakdown 20
  • 21. No coding convention document No standards apart from linters Discussions keep happening on PRs Slowing down onboarding 21
  • 22. Changing requirements Standard in our industry Leads to rework and delays Makes planning and estimation difficult 22
  • 23. No ADRs about the decisions Unclear why things are the way they are No time to do that Lack of accountability for key choices 23
  • 24. As release approached, more issues emerged… 24
  • 25. No production deployment possible 2 weeks before the release Should be available in multiple regions Critical issues discovered too late 25
  • 26. E2E tests constantly break E2E tests were not part of the pipeline Constantly changing UI No feedback on quality 26
  • 27. No performance testing Discovered 1 week before release Database couldn't handle the load Assumptions about scalability were wrong 27
  • 28. What have we learned? Failure is necessary for growth Individual decisions can lead to systemic failures Systemic problems require systemic solutions 28
  • 30. Prioritize your problems Car with oil leak heading towards a cliff Stop the car first Change direction second Fix the oil leak later In software terms: Fix critical production issues first Address major architectural problems next Improve developer experience after 30
  • 31. Improve decision-making Avoid premature optimization Document decisions using ADRs Defer decisions 31
  • 32. Example ADR: Ephemeral Cloud Environments Status Accepted Context To ensure consistency with production, we will not use local development environments but ephemeral cloud environments. Decision Developers will use ephemeral cloud environments managed via Infrastructure-as-Code (IaC). Consequences Benefits: Consistent with production, centralized management Trade-offs: Slower deployments, higher costs, learning curve The real impact: 30-second local feedback → 15-minute cloud deployment $1000/month pluggable env → $300/month per developer Simple debugging → Complex cloud-based workflows 32
  • 34. Build trust and collaboration Promote shared code ownership and open contributions Involve team members in decision-making processes Break down communication silos between teams with regular cross-team syncs 34
  • 35. Process & Planning Excellence 35
  • 36. Enhance planning and processes Define clear goals, "definition of done," and prioritize tasks Create detailed requirements to reduce ambiguity Use short iterations, regular retrospectives, and incremental delivery Empower developers to say "no" to unclear or incomplete tasks 36
  • 38. Strengthen technical and architectural practices Design flexible, maintainable architectures Set up robust local development environments Integrate reliable automated tests into the CI/CD pipeline Establish and enforce coding conventions 38
  • 39. Quality & Production Readiness 39
  • 40. Align QA and development Collaborate early with QA on testable requirements Make E2E tests reliable Stabilize UX designs before development starts 40
  • 41. Establish production readiness checklists Create comprehensive pre-deployment checklists Include performance, security, monitoring, and scalability criteria Validate production readiness early and continuously Use frameworks like AWS Well-Architected for guidance Make production readiness a team responsibility, not just ops 41
  • 42. The pattern of failure Small decisions compound into big failures Good intentions aren't enough without good process Systematic problems require systematic solutions Prevention is better than crisis management 42
  • 43. Start somewhere, start today Pick one practice from today's solutions Document your next architectural decision Improve one feedback loop in your team Empower developers to push back on unclear tasks Remember: Learning from others' failures < Your own 43
  • 44. Further reading The Phoenix Project by Gene Kim, Kevin Behr, and George Spafford Accelerate: The Science of Lean Software and DevOps by Nicole Forsgren, Jez Humble, and Gene Kim Clean Architecture by Robert C. Martin (Uncle Bob) AWS Well-Architected Framework: https://aws.amazon.com/architecture/well-architected/ https://adr.github.io/ 44