Multi-Cloud is Multi-Headache
When I was younger, my family ran a small IT school. We bought some computers, named them "servers", installed Linux, and started offering services.
On weekends, we would literally turn them off.
Ridiculous now, right? But it shows how much things have changed: today's customer expectations are brutal. Every service is supposed to be always available, no excuses. Even the smallest platform is expected to perform like the internet giants.
TL;DR; from that home-made QoS the world came the promise of multi-cloud.
The promise
On paper, it's brilliant:
Run workloads across multiple providers. Avoid lock-in. Gain leverage in negotiations. Survive outages by failing over seamlessly.
It's the executive dream: resilience, freedom, “best of breed”. Let's also consider vendors that love to fuel that dream.
But the reality of multi-cloud? It's not a dream. It's a nightmare.
The harsh truth
Cloud providers agree on nothing except charging you by the hour.
Do you want multi-cloud? Then:
Neither gives you the smooth portability that slides and webinars promise.
Are there initiatives breaking this concept? Yes, efforts like SECA are aiming at it, and are actively working to create a standardised API schema for cloud resources. Essentially, just swap the API endpoint from Provider A to Provider B, and you can reuse your Terraform stack seamlessly.
Sounds promising. Needed. But today? It’s a work in progress. Meanwhile, teams are biting the pain.
When Multi-Cloud Makes Sense
This doesn't mean multi-cloud is always wrong, or even worse, unfeasible. But sometimes, it's not optional.
Just think of Geo Compliance: European banking laws, US healthcare, data sovereignty in countries like Germany or France. Sometimes, your application must run in specific regions or providers. Multi-cloud is the only way to check that box.
Or, does Disaster Recovery ring a bell to you? If your business cannot afford downtime (trading and finance, payments, healthcare, telco), then a cold or warm standby in another provider may be justified. This is insurance, and like all insurance, it's painful and expensive: until the day you really need it.
But these are narrow, and specific use cases. Not a universal strategy.
The Pareto principle trap
I've been using this principle for decades without even noticing that, still I remember the first time when my mentor introduced it to me formally: essentially, it's a 80/20 equation where you'll spend 20% of energy to get 80% of the outcome.
Sounds appealing, but it's like a UNO reverse card: when the Pareto principle kicks in, the remaining 20% of the outcome will require 80% of your energy.
The same principle can be applied to your SLA: every additional nine will cost you a fortune. Is it a price you can afford? It's a payoff you need to evaluate wisely, since it could be way less expensive to accept such an outage than to try to build the ultimate and final resilient distributed system.
As I'm very used to saying, let's be more pragmatic. In addition, this overhead isn't just financial: it's human resources and energies:
All because someone in a boardroom thought “multi-cloud” sounded safer, and he's the one who can be avoided to get involved in implementing it.
A better alternative: Multi-Region
If your real goal is resilience, there's a much easier path: multi-region within the same provider.
Although every use case should be evaluated carefully, this gives you a competitive series of advantages, like consistent IAM, Unified APIs, and in most the cases, a built-in data replication and DR.
You still get availability, compliance (sic), and recovery, without multiplying complexity. Besides the business, legal, and tech considerations, I'd like to focus more on the human aspect.
It's not just about technology
We tend to think technology will enhance business, being the secret sauce to solve any problem. Reality is complex, and like formula maths, it's made up of several terms: a team is made of humans, and these still play a role in your organisation.
Kubernetes can help, but it doesn't solve the real problem. Multi-cloud is not just a technology challenge. It's an organisational burden.
Every extra provider you add is another mental tax, burdening your team:
At some point, the resilience you're buying is outweighed by the fragility you're introducing. It's legit, but technology has to scale along with human operators.
Being wise about Multi-Cloud
Since my previous take on naming Cloud Native things, here's another take of mine:
Multi-cloud is not evil. But chasing it blindly because of vendor FOMO, lock-in paranoia, or shiny marketing slides, is dangerous.
Use it only when the business requirement demands it: geo-compliance, sovereignty, DR. Everywhere else, simplify.
Because in the end, platforms aren't just about resilience—they're about people. And people (your developers, your SREs, your customers) pay the price for unnecessary complexity.
Be wise. Build for outcomes, not buzzwords.
DevOps & Cloud Solutions Architect | Blockchain & AI Specialist | Microsoft Certified Trainer (MCT)
1moWhat if you have a client with a Microsoft/Azure infra and requirements, and you are mainly on AWS 🤔?
VP | IT Operations Lead at Bank ABC
2moFrom a pure technical standpoint I agree. Moreover train a team to manage multiple CSPs with the same proficiency, it’s quite impossible and severely limit the usage of prebuilt services that are crucial for the business agility everyone’s seeking. However, from a compliance and risk management perspective there’s no other way to tackle with the weakness of the IT supply chain. The road toward actual resiliency is tough, and expecting tools will help is a dream will never become true. If you want to be resilient and want to adopt multiple cloud providers - you have to - stop thinking there will be a technology such as K8s to help you out and start from something else
PPC | Google & LinkedIn Ads | Ad Copy & Strategy | Social Media Marketer & Manager
2moSeem like a piece of cake with a centralized control plane like HAProxy Fusion 👀 😁
IT Governance | Digital Transformation | Cyber Risk Mitigation
2moFine points, Dario, including "When Multi-Cloud Makes Sense". I can add our case, that fits in this paragraph, but is not narrow and not specific. I designed a secure and scalable infrastructure that is multi-cloud by design due to the following specification: algorithms and data shall be moveable where the Customers wants. So we have a single node k3s edge that can be physical or can be installed as a virtual appliance on customers' premises, but also can run on Customers' cloud resource (again, if the Customer prefers). This edge contains and orchestrate all resources under our central control, but the provided resources are those of the Customers' chosen cloud technology. We just allocate them with IaC on their region/cloud, then the cluster gets connected to our global control plane. This is a sort of meta-use case, because it's decided during the technical analysis of a commercial sale. We just need to be ready with Terraform code for allocating the "cell" (this is how we call the minimal unit of resources, including all accessories needed by central orchestration, updates etc) where the Customers ultimately decides. Definitely not narrow and not specific. Just tough, aligned with our case of Small Supplier vs Big Customer
ICT solutions Architect | Kubernetes & Cloud Native Expert | Co-host Kubernetes Podcast
2moI have customers who do multi cloud. Not for their production environment. One of my customers has his development in the cloud, with scale up and scale to zero. And the production environment on a (private)cloud where the rules and regulations are just right for their application. So they are multi cloud and using the power where you need it.