Running Kubernetes within a single cloud provider is already a pretty tall order. Multi cloud Kubernetes setups only add to the complexity around configuration, management, and cost. Just imagine the pain of having a few federated or centrally managed control planes to take care of. 🤯
But as more and more teams run their clusters with different cloud providers, we get lessons learned and best practices for taking at least some of the pain away.
This article gives a bunch of tips for reducing the cost and complexity of multi cloud Kubernetes. If you’re interested in security, keep an eye out for another post we’re soon going to publish.
Why multi cloud Kubernetes anyway?
Why do teams end up struggling with multi cloud Kubernetes deployments in the first place? Here are a few good reasons:
- Business-related issues – a merger or acquisition can sometimes unexpectedly bring another provider in-house. Other reasons include the chance to avoid vendor lock-in, get access to specific features to drive innovation or gain solid ground for contract negotiations.
- Delivering better service – running services close to end-users comes with the benefits of low latency, reliability, and an enhanced user experience.
- Greater availability – running applications in multi cloud means that the chances of them all going down at the same time are very slim. Also, you can shift traffic from a problematic platform to a healthy one.
- Compliance requirements – depending on your location, your product might fall under specific data localization regulations, meaning that you must store data in a data center located in a given geographic zone.
Multi cloud Kubernetes cost and complexity: 7 problems that need solving
1. Infrastructure management
It’s hard to provision and decommission cloud resources across multiple cloud providers in a predictable and repeatable way.
This is where Infrastructure-as-Code (IaC) tools can help. General tools like Terraform or cloud-specific ones like AWS CloudFormation work really well when you’re dealing with a single cloud provider. Adding another vendor to the mix quickly becomes tricky.
Alternatively, you can use a solution like Pulumi that lets you abstract infrastructure away from the specific features of individual cloud providers. No matter which path you choose, you’ll always have to implement and manage code for every cloud provider.
Let’s start with a quick dive into Kubernetes networking challenges.
Each managed Kubernetes offering – like Amazon Elastic Kubernetes Service, Azure Kubernetes Service, or Google Kubernetes Engine, has its own implementation of Kubernetes networking and uses different Kubernetes CNI (container network interface) plugins to enable networking inside the Kubernetes cluster. So, it requires knowledge to handle different CNIs and might be a challenge if you manage clusters with different CNIs.
The challenges around multi cloud networking are both general and Kubernetes-specific. For example, the transient nature of containers means that you need to persist storage for all the clusters spread across multiple cloud services.
With the availability of the Container Storage Interface, you get a few solutions that provide methods for spreading storage across cloud environments.
In a multi cloud scenario, Kubernetes resources need to have network connectivity across the cloud environments.
One method to extend networks beyond a single cloud provider is VPN tunneling – it entails creating tunnels between each cloud environment. Note that VPN can cause congestion and security concerns for egress traffic on both sides of the connection (not to mention vulnerabilities such as prefix hijacking and route leakage).
3. Application deployment
Deployment may become difficult and require orchestration, depending on the number of applications and their dependencies. The approach you choose here will depend on your application packaging strategy and existing CI/CD procedures.
You can basically go with one of these two:
- Distribute applications using your existing CI/CD pipelines – this allows for smooth integration into the current development process. The biggest disadvantages are scalability and the maintenance burden it places on engineers.
- Use a GitOps process – it can provide continuous application deployment to target clusters, controlled by source code and managed by software such as ArgoCD, FluxCD, or GitLab. The ability to control everything from a single point of truth while automatically keeping deployments constant is the primary benefit here. The biggest con is that you end up needing yet another tool and its interaction with existing CI/CD procedures, as well as a variety of cloud providers.
4. Compatibility and interoperability
It’s hard to ensure compatibility and interoperability between multiple cloud systems when you’re looking at differences in API syntax, Kubernetes compliance, and feature support.
To mitigate this, check the different Kubernetes services and ensure compliance with your workload requirements. Extensive testing of your application’s behavior across multiple clouds is a good idea
Also, use cloud provider-independent APIs and SDKs for working with cloud services. This will open the door to developing applications and automation scripts that work across multiple cloud providers. This is how you make them more portable and flexible by not relying directly on cloud-specific APIs.
5. Operational complexity (observability and troubleshooting)
When working with several cloud platforms, managing operational activities like monitoring, logging, and troubleshooting becomes a minefield.
Each cloud provider may have its own set of tools and interfaces, forcing you to traverse and aggregate data from several sources.
However, observability is a critical component of a multi cloud platform. Your observability stack must be scalable while consuming metrics, logs, events, and outages from several platforms.
You can go with open-source solutions like Prometheus or Grafana. But you’ll still need to develop bespoke dashboards and alerting systems that satisfy your business and operational demands.
6. Compliance and security
Managing compliance and security in single-cloud setups is no walk in the park. You need authentication and authorization measures, enforce security standards, keep up with security vulnerabilities and fixes, and harden environments. Because of the huge variety of security and compliance implementations, multi cloud setups necessitate even more work.
Multi cloud configurations can introduce additional complications in these areas:
- Authentication and authorization – you’ll need to interface with the authentication mechanism of each cloud provider. For account, role, and policy development, it’s ideal to have a centralized solution that is independent of the cloud provider.
- System hardening and security policy – enforcing the limitation of unprotected ports or traffic, protecting APIs, and implementing least privilege are all examples of security policies and environment hardening. You need to centrally define, translate, and disseminate control across multiple cloud platforms.
- Storage and networking – storage might require encrypting data at rest and establishing data loss prevention methods. Data encryption, innovative routing technologies, and network policy management are all necessary for secure multi cloud networking.
- Vulnerability patching – you also need to keep your infrastructure updated using each provider’s methods.
CAST AI includes a container security feature that identifies vulnerabilities and configuration issues – and then prioritizes them to help you run containers with confidence.
7. Cost management and optimization
This area is pretty broad, so let’s focus on two aspects:
1. Finding the right skilled people in multi cloud environments isn’t that easy
Multi cloud Kubernetes deployments call for professionals who are knowledgeable about both Kubernetes and the cloud providers in use.
A company may choose to invest in training and upskilling their workers to ensure they have the skills needed to properly manage and run multi cloud Kubernetes deployments. Or hire new experts.
Both approaches involve capital expenditures. And you bet they’re not small.
2. Managing costs in a multi cloud environment is a challenge
Each cloud service provider has its own pricing models, resource size choices, and invoicing systems. Getting enough visibility to make smart decisions is already tough; optimizing resource utilization and costs across several providers is even harder.
The easiest way to implement cost-cutting measures is via a third-party solution that can peer into each of your provider accounts and pull all the cost data into one location. Ideally, it should do it in real time.
Multi cloud Kubernetes case study: Delio
The UK-based fintech company Delio has always prioritized cost, management, and performance. The management burden of running Kubernetes clusters across AWS and Azure prompted the team to seek a solution.
“We were running on T-type EC2 instances and scaling them a lot during our operation. At the beginning, we used AWS autoscalers and then added Azure into the mix. Keeping an eye on it all became difficult.
Are we using the right instances for the right job? Do we have to create more node groups? Is something not working because we’ve run out of space? Answering all of these questions and applying fixes became a time-consuming issue for our team,” said Alex Le Peltier, Head of Technology Operations at Delio.
CAST AI was built to give teams a helping hand in running Kubernetes across the three major cloud providers, with a special focus on automated cost optimization. Book a demo to see how it could help you navigate the multi cloud Kubernetes universe.