How Akamai Achieved 40-70% Cloud Cost Savings & Reclaimed Engineers’ Time

→ 40-70% cost savings on Kubernetes workloads
→ Zero downtime or incidents
→ Massive time savings and enhanced engineer productivity
Company size

9,800+ employees

Industry

Technology

Headquarters

Cambridge, MA

Cloud services used

Azure Kubernetes Service (AKS)

Company

Akamai Technologies is one of the world’s largest and most trusted cloud delivery platforms, with over 25 years of experience helping customers deliver secure digital experiences. With the world’s most distributed compute platform, from cloud to edge, the company helps its customers design and run apps while keeping experiences closer to consumers and cybersecurity threats further away. 

Challenge

Akamai has a very large and complex cloud infrastructure on a major cloud provider, powering services delivered to the most demanding customers with strict SLAs. The company was looking for a Kubernetes automation platform that would optimize the costs of running its core infrastructure, scaling applications up and down in line with constantly and, at times, dramatically changing demand.

Solution

CAST AI offered a robust set of features that perfectly matched Akamai’s use cases and requirements: maximized resource utilization with bin packing, automatic selection of the most cost-efficient compute instances, Spot instance automation throughout the entire instance lifecycle, and in-depth Kubernetes cost analytics capabilities.

Results

By implementing CAST AI, Akamai saw savings between 40-70%, depending on the workload. The platform’s automation features also generated massive time savings, enabling engineers to invest in impactful areas like developing new features for customers.

The core savings we got are just brilliant, falling between 40-70%, depending on the workload. But that’s not the full story. 
Before implementing CAST AI, my team was constantly moving around knobs and switches to make sure that our production environments and customers were up to par with the service we needed to invest in. 
Now engineers have more bandwidth to focus on other areas they couldn’t invest in before, like releasing features faster for our customers.
Dekel Shavit
Senior Director of Engineering at Akamai

Key features used

Autoscaling

The graph below shows how CAST AI autoscaler scales cloud resources up and down in line with real-time demand, giving enough headroom to meet the requirements of the application.

Bin packing

CAST AI compacted the cluster via pod bin packing and reduced the wasted resources.

The path to achieving 40-70% cloud cost savings

What are the most challenging aspects of managing the scale and complexity of Akamai’s cloud infrastructure?

Akamai has a very large and complex cloud infrastructure. For some of the services we consume, we’re probably the largest customers for our vendor. We have done tons of core engineering and reengineering with the vendor’s team in order to support our scale. 

Akamai is servicing customers of various sizes and industries, among them large financial institutions, credit card companies, and other of the most demanding customers on the planet. What we provide is directly related to their security posture. At the end of the day, we’re the ones processing security events, and we’re applying smart logic on top of them, exposing actionable insight to our customers.

Delay is not an option. If we’re not able to respond to a security attack in real time, we have failed. 

Adding cost to the mix brings that challenge to the next level. How does your organization approach that?

As with any other company, we need to balance complexity with cost. And this is directly connected to our ability to scale.

Nobody knows when our customers will be attacked. I can plan for X, but I also need to plan for 10X and 100X. And this is where leveraging Kubernetes as a scaling infrastructure really allows us to drive value for our customers. We also saw real-life attacks on our customers that would drive 100x or 1000x on specific components of our infrastructure. Scaling our cloud capacity by 1000x in advance just isn’t financially feasible. 

We thought about optimizing things on the code side. But at the end of the day, because of this inherent complexity of the business model, we need to think about a way to optimize the core infrastructure.

This is when Akamai started looking for the right cost optimization platform

Where did you search for the right optimization tooling start?

We were deeply involved in a POC with another provider. However, we experienced challenges related to our infrastructure’s scale and unique use cases. We started to look for another solution, and one of our providers, Develeap, introduced us to CAST AI.

Develeap is a CAST AI partner that provides and supports development teams end-to-end – from selecting the right branching methodology to implementing efficient CI/CD pipelines, provisioning cost-effective infrastructure in the cloud, and putting mechanisms in place to monitor and control your testing and production environments, all done with over 120 top talent DevOps professionals.

Why did you decide to move forward with CAST AI?

Do you remember the first time you held an iPhone and thought: Wow, this is different? There was like an aha moment – an epiphany. It’s somewhat magical, but it’s tangible; it’s in your hands. I had an aha moment like that with CAST AI. 

Literally two minutes into the integration, we saw the cost analytics and insights into something I’d never had before and tried to get for a very long time. And that was just the tip of the iceberg, right? This was literally two minutes into the integration. Like an iPhone, CAST AI is simple, but there’s tons of very smart thinking behind it. 

To be more specific, we decided to go with CAST AI for three main reasons:

1. The rich array of features

This isn’t just about cost optimization, but also capabilities like cost modeling, which are super important for us and drive a lot of value.  

2. Feature maturity

All optimization solutions say the same thing in their marketing materials: they have automation for Spot instances, bin packing, rightsizing, etc. But the feature maturity on CAST AI matched our use case – and let’s not forget, we’re running very large and complex data processing workloads, leveraging Spark.

For our use case, CAST was not two times better or five times better. It was immeasurably better.
Dekel Shavit
Senior Director of Engineering at Akamai

3. Ease of doing business

Ease of doing business isn’t just about the business model. There’s a bigger story to the way we interact with the CAST AI team.

We’re a painful customer. We have a lot of security requirements, and the way that the CAST AI team delivered throughout the interaction with us was nothing short of amazing. That’s how I define the ease of doing business. The team never overpromised; instead, they over-delivered. 

Was it easy to trust an automation solution to manage your infrastructure?

Design to trust is a concept I talk about a lot. CAST AI’s technology is brilliant, but technology isn’t the whole story. When you have a partner who is able to not just deliver but also be transparent about their business model, the way they do things, and setting expectations, you build trust with them. 

We’re a very complex organization because we need to be up to par with our security standards, compliance objectives, and regulations. Working with the CAST team was really a design to trust process across the board because we trust you.

Results: 40-70% cost savings and time reclaimed for mission-critical tasks

How much did you save with CAST AI?

The core savings we got and are getting on an ongoing basis are actually brilliant. It’s anywhere between 40% and 70%, depending on the workload

Our dream scenario was to drive costs down and, at the same time, make sure that the performance was either on par with what we had or better. We achieved that by leveraging CAST AI features like bin packing, instance rightsizing, Spot instance automation, and many others.

Get results like Akamai – book a demo with CAST AI now

What was the impact of CAST AI on your workload and the workload of your team?

Cost savings aren’t the full story. The full story here is that all of this was achieved with very low investment from our side. 

The time to value here was also very short compared to other alternatives that we did a POC with. The saving factor by itself, if it’s 2x or 3x, that’s great. But it’s when you can turn on and forget about the platform that you get its full value.

Since the CAST AI engine basically optimizes a lot of the day-to-day efforts that they were doing, we see fewer incidents in our production environment and much less manual tuning that we need to do.

We were able to both reduce the cost, improve our margins, and reinvest those dollars into our engineering team. 
Dekel Shavit
Senior Director of Engineering at Akamai

Automating Spot instances for dramatic cost savings

Did you try using Spot instances without any automation in place?

We tried using Spot instances because they simply make business sense. But because of the complexity of our workloads – in particular, complex Apache Spark workloads – leveraging Spot instances turned out to be complex. We needed to either overengineer our workloads or put more working hands on the workload to leverage Spot instances to the point that it didn’t make sense financially. 

It was super clear that there is value on the table if we found the right tool to leverage Spot instances. This was partially the reason why we moved forward with CAST AI.

How does CAST AI automate Spot instances for Akamai?

For us, the fact that we were able to leverage Spot instances on our Spark workload with zero investment from our engineering team or operations was the factor that drove the point home.

We didn’t change anything in our workload in order to leverage Spot instances. And at our scale, it’s super complex. Every time I have a hiccup in my system, that’s an issue for our customers. I literally cannot have it. 

Zero issue is part of the integration. That’s something that I actually didn’t expect. It really shows that the team that we work with understands our pain points. 
Dekel Shavit
Senior Director of Engineering at Akamai Technologies

A partnership based on trust

What was the support like throughout the process of implementing and running CAST AI?

We’re a very painful customer due to our security demands, auditing, etc., because we need to keep the standard that our customers expect. At our scale, that’s not easy.  

The CAST AI team showed a clear understanding of the complexity of doing business from our side and streamlined the process. 

After the implementation, I’m glad to say we didn’t need any. There were no incidents. We’re talking zero downtime, zero issues.
Dekel Shavit
Senior Director of Engineering at Akamai Technologies

Automation is great, but it’s also somewhat scary. So, when you understand your customers and their pain points, I think you can finetune your product and optimize it like CAST AI did in a way that allows the customer to trust the platform.

I see CAST AI as a holistic platform that is just right for any company that has a Kubernetes workload. It doesn’t matter how complex or simple your workload is, how large it is, or where you’re running it. 
At the end of the day, it’s because of the holistic value of the CAST AI platform that you’ll be able to have a clear cost analysis of your workload; that’s a win. And then you’ll be able to improve your performance; that’s another win. Cost savings? Amazing! 
Dekel Shavit
Senior Director of Engineering at Akamai Technologies

You’re underway to simplify Kubernetes

  • No more complexity of Kubernetes management
  • 50%+ lower cloud costs without repetitive tasks
  • Predictable cloud bills and performance at all times

4.1/5 – Average rating

5/5 – Average rating

Users love CAST AI on G2 CAST AI is a leader in Cloud Cost Management on G2 CAST AI is a leader in Cloud Cost Management on G2 CAST AI is a leader in Small-Business Cloud Cost Management on G2

Book a demo

Find out how much you can save on your cloud bill.

✓ Valid number ✕ Invalid number
Which Kubernetes services are you using?(Required)
This field is for validation purposes and should be left unchanged.