8 best practices to reduce your AWS bill for Kubernetes

Leon Kuperman
· 13 min read
aws cost optimization tips

If your AWS Kubernetes bill went way over your budget this month, it’s not your fault. 

Typically, companies go over their cloud budgets by 23% (Flexera). 

Cloud providers aren’t exactly helping here. The bills are long, complicated, and hard to understand. AWS has 150+ types of EC2 instances available, how are you supposed to make sense of it all?

This article is packed with tips for reducing your AWS costs, no matter if you run K8s on your own or use EKS.

What you’ll find inside:

Note: While writing this article, AWS introduced changes to its offer. We will be updating it on a regular basis.

Quick guide to AWS pricing

AWS pricing

On-demand instances

How it works: In this pay-as-you-go model, you only pay for the resources that you actually use. For example, AWS charges you for every hour of compute capacity used by your team. You face no long-term binding contracts or upfront payments, so you’re not overcommitting to any budgets. And you get to increase or reduce your usage just-in-time. 

What to watch out for:

  • It’s the most expensive option, so overrunning your budget is a real risk here.
  • The flexible on-demand instances work great for unpredictable workloads that experience fluctuating traffic spikes. Otherwise, you should look into alternatives.

Reserved instances 

How it works: In theory, everything sounds great. You can buy capacity upfront in a given availability zone for a really low price when compared to on-demand instances. The larger your upfront payment, the bigger the discount. In AWS, you’re looking at commitments of 1 to 3 years.

What to watch out for:

  • In most scenarios, you’ll be committing to a specific instance or family – and changing it later isn’t an option. This could become a problem if your requirements change later. 

Savings Plans 

How it works: This model is very similar to Reserved instances, but here you’re committing to using a given amount of compute power over 1 or 3 years (measured as usage per hour). So, you get on this plan and all usage up to your commitment will be covered. Anything extra is billed using the on-demand rate. 

What to watch out for:

  • Sure, you’re committing to consistent usage, not specific instance types and configurations like in Reserved instances.

For both reserved instances and savings plans, you’re still running the risk of locking yourself in with the cloud vendor. Also, you’re committing to pay for resources that might make little sense for your company in 1 or 3 years. 

Consider this:

When your compute requirements change, you need to either commit even more or you’re stuck with unused capacity.

Reserved instances remove any flexibility of scaling, seasonality, or ability to easily configure multi-region/zone distribution.

Spot instances

How it works: This is a very cost-effective pricing model. You bid on resources CSPs aren’t using at the moment and get to save up to 90% off the on-demand pricing. But it comes with limited availability. AWS can pull the plug with a 2-minute warning. That’s still better than Azure or Google Cloud that give you only 30 seconds. 

What to watch out for:

  • The savings are amazing, so spot instances are definitely worth your attention.
  • Just make sure that you pick them for workloads that can handle interruptions – for example, stateless application components like microservices or replicable applications. 

Dedicated host 

How it works: This is a physical server with an instance capacity that is exclusively for you. You can use your own licenses for cost reductions but still get the resiliency and flexibility of AWS. A good choice for businesses that need to achieve compliance and avoid sharing hardware with other tenants for extra security.

What to watch out for:

  • When setting up a Dedicated host, you pick a configuration that identifies the type and number of instances you can run on it.
  • Note that you’re billed hourly for every active Dedicated host, not each instance running. Still, this is a pricey option. 

Note: Don’t forget about the extra charges

All cloud providers charge for extra things – think egress traffic, load balancing, block storage, IP addresses, or premium support. AWS is no exception. Consider these expenses when comparing different pricing options and setting your cloud budget.

Start by understanding your AWS Kubernetes bill

AWS bills are long, complex, and hard to understand. Just take a look at this bill here:

AWS bill

Each service in your bill comes with its own billing metric. Some services in the AWS Simple Storage Service are charged by requests, while others use GB. 

This is just the tip of the iceberg. We covered more cloud bill issues here: Surprised by your cloud bill? 5 common issues & how to deal with them

To understand your usage and start analyzing costs, you need to look into various areas in the AWS console. Checking the AWS Billing and Cost Management Dashboard alone won’t be enough.

You need a more granular view, and for that, you need to visit the Cost Explorer. To make billing more transparent, it’s a good idea to group and report on costs by specific attributes (like region or service). 

As you can imagine, this is very time-consuming.

But you need to get started somewhere with. And the best starting point is the AWS suite of cost management tools.

AWS cost management tools – they’re worth exploring

  • AWS Cost Explorer – This is where you can visualize and manage your costs and usage over time. Start by creating some custom reports, analyzing your data from a bird’s eye view, and identifying any cost drivers or anomalies.
  • AWS Budgets – Use this handy tool to set custom budgets and track your costs/usage for different use cases. You can set an alert for when the actual or forecasted cost goes over your budget threshold. The tool can also alert you when your actual Reserved instances or Savings Plan utilization/coverage drops below the threshold you’ve set for it. 
  • AWS Cost & Usage Report – This report gives you a comprehensive overview of your costs and usage together with some extra metadata on AWS pricing, services, Reserved instances, and Savings Plans. You can see all that at different levels of granularity and organize everything further with Cost Allocation tags and Cost Categories.

Multiple teams using one AWS account – how to deal with that?

If several teams or departments contribute to your AWS cloud bill, you need a method to make sense of it (or at least give a helping hand to the CFO).

Fortunately, AWS provides mechanisms for categorizing expenses by accounts, organizations, or projects. You can use them to keep teams from spending too much.  

  • Organizations – Use this feature to centrally manage and govern your environment while scaling AWS resources. Create a new AWS account, allocate resources, organize workflows by grouping accounts, and then apply budgets and policies to the accounts or groups. You can make billing simpler by using a single payment method.
  • Tagging resources – When you tag resources directly, they’re not going to land on your bill. You need to break data down by tags manually. To do that, write a report in the Cost Explorer or download data from S3 and use it directly (don’t expect this to be an easy task). Entire companies exist just to build tools for expressing and representing bills. Note that some services/components/resources cannot be tagged.

Alternatively, you can use resource tags in the AWS Cost Explorer. The usual practice is to create tags for each team, environment, application, service and even feature. Then turn on cost grouping for those tags and you can create reports in the AWS billing console.

How to forecast cloud costs? 3 techniques

Cloud bills fluctuate depending on usage, so forecasting them is painful. 

But it’s worth your time. Understanding your future resource requirements helps to keep costs in check. 

Here are three techniques for forecasting costs towards AWS cost optimization.

  1. Analyze your usage reports – First things first, you need to achieve clear visibility of your cloud expenses. Monitor the resource usage reports regularly. Set up email and other alerts to keep your bill in check.
     
  2. Model your cloud expenses – How do you calculate the total cost of ownership of your cloud resources? Start by analyzing the AWS pricing models and plan capacity requirements over time to forecast costs. It’s smart to also measure application or workload-specific costs to develop an application-level cost plan. You need one location for aggregating – this is how you understand expenses better.
  3. Detect peak resource usage scenarios – Use periodic analytics and run reports on your usage data to identify these scenarios. You can use other data sources – for example, the seasonal customer demand patterns. If you spot that these patterns correlate with your peak resource usages, you can now identify them in advance (and prepare for it accordingly).

8 best practices to reduce your AWS Kubernetes bill

Choosing the right VM instance type

1. Define your requirements

Order only what your workload needs across the compute dimensions: 

  • CPU count and architecture, 
  • memory, 
  • storage, 
  • network. 

An affordable instance might look tempting, but what happens if you start running a memory-intensive application and end up with performance issues affecting your brand and customers?

Pay attention when choosing between CPU and GPU dense instances. If you’re developing a machine learning application, you probably need a GPU dense instance type because training models on it is much faster. 

2. Choose the right instance type 

There’s no denying that compute resources are the biggest line item on your bill. Picking the right VM instance type can save you even 50% of your bill (we actually tested it).

AWS offers many different instance types matching a wide range of use cases, with entirely different combinations of CPU, memory, storage, and networking capacity. Every type comes in one or more sizes, so you can scale your resources easily.

But know this: Cloud providers roll out different computers for their VMs and the chips in those computers have different performance characteristics. So you might pick an instance type with a strong performance that you don’t actually need (and you don’t even know it).

Understanding and calculating all of this is hard. AWS offers more than 150 different instance types. 

The best way to verify a VM’s capabilities is by benchmarking: by dropping the same workload on every machine type and checking its performance. We actually did that, take a look here.

3. Verify storage transfer limitations

Data storage is another cost optimization aspect worth your time.

Each application out there comes with its unique storage needs. Make sure that the VM you pick has the storage throughput your workloads need. 

Also, avoid expensive drive options like premium SSD unless you’re planning to use them to the fullest.

Spot instances

4. Check if your workload is spot-ready

Spot instances are a great way to save up on your AWS Kubernetes bill – even by 90% off the on-demand pricing! But before jumping on the spot bandwagon, you need to decide how aggressive you’re going to be about implementing them? 

Here are a few questions to get you a step closer:

  • How much time does your workload need to finish the job? 
  • Is it mission- and time-critical?
  • Can it handle interruptions? 
  • Is it tightly coupled between instance nodes? 
  • What tools are you going to use to move your workload when AWS pulls the plug? 

5. Cherry-pick your spot instances

When choosing spot instances, pick the slightly less popular instances because they’re less likely to get interrupted. 

Once you pick an instance, check its frequency of interruption – the rate at which this instance reclaimed capacity during the trailing month. 

You can see it the AWS Spot Instance Advisor in ranges of <5%, 5-10%,10-15%,15-20%, and >20%:

AWS spot instances frequency of interruption

Don’t shy away from using spot instances for more important workloads.

AWS offers a type of spot instance that gives you an uninterrupted time guarantee for up to 6 hours (in hourly increments). You pay just a little more for it and still achieve a discount of 30-50% off the on-demand pricing.

6. Bid your price on spot

Once you find the right instance, it’s time to set the maximum price you’re ready to pay for it. The instance will only run when the marketplace price is below or equal to your bid. 

Here’s a rule of thumb: Set the maximum price to one that equals the on-demand price. Otherwise, you risk getting your workloads interrupted once the price goes higher than what you set.

To boost your chances of snatching spot instances, you can set up groups of spot instances (AWS Spot Fleets) and request multiple instance types at the same time.

You will be paying the maximum price per hour for the entire fleet rather than a specific spot pool (a group of instances with the same type, OS, availability zone, and network platform).

But to make it work, you’re facing a huge amount of configuration, setup, and maintenance tasks. 

Autoscaling

7. Use mixed instances

So far, I didn’t really talk about anything related exclusively to Kubernetes, so here’s one good point.

A mixed-instance strategy gets you great availability and performance at a reasonable cost. You basically pick different instance types since some are cheaper and just good enough, but might not be suitable for high-throughput, low-latency workloads.

Depending on your workloads, you can often pick the cheapest machines and make it all work.

Or you could have a smaller number of machines with higher specs. This can help slash your AWS Kubernetes bill because each node requires Kubernetes to be installed on it and adds a little overhead.

But mixed instances present some scaling challenges.

In a mixed-instance scenario, each instance uses a different type of resource. While you can scale up the instances in your autoscaling groups using metrics like CPU and network utilization, you’ll get inconsistent metrics. 

Cluster Autoscaler helps here by allowing you to mix instance types in a node group. However, your instances must have the same capacity in terms of CPU and memory. 

8. Make multiple availability zones work for you

Instances spanning across several availability zones increase your availability. But AWS recommends to do this:

  • Configure multiple node groups, 
  • Scope each of them to a single Availability Zone,
  • And then enable the –balance-similar-node-groups feature. 

If you create only one node group, you can scope that node group to span across multiple Availability Zones.

Optimize your AWS Kubernetes bill intelligently 

All of those tips apply regardless of your infra type – whether you handle clusters on your own or use AWS EKS. 

But to seriously reduce your AWS bill, you need an intelligent platform that can choose the right instance size from various instance types spanning on-demand, spot, and reserved – all the while helping you to manage infrastructure dependencies.  

This is what CAST AI is about.

We’re currently working on an EKS optimization tool that will be launched in a few weeks.

Be the first to find out: EKS cost optimization
Get on the waiting list to be the first to find out when CAST AI optimizer for AWS EKS rolls out and is open for you to analyze and optimize your clusters.