Out of the 500+ EC2 instances that AWS offers, burstable instances may look like a winning option for workloads with low CPU utilization and occasional bursts. Take a look at the pricing data, and you’ll see that VMs in the T family seem to come with a lower price tag than the corresponding non-burstable instance types.
Sure, a burstable instance might be a good pick if your workload experiences low CPU utilization most of the time and gets a burst of traffic once in a while. Like a WordPress page or small e-commerce site that gets a few users per day.
But what about Kubernetes? For K8s workloads, burstable EC2 instances perform worse than expected. Keep reading to learn why.
How burstable instances work and why they’re so tricky
To quote AWS, “The T instance family provides a baseline CPU performance with the ability to burst above the baseline at any time for as long as required.”
AWS also claims that T instances “can save you up to 15% in costs when compared to M instances, and can lead to even more cost savings with smaller, more economical instance sizes.”
In general, burstable instances in families T2, T3, and T3a are designed for low-intensity, non-latency sensitive applications, microservices, or small/medium databases.
But bursting can happen under one condition: you need to have enough credits.
Here’s where the T-ricky part starts
When using a T instance, you don’t get the exclusive right to fully use that CPU; you share CPU time with other AWS users.
It’s like Uber Pool where you share a car with a bunch of strangers. Your trip becomes less predictable and inevitably takes longer.
Every T instance accumulates CPU credits per hour of operation. The number of credits depends on the instance type, with larger instances collecting more credits per hour. Still, you can’t collect them infinitely – there’s always a cutoff to that number.
But as the EC2 instance works, it also burns CPU credits. And if you run out of them, you won’t get the whole CPU. A single CPU credit provides:
- 100% of a core for 1 minute,
- or 25% of a core for 4 minutes,
- or 10% of a core for 10 minutes, etc.
Restarting instances causes you to lose credits:
- Restarting a T2 instance means that you immediately lose all of the accrued credits.
- If you restart a T3 or T4 instance, your credits will still be available for seven days (and then you’ll lose them).
Burstable price covers limited CPU utilization – for example, 30% for t3.large, 40% for t3.xlarge and t3.2xlarge. Sure, this base price sounds great. Like AWS promised, it’s c. 15% cheaper than the cost of the corresponding non-burstable instances.
But with burstable instances, you need to track two costs:
- the base price,
- the extra CPU credit cost.
What’s the extra CPU credit cost?
Take the t3.large instance as an example.
It comes with a baseline of 30% CPU utilization. If your utilization is lower than that, the instance starts accumulating CPU credits.
But what happens if your utilization goes beyond the baseline? The instance will start using the credits it has previously accumulated.
If your burstable instances do nothing for 24 hours, accrued credits will be spent in ~4 hours if you apply load.
The worst part? If it turns out that you don’t have enough credits, you pay extra to acquire additional credits to sustain high CPU utilization.
What kind of extra costs are we talking about?
In this benchmark study, the extra CPU credits raked up the cost corresponding to 28% of the total non-burstable compute expenses. This made burstable compute significantly more expensive than the non-burstable alternative. The study found that for sustained CPU utilization above 42.5%, using m6i.large instances is cheaper than using t3.large instances.
Adding extra credits isn’t cheap either
If you run out of credits, you can either switch to a larger instance type (which means a few minutes of downtime and an instance cost that is, well, larger) or buy extra credits. AWS provides this option with T2/T3 Unliminted.
Contrary to the Standard mode, where instances are throttled to the baseline when credits run low. This exposes you to potential business impact (timeouting operations, sluggish responses) if your workloads are getting a fraction of the CPU they are expecting if Unlimited mode is not enabled (T2 default). In Unlimited, you get extra credit added to the instance. These credits cost $0.05 per vCPU hour (Linux).
An AWS cost comparison showed that if we turned the Unlimited mode on for a t3.large instance, that instance would end up costing 48% more than the corresponding non-burstable m5.large.
Burstable instances are a bad pick for Kubernetes, here’s why
While burstable instances are fine for rarely used workloads, it would be better to just go with a serverless model like AWS Lambda, where you pay only for short durations when you need them and don’t keep any EC2 instances at all. However, AWS Lambda becomes expensive quickly at a sustained usage scale.
Well-managed Kubernetes capacity with horizontal pod scaling and well-sized pods is a more cost-effective solution.
If your Kubernetes has workloads to run, it should run them to meet requests/needs so that service users can get predictable and consistent performance. In the case of low workload usage, horizontal pod scalers like HPA or KEDA should reduce the number of replicas automatically, and barely utilized nodes should be removed from the cluster.
A workload’s CPU and Memory utilization metrics should be monitored, and CPU / Memory requests should be rightsized based on actual resource utilization in containers.
If the remaining Kubernetes nodes are bin-packed well, your nodes will achieve high CPU utilization – which will cause you to buy credits if you’re using burstable instances. That’s where T3 becomes much more expensive and delivers poor bang for your buck.
In short, burstable instances are expensive, deliver poor and inconsistent performance, and have a confusing SKU – lose, lose, lose.
Note that AWS Karpenter will default to the T3 family if you don’t specify any instance preferences. Picking a good instance out of 500+ instances is no small feat, though.
What if there were some independent party that would provide a curated service and protect you from these gotchas?
A managed Kubernetes autoscaler wins here
You don’t need burstable instances to save up on spiky applications if your autoscalers are configured well enough to handle the load. Kubernetes has three autoscaling mechanisms in place, but a managed K8s platform like CAST AI helps you make the most of them.
The platform ensures that the type of nodes in use matches the application’s requirements at all times, scaling them up and down automatically.
What happens when a workload suddenly requests more CPU or memory than the resources available on any of the worker nodes? The CAST AI autoscaler will smoothly address Kubernetes capacity dynamically via its headroom policy.
To see how this works, dive into the docs or book a demo with one of our engineers to get a personalized walkthrough.
Learn how the CAST AI autoscaler works
Book a platform walkthrough to see automation in action.