AWS recommends running highly available EKS clusters with worker nodes in node pools (Auto Scaling groups) that are spread across multiple Availability Zones or Regions.
This choice sure makes sense in terms of reliability. But does it help to optimize Kubernetes clusters for cost?
Our experience shows that using node pools leads to sub-optimal utilization in some cases and so generates significant cloud waste and high cloud expenses.
What is an Auto Scaling group anyway?
In our context, a node pool is essentially an Auto Scaling group. It consists of EC2 instances AWS treats as a logical grouping for automated scaling and management. It allows you to benefit from EC2 Auto Scaling features like health check replacements and scaling policies.
Auto Scaling groups help to dynamically increase or decrease the number of instances in the group to address changing conditions and provide your workloads with just enough resources to run smoothly.
If a scaling policy is enabled, the Auto Scaling group adjusts the desired capacity of the group between the specified minimum and maximum capacity values. It also launches or terminates the instances as needed, also enabling you to scale on a schedule.
Sounds great, doesn’t it? Here’s the caveat.
Here’s the issue with node pools
Looking to adjust their resources to changing demand and optimize their usage, our customers often turned to this tactic and created node pools.
But here’s the issue: these node pools were only partially full. A team could easily end up with a collection of nodes containing more capacity than needed. The company would pay for resources that were effectively not being used.
Cost-efficient alternative: single node pool with maximum utilization
Implementing a “no node pools” approach is more beneficial because it allows teams to avoid the mounting costs of cloud resources. Instead of keeping a set of partially full nodes, you get a single node pool where all the nodes are completely full, leaving no room for cloud waste.
But keeping an eye on your node pool and making sure that utilized and requested capacity match requires a lot of engineer time.
That’s why this method is such a great use case for automation.
Automating the no node pools approach
CAST AI creates a single node pool that is completely full, ensuring high resource utilization and eliminating any cloud waste. It constantly analyzes your application demands and adjusts the available capacity in real time by deleting nodes and moving workloads around to minimize waste.
Want to check what your node layout looks like vs. what it could look like for greater optimization?
Get a free savings report to check how much a “no node pools” would save you.