How to solve the bin packing problem in the world of Kubernetes? We all know that the Kubernetes scheduler is all about fairness and not maximum node utilization. But to cut costs, teams need a way to bin pack nodes with efficiency in mind – for example, by moving workloads from one node to another and removing the empty node.
Evictor is an automated solution that comes as part of the CAST AI platform. It continuously compacts pods into fewer nodes, creating empty nodes that can be removed following the Node deletion policy (if you enable it). To avoid any downtime, Evictor will only consider applications with multiple replicas.
To show you what we mean by adding automation to pod eviction and scheduling, let’s take a closer look at how Evictor works.
What does automated Kubernetes bin packing look like?
On the CAST AI platform, you can configure the settings and policies that will be taken into account during autoscaling, both up and down.
Here’s a short overview of what you can find on this page:
- Unscheduled pods policy – it automatically adjusts the size of a Kubernetes cluster, so all pods have a place to run. Here you can turn on the spot instance policy and use spot fallback to ensure that workloads have a place to run when spot instances get interrupted.
- Node deletion policy – it automatically removes nodes from the cluster when they no longer have workloads scheduled to them to keep maintain a minimal footprint and slash the costs. This is where you can enable Evictor, which continuously compacts pods into fewer nodes and creates massive cost savings via bin packing.
- CPU limit policy – this policy keeps CPU resources within the defined minimum and maximum thresholds.
Let’s say that you enabled Evictor and set it to work.
This is how Evictor works:
- It identifies one node (in red) as a candidate for eviction.
- It automatically moves pods to other nodes – a mechanism called “bin-packing.”
- Once the node is empty, it gets deleted from the cluster.
- Finally, Evictor returns to the first step, constantly looking for nodes that are good candidates for eviction.
As a result, one node is deleted:
Here are the Evictor logs:
time="2021-06-14T16:08:27Z" level=debug msg="will try to evict node \"ip-192-168-66-41.us-east-2.compute.internal\"" time="2021-06-14T16:08:27Z" level=debug msg="annotating (marking) node \"ip-192-168-66-41.us-east-2.compute.internal\" with \"evictor.cast.ai/evicting\"" node_name=ip-192-168-66-41.us-east-2.compute.internal time="2021-06-14T16:08:27Z" level=debug msg="tainting node \"ip-192-168-66-41.us-east-2.compute.internal\" for eviction" node_name=ip-192-168-66-41.us-east-2.compute.internal time="2021-06-14T16:08:27Z" level=debug msg="started evicting pods from a node" node_name=ip-192-168-66-41.us-east-2.compute.internal time="2021-06-14T16:08:27Z" level=info msg="evicting 9 pods from node \"ip-192-168-66-41.us-east-2.compute.internal\"" node_name=ip-192-168-66-41.us-east-2.compute.internal I0614 16:08:28.831083 1 request.go:655] Throttling request took 1.120968056s, request: GET:https://10.100.0.1:443/api/v1/namespaces/default/pods/shippingservice-7cd7c964-dl54q time="2021-06-14T16:08:44Z" level=debug msg="finished node eviction" node_name=ip-192-168-66-41.us-east-2.compute.interna
Next, Evictor removes the second and third nodes. At this point, only three nodes remain:
After about 10 minutes, Evictor deleted three nodes and left three nodes running. CPUs are now at a much healthier rate of 80%.
Now you see why automation can lift so much weight off your shoulders around node draining and pod eviction in Kubernetes.
Automated bin packing saves both time and costs
If you’d like to check how Evictor fits into the overall cost reduction flow of CAST AI, check out this article showing how we reduced the overall cluster cost by 66% thanks to smart pod scheduling and using spot instances.
Kubernetes containers have defined sizes, so the goal of any optimization effort is to identify the best method to pack a collection of pods to maximize node utilization.
In the bin packing problem, we get containers of the same capacity given in this situation. The goal is to find the fewest number of containers that can accommodate all of the pods. The idea is to pack all the pods into the fewest nodes possible so that nodes achieve a high level of utilization and businesses avoid paying for cloud resources that aren’t fully utilized.
Bin packing opens the door to higher node utilization, which means that businesses make the most out of every single virtual machine they pay for. Instead of running three small workloads in three machines, teams can use bin packing to run all the workloads in one machine and pay just for that machine, significantly reducing their cloud spend.
Bin packing algorithms attempt to find the best solution to the bin packing problem where the goal is to pack a set defined set of items into as fewest containers (in Kubernetes, nodes) as possible.
There are a number of both open-source and commercial tools that modify the usual decision-making process of the Kubernetes scheduler and bin pack pods into as few nodes as possible. CAST AI is one of them, with Evictor acting as its bin packing mechanism.
The best indicator of high node utilization and successful bin packing is a lower cloud bill. By running the same number of size of workloads in fewer virtual machines than previously, teams get to save up a lot on their cloud costs.