The cloud brought us many great products, but also a massive knowledge gap between technical and non-technical folks. Studies on the topic clearly point to this: teams struggle to understand the cost dynamics of the cloud and cloud-native technologies like Kubernetes.
For a cloud investment to work out, you need the finance leader to become your strategic partner. That’s why the first step is understanding what makes cloud financial management different from traditional IT cost management.
Moving beyond traditional IT financial management
Managing cloud spend is entirely different from managing traditional IT expenses like physical servers or on-premises software licenses.
In a traditional setup, it’s up to finance teams to set and approve budgets so that procurement teams can make purchases and manage vendors. The implementation and provisioning of the new infra fall upon the shoulders of technology teams. Managing costs in such a crystal clear process sounds like a piece of cake.
Enter the cloud. Engineers can acquire new resources independently, in a few minutes, and without consulting anyone about it. They aren’t used to managing cloud costs, simply because they never had to think twice about them.
Budgeting and planning have changed a lot in the cloud. In the past, scaling up meant investing in new hardware. This was planned months, if not years, in advance. In the cloud, users can scale resources up and down on an ongoing basis. This results in great variability in cloud expenses month over month, down to day over day.
Now you see why traditional approaches to financial management aren’t going to cut it in the cloud.
Even though cloud computing (together with its unique financial risks) has been around for a while, many organizations still struggle to figure out cost control.
Going over budget is common in the cloud – the latest report from Flexera indicated an average of 13% but other reports point to higher figures nearing 39%. At the same time, companies are wasting 32% of cloud spend, meaning that engineers fail to extract value from ⅓ of the cloud capacity they request.
Why are teams struggling to control their cloud bills?
4 biggest challenges of cloud financial management
1. Lack of visibility into cloud costs
Keeping an accurate inventory of cloud resources is hard. Providers offer tagging policies and other features to make cost allocation and traceability easier. But it’s up to the team whether they’re being used or not.
Tracking spend grows harder as cloud adoption increases within the company. In an ideal world, finance teams are capable of tracing cloud costs back to specific initiatives, teams, or projects.
2. Unexpected bills
Imagine the surprise of the Adobe employee who discovered an unplanned cloud bill of over $500k because of a workload left running unchecked.
Cloud computing makes compute, storage, and other resources available to everyone. Without guardrails to control and govern cloud spending, this autonomy can result in unexpected or unexplainable costs. Teams are still falling short of consolidating their billing and understanding their usage and spending habits.
3. Lack of cost oversight
Working across functions (technical, product, and business) without a centralized view of the actual spend makes understanding cost and usage trends very difficult. And insufficient cost oversight can lead to allocating resources inefficiently based on IT priorities such as performance, reliability, and security.
It’s not uncommon for engineers to buy a little more capacity than necessary just to sleep well at night. But before you know it, an extra expense here and there snowballs into that pretty common 30% of wasted cloud resources.
4. Inaccurate forecasting
Forecasting cloud expenses is a popular activity among companies looking to save up on capacity with committed use discounts (reserved instances and savings plans in AWS).
But committed cloud spend is usually ~20% lower than the actual spend. Some companies go over their initial estimations by 2x! Examples like Pinterest confirm this. During one holiday season, the company’s cloud spend went beyond the initial estimates by $20 million.
This is another problem that results from traditional methods of budgeting and financial variance analysis not translating well to the cloud.
4 tactics for managing your cloud costs
1. It all starts with cost visibility
A typical cloud bill is long and hard to understand. Here are two examples from AWS and Azure that prove this:
Every service has a billing metric defined for it. Understanding your usage just by looking at the bill is challenging, even when you’re doing that for just one team and one cloud provider. What if you have more than one team to analyze?
Here’s a typical analysis sequence managers go through:
- Analyzing usage reports to identify any spending patterns.
- Detecting peak resource usage scenarios with periodic analytics and historical usage data.
- Checking if the seasonal customer demand patterns correlate with peak resource usage (this helps in identifying them in advance).
- Monitoring resource usage reports and setting up alerts to keep cloud costs in check.
- Developing an application-level cost plan by measuring application- or workload-specific costs.
- Analyzing pricing models of the cloud provider and planning capacity requirements.
These tasks aren’t one-off jobs and need to be done on a regular basis. At this point, many companies turn to native cost reporting solutions from cloud providers or use third-party tools that bring deeper insights in near real time.
Check out this guide to dive deeper into your cloud bill: Surprised by your cloud bill? 5 common issues & how to deal with them
2. Selecting the right cloud resources
Picking the right cloud resources is tricky when you’re facing such a broad spectrum of services. Compute is a good example – AWS offers almost 400 different instances!
Here’s an example sequence for making the best choice:
- Define your minimum requirements – consider all compute dimensions including CPU (architecture, count, choice of processor), Memory, SSD, and network connectivity.
- Choose the right instance type – you can typically get various combinations of CPU, memory, storage, and networking capacities that come packaged in instance types optimized for compute, memory, and others.
- Set your instance size – the instance needs to have enough capacity to accommodate the workload’s requirements and include extra options like bursting if necessary.
- Analyze different pricing models – check out different rates for on-demand (pay-as-you-go), reserved capacity, spot instances, and dedicated hosts. Use this guide: How to choose the best VM type for the job and save on your cloud bill
3. Autoscaling resources to match demand
Autoscaling is one of the greatest optimization methods for the cloud and cloud-native technologies like Kubernetes. The tighter your scaling mechanisms are configured, the lower the waste in running the application (and the more valuable your cloud financial management efforts!).
Real-time autoscaling solutions use business metrics to generate the optimal number of required instances. They can scale up, down, or zero if there’s no more work to be done. The mechanism ensures that resources in use always match the application’s requirements in real time.
Take a look at this graph showing the difference between resources requested vs. provisioned (CPU and memory). Notice how it shrank once the team used the CAST AI autoscaling feature (marked “Start of optimization”):
4. Using spot instances
Buying idle capacity from cloud providers is a smart move as it gets you even a 90% discount on the on-demand resources. But these resources can be reclaimed at any moment, leaving you a small window of time to find another place for your application to run.
When using spot instances, managers typically follow this sequence:
- Checking if the workload is spot-ready – Can tolerate interruptions? How long will it take to complete the job? Is this a mission-critical workload?
- Going through the available spot instances – look at less popular instances as they’re less likely to be interrupted and can operate for longer periods of time (frequency of interruption rate).
- Bidding on a spot instance – set the highest amount to pay for your chosen spot instance. Setting the maximum price at the level of on-demand pricing is the rule of thumb here.
- Managing spot instances in groups – this is how you can request numerous instance types at once, increasing your chances of landing a spot instance.
- Preparing for interruptions – have a plan B for your application when your spot instances are reclaimed.
- Spending a lot of time on configuration, setup, and maintenance tasks – or automating it!
Cloud financial management calls for automation
As you can see, many cloud financial management and optimization tasks involve multiple steps that need to be completed over and over again. That’s why teams are turning to automation solutions that take care of tasks like resource selection or spot instance management.
This type of optimization doesn’t require any extra work from engineers and results in round-the-clock savings of 50%+, even for teams that are doing a great job optimizing manually.
A fully automated solution can react quickly to changes in resource demand or pricing, helping you to stay on top of the real-time nature of cloud financial management.
Book a demo and see how this type of automation works for yourself.
CAST AI analyzes, optimizes and keeps your cluster optimized to avoid cloud bill surprises. Book a demo to see how much you could save.
Leave a reply