Bonus content: Cloud bill optimization checklist
Any cost optimization you’ll do begins with understanding your cloud bill. But how do you get started when a typical cloud bill looks like this?
This article explains the ins and outs of cloud bills to help you make sense of and control your expenses.
What you’ll find inside this article:
- The cloud may come with a high price tag, here’s why
- Why optimize cloud costs? or, The dangers of overprovisioning
- 5 cloud bill issues you’re bound to encounter, sooner or later
- Is automation possible?
The cloud may come with a high price tag, here’s why
You’re not a terrible DevOps engineer if your cloud bill usually exceeds your planned budget. A typical public cloud spend goes over budget by 13% on average.
This isn’t surprising at all. After all, you don’t know how much you’ll need to pay until you get your cloud bill at the end of the month. And forecasting these expenses is no walk in the park. This is what makes cloud cost optimization so challenging.
Why is it so hard to control cloud bills?
In the early days of IT, people were divided into those who handled costs and those who wrote code. The latter often had no direct impact on the costs of running that code. Software efficiency matters a lot, but let’s put that aside for a moment.
The traditional software delivery setup was clear on roles and responsibilities – it was the system administration group (later called DevOps or SRE) who took care of infrastructure deployment and ongoing operational costs.
In this setup, it was typical to release software once a month or every couple of months. The traditional way of delivering software to production wasn’t driving innovation. Enter the service team – a group of engineers who could write code, test it automatically, deploy it, and be held responsible for 24/7 operations.
Since software engineers were in the business of software innovation, they needed access to resources. Thanks to the cloud, they could order the infrastructure they needed through Infrastructure-as-a-Service platforms and then automate deployments through Infrastructure-as-Code.
This led to massively faster innovation cycles. In times of economic boom, this type of rapid innovation, infrastructure deployment, and potential associated waste were acceptable to many companies.
As the pendulum perpetually swings back, times have changed recently. We are moving into a time when CFOs, controllers, and financial departments need to know more about how much it costs for different departments to use cloud resources.
It’s a hard nut to crack.
Is the team ordering just enough, too much, or too little?
Are cloud costs optimized?
How can I forecast cloud costs based on projected business metrics and growth?
Finance departments need to answer these questions for every single team/group/department that uses cloud services. And in an ideal scenario, the engineering service team should be able to answer some of them as well.
Why optimize cloud costs? or, The dangers of over-provisioning
We all want to pay less for cloud services. But cloud cost optimization is about more than that. Here are three problems your team might face if you leave your cloud bill to fate.
If cloud cost management isn’t your forte, you might end up wasting resources. This can happen to teams that move an application to the cloud and make assumptions about the resources needed to support the software based on how it ran on-premises, leading to potential over-provisioning.
Planning cloud expenses without any real utilization visibility will likely have you drowning in the sea of unused CPU in the best case.
Even worse, your application could be crashing due to insufficient resources if under-provisioned. There’s no real safety net to catch you when you make an error on either side.
It’s so easy to spin up an instance for a project and then forget to shut it down. Many teams deal with orphaned instances that have no ownership but still continue to generate costs. Shadow IT projects similarly produce poorly accounted for resources in the cloud.
Some teams have a tendency to overprovision resources just to add some extra capacity and create a “just in case” buffer. The limited visibility or control of these resources might snowball into a huge problem at the end of the month.
Deciding on a path with limited information
Recently, we have seen many companies that are migrating from the cloud back to on-premises, citing better cost control as their primary reason. But migration generates costs as well – from planning and data egress to re-aligning your service continuity and disaster recovery plan.
A migration from the cloud to on-premises affects your ability to innovate and quickly respond to change and comes at a steep price.
Consider this scenario:
You’re paying way too much for the cloud. Managing and estimating these costs is hard. So, you move back to data centers and can finally make sense of your infrastructure expenses. You’re killing it! But you’re also killing innovation in the process.
This is a pain many engineers have felt first-hand. Once on-premises, teams can no longer be flexible and experiment because the infrastructure or DevOps team plans the capacity months ahead and blocks certain expansion scenarios.
5 cloud bill issues you’re bound to encounter, sooner or later
1. Cloud bills are hard to understand
They’re long, complicated, hard to unpack and comprehend. The example we shared above speaks volumes.
Every service in your bill has a defined billing metric for it. For example, some services in the AWS Simple Storage Service charge by the number of requests, while others use GB. That’s why making sense of the cloud bill is such an overwhelming task.
To make sense of your usage and costs, you need to look into various areas in your cloud service provider (CSP) console.
Just take a look at AWS Billing and Cost Management Dashboard. To get a more granular view, you need to check the Cost Explorer and then group and report on costs by certain attributes – for example, group resources by region or service.
This approch to cloud cost optimization is time-consuming and heavily relies on human intervention.
Now imagine doing that for more than one team or department using the same cloud service.
2. Billing for multiple clouds is even harder
Multiply the bill problems above by the number of clouds you use.
Let’s say that you’re using AWS. You’re now used to it; its costs are manageable. But what if another department unexpectedly starts using Microsoft Azure in a shadow IT project?
Just compare the cloud bill from AWS and Azure; they’re worlds apart.
3. Multiple teams working in one financial account
Step into the shoes of a CFO and their finance team.
You’re dealing with several departments that contribute to the cloud bill. How can one make sense of it? Who is using which resources?
Public cloud providers offer mechanisms that allow categorizing spending by accounts, organizations, or projects to make sure that a team or department keeps within its spending parameters.
- Organizations – This feature helps to centrally manage and govern your environment when scaling AWS resources. You can create new AWS accounts and allocate resources, organize your workflows by grouping accounts, apply budgets and policies to accounts or groups, and use a single payment method for simpler billing.
- Tagging resources – You can tag resources directly, but don’t expect them to show up in your bills. It’s your job to break down data by tags. You can do that by writing reports in the Cost Explorer or downloading the data from S3 and using it directly, which is definitely not a trivial task. There are entire companies busy developing tools for expressing and representing bills, like CloudHealth by VMWare.
- Resource groups – A resource group is a container that consists of resources that you want to manage as a group. Azure recommends bringing together resources that share the same lifecycle to deploy, update, and delete them as a group.
- Tagging is also available as an option for Azure customers.
Google Cloud Platform
- Projects – In GCP, a project includes a set of users, enabled APIs, billing settings, and authentication, and monitoring settings for those APIs. You can create multiple projects and use them to organize your cloud resources into logical groups to help with understanding their cost.
- Google Cloud supports tags for billing as well – they are called labels. Some GCP resources haven’t implemented the ability to label yet, but that gap will likely close soon.
Each CSP approaches costs and billing differently. This makes it even harder to keep track of your resources, how you use them, and how much they cost you by hand.
Do you run your application on Kubernetes? Check this cost montoring guide: Kubernetes Cost Monitoring: 3 Metrics You Need to Track ASAP
4. Budgeting for the cloud
Each cloud provider offers budgeting tools that help CFOs and their financial teams to restrict resources that can be used in a project in line with its budget for better cloud cost management.
But cloud budgets tend to overrun. Case in point? A Silicon Valley startup Milkie Way burned through $72k on testing Firebase + Cloud Run and almost went bankrupt.
The trouble, in that case, was Google evaluates your budget at the end of the day, and they blew their budget within hours.
In a similar incident, a team of software developers at Adobe incurred $80k a day in unplanned cloud costs, with a final bill that surpassed half a million dollars. For an enterprise, this could easily become a multi-million dollar cloud bill.
What causes cloud budgets to overrun?
- Discovering costly requirements after formal discovery is finished.
- Not knowing your system requirements upfront and having wrong assumptions about how the system features will work and scale.
- Lack of autoscaling design for applications.
- Faulty provisioning logic in IaC that spins out of control.
- The use of serverless (functions) without thought to parallel scale.
- Lack of budget attention, aka nobody watching.
- Poorly configured notifications and alerts.
Approach to consider: Correlating costs with business value at Netflix
Driving costs down isn’t something you do at the expense of supporting your key business goals. Netflix is a good example here. The company uses the total number of active streams (how many people are currently watching content) to measure its business value. By correlating this KPI to cloud costs, Netflix can ensure that spending growth doesn’t outpace one of those active streams.
5. Cloud cost forecasting
Cloud bills are hard to forecast because they fluctuate depending on usage.
Forecasting should still be on top of your mind. Having a good understanding of your future resource requirements helps to keep a rein on costs.
You could possibly get a lower price for services too. That is, if you’re willing to risk committing to a certain level of spending, this could mean a serious case vendor lock-in lasting even a few years.
To help in forecasting, public cloud providers offer various tools:
- AWS Budgets – The tool allows setting custom budgets to track your cost and usage for various use cases. It includes alerts when the actual or forecasted cost exceeds your budget threshold and when your actual RI and Savings Plan utilization or coverage drops below the threshold you set for it.
- AWS Cost Explorer – The AWS Cost Explorer helps to visualize, understand, and manage costs and usage over time. You can create custom reports, analyze your data at a high level, and detect cost drivers or anomalies.
- AWS Cost & Usage Report – This report includes a comprehensive set of cost and usage data with additional metadata about AWS pricing, services, reserved instances, and savings plans at different levels of granularity. It itemizes usage at the account or organization level, and you can organize the costs further using Cost Allocation tags and Cost Categories.
- Cloud cost management – This dashboard helps to track and manage costs across both Azure and AWS cloud services. It includes options for cost analysis and optimization. You can expand it with Microsoft Power BI connectors and Azure Cost Management and Billing APIs.
- Pricing calculator – A handy tool for checking the costs of different Azure configurations. You can estimate costs by products or using ready-made scenarios like Advanced analytics on Big Data or CI/CD for Containers.
Google Cloud Platform
- Google Cloud Billing – The cost forecast feature enables users to check how their costs are trending and how much they’re projected to spend in a given month. You can use it to forecast the end-of-month costs for a specific spending group, from the entire billing account down to one SKU in a single project. You can also export your entire billing data set to Google Big Query and use tools like Google Data Studio for custom analysis.
Once you have the right data about your cloud spend, you can try different techniques of cloud cost forecasting – an essential cost optimization step.
Cloud cost forecasting techniques
- Analyze usage reports – Forecasting is impossible if you don’t have clear visibility of your spend. Monitor your resource usage reports on a regular basis and set up email and other alerts. Some CSPs have tools that allow forecasting of how much you’re likely to spend during the next few months and get recommendations on reserved instances or savings plans.
- Model your cloud costs – To calculate the total cost of ownership, analyze the pricing models of CSPs, and accurately plan capacity requirements over time to project costs. Measure application or workload-specific costs and create an application-level cost plan. Aggregate all of this data in one location to understand your costs and trends better.
- Identify peak resource usage scenarios – You can do this by using periodic analytics and then running reports over your usage data. Consider other sources of data like seasonal customer demand patterns. Do their patterns correlate with your peak resource usages? If so, you can identify these in advance.
Can you automate any cloud cost management tasks?
You’re probably asking yourself these questions right now:
What about third-party solutions? Aren’t cost management and optimization tools here to help make sense of my cloud bill faster?
Third-party cost reporting solutions provide better visibility, but they won’t help you take automated action.
Some third-party tools can be helpful in getting a big picture view of your cloud costs. At worst, they report on your usage so that you know where the costs are coming from. At best, they give you static recommendations that need to be executed by human beings.
A simple checklist will help you do most of what a dedicated tool can accomplish at no additional cost. You can get one delivered to your email here:
To save and optimize costs in real time, use an automated platform that does the heavy lifting for you. CAST AI includes several automation cost optimization features – take a look at our use cases to check if it’s the right fit for you.
CAST AI clients save an average of 63% on their Kubernetes bills
Connect your cluster and see your costs in 5 min, no credit card required.
A cloud bill is a bill you get from your cloud service provider at the end of each billing period. AWS, Google Cloud, and Microsoft Azure bills look completely different, but they have one thing in common – complexity. They’re usually hard to decipher and analyze, preventing teams from learning what contributes to their cloud spend and how to optimize their cloud costs so the bill doesn’t grow larger with each month.
It’s difficult to keep track of cloud resources quickly enough to optimize them continually. This is one of the reasons why cloud cost management is frequently a point-in-time activity, meant to optimize at a higher level than at the level of individual resources. Companies need to find a way to monitor costs in real-time, get notified about anomalies, and forecast these expenses more accurately. This is where an automated cloud optimization solution can help – by reacting to changes in demand in real-time, it adjusts resources seamlessly and always look for better instances to run your workloads.
You might find unexpected cost items on your cloud bill; you won’t be the only person out there struggling to decipher your bill. But there are a few things you can do to avoid that:
– Make sure to discover costly requirements during the formal discovery process,
– Know your system requirements upfront,
– Clarify your assumptions about how the system features will work and scale,
– Develop autoscaling for your applications,
– Monitor your budget to catch anomalies as soon as they happen (real-time alerting features are really helpful here).
Here are several bullet-proof approaches to keep your cloud bill in check and reduce your costs:
– Shut down unused instances, environments, and other cloud resources
– Rightsize instances to avoid overprovisioning and achieve the best performance vs. price combination,
– Eliminate shadow IT projects to gain more control over your resources,
– Use autoscaling smartly – define how much scaling a given workload needs, examine the provider’s pricing model, and make sure that the workload is scaled back down when demand drops.
Leave a reply
Going back to on-prem does sound attractive especially when you’re a new business that haven’t been locked in with a single cloud vendor. But with more control you loose on more tools and like apple would say the “eco system” + migration process gives me migraine, so beware folks and make up your mind fast!
Since my company uses AWS cost explorer & usage report, I feel like we have a good grasp of what’s going on in the bills that we are paying, but it does happen from time to time and in some traffic spikes, when we cant figure out why our cloud bill changed so drastically (ofc its overprovisioning or bad automation most of the time) but the surprised feeling that we get when the bill is there is not nice at all.
Yeah, the first thing I do on any project is set email budget alarms to avoid monster bills. Does CAST provide some sort of autoscaling functionality?
Good article on cloud bill issues…