TL;DR: By dogfooding the Network Cost report on CAST AI, we identified high inter-zone traffic between two services. After moving these services to the same zone, we started saving around 70% on data transfer.
It’s easy to fall into the data egress cost trap, which is one of the sneakiest hidden cost items on your cloud bill. Egress costs, unlike subscriptions, aren’t set in stone. They often grow following the introduction of new components or features, an acquisition, entry into a new market, or being subject to restrictions that require you to move data.
Data egress expenses are extremely difficult to assess and estimate. Knowing where they come from is the first step to taking control and potentially minimizing them.
At CAST AI, we’re proud to manage our own clusters using the very platform we developed. Using our own solution – otherwise known as dogfooding – just makes sense for us.
This is a case study showing how we reduced our data egress charges by 70% using the Network Cost report in the CAST AI platform.
Step 1: Analyzing the daily cloud cost report
Before introducing the Network cost report, we had a clear idea of how much we paid for data transfer. After all, we could get this data by checking the cloud provider’s billing reports.
By the end of August 2023, we had a look at our daily cloud cost report and discovered that data transfer costs were quite significant and seemed to be growing.
The fees are marked in orange in the chart below (spoiler alert: you can probably tell that things got much better by August 31st):
Step 2: Checking the costs at the cluster level
After introducing the Network cost report, we could verify that the cloud bill reflected the traffic in our cluster:
Which services were responsible for generating the highest data transfer costs? Could we possibly move these services to the same zone to avoid cross-zone charges?
To answer these questions, we headed over to the Network Cost report.
Step 3: Checking the Network Cost report at the workload level
When ordering services by Total cost incurred, we quickly identified the cause of such high data transfer costs:
We see that the distributor service is responsible for most of the network egress. Next, we had to investigate where the data was transferred.
Step 4: Identifying sources of ingress and egress traffic
By ordering services by Total traffic, we pinpointed another service that had a very high level of traffic but a relatively low total cost.
That is explained by the fact that this service is receiving traffic (ingress), which is free. However, we see that ingress is similar to the egress of distributor service.
Solution: Moving two services to the same AZ
To slash the egress charges, we placed these two services in the same availability zone. Cross-zone traffic between them dropped to zero. And intra-zone traffic is free.
Results: Cost savings of 70% (c. $2000 per month)
By moving services to the same availability zone, we started saving around 70% on data transfer, which corresponds to about $2000 per month.
Connect your cluster to CAST AI and check your network traffic costs for yourself – you might discover a similar outlier and apply an easy fix to score a quick win.