Spot Instances vs On-Demand: Reduce Your Costs Using Automation

Spot Instances can save up to 90% over On-Demand pricing—but the trade-off is risk. This guide shows how automation helps you use both smartly, without disruption.

Giri Radhakrishnan Avatar
Spot Instances vs On-Demand_ Reduce Your Costs Using Automation

Imagine you’re at a beachfront resort. All the ‘reserved’ premium lounge chairs are expensive but guaranteed. You, however, see an open chair with no name tag. You settle in, enjoy the sun, sip a drink… until someone with a reservation shows up. Time to pack up.

That’s what running on Spot Instances is like. Cloud resources at a fraction of the price, available when no one else is using them. The catch? The cloud provider may reclaim them at any time.

But what if you could stay on that beach chair and know exactly when it’s time to move, or better yet, have another one waiting for you?

On-Demand Instances offer guaranteed availability and predictable pricing at a premium cost. Spot Instances, on the other hand, provide access to the same resources at massive discounts, with the trade-off of potential interruptions that may impact availability. 

While the savings with Spot Instances are compelling, many teams hesitate to rely on them for critical workloads due to concerns about stability and complexity. However, with the right automation strategies in place, you can harness the power of Spot Instances without compromising on availability or reliability. 

In this article, we’ll explore the key differences between On-Demand and Spot Instances and show how automation can help teams unlock cost savings while keeping services resilient and highly available.

What are On-Demand Instances?

On-Demand Instances are instances that you pay for based on your usage. Following the “pay-as-you-go” model, charges are based on the hour or the second. You can purchase as much compute capacity as you need, when you need it, and with no long-term commitment. 

When you no longer require an instance, you remove your workloads from that instance and decommission it, adjusting your cloud capacity to meet your changing computing requirements.

Many teams use On-Demand Instances when they require both stability and flexibility. This type of pricing might work well for short-term projects where you won’t be able to forecast how much capacity you’ll need at any time.

On-Demand Instances: pros and cons

Upside of On-Demand Instances

  • Effortless scaling – A pay-as-you-go arrangement opens the door to scalability. You can increase or decrease the number of instances you use at any moment without being bound by a specific budget or number of instances.
  • Flexibility – On-Demand Instances work well in scenarios where you don’t know how much cloud capacity a project will require. Instead of being locked into paying for a capacity that you will never use (like Reserved Instances), you can constantly re-evaluate your requirements and scale your instances up and down to fit. However, this flexibility comes at a higher cost.
  • Stability – You can use such instances as long as you want, and your cloud provider won’t deny you access to their servers.

Limitations of On-Demand Instances

  • High cost—On-Demand Instances are the most expensive sort of cloud instance; they’re the price you pay for flexibility and the ability to scale your capacity up and down as needed.
  • Cost control challenges – Determining the cost of the instances you need can be a hassle due to the varying rates for different instance types and server zones. By giving teams the freedom to use as many resources as they need, you open yourself to the risk of costs spiraling out of control.

What are Spot Instances?

Spot Instances represent surplus capacity you can get from a cloud provider for a set amount of time. But if demand rises and your provider requires that additional capacity, they will reclaim it, often with less than a minute’s warning.

Spot Instances are generally recommended for stateless, containerized applications that perform short-term tasks. If you have automation mechanisms in place for provisioning, scaling, and decommissioning Spot Instances, you can run even the most demanding production workloads on them.

For example, one of Cast AI’s customers, Yotpo, runs at least 80% of its workloads on Spot Instances and, through automation, manages the entire Spot Instance lifecycle. They use Cast AI to provision the most cost-effective instance type and size and move workloads back and forth between Spots and On-Demand Instances when Spot availability changes.

Here’s how Achi Solomon, Director of DevOps at Yotpo, described the company’s usage of Spot Instances during the most challenging time of year: Black Friday.

Spot Instances: pros and cons

Benefits of Spot Instances 

  • Massive cost reductions – Spot Instances are up to 90% less expensive than On-Demand Instances because they don’t require any commitment from your cloud provider. 
  • Flexibility – Once you’ve determined which types of instances you require, you may request Spot Instances and be up and running immediately. When you no longer need them, you can turn them off immediately. 

Challenges of Spot Instances

  • Interruptions – If your cloud provider needs additional capacity, they will only give you a few minutes’ warning before reclaiming Spot Instances. 
  • Unpredictable pricing – Because the hourly pricing for Spot Instances varies depending on demand, it can be difficult to forecast exactly how much you’ll spend for them if you use them on a regular basis. As a result, even at these low prices, expenses can rise abruptly.

How often do cloud providers change Spot Instance prices?

Source: 2025 Kubernetes Cost Benchmark Report

In general, Azure and GCP Spot prices tend to be more stable and predictable, changing only a couple of times a month. On the other hand, AWS continuously changes its Spot prices, which is true for both GPU and non-GPU instances. 

The average monthly number of distinct prices is 197 for AWS, 0.3 for GCP (a new price is set every three months), and 0.8 for Azure (a new price is set less than once a month).

How long can you expect your instance to run without interruption?

Source: 2025 Kubernetes Cost Benchmark Report


Azure stands out with a higher average node age of 69.4 hours. This could indicate a lower volatility or better management of Spot Instances.

GCP also has a relatively long node age compared to AWS, with instances lasting 13.8 hours on average versus 7.6 hours for AWS. AWS has the shortest node lifespan of the three cloud providers, indicating higher interruption rates or shorter availability for Spot Instances.

Interruptions within one hour are the most frequent, with an average of 34% occurring within this time frame across all providers. The second most common interruption timeframe is between 1 and 2 hours (11% on average).

Which cloud provider interrupts Spot Instances most often?

AWS exhibits the highest overall interruption rate across shorter timeframes, with 50%+ of interruptions occurring in the first hour of a node’s lifetime and 9%+ of Spot nodes suffering interruptions within a week. Azure demonstrates more stability, with much lower percentages of disruption across all intervals, especially within the first 12 hours. GCP falls in the middle.

Here’s a deeper dive into interruption data per time interval:

Source: 2025 Kubernetes Cost Benchmark Report

Why is automation essential for managing Spot Instances?

1. Selecting a Spot Instance isn’t a one-time exercise

The graph below shows the price evolution of three generations of Spot Instances from the same family: m5a, m6a, and m7a.

Source: 2025 Kubernetes Cost Benchmark Report

Notice the growing difference in prices? This demonstrates that choosing a Spot Instance should not be a single point-in-time decision. 

In February 2024, all three instances were priced at a similar level, with the latest generation m7a being the most expensive. But the difference grew significantly during the last quarter of the year.

If a team chose m7a, thinking the slight price bump was worth the performance, they could actually mix these instances to be more cost-effective by year’s end. For example, using m6a would be more cost-effective in a dev environment (where performance isn’t critical) while running production clusters on the more expensive m7a to ensure high performance.

This is why keeping track of price trends and being able to revisit earlier decisions is important. Teams tend to stick to familiar and common instances, missing the opportunity to explore alternatives. 

2. Consolidate clusters on a larger Spot Instance

Here’s an example of the AWS G5 instance types that offer varying GPU configurations depending on the size of the instance. 

Source: 2025 Kubernetes Cost Benchmark Report

Specifically, the g5.xlarge, g5.4xlarge, and g5.16xlarge (NVIDIA A10G Tensor Core GPU) instances are each equipped with one GPU. The g5.24xl has four GPUs, while the g5.48xl provides eight GPUs.

As you can see, different GPUs carry different Spot Instance prices, potentially unlocking new savings if you decide to consolidate workloads and purchase a larger instance.

When running the g5.16xl instance, teams can use the single GPU while gaining access to significant additional compute resources (CPU). Workloads that don’t require the GPU can use these additional resources, leading to cost-effective computation. 

In terms of price per GPU, larger multi-GPU instances provide better value than several smaller, single-GPU instances. When you look at the prices per GPU, there isn’t much difference between the 4-GPU (g5.24xl) and 8-GPU (g5.48xl) instances. This means that larger instances are cheaper for tasks that use many GPUs.

3. Choose the right instance family 

Not all Spot Instances are created equal. Different instance families offer varying levels of reliability, pricing volatility, and availability, and choosing the wrong one can lead to frequent interruptions or missed savings. 

Automation eliminates guesswork by continuously evaluating instance families based on real-time market signals and workload requirements. Instead of manually comparing specs and pricing, your system always runs on the most cost-effective and stable option available.

Automation unlocks more savings by moving Spot-friendly workloads that have already been deployed on Spot VMs to cheaper families while maintaining service uptime and application performance.

Yotpo’s cost breakdown for Spot Instances and On-Demand Instances

Conclusion

Spot Instances offer significant cost savings, but they can be challenging to manage without a solid strategy for handling potential interruptions. This is where automation solutions make a meaningful difference. They can automatically identify and provision the most cost-effective virtual machines and seamlessly transition workloads to On-Demand Instances when Spot capacity becomes unavailable. 

This capability ensures that teams benefit from the cost advantages of Spot Instances without the burden of manual management or the risk of unexpected disruptions, confident that their workloads will always have a place to run.

Cast AIBlogSpot Instances vs On-Demand: Reduce Your Costs Using Automation