Discover how a pharma leader saved 76% on Spot instances used in AI/ML experiments

A leader in the pharmaceutical industry approached us with the goal of reducing cloud costs for running scientific machine learning (ML) models at scale. The company used ML models to run patient simulations as part of its R&D activities, running thousands of pods scaling to 10k CPUs for processing millions of patient records per batch run.

Problem: Manually-handled EC2 spot instances failed to generate the expected cost savings 

The customer’s system relied on engineers’ decisions around the selection of machines for model executions and jobs. 

To optimize cloud expenses, the company used EC2 spot instances for selected jobs. However, the team had no information about future pricing trends. 

The ratio of EC2 spot instances to on-demand instances was chosen manually, often by trial and error. The goal here was to achieve the highest spot fulfillment rate possible at any given moment. Any failures to obtain EC2 spot instances would be handled manually.

Since the process of selecting and launching EC2 spot instances was manual, system users couldn’t schedule jobs to run in the future – for example, over the weekend. The team ran workloads “on-demand” rather than giving them flexibility to run them whenever the price was lowest. If there was no spot capacity left, the job couldn’t be restarted. All of this meant that jobs could only be run in the present moment, and errors could only be addressed as they occurred. 

Solution: Automating EC2 spot instances at every lifecycle stage

The customer decided to start a Proof of Concept with Cast AI to check if the solution could address the issues.

Identifying optimal pod configuration

Cast AI identifies the optimal pod configuration for computation requirements and automatically picks virtual machines that meet workload criteria while selecting the cheapest instance families on the market. 

If the platform encounters an “Out of Capacity” error, the unavailable instances get blacklisted for short bursts and attempted later. To load balance workloads dynamically, Cast AI uses Multiple Availability Zones.

Keeping workloads afloat even when no EC2 spot instances are available

If the platform cannot find any spot capacity to meet a workload’s demand, it uses its Spot Fallback feature to temporarily run the workload on On-Demand instances until spot capacity becomes available. Once that happens, Cast AI seamlessly moves workloads back to EC2 spot instances, provided that users request this behavior.

Pricing prediction for smarter workload execution planning

Using a state of the art transformer, Cast AI has pricing models that predict seasonality and trends, allowing the platform to select the best times to run a batch workload. This leads to significant cost savings if users don’t require the workloads to execute immediately.

All of these built-in advantages allow Cast AI to schedule cloud resources more effectively, which ultimately leads to a much higher spot instance fulfillment rate and better savings.

This simplified chart contains the price of single workload running and using two different types of EC2 instances. The when-to-run model calculates a price and optimal pod configuration given inputs such as CPU, RAM requests, predicted future price and others.

The X axis identifies date and the Y, the total price of single workload run hour. If you’d run the workload immediately, you’ll pay ~360$ per whole run. If you waif for couple of days – that workload run would cost you 5% less.

Result: machine learning cost savings of 76%

The POC carried out by the Cast AI team showed that the platform can generate cloud cost savings of 76%, inclusive of platform fees. These results can be achieved with similar timing and consistently high-quality output when compared to the customer’s existing production system.

Get results like this – book a demo with Cast AI now
CAST AI Case studies Pharmaceutical Company
Company Size

10000+

Industry

Pharmaceutical

Region

EMEA

Services

EKS

More customer stories

  • Bede Gaming
    iGaming

    How Bede Gaming automatically optimizes Kubernetes workloads with no risk to performance

  • PlayPlay
    SaaS

    See how PlayPlay automated Spot VMs for a 40% cloud cost reduction and boosted DevOps team productivity

  • Yotpo
    Technology

    Explore how Yotpo automated Spot instance provisioning for a 40% cloud cost reduction and massive time savings