How We’re Using AI to Optimize AI

Today CAST AI introduced a series of new features into its platform, designed to help optimize GPU-heavy AI models and applications in the cloud. This comes at a pivotal moment, as more companies experiment with generative AI. CAST AI will host a webinar to teach organizations how to lower AI costs without replacing their tech stack and cloud.

Tech leaders face a rush of demand for AI apps. Yet AI comes with high compute requirements and demands specialized hardware equipped with Graphics Processing Units (GPUs), an industry standard for training AI models. Provisioning these resources is expensive; analysts estimate that systems like ChatGPT cost as much as $700,000 a day to operate. The result is that – while AI is primed to transform applications across virtually every industry – training and running models is largely limited to well-funded organizations which dedicate significant IT spend to cloud providers out of necessity.

Gartner’s senior director analyst Chirag Dekate has noted: “IT leaders responsible for AI are discovering ‘AI pilot paradox,’ where launching pilots is deceptively easy but deploying them into production is notoriously challenging.” And while Dekate finds that scaling infrastructure resources is pivotal to success, and that AI is one of the top cloud services, nearly half of businesses struggle with cloud costs.

“As AI models continue to grow in size and power, the cost of computation to train and run them also increases,” said Bobby Yazdani, Founder and Partner of Cota Capital. “By significantly reducing these costs, CAST AI can enable wider experimentation, deployment and adoption of these powerful new technologies.”

To support teams that are building, training, and running AI models, CAST AI has expanded its platform with the following features:

Automated provisioning, selecting, and scaling of cost-effective GPU machines across AWS, Google and Microsoft Azure.
A new real-time workload rightisizing feature that allocates GPU-heavy compute and high speed memory resources.
Automated decommissioning of GPU instances and replacement with more cost-efficient alternatives once the process is completed.
Automated optimization of Amazon Inferentia machines used for executing AI models.
Use of high performance Graviton processors for performance and cost balance.
Automated management of spot instances – CAST AI identifies the optimal pod configuration for the model’s computation requirements and automatically selects machines that meet these criteria cost-effectively.

“Our new features are game-changing for businesses looking to reduce the cost of AI model training,” said Laurent Gil, co-founder and Chief Product Officer at CAST AI. “We have optimized a GPU-heavy training process and generated cloud cost savings of 76 percent for one of our customers. Thanks to these new features, teams can streamline their AI workflows, saving time and resources while driving better results.”

CAST AI will host a webinar in April to teach organizations how to drastically lower AI costs using their existing tech stack and cloud. The day and time and registration details will be shared after Google NEXT, where CAST AI will unveil a new product that will enable customers to more efficiently use AI and cut their cloud costs.

About CAST AI:

CAST AI is the leading Kubernetes automation platform that cuts AWS, Azure, and GCP customers’ cloud costs by over 50%. CAST AI goes beyond monitoring clusters and making recommendations. The platform utilizes advanced machine learning algorithms to analyze and automatically optimize clusters in real time, reducing customers’ cloud costs, and improving performance and reliability to bolster DevOps and engineering productivity.

Learn more: https://cast.ai/