CAST AI’s workload autoscaler for intensive CPU and GPU workloads reduces the cost of running AI

CAST AI, the leading Kubernetes automation platform, and Hugging Face, the leading open-source platform for AI builders, today announced a partnership designed to dramatically reduce the cost of deploying Large Language Models (LLMs) in the cloud.

Generative AI and LLM adoption is growing rapidly. But running AI and LLMs at scale can be costly and prohibitive for many organizations, slowing adoption.

To overcome this issue, Hugging Face and CAST AI have partnered to run Hugging Face customer LLMs on Kubernetes clusters automatically optimized by CAST AI’s Kubernetes automation platform. CAST AI uses advanced machine learning algorithms to analyze and automatically optimize clusters in real time, reducing customers’ cloud costs, and improving performance and reliability.

Since its inception, CAST AI has developed a workload autoscaler purpose-built for intensive CPU and GPU workloads. It is generally available to all CAST AI customers using AWS, Azure, or Google Cloud. The autoscaler combined with CAST AI’s real-time GPU availability map delivers a turn-key solution that cuts the cost of running AI training and inference in production applications regardless of the LLM type.

“At Hugging Face, we’re on a mission to make machine learning more accessible,” said Julien Chaumond, CTO of Hugging Face. “One key to democratization is reducing the cost of running LLMs. We’re excited to partner with CAST AI to leverage their expertise in using machine learning algorithms to automatically optimize clusters and reduce costs.”

“We’re thrilled to partner with Hugging Face to help the company optimize its AI workloads on AWS and Google,” said Laurent Gil, CAST AI Co-Founder and CPO. “And we’re even more excited to get our workload autoscaler into the hands of more companies to help them cost-effectively run CPU- and GPU-intensive large language models for training and inference.”

About Hugging Face:

Hugging Face is the open science and open-source collaborative platform for the machine learning community. The Hugging Face Hub works as a central place where anyone can share, explore, discover, and experiment with open-source ML. Hugging Face empowers the next generation of machine learning engineers, scientists, and end users to learn, collaborate and share their work to build an open and ethical AI future together.

About CAST AI:

CAST AI is the leading Kubernetes automation platform that cuts AWS, Azure, and GCP customers’ cloud costs by over 50%. CAST AI goes beyond monitoring clusters and making recommendations. The platform utilizes advanced machine learning algorithms to analyze and automatically optimize clusters in real time, reducing customers’ cloud costs, improving performance and security, and boosting DevOps and engineering productivity.

Learn more: https://cast.ai/

Media and Analyst Contact
A woman wearing glasses and a sweater posing for a photo.

Director of PR and Analyst Relations
[email protected]

Let’s chat

Do you have any questions about CAST AI? Get in touch with our media department.

This field is for validation purposes and should be left unchanged.