Hugging Face partners with CAST AI to Optimize AI Workloads on AWS and Google

CAST AI’s workload optimization for intensive CPU and GPU workloads reduces the cost of running AI

CAST AI, the leading Kubernetes automation platform, and Hugging Face, the leading open-source platform for AI builders, today announced a partnership designed to dramatically reduce the cost of deploying Large Language Models (LLMs) in the cloud.

Generative AI and LLM adoption is growing rapidly. But running AI and LLMs at scale can be costly and prohibitive for many organizations, slowing adoption.

To overcome this issue, Hugging Face and CAST AI have partnered to run Hugging Face customer LLMs on Kubernetes clusters automatically optimized by CAST AI’s Kubernetes automation platform. CAST AI uses advanced machine learning algorithms to analyze and automatically optimize clusters in real time, reducing customers’ cloud costs, and improving performance and reliability.

Since its inception, CAST AI has developed a workload optimization purpose-built for intensive CPU and GPU workloads. It is generally available to all CAST AI customers using AWS, Azure, or Google Cloud. The optimization combined with CAST AI’s real-time GPU availability map delivers a turn-key solution that cuts the cost of running AI training and inference in production applications regardless of the LLM type.

“At Hugging Face, we’re on a mission to make machine learning more accessible,” said Julien Chaumond, CTO of Hugging Face. “One key to democratization is reducing the cost of running LLMs. We’re excited to partner with CAST AI to leverage their expertise in using machine learning algorithms to automatically optimize clusters and reduce costs.”

“We’re thrilled to partner with Hugging Face to help the company optimize its AI workloads on AWS and Google,” said Laurent Gil, CAST AI Co-Founder and CPO. “And we’re even more excited to get our workload optimization into the hands of more companies to help them cost-effectively run CPU- and GPU-intensive large language models for training and inference.”

About Hugging Face:

Hugging Face is the open science and open-source collaborative platform for the machine learning community. The Hugging Face Hub works as a central place where anyone can share, explore, discover, and experiment with open-source ML. Hugging Face empowers the next generation of machine learning engineers, scientists, and end users to learn, collaborate and share their work to build an open and ethical AI future together.

About Cast AI

Cast AI is the leading automation platform for cloud-native and AI infrastructure. The company achieved unicorn status in January 2026 with a strategic investment from Pacific Alliance Ventures, valuing the company at over $1 billion. Cast AI is trusted by BMW, Cisco, FICO, HuggingFace, and Swisscom to keep mission-critical applications reliable and performant at scale.

Learn more: https://cast.ai/

Media and analyst contact:

Erika Rosenstein

Director of PR and Analyst Relations

[email protected]

Solutions

Resources

Company

Book a demo

Hugging Face partners with CAST AI to Optimize AI Workloads on AWS and Google

About Hugging Face:

About Cast AI

Boost Kubernetes performance, security, and cost optimization

Book a demo