The Cast AI blog
Guides, tutorials, and tips on Kubernetes automation, from cost optimization to cloud security and everything in between.
TPUs vs GPUs: When to Choose What for AI/ML Workloads
TPU vs GPU for AI/ML workloads: silicon architecture, JAX vs PyTorch fit, H100 pricing, spot automation, and total cost of ownership. A practical decision framework for…

Karpenter Best Practices: 10 Tips for Production Clusters
Karpenter’s defaults aren’t production-ready. This guide covers 10 specific practices to prevent real cluster failures:…

Karpenter Cost Optimization: Consolidation Benchmark Results (7-Day Run)
Explore four approaches to Karpenter cost optimization in this benchmarking study showcasing the impact of…

Tokens Are the New Cloud Bill
At FinOps X 2026, a major shift became clear: teams are now spending massively on…

OpsPilot Now Writes Your Workload Scaling Policies. You Just Set the Intent.
OpsPilot, Cast AI’s AI agent for DevOps and SREs, can now automatically generate workload scaling…

What Is Tokenomics, And Why Your AI Infrastructure Is Now a FinOps Problem
I was in the room when tokenomics became official. Here is what it means for…

The Hackathon Fix That Cut Our Storage Costs by 93%
For the second year running, Cast AI hosted an internal Hackathon during our Vilnius team…

The Karpenter Enterprise Suite is GA: Bring Karpenter to the next level
The Karpenter Enterprise Suite is now generally available. It gives platform teams the visibility, optimization,…

2026 State of Kubernetes Resource Optimization: CPU at 8%, Memory at 20%, and Getting Worse
This is the third year we’ve published our report on the real CPU and memory…

GPU Sharing in Kubernetes: How to Cut Costs and Boost GPU Utilization with Cast AI
Running AI and ML workloads on Kubernetes often leads to underutilized, expensive GPUs. This blog…
