Fairgen saves 70% on the cloud while boosting stability for gen AI workloads

Company

Fairgen combines advanced generative AI with established market research methods to provide sophisticated predictive models that allow for accurate, scalable, and trustworthy findings about niche audiences. By doing so, market research organizations can broaden their coverage to hard-to-reach segments while significantly lowering time-to-insight and data gathering costs.

Challenge

Fairgen needed to optimize resource utilization across both its standard SaaS infrastructure for its platform and its AI learning infrastructure without sacrificing performance and the user experience. Provisioning cloud resources manually was impossible to scale, especially with AI workloads that demanded more powerful machines, larger clusters, and dynamically adjustable capacity. 

Solution

Fairgen evaluated several solutions, but Cast AI stood out due to its automation-first approach, which was aligned with the company’s automation-first mindset

Using Cast, Fairgen automated the management of its clusters, optimized resource allocation in real time, and ultimately cut operational costs by 70%. The cost benefits were achieved without sacrificing performance, flexibility, or the user experience of Fairgen’s customers. 

Since switching to Cast, the team has had a much better sense of control and stability over the infrastructure behind its AI workloads compared to the previous setup with GKE Autopilot. 

Results

  • 70% cost savings in the operational infrastructure
  • Massive savings unlocked for gen AI workloads
  • Increased control and stability from development to production

Cluster autoscaling and workload rightsizing

The graphs below show the impact of Cast AI’s automation on one of Fairgen’s clusters. 

Thanks to the combination of cluster autoscaler, workload autoscaling, and bin-packing, the cost per requested CPU dropped and CPU overprovisioning was significantly reduced.

When I first heard about Cast AI, the main selling point that caught my attention was workload rightsizing. And yes, that’s exactly what Cast does—and it does it well. But honestly, the real power of Cast AI depends on how you use it and how you integrate it into your workflows.

For me, Cast AI is much more than just a rightsizing or cloud cost savings tool. It has become a key part of my diagnostic system and the core of my infrastructure management.

In the end, we saved around 70% on operations while still delivering the same experience to our clients.

Mati Konen, VP of Engineering at Fairgen

Optimizing infrastructure for both AI and SaaS

What does Fairgen specialize in?

We provide synthetic respondents to enable market research companies to examine their data more deeply or gain a more granular view. These synthetic respondents help fill gaps in segments that are typically underrepresented—for example, Gen Z, people with two kids, or males aged 60–70 who are interested in stocks.

If I’m a company looking to do marketing, I know that I need to communicate differently for each niche, each segment of the field—highlight different points, present different value propositions that are relevant to that group.

When you try to break the data down to that granular level, you end up with groups that are too small. You can’t make sound decisions based on only 20 or 30 people. The variance is too high, and the risk of error is significant. You need at least 800-1,00 people to have statistical reliability.

What we do is take the data you provide, train on it, and then create synthetic respondents based on your data—not something generic or imagined.

What requirements does your infrastructure need to meet for smooth business operations?

So we have two main requirements. One part of the setup is the standard SaaS infrastructure for our platform. But then there’s the more complex part: training and using the AI models. So we had to optimize not just the standard operational infrastructure but also the AI learning infrastructure.

In the beginning, we took a deep dive into this, looking at specific use cases to see how we could improve things. That’s when we discovered Cast AI—or rather, Cast AI found us. I took a closer look and realized it really fit our needs.

What was the core challenge of optimizing your infrastructure?

I was looking for something to solve the provisioning issue we had. At first, I thought, “Okay, I can just do it manually.” That’s the easiest and quickest way. I was using some tools, just doing simple things.

But then I started thinking: what happens if certain AI workloads need a more powerful machine? Or more capacity? Or a stronger cluster of nodes? What then? Do I stop everything, resize manually, and double resources? That’s not smart. I’d rather do it the smart way.

We’re a deep tech company. Our clients’ trust exists, but it also has to be constantly maintained. If I don’t give them the same performance or the same experience—the same feeling—then questions will come up. And no one wants that.

So the main question was always: how do we maintain consistent performance and user experience while also optimizing resource utilization and saving money? Because yes, we definitely want to save money.

You’ve tried using Google’s native tool GKE Autopilot. Did it help address your challenge?

We had some infrastructure issues we couldn’t diagnose because they didn’t come from a consistently running system.

Autopilot works like this: spin a node up, take it down, and spin it up again. So, we had multiple failed model deployments during development, and we didn’t realize until much later that the issue wasn’t with the model itself—it was with the Autopilot setup not behaving correctly. That led us to want more control over the steps involved in engineering and deploying our models.

With Cast AI, we now have deeper control at the infrastructure level while still keeping some of Autopilot’s benefits. So, we’re saving money, gaining better control, and, just as importantly, maintaining a consistent and predictable experience across the platform with no surprises.

When looking for a solution, were you ready to embrace full-scale automation?

The manual part of the optimization job never even started. I shut it down before it could begin. The moment someone suggested, “Hey, we can just do it manually,” I said: “No way. I’m not willing to take a call at 2 a.m. just because something didn’t scale properly. That’s not going to happen.”

So I started with automation. For me, managing the infrastructure manually was never an option—not because I can’t, but because I won’t. In today’s world, where AI is everywhere and advanced tools are readily available, there has to be a solution for this.

And literally the day after I started looking into it, a friend of mine said, “Hey, I’m working with Cast AI, you should check them out.” And I thought, well, that’s perfect timing.

We reviewed a couple of other tools, but Cast AI was the right fit from the start. I also loved that Cast shared the same automation-first mindset. I thought, “There has got to be a tool for this.” It’s not about saving time—it’s about principle. Come on, we landed on the moon. We can automate cloud infrastructure.

What was integrating Cast like?

We approached it layer by layer. First, we did the POC and testing. Then, we moved into operations, research, and the AI layer. We built it out in stages, and when you do that, you don’t need to be afraid; you’re doing it the right way.

We started the POC in January, wrapped it up in March, and then it was live across everything. At no point did anything feel off. Not once did we say, “This doesn’t make sense.” Now, the entire infrastructure—from research to operations to production—is running on Cast AI.

It came down to open communication, a strong customer success team, a great solutions architect, and solid execution. In the end, we did it right—we executed deeply and correctly, and everyone’s happy with the results.

What level of support did you receive during integration and beyond?

So yes, the experience overall was very good. Now, we’re doing a lot of post-sale work with our Technical Account Manager. That includes better understanding our needs, addressing technical hiccups that came up after going live, and refining provisioning and scaling—basically, everything that helps us get the most out of the platform.

And honestly, I’ve really appreciated working with your TAM. She’s extremely professional, but more importantly, she’s honest. When she doesn’t know something, she says so. “I’ll find out and get back to you.” And she actually does—on time, without needing reminders. It’s been a standout example of reliable, responsive support, and that’s rare. I really value that.

What are the cost savings you have achieved since fully integrating Cast?

The deeper we went, the more cost impact we realized. In our operations alone, we saved roughly 70% in costs. As for AI—while I can’t disclose exact numbers, I can say it was a big number. Big enough that it actually pushed us to change our entire business model. And in the world of AI, that kind of cost savings is no small thing—it’s a very big number.

 How long did it take you from implementing Cast fully to realizing these cost savings? 

Operational infrastructure

We started this process without any major expectations. Our mindset was: let’s do it right and see where it leads us. We experimented—larger machine templates, smaller ones, different configurations, different interactions—just played around with it.

It was essentially an ongoing POC, and it took about two months to find the right balance between savings and performance. Once we found that sweet spot, we felt confident enough to recommend moving forward.

In the end, we saved around 70% on operations while still delivering the same experience to our clients. That was the key. We could have pushed for even more savings, but we didn’t want to compromise on flexibility, responsiveness, or user experience.

AI infrastructure

Now, on the AI side, it was a whole different beast. Honestly, it was very time-consuming. We ran tests for months until we finally reached a point where we understood the system could work reliably. 

When we dove in deeper, we realized that most of the time, our AI isn’t running continuously. We train, generate, and then it’s idle. So the flexibility Cast AI gave us with node scheduling and workflow automation allowed us to maintain the same capabilities, with a minor performance trade-off. What used to take 50 minutes now takes one hour. But that extra 10 minutes doesn’t matter much in the real world. The client isn’t sitting there counting minutes—they’re fine with waiting an hour if needed.

For us and for our clients, that consistency is key. The client doesn’t care if something takes 50 minutes or an hour. What matters is predictability. If it takes two hours, it should always take two hours. Or if it’s five seconds, or two seconds, or 30 milliseconds, whatever it is, just keep it consistent.

That rhythm—the feeling of stability when working with the system—is critical. If the infrastructure isn’t stable and the setup isn’t smart, the whole thing breaks down.

Now, since switching to Cast AI, we have a much better sense of control and stability. We started in January, went into production in March, and we’re not looking back. Of course, there’s always room for improvements and more testing—but honestly, even if it stayed exactly as it is now, we’d be happy.

Why is a sense of control essential for your AI infrastructure?

By shifting control away from Autopilot and back into our hands with Cast, we eliminated the random failures. Taking back control gave us more reliability and more trust, both internally and from clients. That’s worth far more than a cost savings.

So, for me, the real ROI from integrating Cast wasn’t about money. It was about control, trust, and continuity. No more guessing. No more inconsistencies.

To be honest, that was the real trigger point for moving away from Autopilot, not the cost savings, which came later as a bonus. 

Which features were real game-changers for Fairgen?

Node templates and flexibility

For us, the ability to define and use different machine templates in the same environment for different stages of our AI pipeline has been a game-changer.

We prepare everything before running our major research workloads. Our system includes a preprocessor that determines, based on the specific step and complexity, which template should be used—small, medium, large, or even one we call the “master” (64 vCPUs, just in case). Interestingly, we rarely need the master, but when we do, it’s because we know it’s actually more cost-effective to run one large process than to scatter smaller tasks inefficiently.

Our AI processes are complex and multi-stage, with anywhere from 20 to 40 steps per job, depending on file size and the type of work. By assigning templates step-by-step, we’re able to improve utilization, visibility, and cost-efficiency. This flexibility allows us to say, for example, “This step only needs a lightweight machine, but that one needs a heavyweight.”

Tracking resource utilization per node template

And it’s not just provisioning—it’s diagnostics too. Because we can track utilization per template per step, we get insights that help us spot inefficiencies. Sometimes we notice a 40% CPU usage on a machine and ask ourselves: Are we overprovisioned? We reduce CPU, run tests again, and aim for 75–80% utilization. If something changed, we go back to the R&D team and ask if they’ve optimized something. We call this process: check, learn, change, repeat.

It’s half manual, half automated. And honestly, it’s fun! We’re constantly improving, tweaking, and learning. 

Over time, Cast has become not just an optimization solution but a diagnostic and trust-building tool. It gives us confidence and control. And with control comes trust, and with trust comes better decisions.

This approach has fundamentally changed the way we work. It’s not something we hear about often, but it makes perfect sense—AI workloads are always multi-stage. So why would you use a single template for the whole pipeline? It just doesn’t make sense.

I look at things not just from an infrastructure manager’s perspective but also with an R&D mindset. I have to understand the costs and the usage because I’m the one generating the reports. For example, if I see one cost showing as $8,000, but I know I’m actually paying $12,000—where’s that extra $4,000 going? I need to get better visibility and managerial insights. So for me, this has been very helpful.

The value you get from Cast AI really depends on your willingness to invest time and expertise to tailor it to your unique needs. You can use it just for rightsizing, and it will do a great job there. But if you dive deeper and treat it as an integral part of your infrastructure strategy, the added value becomes much greater.

What are the next steps in your optimization journey?

I currently have four or five open feature requests—mainly improvements that would help me better understand what I’m seeing and doing, and help me make smarter, more informed decisions.

There’s always something new—like the node scheduler or the way we started using Spot VMs more effectively. Before, we were only on demand without any Spot. Whenever a new Cast feature comes out, I get a notification—“Hey, here’s something cool for you”—and I immediately think, “Okay, how can we implement this?”

Cast AI manages the infrastructure side of things for me. The less I have to actively manage, the better it is for me. That’s why I try to use every feature I can, as long as it works seamlessly for me. If it works, I’m in. 

Cast AICase StudiesFairgen

1-50

Market research

EMEA

GKE

Automate and maintain your clusters.


This field is for validation purposes and should be left unchanged.
Download the PDF
By submitting this form, you acknowledge and agree that Cast AI will process your personal information in accordance with the Privacy Policy.
This field is hidden when viewing the form
This field is hidden when viewing the form
This field is hidden when viewing the form
This field is hidden when viewing the form
This field is hidden when viewing the form
This field is hidden when viewing the form
This field is hidden when viewing the form
This field is hidden when viewing the form