GPU volatility is real. The reality: adapt or overpay. Learn how in our 2025 GPU Report →

Cast AI
  • Platform
    • Application Performance Automation PlatformAutomate and optimize your entire application stack.
    • Kubernetes cluster optimizationAutoscale your cluster for savings of 50% or more.
    • Kubernetes securitySecure Kubernetes containers and applications with automated remediation.
    • Kubernetes workload optimizationRightsize Kubernetes workloads for optimized performance.
    • LLM optimization for AIOpsRun the best-performing, most cost-effective LLM for Gen AI.
    • Kubernetes cost monitoringView, monitor, and optimize your cloud costs in real-time.
    • Database optimizationAutomate caching to improve application performance.
  • Solutions
    • Migration of stateful workloadsAutomatically pack stateful workloads into fewer optimized nodes.
    • Inference at scaleDeploy LLMs reliably at scale in your VPC.
    • Cut AI costs without sacrificing scaleDeploy models in your VPC without vendor lock-in or overhead.
  • Resources
    • IntegrationsCast AI works with your favorite tools, platforms, and technologies.
    • BlogGuides, tutorials, and tips on Kubernetes. cloud optimization, and automation.
    • Case studiesSee how Cast AI users cut costs, improve performance, and boost productivity.
    • 2025 Kubernetes Cost Benchmark ReportUncover cost-optimization trends and get actionable tips to avoid overspending.
    • WebinarsExpert-led sessions on Kubernetes automation.
    • Spot Instance Availability MapView real-time Spot interruptions, capacity shortages, and pricing.
    • EventsFind out where our team is heading to next.
  • Company
    • About usLearn more about Cast AI and our team.
    • CareersWe’re growing! See our open positions.
    • NewsroomStay up to date with Cast AI news and media coverage.
    • Contact usHave questions? We’re here to help.
    • Partner programDiscover the benefits of joining our powerful cloud ecosystem.
    • Slack communityJoin our community on Slack to stay informed about all our updates.
    • APA Hero programGet certified as a Cloud-Native Automation Expert.
  • Docs
  • Pricing

Book a demo

Sign in
Start free
Back to Careers

Senior Software Engineer – LIVE

Bulgaria; Croatia; Cyprus; Czechia; Estonia; Greece; Hungary; Italy; Latvia; Lithuania; Malta; Poland; Portugal; Romania; Slovakia; Slovenia; Ukraine

Apply for this role

Why Cast AI?

Cast AI is the leading Application Performance Automation (APA) platform, enabling customers to cut cloud costs, improve performance, and boost productivity – automatically.

Built originally for Kubernetes, Cast AI goes beyond cost and observability by delivering real-time, autonomous optimization across any cloud environment. The platform continuously analyzes workloads, rightsizes resources, and rebalances clusters without manual intervention, ensuring applications run faster, more reliably, and more efficiently.

Headquartered in Miami, Florida, Cast AI has employees in more than 32 countries worldwide and supports some of the world’s most innovative teams running their applications on all major cloud, hybrid, and on-premises environments. Over 2,100 companies already rely on Cast – from BMW and Akamai to Hugging Face and NielsenIQ.

What’s next? Backed by our $108M Series C, we’re doubling down on making APA the new standard for DevOps and MLOps, and everything in between.

About the role

We’re looking for a highly skilled senior software engineer with deep expertise in containers and systems programming to join the LIVE team. The LIVE Team develops and maintains critical infrastructure for container live migrations, ensuring that our platform can seamlessly transfer workloads while maintaining application availability. This involves a blend of software engineering, knowledge of distributed systems, expertise in containerization technology, networking and a deep understanding of Kubernetes.

Requirements:

  • Strong software engineering skills with experience in distributed systems and backend development (ideally in GoLang, but a willingness to transition to it is sufficient).
  • Deep knowledge of low-level Linux details process level and below including networking and filesystems.
  • Experience with network stack, very comfortable with OSI layers, Software Defined Networks.
  • Proficient in debugging, optimization, and performance-tuning of applications.
  • Experience with IaaS building blocks in any of AWS, Azure, or GCP.
  • Experience with low-level networking (NAT, iptables, conntrack, etc.) is a huge plus.
  • Deep Kubernetes experience is a huge plus, as a big part of the role is hacking Kubernetes to do what it wasn’t intended to do.
  • Strong English language skills, both verbal and written.
  • A proactive problem-solving mindset, always aiming for a “yes, we can” attitude.
  • Located in any European country within the GMT +0 to GMT +3 time zones and comfortable working fully remote.

Responsibilities:

  • Design, develop, and implement live migration features that ensure minimal downtime and seamless transitions for stateful workloads within Kubernetes.
  • Collaborate with cross-functional teams.
  • Troubleshoot and debug issues, ensuring high availability and reliability of the platform.
  • Document design decisions, processes, and system behaviors to facilitate knowledge sharing within the team.
  • Stay up-to-date with industry best practices, tools, and techniques.
  • Actively contribute to the continuous improvement of our systems and processes.
  • Participate in code reviews and provide constructive feedback to team members to enhance the overall quality of the codebase.

What’s in it for you?

  • Competitive salary (€6,500 – €9,000 gross, depending on the level of experience)
  • Enjoy a flexible, remote-first global environment.
  • Collaborate with a global team of cloud experts and innovators, passionate about pushing the boundaries of Kubernetes technology.
  • Enjoy a flexible, remote-first global environment.
  • Equity options.
  • Get quick feedback with a fast-paced workflow. Most feature projects are completed in 1 to 4 weeks.
  • Spend 10% of your work time on personal projects or self-improvement. 
  • Learning budget for professional and personal development – including access to international conferences and courses that elevate your skills.
  • Annual hackathon to spark new ideas and strengthen team bonds.
  • Team-building budget and company events to connect with your colleagues.
  • Equipment budget to ensure you have everything you need.
  • Extra days off to help maintain a healthy work-life balance.

    #LI-Remote

Cast AI is the leading Application Performance Automation platform, enabling customers to cut cloud costs, improve performance, and boost productivity.

  • Facebook
  • GitHub
  • Slack Community
  • LinkedIn
  • X
Solutions
  • Kubernetes cluster optimization
  • Kubernetes cost monitoring
  • Kubernetes workload optimization
  • Kubernetes security
  • LLM optimization for AIOps
  • Database optimization

Resources

  • Blog
  • Events
  • Webinars
  • Customer stories
  • Documentation
  • Release notes
  • Pricing

Company

  • About us
  • Careers
  • Contact us
  • Slack community
  • Newsroom
  • Brand assets
  • Partner program
  • APA Hero program
  • Referral program

Ā© 2025 CAST AI Group Inc.

  • Privacy policy
  • Terms of service
  • Customer data processing
  • EU Projects
  • Information security policy
Book a demo

See how Cast AI can transform your cloud-native operations and maximize Kubernetes cost savings.

Which Kubernetes cloud services do you use?(Required)
By submitting this form, you acknowledge and agree that Cast AI will process your personal information in accordance with theĀ Privacy Policy.
This field is hidden when viewing the form
This field is hidden when viewing the form
This field is hidden when viewing the form
This field is hidden when viewing the form
This field is hidden when viewing the form
This field is hidden when viewing the form
This field is hidden when viewing the form
This field is hidden when viewing the form