Kubernetes Pod Scheduling: Balancing Cost and Resilience

Learn to optimize Kubernetes pod scheduling for better cost efficiency and resilience. Explore proven strategies to fine-tune scheduling policies, reduce resource waste, and build fault-tolerant clusters that perform well under production workloads.

Phil Andrews Avatar
Kubernetes Pod Scheduling_ Balancing Cost and Resilience

Kubernetes pod scheduling plays a critical role in how your applications perform and how much you pay to run them. Every time the scheduler determines the pod’s running location, it balances factors such as cost efficiency, resource availability, fault tolerance, and workload priorities. For teams managing dynamic environments or production-grade clusters, configuring pod scheduling effectively is essential to maintaining resilience without overspending.

Check out the first part of this series for more insights into the three pod scheduling mechanisms and best practices.

In this part, I dive into resource optimization and resiliency best practices for Kubernetes clusters. Using real-world examples, we’ll explore how to fine-tune scheduling policies to improve availability, reduce waste, and keep workloads running smoothly even during failures or scaling events. 

Whether you’re trying to reduce cloud costs or build a more resilient platform, understanding the subtleties, implementation patterns, and trade-offs is critical for creating high-performance, robust, and cost-effective Kubernetes infrastructures.

Resource Optimization Considerations

Scheduling policies significantly impact resource utilization and costs. In many real-world examples, optimized pod distribution has improved CPU utilization by 35-47% and memory utilization by 28-39%.

Bin-Packing vs. Spreading

While spreading workloads improves resilience, excessive spreading can lead to resource fragmentation:

# May lead to excessive spreading and poor bin-packing
podAntiAffinity:
  requiredDuringSchedulingIgnoredDuringExecution:
  - labelSelector:
      matchExpressions:
      - key: app
        operator: In
        values:
        - app-name
    topologyKey: kubernetes.io/hostname

Balancing Cost and Resilience

For optimal resource efficiency with appropriate resilience:

1. Use node-level soft anti-affinity for non-critical services:

podAntiAffinity:
  preferredDuringSchedulingIgnoredDuringExecution:
  - weight: 80
    podAffinityTerm:
      labelSelector:
        matchExpressions:
        - key: app
          operator: In
          values:
          - service-name
     topologyKey: kubernetes.io/hostname

2. Reserve strict constraints for Zone/Region level or critical services:

topologySpreadConstraints:
- maxSkew: 1
  topologyKey: topology.kubernetes.io/zone
  whenUnsatisfiable: DoNotSchedule
  labelSelector:
    matchLabels:
      app: critical-service
- maxSkew: 3  # More flexible at node level
  topologyKey: kubernetes.io/hostname
  whenUnsatisfiable: ScheduleAnyway
  labelSelector:
    matchLabels:
    app: critical-service

3. Group related non-critical services together:

affinity:
  podAffinity:
    preferredDuringSchedulingIgnoredDuringExecution:
    - weight: 100
      podAffinityTerm:
        labelSelector:
          matchExpressions:
          - key: app
            operator: In
            values:
            - related-service
    topologyKey: kubernetes.io/hostname

Resilience Engineering with Topology Controls

Proper distribution policies form the foundation of resilience engineering in Kubernetes. Here are a few best practices to help you boost the resilience of your Kubernetes clusters.

Multi-Level Resilience Strategy

For comprehensive resilience, implement constraints at multiple levels:

# Comprehensive resilience configuration
topologySpreadConstraints:
- maxSkew: 1  # Strict region balance
  topologyKey: topology.kubernetes.io/region
  whenUnsatisfiable: DoNotSchedule
  labelSelector:
    matchLabels:
      app: critical-service
- maxSkew: 1  # Strict zone balance within regions
  topologyKey: topology.kubernetes.io/zone
  whenUnsatisfiable: DoNotSchedule
  labelSelector:
    matchLabels:
      app: critical-service
- maxSkew: 2  # More flexible node balance
  topologyKey: kubernetes.io/hostname
  whenUnsatisfiable: ScheduleAnyway
  labelSelector:
    matchLabels:
      app: critical-service

Cascading Constraint Patterns

Design constraints to cascade from strict to flexible:

  1. Hard constraints at broad topology levels (region, zone)
  2. Softer constraints at narrow levels (node, rack)
  3. Fallback provisions for scheduling when an ideal distribution isn’t possible
# Hard requirement at zone level
topologySpreadConstraints:
- maxSkew: 1
  topologyKey: topology.kubernetes.io/zone
  whenUnsatisfiable: DoNotSchedule
  labelSelector:
    matchLabels:
      app: resilient-app

Combined with:

# Soft preference at node level
affinity:
  podAntiAffinity:
    preferredDuringSchedulingIgnoredDuringExecution:
    - weight: 90
      podAffinityTerm:
        labelSelector:
          matchExpressions:
          - key: app
            operator: In
            values:
            - resilient-app
        topologyKey: kubernetes.io/hostname

Real-World Implementation Patterns

Different workload types require distinct distribution strategies:

Pattern 1: Global Service with Regional Presence

For services that need global distribution with a local presence:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: global-cache
spec:
  replicas: 12
  template:
    spec:
      topologySpreadConstraints:
      - maxSkew: 1
        topologyKey: topology.kubernetes.io/region
        whenUnsatisfiable: DoNotSchedule
        labelSelector:
          matchLabels:
            app: global-cache
      - maxSkew: 2
        topologyKey: topology.kubernetes.io/zone
        whenUnsatisfiable: DoNotSchedule
        labelSelector:
          matchLabels:
            app: global-cache

This ensures even distribution across all regions with reasonable zone distribution.

Pattern 2: Stateful Application with Cross-Zone Resilience

For database clusters and stateful applications:

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: distributed-database
spec:
  replicas: 5
  template:
    spec:
      affinity:
        podAntiAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
          - labelSelector:
              matchExpressions:
              - key: app
                operator: In
                values:
                - distributed-database
            topologyKey: topology.kubernetes.io/zone

This guarantees no two database instances share an availability zone, maximizing resilience against zone failures.

Pattern 3: Performance-Sensitive Microservices

For microservices that benefit from proximity but need basic resilience:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: api-service
spec:
  replicas: 6
  template:
    spec:
      affinity:
        podAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
          - weight: 80
            podAffinityTerm:
              labelSelector:
                matchExpressions:
                - key: app
                  operator: In
                  values:
                  - cache-service
              topologyKey: kubernetes.io/hostname
        podAntiAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
          - weight: 50
            podAffinityTerm:
              labelSelector:
                matchExpressions:
                - key: app
                  operator: In
                  values:
                  - api-service
              topologyKey: kubernetes.io/hostname

This balances the need for API services to be near their caches while maintaining some separation between API instances.

Pattern 4: Cost-Optimized Non-Critical Services

For cost-sensitive workloads where some resilience is desired but not critical:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: batch-processor
spec:
  replicas: 10
  template:
    spec:
      topologySpreadConstraints:
      - maxSkew: 5  # Allow significant imbalance
        topologyKey: kubernetes.io/hostname
        whenUnsatisfiable: ScheduleAnyway
        labelSelector:
          matchLabels:
            app: batch-processor

This provides basic distribution while allowing for substantial bin-packing and resource efficiency.

Common Pitfalls and Misconfigurations

Even experienced Kubernetes engineers can fall into scheduling traps. Understanding common pitfalls can help avoid service disruptions and performance issues.

Pitfall 1: Overly Strict Anti-Affinity

Setting hard anti-affinity without considering cluster size can lead to scheduling failures:

# Problematic on small clusters
podAntiAffinity:
  requiredDuringSchedulingIgnoredDuringExecution:
  - labelSelector:
      matchExpressions:
      - key: app
        operator: In
        values:
        - app-name
    topologyKey: kubernetes.io/hostname

Solution: Use preferred anti-affinity or topology spread constraints with ScheduleAnyway for smaller clusters or non-critical services.

Pitfall 2: Conflicting Affinity Rules

Contradictory rules can create scheduling impossibilities:

# Conflicting rules
affinity:
  podAffinity:
    requiredDuringSchedulingIgnoredDuringExecution:
    - labelSelector:
        matchExpressions:
        - key: app
          operator: In
          values:
          - service-a
      topologyKey: kubernetes.io/hostname
  podAntiAffinity:
    requiredDuringSchedulingIgnoredDuringExecution:
    - labelSelector:
        matchExpressions:
        - key: app
          operator: In
          values:
          - service-a
      topologyKey: kubernetes.io/hostname

Solution: Carefully review affinity rules for logical consistency before deployment.

Pitfall 3: Excessive Node Specialization

Over-using node selectors and taints alongside affinity rules can severely restrict scheduling options:

# Too restrictive
nodeSelector:
  disktype: ssd
  cpu: highperf
affinity:
  podAntiAffinity:
    requiredDuringSchedulingIgnoredDuringExecution:
    - labelSelector:
        matchExpressions:
        - key: app
          operator: In
          values:
          - database
      topologyKey: kubernetes.io/hostname

Solution: Minimize node specialization and use soft preferences where possible.

Pitfall 4: Ignoring Scaling Implications

Distribution policies that work for small deployments may fail during scale-up:

# Works for 3 replicas, fails at 10+
spec:
  replicas: 3  # Later scaled to 10+
  template:
    spec:
      affinity:
        podAntiAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
          - labelSelector:
              matchExpressions:
              - key: app
                operator: In
                values:
                - web-app
            topologyKey: kubernetes.io/hostname

Solution: Design distribution policies with maximum potential scale in mind.

Pitfall 5: Forgetting About Resource Constraints

Distribution policies can conflict with resource availability:

# May cause scheduling failures
resources:
  requests:
    memory: 16Gi
    cpu: 4
topologySpreadConstraints:
- maxSkew: 1
  topologyKey: kubernetes.io/hostname
  whenUnsatisfiable: DoNotSchedule
  labelSelector:
    matchLabels:
      app: resource-heavy

Solution: Ensure your distribution strategy accounts for the resource profile of your workloads.

Conclusion

When properly implemented, these three techniques lay the groundwork for high-performance applications that remain available despite infrastructure issues while optimizing resource use.

Here are the takeaways from our exploration:

Balance resilience and efficiency:

  • Implement stricter constraints at broader topology levels
  • Use more flexible constraints at narrow levels
  • Consider resource implications of distribution strategies

Apply context-appropriate patterns:

  • Consider service criticality, scale, and performance requirements
  • Adapt strategies to your specific cluster topology and size
  • Test distribution policies at target scale before production deployment

By following the best practices presented in this article, your Kubernetes infrastructure can join the ranks of top-performing environments.

/inlin

Cast AIBlogKubernetes Pod Scheduling: Balancing Cost and Resilience