The Ultimate Karpenter Guide: Setup to Advanced Config

Karpenter became the default autoscaler on EKS because it provisions the right instance in seconds. This guide takes you from a working install through the production patterns that determine how Karpenter performs at scale, with working YAML in every section.

Karpenter became the default autoscaler on EKS because it provisions the right instance in seconds. This guide takes you from a working install through the production patterns that determine how Karpenter performs at scale, with working YAML in every section.

Kubernetes clusters are not static. Pods come and go. Traffic shifts. New deployments arrive. The provisioning layer needs to keep up, fast and accurately, without forcing teams to manage instance groups by hand.

Karpenter is the open-source autoscaler that solved this on EKS. It watches pending pods, evaluates NodePool constraints, and launches the right instance type in seconds. AWS ships it as the default scaler in EKS Auto Mode, and roughly 60% of new EKS clusters now provision through it.

That covers the install. The harder part is what comes after: which instance families to allow, how aggressively to consolidate, when to use Spot, how to keep disruption under control, and how to debug a cluster that scales fast but sometimes surprises you with the bill.

This guide takes you from a working Karpenter install on EKS through the advanced patterns that decide whether Karpenter holds up in production. Expect working YAML in every section.

What is Karpenter?

Karpenter is an open-source Kubernetes node provisioner originally built by AWS and donated to the CNCF. It replaces Cluster Autoscaler with a different model. Instead of pre-defining node groups and asking the scheduler to fit pods into them, Karpenter watches unschedulable pods and provisions the optimal node directly from the cloud provider’s instance catalog.

The provisioning loop:

  1. A pod is scheduled, and the API server cannot place it on existing capacity.
  2. Karpenter sees the pending pod and reads its resource requests, taints, tolerations, and node selector terms.
  3. Karpenter evaluates the constraints in your NodePool definitions and the available EC2 instance types.
  4. Karpenter launches the most efficient instance that satisfies the pod’s needs and your NodePool rules.
  5. The pod is scheduled to the new node.

That entire loop typically takes 30 to 90 seconds on AWS, compared with several minutes when using Cluster Autoscaler with pre-warmed node groups.

The control surface is two custom resources. NodePool specifies which node types are allowed. EC2NodeClass (formerly AWSNodeTemplate) specifies how Karpenter should launch them on AWS.

Karpenter vs Cluster Autoscaler

The short answer: Karpenter does not require pre-defined Auto Scaling Groups. Cluster Autoscaler scales node groups up and down based on the schedules and instance types you defined ahead of time. Karpenter scales individual nodes and selects instance types at provisioning time, drawing from the full EC2 catalog.

DimensionCluster AutoscalerKarpenter
Provisioning modelScales pre-defined Auto Scaling GroupsProvisions individual nodes from the cloud API
Typical scale-up time2-5 minutes30-90 seconds
Instance flexibilityOne or a few instance types per node groupHundreds of instance types evaluated per request
Spot diversificationCoarse, via ASG mixed instance policiesNative, across many Spot pools simultaneously
ConsolidationRequires separate toolingFirst-class, with budgets and policies
Drift detectionNot supportedDetects and replaces nodes that no longer match spec
Node expirationNot supportedFirst-class, configurable per NodePool
Configuration unitNode group (one per workload tier)NodePool + EC2NodeClass (one set per workload tier)
Cloud supportAll major clouds, matureAWS most mature; Azure and Alibaba in active development

Migration considerations

If you are running Cluster Autoscaler today, the migration to Karpenter is not a flag flip. Existing node groups continue to run; you add Karpenter alongside, route a subset of workloads to it via NodePool requirements and pod-level node selectors, and decommission the node groups gradually. Most teams take 4-8 weeks to fully migrate a production cluster, with the first 2 weeks spent validating Karpenter’s behavior on non-critical workloads.

The migration is real work, but the operational dividends compound. Faster scale-up means tighter HPA loops. Better Spot diversification means higher Spot ratios in production. First-class consolidation means clusters stay right-sized without manual cleanup.

For most modern EKS clusters, Karpenter is the right choice. The exception is environments with strict instance-type requirements (specific compliance configurations, custom-licensed software tied to specific SKUs) where the flexibility Karpenter offers is more constraint than benefit.

Setting up Karpenter on EKS

The setup has three parts: prerequisites, install, and verification.

Prerequisites

You need an EKS cluster with an OIDC provider associated. Karpenter uses IRSA (IAM Roles for Service Accounts) to call the EC2 API. You also need an IAM role for the Karpenter controller and a separate IAM role that the launched nodes will assume.

The fastest path is the official getting-started script in the Karpenter docs, which provisions both roles, the SQS queue for Spot interruption notifications, and the EventBridge rules. If you prefer infrastructure-as-code, the same setup is available as a Terraform module: terraform-aws-modules/eks/aws//modules/karpenter.

Tag your subnets and security groups so Karpenter can discover them:

aws ec2 create-tags \
  --resources $SUBNET_IDS $SECURITY_GROUP_IDS \
  --tags Key=karpenter.sh/discovery,Value=$CLUSTER_NAME

Install via Helm

Karpenter ships as a Helm chart in a public ECR registry. Check the latest stable release before pinning a version:

helm registry login --username AWS --password-stdin public.ecr.aws \
  $(aws ecr-public get-login-password --region us-east-1)

helm upgrade --install karpenter oci://public.ecr.aws/karpenter/karpenter \
  --version "1.0.5" \
  --namespace kube-system \
  --create-namespace \
  --set "settings.clusterName=${CLUSTER_NAME}" \
  --set "settings.interruptionQueue=${CLUSTER_NAME}" \
  --set controller.resources.requests.cpu=1 \
  --set controller.resources.requests.memory=1Gi \
  --wait

Pin the version. Karpenter follows semver and breaking changes between minor versions are real. Track the release notes and bump deliberately.

Verification

kubectl get pods -n kube-system | grep karpenter
kubectl logs -n kube-system -l app.kubernetes.io/name=karpenter

You should see two karpenter controller pods running and clean logs. If the controller is crash-looping, the most common cause is missing IAM permissions on the controller role. Compare against the recommended policy in the Karpenter getting-started docs.

Your first NodePool and EC2NodeClass

Karpenter does nothing until you define a NodePool and an EC2NodeClass. Start with the minimum that works, then add constraints.

EC2NodeClass

apiVersion: karpenter.k8s.aws/v1
kind: EC2NodeClass
metadata:
  name: default
spec:
  amiFamily: AL2023
  role: KarpenterNodeRole-${CLUSTER_NAME}
  subnetSelectorTerms:
    - tags:
        karpenter.sh/discovery: ${CLUSTER_NAME}
  securityGroupSelectorTerms:
    - tags:
        karpenter.sh/discovery: ${CLUSTER_NAME}
  amiSelectorTerms:
    - alias: al2023@latest

What it does: tells Karpenter to launch nodes with Amazon Linux 2023, into any subnet tagged for discovery, attached to any security group tagged for discovery, using whatever IAM role you set up for nodes.

NodePool

apiVersion: karpenter.sh/v1
kind: NodePool
metadata:
  name: default
spec:
  template:
    spec:
      requirements:
        - key: kubernetes.io/arch
          operator: In
          values: ["amd64"]
        - key: karpenter.sh/capacity-type
          operator: In
          values: ["on-demand"]
        - key: node.kubernetes.io/instance-type
          operator: In
          values: ["m5.large", "m5.xlarge", "m5.2xlarge"]
      nodeClassRef:
        group: karpenter.k8s.aws
        kind: EC2NodeClass
        name: default
      expireAfter: 720h
  limits:
    cpu: 1000
  disruption:
    consolidationPolicy: WhenEmptyOrUnderutilized
    consolidateAfter: 30s

Apply both, then deploy a workload that has resource requests but no existing capacity. Karpenter should provision a node within a minute. Confirm with kubectl get nodes -L karpenter.sh/nodepool and you should see the new node tagged with the NodePool name.

This is the smallest production-safe configuration. It works. It is also restrictive in ways that limit Karpenter’s actual value, which is what the next section addresses.

Tuning NodePools for production

The minimal NodePool above limits Karpenter to three specific instance types. Real production clusters benefit from much more flexibility.

Open up instance flexibility

Replace the explicit instance-type list with category and generation filters:

requirements:
  - key: kubernetes.io/arch
    operator: In
    values: ["amd64"]
  - key: karpenter.sh/capacity-type
    operator: In
    values: ["spot", "on-demand"]
  - key: karpenter.k8s.aws/instance-category
    operator: In
    values: ["c", "m", "r"]
  - key: karpenter.k8s.aws/instance-generation
    operator: Gt
    values: ["4"]
  - key: karpenter.k8s.aws/instance-cpu
    operator: Lt
    values: ["33"]
  - key: topology.kubernetes.io/zone
    operator: In
    values: ["us-east-1a", "us-east-1b", "us-east-1c"]

What changed:

  • The capacity type now supports Spot first, with on-demand as a fallback when Spot is unavailable.
  • Instance category includes compute (c), general (m), and memory (r) families.
  • Instance generation must be greater than 4, so Karpenter picks current-gen Graviton and Intel options.
  • CPU count capped at 32 to prevent Karpenter from provisioning a giant instance for a tiny workload.
  • All three AZs allowed for resilience.

This single set of requirements gives Karpenter access to roughly 100+ instance types across multiple Spot pools.

Set realistic limits

limits:
  cpu: 1000
  memory: 1000Gi

Limits are the hard ceiling for the NodePool. Set them based on your cost or capacity envelope. A NodePool with no limits will scale until something else stops it.

Use multiple NodePools for different workload tiers

A single NodePool is rarely enough. Most production clusters use two to four NodePools, each with different rules:

  • A general-purpose pool for stateless workloads (allows Spot, broad instance flexibility).
  • A reserved pool for stateful or latency-sensitive workloads (on-demand only, narrower instance set, taints for isolation).
  • A GPU pool for ML or rendering workloads (GPU instance families, taints).
  • A burst pool for nightly batch jobs (Spot only, large instances, expires aggressively).

Karpenter selects which NodePool to use for a given pod based on which pool’s requirements the pod can satisfy and which has the highest weight. We will get to weights in the advanced section.

Disruption and consolidation

Disruption is what makes Karpenter feel alive, and it is also what makes platform teams nervous. The four disruption events to know:

Consolidation. Karpenter notices that a workload could fit on a smaller or cheaper node and replaces the existing one. This is where most ongoing optimization happens.

Expiration. Nodes get retired after a fixed duration via expireAfter. Useful for forcing AMI rotation and preventing long-lived nodes from drifting.

Drift. Karpenter detects when a node no longer matches its NodePool spec (for example, after you change the AMI family) and replaces it.

Emptiness. Nodes with no workloads get terminated quickly.

Each can be controlled with disruption budgets:

disruption:
  consolidationPolicy: WhenEmptyOrUnderutilized
  consolidateAfter: 1m
  budgets:
    - nodes: "20%"
    - nodes: "0"
      schedule: "0 9 * * mon-fri"
      duration: 8h
      reasons:
        - Drifted
        - Underutilized

What this does:

  • Consolidation runs when nodes are empty or underutilized, with a 1 minute hold before any action.
  • At any given time, no more than 20% of the NodePool’s nodes can be disrupted simultaneously.
  • During business hours (9 AM Monday through Friday, lasting 8 hours), drift and consolidation are blocked entirely. Empty node termination still applies.

Disruption budgets are the most important production guardrail Karpenter offers. Configure them deliberately. If you do not set budgets, Karpenter defaults to disrupting up to 10% of nodes at a time, which is reasonable for general workloads but too aggressive for clusters with single-replica services or strict PodDisruptionBudgets.

Consolidation policies

Two consolidation policies are available:

  • WhenEmpty: Karpenter only terminates nodes that are completely empty. Safe and conservative.
  • WhenEmptyOrUnderutilized: Karpenter actively replaces underutilized nodes with smaller ones. Higher savings, more disruption.

Most teams start with WhenEmpty and graduate to WhenEmptyOrUnderutilized once they trust their PodDisruptionBudgets and have validated that critical workloads handle eviction gracefully.

Spot instance strategy

Spot is the largest single cost lever Karpenter unlocks. It is also the largest source of operational risk if you handle it reactively.

Diversify across Spot pools

Karpenter natively diversifies across Spot pools when you allow multiple instance families and zones in a NodePool:

requirements:
  - key: karpenter.sh/capacity-type
    operator: In
    values: ["spot"]
  - key: karpenter.k8s.aws/instance-category
    operator: In
    values: ["c", "m", "r"]
  - key: karpenter.k8s.aws/instance-generation
    operator: Gt
    values: ["4"]
  - key: topology.kubernetes.io/zone
    operator: In
    values: ["us-east-1a", "us-east-1b", "us-east-1c"]

The more instance types and zones a NodePool can use, the more Spot pools Karpenter draws from, and the lower the chance of a correlated mass interruption.

Fall back to on-demand

A safer pattern is to allow both capacity types in the same NodePool:

- key: karpenter.sh/capacity-type
  operator: In
  values: ["spot", "on-demand"]

Karpenter prefers Spot when available and falls back to on-demand when Spot is constrained.

Handle interruption notices

Karpenter listens for AWS Spot interruption notices via the SQS queue you configure in the controller settings. When AWS sends a 2-minute interruption warning, Karpenter cordons the node, drains the workloads, and provisions a replacement.

That 2-minute window is short. Workloads that take longer than 2 minutes to terminate gracefully will be killed mid-shutdown. Mitigate by:

  • Setting terminationGracePeriodSeconds on stateful pods.
  • Using PodDisruptionBudgets to prevent simultaneous interruptions.
  • Avoiding Spot for workloads that cannot tolerate interruption.

This is also where Karpenter alone has a ceiling. Karpenter reacts to AWS’s 2-minute warning. It does not predict interruptions earlier.

Advanced patterns

A few configurations that show up in mature Karpenter deployments.

Weighted NodePools for tier prioritization

When multiple NodePools could host a pod, Karpenter picks the one with the highest weight. This is how teams express preference for reserved capacity over Spot, or for one instance family over another:

# Pool A: prefer reserved capacity
apiVersion: karpenter.sh/v1
kind: NodePool
metadata:
  name: reserved
spec:
  weight: 100
  template:
    spec:
      requirements:
        - key: karpenter.sh/capacity-type
          operator: In
          values: ["reserved"]
      nodeClassRef:
        group: karpenter.k8s.aws
        kind: EC2NodeClass
        name: default
---
# Pool B: fall back to general spot/on-demand
apiVersion: karpenter.sh/v1
kind: NodePool
metadata:
  name: general
spec:
  weight: 10
  template:
    spec:
      requirements:
        - key: karpenter.sh/capacity-type
          operator: In
          values: ["spot", "on-demand"]
      nodeClassRef:
        group: karpenter.k8s.aws
        kind: EC2NodeClass
        name: default

Pods land on reserved first if it has capacity, then fall through to general.

GPU NodePools with taints

apiVersion: karpenter.sh/v1
kind: NodePool
metadata:
  name: gpu
spec:
  template:
    spec:
      taints:
        - key: nvidia.com/gpu
          effect: NoSchedule
      requirements:
        - key: karpenter.k8s.aws/instance-family
          operator: In
          values: ["g5", "g6"]
        - key: karpenter.sh/capacity-type
          operator: In
          values: ["on-demand"]
      nodeClassRef:
        group: karpenter.k8s.aws
        kind: EC2NodeClass
        name: gpu-nodeclass
  limits:
    "nvidia.com/gpu": 100

The taint forces GPU workloads to explicitly tolerate the pool, so non-GPU workloads do not accidentally land on expensive GPU nodes.

Custom AMIs

For organizations with hardened AMIs or custom user data, point EC2NodeClass at your AMI:

spec:
  amiFamily: Custom
  amiSelectorTerms:
    - tags:
        Name: my-org-eks-worker-ami
  userData: |
    #!/bin/bash
    /etc/eks/bootstrap.sh ${CLUSTER_NAME}
    # custom hardening here

Use the Custom AMI family when you need user-data control beyond what the managed AMI families allow.

Day-2 operations

What changes once Karpenter is provisioning real production traffic.

Observability

Karpenter exposes metrics on :8080/metrics in Prometheus format. The metrics worth alerting on:

  • karpenter_nodes_terminated: spike indicates aggressive consolidation or interruption events.
  • karpenter_pods_state{phase="pending"}: pods Karpenter could not provision for. Indicates NodePool constraints are too tight.
  • karpenter_disruption_evaluation_duration_seconds: how long disruption decisions take. Slow numbers indicate scaling problems.

Pair with logs. Karpenter logs every provisioning decision, every consolidation event, and every disruption with the workload that triggered it.

Troubleshooting

Two commands solve most issues:

kubectl describe pod <pending-pod>
kubectl logs -n kube-system -l app.kubernetes.io/name=karpenter --tail=200

The pod’s events tell you why scheduling failed. Karpenter’s logs tell you why provisioning failed. The most common production issues are NodePool requirement mismatches, exhausted limits, and IAM permission errors during instance launch.

AMI rollouts

When a new AL2023 AMI ships and you want to roll it out, update the EC2NodeClass alias and let drift detection do the work:

amiSelectorTerms:
  - alias: al2023@v20251201

Karpenter detects the drift and replaces existing nodes within the disruption budgets you have configured. For staged rollouts, use a separate EC2NodeClass and migrate one NodePool at a time.

Karpenter on Azure and GCP

Karpenter started on AWS and is most mature there. The CNCF project includes a provider abstraction, and active providers exist for Azure (AKS) and Alibaba Cloud. GKE has its own auto-provisioning that is conceptually similar but is not Karpenter.

If you are running on AKS, the Karpenter provider for Azure is in active development. Expect feature parity with the AWS provider to lag by a release or two.

For GKE, GCP’s Node Auto Provisioning (NAP) covers similar ground with a different control plane.

The patterns in this guide apply broadly, but the specifics (provider configuration, IAM equivalents, AMI vs image references) need to be translated per cloud.

Where Karpenter ends, and the operational layer begins

Karpenter’s job is provisioning. It does that well. What it does not do, by design, is the operational layer above provisioning:

  • It does not show cost attribution by workload, namespace, or team.
  • It does not adjust pod resource requests as workloads evolve, so it provisions against whatever requests are set.
  • It does not predict Spot interruptions earlier than AWS’s 2-minute warning.
  • It does not surface why the cluster scaled at 2:14 AM in a single view.
  • It does not consolidate stateful workloads without restarts.

Most platform teams running Karpenter at scale eventually build that layer themselves: a Grafana stack for visibility, custom Prometheus queries for Spot ratio, a side process for resource-request tuning, runbooks for AMI rotations.

That layer can also be bought.

Cast AI for Karpenter runs alongside Karpenter and adds the parts Karpenter does not ship. Cost attribution by workload and team. Continuous workload rightsizing. Spot interruption prediction earlier than the standard 2-minute notice. Container Live Migration for moving stateful workloads between nodes without restart. Karpenter continues to provision nodes; Cast AI handles the operational layer above it.

If you have spent the time tuning NodePools, configuring disruption budgets, and building dashboards, the next step is the operational layer that turns Karpenter from a powerful provisioner into a fully operated platform.

Explore Cast AI for Karpenter →

Frequently Asked Questions About Karpenter

Does Karpenter work with EKS Auto Mode?

KS Auto Mode is built on Karpenter. AWS runs the controller for you and ships a default NodePool configuration. If you are running Auto Mode, you are already running Karpenter, just without direct access to the controller pod or the NodePool definitions. Self-managed Karpenter on a standard EKS cluster gives you full control over both.

Can Karpenter run on AKS or other clouds?

Karpenter is a CNCF project with provider abstraction. The AWS provider is the most mature. The Azure provider for AKS is in active development with feature parity lagging by a release or two. Alibaba Cloud also has an active provider. GKE has its own auto-provisioning that is conceptually similar but is not Karpenter.

What is the difference between WhenEmpty and WhenEmptyOrUnderutilized?

WhenEmpty terminates only nodes with no running workloads. WhenEmptyOrUnderutilized actively replaces underutilized nodes with smaller ones to consolidate capacity. The first is conservative and safe by default. The second is more aggressive and saves more, but only after you have validated that workloads handle eviction gracefully and PodDisruptionBudgets are configured correctly.

Does Karpenter support Graviton (arm64) instances?

Yes. Set kubernetes.io/arch in the NodePool requirements to include arm64 , and Karpenter will provision Graviton instances when they are the best fit. For mixed clusters, allow both amd64 and arm64. Workloads that require a specific architecture should be pinned via a node selector.

Can I run Karpenter alongside Cluster Autoscaler?

Yes, during migration. Each manages its own node groups or NodePools, and they do not interfere as long as workloads route to one or the other via labels and selectors. This is the recommended pattern for production migration: stand up Karpenter alongside CA, move workloads tier by tier, and decommission node groups as they drain.

What permissions does Karpenter need?

The Karpenter controller needs permission to launch and terminate EC2 instances, describe instance types, manage SSM parameters for AMI lookup, and read the SQS queue for Spot interruption notices. The launched nodes need a separate IAM role with the standard worker-node permissions (ECR pull, CNI, EBS volume management). The official getting-started guide ships both policies.

How does Karpenter handle pod priority?

Karpenter respects Kubernetes pod priority and preemption. Higher-priority pods can preempt lower-priority pods on existing nodes. For provisioning, Karpenter does not currently prioritize provisioning for higher-priority pending pods over lower-priority ones in the same scheduling pass; both are evaluated together.

Does Karpenter replace Cluster Autoscaler or run alongside it?

Either. Most production clusters running Karpenter at scale eventually decommission the Cluster Autoscaler entirely because Karpenter handles the same scale-up, consolidation, drift, and node expiration. During migration, the two run side by side; in the long term, Karpenter is enough.

Cast AIAutomation AcademyThe Ultimate Karpenter Guide: Setup to Advanced Config