What does OOMKilled (exit code 137) mean?
OOMKilled is a Linux kernel feature, not a Kubernetes feature. The cgroup memory controller enforces limits at the kernel level. Kubernetes reads the container exit code and sets reason: OOMKilled in pod status — it’s a reporter, not the enforcer.
- Exit code 137 = 128 + SIGKILL (signal 9). The Linux kernel OOM killer fired against your container.
- Root cause: the container crossed its cgroup memory limit — or the node ran out of physical memory.
- Kubernetes surfaces it as
reason: OOMKilledinkubectl describe pod. - No graceful shutdown: SIGKILL cannot be caught or ignored.
- The fix is correct per-container memory sizing — not globally raising limits or adding nodes.
- QoS class determines kill priority:
BestEffortpods die first,Guaranteedpods last. - Three distinct scenarios produce OOMKilled. Each has a different fix.
Why Kubernetes kills a pod with exit code 137
Three distinct scenarios produce OOMKilled. Conflating them leads to the wrong fix.
Scenario 1: Container memory limit reached
The most common case. When a container’s memory consumption crosses its cgroup limit, the kernel kills that container only — the rest of the pod keeps running. Kubelet restarts the container per the pod’s restartPolicy.
On Kubernetes 1.28+ with cgroup v2, memory.oom.group=1 kills all processes in the container together rather than just the offending process. This is the correct behavior for multi-threaded applications; previously, a background thread crossing the limit could leave the main process in a broken state.
Init container OOMKills: If an init container gets OOMKilled, the pod never transitions to Running — it stays in Init or Init:OOMKilled state and restarts. This is distinct from a running container OOMKill, where the pod is already running and the container restarts within it. Check kubectl describe pod for init container terminated state.
Sidecars as hidden culprits: Istio proxies, Datadog agents, and other sidecars run in the same pod and draw from the same node memory. An app container correctly sized at 256Mi can still push the pod toward OOM if an Istio sidecar is consuming another 100–150Mi that was never accounted for in the limit. Always inspect every container in the pod, not just the application container.
QoS class determines which containers the kernel targets first under node memory pressure:
| QoS Class | Condition | oom_score_adj | Kill Priority |
|---|---|---|---|
| BestEffort | No requests or limits set | 1000 | First |
| Burstable | Requests < Limits (or only one set) | 2–999 | Middle |
| Guaranteed | Requests == Limits (non-zero, all containers) | -997 | Last |
Scenario 2: Node memory pressure and kubelet eviction
Kubelet monitors node-level memory via eviction thresholds (default: memory.available < 100Mi). When the node approaches exhaustion, kubelet proactively evicts pods — BestEffort first, then Burstable. This is distinct from an OOMKill:
| OOMKilled | Evicted | |
|---|---|---|
| Trigger | Container exceeded its memory limit | Kubelet memory pressure threshold |
| Exit Code | 137 | N/A (pod phase: Failed) |
| kubectl reason | OOMKilled | Evicted |
| Who kills | Linux kernel | kubelet |
| Pod status after | Restarted (per restartPolicy) | Stays Failed until deleted |
Scenario 3: Node-level kernel OOM (container under its own limit)
If kubelet eviction doesn’t shed pods fast enough, the kernel OOM killer fires at the node level. A container can be OOMKilled here even if it never exceeded its own cgroup limit — it was simply the lowest-scored candidate in the kernel’s OOM priority calculation at that moment.
This appears as OOMKilled in pod status, identical to Scenario 1. Distinguish via the node syslog:
grep "Out of memory: Killed process" /var/log/syslogIf the output does not contain the string memcg, this is a node-level OOM — not a container cgroup limit breach. The fix is different: ensure sufficient node memory headroom using kube-reserved memory configuration, not by changing container limits that weren’t actually exceeded.
How to diagnose OOMKilled (kubectl describe, events, logs)
Start with kubectl describe. Look for the terminated state in the container status block:
kubectl describe pod <pod-name> -n <namespace>You’re looking for:
Last State: Terminated
Reason: OOMKilled
Exit Code: 137Pull logs from the previous (killed) container instance:
kubectl logs --previous <pod-name> -n <namespace>Check recent events:
kubectl get events --sort-by=.lastTimestamp -n <namespace>For cluster-wide visibility in Prometheus:
kube_pod_container_status_last_terminated_reason{reason="OOMKilled"}To understand the memory trajectory that led to the kill, query actual working set memory. Use container_memory_working_set_bytes — not container_memory_usage_bytes, which includes page cache the kernel will reclaim before triggering OOM:
quantile_over_time(0.95, container_memory_working_set_bytes{container!=""}[7d:5m])This p95 value over 7 days is your baseline for setting memory requests. If your current limit sits below this number, OOMKilled will keep happening.
For PSI (Pressure Stall Information) — a leading indicator of memory starvation before the OOM killer fires:
node_pressure_memory_stalled_seconds_totalOnce you know the right values, apply them immediately:
kubectl set resources deployment/my-app -c my-container --requests=memory=256Mi --limits=memory=512MiHow to fix OOMKilled
- Raise the limit 50% right now.
kubectl patch deployment my-app --type=json -p='[{"op":"replace","path":"/spec/template/spec/containers/0/resources/limits/memory","value":"384Mi"}]'— this stops the bleeding without permanently inflating your baseline. - Deploy and watch for 24 hours. Confirm no new OOMKilled events:
kube_pod_container_status_last_terminated_reason{reason="OOMKilled"}. - Get a data-driven recommendation before setting a permanent value. Deploy VPA in
updateMode: Off(see below) and let it observe actual usage for several days. Don’t guess twice.
Set memory requests and limits correctly
- Memory request: set at p95 of
container_memory_working_set_bytesover 7 days - Memory limit: set at p99 or higher — 2x the request is a reasonable starting point without tail data
- Guaranteed QoS (requests == limits, non-zero): grants
oom_score_adj=-997— the lowest kill priority, last to be targeted under node pressure. Use this for latency-sensitive production workloads where any restart is unacceptable. The scheduler treats the full request as reserved capacity, so cost is higher. - Burstable QoS (requests < limits): the right choice for most workloads. Set requests at p95 actual usage, limits at p99+ or 1.5–2x requests. Pods can absorb traffic spikes without exhausting node capacity.
If you’re using VPA for recommendations without enforcement:
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
name: my-app-vpa
spec:
targetRef:
apiVersion: apps/v1
kind: Deployment
name: my-app
updatePolicy:
updateMode: "Off" # recommendations only, no enforcementCheck .status.recommendation.containerRecommendations for the suggested values. Valid updateMode values: Off, Initial, Recreate, Auto.
Find and fix memory leaks
If working set grows unbounded between restarts, you have a leak — raising the limit only extends the time between kills. JVM applications deserve specific attention: setting -Xmx equal to the container limit ignores off-heap allocation (metaspace, code cache, threads). Rule of thumb: -Xmx = container_limit × 0.75. The remaining 25% covers non-heap JVM overhead.
Why adding more memory is not the fix
Blanket limit increases across your fleet don’t prevent OOM kills — they raise the average everywhere while the specific containers that actually need more memory keep dying. According to the Cast AI 2026 Kubernetes Optimization Report, average memory utilization across production clusters is 20% with 79% overprovisioning. Clusters with generous memory padding were still averaging 40–50 OOM kills per monitoring interval. The memory was provisioned — it was provisioned to the wrong workloads.
How to prevent OOM kills at scale (automated rightsizing)
Manual sizing works for a handful of services. At 50+ deployments with variable traffic, it breaks down. Memory profiles change with code releases, seasonal load, and dataset growth. A limit that was correct three months ago may be wrong today.
Cast AI Workload Optimization tracks container_memory_working_set_bytes continuously, identifies containers approaching their limits, and raises those limits proactively — before an OOM kill, not after. For containers that are significantly overprovisioned, it brings limits down, recovering headroom for workloads that actually need it.
On Kubernetes 1.33+, In-Place Pod Resizing adjusts memory limits without restarting pods. For stateful workloads, this matters: a traditional VPA-style resize requires a pod restart that may itself cause a brief outage. In-place resizing eliminates that constraint.
Cast AI uses PSI (Pressure Stall Information) metrics as leading indicators. PSI measures time processes spend stalled waiting for memory — it signals starvation before the OOM killer fires. Acting on PSI lets the system raise a limit before the container dies.
After enabling automated rightsizing, OOM kills dropped to near zero across clusters that were previously averaging 40–50 kills per interval (Cast AI 2026 Kubernetes Optimization Report). The rightsizing engine analyzes all resource dimensions simultaneously — when it corrects memory limits, it also surfaces CPU overprovisioning in the same pass. Provisioned CPUs dropped by roughly half in the same clusters. Memory and CPU overprovisioning tend to coexist; fixing one without the other leaves the job half done.
For cluster-by-cluster breakdowns of where overprovisioning concentrates and which workload types are most OOM-prone, see the Cast AI 2026 Kubernetes Optimization Report.
FAQ
What does exit code 137 mean in Kubernetes?
Exit code 137 means the container was killed by SIGKILL (signal 9): 128 + 9 = 137. In Kubernetes, this almost always means the Linux kernel OOM killer fired because the container exceeded its cgroup memory limit. Kubernetes reports this as reason: OOMKilled in pod status.
How do I find which container was OOMKilled?
Run kubectl describe pod <name> -n <namespace> and look for Reason: OOMKilled, Exit Code: 137 in the Last State section. For cluster-wide visibility: kube_pod_container_status_last_terminated_reason{reason="OOMKilled"} in Prometheus.
What is the difference between OOMKilled and Evicted in Kubernetes?
OOMKilled means the Linux kernel OOM killer terminated a container for crossing its cgroup memory limit — exit code 137, container is restarted per restartPolicy, who kills is the Linux kernel. Evicted means kubelet proactively removed a pod because node-level memory.available dropped below the eviction threshold — no exit code (pod phase: Failed), pod stays Failed until deleted, who kills is kubelet. Eviction is preventive; OOMKilled is the kernel acting after limits are breached.
Does OOMKilled always mean exit code 137?
Yes — OOMKilled is caused by SIGKILL (signal 9), and 128 + 9 = 137. However, exit code 137 can also occur if a container is killed externally by SIGKILL for other reasons (for example, kubectl delete pod --force). Always confirm by checking kubectl describe pod for Reason: OOMKilled.
What QoS class prevents OOMKilled?
Guaranteed QoS (requests == limits for all containers in the pod) gives oom_score_adj=-997, making it the last candidate for node-level OOM killing. It does not prevent a container from being killed if it exceeds its own limit — it only reduces kill priority under node pressure. For latency-sensitive workloads where any restart is unacceptable, Guaranteed QoS is the right choice. For most workloads, Burstable QoS (requests at p95, limits at 1.5–2x requests) balances protection with scheduler efficiency.
Will raising the memory limit fix OOMKilled?
Only if the container was genuinely under-limited. If the root cause is a memory leak, raising the limit only extends the time between kills. Globally padding limits doesn’t solve the problem — per the Cast AI 2026 Kubernetes Optimization Report, clusters with 79% overprovisioning were still averaging 40–50 OOM kills per interval. Accurate per-container sizing based on actual usage data outperforms blanket limit increases.
How do I set Kubernetes memory limits correctly?
Set memory requests at p95 of container_memory_working_set_bytes over 7 days. Set limits at p99 or higher — 2x the request is a safe starting point. Use container_memory_working_set_bytes, not container_memory_usage_bytes (which includes reclaimable page cache). For JVM containers, set -Xmx to no more than 75% of the container memory limit to leave room for off-heap allocation. Use kubectl set resources to apply changes: kubectl set resources deployment/my-app -c my-container --requests=memory=256Mi --limits=memory=512Mi.
What is oomkilled kubernetes and how do I prevent it at scale?
OOMKilled in Kubernetes is when a container is terminated by the Linux kernel OOM killer for exceeding its cgroup memory limit. At scale, preventing it requires continuous automated rightsizing — tracking actual memory usage per container and adjusting limits proactively. Manual sizing doesn’t keep up with changing workload behavior. Tools like Cast AI Workload Optimization monitor PSI metrics and working set trends to raise limits before OOM kills happen and lower limits for overprovisioned containers.



