Exit code 137 is Docker's way of telling you the Linux kernel just killed your container because it tried to eat more memory than you allocated. The number comes from 128 + 9, where 9 is the signal number for SIGKILL
- the nuclear option that can't be caught or ignored.
The Real-World Scenario
Picture this: You're running a Node.js app in production with --memory=512m
because that seemed reasonable. Everything works fine for weeks. Then at 3:17 AM on a Tuesday, your monitoring starts screaming. Your container died with exit code 137.
What happened? Your app hit a traffic spike, loaded more data into memory, and suddenly needed 600MB. The Linux kernel's OOM killer said "nope" and killed it instantly. No graceful shutdown, no cleanup, just dead. This is a common production scenario that catches teams off guard.
How to Confirm It's Actually an OOM Kill
Don't guess. Check if Docker flagged it as an OOM kill:
## Check if container was OOM killed
docker inspect --format '{{.State.OOMKilled}}' container_name
## See the actual exit code
docker inspect --format '{{.State.ExitCode}}' container_name
## Get container logs to see what happened before death
docker logs --tail=50 container_name
If OOMKilled
is true
and exit code is 137, you found your culprit. Sometimes the OOMKilled flag is misleading or false, especially on Windows containers or when child processes get killed instead of the main process.
Memory Usage vs Memory Limits: The Gotcha
Here's what trips up most people: Docker's memory reporting includes cache and buffers, but the OOM killer looks at RSS (resident set size) - the memory your process actually claims. This memory accounting difference causes confusion during debugging.
Use docker stats to see real-time usage:
## Live memory monitoring
docker stats container_name
## One-time snapshot
docker stats --no-stream
The memory column shows current usage vs limit. If you're consistently hitting 80%+ of your limit, you're playing Russian roulette with the OOM killer. For more detailed monitoring, consider using cAdvisor or other monitoring tools.
The JVM Memory Trap
Java applications are notorious for this. The JVM allocates heap space based on available system memory, not container limits. If your container has 512MB but the host has 32GB, the JVM might try to allocate 8GB of heap and instantly die. This is a well-documented issue in containerized environments.
Fix it by setting JVM flags to respect container limits:
## Modern JVMs (Java 11+) should detect container limits automatically
-XX:+UseContainerSupport
## Or set explicitly
-Xmx400m # Leave room for non-heap memory
Same problem exists with other runtimes. Node.js with --max-old-space-size, Python with memory pools, Go with garbage collection - they all need to know about your container's memory constraints. .NET applications have similar considerations.
Kubernetes Makes It More Complicated
In Kubernetes, you set both requests
and limits
. The OOM killer respects limits, but Kubernetes scheduling uses requests. This creates a dangerous gap that leads to unpredictable OOM kills.
If you set requests: 256Mi
and limits: 512Mi
, Kubernetes might schedule your pod on a node assuming it needs 256Mi. But if it actually uses 512Mi and the node is overcommitted, multiple pods can hit OOM simultaneously. This is explained in detail in the official Kubernetes documentation.
Current best practice is setting requests = limits
to avoid this surprise. The Kubernetes community increasingly recommends this approach for production workloads.
Memory Leaks vs Memory Spikes
Exit code 137 from a memory leak looks different than from a traffic spike. Leaks show gradual memory growth in monitoring until sudden death. Spikes show stable usage followed by immediate jumps. Proper monitoring helps distinguish between these patterns.
Real GitHub issue: Astro builds were failing with exit code 137 because too many concurrent image optimizations pushed memory usage over limits. Not a leak - just bad resource planning. Similar issues appear across different platforms and applications.
The fix was either increasing memory limits or throttling concurrent operations. Sometimes the answer isn't "give it more memory" but "make it use memory more efficiently." Production debugging techniques help identify the root cause.