What Docker Daemon Actually Does (And When It Breaks)

Docker Daemon (dockerd) is the server that does the actual work when you run Docker commands. It's part of the Docker Engine - and if you're stuck on Docker 20.10.x like half the enterprise world, you get to deal with that BuildKit memory leak. Even the newer 24.x versions randomly corrupt the overlay2 storage driver when you least expect it.

How It Actually Works (Until It Doesn't)

Docker Architecture

When you run docker run hello-world, your Docker client talks to dockerd through a socket at /var/run/docker.sock. The daemon then does a bunch of stuff: pulls the image, creates a container, sets up networking, and starts the process. This works great until the daemon hangs, which is basically every other Tuesday.

I've seen dockerd eat like 3.7GB of RAM on production servers just because someone left too many dangling images around. Lost a weekend to this exact issue. The daemon handles everything from image layers to container networking, and when any of these subsystems gets stuck, your entire Docker setup becomes useless.

The Socket Permission Dance

Here's something they don't tell you in the tutorials: Docker daemon runs as root and owns that socket. Try running docker ps as a non-root user and you'll get permission denied connecting to /var/run/docker.sock. The "solution" is adding users to the docker group, which effectively gives them root privileges - great for security, right?

Memory Usage Reality Check

Docker Memory Usage

Docker daemon eats RAM like there's no tomorrow. Sure, it starts around 800MB, but I've debugged production issues where dockerd hit 6GB+ because of image layer accumulation and container metadata bloat. The daemon uses Linux namespaces and cgroups for isolation, but good luck debugging when it starts swapping your host to death.

Container startup times are supposedly 1-2 seconds in benchmarks, but try spinning up a Spring Boot app on a busy production box and watch that number become 45+ seconds while the daemon contemplates its life choices. Docker 20.10.x has that lovely memory leak in BuildKit that makes things even worse.

When Docker Daemon Dies

Here's the fun part: when dockerd crashes (and it will), your running containers keep going but you lose all management. No docker ps, no docker stop, nothing. You end up with zombie containers that you can only kill with kill -9. The live restore feature is supposed to reconnect, but half the time it just creates more confusion.

The API Everyone Hates to Love

Docker API Architecture

The Docker API is how everything talks to the daemon - Docker Compose, Kubernetes, Portainer, whatever. It's a REST API that works over Unix sockets or TCP (if you're brave enough to expose it). The problem? When the daemon locks up, the API hangs too, and every tool trying to talk to Docker becomes useless.

The containerd integration was supposed to make things better by offloading work, but now you have two things that can break instead of one. Spent my entire Saturday in December 2023 debugging this exact scenario when containerd 1.7.0 had that fun bug where it would deadlock on image pulls.

Docker Daemon vs Container Runtime Alternatives

Feature

Docker Daemon (dockerd)

Containerd

Podman

CRI-O

Architecture

Client-server daemon (single point of failure)

Lightweight runtime

Daemonless (fork-exec model)

K8s-focused runtime

Memory Usage

800MB-6GB (usually starts at 900MB, but I've seen it spike to 8GB+)

usually around 250MB, but I've seen it spike to 400MB+

15-20% less than Docker

hovers around 180MB in my testing

Root Privileges

Requires root (security nightmare)

Requires root

Actually works rootless

Requires root

API Compatibility

Full Docker API

Partial compatibility

"Mostly" compatible (until it isn't)

Kubernetes CRI only

Container Startup

1.2-30+ seconds (image dependent)

0.8 seconds (simpler operations)

1.0 seconds (no daemon startup cost)

0.9 seconds

When It Breaks

Everything stops working

Less surface area for failure

Per-command failures only

K8s handles restarts

Networking Pain

Complex bridge/overlay setup

Basic (you'll need plugins)

Works but different from Docker

CNI plugins required

Docker Compose

Works perfectly

Limited support

Works with quirks

Doesn't work

Learning Curve

Easy until production

Need to learn new tooling

"Drop-in replacement" (lies)

Kubernetes-only mindset

Production Reality

Battle-tested (and battle-scarred)

Getting there

Still finding edge cases

Solid for K8s only

Docker Daemon Internals (And Where Things Go Wrong)

The Modular Mess

Docker Daemon Internals

Docker daemon has all these internal components that are supposed to work together but mostly just fight over resources: API server, container manager, image manager, network controller. Sounds organized, right? In reality, when any one of them shits the bed, everything stops working.

The API server listens on /var/run/docker.sock (or TCP port 2376 if you hate security) and tries to handle requests. "Tries" being the key word - I've seen it lock up completely when the image manager gets stuck downloading layers. The whole thing shares state through some internal database that can get corrupted if the daemon doesn't shut down cleanly.

Container State Hell

The daemon tracks containers through a "sophisticated state machine" - fancy words for a bunch of if/else statements that decide whether your container is running, stopped, or in some undefined zombie state. Containers transition between states like Created → Running → Stopped → Paused → Killed, and the daemon has to keep track of which state each container is supposed to be in. The metadata gets stored somewhere in /var/lib/docker/, and when that gets corrupted (which happens), your containers become ghost containers that show up in docker ps but can't be killed.

Container restart policies are supposed to work automatically, but they often fail silently when the daemon is under load. I've debugged production issues where containers that should have restarted just... didn't.

The Root Problem (Literally)

Docker's security model boils down to: everything runs as root. It's like giving your toddler admin access to your server. The daemon needs root because it has to manage namespaces, cgroups, and network interfaces.

User namespace remapping is supposed to help, but it breaks half the existing images because they assume they're running as root. Rootless mode exists but still has limitations that make it unusable for most real workloads.

Network Isolation Theater

Docker creates network namespaces for each container, which sounds secure until you realize the daemon is managing all of this from a single root process. Container-to-container communication works through bridge networks that the daemon configures, but I've seen network conflicts take down entire Docker environments when daemon restarts don't clean up properly.

Port mappings are another source of fun - the daemon binds to host ports and forwards traffic, but if it crashes while ports are bound, you get port already in use errors that require manual cleanup.

Configuration Hell

Docker daemon configuration is scattered across command-line flags, the /etc/docker/daemon.json file, environment variables, and systemd service files. Good luck figuring out which one is actually being used when things break.

I've spent hours debugging why a logging driver wasn't working, only to find it was overridden by a command-line flag buried in /lib/systemd/system/docker.service that some previous admin added in 2019. The daemon configuration docs don't mention that some settings require a full restart, while others can be reloaded - learned that one at 2:30am when a config reload took down prod.

Logging and Monitoring Lies

The daemon has "built-in" monitoring capabilities - meaning you can export metrics to Prometheus if you configure it right. Half the time the metrics endpoint doesn't work because of permission issues or the daemon is too busy to respond.

Container logs get stored in /var/lib/docker/containers/ as JSON files that grow without bounds unless you configure log rotation. I've seen these log files fill up entire disks, bringing production systems to their knees.

Error Handling That Doesn't

The daemon's "sophisticated error handling" mostly involves logging cryptic messages and giving up. When a container fails to start, you get helpful errors like Error response from daemon: OCI runtime create failed: container_linux.go:380: starting container process caused: exec: "/bin/bash": stat /bin/bash: no such file or directory - which is Docker's way of saying the image doesn't have bash, but it took 47 characters of cryptic garbage to tell you.

Health checks are supposed to restart failing containers automatically, but they often just mark containers as unhealthy without doing anything about it. The containerd integration was supposed to make error recovery better, but now you have to debug two different systems when things go wrong.

Actually Useful Docker Daemon Troubleshooting

Q

Why does `docker ps` say "Cannot connect to the Docker daemon socket"?

A

This is the most common Docker error you'll encounter. The daemon either isn't running, or you don't have permission to access the socket at /var/run/docker.sock.

Quick fixes:

  • Check if dockerd is running: sudo systemctl status docker
  • Start it: sudo systemctl start docker
  • Add yourself to the docker group: sudo usermod -aG docker $USER (then log out/in)

On WSL2, this usually means Docker Desktop isn't running or the integration is broken.

Q

Docker daemon is eating all my RAM. What gives?

A

Docker daemon starts around 800MB but balloons faster than my AWS bill after someone forgot to stop a GPU instance. I've seen it ballooned to some ridiculous number like 5.8GB on busy systems. Use docker system df and docker stats to see where your memory is going.

The nuclear option: docker system prune -a && sudo systemctl restart docker

For ongoing relief:

  • Set up log rotation
  • Use docker system prune regularly
  • Monitor with docker system df
Q

The daemon crashed and now containers won't stop!

A

When dockerd crashes, running containers keep going but become orphaned. docker ps shows nothing, but ps aux | grep container-name shows the processes still running.

The fix:

  1. Restart Docker: sudo systemctl restart docker
  2. If that doesn't work: docker kill $(docker ps -q) to kill everything
  3. Nuclear option: sudo kill -9 $(pidof containerd-shim-runc-v2)

Live restore is supposed to reconnect, but it's flaky and often leaves you with ghost containers. This broke at 2am on a Sunday because of course it did.

Q

"Error response from daemon: OCI runtime create failed" - what now?

A

This is Docker's way of saying "something went wrong but I won't tell you what." Usually it's one of these:

Permission issues:

docker run --user $(id -u):$(id -g) your-image

Out of disk space:

docker system prune -a
df -h /var/lib/docker

SELinux being SELinux:

sudo setsebool -P container_manage_cgroup true

Port already in use:

sudo netstat -tulpn | grep :your-port
Q

Docker daemon won't start after system reboot

A

Check the daemon logs first:

sudo journalctl -u docker.service --no-pager

Common issues:

  • Corrupted daemon state: sudo rm -rf /var/lib/docker/tmp/*
  • Disk full: Clean up /var/lib/docker (but back up first)
  • Config file errors: Validate /etc/docker/daemon.json with json_pp
  • Port conflicts: Something else grabbed port 2375/2376
Q

Why do my containers stop when I log out?

A

If you're running containers in the foreground without the daemon properly managing them, they die when your session ends. Make sure you're using:

docker run -d your-image  # Detached mode

Not:

docker run your-image &   # Background process tied to your session
Q

Docker networking randomly stops working

A

This usually happens after VPN connections, system hibernation, or daemon restarts. The bridge network gets confused and containers can't reach each other or the internet.

Quick fix:

sudo systemctl restart docker
docker network prune

Nuclear fix:

sudo systemctl stop docker
sudo ip link delete docker0
sudo systemctl start docker
Q

Can I run Docker without the daemon (like Podman)?

A

Not with Docker itself

  • the daemon is fundamental to Docker's architecture.

If you want daemonless containers, switch to Podman which provides a mostly-compatible CLI. But be prepared for subtle differences that will break your existing scripts.

Q

"ENOSPC: no space left on device" but I have plenty of disk space

A

Docker has its own storage drivers and the issue is usually in /var/lib/docker. Check:

docker system df          # Docker's disk usage
du -sh /var/lib/docker/*  # Find the space hogs

The nuclear option:

docker system prune -a --volumes  # Removes EVERYTHING unused

More targeted fixes:

  • docker container prune - removes stopped containers
  • docker volume prune - removes unused volumes
  • docker image prune -a - removes unused images
  • docker builder prune - removes build cache
Q

The daemon hangs and won't respond to signals

A

Sometimes dockerd gets stuck in an uninterruptible state. You can't even kill it with kill -9. This usually happens when it's waiting on storage I/O or stuck in kernel space.

The brutal fix:

sudo systemctl stop docker.socket
sudo systemctl stop docker.service
sudo systemctl kill docker.service
sudo systemctl start docker.service

If that doesn't work, you're looking at a reboot. The CEO asked why the site was down and I had to explain Docker sockets.

Q

The 3AM Debugging Checklist

A

When Docker daemon decides to shit the bed during your on-call shift, copy this:

## Quick daemon status check
sudo systemctl status docker
sudo journalctl -u docker --no-pager -n 50

## Nuclear option - restart everything
docker system prune -a && sudo systemctl restart docker

## If still broken, check what's actually running
ps aux | grep docker
ps aux | grep containerd
sudo netstat -tulpn | grep 237

## Last resort before calling senior engineers
sudo systemctl stop docker.socket docker.service
sudo kill -9 $(pidof dockerd containerd containerd-shim-runc-v2)
sudo systemctl start docker.service

Time estimates: 5 minutes if lucky, 2 hours if not. Welcome to production Docker.

Practical Docker Daemon Resources (That Actually Help)

Related Tools & Recommendations

tool
Similar content

containerd - The Container Runtime That Actually Just Works

The boring container runtime that Kubernetes uses instead of Docker (and you probably don't need to care about it)

containerd
/tool/containerd/overview
100%
integration
Similar content

Jenkins Docker Kubernetes CI/CD: Deploy Without Breaking Production

The Real Guide to CI/CD That Actually Works

Jenkins
/integration/jenkins-docker-kubernetes/enterprise-ci-cd-pipeline
82%
troubleshoot
Similar content

Docker Desktop Security Hardening: Fix Configuration Issues

The security configs that actually work instead of the broken garbage Docker ships

Docker Desktop
/troubleshoot/docker-desktop-security-hardening/security-configuration-issues
77%
howto
Similar content

Set Up Microservices Observability: Prometheus & Grafana Guide

Stop flying blind - get real visibility into what's breaking your distributed services

Prometheus
/howto/setup-microservices-observability-prometheus-jaeger-grafana/complete-observability-setup
69%
troubleshoot
Similar content

Fix Docker Container Startup Failures: Troubleshooting & Debugging Guide

Real solutions for when Docker decides to ruin your day (again)

Docker
/troubleshoot/docker-container-wont-start-error/container-startup-failures
65%
tool
Similar content

Prometheus Monitoring: Overview, Deployment & Troubleshooting Guide

Free monitoring that actually works (most of the time) and won't die when your network hiccups

Prometheus
/tool/prometheus/overview
64%
troubleshoot
Similar content

Docker Daemon Won't Start on Windows 11? Here's the Fix

Docker Desktop keeps hanging, crashing, or showing "daemon not running" errors

Docker Desktop
/troubleshoot/docker-daemon-not-running-windows-11/windows-11-daemon-startup-issues
59%
troubleshoot
Similar content

Fix Docker Build Context Too Large: Optimize & Reduce Size

Learn practical solutions to fix 'Docker Build Context Too Large' errors. Optimize your Docker builds, reduce context size from GBs to MBs, and speed up develop

Docker Engine
/troubleshoot/docker-build-context-too-large/context-optimization-solutions
55%
troubleshoot
Similar content

Fix Docker Permission Denied on Mac M1: Troubleshooting Guide

Because your shiny new Apple Silicon Mac hates containers

Docker Desktop
/troubleshoot/docker-permission-denied-mac-m1/permission-denied-troubleshooting
52%
howto
Similar content

Mastering Docker Dev Setup: Fix Exit Code 137 & Performance

Three weeks into a project and Docker Desktop suddenly decides your container needs 16GB of RAM to run a basic Node.js app

Docker Desktop
/howto/setup-docker-development-environment/complete-development-setup
51%
troubleshoot
Similar content

Fix Docker Permission Denied: /var/run/docker.sock Error

Got permission denied connecting to Docker socket? Yeah, you and everyone else

Docker Engine
/troubleshoot/docker-permission-denied-var-run-docker-sock/docker-socket-permission-fixes
49%
tool
Similar content

Docker Scout: Overview, Features & Getting Started Guide

Docker's built-in security scanner that actually works with stuff you already use

Docker Scout
/tool/docker-scout/overview
48%
tool
Similar content

Ubuntu 22.04 LTS Developer Workstation Setup & Troubleshooting

Ubuntu 22.04 LTS desktop environment with developer tools, terminal access, and customizable workspace for coding productivity.

Ubuntu 22.04 LTS
/tool/ubuntu-22-04-lts/developer-workstation-setup
48%
tool
Similar content

Python 3.13 Broke Your Code? Here's How to Fix It

The Real Upgrade Guide When Everything Goes to Hell

Python 3.13
/tool/python-3.13/troubleshooting-common-issues
45%
tool
Similar content

Docker Kubernetes ArgoCD Prometheus GitOps Stack: Real-World Guide

Everyone's running this combo these days. Here's what actually works and what'll drive you insane.

/tool/gitops-stack/overview
45%
tool
Similar content

Dev Containers: Advanced VS Code Configuration & Performance Guide

Master advanced devcontainer.json configurations for VS Code Dev Containers. Optimize performance, troubleshoot common issues, and debug complex setups for effi

Dev Containers
/tool/dev-containers/advanced-configuration
42%
troubleshoot
Similar content

Fix Docker Security Scanning Errors: Trivy, Scout & More

Fix Database Downloads, Timeouts, and Auth Hell - Fast

Trivy
/troubleshoot/docker-security-vulnerability-scanning/scanning-failures-and-errors
40%
tool
Similar content

Portainer Business Edition: Advanced Container Management & DevOps

Stop wrestling with kubectl and Docker CLI - manage containers without wanting to throw your laptop

Portainer Business Edition
/tool/portainer-business-edition/overview
39%
tool
Similar content

Express.js Production Guide: Optimize Performance & Prevent Crashes

I've debugged enough production fires to know what actually breaks (and how to fix it)

Express.js
/tool/express/production-optimization-guide
39%
tool
Similar content

Docker Security Scanners for CI/CD: Trivy & Tools That Won't Break Builds

I spent 6 months testing every scanner that promised easy CI/CD integration. Most of them lie. Here's what actually works.

Docker Security Scanners (Category)
/tool/docker-security-scanners/pipeline-integration-guide
39%

Recommendations combine user behavior, content similarity, research intelligence, and SEO optimization