The Docker Problems Nobody Talks About
Been running FastMCP servers for over a year, and container deployment has some nasty surprises. Here's what actually works and what'll break in ways you didn't expect.
Multi-stage builds are mandatory - learned this when Docker images got huge and deployments crawled to a halt. FastMCP pulls in tons of dependencies - ML libraries, database drivers, auth modules. Your CI will timeout, storage costs explode, and everyone gets mad.
Single-stage builds are career suicide. The Docker multi-stage build docs are fine for theory, but here's what actually keeps production running.
Dockerfile That Won't Make You Cry
This Dockerfile survived three different companies and their production nightmares. It follows Docker's official best practices, but more importantly, it won't break at 2am:
## Build stage - install dependencies
FROM python:3.11-slim as builder
WORKDIR /app
## Still on Python 3.11 because 3.12 broke our asyncio shit with some bullshit SSL context error
RUN python -m venv /opt/venv
ENV PATH=\"/opt/venv/bin:$PATH\"
## Install build dependencies (gcc is required for some FastMCP deps)
RUN apt-get update && apt-get install -y \
build-essential \
curl \
gcc \
&& rm -rf /var/lib/apt/lists/* \
# Clean up immediately or your image will be huge
&& apt-get clean
## Install Python dependencies first (better caching)
COPY requirements.txt .
RUN pip install --no-cache-dir --upgrade pip \
&& pip install --no-cache-dir -r requirements.txt
## Runtime stage - minimal image
FROM python:3.11-slim
WORKDIR /app
## Copy virtual environment from builder
COPY --from=builder /opt/venv /opt/venv
ENV PATH=\"/opt/venv/bin:$PATH\"
## Create non-root user (security audits will fail without this)
RUN groupadd -r mcpuser && useradd -r -g mcpuser mcpuser \
&& chown -R mcpuser:mcpuser /app
USER mcpuser
## Copy application code
COPY --chown=mcpuser:mcpuser . .
## Health check that actually works
## Note: curl might not be installed in slim image - use python instead
HEALTHCHECK --interval=30s --timeout=10s --start-period=15s --retries=3 \
CMD python -c \"import urllib.request; urllib.request.urlopen('http://localhost:8000/health')\" || exit 1
## CRITICAL: Bind to 0.0.0.0 or container networking won't work
EXPOSE 8000
CMD [\"python\", \"server.py\", \"--transport\", \"http\", \"--port\", \"8000\", \"--host\", \"0.0.0.0\"]
Why Each Line Matters (Learned the Hard Way):
- Multi-stage build: Images used to be massive, now they're much smaller. Build tools in production containers are security holes.
- Non-root user: Security scans will fail without this. DevSecOps will reject your deployment and you'll look like an amateur.
- Health checks: Without proper health checks, Kubernetes thinks dying containers are healthy. Use Python instead of curl - slim images don't include curl by default.
- HTTP transport: STDIO is useless in containers. HTTP is the only transport that works reliably.
- Bind to 0.0.0.0: If you bind to 127.0.0.1, external traffic can't reach your container. This will break staging in frustrating ways.
Environment Configuration
FastMCP servers need different configurations between development and production. Here's how I actually configure this shit in production (learned this after fucking around with config files for weeks):
import os
from fastmcp import FastMCP
## Configuration that won't break at 3am
mcp = FastMCP(
name=os.getenv(\"SERVER_NAME\", \"production-server\"),
version=os.getenv(\"SERVER_VERSION\", \"1.0.0\")
)
## Configure based on environment
if os.getenv(\"ENVIRONMENT\") == \"production\":
# Production logging
import structlog
structlog.configure(
processors=[
structlog.stdlib.filter_by_level,
structlog.stdlib.add_logger_name,
structlog.stdlib.add_log_level,
structlog.stdlib.PositionalArgumentsFormatter(),
structlog.processors.TimeStamper(fmt=\"iso\"),
structlog.processors.StackInfoRenderer(),
structlog.processors.format_exc_info,
structlog.processors.JSONRenderer()
],
logger_factory=structlog.stdlib.LoggerFactory(),
wrapper_class=structlog.stdlib.BoundLogger,
cache_logger_on_first_use=True,
)
Memory Management: The Silent Killer
FastMCP has memory leak issues. Containers slowly consume more memory and eventually get OOMKilled during heavy traffic. Long-running tool calls seem to hold onto memory - it's a garbage collection issue but the exact cause isn't clear.
Memory configuration that actually works:
## These settings help with memory management
ENV PYTHONMALLOC=malloc
ENV MALLOC_TRIM_THRESHOLD_=100000
ENV PYTHONFAULTHANDLER=1
## Enable this if you want to debug memory leaks (adds overhead)
## ENV PYTHONMALLOC=debug
## Never set memory limits too low - containers will die randomly
## 512MB is barely enough for anything useful
docker run -m 1g --oom-kill-disable=false your-fastmcp-server
What These Actually Do:
PYTHONMALLOC=malloc
: Python's default allocator is garbage for long-running processesMALLOC_TRIM_THRESHOLD_
: Forces the OS to reclaim memory instead of hoarding it- Container memory limits: Without this, one container can kill your entire host (happened to us twice)
Pro tip: Restart your containers every 24 hours with a CronJob. It's hacky but it works. Memory leaks are a fact of life with current FastMCP versions.
Transport Selection: What Actually Works
Transport | Container Viability | Performance | Debugging | Reality Check |
---|---|---|---|---|
STDIO | ❌ Completely broken | N/A | N/A | Don't waste your time |
HTTP | ✅ Only sane choice | Good enough | Easy to debug | Use this or suffer |
SSE | ⚠️ Timeout hell | Inconsistent | Pain in the ass | Avoid unless forced |
HTTP transport that won't break:
if __name__ == \"__main__\":
import argparse
parser = argparse.ArgumentParser()
parser.add_argument(\"--transport\", default=\"http\")
parser.add_argument(\"--port\", type=int, default=8000)
# NEVER bind to 127.0.0.1 in containers - learned this the hard way
parser.add_argument(\"--host\", default=\"0.0.0.0\")
args = parser.parse_args()
if args.transport == \"http\":
# Add some basic error handling because this WILL fail sometimes
try:
mcp.run_http(host=args.host, port=args.port)
except Exception as e:
print(f\"Failed to start HTTP server: {e}\")
sys.exit(1)
else:
# Don't even bother with other transports in containers
print(\"Use HTTP transport in containers or you'll have a bad time\")
sys.exit(1)
Listen carefully: If you bind to 127.0.0.1
in a container, external traffic can't reach it. This seems obvious but I've seen senior engineers make this mistake.
Security Hardening (Or: How Not to Get Pwned)
Production FastMCP containers will get attacked. Google's distroless images reduce attack surface, but Container Registry shuts down March 18, 2025 so plan accordingly:
## Distroless is great until you need to debug something at 3am
FROM gcr.io/distroless/python3-debian11
## Or keep debugging tools (recommended for sanity)
FROM python:3.11-slim
## Remove anything attackers can use
RUN apt-get remove -y apt curl wget && \
apt-get autoremove -y && \
apt-get clean && \
rm -rf /var/lib/apt/lists/* /var/cache/apt/*
## Read-only filesystem prevents most attacks
VOLUME [\"/tmp\"]
USER 65534:65534 # nobody:nobody - security scans love this
Runtime security (DevSecOps will check this):
docker run \
--read-only \
--tmpfs /tmp \
--security-opt=no-new-privileges \
--cap-drop=ALL \
--user 65534:65534 \
# Don't mount your entire file system as a volume (yes, people do this)
--volume /app/data:/app/data:ro \
your-fastmcp-server
Real talk: Most security breaches happen because someone mounted /
as a volume or ran as root. Don't be that person.
Image Optimization: Lessons from Our Storage Bill Horror Stories
I learned image optimization when Docker registry costs got out of hand from building oversized images. Here's what actually works:
Layer optimization lessons:
- Group
RUN
commands or every line creates a new layer (learned this when builds got painfully slow) - Use
.dockerignore
- accidentally including large files will bloat every image - Multi-stage builds saved our ass - build tools stayed in build stage, runtime stayed clean
- Pin package versions or Docker will re-download everything on every build
Size optimization experience:
- Base Python image: Around 1GB (eats storage budget)
- With FastMCP + dependencies: Images get huge, 1.5GB+ easily
- After multi-stage build: Much smaller and deployments are faster
- With distroless base: Smaller still but debugging becomes difficult
Reality check: Smaller images deploy faster and cost less to store. But if you can't debug in production, you'll spend hours trying to figure out why things break. Pick your poison.
Container Resource Requirements
From production experience:
- Simple tools (file operations, basic APIs): Start with 256-512MB memory, adjust CPU as needed
- Database work: 512MB-1GB memory works well, CPU matters for complex queries
- ML/AI tools: Memory intensive - 1-2GB+ required, CPU depends on model complexity
- High-traffic APIs: Scale resources generously - 2GB+ memory, multiple CPU cores
Memory leak mitigation:
## Set JVM-like memory management for Python
ENV PYTHONMALLOC=pymalloc_debug
ENV PYTHONFAULTHANDLER=1
ENV PYTHONUNBUFFERED=1
## Enable garbage collection debugging
ENV PYTHONDEVMODE=1
Development vs Production Image Patterns
Separate images for development and production environments:
Development Dockerfile:
FROM python:3.11-slim
RUN apt-get update && apt-get install -y curl vim htop
COPY requirements-dev.txt .
RUN pip install -r requirements-dev.txt
## Include development tools, debuggers, etc.
Production Dockerfile:
FROM python:3.11-slim as production
## Minimal dependencies only
## No development tools
## Security hardening
## Performance optimizations
Monitoring Container Health
FastMCP containers need comprehensive health monitoring. Follow Kubernetes health check best practices with proper liveness and readiness probes:
from fastapi import FastAPI
from fastmcp import FastMCP
## Add FastAPI for health endpoints
app = FastAPI()
@app.get(\"/health\")
async def health_check():
\"\"\"Basic health check\"\"\"
return {\"status\": \"healthy\", \"timestamp\": time.time()}
@app.get(\"/health/ready\")
async def readiness_check():
\"\"\"Kubernetes readiness probe\"\"\"
# Check database connections, external dependencies
return {\"status\": \"ready\"}
@app.get(\"/health/live\")
async def liveness_check():
\"\"\"Kubernetes liveness probe\"\"\"
# Check if server is responsive
return {\"status\": \"alive\"}
## Mount FastMCP on subpath
app.mount(\"/mcp\", mcp.app)
For comprehensive health check implementation, consider using fastapi-healthchecks for structured dependency checking.
Container Security Considerations
Recent security research has identified critical vulnerabilities in MCP integrations where attackers can hijack AI agents through prompt injection. Production deployments must implement proper input validation and sandboxing.
This containerization foundation enables the Kubernetes orchestration patterns covered in the next section. Without proper Docker practices, Kubernetes deployments fail unpredictably in production.