How many workers should I use?

Honestly? No idea. Start with `(2 × cores) + 1` and see what breaks. I've seen 20 workers on a 4-core box because the app was constantly hitting the database. I've also seen 2 workers handle more load than 10 workers because the app wasn't garbage. **Reality:** Just try different numbers and see what happens. There's no formula that works for everyone.

Why does my Gunicorn keep running out of memory?

Your app probably has a memory leak. Fix that first, then blame Gunicorn. Seriously though, each worker loads your entire app into memory. If your Django app is 200MB and you have 10 workers, that's 2GB just sitting there. Add `--max-requests 5000` to restart workers periodically and buy yourself some time.

Why do I need nginx with Gunicorn?

Because Gunicorn serving static files is like using a Ferrari to deliver pizza - technically possible but stupid expensive. Nginx handles static files in ~microseconds. Gunicorn takes milliseconds and ties up a worker process. With enough static requests, you'll run out of workers and your site will crawl. Plus nginx does SSL termination, gzip compression, and protects you from slow clients. Gunicorn does none of that well.

My app is slow, should I add more workers?

Probably not, but I've been wrong before. **Things to check before you blame Gunicorn:** - Database is slow (it's always the database) - You're making 50 API calls per request - Your code sucks Adding workers to fix a slow database is like buying more cars to fix traffic jams. Spoiler: doesn't work.

Should I use async workers?

Only if your app is genuinely async-friendly. Most Django apps aren't. If you're doing database queries with the Django ORM, you're blocking anyway. Async workers won't help. FastAPI with proper async database calls? Yeah, async workers make sense. **When async workers help:** Lots of HTTP API calls, long-polling, streaming responses **When they don't:** Traditional Django/Flask apps that block on database calls

How do I debug why Gunicorn is dying?

Check your logs first: `journalctl -u your-service` or wherever you're logging. Common culprits: - **Worker timeout:** Your code takes too long (increase `--timeout`) - **OOM killer:** You're using too much memory (reduce workers or fix memory leaks) - **File descriptor limit:** Too many connections (check `ulimit -n`) Add `--log-level debug` for more verbose output, but prepare for log spam.

Can I run Gunicorn on Windows?

No. It uses Unix-specific features like `fork()`. Use Waitress if you're stuck on Windows. This comes up more than you'd think. The answer is always "switch to Linux or use a different server."

My Gunicorn workers keep timing out

Default timeout is 30 seconds. If your views take longer, workers get killed. Either: 1. **Fix your slow code** (preferred solution) 2. **Increase timeout:** `--timeout 120` 3. **Use async processing** for long tasks (Celery, RQ, etc.) I've seen people set 300-second timeouts to "fix" slow database queries. Don't be that person.

Why is my memory usage growing over time?

Memory leaks in your app code. Gunicorn workers accumulate memory and never release it. **Memory leak pattern looks like this:** ``` Hour 1: 200MB per worker Hour 2: 250MB per worker Hour 3: 320MB per worker Hour 4: 400MB per worker ... eventually: OOM killer strikes ``` **Quick fix:** Restart workers periodically with `--max-requests 1000` **Real fix:** Profile your app and find the leak **Pro tip:** Python's garbage collector isn't perfect. Some third-party libraries leak memory like a sieve.

How do I gracefully restart Gunicorn?

Send `HUP` signal to the master process: `kill -HUP $(cat gunicorn.pid)` This spawns new workers and kills old ones after they finish current requests. Usually works, but occasionally a worker gets stuck and you need `kill -9`.

Why does Gunicorn work fine locally but die on my $5 VPS?

Because your $5 VPS has 512MB of RAM and you're trying to run 8 workers. Do the math. I spent 4 hours debugging this once before realizing I was trying to run a 200MB Django app with 10 workers on a machine with 1GB RAM. The OOM killer was having a field day.

Should I use --preload?

Depends. `--preload` loads your app once then forks workers. Saves memory but makes reloading code harder. **Use --preload if:** You're memory-constrained and restart the whole process for code changes **Skip it if:** You want to reload code with `HUP` signal

What about Docker containers?

Use fewer workers in containers. A 4-core host with 10 containers each running 8 workers = 80 workers fighting for 4 cores. Do the math. I usually run 2-4 workers per container and scale horizontally instead.

Why does Gunicorn randomly exit code 143?

Because Docker sent SIGTERM and your app took too long to shut down. Docker waits 10 seconds then sends SIGKILL (exit code 137). Either handle shutdown signals properly in your app or increase Docker's grace period to 30 seconds. I learned this when our "graceful" deployments kept showing as crashed in monitoring even though they worked fine.

Currently viewing the AI version

Switch to human version

Gunicorn Python WSGI Server: AI-Optimized Technical Reference

Configuration

Production-Ready Worker Settings

Worker Formula: (2 × cores) + 1 as starting point
Reality Check: Formula fails for database-heavy apps (Django apps may need 20 workers on 4-core systems)
Memory Management: Use --max-requests 5000 for memory leak mitigation
Restart Jitter: Add --max-requests-jitter 1000 to prevent thundering herd during restarts

Critical Production Configuration

gunicorn myapp.wsgi:application \
    --bind 0.0.0.0:8000 \
    --workers 9 \
    --worker-class sync \
    --timeout 120 \
    --max-requests 5000 \
    --max-requests-jitter 1000 \
    --preload \
    --access-logfile - \
    --error-logfile -

Infrastructure Requirements

Mandatory: Nginx frontend (never serve static files with Gunicorn)
Database Connections: Each worker requires dedicated DB connection
File Descriptors: Default Ubuntu 20.04 limit (1024) insufficient, require 65536
Container Platform: Avoid Alpine Linux (signal handling issues), use debian-slim

Resource Requirements

Memory Calculations

Base App Size: Each worker loads complete application
Example: 200MB Django app × 10 workers = 2GB baseline memory
Preload Benefit: 30% memory reduction for typical Django apps
Memory Leak Pattern: Linear growth over time requiring worker restarts

Performance Benchmarks

Server	Performance (RPS)	Configuration Complexity	Reliability
Gunicorn	~2,000	Low (2-3 SO tabs)	High
uWSGI	~4,000	Extreme (15+ SO tabs)	Medium
Uvicorn	~6,000+	Low (async only)	Medium
Waitress	~1,500	Minimal (Windows)	High

Time Investment

Gunicorn Setup: 1-2 hours for production deployment
uWSGI Alternative: 3+ days for equivalent reliability
Learning Curve: Minimal documentation requirements

Critical Warnings

Production Failure Modes

Worker Timeout: Default 30 seconds kills long-running requests
- Impact: Complete request failure
- Frequency: Common with slow database queries
- Solution: Increase --timeout or fix slow code
Memory Exhaustion: Workers accumulate memory, never release
- Pattern: 200MB → 250MB → 320MB → 400MB → OOM kill
- Frequency: Inevitable with memory leaks
- Mitigation: --max-requests cycling
File Descriptor Exhaustion: Silent connection failures at 1024 limit
- Impact: New connections rejected without error logs
- Detection: Difficult (fails silently)
- Solution: Increase ulimit to 65536
Docker Signal Handling: SIGTERM ignored in Alpine containers
- Impact: Graceful restarts become hard kills
- Frequency: Consistent with Alpine base images
- Solution: Switch to debian-slim base

Database Connection Scaling

Formula: Workers × Services = Total Connections Required
PostgreSQL Default: 100 max connections
Breaking Point: 20 workers × 5 services = 100 connections (at limit)
Scaling Constraint: Database connection pool becomes bottleneck

Framework Compatibility Reality

Django: Blocking ORM calls make async workers ineffective
Flask: Synchronous by design, async workers provide no benefit
FastAPI: Requires async workers for performance gains
Performance Trap: Async workers with blocking code reduce performance

Decision Criteria

When to Choose Gunicorn

Reliability Priority: Consistent uptime over peak performance
Team Expertise: Limited DevOps experience
Framework: Django, Flask, traditional WSGI applications
Scale Requirements: Under 10,000 concurrent users

When to Avoid Gunicorn

WebSockets: No native support, use Uvicorn/Daphne
Windows Deployment: Unix-only (fork() dependency)
Massive File Uploads: Worker model inefficient for large files
Container Constraints: High worker count in memory-limited containers

Alternative Selection Matrix

Higher Performance + Complexity: uWSGI (4,000 RPS, complex config)
Async Applications: Uvicorn (6,000+ RPS, async-native)
Windows Requirement: Waitress (1,500 RPS, cross-platform)
Corporate Apache: mod_wsgi (variable performance, policy compliance)

Implementation Reality

Deployment Architecture

Internet → Nginx (Port 80/443) → Gunicorn (Port 8000) → Python App

Nginx Responsibilities:

Static file serving (microsecond response)
SSL termination
Request buffering
Gzip compression

Gunicorn Responsibilities:

Python application execution only

Common Misconceptions

Static File Serving: Gunicorn can serve static files (inefficient, ties up workers)
Performance Scaling: More workers always improve performance (false beyond memory limits)
Async Benefits: Async workers improve all Python apps (false with blocking operations)
Configuration Complexity: Simple server means simple scaling (worker tuning requires iteration)

Breaking Changes and Migration

Version 21.x → 22.x: Stricter HTTP parsing breaks some mobile clients
Impact: Requests that previously worked become rejected
Detection: Check logs post-upgrade for parsing errors
Frequency: Affects applications with non-standard client implementations

Container-Specific Considerations

Worker Density: Reduce workers per container in multi-container deployments
Example Problem: 10 containers × 8 workers = 80 workers competing for 4 cores
Solution: 2-4 workers per container, scale horizontally
Signal Handling: Docker SIGTERM timeout (10 seconds) insufficient for graceful shutdown

Operational Intelligence

Debugging Production Issues

Log Analysis Priority: journalctl -u service-name before worker tuning
Memory Leak Detection: Monitor worker RSS growth over time
Performance Bottlenecks: Database queries before worker count
Connection Issues: File descriptor limits before network configuration

Monitoring Indicators

Worker Restart Frequency: High frequency indicates memory leaks or crashes
Response Time Distribution: Timeouts indicate worker exhaustion or slow code
Memory Growth Rate: Linear growth confirms leak presence
Connection Refused Errors: File descriptor or worker exhaustion

Cost-Benefit Analysis

Development Time: Minimal compared to alternatives
Operational Overhead: Low maintenance requirements
Performance Trade-off: 20-50% slower than optimized alternatives
Reliability Benefit: Predictable behavior under load
Team Productivity: Reduced troubleshooting time versus complex alternatives

Version and Security Considerations

Current Stable: 23.x series (as of 2025)
Minimum Python: 3.7+ (3.7 itself deprecated in 2025)
Security Updates: Maintainers responsive to vulnerability reports
Upgrade Cadence: Annual major versions, quarterly patch releases
Breaking Change Frequency: Minimal, configuration-compatible across minor versions

Useful Links for Further Investigation

Resources That Actually Help

Link	Description
Official Gunicorn Docs	Surprisingly doesn't suck, which is rare for Python documentation. Has actual nginx configs you can steal.
GitHub Repository	Where you go when stuff breaks. Maintainers actually respond instead of ghosting you.
Digital Ocean Django Tutorial	I've followed this tutorial at least 15 times. Actually works instead of being complete garbage.
Real Python Gunicorn Article	Explains the "why" instead of just copy-paste commands. Revolutionary concept, apparently.
Stack Overflow Gunicorn Questions	Where the real answers live. Sort by votes, ignore anything older than 2020.
Docker Official Python Images	Use these as base images. They have sensible defaults.
Kubernetes Deployment Examples	Not Gunicorn-specific, but shows you how to deploy Python apps properly.
Gunicorn releases	Check this page for security updates. The maintainers are pretty good about patching stuff quickly.
GitHub security tab	Use this tab for reporting security issues instead of email, as it is a more reliable method for communication.

31%

Recommendations combine user behavior, content similarity, research intelligence, and SEO optimization