Django memory leaks don't announce themselves - they slowly consume RAM until your server runs out and the kernel starts killing processes. Here's how to catch them before they kill your app, using memory profiling tools, production monitoring strategies, Django memory debugging, memory leak detection patterns, garbage collection analysis, heap profiling techniques, Django queryset optimization, ORM memory patterns, tracemalloc integration, and production debugging strategies.
The Django Memory Leak You'll Actually Encounter
Forget the textbook examples. Here's the memory leak that brought down our production API serving 2 million requests per day:
## views.py - This innocent code consumed 8GB RAM in 6 hours
def export_users(request):
\"\"\"Export all users to CSV - looks harmless, right?\"\"\"
users = User.objects.all() # Don't fucking do this
response = HttpResponse(content_type='text/csv')
writer = csv.writer(response)
for user in users: # Memory leak: entire queryset loaded into RAM
writer.writerow([user.email, user.created_at])
return response
What happened: User.objects.all()
loads every user into memory at once. With 500,000 users, that's 4-6GB of RAM. The server had 8GB total, so this single request killed everything else running.
The fix that actually works:
## views.py - Memory-efficient version that scales
def export_users(request):
\"\"\"Export users without killing the server\"\"\"
response = HttpResponse(content_type='text/csv')
writer = csv.writer(response)
# Iterator doesn't load everything into memory
users = User.objects.all().iterator(chunk_size=1000)
for user in users:
writer.writerow([user.email, user.created_at])
return response
The iterator() method with chunk_size=1000
processes 1000 records at a time, keeping memory usage constant regardless of dataset size.
QuerySet Memory Traps That Kill Production
Django's ORM makes it easy to write queries that consume ridiculous amounts of memory. Here are the patterns that will fuck you over:
The Prefetch That Ate Everything
## BAD - Loads 10,000 comments into memory per article
articles = Article.objects.prefetch_related('comments').all()
for article in articles:
print(f\"{article.title}: {article.comments.count()} comments\")
If you have 100 articles with 10,000 comments each, you just loaded 1 million comment objects into RAM. The solution is Prefetch objects with limits:
## GOOD - Limits memory usage with selective prefetching
from django.db.models import Prefetch
articles = Article.objects.prefetch_related(
Prefetch(
'comments',
queryset=Comment.objects.order_by('-created_at')[:10],
to_attr='recent_comments'
)
).all()
for article in articles:
print(f\"{article.title}: {len(article.recent_comments)} recent comments\")
The Innocent Aggregate That Wasn't
## This looks safe but it's not
user_stats = User.objects.annotate(
total_orders=Count('orders'),
total_spent=Sum('orders__total')
).all()
## Memory explodes when you iterate
for user in user_stats: # Loads entire result set
print(f\"{user.email}: ${user.total_spent}\")
Annotations can create massive result sets. Use iterator() for large annotated queries:
## Safe version that won't kill your server
user_stats = User.objects.annotate(
total_orders=Count('orders'),
total_spent=Sum('orders__total')
).iterator(chunk_size=500)
for user in user_stats:
print(f\"{user.email}: ${user.total_spent}\")
Production Memory Monitoring That Actually Helps
Don't wait for your app to crash. Here's monitoring that catches memory issues early:
## Memory monitoring middleware for production
import tracemalloc
import psutil
import logging
class MemoryMonitoringMiddleware:
def __init__(self, get_response):
self.get_response = get_response
tracemalloc.start() # Enable memory tracking
self.logger = logging.getLogger(__name__)
def __call__(self, request):
# Snapshot memory before request
snapshot_before = tracemalloc.take_snapshot()
process = psutil.Process()
memory_before = process.memory_info().rss
response = self.get_response(request)
# Check memory after request
snapshot_after = tracemalloc.take_snapshot()
memory_after = process.memory_info().rss
memory_delta = (memory_after - memory_before) / 1024 / 1024 # MB
# Alert on suspicious memory usage
if memory_delta > 100: # More than 100MB per request
top_stats = snapshot_after.compare_to(snapshot_before, 'lineno')
leak_info = []
for stat in top_stats[:5]:
size_mb = stat.size_diff / 1024 / 1024
leak_info.append(f\" {size_mb:.1f}MB: {stat.traceback.format()[-1]}\")
self.logger.error(
f\"Memory leak detected: {request.path} used {memory_delta:.1f}MB
\"
f\"Top allocations:
\" + \"
\".join(leak_info)
)
# Track memory trends
if hasattr(request, 'user') and request.user.is_authenticated:
cache_key = f\"memory_usage_{request.user.id}_{request.path}\"
# Store memory usage for trend analysis
cache.set(cache_key, memory_delta, 300) # 5 minutes
return response
This catches memory leaks in real-time and logs the exact code causing the problem. Install it in MIDDLEWARE
and check your logs for leak alerts.
Garbage Collection Tuning for Django
Python's garbage collector can cause performance issues with large Django applications. Here's tuning that actually improves performance:
## settings.py - GC tuning for production Django
import gc
## Reduce GC frequency for generation 0 (helps with high request volumes)
gc.set_threshold(700, 10, 10) # Default is (700, 10, 10)
## For memory-intensive apps, tune more aggressively
if os.environ.get('DJANGO_ENV') == 'production':
gc.set_threshold(1500, 15, 15) # Less frequent GC, better performance
Monitor GC performance in production:
## Add to your monitoring
import gc
def log_gc_stats():
\"\"\"Log garbage collection statistics\"\"\"
stats = gc.get_stats()
logger.info(f\"GC Stats: {stats}\")
# Check for excessive garbage collection
if stats[0]['collections'] > 10000: # Adjust threshold based on your app
logger.warning(\"High GC activity detected - potential memory leak\")
The Nuclear Option: Memory Leak Detection in Production
When everything else fails, here's the code that saved our production environment. It identified a Celery task that was leaking 200MB per execution:
## Advanced memory leak detector - use only when desperate
import objgraph
import tracemalloc
from collections import defaultdict
class MemoryLeakHunter:
def __init__(self):
self.baseline = None
self.request_count = 0
def take_baseline(self):
\"\"\"Call this after app warmup\"\"\"
tracemalloc.start()
self.baseline = tracemalloc.take_snapshot()
logger.info(\"Memory baseline established\")
def hunt_leaks(self, request):
\"\"\"Call this periodically to hunt for leaks\"\"\"
self.request_count += 1
if self.request_count % 100 == 0: # Check every 100 requests
current = tracemalloc.take_snapshot()
if self.baseline:
top_stats = current.compare_to(self.baseline, 'lineno')
# Log top memory growth
logger.info(\"Top memory growth since baseline:\")
for stat in top_stats[:10]:
size_mb = stat.size_diff / 1024 / 1024
logger.info(f\" +{size_mb:.1f}MB: {stat.traceback.format()[-1]}\")
# Check for concerning growth patterns
total_growth = sum(stat.size_diff for stat in top_stats) / 1024 / 1024
if total_growth > 500: # More than 500MB growth
logger.error(f\"MEMORY LEAK DETECTED: {total_growth:.1f}MB growth\")
# Generate object growth report
objgraph.show_growth(limit=10)
This saved us when a third-party library was leaking Django model instances. The logs showed exactly which code was causing the growth, making it easy to isolate and fix.
Memory leaks in Django aren't mysterious - they're usually QuerySets loading too much data, circular references in model relationships, or third-party libraries not cleaning up properly. The key is monitoring memory usage per request and investigating anything that grows over time.