Memory fragmentation is that sneaky bastard that kills your app while you're sleeping. You provision a 16GB Redis instance, store maybe 6GB of data, and somehow you're still getting OOM kills. What the fuck?
Black Friday 2022 - that's when I learned this lesson. We had 16GB allocated, around 4GB of actual data (I think it was 4.2GB but who's counting), and Redis kept getting murdered by the OOM killer. Spent the entire night debugging this shit. Turns out we had a fragmentation ratio of 3.4 - basically paying for 16GB to store less than 5GB of data. Math is fun when it's costing you money.
Redis uses jemalloc for memory allocation. It's supposed to be better than the standard malloc, right? Well, when your app dumps 500KB user profiles right next to tiny 50-byte session tokens, jemalloc creates a fucking mess. Memory looks like Swiss cheese - holes everywhere that can't be reused.
Understanding the Fragmentation Ratio
The fragmentation ratio is calculated as used_memory_rss / used_memory
where:
used_memory_rss
is the actual RAM allocated by the OS to Redisused_memory
is Redis's view of memory usage for stored data
## Check your current fragmentation ratio
redis-cli INFO memory | grep fragmentation
mem_fragmentation_ratio:2.34
What These Numbers Actually Mean:
- Ratio 1.0-1.3: Healthy memory usage, minimal fragmentation
- Ratio 1.3-1.5: Moderate fragmentation, monitor closely
- Ratio 1.5-2.0: Serious fragmentation, performance impact likely
- Ratio >2.0: Critical fragmentation, immediate action required
A ratio over 2.0 means your Redis is wasting half your memory on nothing. That 3.4 ratio I mentioned? We were literally throwing money away - 240% waste for storing basic user sessions and product data.
The Real Causes of Memory Fragmentation
Unlike what Redis documentation suggests, fragmentation isn't just about "allocating and freeing objects of different sizes." Here's what actually fragments Redis memory in production:
Variable-Size Key Expiration Patterns
When you have mixed workloads with different key sizes and TTL patterns, memory gets fragmented as smaller keys expire between larger ones:
## This is exactly what killed us in production
SET large_user_profile:12345 "{ massive JSON object 500KB }"
SET session:abc "small session token"
SET large_user_profile:67890 "{ another massive JSON 500KB }"
EXPIRE session:abc 300 # Small key expires, leaves gap
## After expiration, you have: [500KB][gap][500KB]
## New 1MB allocation can't fit in the gap - fragments memory
Hash Resizing Under Load
Redis hashes automatically resize when they grow, but rehashing temporarily doubles memory usage and can leave fragmented blocks:
## Monitor hash resizing causing fragmentation
redis-cli --latency-history -i 1
## Look for latency spikes during high write volume
List and Stream Operations
Redis Lists and Streams are particularly fragmentation-prone because they allocate memory in chunks. When you trim lists or expire stream entries, the freed chunks often can't be reused efficiently.
Advanced Fragmentation Diagnosis
Memory Fragmentation Visualization: Visual representation showing how memory fragments over time - allocated blocks scattered throughout address space with unusable gaps between them, leading to inefficient memory utilization.
The basic INFO memory
command doesn't tell the whole story. Use these Redis commands for detailed analysis:
## Get comprehensive memory statistics
redis-cli MEMORY STATS
## Sample output showing fragmentation sources:
total.allocated: 8589934592 # 8GB allocated
dataset.bytes: 6442450944 # 6GB actual data
dataset.percentage: 75.0 # 75% efficiency
fragmentation.bytes: 2147483648 # 2GB fragmented
fragmentation.ratio: 1.33
## Analyze specific key memory usage
redis-cli MEMORY USAGE user:profile:12345
(integer) 524288 # This key uses 512KB
## Find memory-hungry keys
redis-cli --bigkeys --memkeys-samples 10000
The MEMORY DOCTOR
command provides automated analysis, but it often misses cluster-specific fragmentation patterns.
Jemalloc vs. System Allocators
Redis uses jemalloc by default, which is generally better at handling fragmentation than glibc malloc, but it's not magic. Jemalloc's effectiveness depends on your allocation patterns:
## Check which allocator Redis is using
redis-cli INFO memory | grep allocator
mem_allocator:jemalloc-5.3.0
## Force different allocators during compilation
make MALLOC=libc # Use system malloc (worse fragmentation)
make MALLOC=jemalloc # Use jemalloc (default, better)
In production, jemalloc beats system malloc for fragmentation handling, but when you're constantly pushing 64KB-1MB objects, both allocators will fragment your memory to hell. Tested this on Redis 7.0.8 last month - same shit, different version number.
Memory Fragmentation in Redis Clusters
Cluster deployments fragment memory differently than standalone instances. Slot migration operations temporarily double memory usage for migrated keys, and failed migrations leave orphaned allocations:
## Check cluster-specific fragmentation
for node in redis-node-{1..6}; do
echo "=== $node ==="
redis-cli -h $node INFO memory | grep -E "(fragmentation|used_memory)"
done
## Look for nodes with significantly different fragmentation ratios
## This indicates uneven slot distribution or failed migrations
The Active Defragmentation Trap
Redis 4+ includes active defragmentation, which sounds like a silver bullet but can actually make problems worse:
## Active defrag configuration (be careful!)
CONFIG SET activedefrag yes
CONFIG SET active-defrag-ignore-bytes 100mb
CONFIG SET active-defrag-threshold-lower 10
CONFIG SET active-defrag-cycle-min 1
CONFIG SET active-defrag-cycle-max 25
Why Active Defrag Often Backfires:
- Blocks the main thread: Defragmentation pauses command processing
- Triggers timeouts in clusters: Other nodes think the defragging node has failed
- CPU intensive: Can cause thermal throttling on cloud instances
- Temporary fragmentation increase: Moving memory around fragments it more initially
Active defragmentation during high traffic is like doing surgery on a running engine while it's on fire. We enabled it during a 2am production incident thinking it would help - made everything ten times worse. Latency spiked, timeouts everywhere, the whole cluster went to shit. Took us offline for another 30 minutes.
The Redis docs don't bother mentioning that defrag blocks the main thread. Thanks, guys.
Memory Fragmentation Monitoring Alerts
Set up monitoring on these fragmentation indicators before problems occur:
## Critical alerting thresholds
mem_fragmentation_ratio > 1.5 # Memory efficiency dropping
used_memory_rss_human > 80% of available RAM # Running out of space
mem_fragmentation_bytes > 1GB # Absolute waste is significant
## Warning thresholds
mem_fragmentation_ratio > 1.3 # Early fragmentation warning
latest_fork_usec > 10000 # Fork operations becoming slow (indicates memory pressure)
Set up comprehensive memory alerting and metric collection before problems occur.
When to Restart vs. Fix Fragmentation
Sometimes restarting Redis is the fastest solution, but it's not always feasible:
Restart When:
- Fragmentation ratio >2.5 and climbing
- Active defragmentation doesn't help after 24 hours
- Memory efficiency drops below 50%
- You can afford the downtime (seconds for small datasets, minutes for large)
Try to Fix When:
- Fragmentation ratio 1.5-2.5 and stable
- Production system can't afford restart
- You have replica nodes that can take over temporarily
Nobody tells you this: once you have severe fragmentation, fixing it without a restart is like trying to unscramble eggs. I've wasted entire weekends trying to fix fragmentation on running instances. Don't be me.
RedisInsight helps visualize the fragmentation, but by the time you're pulling up pretty charts, you're already fucked.
Anyway, diagnosing fragmentation is just step one. Next up - stopping the OOM killer from murdering your Redis instance at 3am.