PostgreSQL pg_basebackup: AI-Optimized Technical Reference
Core Function
pg_basebackup creates physical backups of PostgreSQL clusters by copying actual data files (not SQL dumps) while the database is running. Uses PostgreSQL's replication protocol to create consistent backups without downtime.
Critical Performance Specifications
Backup Time Comparison (500GB Database)
- pg_basebackup: 2-4 hours with rate limiting
- pg_dump: 8-12 hours (SQL dump approach)
- pgBackRest: 45 minutes (enterprise tool)
- Barman: 2 hours (backup manager)
Production Impact During Backup
- Without rate limiting: Destroys production performance (100% CPU, API timeouts)
- With 50MB/s limit: 30% CPU usage, no user impact, 2.5 hours backup time
- With compression: 60% space savings, single CPU core maxed out, 3.5 hours total
Storage Requirements
- Uncompressed: 1x database size
- With gzip: ~0.6x database size (built-in compression is single-threaded)
- Need 2x database size free space for safe backup operations
Configuration Requirements
Essential PostgreSQL Settings
# CRITICAL: Insufficient wal_senders causes silent failures
max_wal_senders = 5 # Must exceed number of replicas + 1 for backup
# MUST be replica or higher (minimal breaks everything)
wal_level = replica
Authentication Setup
-- Backup user requires REPLICATION privilege
CREATE USER backup_user WITH REPLICATION LOGIN PASSWORD 'secure_password';
-- pg_hba.conf entry required
host replication backup_user 10.0.0.0/8 md5
Production-Safe Commands
Basic Backup (Rate Limited)
pg_basebackup -h db-server -D /backup/postgres -U backup_user \
--max-rate=50M -P -v -X stream
Large Database Backup (>500GB)
pg_basebackup -h db-server -D /backup/postgres -U backup_user \
--max-rate=100M -Ft -z -X stream -P -v
Emergency Backup (No Rate Limiting)
pg_basebackup -h db-server -D /emergency-backup -U backup_user \
-P -v -X stream --no-sync
Critical Failure Modes
"Number of requested standby connections exceeds max_wal_senders"
- Cause: max_wal_senders too low for existing replicas + backup process
- Impact: Backup fails immediately
- Solution: Increase max_wal_senders or stop non-essential replicas
- Check:
SELECT * FROM pg_stat_replication;
Backup Completes but is Corrupted
- Cause: WAL files get out of sync during backup
- Impact: Backup appears successful but is unusable
- Solution: Always use
-X stream
(streams WAL during backup) - Avoid:
-X fetch
(fetches WAL after backup - race condition prone)
90% Completion Failures
- Primary causes: Network timeouts, disk space exhaustion
- Impact: Hours of backup time wasted
- Mitigation: Monitor disk space with
watch -n 5 df -h /backup
- Network: Use
--checkpoint=spread
for WAN backups
Rate Limiting Bypassed
- Issue: WAL streaming ignores rate limits
- Impact: Full bandwidth saturation despite rate limiting
- High-transaction DBs: Generate 100MB/hour+ WAL, bypassing limits
- Monitor:
SELECT pg_wal_lsn_diff(pg_current_wal_lsn(), '0/0')/1024/1024 AS mb_generated;
Version-Specific Issues
PostgreSQL 13
- Breaking change: WAL file naming scheme changed
- Impact: Backup scripts using old naming conventions fail
- Workaround: Update scripts or use --no-manifest
PostgreSQL 14
- New feature: Built-in compression options
- Reality: Single-threaded compression slower than piping through gzip
- Recommendation: Use external compression tools
PostgreSQL 17
- New feature: Incremental backups
- Critical issues: Buggy implementation, incremental manifests corrupt
- Block tracking overhead: Significant performance impact
- Recommendation: Use pgBackRest for reliable incrementals instead
Resource Requirements
Time Investment
- Initial setup: 30 minutes (assuming no configuration issues)
- First production backup: 2-4 hours for debugging and tuning
- Monthly restore testing: 2 hours (mandatory for validation)
Expertise Requirements
- Basic usage: Junior DBA level
- Production troubleshooting: Senior DBA level (networking, PostgreSQL internals)
- Performance optimization: Expert level (I/O tuning, replication protocol knowledge)
Infrastructure Requirements
- Network: Dedicated backup network recommended for large databases
- Storage: 2x database size for backup target
- CPU: Minimal impact with proper rate limiting
- Monitoring: Essential for large databases (backup can take hours)
Operational Intelligence
When pg_basebackup is Wrong Tool
- Cross-version restores: Use pg_dump instead
- Logical replication: pg_dump provides more flexibility
- Cloud managed services: AWS RDS, Google Cloud SQL, Azure - use their backup systems
- Need real incrementals: pgBackRest or Barman required
Hidden Costs
- Network bandwidth: Can saturate links without rate limiting
- I/O impact: Even rate-limited backups stress storage subsystem
- Backup validation: Requires separate infrastructure for restore testing
- Monitoring complexity: Large backups need progress tracking
Production Gotchas
- Silent permission failures: Exit code 0 even with failed backups
- Manifest corruption: PostgreSQL 13+ manifests can corrupt, invalidating entire backup
- Timeout scaling: As databases grow, wal_sender_timeout must increase
- Compression tradeoffs: Built-in compression is CPU-bound and slow
Tool Comparison Reality Check
Tool | Setup Time | Reliability | Performance | Feature Completeness |
---|---|---|---|---|
pg_basebackup | 30 min | Good with caveats | Medium | Basic |
pgBackRest | 2-4 hours | Excellent | High | Enterprise |
Barman | 1-2 hours | Good | Medium | Good |
pg_dump | 5 min | Excellent | Low (large DBs) | Limited scope |
Backup Validation Strategy
- File integrity:
pg_verifybackup backup_directory
(PostgreSQL 13+) - Logical consistency: Only verifiable through actual restore
- Mandatory testing: Monthly restore to throwaway instance
- Automation: Use Testcontainers for CI/CD backup testing
Critical Warnings
What Documentation Doesn't Tell You
- Rate limiting only applies to data files, not WAL streaming
- Successful backup status doesn't guarantee usable backup
- Version compatibility is strict - no cross-major-version restores
- Managed cloud services block pg_basebackup access entirely
Breaking Points
- Database size >1TB: Backup duration may exceed timeout windows
- High transaction rate: WAL generation can overwhelm network
- Network instability: Backup process is sensitive to connection drops
- Storage I/O limits: Can bottleneck entire backup process
Migration Considerations
- From pg_dump: Massive speed improvement but less flexibility
- To pgBackRest: Complex setup but superior features and reliability
- Cloud migration: pg_basebackup becomes unavailable, plan backup strategy change
Decision Criteria
Use pg_basebackup When:
- Database size >100GB (where pg_dump becomes impractical)
- Same major version backup/restore required
- Simple backup requirements
- Limited expertise/time for complex tool setup
Avoid pg_basebackup When:
- Need cross-version compatibility
- Require reliable incremental backups
- Using managed cloud PostgreSQL services
- Need enterprise backup features (encryption, deduplication)
- Database has high transaction rate (WAL streaming issues)
Worth the Complexity Despite:
- Confusing error messages that waste troubleshooting time
- Version-specific quirks and breaking changes
- Network bandwidth consumption
- Need for separate restore testing infrastructure
The tool is reliable for basic physical backups when properly configured, but lacks enterprise features and has sharp edges that require operational expertise to navigate safely.
Useful Links for Further Investigation
Resources That Actually Help (And Which Ones Suck)
Link | Description |
---|---|
pg_basebackup Official Documentation | The actual command reference. Dry as hell but complete. Start here for flag definitions. |
Continuous Archiving and PITR | How to set up point-in-time recovery. Essential reading if you want to sleep at night. |
Streaming Replication Protocol | Technical details. Only read this if you're debugging connection issues or you enjoy pain. |
PostgreSQL Wiki: Backup Ecosystem | Community-maintained and updated by people who actually use this stuff. Much more practical than official docs. |
Pythian PITR Guide | One of the few guides that shows you how to recover from real failure scenarios. |
pgBackRest | The backup tool you'll eventually switch to. Documentation is confusing but the tool is solid. Parallel backups, real incrementals, proper compression. |
Barman | 2ndQuadrant's backup manager. Has a web interface which is nice if you're not a command line person. Slower than pgBackRest but easier to set up. |
WAL-G | If you're backing up to cloud storage and want something faster than the built-in tools. Written in Go, actually maintained. |
AWS RDS PostgreSQL Backups | You can't use pg_basebackup with RDS. Use their snapshot system instead. |
Google Cloud SQL PostgreSQL | Same deal - no pg_basebackup access. Their backup system works fine though. |
Azure Database for PostgreSQL | Another managed service where pg_basebackup doesn't work. Use their built-in backups. |
pg_stat_progress_basebackup | Query this view to see how much of your backup is actually done. Essential for large databases. |
PostgreSQL Performance Blog | Cybertec's blog has actual performance testing of backup tools. Read their pgBackRest vs pg_basebackup comparisons. |
PostgreSQL GitHub Issues | Search for "pg_basebackup" to find actual bugs and workarounds |
pgBackRest Issues | Better maintained than you'd expect. Developers actually respond. |
PostgreSQL Docker Official Images | Good for testing backup/restore procedures without affecting production. The Alpine images start faster. |
Testcontainers | If you want to automate backup testing in CI/CD. Worth the complexity for critical systems. |
Related Tools & Recommendations
pg_dumpall - Back up entire PostgreSQL clusters
The nuclear option for PostgreSQL backups - gets everything or nothing
Set Up PostgreSQL Streaming Replication Without Losing Your Sanity
integrates with PostgreSQL
PostgreSQL WAL Tuning - Stop Getting Paged at 3AM
The WAL configuration guide for engineers who've been burned by shitty defaults
Google Pixel 10 Phones Launch with Triple Cameras and Tensor G5
Google unveils 10th-generation Pixel lineup including Pro XL model and foldable, hitting retail stores August 28 - August 23, 2025
Dutch Axelera AI Seeks €150M+ as Europe Bets on Chip Sovereignty
Axelera AI - Edge AI Processing Solutions
Samsung Wins 'Oscars of Innovation' for Revolutionary Cooling Tech
South Korean tech giant and Johns Hopkins develop Peltier cooling that's 75% more efficient than current technology
Nvidia's $45B Earnings Test: Beat Impossible Expectations or Watch Tech Crash
Wall Street set the bar so high that missing by $500M will crater the entire Nasdaq
Microsoft's August Update Breaks NDI Streaming Worldwide
KB5063878 causes severe lag and stuttering in live video production systems
Apple's ImageIO Framework is Fucked Again: CVE-2025-43300
Another zero-day in image parsing that someone's already using to pwn iPhones - patch your shit now
Trump Plans "Many More" Government Stakes After Intel Deal
Administration eyes sovereign wealth fund as president says he'll make corporate deals "all day long"
Docker Daemon Won't Start on Linux - Fix This Shit Now
Your containers are useless without a running daemon. Here's how to fix the most common startup failures.
Linux Foundation Takes Control of Solo.io's AI Agent Gateway - August 25, 2025
Open source governance shift aims to prevent vendor lock-in as AI agent infrastructure becomes critical to enterprise deployments
Thunder Client Migration Guide - Escape the Paywall
Complete step-by-step guide to migrating from Thunder Client's paywalled collections to better alternatives
Fix Prettier Format-on-Save and Common Failures
Solve common Prettier issues: fix format-on-save, debug monorepo configuration, resolve CI/CD formatting disasters, and troubleshoot VS Code errors for consiste
Get Alpaca Market Data Without the Connection Constantly Dying on You
WebSocket Streaming That Actually Works: Stop Polling APIs Like It's 2005
Fix Uniswap v4 Hook Integration Issues - Debug Guide
When your hooks break at 3am and you need fixes that actually work
How to Deploy Parallels Desktop Without Losing Your Shit
Real IT admin guide to managing Mac VMs at scale without wanting to quit your job
Microsoft Salary Data Leak: 850+ Employee Compensation Details Exposed
Internal spreadsheet reveals massive pay gaps across teams and levels as AI talent war intensifies
AI Systems Generate Working CVE Exploits in 10-15 Minutes - August 22, 2025
Revolutionary cybersecurity research demonstrates automated exploit creation at unprecedented speed and scale
I Ditched Vercel After a $347 Reddit Bill Destroyed My Weekend
Platforms that won't bankrupt you when shit goes viral
Recommendations combine user behavior, content similarity, research intelligence, and SEO optimization