Currently viewing the human version
Switch to AI version

What pg_dumpall Actually Does (And Why It's Painful)

PostgreSQL Database Architecture
Database Backup Process

pg_dumpall is PostgreSQL's cluster-wide backup tool that creates logical dumps of everything in your PostgreSQL instance. While pg_dump handles single databases, pg_dumpall grabs the whole damn thing: databases, users, roles, tablespaces, and all the global objects that pg_dump can't touch.

Here's the brutal reality: pg_dumpall is slow. Painfully slow. A 100GB cluster can take 2-3 hours, and there's no progress indicator. You run it and pray nothing crashes halfway through. Unlike pg_dump which supports parallel processing with -j, pg_dumpall is single-threaded all the way.

The Password Prompt Hell: pg_dumpall connects to every database in your cluster separately. Without a .pgpass file, you'll be typing passwords every few minutes like some kind of authentication slave. Set up your password file or lose your sanity.

When You Actually Need This Tool: Despite the pain, pg_dumpall is your only option for:

Production Reality Check: Most teams end up using pg_dump for individual databases and only reach for pg_dumpall when they need to replicate the entire cluster setup. The restore process is also all-or-nothing - if something fails midway, you're starting over from scratch.

PostgreSQL Cluster vs Single Database

The tool requires superuser access because it needs to read everything, including system catalogs and role information. Try running this as a regular user and you'll get permission errors that'll make you question your life choices.

pg_dumpall vs. Other PostgreSQL Backup Methods

Feature

pg_dumpall

pg_dump

pg_basebackup

Scope

Entire cluster

Single database

Entire cluster

Speed

Slow (no parallel)

Fast (with -j flag)

Fastest

Output Format

SQL script

SQL/custom/tar

Binary files

Global Objects

✅ Users, roles, tablespaces

❌ Database-only

✅ Everything

Progress Indicator

❌ Blind faith

✅ Verbose mode

✅ Progress shown

Parallel Processing

❌ Single-threaded

✅ -j workers

✅ Inherently parallel

Cross-Version Restore

✅ Works across versions

✅ Works across versions

❌ Same major version only

Partial Restore

❌ All or nothing

✅ Selective restore

❌ All or nothing

Compression

Manual (gzip)

Built-in options

Manual (gzip)

Real-World Use

Cluster migration/cloning

Daily database backups

Hot standby setup

Using pg_dumpall in Production (What Actually Works)

PostgreSQL Cluster Backup Workflow

Basic Commands That Don't Suck

Simple cluster backup:

pg_dumpall > cluster_backup.sql

That basic command looks simple but you'll get password prompts for every database. Set up .pgpass or you'll be typing passwords for 20 minutes.

Better approach with compression:

pg_dumpall | gzip > cluster_backup.sql.gz

The gzip trick is nice until your backup process dies and you're stuck with a corrupted .gz file. Always test your compressed backups.

Production-ready command:

pg_dumpall -h localhost -U postgres --no-password | gzip > "cluster_$(date +%Y%m%d_%H%M%S).sql.gz"

What Goes Wrong (And How to Fix It)

Memory Issues: Large clusters can exhaust memory during the dump. PostgreSQL 17's pg_dumpall doesn't support the memory-limiting options that pg_dump has. If you're hitting OOM errors, you're basically fucked - split the job or get more RAM.

Tablespace Headaches: If your cluster uses custom tablespaces, the restore will fail on a different server because the paths won't exist. The backup includes CREATE TABLESPACE commands with hardcoded paths like /data/pg_tablespace. You'll need to either:

The Password Prompt Nightmare: Without proper authentication setup, pg_dumpall becomes interactive hell:

PostgreSQL Authentication Issues

Password for user postgres: 
Password for user postgres: 
Password for user postgres: 
[...continues for every database...]

Fix this with a .pgpass file or `PGPASSWORD` environment variable.

Version Compatibility: Unlike binary backups, pg_dumpall dumps work across PostgreSQL major versions. You can dump from PostgreSQL 13 and restore to 17. But here's the catch - always use the **newer** version's pg_dumpall. Dump from 13 using pg_dumpall from 17, not the other way around.

Restore Reality Check

Restoring is the same nightmare in reverse:

psql -U postgres -f cluster_backup.sql

If anything fails midway through (corrupted user, conflicting tablespace, whatever), you're starting over. No incremental restore, no "skip the broken part" - it's all or nothing.

Pro tip: Always test restores on a throwaway server first. The number of times I've seen teams discover their backups are trash during an actual emergency is depressing.

PostgreSQL Backup Testing Process

Common pg_dumpall Questions (And Honest Answers)

Q

Why is pg_dumpall so damn slow?

A

pg_dumpall is single-threaded and has no parallel processing. It dumps databases one by one, sequentially. A 100GB cluster takes 2-3 hours because there's no way to speed it up. Unlike pg_dump which can use multiple workers with -j, pg_dumpall just chugs along at whatever speed your disk I/O allows.

Q

Can I get a progress indicator?

A

Nope. pg_dumpall runs in complete silence until it's done or crashes. No progress bar, no "X% complete", nothing. You just run it and hope. Use --verbose to get some output, but it's mostly useless start/stop timestamps.

Q

Why does it keep asking for passwords?

A

Because pg_dumpall connects to every database separately, and PostgreSQL asks for authentication each time. Without a .pgpass file, you'll type the same password 20 times. Set up password file authentication or use PGPASSWORD environment variable.

Q

Can I exclude specific databases?

A

Yes, use --exclude-database=pattern. But here's the frustrating part

  • you can't exclude system databases like template0 or template1. So if those have issues, your entire dump fails.
Q

What happens if the dump fails halfway through?

A

You're fucked. pg_dumpall doesn't support resuming from where it left off. If your network drops, disk fills up, or PostgreSQL crashes, you start over from the beginning. This is why smart teams test their backups and have monitoring on backup processes.

Q

Can I restore individual databases from a pg_dumpall backup?

A

Not easily. The output is one giant SQL script with all databases mixed together. You'd need to manually edit the SQL file to extract just the database you want. It's a nightmare

  • use pg_dump for individual database backups instead.
Q

Does pg_dumpall work across PostgreSQL versions?

A

Yes, but use the newer version's pg_dumpall. If you're migrating from PostgreSQL 13 to 17, use PostgreSQL 17's pg_dumpall to create the backup from the 13 cluster. The reverse often breaks with syntax changes and new features.

Q

How much disk space do I need?

A

Plan for 2-3x your database size. The uncompressed SQL dump is usually larger than your actual data due to INSERT statements and metadata. Add compression (| gzip) and you'll get it down to maybe 50-70% of original size, depending on your data types.

Q

Why can't I run pg_dumpall as a regular user?

A

Because it needs to read system catalogs, role information, and global objects that only superusers can access. Try running as a regular user and you'll get permission denied errors on half the databases. You need superuser privileges or a role that can access everything.

Useful PostgreSQL Backup Resources

Related Tools & Recommendations

alternatives
Popular choice

PostgreSQL Alternatives: Escape Your Production Nightmare

When the "World's Most Advanced Open Source Database" Becomes Your Worst Enemy

PostgreSQL
/alternatives/postgresql/pain-point-solutions
60%
tool
Popular choice

AWS RDS Blue/Green Deployments - Zero-Downtime Database Updates

Explore Amazon RDS Blue/Green Deployments for zero-downtime database updates. Learn how it works, deployment steps, and answers to common FAQs about switchover

AWS RDS Blue/Green Deployments
/tool/aws-rds-blue-green-deployments/overview
55%
tool
Recommended

pgAdmin - The GUI You Get With PostgreSQL

It's what you use when you don't want to remember psql commands

pgAdmin
/tool/pgadmin/overview
55%
blog
Recommended

The Great Cron Job Mystery: When Variables Go AWOL

compatible with cron

cron
/blog/2025-08-23/the-great-cron-job-mystery-when-variables-go-awol
55%
howto
Recommended

How to Migrate PostgreSQL 15 to 16 Without Destroying Your Weekend

depends on PostgreSQL

PostgreSQL
/howto/migrate-postgresql-15-to-16-production/migrate-postgresql-15-to-16-production
45%
alternatives
Recommended

Why I Finally Dumped Cassandra After 5 Years of 3AM Hell

depends on MongoDB

MongoDB
/alternatives/mongodb-postgresql-cassandra/cassandra-operational-nightmare
45%
compare
Recommended

MongoDB vs PostgreSQL vs MySQL: Which One Won't Ruin Your Weekend

depends on postgresql

postgresql
/compare/mongodb/postgresql/mysql/performance-benchmarks-2025
45%
news
Popular choice

Three Stories That Pissed Me Off Today

Explore the latest tech news: You.com's funding surge, Tesla's robotaxi advancements, and the surprising quiet launch of Instagram's iPad app. Get your daily te

OpenAI/ChatGPT
/news/2025-09-05/tech-news-roundup
45%
tool
Popular choice

Aider - Terminal AI That Actually Works

Explore Aider, the terminal-based AI coding assistant. Learn what it does, how to install it, and get answers to common questions about API keys and costs.

Aider
/tool/aider/overview
42%
tool
Popular choice

jQuery - The Library That Won't Die

Explore jQuery's enduring legacy, its impact on web development, and the key changes in jQuery 4.0. Understand its relevance for new projects in 2025.

jQuery
/tool/jquery/overview
40%
news
Popular choice

vtenext CRM Allows Unauthenticated Remote Code Execution

Three critical vulnerabilities enable complete system compromise in enterprise CRM platform

Technology News Aggregation
/news/2025-08-25/vtenext-crm-triple-rce
40%
tool
Popular choice

Django Production Deployment - Enterprise-Ready Guide for 2025

From development server to bulletproof production: Docker, Kubernetes, security hardening, and monitoring that doesn't suck

Django
/tool/django/production-deployment-guide
40%
tool
Popular choice

HeidiSQL - Database Tool That Actually Works

Discover HeidiSQL, the efficient database management tool. Learn what it does, its benefits over DBeaver & phpMyAdmin, supported databases, and if it's free to

HeidiSQL
/tool/heidisql/overview
40%
troubleshoot
Popular choice

Fix Redis "ERR max number of clients reached" - Solutions That Actually Work

When Redis starts rejecting connections, you need fixes that work in minutes, not hours

Redis
/troubleshoot/redis/max-clients-error-solutions
40%
tool
Popular choice

QuickNode - Blockchain Nodes So You Don't Have To

Runs 70+ blockchain nodes so you can focus on building instead of debugging why your Ethereum node crashed again

QuickNode
/tool/quicknode/overview
40%
integration
Popular choice

Get Alpaca Market Data Without the Connection Constantly Dying on You

WebSocket Streaming That Actually Works: Stop Polling APIs Like It's 2005

Alpaca Trading API
/integration/alpaca-trading-api-python/realtime-streaming-integration
40%
alternatives
Popular choice

OpenAI Alternatives That Won't Bankrupt You

Bills getting expensive? Yeah, ours too. Here's what we ended up switching to and what broke along the way.

OpenAI API
/alternatives/openai-api/enterprise-migration-guide
40%
howto
Popular choice

Migrate JavaScript to TypeScript Without Losing Your Mind

A battle-tested guide for teams migrating production JavaScript codebases to TypeScript

JavaScript
/howto/migrate-javascript-project-typescript/complete-migration-guide
40%
news
Popular choice

Docker Compose 2.39.2 and Buildx 0.27.0 Released with Major Updates

Latest versions bring improved multi-platform builds and security fixes for containerized applications

Docker
/news/2025-09-05/docker-compose-buildx-updates
40%
tool
Popular choice

Google Vertex AI - Google's Answer to AWS SageMaker

Google's ML platform that combines their scattered AI services into one place. Expect higher bills than advertised but decent Gemini model access if you're alre

Google Vertex AI
/tool/google-vertex-ai/overview
40%

Recommendations combine user behavior, content similarity, research intelligence, and SEO optimization