What Actually Is Logical Replication and Why Should You Care?

PostgreSQL Logical Replication Architecture

PostgreSQL WAL Architecture

Look, logical replication sounds fancy but it's basically this: instead of copying the entire database like streaming replication does, you pick specific tables and only sync those. PostgreSQL reads the WAL records, decodes them into actual SQL operations (INSERT, UPDATE, DELETE), and sends those to your subscriber database.

The catch? Tables need a primary key or unique index to work properly. No primary key means you're stuck with REPLICA IDENTITY FULL, which sends the entire row for every update and will destroy your WAL volume on busy tables.

How It Actually Works (Without the Bullshit)

You create a publication on your source database (publisher) listing which tables to replicate. Then you create a subscription on the destination database (subscriber) that connects to the publisher.

The publisher starts a walsender process that uses the built-in `pgoutput` plugin to decode WAL records into logical replication messages. These get streamed to the subscriber where apply workers execute the SQL operations.

PostgreSQL 17 Release

PostgreSQL 17 allegedly fixed some of the biggest pain points with failover slot synchronization and the new `pg_createsubscriber` utility that converts standby servers to logical subscribers. I'm still testing this shit - too many times PostgreSQL "fixes" have introduced new problems that are worse than the original bugs. But so far, it looks promising.

Why You'd Actually Want to Use This Thing

Different PostgreSQL versions? No problem. Logical replication works between major versions going back to 9.4. I've used it to migrate from PostgreSQL 11 to 15 with zero downtime. Streaming replication would've required matching versions and a full outage.

Only need specific tables? Perfect. You can replicate individual tables, specific columns, or even filter rows with WHERE clauses in your publication. Beats copying a 2TB database when you only need the user activity tables.

Want to write to the replica? Logical replication lets you write to subscriber databases. The subscriber can have different indexes, triggers, even completely different schemas. Try doing that with streaming replication (hint: you can't).

Multiple data sources? A subscriber can pull from multiple publishers. I've used this to aggregate customer data from regional databases into a central analytics warehouse.

PostgreSQL 17 Finally Fixed the Big Issues

PostgreSQL 17 dropped on September 26, 2024 and honestly, it fixed the stuff that's been driving everyone crazy about logical replication:

Failover slot synchronization: Remember how logical replication slots would just disappear after failover? Yeah, that's fixed. Slots now sync between primary and standby, so your replication doesn't die when the primary goes down. This was a massive operational pain point.

pg_createsubscriber utility: You can now convert existing physical standbys to logical subscribers without rebuilding everything from scratch. I've rebuilt way too many logical replication setups because of this missing piece.

pg_upgrade preservation: Your logical replication setup actually survives major version upgrades now. Before this, upgrading PostgreSQL meant tearing down and rebuilding all your replication relationships.

What Actually Breaks in Production

Tables without primary keys are a nightmare. You'll need either a unique index or REPLICA IDENTITY FULL. The latter sends the entire row for every update and will absolutely murder your WAL volume. I learned this the hard way on a busy events table - WAL usage went completely insane, like 5x or 6x normal volume. The monitoring graphs looked like a hockey stick.

WAL retention will fill up your disk. Logical replication slots prevent WAL cleanup until all subscribers catch up. If a subscriber goes down or gets stuck, your primary server runs out of disk space. I've seen this kill production servers when monitoring wasn't set up properly. Set up alerts on pg_replication_slots.restart_lsn lag or you'll get paged at 3am when the server dies.

Network connectivity is critical. Each subscription needs its own connection to the publisher. Unlike streaming replication with one connection, logical replication multiplies your connection requirements. Network hiccups cause apply lag that compounds quickly on busy systems.

Sequences drift like crazy. Logical replication doesn't sync sequences, so your subscriber's sequence values drift from the publisher. If you fail over to a subscriber, you'll get primary key conflicts when new inserts use duplicate sequence values.

Performance and What Doesn't Work

Logical replication uses less I/O than streaming replication since it only processes actual data changes, but the WAL decoding uses CPU cycles. On a busy system, you'll see higher CPU usage on the publisher from the walsender processes.

Schema changes aren't replicated. You have to apply DDL changes manually on both publisher and subscriber. This includes adding columns, changing types, creating indexes - everything structural needs to be coordinated manually.

TRUNCATE wasn't supported until PostgreSQL 11. If you're still on 10 or earlier, TRUNCATE operations just don't replicate. Subscribers keep their data while the publisher table gets emptied.

Large objects are fucked. TOAST columns and large objects aren't supported. Same with certain column types. Check the restrictions before committing to logical replication.

No conflict resolution. If you write to both publisher and subscriber, conflicts just error out and you're stuck fixing it manually. There's no automatic resolution like "last write wins" or anything useful. I once had a conflict that took me 3 hours to figure out because the error message was something useless like "duplicate key value violates unique constraint" with zero fucking context about which operation or row caused it. You need external tools like Postgres-XL, EnterpriseDB's BDR, or pglogical for multi-master scenarios, but honestly those add their own complexity and new ways to break.

When You Actually Need This

Cross-version upgrades: I've used logical replication to upgrade from PostgreSQL 11 to 15 with zero downtime. Well, mostly zero - had a brief hiccup when sequences got out of sync, but that's another story. Set up logical replication to the new version, let it sync, then switch over. Way less risky than pg_upgrade on a production system.

Selective data sync: Need to sync just user tables to a reporting database? Logical replication lets you pick specific tables and even filter rows with WHERE clauses. Beats copying a 2TB database when you only need 100GB of actual data.

Multi-region compliance: GDPR requires EU customer data stays in EU? Set up logical replication with row filters based on customer location. Each region gets only the data it's legally allowed to have.

Real-time ETL: Stream changes to your data warehouse as they happen. The subscriber can have completely different schemas, indexes, even extra computed columns. Try doing that with streaming replication.

Just remember: logical replication is complex as hell compared to streaming replication. Only use it when you actually need the features it provides. For basic high availability, streaming replication is usually what you want. Check out comprehensive tutorials if you're just getting started.

PostgreSQL Logical Replication FAQ (The Real Questions)

Q

Why is logical replication so much more complicated than streaming replication?

A

Because it's doing way more work. Streaming replication just copies WAL blocks. Logical replication has to decode those blocks into SQL operations, filter them by table/row, and stream the changes. More moving parts means more ways to break.

Q

My table has no primary key - am I screwed?

A

Pretty much. You need either a primary key, unique index, or REPLICA IDENTITY FULL. The latter sends the entire row for every UPDATE and will absolutely destroy your WAL volume. I've seen it increase WAL from 10GB/day to 200GB/day on a busy table. Actually, might've been worse than that

  • I just remember the monitoring graphs going completely batshit insane. Add a primary key or suffer the consequences.
Q

Can I replicate between PostgreSQL 13 and 16?

A

Yeah, logical replication works between versions back to 9.4. I've used it to migrate from 13 to 16 with zero downtime. Just watch out for new features in the destination version that don't exist in the source.

Q

Why is this eating all my disk space and how do I fix it before the server dies?

A

Logical replication slots hold onto WAL files until subscribers catch up. If a subscriber goes down, WAL just keeps piling up. Check pg_replication_slots.restart_lsn and compare it to current WAL position. Big gap = problem.Quick fix: Drop dead slots with SELECT pg_drop_replication_slot('slot_name'). Nuclear option: Restart the entire replication from scratch. Monitor slot lag or you'll run out of disk at 3am.

Q

Schema changes just broke everything - now what?

A

DDL doesn't replicate. You have to apply schema changes manually to both sides. I always do subscriber first, then publisher, so new columns have somewhere to go. If you mess up the order, replication errors out and you're stuck fixing it manually.

Q

Can I have multiple subscribers pulling from one publisher?

A

Sure.

Each subscriber gets its own replication slot, so they don't interfere with each other. Just watch your connection limits

  • each subscription uses a connection.
Q

Can the subscriber have different table structures?

A

Yeah, within limits. The subscriber can have extra columns, different indexes, constraints, whatever. But the replicated columns need matching names and compatible types. Column order doesn't matter as long as names match. I've used this to add computed columns and better indexes on read replicas.

Q

PostgreSQL 17 fixed failover - does it actually work?

A

So far, yes. Before 17, logical replication slots would disappear during failover and you'd have to rebuild everything. PostgreSQL 17 added slot synchronization between primary and standby. Set sync_replication_slots = on and slots should survive failover. I've tested this in dev but haven't had a production failover yet to really confirm it works as advertised. Fingers crossed.

Q

Changes aren't showing up on the subscriber - what's the dumb thing I missed?

A

Check these in order:

  1. Is the table in the publication?
  2. Does the publication filter exclude the rows?
  3. Does the table have a replication identity?
  4. Check pg_stat_subscription for errors.

90% of the time it's #1 - you forgot to add the table to the publication. The error message will be something unhelpful like "logical replication worker for subscription has exited with exit code 1" which tells you absolutely nothing useful.

Q

How do I know when this is about to break?

A

Watch `pg_stat_subscription.apply_lag` - if it keeps growing, you're falling behind. Check `pg_replication_slots.restart_lsn` lag too. Set up alerts because when slots get stuck, they prevent WAL cleanup and your disk fills up fast.

Also monitor pg_stat_subscription.worker_type - if apply workers are frequently restarting, you probably have conflicts or data type issues.

Q

Can I replicate only some rows from a table?

A

Yeah, PostgreSQL 15 added row filters with WHERE clauses. You can do CREATE PUBLICATION pub_name FOR TABLE customers WHERE (region = 'US'). Column filtering isn't supported directly, but you can create views on the subscriber side.

Q

Will this destroy my performance?

A

Logical replication uses CPU for WAL decoding on the publisher and network bandwidth for streaming changes. It's usually lighter on disk I/O than streaming replication since it only processes actual data changes. But on a busy system with lots of updates, the CPU overhead from decoding can be noticeable.

Q

Why does my replication randomly break every Tuesday?

A

This is weirdly specific, but I've seen this. Usually it's some batch job or maintenance script that runs weekly and does something that breaks replication. Check your cron jobs for anything that truncates tables, drops and recreates objects, or does massive bulk updates. Also check if someone's running pg_dump or other maintenance that might be interfering with WAL.

PostgreSQL Replication: Which One Won't Screw You Over?

Feature

Logical Replication

Physical Streaming Replication

WAL Shipping

Replication Level

Table/row level selective

Entire cluster block-level

Entire cluster file-level

Cross-Version Support

Yes (PostgreSQL 9.4+)

Same major version only

Same major version only

Subscriber Writability

Fully writable

Read-only standby

Read-only standby

Schema Flexibility

Different schemas supported

Identical schema required

Identical schema required

DDL Replication

Manual DDL application

Automatic DDL replication

Automatic DDL replication

Setup Complexity

Moderate

  • publication/subscription

Low

  • streaming configuration

Low

  • archive/restore setup

Network Requirements

Persistent connection per subscription

Single persistent connection

Periodic file transfer

Failover Time

Depends on apply lag

Near-instant (seconds)

Minutes to hours

Resource Overhead

CPU for WAL decoding

Minimal CPU, network bandwidth

Disk I/O for archiving

Conflict Resolution

None (manual intervention)

Not applicable

Not applicable

Large Object Support

No

Yes

Yes

Sequence Synchronization

No

Yes

Yes

TRUNCATE Support

PostgreSQL 11+

Yes

Yes

Pain in the Ass Factor

High (lots of moving parts)

Medium (mostly works)

Low (set it and forget it)

What Breaks First

Apply workers crash

Network hiccup kills stream

Archive destination fills up

Setting Up Logical Replication Without Losing Your Mind

PostgreSQL Logical Replication Process

Logical replication setup is more involved than streaming replication, but the flexibility is worth it if you actually need the features. Here's how to do it without shooting yourself in the foot.

Configuration That Will Actually Work

Set wal_level = logical in `postgresql.conf`. This increases WAL volume by 10-30% depending on your workload, so monitor disk space. Restart required - plan your downtime window accordingly.

Bump connection limits: Set `max_replication_slots` and max_wal_senders to at least 2x your planned subscriptions. Maybe 3x to be safe? Each subscription eats one slot and one sender process. I've been caught by this limit way too many times - nothing like having replication fail because you hit some arbitrary connection limit.

Fix pg_hba.conf: Add replication access for your replication user. Don't use superuser for replication - create a dedicated user with REPLICATION privilege. The subscriber connection needs REPLICATION and CREATE privileges.

The Basic Setup (Publisher Side)

Create a publication specifying exactly which tables to replicate. Don't use FOR ALL TABLES unless you actually want everything:

-- Be specific about what you're replicating
CREATE PUBLICATION my_app_pub FOR TABLE users, orders, products;

-- PostgreSQL 15+ lets you filter operations
CREATE PUBLICATION audit_pub FOR TABLE logs (INSERT);

Subscriber Setup (The Easy Part)

Create the table structures first - DDL doesn't replicate automatically. Then create the subscription:

CREATE SUBSCRIPTION my_app_sub 
CONNECTION 'host=publisher port=5432 dbname=source_db user=repl_user sslmode=require' 
PUBLICATION my_app_pub;

PostgreSQL Configuration

Initial sync happens automatically but can take hours on large tables. Or days if you're unlucky. The publisher's replication slot holds WAL during this time, so monitor disk space or you'll run out and get paged at 2am like I did once. Fun times.

PostgreSQL 17's New Stuff That Actually Helps

Failover slot sync: Set `sync_replication_slots = on` on your standby. Logical replication slots finally survive failover without breaking everything.

pg_createsubscriber: Converts physical standbys to logical subscribers without rebuilding from scratch. Saves hours of downtime:

pg_createsubscriber --publisher-host=primary \
    --publication=my_app_pub --subscription=my_app_sub

Monitoring (Or How to Not Get Paged at 3am)

PostgreSQL Monitoring

Watch `pg_replication_slots` for slot lag. When restart_lsn falls behind, WAL piles up and your disk fills:

-- Check slot lag (set up alerts for this)
SELECT slot_name, active,
       pg_size_pretty(pg_wal_lsn_diff(pg_current_wal_lsn(), restart_lsn)) as behind
FROM pg_replication_slots WHERE slot_type = 'logical';

Monitor `pg_stat_subscription` for apply worker failures. When workers crash, run ALTER SUBSCRIPTION sub_name REFRESH PUBLICATION to restart them. There are specific queries for debugging replication issues.

If a slot gets stuck, drop it before it kills your disk:

SELECT pg_drop_replication_slot('dead_slot_name');

What Will Bite You in Production

Tables without primary keys: Use REPLICA IDENTITY FULL and your WAL volume explodes. Add primary keys or accept the pain.

Sequences drift: Logical replication doesn't sync sequences, so your subscriber sequences get out of sync with the publisher. If you failover to the subscriber, new inserts will conflict with existing data.

Large transactions: Massive batch updates cause apply lag and can crash apply workers. Break big transactions into smaller chunks.

SSL is not optional: Use sslmode=require in connection strings. Don't send replication traffic unencrypted across networks.

Unique constraints cause conflicts: If data arrives out of order, unique constraints on the subscriber can cause replication to fail. Consider DEFERRABLE INITIALLY DEFERRED constraints.

Stuck slots are a nightmare: WAL retention from stuck slots can fill your disk quickly. Monitor slot advancement and have cleanup procedures ready.

Test everything in development first. Logical replication has more moving parts than streaming replication, so there are more ways for it to break. The PostgreSQL 17 improvements help, but it's still complex as hell compared to streaming replication.

You'll want to bookmark the resources section below - you're going to need those links when things start breaking at 3am on a weekend. And they will break, probably when you least expect it.

Essential PostgreSQL Logical Replication Resources

Related Tools & Recommendations

compare
Similar content

PostgreSQL vs MySQL vs MariaDB vs SQLite vs CockroachDB

Compare PostgreSQL, MySQL, MariaDB, SQLite, and CockroachDB to pick the best database for your project. Understand performance, features, and team skill conside

/compare/postgresql-mysql-mariadb-sqlite-cockroachdb/database-decision-guide
100%
compare
Similar content

PostgreSQL vs MySQL vs MariaDB: Developer Ecosystem Analysis

PostgreSQL, MySQL, or MariaDB: Choose Your Database Nightmare Wisely

PostgreSQL
/compare/postgresql/mysql/mariadb/developer-ecosystem-analysis
60%
tool
Similar content

ClickHouse Overview: Analytics Database Performance & SQL Guide

When your PostgreSQL queries take forever and you're tired of waiting

ClickHouse
/tool/clickhouse/overview
59%
compare
Recommended

PostgreSQL vs MySQL vs MongoDB vs Cassandra - Which Database Will Ruin Your Weekend Less?

Skip the bullshit. Here's what breaks in production.

PostgreSQL
/compare/postgresql/mysql/mongodb/cassandra/comprehensive-database-comparison
54%
tool
Similar content

Cassandra Vector Search for RAG: Simplify AI Apps with 5.0

Learn how Apache Cassandra 5.0's integrated vector search simplifies RAG applications. Build AI apps efficiently, overcome common issues like timeouts and slow

Apache Cassandra
/tool/apache-cassandra/vector-search-ai-guide
53%
tool
Similar content

PostgreSQL Performance Optimization: Master Tuning & Monitoring

Optimize PostgreSQL performance with expert tips on memory configuration, query tuning, index design, and production monitoring. Prevent outages and speed up yo

PostgreSQL
/tool/postgresql/performance-optimization
50%
tool
Similar content

PostgreSQL: Why It Excels & Production Troubleshooting Guide

Explore PostgreSQL's advantages over other databases, dive into real-world production horror stories, solutions for common issues, and expert debugging tips.

PostgreSQL
/tool/postgresql/overview
42%
tool
Similar content

Neon Production Troubleshooting Guide: Fix Database Errors

When your serverless PostgreSQL breaks at 2AM - fixes that actually work

Neon
/tool/neon/production-troubleshooting
41%
tool
Similar content

Supabase Production Deployment: Best Practices & Scaling Guide

Master Supabase production deployment. Learn best practices for connection pooling, RLS, scaling your app, and a launch day survival guide to prevent crashes an

Supabase
/tool/supabase/production-deployment
39%
tool
Similar content

Neon Serverless PostgreSQL: An Honest Review & Production Insights

PostgreSQL hosting that costs less when you're not using it

Neon
/tool/neon/overview
39%
tool
Similar content

Change Data Capture (CDC) Troubleshooting Guide: Fix Common Issues

I've debugged CDC disasters at three different companies. Here's what actually breaks and how to fix it.

Change Data Capture (CDC)
/tool/change-data-capture/troubleshooting-guide
38%
tool
Similar content

Supabase Overview: PostgreSQL with Bells & Whistles

Explore Supabase, the open-source Firebase alternative powered by PostgreSQL. Understand its architecture, features, and how it compares to Firebase for your ba

Supabase
/tool/supabase/overview
38%
tool
Similar content

Liquibase Overview: Automate Database Schema Changes & DevOps

Because manually deploying schema changes while praying is not a sustainable strategy

Liquibase
/tool/liquibase/overview
38%
tool
Similar content

mongoexport Performance Optimization: Speed Up Large Exports

Real techniques to make mongoexport not suck on large collections

mongoexport
/tool/mongoexport/performance-optimization
36%
tool
Similar content

Redis Cluster Production Issues: Troubleshooting & Survival Guide

When Redis clustering goes sideways at 3AM and your boss is calling. The essential troubleshooting guide for split-brain scenarios, slot migration failures, and

Redis
/tool/redis/clustering-production-issues
34%
tool
Similar content

CDC Enterprise Implementation Guide: Real-World Challenges & Solutions

I've implemented CDC at 3 companies. Here's what actually works vs what the vendors promise.

Change Data Capture (CDC)
/tool/change-data-capture/enterprise-implementation-guide
31%
tool
Similar content

Flyway: Database Migrations Explained - Why & How It Works

Database migrations without the XML bullshit or vendor lock-in

Flyway
/tool/flyway/overview
31%
howto
Similar content

Zero Downtime Database Migration: 2025 Tools That Actually Work

Stop Breaking Production - New Tools That Don't Suck

AWS Database Migration Service (DMS)
/howto/database-migration-zero-downtime/modern-tools-2025
31%
alternatives
Similar content

PostgreSQL Alternatives: Escape Production Nightmares

When the "World's Most Advanced Open Source Database" Becomes Your Worst Enemy

PostgreSQL
/alternatives/postgresql/pain-point-solutions
31%
tool
Similar content

Firebase - Google's Backend Service for Serverless Development

Skip the infrastructure headaches - Firebase handles your database, auth, and hosting so you can actually build features instead of babysitting servers

Firebase
/tool/firebase/overview
30%

Recommendations combine user behavior, content similarity, research intelligence, and SEO optimization