Mastering PostgreSQL Replication: Types and Setup Explained
Learn how PostgreSQL streaming and logical replication work, when to use each one, and what to check before production failover.
Mastering PostgreSQL Replication: Types and Setup Explained
PostgreSQL replication keeps a second server close enough to the primary that you can survive hardware failure, shift read traffic, or run a controlled migration. If your database is a production dependency, you need to know which PostgreSQL replication model fits your risk tolerance before a node fails.
PostgreSQL gives you two common choices: streaming replication and logical replication. Streaming replication copies WAL at the physical cluster level. Logical replication sends row-level changes from selected tables through publications and subscriptions.
Why PostgreSQL Replication Matters
Replication helps with four everyday operations problems:
- High availability: If the primary fails, you can promote a standby and point applications at it.
- Disaster recovery: A standby in another location can protect you from a site-level outage.
- Read scaling: Read-only queries can run against hot standbys instead of the write primary.
- Migration support: Logical replication can help move selected tables between PostgreSQL versions or database layouts.
Replication is not a backup replacement. A bug, bad migration, or accidental DELETE can replicate quickly. Keep tested backups and point-in-time recovery alongside replication.
Streaming Replication (Physical Replication)
Streaming replication is the most common and fundamental form of replication in PostgreSQL. It works by sending Write-Ahead Log (WAL) records from the primary server to one or more replicas. These WAL records represent every change made to the database. The replicas then apply these WAL records to their own data files, ensuring they remain consistent with the primary.
Synchronous vs. Asynchronous Streaming
Synchronous replication makes the primary wait for one or more synchronous standbys before it reports a commit to the client. The exact safety level depends on synchronous_commit; for example, waiting for WAL to be written is different from waiting for it to be replayed. You get stronger protection against losing acknowledged commits, but every commit now depends on replica and network latency.
Asynchronous replication lets the primary commit locally and send WAL to replicas afterward. It is faster for writes, but a primary crash can lose recent transactions that had not reached a standby yet.
Setting Up Streaming Replication (Asynchronous Example)
Setting up streaming replication involves configuring both the primary and replica servers. Here’s a simplified guide:
1. Configure the Primary Server (postgresql.conf and pg_hba.conf)
On the primary server, you need to enable WAL archiving and replication connections.
postgresql.confmodifications:wal_level = replica # or logical for logical replication max_wal_senders = 5 # Number of concurrent replication connections wal_keep_size = 512MB # Or wal_keep_segments for older versions # For synchronous replication, add: # synchronous_standby_names = 'replica1,replica2' # Or for specific server name/priority: # synchronous_standby_names = '1 (replica1), 2 (replica2)' archive_mode = on archive_command = 'test ! -f /path/to/wal_archive/%f && cp %p /path/to/wal_archive/%f'wal_level: Must be at leastreplicafor streaming replication.max_wal_senders: Specifies how many standby servers can connect simultaneously.wal_keep_size: Prevents WAL files from being deleted before replicas can fetch them (a simpler alternative toarchive_commandfor basic setups, but archiving is recommended for robustness).archive_modeandarchive_command: Useful for point-in-time recovery (PITR) and for replicas that need old WAL after falling behind. In production, use a real archive target or a backup tool instead of a local copy command.
pg_hba.confmodifications:Allow the replica to connect for replication. Replace
replica_ip_addresswith the actual IP of your replica.# TYPE DATABASE USER ADDRESS METHOD host replication replication_user replica_ip_address/32 md5You'll also need to create a replication user:
-- On the primary server: CREATE ROLE replication_user WITH REPLICATION LOGIN PASSWORD 'your_password';After modifying these files, reload the PostgreSQL configuration:
pg_ctl reload # Or restart PostgreSQL if needed
2. Prepare the Replica Server
Before starting the replica, it must have a data directory that is a copy of the primary's data directory at a specific point in time. The easiest way is to use pg_basebackup.
Stop PostgreSQL on the replica (if running).
Take a base backup:
# Ensure PGDATA is empty or removed first pg_basebackup -h primary_host_ip -p 5432 -U replication_user -D /var/lib/postgresql/data/ -Fp -Xs -P -R-h,-p,-U: Specify the primary server's connection details.-D: The data directory for the replica.-Fp: Format is plain.-Xs: Stream WAL during the backup.-P: Show progress.-R: Write standby connection settings and createstandby.signalfor PostgreSQL 12 and newer.- You will be prompted for the
replication_user's password.
3. Configure the Replica Server
postgresql.confmodifications (for PG12+):hot_standby = on # Allows read-only queries on the replica primary_conninfo = 'host=primary_host_ip port=5432 user=replication_user password=your_password'hot_standby: Enables read-only queries on the standby.primary_conninfo: Connection string to the primary server.
Older PostgreSQL versions:
PostgreSQL 12 removed
recovery.conf. If you maintain an older server, createrecovery.confin the replica data directory:standby_mode = 'on' primary_conninfo = 'host=primary_host_ip port=5432 user=replication_user password=your_password' # If using archive recovery instead of streaming, you'd specify restore_command # restore_command = 'cp /path/to/wal_archive/%f %p' # recovery_target_timeline = 'latest'On PostgreSQL 12 and newer, standby mode is controlled by
standby.signal, andprimary_conninfousually lives inpostgresql.auto.confwhen created bypg_basebackup -R.
4. Start the Replica Server
Start the PostgreSQL service on the replica. It will connect to the primary, receive WAL records, and begin to synchronize. You can check the logs for confirmation.
Tip: For robust HA, consider using tools like Patroni or repmgr, which automate failover and management.
Logical Replication
Logical replication is a more flexible and granular form of replication introduced in PostgreSQL 10. Instead of replicating entire blocks of data or WAL records, it replicates data changes based on their logical meaning (e.g., INSERT, UPDATE, DELETE statements) at the row level. This is achieved by decoding the WAL records into a stream of logical changes.
Key Features and Use Cases:
- Selective replication: You can choose which tables to replicate. Recent PostgreSQL versions also support column lists in publications, but check your server version before relying on that feature.
- Cross-Version Replication: Logical replication can replicate data between different major versions of PostgreSQL.
- Schema control: Logical replication does not automatically replicate DDL. Create matching tables and apply schema migrations on the subscriber.
- Data Transformation: While not built-in, logical replication provides a foundation for more complex ETL (Extract, Transform, Load) processes.
- Replication from a Primary to a Replica that is not a full clone: The target database does not need to be a complete physical copy of the source.
How it Works:
- Publisher: The source database (primary) where data changes occur. It needs
wal_level = logical. Changes are decoded from WAL into a logical stream. - Publication: A named set of tables on the publisher whose changes will be replicated.
- Subscriber: The target database (replica) that receives the changes.
- Subscription: A connection on the subscriber that connects to the publisher and applies the changes from a specific publication.
Setting Up Logical Replication
1. Configure the Publisher (Primary Server)
postgresql.confmodifications:wal_level = logical max_replication_slots = 10 # For logical replication slots max_wal_senders = 10 # Should be at least max_replication_slotsCreate a Publication:
-- On the publisher database: CREATE PUBLICATION my_publication FOR TABLE table1, table2 WITH (publish = 'insert,update,delete'); -- Or for all tables: -- CREATE PUBLICATION all_tables_pub FOR ALL TABLES;Reload the configuration on the publisher.
2. Configure the Subscriber (Replica Server)
Ensure the target tables exist: The subscriber database must have the target tables with the same schema as the publisher. You can create them manually or use
pg_dumpto extract schema.Create a Subscription:
-- On the subscriber database: CREATE SUBSCRIPTION my_subscription CONNECTION 'host=publisher_host_ip port=5432 user=replication_user password=your_password dbname=publisher_db' PUBLICATION my_publication;The
replication_userneeds appropriate permissions on the publisher.PostgreSQL will automatically create a replication slot on the publisher and start applying changes. You can monitor the subscription status using
pg_stat_subscriptionon the subscriber.
Tip: Logical replication uses PostgreSQL's built-in logical decoding infrastructure. It does not require a separate extension for basic publications and subscriptions.
Choosing the Right Replication Method
- Streaming Replication: Ideal for high availability and disaster recovery, where you need an exact, byte-for-byte copy of the primary. It's simpler to set up for full database replication and provides the best read-scalability for read-only replicas.
- Logical Replication: Best suited for selective data distribution, migrations, cross-version upgrades, or when you need to replicate only a subset of data. It allows for more complex scenarios like replicating to different schemas or performing data transformations.
Takeaway
Use streaming replication when you need a full standby for failover, disaster recovery, or read-only traffic. Use logical replication when you need selected tables, cross-version migration, or controlled data movement between different databases.
Before you trust either setup, run a failover drill, check application connection handling, monitor replication lag, and verify that backups still restore cleanly. Replication keeps another server current; it does not replace operational testing.