Understanding and Executing PostgreSQL Failover vs. Switchover Scenarios
In modern database architecture, ensuring continuous operation through High Availability (HA) is paramount. PostgreSQL, a powerful open-source relational database, relies on streaming replication to maintain data consistency across multiple nodes. However, when the primary server encounters an issue or requires maintenance, database administrators must execute precise procedures to restore service. This article clearly differentiates between two critical HA operations—Failover and Switchover—and details the steps and considerations for safely promoting a standby replica to become the new primary.
Understanding the distinction between these events is vital. A switchover is a planned, controlled transition, while a failover is an emergency response to unexpected outage. Successfully navigating either scenario depends heavily on proper configuration, robust monitoring, and familiarity with the tools used to manage replication.
Replication Fundamentals: The Foundation of HA
PostgreSQL High Availability is built upon streaming replication, where one server acts as the Primary (or Master) and one or more servers act as Standbys (or Replicas). The Primary streams write-ahead log (WAL) records to the Standbys to keep them in sync.
To manage these roles effectively, specific configuration settings are necessary on both primary and replica nodes:
Critical Configuration Settings
These settings govern how replication operates and how nodes identify each other:
wal_level: Must be set toreplicaor higher (ideallylogicalif using tools that require logical decoding) on the Primary.max_wal_senders: Defines the maximum number of concurrent connections from standbys. Must be increased from the default (usually 10) to accommodate all standbys.hot_standby: Must be set toonin the standby server'spostgresql.confto allow read-only queries during replication.synchronous_commit: Controls transaction acknowledgment. For zero data loss during switchovers, this is often set toon(orremote_writefor minor latency tolerance).primary_conninfo: Set on the standby, detailing connection information (host, port, user, password) to connect to the current Primary.
Best Practice: For robust HA, use connection pooling layers (like PgBouncer or dedicated HA proxies like Patroni or Repmgr) that abstract the physical server addresses, making failovers and switchovers seamless for applications.
Switchover: The Planned Transition
A Switchover is a controlled, graceful process where the active Primary node is intentionally decommissioned, and a designated Standby is promoted to take its place. This procedure is typically used for planned maintenance, version upgrades, or hardware replacements.
Steps for a Controlled Switchover
The goal of a switchover is to ensure zero data loss by waiting for all in-flight transactions to be replicated before promotion.
- Stop Writes on the Current Primary: The first step is to prevent any new transactions from being committed on the current Primary. This is often achieved by setting
default_transaction_read_only = onor temporarily shutting down client connections. - Wait for Replication Catch-up: Ensure the designated Standby has received and applied all remaining WAL records from the Primary. You can check the replication lag using
pg_stat_replicationon the Primary or by examining the standby's recovery status. - Initiate Standby Promotion: Execute the command to promote the chosen Standby server to Primary role. The specific command depends on the management tool used (e.g.,
pg_ctl promoteor a cluster manager command). - Reconfigure Old Primary: Once the Standby is successfully promoted, the old Primary must be reconfigured to follow the new Primary as a Standby. This involves updating its
primary_conninfo. - Redirect Applications: Update the load balancer or connection pooler to direct traffic to the new Primary server.
Failover: The Emergency Response
Failover is an immediate, reactive procedure triggered when the current Primary server fails unexpectedly (e.g., hardware crash, network partition, software error) and cannot be brought back online quickly.
Failover inherently carries a higher risk of data loss because there is no guarantee that the last few committed transactions had time to stream to the Standbys before the failure occurred.
Executing an Emergency Failover
Failover procedures are designed for speed and recovery, often utilizing specialized tooling to automate the promotion.
- Determine the Health of the Old Primary: Verify that the original Primary is truly unavailable and not just experiencing a transient network issue (this prevents dangerous 'split-brain' scenarios).
- Select the Best Standby: Choose the Standby with the least replication lag (the one that is furthest ahead in the WAL stream).
- Promote the Standby: Immediately promote the selected Standby using the promotion command (
pg_ctl promote). - Handle Data Loss (If Necessary): If the cluster utilizes asynchronous replication, the data lost on the failed Primary might need to be manually reconciled or simply accepted, depending on the application's tolerance.
- Reconfigure Former Primary: Once the original Primary is recovered, it must be cleaned, reinitialized (often requiring a base backup from the new Primary), and configured to follow the new Primary.
Tools for Safe Promotion: Repmgr vs. Patroni
While manual promotion using pg_ctl is possible, robust HA environments rely on dedicated tools to manage the complex choreography required for failover and switchover, automatically handling configuration changes and cluster state management.
Repmgr (Replication Manager)
repmgr is a powerful, lightweight tool that monitors replication and facilitates manual or automatic failover. Key commands include:
- Switchover:
repmgr standby promotefollowed byrepmgr switchover --sibling-nodes-wait(to ensure catch-up). - Failover:
repmgr failover
Patroni
Patroni utilizes Distributed Consensus Stores (like etcd, ZooKeeper, or Consul) to manage cluster state, automatically electing a new Primary upon failure detection. Patroni largely automates both switchovers and failovers through API calls or Kubernetes operators, drastically reducing manual intervention.
Example using Patroni (Conceptual Promotion Command):
# Triggering a switchover via Patroni's REST API
curl -X POST http://patroni-api-endpoint/switchover -H "Content-Type: application/json" -d '{"target": "standby_node_name"}'
Warning on Split-Brain: The greatest danger during automated failover is the 'split-brain' scenario, where two nodes mistakenly believe they are the Primary due to network partitioning. Tools like Patroni mitigate this using quorum mechanisms, while manual setups require strict fencing mechanisms (like power controls) to ensure only one Primary exists.
Summary of Differences
| Feature | Switchover (Planned) | Failover (Emergency) |
|---|---|---|
| Trigger | Maintenance, upgrade, administrative choice | Primary failure (crash, outage) |
| Data Loss Risk | Near Zero (if properly timed) | Medium to High (depends on replication mode) |
| Downtime Expectation | Short, controlled downtime | Immediate, reactive downtime |
| Preparation | Requires prior coordination and WAL sync confirmation | Requires immediate action and reliance on Standby health |
Executing PostgreSQL failovers and switchovers requires a deep understanding of the cluster state and the tools managing it. While switchovers aim for zero data loss through coordination, failovers prioritize rapid service restoration, often at the expense of the very latest transactions. Proper setup of replication parameters and the utilization of robust cluster management tools are the cornerstones of reliable PostgreSQL High Availability.