Step-by-Step Guide to Deploying a RabbitMQ Active-Passive Cluster
Build a RabbitMQ active-passive setup with clustering, matching Erlang cookies, quorum queues, and a tested failover path.
Step-by-Step Guide to Deploying a RabbitMQ Active-Passive Cluster
RabbitMQ high availability needs more than two servers that can see each other. You need clustering for shared metadata, replicated queues for message availability, and a clear failover path for clients.
This guide shows the RabbitMQ side of an active-passive style deployment. Client failover usually comes from a load balancer, DNS change, service discovery, or a virtual IP managed outside RabbitMQ.
Prerequisites for an Active-Passive Cluster
Before beginning the configuration, ensure the following prerequisites are met across all intended cluster nodes (Node A - Active, Node B - Passive):
- Compatible Software Versions: Keep RabbitMQ Server and Erlang/OTP versions aligned across nodes. In practice, run the same RabbitMQ version on every node unless you are following RabbitMQ's documented rolling upgrade path.
- Network Accessibility: Nodes must communicate over AMQP ports used by clients, the distribution port used for clustering, and any management or TLS ports you enable.
- Host Resolution: Configure the
/etc/hostsfile (or DNS) on all nodes so that each node can resolve the hostname of all other nodes reliably. - Cookie Consistency: The Erlang 'magic cookie' must be identical on all nodes. This is crucial for the nodes to trust each other for clustering.
Establishing Cookie Consistency
The Erlang cookie determines whether nodes can communicate securely. It must be copied from the first node initialized to all others.
On Node A (The first node):
Locate the cookie file (usually /var/lib/rabbitmq/.erlang.cookie or ~/.erlang.cookie depending on the installation method) and copy its contents.
On Node B (and subsequent nodes):
- Stop the RabbitMQ service:
sudo systemctl stop rabbitmq-server - Replace the existing cookie file with the content copied from Node A, ensuring correct permissions (usually
400).# Example using echo (replace content as needed) echo "YOUR_LONG_COOKIE_STRING" | sudo tee /var/lib/rabbitmq/.erlang.cookie sudo chmod 400 /var/lib/rabbitmq/.erlang.cookie - Start the service on Node B:
sudo systemctl start rabbitmq-server
Step 1: Configuring Hostnames and Networking
Ensure that the host files on both Node A and Node B correctly map their hostnames.
Example /etc/hosts (on both servers):
192.168.1.10 rabbitmq-node-a
192.168.1.11 rabbitmq-node-b
Step 2: Initializing the First Cluster Node (Active)
Node A will be the initial primary node, where the cluster is first established.
- Start the service on Node A (if not already running):
sudo systemctl start rabbitmq-server - Verify Status: Ensure the node is running correctly.
rabbitmqctl status
Step 3: Joining the Second Node (Passive) to the Cluster
Now, we instruct Node B to join the cluster led by Node A.
Stop the RabbitMQ application on Node B while keeping the Erlang node available:
sudo rabbitmqctl stop_appReset Node B's local state if it has already been initialized as a standalone node:
sudo rabbitmqctl resetJoin Command: Execute the join command on Node B, specifying the hostname of Node A as the peer.
sudo rabbitmqctl join_cluster rabbit@rabbitmq-node-aTip: Use the hostname defined in
/etc/hosts.Start the RabbitMQ application on Node B:
sudo rabbitmqctl start_app
Step 4: Verifying Cluster Formation
Log into Node A and verify that both nodes recognize each other.
rabbitmqctl cluster_status
Expected Output Snippet:
You should see both rabbitmq-node-a and rabbitmq-node-b listed under running_nodes.
Cluster status of node rabbit@rabbitmq-node-a ...
[{nodes,[{disc,[rabbit@rabbitmq-node-a,rabbit@rabbitmq-node-b]}]},
{running_nodes,[rabbit@rabbitmq-node-a,rabbit@rabbitmq-node-b]},
...
]
Step 5: Configuring High Availability for Queues
Standard RabbitMQ clustering shares metadata such as users, exchanges, bindings, and policies. Queue contents need a replicated queue type if you want messages to survive node failure.
For modern RabbitMQ deployments, use quorum queues for replicated durable queues. Classic mirrored queues used ha-mode policies in older RabbitMQ releases, but that approach is deprecated and removed from newer major versions.
Declare a Quorum Queue
You can declare quorum queues from your application or with rabbitmqadmin. This example creates a durable quorum queue:
rabbitmqadmin declare queue name=orders durable=true arguments='{"x-queue-type":"quorum"}'
For two-node labs, a quorum queue can run, but it cannot tolerate the loss of one node and still keep a majority. For production, use at least three RabbitMQ nodes for quorum queues so one node can fail while the queue still has a majority.
Step 6: Test Failover
Before calling the cluster ready, test the path your clients will use:
- Publish a few persistent test messages to a quorum queue.
- Stop the active node's RabbitMQ application with
sudo rabbitmqctl stop_app. - Confirm clients reconnect through your load balancer, DNS target, or service discovery setup.
- Consume the test messages from the surviving node.
- Start the stopped application again with
sudo rabbitmqctl start_appand checkrabbitmqctl cluster_status.
Final Takeaway
RabbitMQ clustering gives you shared broker metadata, but queue availability depends on the queue type and client failover design. Use quorum queues for replicated durable queues, keep at least three nodes for real fault tolerance, and test failover with the same connection path your applications use.