Troubleshooting Common Elasticsearch Cluster Split-Brain Scenarios
Learn to diagnose and resolve critical Elasticsearch split-brain issues. This guide covers common causes like network partitions and incorrect quorum configurations. Discover practical diagnostic steps, including network checks and log analysis, and follow a clear, step-by-step resolution process to restore cluster stability. Implement prevention strategies to safeguard your Elasticsearch deployment against future split-brain incidents.
Troubleshooting Common Elasticsearch Cluster Split-Brain Scenarios
Split brain is the Elasticsearch failure people talk about because it sounds dramatic, but the useful question is more practical: can more than one part of the cluster make master-level decisions at the same time? Modern Elasticsearch versions are designed to prevent that through majority-based cluster coordination. Older clusters, especially pre-7.0 clusters with a bad discovery.zen.minimum_master_nodes setting, were easier to misconfigure.
So this article separates two situations that often get mixed together. A true split brain means independent partitions can each elect or keep a master. A master election outage means the cluster cannot elect a master because it does not have a majority. The first risks conflicting cluster state and data inconsistency. The second is painful, but it is usually the safer failure mode.
What split brain looks like
In a healthy cluster, one elected master manages cluster state: index creation, shard allocation decisions, mappings, node membership, and similar metadata. Data nodes can handle reads and writes, but the cluster still depends on a single master view of the world.
A split-brain scenario happens when network partitions or bad discovery settings let two groups of nodes behave as if each group is the real cluster. One side might accept writes to an index while the other side accepts different writes. When connectivity returns, Elasticsearch cannot simply merge two conflicting histories like a text file.
In modern Elasticsearch, if a partition does not have a majority of master-eligible nodes, it should not elect a master. That means some nodes may become unavailable instead of forming a competing cluster. That is the behavior you want.
Version matters
For Elasticsearch 6.x and older, the key setting was:
discovery.zen.minimum_master_nodes: 2
The rule was majority of master-eligible nodes: (N / 2) + 1, rounded down for integer division before adding one. With three master-eligible nodes, set it to 2. With five, set it to 3. Setting it to 1 in a three-node cluster made split brain possible.
For Elasticsearch 7.x and later, discovery.zen.minimum_master_nodes is gone. Cluster coordination changed, and Elasticsearch manages voting configuration. New clusters still need correct bootstrapping with cluster.initial_master_nodes, but that setting is only for the first cluster formation. After the cluster forms, remove it from configuration.
Do not “fix” a modern cluster by adding old discovery.zen settings. They are not the control plane anymore.
Common causes
The most common trigger is a network partition between master-eligible nodes. In cloud terms, that might be a security group change, a bad route table, a network ACL, a zone-level problem, or a firewall rule that blocks transport port 9300. In bare metal environments, it might be a switch, VLAN, DNS, MTU, or packet-loss problem.
Another cause is running too few master-eligible nodes. Two master-eligible nodes are awkward because there is no clean majority after one fails. A production cluster normally uses three dedicated master-eligible nodes, or three mixed-role nodes in a small deployment.
A third cause is stale or reused data directories. If you clone VMs or reuse disks from old clusters, nodes may carry cluster metadata you did not intend. That can lead to confusing join failures and, in the worst cases, accidental formation of a separate cluster.
Finally, manual recovery under pressure can make the problem worse. Rebooting random nodes, wiping data paths, or forcing unsafe allocation before you know which partition is authoritative can turn a recoverable incident into data loss.
First checks during an incident
Start by asking every reachable node what it thinks:
curl -s "http://NODE:9200/_cat/master?v"
curl -s "http://NODE:9200/_cat/nodes?v&h=ip,name,roles,master,node.role"
curl -s "http://NODE:9200/_cluster/health?pretty"
Run those against more than one node if possible. If different nodes report different masters or different node membership, you may be looking at a partitioned cluster or isolated nodes.
Check the logs on master-eligible nodes for messages about elections, joins, disconnects, and publication failures. Useful search terms include master not discovered, elected-as-master, node-left, node-join, publication, connect_transport_exception, and handshake.
Then test transport connectivity, not just HTTP:
nc -vz node-1.example.internal 9300
nc -vz node-2.example.internal 9300
nc -vz node-3.example.internal 9300
Run those tests from node to node. A load balancer or bastion reaching HTTP port 9200 tells you very little about whether Elasticsearch nodes can form a cluster over 9300.
Check discovery and bootstrap configuration
On Elasticsearch 7.x and later, inspect these settings:
cluster.name: my-cluster
discovery.seed_hosts:
- node-1:9300
- node-2:9300
- node-3:9300
For a brand-new cluster only:
cluster.initial_master_nodes:
- node-1
- node-2
- node-3
The bootstrap names must match node.name. After the cluster forms, remove cluster.initial_master_nodes from all nodes.
On Elasticsearch 6.x and older, check:
discovery.zen.minimum_master_nodes: 2
for a three-master-eligible-node cluster. Also confirm all master-eligible nodes have consistent discovery hosts and cluster names.
Recovery principles
If you suspect true split brain, stop making changes through the cluster API until you know which side is authoritative. The safest recovery usually follows this order:
- Preserve evidence: collect logs, node lists, master views, and index health from each partition.
- Restore network connectivity or intentionally isolate the bad side.
- Choose the authoritative partition based on majority, latest valid data, and business impact.
- Stop Elasticsearch on nodes that should not continue as an independent partition.
- Bring nodes back one at a time and verify they join the authoritative cluster.
- Restore missing data from snapshots if any primary shard history is lost or inconsistent.
Do not delete translog directories as a routine split-brain fix. That is dangerous advice. Translogs are part of Elasticsearch recovery. Removing files under the data path manually can cause data loss and should only be done with version-specific guidance from Elastic support or after you have accepted the loss and have a rebuild plan.
If two partitions accepted writes independently, there may be no perfect automatic merge. You may need to restore from snapshot, reindex from source systems, replay application logs, or choose one side’s data as authoritative.
A realistic example
Imagine a three-node cluster across three private subnets. A firewall change accidentally blocks transport traffic between node 1 and nodes 2 and 3. Nodes 2 and 3 still see each other, so they keep or elect a master. Node 1 cannot see a majority. In a modern, correctly configured cluster, node 1 should not form a competing master by itself. Clients using node 1 may fail, but the cluster avoids conflicting masters.
Now imagine an old 6.x cluster with three master-eligible nodes and discovery.zen.minimum_master_nodes: 1. Node 1 can elect itself, while nodes 2 and 3 elect another master. That is the classic split-brain risk. The fix is not just reconnecting the network. You also need to correct quorum configuration and decide how to handle any writes accepted on the wrong side.
Prevention
Use three master-eligible nodes for small and medium clusters. For larger clusters, make them dedicated master nodes so search and indexing load does not interfere with cluster coordination.
Keep master-eligible nodes on reliable networks with low packet loss. Spreading nodes across zones can help availability, but only if the network between zones is dependable and the quorum design still makes sense.
Monitor master changes. A master election during planned maintenance is normal. Frequent elections during normal traffic are a warning sign.
Monitor transport connectivity and not only HTTP uptime. A node can answer 9200 and still fail to participate correctly in the cluster if transport traffic is blocked.
Snapshot regularly and test restores. Replicas do not protect you from a bad delete, corrupted data, or conflicting writes during a serious incident.
Be careful with bootstrap settings. On modern clusters, cluster.initial_master_nodes is not an everyday discovery setting. Use it to create a new cluster, then remove it.
The best split-brain recovery is the one you never need: majority-based master eligibility, correct version-specific discovery settings, boring network rules, and a tested snapshot plan.
How to tell split brain from a normal election
A master election is not automatically split brain. During a rolling restart, network flap, or master node failure, Elasticsearch may elect a new master. If the cluster keeps one authoritative master and the old master steps down, that is normal distributed-system behavior.
Warning signs are different views from different nodes. If node A reports itself as master while node B reports node C as master, stop and investigate. If two groups of nodes both accept cluster-state changes while disconnected, you have a much more serious situation than a brief election.
Also watch client behavior. Clients pinned to an isolated node may see failures even while the majority side is healthy. That does not mean the majority cluster is broken. It may mean your client connection strategy or load balancer is still sending traffic to a node that cannot participate.
Load balancers and client routing
Elasticsearch transport discovery is not the same as client HTTP routing. Do not put master discovery behind a generic HTTP load balancer and expect it to solve cluster membership. Nodes need transport connectivity to each other.
For clients, use multiple HTTP endpoints or a load balancer that removes unhealthy nodes quickly. A node that has lost its master may still have a process listening for a short time, but it is not a good target for writes. Health checks should be more meaningful than “port 9200 is open.”
A practical HTTP health check might query cluster health locally and reject nodes that do not have a discovered master. The exact approach depends on your client and infrastructure, but the principle is simple: do not keep sending writes to isolated nodes.
Post-incident cleanup
After the cluster is stable, compare index health, document counts, and application-level source-of-truth counts. If there was any chance of writes landing on different partitions, Elasticsearch health alone cannot prove the data is semantically correct.
Review the timeline. Which node lost connectivity first? Which node was master before the event? Did clients continue writing? Were snapshots current? Did alerts fire before users noticed? These details determine whether you need only a network fix or a data reconciliation plan.
For older clusters, schedule the version and discovery-setting work. If you are still running a version that depends on discovery.zen.minimum_master_nodes, make sure it is correct today, then plan an upgrade path. Split-brain prevention is not a one-time runbook step; it is part of cluster lifecycle management.
Configuration changes to avoid during panic
Do not change cluster.name to make nodes join. That creates a different cluster identity problem.
Do not wipe data paths unless you are intentionally discarding the node’s local shard copies and have confirmed the cluster has valid copies elsewhere or snapshots available.
Do not add cluster.initial_master_nodes back to an existing modern cluster as a general restart fix. That setting is for initial bootstrap, not routine discovery.
Do not lower quorum-style protections on old clusters to restore availability. Making a minority partition writable may feel like progress, but it is exactly how conflicting masters become possible.
Designing for awkward failure domains
Three master-eligible nodes work best when no single infrastructure event can remove two of them. In a three-zone cloud region, one master-eligible node per zone is a common pattern. In a two-zone environment, placement is harder because one zone may contain two votes. If that larger zone fails, the remaining single vote cannot safely elect a master. That is not Elasticsearch being fragile; that is majority math.
Do not solve this by adding an even number of voting nodes without thinking through failure modes. Four master-eligible nodes still require a majority, and a two-two partition cannot safely choose a side. Dedicated voting-only nodes can help in some designs, but the principle stays the same: the cluster needs a reliable majority to make cluster-state decisions.
Write this down in the architecture notes. During an outage, people often ask why the surviving node or surviving zone cannot just keep serving writes. The answer should be clear before the incident: accepting writes without a majority risks conflicting history.