Understanding and configuring Windows Server Failover Cluster Quorum for SQL Server Always On


What is quorum?

SQL Server availability solutions such as Always On Availability Group (AG) and Failover Cluster Instance (FCI) depends on Windows Server Failover Cluster (WSFC) for high availability and failover operations. WSFC failover works based on the quorum. Generally, quorum is minimum requirement of something to be up and running. In Windows Cluster, quorum decides, minimum number of nodes required for cluster to be up and running.

In Windows Server cluster, more than half of the nodes required for cluster to be up and run. For example, if you have 4 node cluster, you need (4 / 2) + 1 nodes for cluster to work.  This is called quorum in WSFC.
Quorum and Roles:

Windows Server designed in a way, quorum understands the node which is active and nodes which are in stand-by. Also, at a given point of time, there cannot be more than active nodes in a Windows Server Failover Cluster. The quorum aware of this active node and in case of fail-over, it will decide the next active node and stand-by nodes. In AG and FCI, active means, nodes which owns the Resource Group at a given point of time.

Quorum and Partitioned Clusters:

In Windows Server Failover Cluster, each node communicates to other nodes in the cluster through a dedicated network connection. When there is communication failure between nodes, there will be partition of nodes and each node will think, other nodes are down and trying to host Resource Group (active) for keeping the system up and running. In Windows Server Failover Cluster (WSFC), more than one node cannot be active at a given point of time. This conflict is called as split-brain scenario.

Quorum designed in a way, it is aware of this communication, and if there are partitions in cluster, due to network failure or some other issues, it will intervene to prevent the split-brain scenarios. The partition with majority quorum will own the resource group and cluster will force stop the cluster service from nodes of other partitions and its removed from the WSFC.

WSFC designed as when the communication between removed nodes established either manually or automatically, nodes can communicate to current cluster nodes, they will automatically join the cluster and start their cluster service.

How Windows Server Failover Cluster Quorum works:

Let’s assume, the cluster is partitioned due to network failure, each partitioned nodes will try to own the resource group. In order to achieve this, each subset of nodes has to prove the quorum majority.

For example, you have 5 node cluster, post partition each subset will try to prove the quorum majority and subset with 3 nodes will own the resources. Here, cluster quorum used to avoid the split-brain situation.

This works well, when you have odd number of nodes, what happens when you have even number of nodes? For example, you have a cluster with 4 nodes, and if network failure, you have partitioned cluster of 2 subsets with 2 nodes each. It is 50/50 quorum, and again both subset will consider itself as majority and try to own the resources, and fail to prove the majority hence cluster will be down.

How the cluster manages quorum, when you have even number of nodes? There are 2 options:

Add a cluster witness and increase the vote count by one or decrease the vote of node as zero and make total number of votes as odd number.

Starting from Windows Server 2012 quorum is configured automatically using Dynamic Quorum. However cluster witness must be added manually to cluster.

Dynamic Quorum:

Post verifying the quorum majority, the new majority definition will be updated among new cluster nodes. The concept of the total number of votes adopted after a successive failure is known as Dynamic Quorum. This allows cluster to lose one node, then another, then another until the last standing member and majority definition is updated on each loss of nodes dynamically.

For example, you have a 3 node cluster and if there is a failure, the subset of 2 nodes will survive and other node will be removed from cluster. The quorum needed for cluster to be up and running is more than half, in our case, we have left with only 2 nodes, the quorum is 50/50. Now cluster will automatically zero down vote of either one of the nodes and other node will be assigned as majority node with total vote of 1 out 1. This is called dynamic quorum.

Dynamic Witness:

Starting from Windows Server 2012 R2, the vote of cluster witness is calculated dynamically, when you have odd number nodes, the witness does not have vote and, if there are even number of nodes, the witness will have votes make total number of vote as odd. The process of dynamically deciding the witness vote is called Dynamic Witness.

For example, you have a 5 node cluster with a cluster witness. By default, each node and witness will have 1 vote each. The total vote for quorum of this cluster is 5, in this case, the value of witness vote is 0, since it is already an odd number. Now assume, there is a failure and one node removed from cluster, the total vote value will be 4, now the cluster will automatically assign the voting of witness by 1 to make the total value of vote as odd number. This process is called, dynamic witness.

Quorum Witness Types:

The Windows Server Failover Cluster (WSFC) supports the following 3 types of quorum witnesses:

Disk Witness:

A small size of storage disk (typically 1 GB) attached with the cluster. The quorum disk is accessible from all nodes in the cluster and is highly available. Disk witness contains the copy of the cluster database.

File Share Witness:

A shared folder from an external server, which is accessible from all the nodes in the cluster. Usually from Windows File Share or a folder from Domain Controller. It should be reliable and DBA should aware of any changes to the file share, either access related or maintenance related. File Share witness maintains the clustering information in a witness.log file, but does not store copy of the cluster database.
  
Cloud Witness:

Introduced on Windows Server 2016, BLOB storage account in Azure, which is accessible to all the nodes in the cluster. Cloud witness maintains the clustering information in a witness.log file, but does not store copy of the cluster database.

Quorum Models:

Choosing the correct quorum model is very critical decision for your availability solution to work. As we discussed earlier, Windows Server provides multiple combination of quorum models. Let’s look at what works best for your availability solution.
  • Node Majority
  • Node & Disk Majority
  • Node & File Share Majority
  • Node & Cloud Storage Majority
  • Disk-only - No Majority.
Let’s look at each models in detail.

Node Majority:

Quorum majority is calculated based on the active number of nodes in the cluster. By default, each node will have 1 vote each. There is not witness configured here. Recommended quorum mode for Availability Groups (AGs) and Failover Cluster Instances (FCIs) when there is an odd number of nodes.

Node & Disk Majority:

Quorum majority vote is calculated as number of active nodes in the cluster and shared disk cluster resource. Connectivity by any node to disk resource count as an affirmative vote for the disk. Size of the disk should be 512MB minimum and it should be excluded from Anti-Virus scanning.  The disk resource should be able to failover as a stand-alone instance. This model is recommended for Failover Cluster Instance but not recommended for Availability Group (AGs).

Node & File Share Majority:

In this model, voting is based on the active number of nodes in the cluster and a file share resource. By default, each node will have a vote and file share will have 1 vote. As a best practice, the file share resource should not be physically located on any node of the cluster, in-case of loss of that node will result in loss of 2 votes. This model is recommended for Availability Groups and Failover Cluster Instances, when there are even number of nodes in the cluster.

Node & Cloud Storage Majority:

In this model, voting is calculated based on the active number of nodes in the cluster and Azure Blob Storage. By default, each node will have 1 vote and Azure Blob Storage will have a vote. The Cloud witness is available from Windows Server 2016 and previous versions can continue to use other models. Cloud Storage witness act as data centre and will provide reliable voting. This is recommended for Always On Availability Group and Failover Cluster Instance, when there is an even number of nodes.
  
Disk-Only - No majority:

In this model, there is no quorum majority calculated, active node is determined by the shared disk cluster resource. There are no votes for the nodes in the cluster, only single-vote of the disk-resource is required to be online as primary. All nodes in the cluster must have connectivity to disk to gain an affirmative vote and be online. This model is not recommended for Always On Availability Group and Failover Cluster Instances. This model exists only for backward compatibility.

Example Scenarios:

Scenario 1: Two nodes without witness:

Cluster will dynamically decide the quorum, vote of either one of two nodes will be zeroed down and another node will be assigned as total votes 1 out 1 and it has quorum majority and own the resources.

If the current active node fails unexpectedly, then the cluster will go down, it is a single point of failure, but if is a graceful shutdown, the voting will be shifted to the another node and the cluster will be still up. There is a fifty percent chance, the cluster will survive one failure. This is the reason we need a cluster witness.

Scenario 2: Two nodes with witness:

In this case, the quorum has total 3 votes. If any of a node or witness goes down, you will still have a node and witness or 2 nodes with quorum majority and cluster will be up and running. This cluster will survive maximum one failure.

Scenario 3: Three nodes without witness:

Total vote is 3, if any one of the nodes is, the vote will be 2/3 and the cluster can survive, at this point, dynamic quorum updated and cluster will be scenario 1. This cluster can survive one failure and there is a fifty percent chance of next failure.

Scenario 4: Three nodes with witness:

In this case, the witness does not have vote (dynamic witness), so the total vote is an odd number 3. If any one of the nodes fails, the cluster will be scenario 2. This cluster can survive 2 subsequent failures of node.

Scenario 5: Four nodes without witness:

In this scenario, the cluster automatically zero downs one vote from a node and make quorum majority as odd number. If a node fails, the cluster becomes scenario 3. This cluster can survive 2 subsequent failures of nodes and there is a fifty percent chance of next subsequent failure of nodes.

Scenario 6: Four nodes with witness:

All four nodes and witness will have a vote and make the quorum majority an odd number. In case of failure, the cluster will become scenario 4. The cluster will survive 3 node failures.

Configuring the failover cluster quorum:

There are two ways you can configure the cluster quorum, Failover Cluster Manager and Failover Cluster Windows PowerShell cmdlets. In this article, we are going to configure node and file share majority for a cluster. The Failover Cluster Manager wizard is self-explanatory and having understanding of the quorum concept will help you in configuring other quorum models and it is very similar.


Go to Failover Cluster Manager and connect to the cluster. Right click on the cluster name and go to More Actions and choose the Configure Cluster Quorum Settings.

Failover Cluster Manager - Quorum Configuration
Failover Cluster Manager - Quorum Configuration

Configure Quorum Cluster Wizard will be opened, under Select Quorum Configuration Option choose the Select the quorum witness Option.

Configure Cluster Quorum Wizard
Configure Cluster Quorum Wizard
Click Next and move to the Select Quorum Witness page and select the Configure a file share witness option and click on the Next button.

Quorum Configuration - Select Quorum Witness
Quorum Configuration - Select Quorum Witness

Under Configure File Share Witness page enter the File Share Path and click on Next button. The File Share Path is nothing but a folder path, example: \\SERVER\Folder\Cluster.

Quorum Configuration - File Share Witness Configuration
Quorum Configuration - File Share Witness Configuration

On Confirmation page, review the settings and click on Next button.

Quorum Configuration - Confirmation Wizard
Quorum Configuration - Confirmation Wizard

Summary – You have successfully configured the quorum settings for the cluster. Click on the Finish button to close the wizard.

Quorum Configuration - Summary Wizard
Quorum Configuration - Summary Wizard

Viewing Cluster Quorum information:

PowerShell: In command window, type Get-ClusterQuorum to view the cluster details:

View Cluster Quorum Information - Get-ClusterQuorum PowerShell cmdlet
View Cluster Quorum Information - Get-ClusterQuorum PowerShell cmdlet

SSMS: Right click on Availability Groups -> Click on Show Dashboard -> In Dashboard – click on Availability Group -> Click on View Cluster Quorum Information at the right side of the dashboard.
View Cluster Quorum Information - Availability Group - Monitoring Dashboard - SSMS
View Cluster Quorum Information - Availability Group - Monitoring Dashboard - SSMS


TSQL: Connect to cluster server and run the below query:

SELECT  member_name, member_state_desc, number_of_quorum_votes
FROM   sys.dm_hadr_cluster_members;

Managing and configuring Node Weight and Cluster Voting:

To manage, node weight, go to Configure Cluster Quorum Wizard (mentioned above) and click on the Advanced quorum configuration and click Next.

Configure Cluster Quorum Wizard - Advanced quorum configuration
Configure Cluster Quorum Wizard - Advanced quorum configuration

Under Select Voting Configuration page, click on Select Nodes and uncheck the nodes, which you don’t want add weight and click on Next button.

Cluster Quorum Configuration - Managing Quorum voting and configuration
Cluster Quorum Configuration - Managing Quorum voting and configuration


Configure the remaining pages as mentioned above and close the wizard. Now go to view cluster quorum information section to verify the changes.

Best practices for configuring quorum vote:

When Windows Server Failover Cluster nodes spread across multiple data centres, there is a possibility of network latency or failure, and cluster may partition, if small disruptions. To prevent this, configure votes to nodes, which can communicate without any issues under normal circumstances.

Configure votes to only nodes, which are hosting SQL Server instances, you can skip other nodes, this will minimize the failure chances.

Configure votes to primary node and most eligible stand-by nodes, which are going to be primary, in case of failover.

Try to maintain the total number of votes as odd number, in-case of even nodes, add witness and leverage the benefits of dynamic quorum and dynamic witness features.

Summary

Great! we have covered what is quorum and its importance, new quorum features introduced in Windows Server 2012, witness’s types and configurations and cloud witness features, quorum models and finally managing and configuring windows failover cluster quorum, voting and best practices. Thanks for reading the article till the end. I hope you enjoyed this post, start building your own servers and test different quorum models and scenarios. Please write your questions and views on the comment section. I will reply to you as quick as possible. Thank you!

How to configure Read-Only routing on SQL Server 2016 Always On Availability Group (AG)?

The great advantage of Always On Availability Group over other availability solutions is their ability to scale-out read operations (SELECT queries). Read-only routing is a feature of Always On Availability Group, which redirects connection requests from applications into readable secondary. In SQL Server 2016 Always On Availability Group, you can configure upto 8 readable secondary replicas.

The connections are redirected based on the routing rules. Always On Availability Group provides the following options to define the rules:
  • Read-Only Routing URL
  • Read-Only Routing List
Before defining the routing rules, we must understand the following conditions:

The application must connect to the Virtual Network Name (VNN) and not to the secondary replica directly. VNN is defined at the time of configuring listener.

The application connection string must contain the read-only connection parameter, ApplicationIntent=ReadOnly;

There must be at least on readable secondary exist on the AG. Let’s configure the routing rules:

Read-only Routing URL:

The URL is used when an application explicitly trying to connect to readable secondary with read-only intent. This URL contains the hostname and port number.

Format : TCP://server.domain.com:1433
Example : TCP://node1.ms.com:1433 (note: this on node1.ms.com)
This rule is applicable only when the node is acting as a secondary replica (if it is primary, obviously it will accept, read and write connections).

Configure read-only routing URL using T-SQL:


ALTER AVAILABILITY GROUP [TestAG]
MODIFY REPLICA ON
N’node1’
WITH
(SECONDARY_ROLE (READ_ONLY_ROUTING_URL = N’TCP://node1.ms.com:1433’));
GO


Read-Only Routing List:

The read-only routing list contains the list of readable secondary with their priority. For example, node1.ms.com is the primary replica, and when an application is trying to connect to AG with explicit read-only intent, the primary replica will redirect the read-only connection to available secondary replicas as defined on the routing-list.

Format : ‘replica1’,’replica2’
Example  : ‘node2’,’node3’ (note; on node1.ms.com)

Configure read-only routing list using T-SQL:

ALTER AVAILABILITY GROUP [TestAG] 
MODIFY REPLICA ON 
N’node1’ 
WITH (PRIMARY_ROLE (READ_ONLY_ROUTING_LIST= (N’node2’, N’node3’))); 
GO

Note the role, it is when node1.ms.com server is in primary role, it will take effect. Now you have configured the routing URL and routing list, follow the below steps to verify it is working as expected.

Steps to verify Read-Only routing:

In SSMS login, enter server name as Always On AG listener and go to options and find the additional connector parameters.

Enter the below connection parameters and click on the connect button.

ApplicationIntent=ReadOnly; InitialCatalog=databasename;

Open new query window and see the server name to identify, currently which server it is connecting.

SELECT @@SERVERNAME;

The output should be name of Read-only Secondary Replica. 

At the Read-Only Secondary Replica, if you try to update tables, you will get the error message stating, you cannot perform DML operation on secondary replica.

If you are not getting the Read-Only Secondary replica, mention it on comment section, I will be glad to help you and get back to as soon as possible.

Load-balancing across Read-Only Secondary Replicas:

Starting from SQL Server 2016, you can configure the load balancing across the read-only replicas.  Load balancing can be configured as below:

ALTER AVAILABILITY GROUP [TestAG] 
MODIFY REPLICA ON 
N’node1’ 
WITH 
(
PRIMARY_ROLE 
(
READ_ONLY_ROUTING_LIST= ((’node1’, ’node2’,’node3’), ‘node4’, ‘node5’)); 
));
GO

Load-balancing performed using Round-robin algorithm, and connection requests are load-balanced between node1, node2 and node3. Next priority will be node4 and the last priority will be node5.

I hope this article helps you in configuring the read-only routing, if you have questions or doubts, please mention it on comment section.