SQL Server Always On Availability Group (AG) Listener: step by step guide


Availability databases hosted on SQL Server Always On Availability Group (AG) can be connected using a unique Virtual Network Name (VNN) called Availability Group Listener. When Availability Group is enabled, clients can connect to databases in both primary and secondary replicas without explicitly specifying the SQL Server instance name. You don’t even need to know the instance name to connect to Availability Group (AG).

When you have configured read-only routing for secondary replicas and the application or client connection contains the application intent as read-only, the listener will redirect the connection to secondary readable replicas, Otherwise, the listener redirects both read-write and read-only intent connections to primary replica. If there is a fail-over, the listener will redirect connections to new primary and secondary connections based on the application intent.
Availability group listener consist of following objects:
  • Domain Name Systems (DNS)
  • Listener Port
  • One or more IP addresses (VIP)
The Listener is always owned by the SQL Server instance where the primary replica resides. At the time of failover, the new primary replica will own the listener.

Example:
  • DNS: salesag.mscorp.com
  • Port: 1433
  • IP: static or DHCP

Creating Availability Group Listener:

Availability Group Listener can be created while creating Availability Group or it can be created post creating the AG. The Listener creation options are available on Specify Replicas window and under the Listener tab.

If you want to skip the Listener creation while creating Availability Group, you can leave the default option Do not create an availability group listener now check the below image for reference:

New Availability Group - Specify Replicas
New Availability Group - Specify Replicas


Post creating the Availability Group, you can configure listener on Availability Groups -> Group Name -> Add Listener.

Either on create an availability group listener or New Availability Group Listener wizard, specify the Listener DNS Name, Port, Network Mode values. The following figure shows the example availability group listener from my lab.

Create availability group listener
Create availability group listener

You can also, configure Network Mode as DHCP (Dynamic Host Configuration Protocol). DHCP is limited to single subnet and it is not recommended for production environments. Choose the Static IP option from the drop down, if you want the multi-subnet availability group.

To Add IP address, click on Add button and provide the IP address at IPv4 Address text box.
You can also, configure Network Mode as DHCP (Dynamic Host Configuration Protocol). DHCP is limited to single subnet and it is not recommended for production environments. Choose the Static IP option from the drop down, if you want the multi-subnet availability group.  To Add IP address, click on Add button and provide the IP address at IPv4 Address text box.
Create availability group listener - Add IP Address wizard

Additionally, you can script out the changes to query windows and see T-SQL script. Codes from my demo cluster:

USE [master]
GOALTER AVAILABILITY GROUP [MSCORPAG]
ADD LISTENER N'mscorpag' (WITH IP((N'192.168.0.7', N'255.255.252.0')), PORT=1433);GO

You can configure port 1433, default SQL Server instance port for availability group listener. In that case, you don’t have to specify the port number explicitly on connection string or client connection. If you specify custom port, you have to explicitly specify it. If you have more than one SQL Server instances on the machine, I recommend you to use different port, because, the listener may be configured to different instance which is listening on different port.

Connecting Availability Group Listener:

To connect availability databases through availability group listener, specify the listener name on server name box as below:

Connecting to availability group listener in SSMS
Connecting to availability group listener in SSMS

I am using default port number, if you are using different port number, specify explicitly on the connection string. Example connection string with different port number provided below:

connUrl = "jdbc:sqlserver://mscorpag.mscorp.com:3306;databaseName=msdb;user=rathish;password=pwd";

Listener and user permissions:

When you create an availability group listener, the cluster will create a computer object on domain controller and assign its computer name as virtual network name automatically, to perform this, the cluster should have Create Computer Object permission on your Active Directory.

Additionally, I have created availability group listener with the help of Domain Administrator by following below method.

Create a computer object on Active Directory and assign listener name as computer name to that object. The cluster (name of the cluster which own the availability group) will have the full control on the newly created computer object. Add an entry at Domain Name Systems (DNS) against that computer name with a static IP address. Now create listener on availability group. In troubleshooting section of this article, we look into the common error scenarios with listener creation permission.

On SQL Server, you need sysadmin privilege to create the availability group listener and at least ALTER AVAILABILITY GROUP permission to modify the listener.

Read-Only Routing and availability group listener:

In SQL Server Always On Availability Group (AG), read-only routing features provides the scalability by redirecting read-only connections (SELECT queries) to readable secondary replicas. Routing of connection to secondary replicas works, only when the applications or clients connecting availability databases through availability group listener. If you are connecting directly to SQL Server instance name, at the time of failover, connections will not automatically failover to new primary server.

As stated earlier, listener will redirect connections to both primary and secondary replicas. If you have configured any of the secondary replicas as readable and specify the application intent as read only, the listener will redirect connections secondary readable replicas based on the read-only routing configuration. Refer my previous article How toconfigure Read-Only routing on SQL Server Always On Availability Group for detailed explanation. Example connection string with Application Intent and default database given below:

connUrl = "jdbc:sqlserver://mscorpag.mscorp.com:3306;databaseName=mscorp;user=rathish;password=pwd;ApplicationIntent=ReadOnly";
You must specify the ApplicationIntent and Default database on connection string for read-only routing to work, otherwise the connection will be redirected to primary replica only, later in this article, we will see the troubleshooting steps for this scenario.

Availability Group Listener and multi-subnet failover:

You should set the MultiSubnetFailover option as True, enabling this option provides faster failover when your availability group spans over multiple subnets. It is recommended to set this option as True, even, if the availability group only spans a single subnet. This provides additional optimisation, even for single subnets ate the time of failover.

Example connection string with multi-subnet failover option:

connUrl = "jdbc:sqlserver://mscorpag.mscorp.com:3306;databaseName=mscorp;user=rathish;password=pwd;ApplicationIntent=ReadOnly;MultiSubnetFailover=True"; 

Monitoring Availability Group Listener:

SQL Server provides following catalog views to monitor availability group listener properties:

sys.availability_group_listener_ip_addresses: returns the virtual IP address of availability group listeners
sys.availability_group_listeners: returns the network name of availability group listeners
sys.dm_tcp_listener_states: returns the status of TCP IP address and port of availability group listener. 

Removing Availability Group Listener:

You can remove availability group listener, either using SSMS or using T-SQL. In SSMS, right-click on listener name and select the Delete options to remove the availability group listener.

Using T-SQL:

ALTER AVAILABILITY GROUP MSCORPAG REMOVE LISTENER 'mscorpag.mscorp.net';

Troubleshooting availability group listener issues:

Scenario 1: Unable to create availability group listener:

Error: Microsoft SQL Server, Error: 19457

Message:

The specified IP Address '' is not valid in the cluster-allowed IP range. Check with the network administrator to select values that are appropriate for the cluster-allowed IP range. (Microsoft SQL Server, Error: 19457)
Msg 19471, Level 16, State 0, Line 2The WSFC cluster could not bring the Network Name resource with DNS name '' online. The DNS name may have been taken or have a conflict with existing name services, or the WSFC cluster service may not be running or may be inaccessible. Use a different DNS name to resolve name conflicts, or check the WSFC cluster log for more information.
Msg 19476, Level 16, State 4, Line 2The attempt to create the network name and IP address for the listener failed. The WSFC service may not be running or may be inaccessible in its current state, or the values provided for the network name and IP address may be incorrect. Check the state of the WSFC cluster and validate the network name and IP address with the network administrator.
Possible Causes:

Cause 1: The cluster name account does not have the Create Computer Objects in Active Directory Organizational Unit.

Solution: Grant Create Computer Object to Cluster account name.

 
How to grant Create Computer Objects to Cluster name account:

Select the Advanced Feature option by View -> Advanced Features:
Troubleshooting Availability Group Listener - View Advanced Feature on AD
Troubleshooting Availability Group Listener - View Advanced Feature on AD

Go to Advanced Security Settings for Organisational Unit (OU): Follow the highlighted part as below image:

Advanced Security Settings of Organizational Unit
Advanced Security Settings of Organizational Unit

Select the Create Computer Objects from permission tab:
Create Computer Objects permission to cluster name account
Create Computer Objects permission to cluster name account

Cause 2: The cluster user account, which has Account Operator permission on Active Directory by default can create up to 10 computer objects, and it is exceeds its limit now.

Solution: If your organisation polity allowed, grant Create Computer Object permission to cluster account, or prestage the required objects on the Active Directory, Organizational Unit (OU).

How to prestage computer objects for availability group listener:

When you create an availability group listener, a virtual computer object will be created at the particular organizational unit of active directory automatically.

You can prestage the virtual computer objects as follows:

Create a computer objects under the organizational unit where you hosted the cluster, check with your domain administrator, if you don’t have access to create objects on Active Directory Domain Controller

Add virtual computer objects on Active Directory for Availability Group listener
Add virtual computer objects on Active Directory for Availability Group listener

Grant Full control permission to cluster name account on newly created computer object. Note that, this name should be same as the listener name, you will be creating on availability group.

Full control permission on virtual computer object to cluster name account
Full control permission on virtual computer object to cluster name account

Scenario 2: Read-Only routing not working when connecting to listener

Error: ApplicationIntent=ReadOnly specified on application connection string and read-only routing configured on the availability group, but when connecting to listener, it is not redirecting to secondary readable replicas.

Possible Cause: You have not mentioned the default database on the connection string. For more information on this behaviour can be found here.

Solution: Add Initial Catalog value in connection string.

Example SSMS connection string:
Read-Only Routing Connection parameters on SSMS
Read-Only Routing Connection parameters on SSMS

Scenario 3: Availability Group Listener - login time-out error

Error: You are unable to connect to availability group listener in a multi-subnet environment. This error usually occurs at the time of failover.

Possible Causes:

Cause 1: Your application uses legacy data provider that does not support the multi-subnet failover features.

Solution: Use the newer version of SQLClient drivers that supports multi-subnet features.

Cause 2: MultiSubnetFailover parameters are not used on the connection string.

Solution: Include MultiSubnetFailover=True parameter on connection string to fix this issue.

Example:

connUrl = "jdbc:sqlserver://mscorpag.mscorp.com:3306;databaseName=mscorp;user=rathish;password=pwd;ApplicationIntent=ReadOnly;MultiSubnetFailover=True";

Scenario 3: Availability Group Listener name not resolving to IP address

Error: After failover in a multi-subnet environment, ping command from client not resolving to new IP address of the listener and DNS entry of the listener name shows IPs of both subnets.

Possible Causes:

This error generally occurs, when listener is created using Failover Cluster Manager, rather than from SSMS.

Solution: Set the value of RegisterAllProvidersIP value as 0. This require restart of listener network name resource.

Example:

PowerShell:

Import-Module FailoverClusters
Get-ClusterResource dbcluster |Set-ClusterParameter
RegisterAllProvidersIP 0
Cluster.exe:

cluster /cluster: dbcluster res mscorpag /priv
RegisterAllProvidersIP=0
If still ping to listener returning wrong IP address, from application/client system, open command prompt as administrator and run the ipconfig /flushdns command.

Scenario 4:  Server cannot host the availability group listener IP address:

Error: Error 19456, Severity 16:

Message:

Error 19456, Severity 16: None of the IP addresses configured for the availability group listener can be hosted by the server '%.*ls'. Either configure a public cluster network on which one of the specified IP addresses can be hosted, or add another listener IP address which can be hosted.
Solution: Add a new IP address manually to existing listener with different subnet.

Scenario 5: Active Directory Policy

Error: Error 8557

Message:

Error 8557 (Your computer could not be joined to the domain. You have exceeded the maximum number of computer accounts you are allowed to create in this domain. Contact your system administrator to have this limit reset or increased.)
A computer object is created automatically, whenever you create an availability group listener. Even if you drop the listener, the computer object remains there in Active Directory. By default, authenticated domain users can create up to 10 computer objects, when this limit exceeds, it will through an error at the time, availability group listener creation.

Solution: You may contact domain admin to clean up this computer objects or ask them to increase the default limit (I think it is a global variable).

Best practices:

  • Use static IP address for reliable connections and multi subnet failover
  • Use meaningful and unique listener name for each availability groups
  • Test it on staging server, before implementing it on production environment
  • Additionally, to avoid the accidental deletion on cluster objects, enable the Protect objects from accidental deletion option for Organizational Unit (OU).
Protect object from accidental deletion
Protect object from accidental deletion

I hope this article is helpful to you, if you have any queries or want to share your thoughts on availability group listener, please write it in comment section. Thank you!

Step by step guide to add SSISDB to SQL Server 2016 Always On Availability Group (AG)

Starting from SQL Server 2016, you can add SSISDB as Availability Database in Always On Availability Group (AG). This enables high availability and automated failover for SSISDB. This article provides step by step guide to add SSISDB to Availability Group.

Make sure, you have installed SQL Server Integration Services on both primary and secondary replicas.

Step 1: Create SSIDB on primary replica:

Right-click on the Integration Services Catalogs and click on Create Catalog Wizard option

Creating Integration Service (SSIS) Catalog
Creating Integration Service (SSIS) Catalog

Step 2: Configure Integration Service Catalog:

On Catalog Creation Wizard select the Enable CLR integration and Enable automatic execution of Integration Services stored procedure at SQL Server startup. Then enter the password on Password and Retype Password text boxes and click OK button.

Configuring Integration Service Catalog
Configuring Integration Service Catalog

Step 3: Adding SSISDB to Availability Group:

On primary replica, add SSISDB to an Availability Group (AG). Expand the Always On High Availability and then Availability Groups. Right-click on Availability Group and click on Add Database option.

Adding SSISDB to Availability Group
Adding SSISDB to Availability Group (AG)

Step 4: Select database and decryption password:

On Add Database to Availability Group wizard, click Next button. All the eligible databases will be displayed here. Against SSISDB you will see Password required. Enter the password on Password column and click on Refresh button.

Always On - SSISDB prerequisites
Always On - SSISDB prerequisites

Now you will see the Meets prerequisites on status column. Click on the Next button.

SSISDB Selection for AG and decryption password
SSISDB Selection for AG and decryption password

Step 5: Selecting Secondary Replicas:

On Connect to Existing Secondary Replicas page, you can either connect to selective secondary replicas by clicking on Connect button against each replica name or you can connect to all secondary replicas by clicking on Connect All button.

You will be prompted to enter username and password, now connected to secondary replica as mentioned user. Click on the Next button.

Choosing secondary replicas for SSIS Availability Database
Choosing secondary replicas for SSIS Availability Database

Step 6: Initial Data Synchronization:

On Select Initial Data Synchronization page, Select your data synchronization preference. Since, my SSISDB size is very small and my cluster nodes are in same data centre, I am selecting Automatic seeding. You may use Full database and log backup option or use other options as your requirement and click Next.
SSISDB - Always On - Initial Data Synchronization
SSISDB - Always On - Initial Data Synchronization

Step 7: Validation:

Validate and fix the failed requirements and click on Next button.
Adding SSISDB to Availability Group (AG) - Validation
Adding SSISDB to Availability Group (AG) - Validation

Step 8: Summary:

Verify and confirm the choices made in the adding SSISDB to Availability Group wizard and click on Finish button to continue.

Adding SSISDB to Availability Group (AG) - Summary
Adding SSISDB to Availability Group (AG) - Summary

Step 9: Results:

The Results page provide the status of adding SSISDB to Availability Group. Click on Close to exit the wizard.
Adding SSISDB to Availability Group (AG) - Results
Adding SSISDB to Availability Group (AG) - Results

Step 10: Enabling Always On Support:

Right-click on the Integration Service Catalogs and click on Enable Always On Support option.    

Enabling Always On Support for Integration Service Catalog
Enabling Always On Support for Integration Service Catalog

Step 11: Connecting to secondary replicas:

On Enable Support For AlwaysOn wizard, click Connect All to connect to all secondary replicas and click OK button to close the wizard.

Enable Support for Always On - Connecting secondary replicas
Enable Support for Always On - Connecting secondary replicas

Step 12: Verifying Always On Support:

Post completing the previous step, you can verify the Always On support by checking the SQL Server Agent jobs. The following 2 jobs are created for Failover Monitor and Server Maintenance on both primary and secondary replicas.

Verifying Always On Support for Integration Service Catalog
Verifying Always On Support for Integration Service Catalog

Step 13: Monitoring SSISDB Availability Database Health:

Open the Availability Group Dashboard and check the status of the SSISDB and Availability Group Database Synchronization status.

Monitoring SSIDB Availability Database Health
Monitoring SSIDB Availability Database Health

Great! You have successfully added SSISDB as Availability Database on an Availability Group (AG).
Note: When you are upgrading / patching, remove the SSISDB from Availability Group, apply the patches and add it back to Availability Group as mentioned here. SSISDB in Always On Availability Group rule checks the SSISDB upgradation or patching status. I hope this article helps you. Please share your comments below and if you have difficulty in following any of the above steps, write in comment section, I will get back to you as soon as possible.

Understanding and configuring Windows Server Failover Cluster Quorum for SQL Server Always On


What is quorum?

SQL Server availability solutions such as Always On Availability Group (AG) and Failover Cluster Instance (FCI) depends on Windows Server Failover Cluster (WSFC) for high availability and failover operations. WSFC failover works based on the quorum. Generally, quorum is minimum requirement of something to be up and running. In Windows Cluster, quorum decides, minimum number of nodes required for cluster to be up and running.

In Windows Server cluster, more than half of the nodes required for cluster to be up and run. For example, if you have 4 node cluster, you need (4 / 2) + 1 nodes for cluster to work.  This is called quorum in WSFC.
Quorum and Roles:

Windows Server designed in a way, quorum understands the node which is active and nodes which are in stand-by. Also, at a given point of time, there cannot be more than active nodes in a Windows Server Failover Cluster. The quorum aware of this active node and in case of fail-over, it will decide the next active node and stand-by nodes. In AG and FCI, active means, nodes which owns the Resource Group at a given point of time.

Quorum and Partitioned Clusters:

In Windows Server Failover Cluster, each node communicates to other nodes in the cluster through a dedicated network connection. When there is communication failure between nodes, there will be partition of nodes and each node will think, other nodes are down and trying to host Resource Group (active) for keeping the system up and running. In Windows Server Failover Cluster (WSFC), more than one node cannot be active at a given point of time. This conflict is called as split-brain scenario.

Quorum designed in a way, it is aware of this communication, and if there are partitions in cluster, due to network failure or some other issues, it will intervene to prevent the split-brain scenarios. The partition with majority quorum will own the resource group and cluster will force stop the cluster service from nodes of other partitions and its removed from the WSFC.

WSFC designed as when the communication between removed nodes established either manually or automatically, nodes can communicate to current cluster nodes, they will automatically join the cluster and start their cluster service.

How Windows Server Failover Cluster Quorum works:

Let’s assume, the cluster is partitioned due to network failure, each partitioned nodes will try to own the resource group. In order to achieve this, each subset of nodes has to prove the quorum majority.

For example, you have 5 node cluster, post partition each subset will try to prove the quorum majority and subset with 3 nodes will own the resources. Here, cluster quorum used to avoid the split-brain situation.

This works well, when you have odd number of nodes, what happens when you have even number of nodes? For example, you have a cluster with 4 nodes, and if network failure, you have partitioned cluster of 2 subsets with 2 nodes each. It is 50/50 quorum, and again both subset will consider itself as majority and try to own the resources, and fail to prove the majority hence cluster will be down.

How the cluster manages quorum, when you have even number of nodes? There are 2 options:

Add a cluster witness and increase the vote count by one or decrease the vote of node as zero and make total number of votes as odd number.

Starting from Windows Server 2012 quorum is configured automatically using Dynamic Quorum. However cluster witness must be added manually to cluster.

Dynamic Quorum:

Post verifying the quorum majority, the new majority definition will be updated among new cluster nodes. The concept of the total number of votes adopted after a successive failure is known as Dynamic Quorum. This allows cluster to lose one node, then another, then another until the last standing member and majority definition is updated on each loss of nodes dynamically.

For example, you have a 3 node cluster and if there is a failure, the subset of 2 nodes will survive and other node will be removed from cluster. The quorum needed for cluster to be up and running is more than half, in our case, we have left with only 2 nodes, the quorum is 50/50. Now cluster will automatically zero down vote of either one of the nodes and other node will be assigned as majority node with total vote of 1 out 1. This is called dynamic quorum.

Dynamic Witness:

Starting from Windows Server 2012 R2, the vote of cluster witness is calculated dynamically, when you have odd number nodes, the witness does not have vote and, if there are even number of nodes, the witness will have votes make total number of vote as odd. The process of dynamically deciding the witness vote is called Dynamic Witness.

For example, you have a 5 node cluster with a cluster witness. By default, each node and witness will have 1 vote each. The total vote for quorum of this cluster is 5, in this case, the value of witness vote is 0, since it is already an odd number. Now assume, there is a failure and one node removed from cluster, the total vote value will be 4, now the cluster will automatically assign the voting of witness by 1 to make the total value of vote as odd number. This process is called, dynamic witness.

Quorum Witness Types:

The Windows Server Failover Cluster (WSFC) supports the following 3 types of quorum witnesses:

Disk Witness:

A small size of storage disk (typically 1 GB) attached with the cluster. The quorum disk is accessible from all nodes in the cluster and is highly available. Disk witness contains the copy of the cluster database.

File Share Witness:

A shared folder from an external server, which is accessible from all the nodes in the cluster. Usually from Windows File Share or a folder from Domain Controller. It should be reliable and DBA should aware of any changes to the file share, either access related or maintenance related. File Share witness maintains the clustering information in a witness.log file, but does not store copy of the cluster database.
  
Cloud Witness:

Introduced on Windows Server 2016, BLOB storage account in Azure, which is accessible to all the nodes in the cluster. Cloud witness maintains the clustering information in a witness.log file, but does not store copy of the cluster database.

Quorum Models:

Choosing the correct quorum model is very critical decision for your availability solution to work. As we discussed earlier, Windows Server provides multiple combination of quorum models. Let’s look at what works best for your availability solution.
  • Node Majority
  • Node & Disk Majority
  • Node & File Share Majority
  • Node & Cloud Storage Majority
  • Disk-only - No Majority.
Let’s look at each models in detail.

Node Majority:

Quorum majority is calculated based on the active number of nodes in the cluster. By default, each node will have 1 vote each. There is not witness configured here. Recommended quorum mode for Availability Groups (AGs) and Failover Cluster Instances (FCIs) when there is an odd number of nodes.

Node & Disk Majority:

Quorum majority vote is calculated as number of active nodes in the cluster and shared disk cluster resource. Connectivity by any node to disk resource count as an affirmative vote for the disk. Size of the disk should be 512MB minimum and it should be excluded from Anti-Virus scanning.  The disk resource should be able to failover as a stand-alone instance. This model is recommended for Failover Cluster Instance but not recommended for Availability Group (AGs).

Node & File Share Majority:

In this model, voting is based on the active number of nodes in the cluster and a file share resource. By default, each node will have a vote and file share will have 1 vote. As a best practice, the file share resource should not be physically located on any node of the cluster, in-case of loss of that node will result in loss of 2 votes. This model is recommended for Availability Groups and Failover Cluster Instances, when there are even number of nodes in the cluster.

Node & Cloud Storage Majority:

In this model, voting is calculated based on the active number of nodes in the cluster and Azure Blob Storage. By default, each node will have 1 vote and Azure Blob Storage will have a vote. The Cloud witness is available from Windows Server 2016 and previous versions can continue to use other models. Cloud Storage witness act as data centre and will provide reliable voting. This is recommended for Always On Availability Group and Failover Cluster Instance, when there is an even number of nodes.
  
Disk-Only - No majority:

In this model, there is no quorum majority calculated, active node is determined by the shared disk cluster resource. There are no votes for the nodes in the cluster, only single-vote of the disk-resource is required to be online as primary. All nodes in the cluster must have connectivity to disk to gain an affirmative vote and be online. This model is not recommended for Always On Availability Group and Failover Cluster Instances. This model exists only for backward compatibility.

Example Scenarios:

Scenario 1: Two nodes without witness:

Cluster will dynamically decide the quorum, vote of either one of two nodes will be zeroed down and another node will be assigned as total votes 1 out 1 and it has quorum majority and own the resources.

If the current active node fails unexpectedly, then the cluster will go down, it is a single point of failure, but if is a graceful shutdown, the voting will be shifted to the another node and the cluster will be still up. There is a fifty percent chance, the cluster will survive one failure. This is the reason we need a cluster witness.

Scenario 2: Two nodes with witness:

In this case, the quorum has total 3 votes. If any of a node or witness goes down, you will still have a node and witness or 2 nodes with quorum majority and cluster will be up and running. This cluster will survive maximum one failure.

Scenario 3: Three nodes without witness:

Total vote is 3, if any one of the nodes is, the vote will be 2/3 and the cluster can survive, at this point, dynamic quorum updated and cluster will be scenario 1. This cluster can survive one failure and there is a fifty percent chance of next failure.

Scenario 4: Three nodes with witness:

In this case, the witness does not have vote (dynamic witness), so the total vote is an odd number 3. If any one of the nodes fails, the cluster will be scenario 2. This cluster can survive 2 subsequent failures of node.

Scenario 5: Four nodes without witness:

In this scenario, the cluster automatically zero downs one vote from a node and make quorum majority as odd number. If a node fails, the cluster becomes scenario 3. This cluster can survive 2 subsequent failures of nodes and there is a fifty percent chance of next subsequent failure of nodes.

Scenario 6: Four nodes with witness:

All four nodes and witness will have a vote and make the quorum majority an odd number. In case of failure, the cluster will become scenario 4. The cluster will survive 3 node failures.

Configuring the failover cluster quorum:

There are two ways you can configure the cluster quorum, Failover Cluster Manager and Failover Cluster Windows PowerShell cmdlets. In this article, we are going to configure node and file share majority for a cluster. The Failover Cluster Manager wizard is self-explanatory and having understanding of the quorum concept will help you in configuring other quorum models and it is very similar.


Go to Failover Cluster Manager and connect to the cluster. Right click on the cluster name and go to More Actions and choose the Configure Cluster Quorum Settings.

Failover Cluster Manager - Quorum Configuration
Failover Cluster Manager - Quorum Configuration

Configure Quorum Cluster Wizard will be opened, under Select Quorum Configuration Option choose the Select the quorum witness Option.

Configure Cluster Quorum Wizard
Configure Cluster Quorum Wizard
Click Next and move to the Select Quorum Witness page and select the Configure a file share witness option and click on the Next button.

Quorum Configuration - Select Quorum Witness
Quorum Configuration - Select Quorum Witness

Under Configure File Share Witness page enter the File Share Path and click on Next button. The File Share Path is nothing but a folder path, example: \\SERVER\Folder\Cluster.

Quorum Configuration - File Share Witness Configuration
Quorum Configuration - File Share Witness Configuration

On Confirmation page, review the settings and click on Next button.

Quorum Configuration - Confirmation Wizard
Quorum Configuration - Confirmation Wizard

Summary – You have successfully configured the quorum settings for the cluster. Click on the Finish button to close the wizard.

Quorum Configuration - Summary Wizard
Quorum Configuration - Summary Wizard

Viewing Cluster Quorum information:

PowerShell: In command window, type Get-ClusterQuorum to view the cluster details:

View Cluster Quorum Information - Get-ClusterQuorum PowerShell cmdlet
View Cluster Quorum Information - Get-ClusterQuorum PowerShell cmdlet

SSMS: Right click on Availability Groups -> Click on Show Dashboard -> In Dashboard – click on Availability Group -> Click on View Cluster Quorum Information at the right side of the dashboard.
View Cluster Quorum Information - Availability Group - Monitoring Dashboard - SSMS
View Cluster Quorum Information - Availability Group - Monitoring Dashboard - SSMS


TSQL: Connect to cluster server and run the below query:

SELECT  member_name, member_state_desc, number_of_quorum_votes
FROM   sys.dm_hadr_cluster_members;

Managing and configuring Node Weight and Cluster Voting:

To manage, node weight, go to Configure Cluster Quorum Wizard (mentioned above) and click on the Advanced quorum configuration and click Next.

Configure Cluster Quorum Wizard - Advanced quorum configuration
Configure Cluster Quorum Wizard - Advanced quorum configuration

Under Select Voting Configuration page, click on Select Nodes and uncheck the nodes, which you don’t want add weight and click on Next button.

Cluster Quorum Configuration - Managing Quorum voting and configuration
Cluster Quorum Configuration - Managing Quorum voting and configuration


Configure the remaining pages as mentioned above and close the wizard. Now go to view cluster quorum information section to verify the changes.

Best practices for configuring quorum vote:

When Windows Server Failover Cluster nodes spread across multiple data centres, there is a possibility of network latency or failure, and cluster may partition, if small disruptions. To prevent this, configure votes to nodes, which can communicate without any issues under normal circumstances.

Configure votes to only nodes, which are hosting SQL Server instances, you can skip other nodes, this will minimize the failure chances.

Configure votes to primary node and most eligible stand-by nodes, which are going to be primary, in case of failover.

Try to maintain the total number of votes as odd number, in-case of even nodes, add witness and leverage the benefits of dynamic quorum and dynamic witness features.

Summary

Great! we have covered what is quorum and its importance, new quorum features introduced in Windows Server 2012, witness’s types and configurations and cloud witness features, quorum models and finally managing and configuring windows failover cluster quorum, voting and best practices. Thanks for reading the article till the end. I hope you enjoyed this post, start building your own servers and test different quorum models and scenarios. Please write your questions and views on the comment section. I will reply to you as quick as possible. Thank you!