The Setup
The DBAs were having a discussion this week about how we should setup Quorum on clusters that have different sets of nodes. We looked at our current setups and noticed that there are some inconsistencies. Some have a File Share Witness and some do not, some times the nodes in the BDC (DR) have votes and some times they do not. We would like to have a standard configuration for all clusters. It makes it easier to setup and easier to validate if something is not correct.
In our environment we have three basic types of SQL clusters.
- Two node cluster - One node in the PDC and one node in the BDC
- Three node cluster - Two nodes in the PDC and one node in the BDC
- Four node cluster - Two nodes in the PDC and two nodes in the BDC
· https://technet.microsoft.com/en-us/library/dn265972(v=ws.11).aspx
· http://sqlha.com/2013/07/02/wsfcs-dynamic-witness-in-windows-server-2012-r2/
The Testing
Two Node Cluster with File Share Witness
Initial Setup
I setup a cluster named ASTCL-DUO with two members, Batman and Robin. I added a File Share Witness and gave both nodes a vote
Then I checked the cluster configuration using PowerShell
You can see the quorum dynamically configured the witness to have 1 vote (WitnessDynamicWeight) so that we have an odd number of votes.
Remove One Node
Next I removed one of the nodes from the cluster to see what would happen to the assigned votes and the Witness. I did this by shutting down the cluster Service on Robin.
Not only did the File Share Witness not get removed but the Node Weight for the cluster node that was shut down stayed at 1. It still has a vote assigned. Dynamic Quorum did nothing. I thought about this and it makes sense. You had three votes, you only need to have two for a quorum so there is no need to change anything. My only concern with this is what will happen if the File Share Witness goes down. I will test that later and add the results below
Remove One Node and File Share Witness
Now we are going to leave the Robin node down and shut down the File Share Witness as well. Since I am using my DC as the file share witness I am just going to turn off sharing for the folder that is being used as the file share witness
Our node status doesn't change but now we can see that the File Share Witness has failed
The kicker is that since there are three voters in the quorum but we only have one available the cluster shuts down
I would have thought that dynamic quorum would have removed the votes from secondary node and the witness when the secondary node went down to prevent this from happening.
Two Node Cluster with File Share Witness and One Voting Node
Initial SetupFor this series of tests I started the cluster service back up on Robin and removed it's vote to see what would happen to the cluster when we started removing things
I ran the cluster quorum configuration script
Since we started out with an even number of votes (1), the cluster decided it did not need the File Share Witness so it set the WitnessDynamicWeight to 1
Remove One Node
I shut down the cluster service on Robin again.
The Quorum configuration will be the same since the node didn't have a vote
Remove One Node and File Share Witness
Leaving the Robin node down, I next turned off sharing for the File Share Witness folder.
The interesting thing here is that the cluster does not go down like it did in the first setup. I let it sit in this state for around 5 minutes to be sure.
Conclusions
If you read through any cluster documentation found on the web regarding using a File Share Witness (see my two links above), it seems to be best practice to use a witness even if you have an even number of nodes. If you try to configure your cluster Quorum without one you even get an error message about it during setup.
Since we currently are not running an active/active setup and DR failover is a manual event, we also do not need to have votes on the nodes in the BDC.
Going forward (and possibly backward, we might need to fix some current clusters). It seems like the best setup is to
1. Always have a File Share Witness
2. Always remove the votes from DR (and AG read servers)
This will prevent the cluster from coming down as long as the Primary node is available, it can survive the loss of both the secondary node and the File Share Witness.
This might change if we move to an Active/Active setup or the FSW goes to a third party site but I think we should build it into our current architecture.
Code used in this Blog
So you don't have to retype from my images
# Get node weight
get-cluster |ft name,dynamicquorum, witnessdynamicweight -Autosize
get-clusternode -Name * |format-table NodeName, DynamicWeight, NodeWeight -Autosize
No comments:
Post a Comment