Hi All
I have a 4 node SQL 2012 Cluster which has 16 SQL Cluster Roles and each role has it own SQL Instance. The Cluster was setup about 3 years ago with only 2 nodes as the hardware for the 3rd and 4th node were being used for another reason at the time. Node 3 was added about 18 months ago and node 4 was added about 3 months ago. All hardware is the same, same Make, model, processor, NIC's, Hard Disks and External Storage.
When I added Node 4 I added Each SQL Cluster instance to the 4th node the same way we added SQL to node 2 and node 3 by running setup.exe /Action=AddNode /UpdateSource=<Path to Update Folder>. All the instances install with no issue. Once this was completed I tested failing over some of the roles to the new 4th node, then I started getting a problem moving some of the roles to node 4 and node 3.
For example, say I have Role "SQLServer1 (InstanceA)" running on node 1, perform live migration to node 4, Live migration fail's when trying to start SQL Server on node4, when it fails it tries a different node, this is normal node 1 or node 2, and starts no issue.
I checked the application event log and it just gave a general error so I looked at the SQL Server instance log which did not help either. Next stage I get the Cluster log for node 4 and then looked through this for any issue's. What I found was SQL Server was failing to start because it was unable to open Registry Key "HKLM\SOFTWARE\Microsoft\Microsoft SQL Server\MSSQL11.<InstanceName>\Replication".
So I open regedit and navigate to "HKLM\SOFTWARE\Microsoft\Microsoft SQL Server\MSSQL11.<InstanceName> and there was no "Replication" key, check on Node 1 and Node 2 and the keys are there, just not on node 4, some where also missing on Node 3.
To make sure what I was looking at was correct I manually added the "Replication" Key and all the Values under this key using node 1 as the reference server. Once completed I tried to move the Role that had failed to move to Node 4 and now it did fail over to node 4 with no issue.
So my Problem is; why is it this Key is not being created on some of the nodes when I add a new instance to that node. It is not a hard job to check for the key once the instance has been installed and add the key if missing. But of the 16 Cluster Roles I have installed, only node 1 can have all the roles running on it due to this problem, Node 2 needs this key added for 2 instances, Node 3 needs this key added 8 instances and node 4 was the same as node 3 but I have manually added the registry keys to make sure the roles all fail.
Does anyone know why this happens?
I have found an article which has similar issue https://sqlcan.wordpress.com/2014/10/27/missing-registery-settings-in-cluster-nodes-for-sql-server/. But I don't think this is the solution for me.
Richard Moth