I'm interested to get a better understanding of how failover clustering for my SQL Server instances works. I've got a number of SQL Server 2012 (SP2) FCIs installed on a 3-node Windows 2012 R2 cluster. When I initiate a failover, the storage moves across in the blink of an eye and the SQL instance takes a couple of seconds to come up. That's all fine.
Before the SQL service starts though, I have the IP address and virtual network name that needs to come up. For these two the process is taking in the region of about 10 seconds to come online, meaning the whole failover process is taking somewhere in the region of 12 - 15 seconds. I don't know if I'm being unrealistic in my expectations, but I would expect the IP address and VNN to come up pretty quickly - after all it's just a call to DNS isn't it?
What I want to try and understand is what's happening behind the scenes as part of the failover process so I can do a bit more digging and see if I can optimise the failover time from the current 15 seconds. Can anyone advise on what's happening internally during the failover process, or at least point me in an appropriate direction?
Many thanks