I have a SP2013 farm that uses a 2-server SQL Server 2014 configuration with Always-On Availability Groups. Up until a couple of months ago, this worked perfectly.
Recently, our F5 load balancer started logging short outages (3-4 minutes) during which it determined that the SharePoint web servers were not available, hence an outage.
Troubleshooting determined that there were no external factors affecting the load balancer (no scans or backup processes, etc. causing the outages). Also, the times at which the outages occurred did not correspond to any SharePoint timer jobs or SQL maintenance plans.
This does not appear to be a SharePoint problem as far as I can tell.
The SQL server logs do show occasional timeout issues, during which the secondary availability group for SharePoint databases cannot connect to the primary, resulting in a condition that the load balancer interprets as an outage.
I am a relative newbie to SQL Server, so I could use some suggestions on this.
- The farm in question has a relatively light load, so resources should not be a problem. No obvious reason for a timeout.
- The "outages" occur 1-3 times per day.
- The outages occur at seemingly random times during the day, not associated with any scheduled events or processes.
- The SQL Servers have been patched to the latest version (2014 SP3 CU4).
Thoughts or suggestions?
David