Greetings.
I have a 2 node AG. SS 2012, SP 3/ CU 10
Synchronous commit, manual failover only.
I can get this message several times a day in the Event Viewer:
"AlwaysOn Availability Groups connection with primary database terminated for secondary database 'foo' on the availability replica with Replica ID:"
Followed then by the message:
"A connection for availability group 'myAG' from availability replica 'myPrimary' with id [C940E91B-4D84-4006-8829-F7084DAB29C6] to 'mySecondary' with id [ADBC3978-17C5-4A98-A75C-0BA8FC2B2C34] has been successfully established"
Sometimes the disconnect and reconnect can even occur withing the same second. The only time they typically cause issues is when a backup is happening on the Secondary. The job will fail and we'll get paged.
More fun facts:
- Event ID for reconnections is 35202 – disconnects is 35267.
- A disconnect can definitely occur when the CPU is very low.
- There’s nothing useful in the Cluster Log for this.
- This AG supports a large Data Warehouse environment (~ 20 TB). Mostly large batch jobs/ no OLTP. Yes, I realize an AG isn't ideal here, but it's what I've got.
Any ideas?
Thanks in advance! ChrisRDBA