Quantcast
Channel: SQL Server High Availability and Disaster Recovery forum
Viewing all articles
Browse latest Browse all 4689

SQL server 2012 did not failover successfully. What happened?

$
0
0

Hi experts,

  I run our system with SQL server 2012 AlwaysOn and I found I can't connect to standby SQL server 2012(Not Synchronizing/Recovery Pending) after an failure. Here is what I found in the logs. What happened? Please help.

---

  1. SQL server errorlog

2014-02-24 18:15:54.58 spid48s     A connection timeout has occurred on a previously established connection to availability replica 'DL980-4' with id [AA216224-A495-4821-B121-F01FEF5132B8].  Either a networking or a firewall issue exists or the availability replica has transitioned to the resolving role.

2014-02-24 18:15:54.60 spid38s     AlwaysOn Availability Groups connection with secondary database terminated for primary database 'TCP' on the availability replica with Replica ID: {aa216224-a495-4821-b121-f01fef5132b8}. This is an informational message only. No user action is required.

2014-02-24 18:16:04.60 spid38s     A connection for availability group 'AGTCP' from availability replica 'DL980-3' with id  [8BA51030-C95F-4944-A8EE-43C44241EC08] to 'DL980-4' with id [AA216224-A495-4821-B121-F01FEF5132B8] has been successfully established.  This is an informational message only. No user action is required.

2014-02-24 18:16:04.60 spid41s     AlwaysOn Availability Groups connection with secondary database established for primary database 'TCP' on the availability replica with Replica ID: {aa216224-a495-4821-b121-f01fef5132b8}. This is an informational message only. No user action is required.

2014-02-24 18:16:29.93 spid38s     A connection timeout has occurred on a previously established connection to availability replica 'DL980-4' with id [AA216224-A495-4821-B121-F01FEF5132B8].  Either a networking or a firewall issue exists or the availability replica has transitioned to the resolving role.

2014-02-24 18:16:29.93 spid38s     AlwaysOn Availability Groups connection with secondary database terminated for primary database 'TCP' on the availability replica with Replica ID: {aa216224-a495-4821-b121-f01fef5132b8}. This is an informational message only. No user action is required.

2014-02-24 18:16:35.82 spid17s     SQL Server has encountered 1 occurrence(s) of I/O requests taking longer than 15 seconds to complete on file [L:\tcpdata4\TCPDATA39.ndf] in database [TCP] (5).  The OS file handle is 0x000000000000A300.  The offset of the latest long I/O is: 0x0000173cfd0000

2014-02-24 18:16:35.82 spid17s     SQL Server has encountered 1 occurrence(s) of I/O requests taking longer than 15 seconds to complete on file [l:\tempdb4\tempdb8.ndf] in database [tempdb] (2).  The OS file handle is 0x0000000000001930.  The offset of the latest long I/O is: 0x000004994e0000

2014-02-24 18:16:35.82 spid17s     SQL Server has encountered 1 occurrence(s) of I/O requests taking longer than 15 seconds to complete on file [l:\tempdb4\tempdb7.ndf] in database [tempdb] (2).  The OS file handle is 0x0000000000001964.  The offset of the latest long I/O is: 0x000004993e0000

2014-02-24 18:16:35.82 spid17s     SQL Server has encountered 1 occurrence(s) of I/O requests taking longer than 15 seconds to complete on file [L:\tcpdata4\TCPDATA43.ndf] in database [TCP] (5).  The OS file handle is 0x000000000000AEAC.  The offset of the latest long I/O is: 0x000019efbf4000

2014-02-24 18:16:35.82 spid17s     SQL Server has encountered 1 occurrence(s) of I/O requests taking longer than 15 seconds to complete on file [L:\tcpdata4\TCPDATA41.ndf] in database [TCP] (5).  The OS file handle is 0x0000000000008914.  The offset of the latest long I/O is: 0x0000270953c000

2014-02-24 18:16:35.82 spid17s     SQL Server has encountered 1 occurrence(s) of I/O requests taking longer than 15 seconds to complete on file [L:\tcpdata4\TCPDATA48.ndf] in database [TCP] (5).  The OS file handle is 0x0000000000002888.  The offset of the latest long I/O is: 0x00000ec75a0000

2014-02-24 18:16:35.83 spid17s     SQL Server has encountered 1 occurrence(s) of I/O requests taking longer than 15 seconds to complete on file [L:\tcpdata4\TCPDATA44.ndf] in database [TCP] (5).  The OS file handle is 0x00000000000022A4.  The offset of the latest long I/O is: 0x000002c5e60000

2014-02-24 18:16:35.83 spid17s     SQL Server has encountered 1 occurrence(s) of I/O requests taking longer than 15 seconds to complete on file [L:\tcpdata4\TCPDATA40.ndf] in database [TCP] (5).  The OS file handle is 0x0000000000000880.  The offset of the latest long I/O is: 0x000022e1642000

2014-02-24 18:16:35.83 spid17s     SQL Server has encountered 1 occurrence(s) of I/O requests taking longer than 15 seconds to complete on file [L:\tcpdata4\TCPDATA47.ndf] in database [TCP] (5).  The OS file handle is 0x00000000000028F4.  The offset of the latest long I/O is: 0x000027d8946000

2014-02-24 18:16:35.83 spid17s     SQL Server has encountered 1 occurrence(s) of I/O requests taking longer than 15 seconds to complete on file [L:\tcpdata4\TCPDATA45.ndf] in database [TCP] (5).  The OS file handle is 0x000000000000AC80.  The offset of the latest long I/O is: 0x000022ea45c000

2014-02-24 18:16:35.83 spid17s     SQL Server has encountered 1 occurrence(s) of I/O requests taking longer than 15 seconds to complete on file [L:\tcpdata4\TCPDATA42.ndf] in database [TCP] (5).  The OS file handle is 0x0000000000001FA8.  The offset of the latest long I/O is: 0x00002829b14000

2014-02-24 18:16:35.83 spid17s     SQL Server has encountered 1 occurrence(s) of I/O requests taking longer than 15 seconds to complete on file [L:\tcpdata4\TCPDATA38.ndf] in database [TCP] (5).  The OS file handle is 0x0000000000002874.  The offset of the latest long I/O is: 0x0000207d4ba000

2014-02-24 18:16:35.83 spid17s     SQL Server has encountered 1 occurrence(s) of I/O requests taking longer than 15 seconds to complete on file [L:\tcpdata4\TCPDATA37.ndf] in database [TCP] (5).  The OS file handle is 0x0000000000001D5C.  The offset of the latest long I/O is: 0x000025b1f1a000

---

2. AlwaysOn Extended Event Log - error_report @ 2014-02-24 18:15:54

A connection timeout has occurred on a previously established connection to availability replica 'DL980-4' with id [AA216224-A495-4821-B121-F01FEF5132B8].  Either a networking or a firewall issue exists or the availability replica has transitioned to the resolving role.

---

3. Cluster log

00002424.00002bb4::2014/02/24-10:16:58.565 INFO  [RES] Network Name: Agent: Sending request Netname/RecheckConfig to NN:6e26fe17-09c2-4f54-8aae-52678b351486:Netbios

00002424.00003284::2014/02/24-10:16:58.565 INFO  [RES] Network Name <AGTCP_tccdb4>: Netbios: Slow Operation, FinishWithReply: 0

00002424.00003284::2014/02/24-10:16:58.565 INFO  [RES] Network Name:  [NN] got sync reply: 0

00002424.00003284::2014/02/24-10:16:58.565 INFO  [RES] Network Name <AGTCP_tccdb4>: Netbios: End of Slow Operation, state: Initialized/Idle, prevWorkState: Idle

00002424.00003284::2014/02/24-10:17:03.566 INFO  [RES] Network Name: Agent: Sending request Netname/RecheckConfig to NN:6e26fe17-09c2-4f54-8aae-52678b351486:Netbios

00002424.00002bb4::2014/02/24-10:17:03.566 INFO  [RES] Network Name <AGTCP_tccdb4>: Netbios: Slow Operation, FinishWithReply: 0

00002424.00002bb4::2014/02/24-10:17:03.566 INFO  [RES] Network Name:  [NN] got sync reply: 0

00002424.00002bb4::2014/02/24-10:17:03.566 INFO  [RES] Network Name <AGTCP_tccdb4>: Netbios: End of Slow Operation, state: Initialized/Idle, prevWorkState: Idle

00002428.00000fc0::2014/02/24-10:17:06.046 ERR   [RES] SQL Server Availability Group: [hadrag] Failure detected, diagnostics heartbeat is lost

00002428.00000fc0::2014/02/24-10:17:06.046 ERR   [RES] SQL Server Availability Group <AGTCP>: [hadrag] Availability Group is not healthy with given HealthCheckTimeout and FailureConditionLevel

00002428.00000fc0::2014/02/24-10:17:06.046 ERR   [RES] SQL Server Availability Group <AGTCP>: [hadrag] Resource Alive result 0.

00002428.00000fc0::2014/02/24-10:17:06.046 ERR   [RES] SQL Server Availability Group: [hadrag] Failure detected, diagnostics heartbeat is lost

00002428.00000fc0::2014/02/24-10:17:06.046 ERR   [RES] SQL Server Availability Group <AGTCP>: [hadrag] Availability Group is not healthy with given HealthCheckTimeout and FailureConditionLevel

00002428.00000fc0::2014/02/24-10:17:06.046 ERR   [RES] SQL Server Availability Group <AGTCP>: [hadrag] Resource Alive result 0.

00002428.00000fc0::2014/02/24-10:17:06.046 WARN  [RHS] Resource AGTCP IsAlive has indicated failure.

000016d4.00004e48::2014/02/24-10:17:06.046 INFO  [RCM] HandleMonitorReply: FAILURENOTIFICATION for 'AGTCP', gen(2) result 1/0.

000016d4.00004e48::2014/02/24-10:17:06.046 INFO  [RCM] Res AGTCP: Online -> ProcessingFailure( StateUnknown )

000016d4.00004e48::2014/02/24-10:17:06.046 INFO  [RCM] TransitionToState(AGTCP) Online-->ProcessingFailure.

000016d4.00003640::2014/02/24-10:17:06.046 INFO  [GEM] Sending 1 messages as a batched GEM message

000016d4.00004e48::2014/02/24-10:17:06.046 INFO  [RCM] rcm::RcmGroup::UpdateStateIfChanged: (AGTCP, Online --> Pending)

000016d4.00004e48::2014/02/24-10:17:06.046 ERR   [RCM] rcm::RcmResource::HandleFailure: (AGTCP)

000016d4.00004e48::2014/02/24-10:17:06.046 INFO  [RCM] resource AGTCP: failure count: 2, restartAction: 2 persistentState: 1.

000016d4.00004e48::2014/02/24-10:17:06.046 INFO  [RCM] numDependents is zero, auto-returning true

000016d4.00004e48::2014/02/24-10:17:06.046 INFO  [RCM] Greater than restartPeriod time has elapsed since first failure of AGTCP, resetting failureTime and failureCount.

000016d4.00004e48::2014/02/24-10:17:06.046 INFO  [RCM] Will queue immediate restart (500 milliseconds) of AGTCP after terminate is complete.

000016d4.00004e48::2014/02/24-10:17:06.046 INFO  [RCM] Res AGTCP: ProcessingFailure -> WaitingToTerminate( DelayRestartingResource )

000016d4.00004e48::2014/02/24-10:17:06.046 INFO  [RCM] TransitionToState(AGTCP) ProcessingFailure-->[WaitingToTerminate to DelayRestartingResource].

000016d4.00004e48::2014/02/24-10:17:06.047 INFO  [RCM] Res AGTCP: [WaitingToTerminate to DelayRestartingResource] -> Terminating( DelayRestartingResource )

000016d4.00004e48::2014/02/24-10:17:06.047 INFO  [RCM] TransitionToState(AGTCP) [WaitingToTerminate to DelayRestartingResource]-->[Terminating to DelayRestartingResource].

00002428.0000111c::2014/02/24-10:17:06.047 INFO  [RES] SQL Server Availability Group: [hadrag] Stopping Health Worker Thread

00002428.000027e4::2014/02/24-10:17:06.047 INFO  [RES] SQL Server Availability Group: [hadrag] Health worker was asked to terminate

000016d4.00002318::2014/02/24-10:17:06.047 INFO  [GEM] Sending 1 messages as a batched GEM message

00002424.00003284::2014/02/24-10:17:06.050 INFO  [RES] Network Name <AGTCP_tccdb4>: Getting Read/Write private properties

00002424.00002bb4::2014/02/24-10:17:06.052 INFO  [RES] Network Name <AGTCP_tccdb4>: Getting Read/Write private properties

000016d4.00000dc0::2014/02/24-10:17:06.063 INFO  [NM] Received request from client address DL980-3.

000016d4.00002ae8::2014/02/24-10:17:06.064 INFO  [NM] Received request from client address DL980-3.

000016d4.0000212c::2014/02/24-10:17:06.067 INFO  [RCM] ignored non-local state Pending for group AGTCP

00002424.00002bb4::2014/02/24-10:17:06.079 INFO  [RES] Network Name <AGTCP_tccdb4>: Getting Read/Write private properties

00002424.00003284::2014/02/24-10:17:06.081 INFO  [RES] Network Name <AGTCP_tccdb4>: Getting Read/Write private properties

00002424.00002bb4::2014/02/24-10:17:08.567 INFO  [RES] Network Name: Agent: Sending request Netname/RecheckConfig to NN:6e26fe17-09c2-4f54-8aae-52678b351486:Netbios

00002424.00003284::2014/02/24-10:17:08.567 INFO  [RES] Network Name <AGTCP_tccdb4>: Netbios: Slow Operation, FinishWithReply: 0

00002424.00003284::2014/02/24-10:17:08.567 INFO  [RES] Network Name:  [NN] got sync reply: 0

00002424.00003284::2014/02/24-10:17:08.567 INFO  [RES] Network Name <AGTCP_tccdb4>: Netbios: End of Slow Operation, state: Initialized/Idle, prevWorkState: Idle


Viewing all articles
Browse latest Browse all 4689

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>