Quantcast
Channel: SQL Server High Availability and Disaster Recovery forum
Viewing all articles
Browse latest Browse all 4689

Cluster Fail-over - Failed to bring secondary node online ??

$
0
0

Hi

Server : Windows server 2008

DB Server : SQL Server 2008 (SP1)

 

Here are the series of events which happened.

1.) Event ID: 1135

Cluster node 'XYZ' was removed from the active failover cluster membership. The Cluster service on this node may have stopped. This could also be due to the node having lost communication with other active nodes in the failover cluster. Run the Validate a Configuration wizard to check your network configuration. If the condition persists, check for hardware or software errors related to the network adapters on this node. Also check for failures in any other network components to which the node is connected such as hubs, switches, or bridges.

2.) Event ID: 1049

     Cluster IP address resource 'SQL IP Address 1 (XYZ)' cannot be brought online because a duplicate IP address '10.9.8.113' was detected on the network.  Please ensure all IP addresses are unique.

3.) Event ID: 1069

     Cluster resource 'SQL IP Address 1 (XYZ)' in clustered service or application 'SQL Server (MSSQLSERVER)' failed.

4.) Event ID: 1049

     Cluster IP address resource 'Cluster IP Address' cannot be brought online because a duplicate IP address '10.9.8.112' was detected on the network.  Please ensure all IP addresses are unique.

5.) Event ID: 1069

    Cluster resource 'Cluster IP Address' in clustered service or application 'Cluster Group' failed.

6.) Event ID: 1066

Cluster disk resource 'Cluster Disk 25' indicates corruption for volume '\\?\Volume{88552e6f-aea2-11df-9790-0026b92fffa7}'. Chkdsk is being run to repair problems. The disk will be unavailable until Chkdsk completes. Chkdsk output will be logged to file 'C:\Windows\Cluster\Reports\ChkDsk_ResCluster Disk 25_Disk16Part1.log'. Chkdsk may also write information to the Application Event Log.

7.) Event ID : 1066

Cluster disk resource 'Cluster Disk 26' indicates corruption for volume '\\?\Volume{88552e05-aea2-11df-9790-0026b92fffa7}'. Chkdsk is being run to repair problems. The disk will be unavailable until Chkdsk completes. Chkdsk output will be logged to file 'C:\Windows\Cluster\Reports\ChkDsk_ResCluster Disk 26_Disk4Part1.log'. Chkdsk may also write information to the Application Event Log.

8.) Event ID: 1049

 (Same message as point 2)

9.) Event ID: 1069

     (Same message as point 3)

10.) Event ID : 1049

(same message as point 4)

11.) Event ID :1069 

       (same message as point 5)

12.) Event ID :1205

    The Cluster service failed to bring clustered service or application 'Cluster Group' completely online or offline. One or more resources may be in a failed state. This may impact the availability of the clustered service or application.

13.) Event ID: 1069

      Cluster resource 'Cluster Disk 17' in clustered service or application 'SQL Server (MSSQLSERVER)' failed.

14.) Event D : 1049

      (same message as point 2)

15.) Event ID: 1069

Cluster resource 'SQL IP Address 1 (XYZ)' in clustered service or application 'SQL Server (MSSQLSERVER)' failed.

16.) Event ID : 1205

 The Cluster service failed to bring clustered service or application 'SQL Server (MSSQLSERVER)' completely online or offline. One or more resources may be in a failed state. This may impact the availability of the clustered service or application.

 

first of all,I went through all the logs, and could not find the reason for fail-over initialization. There should be some thing logged why the failover happened? secondly after failover the service was not coming online due to duplicate IP address detection. later when we try  to manually bring the service online from cluster management it comes online successfully. i dont understand how would duplicate IP address get resolved when we start manually.

Lastly we see few errors related to physical disk resource between failover retries, is this could be the correlated to failover error ? Please help to troubleshoot these errors, i am not so good at clustering and Thanks for your help in advance....:)

Thanks

Mushtaq

 

 

 

 

 

 

 

 

 

 


Viewing all articles
Browse latest Browse all 4689

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>