We have a two node SQL cluster that has been running for quite some time. We have 4 SQL instances running in the cluster. One day one of them stopped working. The SQL Server Resource is in the failed state and the SQL Server Agent is in the Offline State. The name, Analysis and storage are all Online. We are unable to get the SQL Server Role to start. However, if you were to go into the services on either of the nodes and start the services manually, the run fine on the server (they run locally but still show up as failed in the cluster). Tried rebooting both nodes to reset the cluster, checked logins, and a few other things. Logs aren't very helpful. Any ideas?
SQLAGENT.OUT
2015-07-24 07:59:07 - ? [100] Microsoft SQLServerAgent version 11.0.5343.0 (X64 unicode retail build) : Process ID 14296
2015-07-24 07:59:07 - ? [495] The SQL Server Agent startup service account is DOMAIN\sqlagent_sqlcloud.
2015-07-24 07:59:08 - ? [393] Waiting for SQL Server to recover database 'msdb'...
2015-07-24 07:59:10 - ? [000] Configuration option 'show advanced options' changed from 0 to 1. Run the RECONFIGURE statement to install. [SQLSTATE 01000] (Message 15457) Configuration option 'Agent XPs' changed from 0 to 1. Run the RECONFIGURE statement
to install. [SQLSTATE 01000] (Message 15457) Configuration option 'show advanced options' changed from 1 to 0. Run the RECONFIGURE statement to install. [SQLSTATE 01000] (Message 15457)
2015-07-24 07:59:10 - ? [101] SQL Server SQLCLOUD version 11.00.5343 (0 connection limit)
2015-07-24 07:59:10 - ? [102] SQL Server ODBC driver version 11.00.5058
2015-07-24 07:59:10 - ? [103] NetLib being used by driver is DBNETLIB; Local host server is
2015-07-24 07:59:10 - ? [310] 24 processor(s) and 131038 MB RAM detected
2015-07-24 07:59:10 - ? [339] Local computer is SQLCLOUD running Windows NT 6.2 (9200)
2015-07-24 07:59:11 - ? [432] There are 12 subsystems in the subsystems cache
2015-07-24 07:59:11 - ! [364] The Messenger service has not been started - NetSend notifications will not be sent
2015-07-24 07:59:11 - ? [129] SQLSERVERAGENT starting under Windows NT service control
2015-07-24 07:59:11 - + [475] Database Mail is not enabled for agent notifications.
2015-07-24 07:59:11 - + [396] An idle CPU condition has not been defined - OnIdle job schedules will have no effect
2015-07-24 07:59:11 - + [408] SQL Server MSSQLSERVER is clustered - AutoRestart has been disabled
Cluster Events:
Event ID 1069: Cluster resource 'SQL Server' of type 'SQL Server' in clustered role 'SQL Server (SQLCLOUD)' failed.Based on the failure policies for the resource and role, the cluster service may try to bring the resource online on this node or move the group to another node of the cluster and then restart it. Check the resource and group state using Failover Cluster Manager or the Get-ClusterResource Windows PowerShell cmdlet.
Event ID 1254: Clustered role 'SQL Server (SQLCLOUD)' has exceeded its failover threshold. It has exhausted the configured number of failover attempts within the failover period of time allotted to it and will be left in a failed state. No additional attempts will be made to bring the role online or fail it over to another node in the cluster. Please check the events associated with the failure. After the issues causing the failure are resolved the role can be brought online manually or the cluster may attempt to bring it online again after the restart delay period.
Event ID 1205: The Cluster service failed to bring clustered service or application 'SQL Server (SQLCLOUD)' completely online or offline. One or more resources may be in a failed state. This may impact the availability of the clustered service or application.