Quantcast
Channel: SQL Server High Availability and Disaster Recovery forum
Viewing all 4689 articles
Browse latest View live

AlwaysON Availability group setting on SQL Server 2016 in AZURE VM

$
0
0

We are trying to set the Failover cluster in Windows 2016 with SQL Server 2016 version.  On SQL Server side, we are able to set the Availability group using active listner,  However, when we try to stop Primary sql server. We are getting following error in the Failover cluster

Based on the failure policies for the resource and role, the cluster service may try to bring the resource online on this node or move the group to another node of the cluster and then restart it.  Check the resource and group state using Failover Cluster Manager or the Get-ClusterResource Windows PowerShell cmdlet.

Error 1069


Microsoft always on availability cluster failing occasionally.

$
0
0

I have three 3 node clusters created in my environment.  There are two synchronous nodes at one data center and one asynchronous node at another data center.  Each node has its own disks so there's no shared storage.  Occasionally maybe once a month, my test cluster becomes unavailable.  It won't ping and clients can't connect to the SQL cluster listener.  It happened again this morning and was unavailable for about a half hour, then without changing anything, it came back to life.  In the past, I was able to initiate a manual failover/failback and it would come back right away but I'd like to find out why this is happening in the first place.  This does not happen with the other clusters.

There are no cluster errors registered in event viewer.  No nodes fail, I can connect to each one individually.  The cluster name does not ping and neither does the SQL listener, so that leads me to believe it is a Microsoft clustering problem and probably not a SQL issue.

I did a get-clusterlog and have been pouring through it.   I do see DBG messages like:


[Verbose] 000009ac.00001678::2019/02/12-09:46:49.619 DBG   [GEM] Node 1: GEM Id 3565 has been ack'ed by every node. Unacknowledged Message Count = 5
[Verbose] 000009ac.00001678::2019/02/12-09:46:49.619 DBG   [GEM] Node 1: GEM Id 3566 has been ack'ed by every node. Unacknowledged Message Count = 4
[Verbose] 000009ac.00001678::2019/02/12-09:46:49.621 DBG   [GEM] Node 1: GEM Id 3567 has been ack'ed by every node. Unacknowledged Message Count = 3
[Verbose] 000009ac.00001678::2019/02/12-09:46:49.621 DBG   [GEM] Node 1: GEM Id 3568 has been ack'ed by every node. Unacknowledged Message Count = 2
[Verbose] 000009ac.00001678::2019/02/12-09:46:49.621 DBG   [GEM] Node 1: GEM Id 3569 has been ack'ed by every node. Unacknowledged Message Count = 1
[Verbose] 000009ac.00001678::2019/02/12-09:46:49.621 DBG   [GEM] Node 1: GEM Id 3570 has been ack'ed by every node. Unacknowledged Message Count = 0

Around the time it came back up.

Also entries like this often - the possible owners list size is 0 message is interesting.


[Verbose] 00001a40.00002920::2019/02/12-09:48:17.162 INFO  [RES] Distributed Network Name <CAUSACSQvu8>: Netname received Refresh clones message
[Verbose] 00001a40.00002920::2019/02/12-09:48:17.162 INFO  [RES] Distributed Network Name <CAUSACSQvu8>:Possible owners list size is 0
[Verbose] 00001a40.000007d0::2019/02/12-09:48:17.165 INFO  [RES] Network Name: Agent: InitializeModule, Trying to initialize Module(ad4aa780-67db-41df-977e-35ef8ac4be5f,Client) when there is one already in Initialized/Idle state
[Verbose] 00001a40.00002920::2019/02/12-09:48:17.165 INFO  [RES] Distributed Network Name <CAUSACSQvu8>: StartupClone - Client module already exists.
[Verbose] 00001a40.000007d0::2019/02/12-09:48:17.165 INFO  [RES] Distributed Network Name <CAUSACSQvu8>: Client: Synching with slow operation

I don't see a cause and effect relationship when this fails so I can't trigger it.  It's like chasing a ghost.  If anyone has a place to start looking, I'd appreciate it.

Extend On-Prem AlwaysOn to Azure - Questions

$
0
0

Hello,

I want to extend on-prem AlwaysOn AG to Azure.  We currently already have AlwaysOn AG setup and running.  We want to create additional replica(s) in Azure for HA.  Here's my question: in addition to creating the Azure replica(s), do we need to create an Azure Domain Controller for this setup?  Is this an absolute requirement?  I have not been able to definitively find the answer.

Thank you

SPN registration for Alwayson

$
0
0

What name should be register for SPN in Always-on deployment ?

1. The AG listener name and port or  

2. The replica node name ( cluster node name )

AlwaysOn Issue

$
0
0

Hi,

I found the below error:

When i get the alert immediately check and found below error. not any other errors:

Error: 35250, Severity: 16, State: 13.
The connection to the primary replica is not active.  The command cannot be processed.
Error: 35250, Severity: 16, State: 13.
The connection to the primary replica is not active.  The command cannot be processed.

Immediately i logged into server and when i check for dashboard, everything looks good.

I didn't find any error either in Primary or Secondary.

Ms-SQL differential backups using VSS writer

$
0
0

We have some queries regarding Differential backup and restore of MSSQL using its VSS writer.

During backup my application first take full backup, which backup both database and log transaction files i.e .mdf and .ldf files and during the differential backup, it backups only the changed blocks of database(.mdf file) provided by MSSQL VSS writer. There are no issues in backup.

Our application does a VSS based restore. In this, first we restore the full backup data i.e both database( .mdf) and transaction log(.ldf)files.  Then it writes the differential/changed chunks to the corresponding  database (.mdf) file. The VSS writer does not give any error during the actual restore process. But the databases are not accessible after restore , they are in corrupted state. The SQL Server service stops after the restore. It cannot be started. The application logs contains errors related to transaction log number mismatch due to which we can say that the database restore has actually failed.

Also after restoring the VSS partial chunks , I tried restoring SQL data by using the restore database command of SQL i.e"RESTORE Database  [Database name]". But again the restore failed.

So I have a few queries here:

- Can we restore the database using only the VSS provided partial/differential data chunks?
- If the above is true then why the restore is failing while restoring the transaction logs. Am I missing some step?
- Is there some kind of recovery command needed after restoring Full backup + Partial chunk backup to get the databases in a consistent state?

TCP/IP Issue in SQL Server

$
0
0

We got this alert from application team

  

Error Occurred at: 2/18/2019 11:17:01 AM
Subject: IncentiveFetch.aspx ERROR! - Some v5 Logix
Body:
Error in: IncentiveFetch.aspx

The following internal error has occurred:

Error Description: A transport-level error has occurred when receiving results from the server. (provider: TCP Provider, error: 0 - The semaphore timeout period has expired.)
Error Number: 121  

When i check from error log, i didn't find anything. We are using sql server 2016 ent

This secondary replica is not connected to the primary replica. The connected state is DISCONNECTED

$
0
0

Hi Guys

I am having an issue with an SQL Availability Group and am wondering if you can lend a hand troubleshooting:

I have 2 SQL Servers on 2017 running on 2012R2.

I have set them up in an Always on Availability Group with Availability mode set to Synchronous Commit and Readable secondary set to No.

 I thought this was all working fine but when I was doing something unrelated in Microsoft SQL Management Studio I noticed that the Availability Replicas status for the secondary had a red cross by it. I then checked the availability group dashboard and can now see under the secondary:

Synchronization status: Not Synchronizing

Failover Readiness: Data Loss

This secondary replica is not connected to the primary replica. The connected state is DISCONNECTED.

At least one availability database on this availability replica has an unhealthy data synchronization state. If this is an asynchronous-commit availability replica, all availability databases should be in the SYNCHRONIZING state. If this is a synchronous-commit availability replica, all availability databases should be in the SYNCHRONIZED state.

The data synchronization state of this availability database is unhealthy. On an asynchronous-commit availability replica, every availability database should be in the SYNCHRONIZING state. On a synchronous-commit replica, every availability database should be in the SYNCHRONIZED state.

I have checked firewall ports and 5022 is open and not blocked. I have also checked netstat to confirm sqlsrvr.exe is using 5022 which it is. I am now starting to think this could be permission related.

Can anyone help point me in the right direction on where to go from here?

Many thanks!


SQL Quorum

$
0
0

Hi Team,

Here the scenario:

Environment SQL Server 2016 Ent with AlwayOn

1 Primary +2 secondary (1 at DR site)

Quorum --> file sahre withness.

We don't need DR server. What happens if we remove DR server? Doen't remaining 2 servers work or it need to change quorum setting?

Best way to migrate existing database under Always on (2012) to new AG (2016) with minimal downtime (under 5 mins)

$
0
0

Hi Experts,

We are looking to migrate our existing DB server configured for always on (SQL 2012) to a new AG group with SQL 2016 with minimal downtime (under 5 mins)

Current Architecture(Windows Server 2012 and SQL Server 2012):

Node1 and Node2 (ABC data center with synchronous mode) and Node3  (XYZ data center with Asynchronous mode)

New Architecture (Windows Server 2016 and SQL Server 2016):

Node1 and Node2  (ABC data center with synchronous mode) and Node3  (XYZ data center with Asynchronous mode)




Multi Subnet Always On Availability group ( Both Primary and Secondary Nodes down, and DR Node is up) case

$
0
0

Dear MSSQL Experts,

We have setup Multi subnet Always On setup of 2 nodes at primary(synchronous) and one node at Disaster Recovery(Asynchronous) site with disk witness at PR site. I need to know what happens if both the nodes at PR site are down? What will happen when the nodes comes up after sometime say 10-30 minutes, what can be done in such situations? what precautions one should take if it is planned or unplanned activity/failure happens at PR site? 

I came across that we can force failover to DR site, but can anybody please share how can i make it same as it was previously configured i.e DR to PR . 

My main concern is what happens to PR site when both nodes are down and do we need to do anything when it comes up for synchronization. Please correct me if i have mistaken.

Thanks,

Devendra


Devendra Yadav

SQLServer 2016 AlwaysOn with SQL 2017 Secondary replica

$
0
0
Hi,

   We do have SQL Server 2016 with couple of databases configured with AlwaysOn with 2 secondary replicas.  We would like to add another replica in SQL 2017 and test it with that once everything looks good, we will start using SQL 2017 and decommission the SQL2016. In case of issues, we will go back to SQL2016.

   Is this supported? SQL 2016 always on with SQL2017 as a secondary replica.  

   I remember seeing in couple of places mentioning it is possible, some places it says not possible, sql version has to be same. So got confused.

  

Thanks


Unable to give a user Control permission on availability group in SQL 2012 Enterprise SP4-OD

$
0
0

Hi, 

I am tyring to give a login to contorl availabilty group permission, so he can failover it. But i am getting below error. I have tried giving user SA permission but still the same error. Try restarting servers, faiging over but it is not working. But in SQL 2016 it work right away. not sure is it a proble with SQL Version. Anybody can advise on this please. 

SQL version: SQL 2012 Enterprise SP4-OD

Error:

Cannot continue the execution because the session in the kill state

A severe error occurred on the current command.  The results, if any, should be discarded.

Removing second Ip from Listener

$
0
0

Hi,

I am planning to remove secondary Ip from AlwaysOn Listener (This is Multisubnet), But here ADD/REMOVE butotns are grayed out.

Trouble with DPM 2016 installation

$
0
0

I am trying to install DPM 2016 in the Windows Server 2016 where i have installed SQL Server 2016 SP1. I have kept Server 2016 in domain but following issue has occured while installing DPM.

DPM Setup is unable to connect to MSSQLSERVER instance of SQL Server SQLSVR-02. (ID: 4307)

Verify that the specified computer and the instance of SQL Server meets the following requirements:

1)The computer is accessible over the network.

2)A firewall is not blocking requests from the DPM computer. For steps to configure the firewall on the SQL Server, follow the steps described here: http://go.microsoft.com/fwlink/?LinkId=94001.

3)The specified user belongs to the Administrator group on the computer running the SQL Server instance and the sysadmin role on the SQL Server instance.

4)The SQL Browser service is running on the SQL server.

5)TCP/IP protocol is enabled for the specified instance of the SQL server.

But i have applied all the above requirements still same error persist.


Datawarehouse is good on AlwaysOn AG group?

$
0
0

we will have two nodes AAG and database has nightly ETL job for 20 minutes, around 30,000 batches/second.

Do you think it is ok on AAG?

Thanks


Thank you Skiiiiii

AlwaysOn Backups

$
0
0

Hi,

Configure Full and Log backups on Secondary replicas (SQL Server 2016)using Ola Hallengren, everything is working perfect.

But as per the business requirement we need to configure Differentials Backups.

As we know that Differentials backups are not supported on Secondary Replicas. Due to criticality of the DB we can't perfrom full backups are Primary.

So, Please help us what is the workaround?

If i create normal differential backup on Primary Replica, does it going to be any problem?

how to change the IP of listener

$
0
0

Hi,

I have setup 2 nodes always on cluster, now i want to change the IP address of listner, pls guide how can i change this. secondly can i used virtual IP for listener.

Thanks


iffi

Windows patching of always on cluster nodes

$
0
0

Hi,

Here is my situation

1. I have a 2 node always on cluster with a fileshare as a witness

2. I install windows patch on the secondary node and reboot it

3. Then I failover my availability group/s to the secondary

4. Install windows patch on the original primary (right now its a secondary due to the previous step)

5. Reboot the original primary and then failback the availability group/s

As I understand, during the failover and failback (steps 3 and 5), users will briefly lose connection to the sql databases.

I can make this bit less painful by not failing back (skip the step 5). 

However, I am curious if there is a way to also failover active connections and queries along with the avalaibility groups in step 3?

I am open to any ideas, suggestions, third party tools etc. that can help achieve this goal.

Thank you.

Guru

Full backup of SQL

$
0
0

I have a debate in my office as our main DB log getting rapidly increase.

the debate as following

we are taking a full backup for the SQL DB daily. and we think that after this full backup transaction log should be shrinking. as we did full backup. but still after full back up the size of the transaction log still the same.

the vendor is asking us to do a manual shrinking for the DB everydayasn we belive we dot need this as we are taking this full backup daily.

can someone answer this for me.is full backup for SQL should reduce the log size for that DB, or not?

Viewing all 4689 articles
Browse latest View live


<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>