Quantcast
Channel: SQL Server High Availability and Disaster Recovery forum
Viewing all 4689 articles
Browse latest View live

3 node sql multisite cluster

$
0
0

all,

i am testing 3 node multisite sql cluster in a lab, 2 nodes on 1 site and 1 node on another site.

as this is a lab we havent got replicated sans

what we have done is

2 nodes on 1st site - luns presented from san

1 node on second site - manually created the disks (Local) not SAN - replicating exactly the same structure / disk drives letters etc in site 1

to simulate testing, we have manually copied all the sql data from volumes in site A to site B i.e. 3rd node.

do you think this testing would work?

we may have some other things missing , but the basica question is , will / should the above work , as a test ?

would be grateful for any help.... regards


AlwaysON : Log_Send_Queue_Size is increasing but the log_send_rate keeps decreasing on primary replica ?

$
0
0

Hi Folks,

I have configured AlwasyOn for 1.5 TB database in SQL 2014 (SP1+CU3)across data center in Asynchronous mode.

Database have huge transactions activity. In AlwaysOn Dashboard, I found that Log_Send_Queue_Size is increasing but the log_send_rate keeps decreasing. Estimated Data Loss (Time) is showing around 2 hrs.

Anyone have idea on this ?

how can we fix this issue ?


Mark As Answer If My Reply Is Helpful<br/> Thanks<br/> Neeraj Bhandari (MCTS - Sql Server 2008)

AlwaysOn : AVG or FCI ?

$
0
0

As I understand it sql 2012\2014 supports two basic AlwaysOn architectures:

1) Always on Availability Group  

2) Always on Fail-over Cluster Instance

Around my place ALL production DBs are expected to be restored ASAP after a disaster so I don't see how option 1), where the unit of failover is a DB or DB(s),  is that useful. We would need to create jobs, logins, etc (i.e. things outside of the database) all under the watchful eye of the IS Managers and a clock.  During that time period production would be down. It seems that option 2) is the most likely way for us to "quickly" recover a full, working copy. of production before the failure.  Do you agree?

TIA,

edm2

P.S. In fact, option 1 seems like a nuisance to me as it, possibly,  requires Administrative overhead whenever a new DB is created. (Hope I don't forget to include a DB in an availability group that I wanted to be in there.)

read-only routing with single application connection

$
0
0

Good day colleagues,

My application supports a single database connection and in the app console I can produce reports. If I include the app database in an AlwaysOn availability group with a read-intent replica will SQL automatically route the “selects” to that second instance thus offloading my application’s reporting activities or I need a separate db connection (maybe from a reporting app or cli) with a connection specifying read-only intent?

Many thanks,

Archie

Availability of Azure AlwaysOn HA-DR "image"?

$
0
0

In portal.azure.com, under Compute, I found the  "SQL Server 2014 AlwaysOn"  offering which appears to support the HA portion of HA-DR. Is there a full HADR offering using AlwaysOn FCI?

TIA,

edm2

SQL server 2008 R2 Cluster issue

$
0
0

Hi All,

Today morning we face some in issue in cluster environment we are getting below error message in error log continuously with different SPID's. due to that application connectivity get lost for 18 min. After reboot application server its came back normal.

RAM size is 264 GB

we restricted SQL max & min memory

Min memory : 0 & Max memory 130 GB

Service pack: SQL server 2008 R2 SP1

Error: The client was unable to reuse a session with SPID 1871, which had been reset for connection pooling. The failure ID is 29. This error may have been caused by an earlier operation failing. Check the error logs for failed operations immediately before this error message.

Microsoft SQL Server 2008 R2 (SP1) - 10.50.2500.0 (X64)   Jun 17 2011 00:54:03   Copyright (c) Microsoft Corporation  Enterprise Edition (64-bit) on Windows NT 6.1 <X64> (Build 7601: Service Pack 1) 

SQL Server 2008 DB mirroring Active-Active

$
0
0

I was wondering if there are any issues with having two SQL Server instances running on two servers using databsae mirroring in an active-active setup?

For instance -

Server1Server2
Instance - SQL1Instance - SQL2
Database1Databsae3
Database2Database4
Instance - SQL2Instance - SQL1
Databse3 - DRDatabase1 - DR
Databse4 - DRDatabase2 - DR

Does the above make sense? And is it possible?

Thanks Mark


Replicate SQL server database(PUBLISHER) to Always ON Availability Groups (Listener)(SUBSCRIBER)

$
0
0

https://msdn.microsoft.com/en-us/library/hh710046.aspx

I have configured replication between Always ON Availability Groups  (Listener) (PUBLISHER), remote distributor to XYZ SUBSCRIBER...with above link ...

now, I want to know how to replicate Data from XYZ SERVER a PUBLISHER to Always ON Availability Groups  (Listener) (SUBSCRIBER)?

Distributor Database being on XYZ

EX:

XYZ SQL SERVER as PUBLISHER, and DISTRIBUTOR 

to 

Always ON Availability Groups  (Listener) SUBSCRIBER...

Please provide some guidelines......

Thanks,


Repeated error related to Windows Mulitisite Clustering

$
0
0

Hi,

I have successfully setup a multisite clustering using Windows Server 2012 R2. On top of this I have installed Clustered SQL Server 2012.

Everything looks fine except that I am seeing a lot of event ID 1195 in the event viewer of the node where cluster service is running currently. The details of the error are:

"Cluster network name resource 'Cluster Name' failed registration of one or more associated DNS name(s). The error code was '-1'. Ensure that the network adapters associated with dependent IP address resources are configured with access to at least one DNS server."

The event seems to be occurring every 15 minutes.

I have checked connectivity of the nodes to all DNS servers (we have three in total), and found all perfectly fine. Also checked the DNS entries of the Cluster Name, and found each DNS server has two entries, each corresponding to their subnets. Please note that these entries were manually created by DNS admin while we setup the environment.

Does anyone has any idea why this is happening?

Thanks in advance.

Regards

What is clustering , Always on and Mirroing

$
0
0
Please explain me on each one (clustering , Always on and Mirroring).. what are the differences11?

Service Account options for Availability Groups

$
0
0

Hi,

Planning/preparing a 5 node cluster hosting two AGs and hope someone can shed some light on specific account usage.

Environment is SQL 2014.  3 nodes in local datacenter, 2 nodes in remote datacenter.  All nodes will be members of the same Windows cluster, however we are not using SQL FCI, just AGs.

Configuration is that the prod AG will be hosted on 4 nodes.   Synchronous primary and secondary replicas in the local DC for HA, with a witness.  Then 2 asynchronous secondary replicas (non-voting) at the remote DC for DR and planned manual failovers.

The test AG has a sort of reverse configuration, with the synchronous primary/secondary replicas being at the remote DC and the async DR failover replica being the 3rd local datacenter node, which also hosts the dev db (non-AG).

We have a restricted A/D environment managed by our service provider.  We also have password policy restrictions, whereby all domain service accounts must have their passwords changed semi-annually.

So my question is, what is the best way to go about configuring the SQL service accounts, considering future password resets and avoiding downtime?   Ignore the authentication type (kerberos/ntlm) for the time being, as the highest concern is availability.

Ideally we would use gMSAs, but they are not supported.  Alternatively, we could use a single domain service account across all nodes in the Windows cluster, however, I am unable to find official information on how a password change would affect the availability of the system.  Local/virtual accounts seems to be a less-than-ideal option, but would require us to setup certificates and/or encryption, plus duplicate the logins across all SQL replicas and possibly run partially contained databases.

It's going to take me a couple weeks to sort out a MS support call through our provider, so any thoughts would be much appreciated.

Thanks,

Ryan

SQL Server 2012 R2 failover cluster on virtual machines - hyper V with iSCSI LUN

$
0
0

Hi Team,
I am looking for some article or video/document for setting up SQL 2012 R2 two node failover cluster with iSCSI storage running on Hyper-V Server. I am looking for no. of network & IP requirements and number of LUNS minimum (shared storage) required for setting up the SQL Cluster. Any pointers will be appreciated.  Thanks in advance
Regards,

Understanding Always on Availability Synching (Log Shipping vs .bak files)

$
0
0

Hello everyone:  I hope this is the appropriate forum for this question.

I am a long running programmer who has recently started assisting our dba group due to reduced staff.

We recently implemented AOAG for a sql server of substantial size with with the help of a contractor; however there is a conflicting understanding in the group of how it should be maintained from this point.

It is a simple 2 node (primary and secondary) arrangement, with synchronous syncing, with a 3rd server that acts as the "witness" (quorum).

  1. Another contractor in the group believes that the two nodes communicate by means of reading and writing from .bak (backup) files he wants to keep on the witness (quorum) server; he has recently requested more space to house .bak files there for every database on it, and is convinced that without these files the two nodes are not actually syncing.

  2. The implementing contractor, on the other hand, has explained that the .bak files were only necessary for the initial synch and can now be removed if desired, as the primary and secondary nodes sync by a process called "log shipping"; the witness server merely watches (listens if you will?) to see if fail-over is necessary.

Explanation #2, I would like to confirm.  As for #1, I have not been able to confirm, and will create more complications with the amount of space we have available for other maintenance tasks.

Thank you in advance!

P.S. I've tried reading the (rather abstract) Microsoft documents on the subject but I'm still learning.


Testing AO configuration

$
0
0

Hello All,

One of the SQL Support teams in my organization has configured 2 node SQL Always On. Is there any way from which I can verify that AO configuration is proper.

Looking for some kind of smoke tests :)

Thanks in advance !!!


Eswar

SQL Alert 1480 Not firing for a Particular database

$
0
0

All,

I set up an Alert for Message Id 19406 to be alerted whenever the Failover/FailBack on our AAG happens. I also setup an Alert for Message ID 1480 when a particular database's role changes. I need this Alert to work since I have to enable TRUSTWORTHY on some of the databases that are on the nodes (Secondary when the Failover happens and Primary when the Failback happens). I created this Alert by selecting the Database Name while configuring the Alert.

The Alert for MessageID 19406 fires as expected and we get a total of 12 alerts (per Failover & Failback). Unfortunately, the alert for Message ID 1480 does not seem to happen for the particular database. Instead I get one email for every database in the AAG.

I'm not sure if this is how it is supposed to work. I cannot even specify the database name in the Message Text area since I have more than one database for which I need to change the TRUSTWORTHY setting. I was hoping to run a Job via the Alert to enable the TRUSTWORTHY setting.

Please kindly share your experiences!!

Thanks.

rgn

USE [msdb]
GO
EXEC msdb.dbo.sp_update_alert @name=N'AvailabilityGroupRoleChange-1480',
		@message_id=1480,
		@severity=0,
		@enabled=0,
		@delay_between_responses=0,
		@include_event_description_in=1,
		@database_name=N'NorthWind',
		@notification_message=N'Database Role Changed',
		@event_description_keyword=N'',
		@performance_condition=N'',
		@wmi_namespace=N'',
		@wmi_query=N'',
		@job_id=N'00000000-0000-0000-0000-000000000000'
GO
EXEC msdb.dbo.sp_update_notification @alert_name=N'AvailabilityGroupRoleChange-1480', @operator_name=N'DBAsGroup', @notification_method = 1
GO

EXEC msdb.dbo.sp_update_alert @name=N'AAG-StateChange-19406',
		@message_id=19406,
		@severity=0,
		@enabled=1,
		@delay_between_responses=0,
		@include_event_description_in=1,
		@database_name=N'',
		@notification_message=N'State of the AAG Changed. Failover/Failback is currently in progress',
		@event_description_keyword=N'',
		@performance_condition=N'',
		@wmi_namespace=N'',
		@wmi_query=N'',
		@job_id=N'00000000-0000-0000-0000-000000000000'
GO
EXEC msdb.dbo.sp_update_notification @alert_name=N'AAG-StateChange-19406', @operator_name=N'DBAsGroup', @notification_method = 1
GO


SQL2012SP1, simple recovery, transaction log full, can't add another, won't grow

$
0
0

SUMMARY:

I have many databases with full transaction logs. They are already in SIMPLE recovery model,
and I cannot add a new log or truncate the existing one, because the transaction log is full due to 'CHECKPOINT'


DETAILS:

(Examining just one of the problematic databases)

I have a SQL Server 2012 SP1 database with a full transaction log.

This database has one data file (750MB) and one log file (850MB)

Compatibility level = SQL Server 2012 (110)

Recovery Model is Simple
Under Recovery Options, Page Verify=CHECKSUM and Target Recovery Time=0

I had 5G of free disk (which should have been enough), and I added more, so now I have 25G free.

My log file growth is set to: "By 10 percent, Limited to 2097152 MB "
(I can't change it - if I try I get an error because the transaction log is full)

I can't do a backup - get an error because the transaction log is full due to 'CHECKPOINT'

I can't add another log file - get an error because the transaction log is full due to 'CHECKPOINT'

Verifying that file#2 is my log file

SELECT file_id, name FROM sys.database_files;

I can't do a truncate only shrink successfully:
DBCC SHRINKFILE (2, TRUNCATEONLY);

- get an error because my log is out of space - with a cascade error that the transaction log is full due to 'CHECKPOINT'

DBCC CHECKDB mostly just gives errors because the transaction log is full due to 'CHECKPOINT'

I can't change the recovery model to full - get an error because the transaction log is full due to 'CHECKPOINT'

I can't change the recovery model to bulk-logged - get an error because the transaction log is full due to 'CHECKPOINT'

From sys.databases,

is_read_only=0, state=0, state_desc=ONLINE, is_in_standy=0
is_cleanly_shutdown=0, is_supplemental_logging_enabled=0
is_read_committed_snapshot_on=0
recovery_model=3, recovery_model_desc=SIMPLE
is_fulltext_enabled=1
is_published=0, is_subscribed=0, is_merge_published=0, is_distributed=0, is_sync_with_backup=0
is_broker_enabled=1, log_reuse_wait=1, log_reuse_wait_desc=CHECKPOINT

I tried doing a manual CHECKPOINT (using the CHECKPOINT cmd in MgmtStudio) and got errors:


(1 row(s) affected)
Msg 5901, Level 16, State 1, Line 3
One or more recovery units belonging to database 'MYDATABASENAME' failed to generate a checkpoint. This is typically caused by lack of system resources such as disk or memory, or in some cases due to database corruption. Examine previous

entries in the error log for more detailed information on this failure.
Msg 9002, Level 17, State 1, Line 3
The transaction log for database 'MYDATABASENAME' is full due to 'CHECKPOINT'.
SELECTcount(*)FROM fn_dblog (NULL,NULL);

4,521,289

That seems like a lot !?

My ERRORLOGs are full of errors about full transaction logs. I have many databases with this same issue - I've just focused on one, for testing.


DBCC OPENTRAN
gives me no open transactions


However,
select * from sys.dm_tran_database_transactions where database_id = MYDBID

gives me 6 rows, of which 4 have null begin times



I tried using
select request_session_id, * from sys.dm_tran_locks where resource_database_id = MYDBID and resource_subtype ='BULKOP_BACKUP_LOG'

and killing those sessions, but they always come back immediately.

Note: I have seen that SQL2012SP1CU2 has a fix for migrated databases with stuck-open transactions, but this database was created in SQL2012 - it was not upgraded from a prior version.

Thanks for any help or suggestions!

EventID 1069 and 1205 explanation

$
0
0

Hi everyone

I was doing some test on our sql cluster, and I've noticed a problem which causes the cluster log to report eventid 1069 and 1205.

My cluster configuration is as follow

3 HP DL380G7 (cluster-01, cluster-02, cluster-03)

Windows 2008R2 x64 on each of these server

I created a failover cluster of SQL Server

instance 1: IST01, preferred owner node cluster-01. failover on node cluster-02, cluster-03

instance 2: IST02, preferred owner node cluster-02, failover on node cluster-03, cluster-01

instance 3: IST03, preferred owner node cluster-03, failover on node cluster-01, cluster-02

Each server has 72GB memory, and each SQL server instance has set the maximum memory limit to 24GB (so, in the worst case, I can have all three instances on a single node, 24+24+24=72GB)

Every server uses iSCSI lun on our existing SAN.

As I said, I was doing some test, so I tried to move the IST01 from node 1 to node 2 to simulate a failover. Everything ok

I different solution (IST03 from node 3 to node 1, IST01 from node 1 to node 2 etc etc)

The problem arises when I try to move instance IST03 from node cluster-03 to cluster-02. I get 2 events in the cluster event log (eventid 1069 and 1025), the instance goes down for a couple of seconds and then it resumes on node cluster-03

EventID 1069 reports: "Cluster resource 'SQL Server (IST03)' in clustered service or application 'SQL Server (IST03)' failed."

EventID 1205 reports: "The Cluster service failed to bring clustered service or application 'SQL Server (IST03)' completely online or offline. One or more resources may be in a failed state. This may impact the availability of the clustered service or application."

I tried to search online, but I still haven't found a good explanation about these errors

My cluster pass with 100% success the cluster validation process, every update (both windows and sql server) are installed.

Every server has the same software installed (double checked bios & driver revision of every peripheral)

Anyone has a good explanation of these 2 EventID and possibily an idea where I can start to look to solve this problem?

Thanks for any help

Verify SQL Database Health without direct access to thh SQL application

$
0
0

Hi All,

So my company provides disaster recovery solutions and we are trying to come up with an automated process to verify SQL database health during DR tests without direct access to SQL. Our clients do not want to give out their passwords or permission any of our accounts access to their SQL databases. So without running queries or powershell commands against SQL are there any other ways to determine that SQL databases have mounted and are healthy? I've been trying to identify eventIDs that might suggest the databases are healthy but haven't found anything really concrete yet. I've also tried parsing through the SQL error logs to see if they give a distinct "yes databases mounted without error" or "no databases did not mount" responses.

Our clients are mostly running SQL 2008 on windows 2008 r2 operating systems

Thank you

NLB to achieve HA in Sql Server

$
0
0

Hi 

I have a scenario where in I have two workgroup machines(not joined to a Domain) and have SQL server installed on them.

Now, to achieve HA for my web application I want to desing a solution so that whenever one machine goes down automatically other take over it.

Since the machines aren't a part of an AD, can't user AlwaysOn feature on top of a WSFC.

So, I thought NLB would be of little help but with NLB I can use database only for Read purpose and not for RW and even if I am I need to know how and how can I ensure to have both the db's to be in sync during any write operation.

Kindly advise asap.

Regards,

Eager 2 Learn

Adding replica to Availability Group

$
0
0
When adding a secondary replica to an availability group and selecting the full data synchronization option, is there any outage associated with tasks to bring the secondary online?
Viewing all 4689 articles
Browse latest View live


<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>