Skip to main content

The impact of CDC To a Async node

Recent as late last week, I ran into a baffling situation where a async node in the MsSQL HA  solution was not in sync for few hours. Eventough, the high watermark kept increasing for the synchronization between the primary and async node's the latency kept increasing significantly. By the way , HA solution is in AWS in and the nodes are multiple regions.

The impact of having a node not in sync
- Transaction log on the primary will blow out to be uncontrollable
- CDC (Change data capture ) will not work

The impact to the transaction log is quite known fact, but what caught me off guard was the impact to CDC. It was later that it occurred that CDC consumes the transaction log agent, and the transaction log agent does not process the log records until they are  harden at all the HA nodes in the availability group. This lead to more concerns on what needs to be done in case of  catastrophic situation ie. When the primary node in a HA configuration is compromised and the sync secondary takes over the primary role, more details on a case is available here.

FYI , The cause for the latency between the primary and async node was largely to do with a combination of mishaps. A known workload continued , index optimization was processing , and finally there was bandwidth degradation between the two nodes.
Hope this helps someone out in the brave world of HA in MsSQL

Comments

Popular posts from this blog

Create a dacpac To Compare Schema Differences

It's been some time since i added anything to the blog and a lot has happened in the last few months. I have run into many number of challenging stuff at Xero and spread my self to learn new things. As a start i want to share a situation where I used a dacpac to compare the differences of a database schema's. - This involves of creating the two dacpacs for the different databases - Comparing the two dacpacs and generating a report to know the exact differences - Generate a script that would have all the changes How to generate a dacbpac The easiest way to create a dacpac for a database is through management studio ( right click on the databae --> task --> Extract data-tier-application). This will work under most cases but will error out when the database has difffrent settings. ie. if CDC is enabled To work around this blocker, you need to use command line to send the extra parameters. Bellow is the command used to generate the dacpac. "%ProgramFiles...

High Watermarks For Incremental Models in dbt

The last few months it’s all been dbt. Dbt is a transform and load tool which is provided by fishtown analytics. For those that have created incremental models in dbt would have found the simplicity and easiness of how it drives the workload. Depending on the target datastore, the incremental model workload implementation changes. But all that said, the question is, should the incremental model use high-watermark as part of the implementation. How incremental models work behind the scenes is the best place to start this investigation. And when it’s not obvious, the next best place is to investigate the log after an test incremental model execution and find the implementation. Following are the internal steps followed for a datastore that does not support the merge statements. This was observed in the dbt log. - As the first step, It will copy all the data to a temp table generated from the incremental execution. - It will then delete all the data from the base table th...

The maximum number of working threads (100) are already running

The problem                 This afternoon, out of the blue, the development folks called over wanting to know why the DB server was not responding, sure enough the databases were not accessible from application and from MMS. I knew there weren’t any maintenance happening and so I logged in to the server remotely and found that the sql services were still running as usual and the services had not restarted. To my surprise, in 10-15 mins everyone was able connect to the server again.  My first thoughts were,  it would have been an issue with the network and due to the glitch the servers weren’t accessible during the  time period. Environment details : -           The sql server were on a hyper v with a single CPU and 1024 memory -           There was 80 + transaction replications setup and further 20-30 sql ser...