Skip to main content


RDS for SQL Server - SQL native backups

You soon realize the AWS platform provides some of the maintenance work that we used to do as DBA's. But you would find there are situation where you require some level of control. In late 2016 AWS provided the functionality for SQL server to perform native backups of its RDS instances extending its snapshot functionality.
Solution will attempt to extend what Jeff bar has blogged by providing few extra screen shots and some additional steps.
Steps to follow
1.) Enable the SQL native backups by adding the SQL_Server_Backup_Restore option group

2.) Associate the option group created in step 1 to the RDS instance

 Once you have created the option group, it is mandatory that you associate the newly created option group to the RDS instance that you intend to run the SQL native backup. Faili
Recent posts

Simple Storage Service-S3

AWS is revolutionizing  many aspects of the lively hood of many folks in the tech industry. Industries have aligned its processes around the many AWS services and depend on the availability and consistency of its services. As a data person , I am supper delighted and totally on board the AWS journey to a brave new world in the cloud.

For those of you who are getting on-board AWS, Simple Storage System(S3) will be one place to start the journey. I have outlined few points on S3 that can help you on your way.

Is Read after write consistency for PUTS for new objects Is known as object based storage Default S3 buckets per account  =100 Storage tiers
S3 Durable , Immediately available , frequently accessed S3-IA Durable , Immediately available , infrequently accessed S3 - Redundancy Reduced  redundancy storage ( data that is easily reproducible , such as thumb nails ,watermarks etc. )
99.99 vailability Glaciar Cheaper than 33 but more costly and time consuming to retrieve data. This should be used …

The impact of CDC To a Async node

Recent as late last week, I ran into a baffling situation where a async node in the MsSQL HA  solution was not in sync for few hours. Eventough, the high watermark kept increasing for the synchronization between the primary and async node's the latency kept increasing significantly. By the way , HA solution is in AWS in and the nodes are multiple regions.

The impact of having a node not in sync
- Transaction log on the primary will blow out to be uncontrollable
- CDC (Change data capture ) will not work

The impact to the transaction log is quite known fact, but what caught me off guard was the impact to CDC. It was later that it occurred that CDC consumes the transaction log agent, and the transaction log agent does not process the log records until they are  harden at all the HA nodes in the availability group. This lead to more concerns on what needs to be done in case of  catastrophic situation ie. When the primary node in a HA configuration is compromised and the sync secondary ta…


PageIOLatch PageIOLatch_* , this wait type keeps transferring data from disk to memory. This is caused due  to pages been evicted from the buffer cache. 
The pageIOLatch waittype started appearing as the most long standing wait type on one of the servers. It wasn't Immediately obvious what caused the waittype to be such a huge problem.

 Figure 1

  Figure1 , depicts the insert and update operations and the corresponding wait types during the insert and update opearion.
The waittype was appearing against  the insert query which was executed 150k times per a hour. The insert query was a victim in this case and the update query that was executed few seconds later was the cause.   The update statement looked pretty solid, it was updating the table based on the single parameter which was the clustered key of the table.  :(
The impact  - Page life expectancy would start reducing when this wait type appeared.
- Log flush rate per sec will be very high, causing logs to be flushed…

For XML and LEAD

Today i got a adhoc request from the marketing team to get some details of few customers. 
i.e The frequency of a customer filing their GST returns. 
Just so that i can relate to things lot more easily i have used the Adventureworks database to explain the query.
Days before SQL 2012 this would have been a quite a bit of an effort. With the Lead function been available with SQL 2012 things are lot more easier. 
"For XML" was used to loop tough the data set and grab the results to a single row while LEAD was used to get the next/previous record for the difference. Select CustomerID ,  OrderFrequency
        from Sales.Customer SH
         cross apply
            select STUFF((
                            SELECT  ', ' + CAST(isnull(

When To Helped The Cardinality Estimater

So the usual story, the query works well for small data sets but does not work optimally for large data. What are my options ?

Recompile the view/function/procedure Update states Add a optimize @Variable for unknown Look for missing indexes Rewrite the query
The low hanging fruits Generally, the top two items would allow you to get over the hump but without a optimize for hint you can’t get over the parameter sniffing problem and without the index you would keep running into the same problem over and over again.
Long term Rewrite the query to avoid the the index and if you get really lucky it may be possible to avoid the optimize for hint.
How i went about solving the problem