Skip to main content

Posts

High Watermarks For Incremental Models in dbt

The last few months it’s all been dbt. Dbt is a transform and load tool which is provided by fishtown analytics. For those that have created incremental models in dbt would have found the simplicity and easiness of how it drives the workload. Depending on the target datastore, the incremental model workload implementation changes. But all that said, the question is, should the incremental model use high-watermark as part of the implementation. How incremental models work behind the scenes is the best place to start this investigation. And when it’s not obvious, the next best place is to investigate the log after an test incremental model execution and find the implementation. Following are the internal steps followed for a datastore that does not support the merge statements. This was observed in the dbt log. - As the first step, It will copy all the data to a temp table generated from the incremental execution. - It will then delete all the data from the base table th...
Recent posts

View Partitioning Child table settings - Postgres

I have been an agent of change in my pod and the staff I have been waving is postgres :). We have been working out a new project and the OLTP data store is postgres. Out of the many things I encountered this week two things really topped the list.  Most OLTP workload are random IO. Postgres heep vs cluster key that sorts data  Declarative partitioning is the way forward to keeping data sets manageble and assists with maintenance tasks. The code snippet for bellow provides details of the partition child table properties.    SELECT    nmsp_parent.nspname AS parent_schema,    parent.relname      AS parent,    nmsp_child.nspname  AS child_schema,    child.relname       AS child ,    pg_get_expr(child.relpartbound, child.oid, true) as child_expression,    child.reloptions   FROM pg_inher...

Default privileges Don't work when owner changes

One of the first things you learn in  postgres  is the importance of getting the default privileges configured. Coming from the  SQLserver  background, I found having to assign default privileges a little precarious but once I got over that hump it has become more obvious for its place in PG.  The other day I discovered that default privileges don't get inherited of the new owner when the ownership changes of an object. 1.) login as  postgres   create table tab(id int)  2.) login as d_owner  create table  taba (id int) 3.) Grant permission for  readonly  role access data from the public schema owned by d_owner grant select on all tables in schema public to  readrole ; alter default privileges for user  d_owner   in schema public grant select on tables to feedsapi_readonly_role 4.) Change the owner of  tab  from  postgres  to d_owner     ...

How to List The Capacity attributes for Dynamodb table with powershell

I was muddling around attempting to find  away to display the ReadCapacity and WriteCapacity for the dyanmodb table with powersheel. Wanted to share this as it took me quite bit of trial error before getting this right. $PrimaryRegion = 'ap-southeast-2' $ddbtablelist = Get-DDBTableList -Region $PrimaryRegion foreach ($ddbtablelist_item in $ddbtablelist) { Get-DDBTable -Region $PrimaryRegion -TableName $ddbtablelist_item |Select- Object TableName , ItemCount, @{Name= "ProvisionedThroughput" ;Expression={ $_.ProvisionedThroughput | Select- Object ReadCapacityUnits ,WriteCapacityUnits}}|Select- Object -Property * -ExcludeProperty ProvisionedThroughput -ExpandProperty ProvisionedThroughput}

Delete Manual RDS Snapshots with Powershell

As you may have observed already , when a RDS instance is deleted all the automated snapshots disappear along with it giving us little or no option to recover. However, manual snapshots can be retained indefinitely for a additional cost. You will then realize that AWS has imposed a softlimit of 100 manual snapshots per region and account . This can be extended by speaking to AWS but this will cause the cost to increase over a period of time. There is also the other downside, were  snapshots can't be moved to a S3 bucket. After much discussuion we decided to delete the manual snapshots older than 10 days. Attached is the sample script. $20Days = (Get-Date).Adddays(-20) $ManulSonpshot = Get-RDSDBSnapshot -Region us-east-1 | Sort-Object SnapshotCreateTime | select -First 1000 | Where-Object {$_.SnapshotType -eq "manual" -and $_.SnapshotCreateTime -le $20Days } | Select-Object DBInstanceIdentifier , Engine , SnapshotCreateTime ,dbsnapshotidentifier foreach ($Ma...

How to Backup postgres globals without sysadmin permission in RDS

For those of my Friends who are on postgres RDS and require to backup the globas for a rds instance you would soon find it can't perform a backup with pg_dumpall --globals-only.  The default user that gets created along with the RDS postgres instance does not have adequate permission to get through the backup processes. Bellow is the message that you would receive. pg_dumpall.exe : pg_dumpall: query failed: ERROR:  permission denied for relation pg_authid At line:1 char:1 + & 'C:\setup_msi\postgres\bin\pg_dumpall.exe'  -h XXXXXX.cXXXX... + ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~    + CategoryInfo          : NotSpecified: (pg_dumpall: que...ation pg_authid:String) [], RemoteException    + FullyQualifiedErrorId : NativeCommandError pg_dumpall: query was: SELECT oid, rolname, rolsuper, rolinherit, rolcreaterole, rolcreatedb, rolcanlogin,...

How To Install a msi/exe with powershell

Today I had to programitacally install a .msi and .exe at the start up of an EC2 instance. By default all them code that needs to be executed during a start up is added to the user data of the EC2 instance. Bellow was the code used to install the msi and exe start-process c:\setup\<>.msi /gn -wait However executing the exe required some effort as it needed special switchers start-process c:\setup\psqlodbc-setup.exe /install /quiet /norestart  -wait -passthorugh found the extra switcher doing this