Update SCD Type 2 dimension in one single transaction using only T-SQL

Merge logo new

Recently I got a request inside my organization to make sure that a dimension would keep track of the changes due to requrementes from the business.

This needed to be done in a single transaction in pure T-SQL code.

So – what to do and how to do it. Here’s one way.

The sourcetable looks like this:

tablesource

The request was to keep track of changes in the ManagerId according to CaseId.

I’ve created a SCD2 table like this:

CREATE TABLE [dbo].[CaseProjectManagerHistory](
	[dwid] [bigint] IDENTITY(1,1) NOT NULL,
	[CaseId] [int] NULL,
	[ManagerId] [int] NULL,
	[dwDateFrom] [date] NULL,
	[dwDateTo] [date] NULL,
	[dwIsCurrent] [bit] NULL,
	[dwChangeDate] [date] NULL
)

The fields are as follows:
dwid: Identifier for the table
CaseId: The caseid for the rows
ManagerId: The managerid for the row
dwDateFrom: The date from where the row is actual
dwDateTo: The date to where the row is actual
dwIsCurrent: Boolean that tells if the row is the current one or not
dwChangeDate: The date of the change (if the row has changed since the first write)

If you need to catch up on the history types in a dimension – then take a look at Kennie’s blogppost HERE.

First of all I started out with a merge statement that would insert all the new values not in the table and update the ones that needed update.

Something like this:

merge dbo.CaseProjectManagerHistory as target
	using (select CaseId, ManagerId, cast(getdate() as date) as startDate, datefromparts(2199,1,1) as endDate, 1 as [current], cast(getdate() as date) as changeDate from dbo.[Case]) as source
	on target.CaseId = source.CaseId
	when not matched by target
		then
			insert (CaseId, ManagerId, dwDateFrom, dwDateTo, dwIsCurrent, dwChangeDate)
			values (source.CaseId, source.ManagerId, source.startDate, source.endDate, source.[current], source.changeDate)
	when matched 
		and target.dwIsCurrent = 1
		and exists (select source.CaseId, source.ManagerId
					except
					select target.CaseId, target.ManagerId)
		and target.dwChangeDate <= source.ChangeDate
		and source.changeDate < target.dwDateTo
		then
			update set dwIsCurrent = 0, target.dwChangeDate = source.changeDate, target.dwDateTo = dateadd(d,-1,source.startDate)

Those of you who haven’t tried and worked with a merge-statement – you can get the 101 from BOL here.

But this merge statement only inserts new rows and updates existing rows. The rows that are updated still needs to be in the table in order to fully apply to the SCD 2 rules.

This can be done by using the cluse ‘output’ from the merge-statement and then use the output rows to insert into the same table.

It will look like this:

insert into dbo.CaseProjectManagerHistory_demo (CaseId, ManagerId, dwDateFrom, dwDateTo, dwIsCurrent, dwChangeDate)
select CaseId, ManagerId, startDate, endDate, [current], changeDate 
from (
	merge dbo.CaseProjectManagerHistory_demo as target
	using (
		select 
			CaseId
			,ManagerId
			,cast(getdate() as date) as startDate
			,datefromparts(2199,1,1) as endDate
			,1 as [current]
			,cast(getdate() as date) as changeDate 
		from 
			dbo.[Case]
		where 1=1
			and caseid in (2005,2013,2015,2016,2019,2021,2023,2025,2027,2028)
			) as source
	on target.CaseId = source.CaseId
	when not matched by target -- indsæt nye rækker
		then
			insert (CaseId, ManagerId, dwDateFrom, dwDateTo, dwIsCurrent, dwChangeDate)
			values (source.CaseId, source.ManagerId, source.startDate, source.endDate, source.[current], source.changeDate)
	when matched -- opdater eksisterende rækker
		and target.dwIsCurrent = 1
		and exists (select source.CaseId, source.ManagerId --filtrer kun på rækker der ikke allerede eksisterer i target
					except
					select target.CaseId, target.ManagerId)
		and target.dwChangeDate <= source.ChangeDate
		and source.changeDate < target.dwDateTo
		then
		update set dwIsCurrent = 0, target.dwChangeDate = source.changeDate, target.dwDateTo = dateadd(d,-1,source.startDate) 
		output $action ActionOut, source.CaseId, source.ManagerId, source.startDate, source.endDate, source.changeDate, source.[current]) as mergeOutput
where mergeOutput.ActionOut = 'UPDATE';

The mergestatement ‘output’ action is used to insert the same rows to the history table once more. The only change is the ‘end date’.

Happy coding!

Note: I did a short presentation with this at my workplace a few weeks ago, and here Kennie (l, b, t) told me that there is a bug in the merge statement that needs to be taken into account. Read more of that here.

Gem

Gem

Gem

Gem

Gem

Gem

Gem

Gem

Gem

Query store – next generation tool for every DBA

cvr_sidefront_lgAlong with the release of SQL server 2016 CTP 3 now comes the preview of a brand new feature for on premise databases – the Query Store. This feature enables performance monitoring and troubleshooting through a log of executed queries.

This blogpost will cover the aspects of this new feature including:

  • Introduction
  • How to activate it
  • Configuration options
  • What information is found in the Query Store
  • How to use the feature
  • What’s in it for me

Continue reading →

Many-to-many in SSAS Tabular

m2mWith the release of SQL Server 2016 CTP 3.0 also comes the ability to test the Many-to-Many functionality within the SSAS Tabular.

This blogpost will cover the aspects of the many-to-many feature from SQL Server 2016 – including:

  • Prerequisites
  • The old way
  • The new way

This post is based on data from the AdventureWorksDW2012 database.

Continue reading →

Behold the new live query stats in SQL Server 2016

LiveQueryStatsWith the release of SQL Server 2016 also comes a great new feature to get a live view of the current execution plan for an active query.

This blogpost will cover the aspects of this new feature including:

  • Introduction
  • How to activate
  • How to use and read the output
  • Downsides – if any

Continue reading →

Row level security in SQL Server 2016

sql_securityWith the release of SQL Server 2016 comes many great new features. One of these is the implementation of row level security in the database engine.

This blogpost will cover the aspects of this new feature – including:

  • Setup
  • Best practice
  • Performance
  • Possible security leaks

Introduction

The row level security feature was released earlier this year to Azure – following Microsoft’s cloud-first release concept. Continue reading →

The DBAs guide to stretch database

sql+server+2016One of the new features in SQL Server 2016 – and there is a lot – is the ability to stretch the on premise databases to an Azure environment.

This blogpost will cover some of the aspects of this – including:

  • Primarily setup – how to get started
  • Monitoring state of databases that are in ‘stretch mode’
  • Daily work with stretch databases
  • Backup – what’s good to know

With the release of SQL Server 2016, the new feature called stretch database is also released. Continue reading →

Enlarge AdventureWorksDW2012

sql-bannerJust recently I had to have a big datawarehouse solution to test some performance optimization using BIML.

I could use the AdventureWorks2012 database, but I needed the clean datawarehouse tables in order to have minimum data maintennance when testing the BIML scripts.

I could not find it, and figures out it was faster to make my own.

So heavily inspired by this post from Jonathan Kehayias (blog), I’ve made a script that can be used to enlarge the dbo.FactInternetSales table.

The script creates a new table called dbo.FactInternetSalesEnlarged and copies data from dbo.FactInternetSales into it with a randomizer. Exploding the data to a 100 times bigger table – est. 6 mio rows.

Get the script here:

EnlargeAdventureWorksDW2012

Happy coding 🙂

 

SSIS expressions I often use

expression

If either you are doing your SSIS by hand or using the BIML framework, you’ve came across the expressions and the expression-builder.

This is a helper list, with my most often used, and wich are allways forgotten when I need them, of my commonly used SSIS expressions.

Continue reading →

I just got 15.000 new collegues

rehfeldimsGuess what. I just got 15.000 new collegues.

Rehfeld Partners is to be acquired by IMS Health. IMS Health is a leading global information and technology services company, with more than 60 years experience, providing clients in the healthcare industry with comprehensive solutions to measure and improve their performance.

Continue reading →