But what about the partitions on the tabular cube? Do we really need to tell and learn the DBA how to handle that on a periodic plan?
The answer is simple: No!
Thanks to the XMLA language, the DMV’s for SSAS instances (both tabular og multidimensionel) and SSIS we can do the partitioning dynamic based on the current data in the datawarehouse.
In below examble I’ve made partitions for every month, but as always you need to take the current architecture, dataflow, data deliveries etc into account when creating your sollution.
Here we go:
We need a table in order to keep track on the partitions and their metadata. Also a table to hold data from existing partitions in the Tabular cube:
-- Schema to hold tables for partitioning CREATE SCHEMA [Partition] GO -- PartitionLog table CREATE TABLE [Partition].[PartitionLog]( [ID] [int] IDENTITY(1,1) NOT NULL, [PartitionName] [varchar](100) NOT NULL, [FromDateKey] [int] NOT NULL, [ToDateKey] [int] NOT NULL, [CreatedDate] [datetime] NULL, [LastProcessDate] [datetime] NULL, [ProcessedCount] [smallint] NULL, CONSTRAINT [PK_PartitionLog] PRIMARY KEY CLUSTERED ( [PartitionName] ASC, [FromDateKey] ASC, [ToDateKey] ASC )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY] ) ON [PRIMARY] GO SET ANSI_PADDING OFF GO ALTER TABLE [Partition].[PartitionLog] ADD CONSTRAINT [DF__Partition__Proce__38996AB5] DEFAULT ((0)) FOR [ProcessedCount] GO -- Table to hold list of existing partitions. -- The table is updated with data from the SSIS package CREATE TABLE
In SSIS I’ve created a dataflow from the SSAS Tabular instance with the list of partitions on the table I want to partition to the table ExitingPartitionsList.
We can do this thanks to the DMV’s for SSAS which also works for Tabular instances – see this link for further information.
The SQL-statement to get list of partitions (reference to msdn):
select distinct PARTITION_NAME from $SYSTEM.DISCOVER_STORAGE_TABLE_COLUMN_SEGMENTS where TABLE_ID='Måleraflæsninger_51b67466-b3e0-425a-9111-072e9d814c0b' order by PARTITION_NAME
The tableID is found in the tables properties (right-click and choose ‘Properties’):
Now we need to take a look at the naming convention of the partitions. There is a need for the partition-name to have a suffix that tells the span of the partition. I’ve choosen [Tablename]-[fromdate]-[todate]. And in order to get usable data from the previous dataflow, I’ve made a view as below:
create view [Partition].[ExistingPartitions] as select [PARTITION_NAME] ,substring(partition_name,CHARINDEX('-',Partition_name,1)+1,CHARINDEX('-',Partition_name,CHARINDEX('-',Partition_name,1)+1)-CHARINDEX('-',Partition_name,1)-1) as FromDateKey ,substring(partition_name,CHARINDEX('-',Partition_name,CHARINDEX('-',Partition_name,1)+1)+1,len(partition_name)) as ToDateKey from [Partition].[ExistingPartitionsList]
Giving this output:
My facttable (Måleraflæsninger) has a field that is used to filter the partition. In this case it is ‘datekey’. In order to get all possible partitions that are needed for the project I’ve made this view:
create view [Partition].[PossiblePartitions] as select distinct cast('Måleraflæsninger-' + cast(int_datekey as varchar(8)) + '-' + convert(varchar(8), dateadd(day,-1,dateadd(month, 1, datekey)), 112) as nvarchar(255)) as Name ,int_datekey as FromdateKey ,cast(convert(varchar(8), eomonth(datekey), 112) as int) as ToDateKey from Fact.Måleraflæsninger where day(datekey) = 1
Take notice that I’ve made the dates end at last day of the month. This is going to be used to generate the view for the partitions later on.
Based on these two views we can now generate the list of partitions that needs to be created in the Tabular project:
create view [Partition].[MissingPartitions] as select pp.Name ,pp.FromdateKey ,pp.ToDateKey from Partition.PossiblePartitions pp where pp.Name not in ( select PARTITION_NAME from Partition.ExistingPartitions)
Now I need to generate a set of XMLA’s. One to create a partition and one to process a partition. The easiest way to get these is to script them from the GUI in the SSAS Tabular instance.
1: Right-click the table that I’m working with and selecting ‘Partitions…’
2: Click ‘New partition’
3: Here specify the name – remember the convention – and the SQL statement. Here it is important to remember the filter criteria for the partitions. In this case it is one partition for every month.
4. Finally click the Script buttom and ‘Script Action to New Query Window’
Result – the areas I’ve highlighted are the ones that we need to parameterize in SSIS – more to come on that part in a bit.
The same way I’ve made a XMLA script to process the partition – the highlighted area is, again, to be parameterized later:
Now we need to go to SSIS and make the logic and steps to accuire the dynamic partitioning.
First of all we need to make sure that all partitions that should be in the model also exists, and if they do not exists, we’øll create them.
The SSIS project now needs some variables as listed below:
The scope varies from project to project.
After all the variables has been defined, we need to make the two XMLA-variables as Expressions.
In the Expression builder add below codes to the respective variable:
CreatePartitionXMLA – replace the meta-data with the one that matches your sollution:
"<Alter AllowCreate="true" ObjectExpansion="ExpandFull" xmlns="http://schemas.microsoft.com/analysisservices/2003/engine"> <Object> <DatabaseID>TabularKube</DatabaseID> <CubeID>Model</CubeID> <MeasureGroupID>Måleraflæsninger_51b67466-b3e0-425a-9111-072e9d814c0b</MeasureGroupID> <PartitionID>" + @[User::Partitionname] + "</PartitionID> </Object> <ObjectDefinition> <Partition xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:ddl2="http://schemas.microsoft.com/analysisservices/2003/engine/2" xmlns:ddl2_2="http://schemas.microsoft.com/analysisservices/2003/engine/2/2" xmlns:ddl100_100="http://schemas.microsoft.com/analysisservices/2008/engine/100/100" xmlns:ddl200="http://schemas.microsoft.com/analysisservices/2010/engine/200" xmlns:ddl200_200="http://schemas.microsoft.com/analysisservices/2010/engine/200/200" xmlns:ddl300="http://schemas.microsoft.com/analysisservices/2011/engine/300" xmlns:ddl300_300="http://schemas.microsoft.com/analysisservices/2011/engine/300/300" xmlns:ddl400="http://schemas.microsoft.com/analysisservices/2012/engine/400" xmlns:ddl400_400="http://schemas.microsoft.com/analysisservices/2012/engine/400/400"> <ID>" + @[User::Partitionname] + "</ID> <Name>" + @[User::Partitionname] + "</Name> <Source xsi:type="QueryBinding"> <DataSourceID>f6bb153e-310c-41b4-9a15-9ed45e9dc8b6</DataSourceID> <QueryDefinition>SELECT [vFact].[Måleraflæsninger].* FROM [vFact].[Måleraflæsninger] WHERE [int_datekey] between " + (DT_WSTR, 8) @[User::FromDatekey] + " and " + (DT_WSTR, 8) @[User::ToDatekey] + "</QueryDefinition> </Source> <StorageMode valuens="ddl200_200">InMemory</StorageMode> <ProcessingMode>Regular</ProcessingMode> <ddl300_300:DirectQueryUsage>InMemoryOnly</ddl300_300:DirectQueryUsage> </Partition> </ObjectDefinition> </Alter>"
ProcessPartitionXMLA – replace the meta-data with the one that matches your sollution:
"<Process xmlns="http://schemas.microsoft.com/analysisservices/2003/engine"> <Type>ProcessFull</Type> <Object> <DatabaseID>TabularKube</DatabaseID> <CubeID>Model</CubeID> <MeasureGroupID>Måleraflæsninger_51b67466-b3e0-425a-9111-072e9d814c0b</MeasureGroupID> <PartitionID>" + @[User::Partitionname] + "</PartitionID> </Object> </Process>"
Now we are all set and just need to build the package.
The focus is now on a container like this:
The steps are to get all the missing partitions from our defined view and loop the result with both Create partition, Process partition and Insert data to PartitionLog.
Define a SQL task with the following statement:
select mp.Name as PartitionName ,mp.FromdateKey ,mp.ToDateKey from Partition.MissingPartitions mp
Define the output to ‘Full resultset’ and map the result to the variable MissingPartitions.
Add a ForeahLoopContainer and define the container to reference the MissingPartitions object as a ADO source and map the data to the variables as below:
Inside this container add two Analysis Services Execetute DDL Tasks and define a Analysis Services Connectione that matches the environment.
The first DDL (Create Partition) is defined like this:
The second DDL (Process Partition) is defined like this:
The last step is to add a record in the PartitionLog table – add a SQL task with this statement:
select mp.Name as PartitionName ,mp.FromdateKey ,mp.ToDateKey from Partition.MissingPartitions mp
And map the parameters like this:
Now, when the package runs, it will get a dataset of missing partitions, loop the dataset and create and process the partitions dynamically. At the end it creates a record in the partitionlog to keep track of this.
The last thing we need to do is to add a container to process the latest partition every time the package executes.
We need to build this:
Again, add a SQL task and define it to get data from the view with current partition we created earlier:
insert into Partition.PartitionLog ( [PartitionName] ,[FromDateKey] ,[ToDateKey] ,[CreatedDate] ,[LastProcessDate] ,[ProcessedCount] ) values (?,?,?,GETDATE(),GETDATE(),1)
Map the ‘Full resulset’ to the variable PartitionName.
The foreach loop container must be defined to use this variable and the data mapped like this:
Add a Anaysis Services Exectute DDL task and define it to use the variable ProcesPartionXMLA. We can reuse the expression as it is defined as expression and uses the same logic in the expression.
Finally add a SQL task with below code:
select PartitionName from Partition.CurrentPartition
Map the parameters like this:
And there it is. All done.
Now every time the package is executed it will see to that missing partitions is created (in this case for every month start) and processed.
And it will make sure that the latest partition is updated with the latest data.
The processing time now takes very short time to do, as the only data that is processed is the latest one. Of course the first time the package is run it will create all the partitions and process them.
The whole code from above can be downloaded here: