Automated approach for creating ServiceNow tables in Azure database
We manage and maintain data warehouse on Azure SQL managed instance (We are migrating the DW to Azure Synapse Analytics). We pull data from various data sources into Azure DW. One of the data sources is ServiceNow. We maintain several ServiceNow tables in Azure database, and run nightly Azure Data Factory (ADF) pipeline to copy the delta changes from ServiceNow tables into the corresponding Azure tables.
We create new ServiceNow tables in Azure database whenever needed by our DW customers. There is no option in ServiceNow to generate the DDL directly for the table we want. So, we have to manually prepare a DDL for table creation in Azure. This is exhaustive when we have to create many tables.
I designed an ADF pipeline that automates the creation of ServiceNow tables in Azure.
Names of the ServiceNow tables that have to be created in Azure are provided as input parameters to the ForEach loop.
These activities are executed in the ForEach loop for each table.
- UpdatePipelineStartRun — I maintain a table with the details of all ServiceNow tables present in Azure database. This activity updates the timestamp in the column that maintains pipeline start time.
- CopyToParquet — This activity copies the data from ServiceNow table into a parquet file in ADLS storage container.
- CopyToStagingTable — This activity creates a staging table in the Azure database using “Auto create table” option in Sink. Parquet file created in previous step is provided as Source to this activity.
- MoveParquetFile — This activity moves the Parquet file (created in CopyToParquet activity) to another ADLS storage container for retention purpose.
- CreateNewTable — The table schema, staging table name and main table name are provided as input to the stored procedure executed in this activity. ServiceNow has a system table called sys_dictionary. This table maintains the details of all columns, columns types and columns lengths for all ServiceNow tables. I used the details in that table to generate the DDL for the user table I want.
CREATE PROCEDURE [dbo].[usp_create_table] @sch NVARCHAR(MAX), @maintab NVARCHAR(MAX), @stgtab NVARCHAR(MAX)
DECLARE @ddl NVARCHAR(MAX);
(select tb.name tab_name,
from sys.tables tb with (nolock)
inner join sys.columns col with (nolock) on tb.object_id=col.object_id
inner join sys.types typ with (nolock) on col.user_type_id=typ.user_type_id
where tb.name =@stgtab)
(select name_value, element_value, internal_type_value, max_length_value from sys_dictionary with (nolock)
where name_value=@maintab and sys_name_value<>@maintab)
(select dict.name_value tbname, stag.col_name colname,
when stag.col_data_type like ‘%char%’ and dict.max_length_value<4000 then concat(‘[‘,stag.col_data_type,’]’,’(‘,dict.max_length_value+(dict.max_length_value*75)/100,’)’)
when stag.col_data_type like ‘%char%’ and dict.max_length_value>=4000 then concat(stag.col_data_type,’ (MAX)’)
from stag inner join dict on stag.tab_name=concat(‘staging_’,dict.name_value) and stag.col_name=concat(dict.element_value,’_value’))
select @ddl = concat(STRING_AGG(‘[‘+cast(colname as varchar(max))+’] ‘+coltype+’ NULL’,’,’),’) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]’)
DECLARE @tsql nvarchar(max) = ‘create table [‘+@sch+’].[‘+@maintab+’](‘+@ddl+’;’
EXEC sp_executesql @tsql
6. InsertDataIntoNewTable — This activity inserts the data into the main table created in previous step with the data present in staging table.
7. LookupNewWatermarkValue — This activity gets the highest watermark value from the user table, which will be used for delta data pulls.
8. UpdateWatermarkValue — I maintain a table with the details of all ServiceNow tables present in Azure database. This activity updates the timestamp in the column that maintains the watermark value.
9. DropStagingTable — This activity drops the staging table created in CopyToStagingTable activity.
10. UpdatePipelineSuccessRun — I maintain a table with the details of all ServiceNow tables present in Azure database. This activity updates the timestamp in the column that maintains the pipeline end time.
Disclaimer: The posts here represent my personal views and not those of my employer or any specific vendor. Any technical advice or instructions are based on my own personal knowledge and experience.