Using sensitive parameters (SSIS Series)

Introduction

This time I want to tell something about the use of sensitive parameters. Passwords, connection strings, anything that is .. uhm .. sensitive.

Problem

SSIS Package Parameters and Environment variables can hold sensitive values, like passwords, connection strings and so on. You do not want these to be seen by people who are not authorized to have access to this information.

Solution

First I want to give a *short* intro on how (not) to save sensitive parameters in SSIS.
There are several options which are explained enough on other internet sites:

  • DontSaveSensitive
  • EncryptSensitiveWithUserKey
  • EncryptSensitiveWithPassword
  • EncryptAllWithPassword
  • EncryptAllWithUserKey

My opinion on those options:

  • When you use a *WithUserKey value, you might get into trouble when someone else from the development team wants to work on a SSIS Package.
  • When you use a *WithPassword value, you have to type (or paste) the password every time you open the package in Visual Studio. I personally really disklike that.
  • But more important, why would you care about saving the sensitive data at all? Thinking about professional SSIS package development, you would not want sensitive data like passwords and connection strings hard coded in the package anyway, while you can simply configure your packages from environment variables as explained in two of my earlier blog posts. So, as you might guess, DontSaveSensitive is my favorite option.

So the rest of this article is based on using the DontSaveSensitive property.
You have to set it in the SSIS package, but also in the project properties:
ssis-e11-436Setting the package property “ProtectionLevel”
ssis-e11-437Setting the Project Protection Level

I have set up a very simple demo. It consists of the following steps:

Set up a folder and environment in SSISDB if not done already

See the script for this in my previous blog post.

Create a sensitive environment variable “MyDBConnectionString”.

In the demo I use my existing DEV folder with the DEV environment inside that folder.
In my case I use Windows Integrated Security, therefore the connection string on itself is not sensitive. I’ll just use this example, because a connection string could also contain a username and password.

ssis-e11-458The environment variable “MyDBConnectionString”. Check the checkbox to make it sensitive

Create a SSIS Package with a sensitive parameter “MyDBConnectionString”

ssis-e11-454
Do I need to say more?

In the SSIS Package create a (non-sensitive) variable “MyDBConnectionString”

Not so exiting either, except that you should set Raise Change Event to True for the variable, so that any other properties set by an expression using the variable will certainly change when the variable changes.
Before you can do that, you must make sure that the column is visible. The printscreens below explain how.
ssis-e11-439Clicking on this icon shows the Variable Grid Options
ssis-e11-440On the Variable Grid Options check “Raise event when variable value changes”
ssis-e11-441For the variable set “Raise Change Event” to True

ssis-e11-455Here you can see that the Expression @[User::MyDBConnectionString] is used to set the ConnectionString property of the OLEDB Connection Manager OLE_SRC_MyDatabase.

Copy the parameter value to the variable value using a Script Task

The “problem” is that you cannot use the sensitive parameter in an expression that uses the value for a non-sensitive property, in my case the ConnectionString property of an OLEDB Connection Manager.
This error is shown:
ssis-e11-435

This is why you have to copy the value of the sensitive parameter to a non-sensitive variable first using a Script Task.
I personally think it is a pity that Microsoft did not create a GETSENSITIVEVALUE(« SensitiveParameterOrVariable ») expression, so you could use an Expression Task, but they didn’t, so a script task is the way to go.
Add a Script Task to the control flow of your SSIS Package and adjust public void Main() as below:

ScriptMain.cs
public void Main()
{
  try
  {
    // Copy sensitive value to normal (non-sensitive) variable, so that it can be used in an expression.
    string value = Dts.Variables["$Package::MyDBConnectionString"].GetSensitiveValue().ToString();
    if (!string.IsNullOrEmpty(value))
    {
      // Only copy, if not empty. Else preserve design time value, for instance for debugging.
      Dts.Variables["User::MyDBConnectionString"].Value = value;
    }
    Dts.TaskResult = (int)ScriptResults.Success;
  }
  catch (Exception e)
  {
    Dts.Log(e.Message, 0, null);
    Dts.TaskResult = (int)ScriptResults.Failure;
    throw (e);
  }
  Dts.TaskResult = (int)ScriptResults.Success;
}

The control flow of the demo package now looks as follows:
ssis-e11-460
The final control flow of the demo package

To demonstrate how it works, the “design time” connection string, that is stored in the variable MyDBConnectionString, is different from the sensitive environment variable MyDBConnectionString.
I have added an Execute SQL Task SQL_FooBar that uses the OLEDB Connection Manager OLE_SRC_MyDatabase.
The connection string in the variable connects to a database named FF (which is Dutch SMS talk for “Just a moment”/temporary).
The connection string in the sensitive environment variable connects to tempdb.
To proof that the connection string from the sensitive environment variable (which passes it to the package parameter) is used during package execution, I take the FF database offline, after the OLEDB Connection Manager is configured.
ssis-e11-443Taking database FF offline ..

So when the connection manager would not be configured from the sensitive parameter during run time, the Execute SQL Task would certainly fail during execution.

Now it’s time to deploy my project with the testpackage to my DEV folder in SSISDB. I do not have screenshots of that, because I assume that you will know how to do this. If not, check my previous blog posts.

ssis-e11-456
ssis-e11-450This time I do the configuration manually: in the project I add a reference to the DEV environment (not in the picture), then I connect the environment variable to the package parameter.

After it is configured, I execute the package.

ssis-e11-451
ssis-e11-452

Shortly after it has succeeded, proof for me that it was using the configured connectionstring with database tempdb, and not the design time value with database FF.

Considerations for debugging

If you want to debug the SSIS package in Visual Studio you have a problem: due to DontSaveSensitive the package parameter will have no value, and you cannot debug the package.
My workaround for this is to put the debug value for the sensitive value in the SSIS variable. In the Script Task the variable will only get the value from the parameter when it is not empty. So during debug time, the variable value will be preserved.
If your value is too sensitive for this, either use a less sensitive value referring to a test database/server etc., or remove the value from the variable after debugging the package. In the latter case, also delete copies of the package in the bin folder.

Will my sensitive value be safe in an environment variable?

Short answer: yes, I think pretty safe.
Why? If a DBA unticks the “Sensitive” checkbox, the value is wiped out and cannot be read.
Of course a DBA or hacker with sysadmin access to the SQL Instance and knowledge could deploy a SSIS Package, just to pull out the sensitive value. But there are little things a DBA cannot do..

Conclusion / Wrap up

In this article you could read some practical insights on using the ProtectionLevel DontSaveSensitive for a SSIS package containing a sensitive parameter configured from a sensitive environment variable during execution.
Steps to take are:

  1. Create a folder in SSISDB. In that folder, create an environment. In my example both the folder and environment were called DEV.
  2. Create a sensitive environment variable.
  3. Create a SSIS package with a sensitive parameter and a (non-sensitive) variable with the same name.
  4. In the SSIS package create a script task to set the value of the variable to the value of the parameter.
  5. In the SSIS package use an expression to use the variable value where needed, for example in a OLEDB Connection Manager or a FTP Task.

(c) 2016 hansmichiels.com – Do not steal the contents – spread the link instead – thank you.

How to automate your SSIS package deployment and configuration (SSIS Series)

Introduction

Hi there, today I want to share my scripts on package deployment. As manual deployment can be cumbersome and prone to errors, I have automated this boring repetitive task.
This article is based on SQL Server 2016, but I have included scripts to be used for SQL Server 2012 and 2014.

Problem

How to automate SSIS Package deployment.

Solution

  1. Setting up folder and environment
  2. Deploy your SSIS Packages
  3. Setup package configuration (or: connect package- and project parameters to environment variables)
  4. Glueing everything together using Powershell

Setting up folder and environment

As you could read in one of my previous posts Using the project deployment model with environments I usually create a one-on-one relation between folders and environments.
So a folder = an environment, and within the folder several SSIS projects can exist.
What I have seen others do is creating a “Generic”environment. That is also a possibility, this is a matter of personal choice. I prefer having the environment having an explicit name identical to the foldername (e.g. DEV), just to decrease the risk that any changes are made in the wrong Generic environment variable values.

Our main goal for this step is to have a way to automate setting up a folder and environment, including all (scripted) environment variables. This script should be rerunable to be able to add or update environment variables when needed.
One thing you should know is that the database SSISDB has a number of stored procecures in the catalog schema, that I use to achieve this.
Well, here’s the script. Don’t bother copy and paste while you can download all scripts here.

pre_deployment.sql

--\ -------------------------------------------------------------------
---) Script to create folder, environment and settings.
--/ -------------------------------------------------------------------
/*
(c) Copyright 2016 - hansmichiels.com
 
This program is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.
 
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
 
You should have received a copy of the GNU General Public License
along with this program. If not, see http://www.gnu.org/licenses/.
*/

USE [SSISDB]
GO

DECLARE @environment_variables_table TABLE(
  [id] int identity(1, 1),
  --)> Just an autonumber to be able to go through the table without cursor.
  [variable_name] nvarchar(128),
  --) Name of the variable
  [data_type] nvarchar(128),
  --) Variable datatype e.g. [ Boolean | Byte | DateTime | Decimal | Double | Int16 | Int32 | Int64 | SByte | Single | String | UInt32 | UInt64 ]
  --) Check all possible values using this query: SELECT DISTINCT [ssis_data_type] FROM [SSISDB].[internal].[data_type_mapping]
  [sensitive] bit,
  --) Indication of the environment variable is sensitive (e.g. a password).
  [value] nvarchar(4000),
  --) The variable value.
  [description] nvarchar(1024),
  --) Extra description for the variable.
  [create_only] bit
  --) Indication that a variable should only be created when it does not exist yet, but should not be replaced when it already exists.
  --) (so the value in the script does not overwrite the existing value).
  --) In this way you can prevent later configuration changes in an environment being undone by the values in the script.
  )

DECLARE @id int
DECLARE @max_id int
DECLARE @folder_id int
DECLARE @folder_name sysname = N'unknown'
DECLARE @environment_name nvarchar(128) = N'unknown'
DECLARE @environment_description nvarchar(128) = N'unknown'
DECLARE @variable_name nvarchar(128)
DECLARE @data_type nvarchar(128)
DECLARE @sensitive bit
DECLARE @value sql_variant
DECLARE @description nvarchar(1024)
DECLARE @create_only bit --
DECLARE @exists bit
DECLARE @nsql nvarchar(max)
DECLARE @message nvarchar(255)

DECLARE @false BIT = 0, @true BIT = 1

--\
---) Environment settings.
---) This is a generic script for all environments, using the @@SERVERNAME (SQL Server Instance name)
---) to set values right for each environment.
--/

IF @@SERVERNAME LIKE 'VWS63-SQL1%1'
BEGIN
    SELECT
      @folder_name = N'DEV',
      @environment_name = N'DEV',
      @environment_description = N'My development environment'
END
IF @@SERVERNAME = '<SQL INSTANCE NAME>'
BEGIN
    SELECT
      @folder_name = N'TST',
      @environment_name = N'TST',
      @environment_description = N'My test environment'
END
IF @@SERVERNAME = '<SQL INSTANCE NAME>'
BEGIN
    SELECT
      @folder_name = N'PRD',
      @environment_name = N'PRD',
      @environment_description = N'My production environment'
END

IF @environment_name = N'unknown'
BEGIN
    RAISERROR('Unknown environment', 16, 1)
    -- Make sure that the script does not continue when the environment settings could not be set due to an unknown @@SERVERNAME.
    GOTO Finally
END

----\
-----) Create folder, when needed.
----/
IF NOT EXISTS (
    SELECT 1
    FROM [SSISDB].[internal].[folders] f
    WHERE f.name = @folder_name
    )
BEGIN
    SET @message = 'Folder “' + @folder_name + '” is being created.'
    RAISERROR(@message , 0, 1) WITH NOWAIT;
    EXEC [SSISDB].[catalog].[create_folder] @folder_name, @folder_id OUTPUT;
END

--\
---) Create environment, when needed.
--/
IF NOT EXISTS (
    SELECT 1
    FROM [SSISDB].[internal].[environments] e
    JOIN [SSISDB].[internal].[folders] f
        ON e.folder_id = f.folder_id
    WHERE e.[environment_name] = @environment_name
        AND f.name = @folder_name
    )
BEGIN
    SET @message = 'Environment “' + @environment_name + '” is being created.'
    RAISERROR(@message , 0, 1) WITH NOWAIT;
    EXEC [SSISDB].[catalog].[create_environment] @folder_name, @environment_name, @environment_description;
END

IF @environment_name = N'DEV'
BEGIN
  INSERT INTO @environment_variables_table
      ( [create_only], [variable_name], [data_type], [sensitive], [value], [description] )
  VALUES
  -- Just a few examples of environment variables, add your own here or generate this part from existing environments or deployed packages, see scripts below.
      ( @true, N'Parameter1', N'Int32', @false, N'1', N'Example parameter 1 (DEV).' )
    , ( @true, N'Parameter2', N'String', @false, N'Hello world from script (DEV).', N'Example parameter 2 (DEV).' )
    , ( @true, N'Parameter3', N'Decimal', @false, N'9.95', N'Example parameter 3 (DEV).' )
    , ( @true, N'ProjectParameter1', N'String', @false, N'Hello world from script (DEV).', N'Example projectparameter 1 (DEV).' )
    , ( @true, N'ProjectParameter2', N'Int32', @false, N'4', N'Example projectparameter 2 (DEV).' )
    
END

IF @environment_name = N'TST'
BEGIN
  INSERT INTO @environment_variables_table
      ( [create_only], [variable_name], [data_type], [sensitive], [value], [description] )
  VALUES
  -- Just a few examples of environment variables, add your own here or generate this part from existing environments or deployed packages, see scripts below.
      ( @true, N'Parameter1', N'Int32', @false, N'1', N'Example parameter 1 (TST).' )
    , ( @true, N'Parameter2', N'String', @false, N'Hello world from script (TST).', N'Example parameter 2 (TST).' )
    , ( @true, N'Parameter3', N'Decimal', @false, N'9.95', N'Example parameter 3 (TST).' )
    , ( @true, N'ProjectParameter1', N'String', @false, N'Hello world from script (TST).', N'Example projectparameter 1 (TST).' )
    , ( @true, N'ProjectParameter2', N'Int32', @false, N'4', N'Example projectparameter 2 (TST).' )
    
END

IF @environment_name = N'PRD'
BEGIN
  INSERT INTO @environment_variables_table
      ( [create_only], [variable_name], [data_type], [sensitive], [value], [description] )
  VALUES
  -- Just a few examples of environment variables, add your own here or generate this part from existing environments or deployed packages, see scripts below.
      ( @true, N'Parameter1', N'Int32', @false, N'1', N'Example parameter 1 (PRD).' )
    , ( @true, N'Parameter2', N'String', @false, N'Hello world from script (PRD).', N'Example parameter 2 (PRD).' )
    , ( @true, N'Parameter3', N'Decimal', @false, N'19.95', N'Example parameter 3 (PRD).' )
    , ( @true, N'ProjectParameter1', N'String', @false, N'Hello world from script (PRD).', N'Example projectparameter 1 (PRD).' )
    , ( @true, N'ProjectParameter2', N'Int32', @false, N'4', N'Example projectparameter 2 (PRD).' )
    
END

SELECT @id = 1, @max_id = MAX([id]) FROM @environment_variables_table
WHILE @id <= @max_id
BEGIN
    SELECT
      @variable_name = v.variable_name,
      @data_type = v.data_type,
      @sensitive = v.sensitive,
      @value = v.value,
      @description = v.[description],
      @create_only = v.[create_only],
      @exists = 0
    FROM @environment_variables_table v
    WHERE [id] = @id;

    IF EXISTS (
        SELECT 1
        FROM
            [SSISDB].[internal].[environment_variables] v
        JOIN
            [SSISDB].[internal].[environments] e ON e.environment_id = v.environment_id
        JOIN
            [SSISDB].[internal].[folders] f
            ON e.folder_id = f.folder_id
        WHERE
            v.[name] = @variable_name
            AND e.environment_name = @environment_name
            AND f.name = @folder_name
        )
    BEGIN
        IF @create_only = 1
        BEGIN
            SET @message = @variable_name + ' already exists and is not replaced.'
            RAISERROR(@message , 0, 1) WITH NOWAIT;
            SET @exists = 1;
        END ELSE BEGIN
            SET @nsql = N'EXECUTE [catalog].[delete_environment_variable] '
              + N'@folder_name = N'''+ @folder_name + ''', @environment_name = N''' + @environment_name + ''', '
              + N'@variable_name = N''' + @variable_name + ''''
            PRINT @nsql;
            EXEC sp_executesql @nsql;
        END
    END

    IF @exists = 0
    BEGIN
        SET @message = 'Creating variable “' + @variable_name + '”.';
        RAISERROR(@message , 0, 1) WITH NOWAIT;

        SET @nsql = N'EXECUTE [catalog].[create_environment_variable] '
          + N'@folder_name = N'''+ @folder_name + ''', @environment_name = N''' + @environment_name + ''', '
          + N'@variable_name = N'''+ @variable_name + ''', @data_type = N''' + @data_type + ''', '
          + N'@sensitive = ' + CONVERT(NVARCHAR, @sensitive) + ', @description = N''' + @description + ''', '
          + CHAR(13) + CHAR(10) + N'@value = ' +
          CASE UPPER(@data_type)
          WHEN 'String' THEN 'N''' + CONVERT(NVARCHAR(1000), @value) + ''' '
          ELSE CONVERT(NVARCHAR(1000), @value)
          END + '; '
        -- PRINT @nsql;
        EXEC sp_executesql @nsql;
    END
    SET @id = @id + 1
END

RAISERROR('Creating folder, environment and settings has completed ..', 0, 1) WITH NOWAIT;

Finally:

GO
--\
---) ON a server where the VARIABLES already exist, you can use this query to generate the 'VALUES' part of the INSERT command used above.
--/
/*
DECLARE @environment_name nvarchar(128) = N'DEV'

SELECT
    m.*,
    v.name,
    N', ( N''' + v.name + ''', N''' + m.[ssis_data_type] + ''', '
      + CONVERT(NCHAR(1), v.sensitive) + ', N'''
      + CASE v.sensitive
        WHEN 0 THEN CONVERT(NVARCHAR(4000), v.value)
        ELSE '< type sensitive value here TODO >' END
      + ''', N''' + v.description + ''' )' AS [sql_values_part]
FROM
    [SSISDB].[internal].[environment_variables] v
JOIN
    [SSISDB].[internal].[environments] e
   ON e.environment_id = v.environment_id
JOIN
    [SSISDB].[internal].[folders] f
    ON e.folder_id = f.folder_id
JOIN
    [SSISDB].[internal].[data_type_mapping] m
    ON m.sql_data_type = v.base_data_type
    AND ( m.sql_data_type != 'tinyint' OR m.ssis_data_type = 'Byte')
WHERE
    e.environment_name = @environment_name
*/

--\
---) ON a server where the PACKAGES are ALREADY DEPLOYED, you can use this query to generate the 'VALUES' part of the INSERT command used above.
---) Manually check the output before running the sql script!
--/
/*
SELECT
  N', ( N''' + q.[variable_name] + ''', N''' + q.[data_type] + ''', '
    + CONVERT(NCHAR(1), q.[sensitive]) + ', N'''
    + CASE q.sensitive
        WHEN 0 THEN CONVERT(NVARCHAR(4000), q.value)
        ELSE '< type sensitive value here TODO >' END
    + ''', N''' + q.[description] + ''' )' AS [sql_values_part]
FROM
  (
  SELECT TOP 2147483647
      prm.[parameter_name] AS [variable_name],
      prm.[parameter_data_type] AS [data_type],r
      prm.[sensitive],
      COALESCE(prm.[default_value], prm.[design_default_value]) AS [value],
      prm.[description],
      ROW_NUMBER() OVER (PARTITION BY prm.[parameter_name] ORDER BY prm.[parameter_name]) AS [Occurrence]
  FROM
      [SSISDB].[internal].[object_parameters] prm
  -- [variable_name], [data_type], [sensitive], [value], [description]
  WHERE
      prm.[value_type] = 'V'
      AND prm.[parameter_name] NOT LIKE 'CM.%'
      AND LEFT(prm.[parameter_name], 1) != '_'
      AND NOT ( prm.[object_type] = 30 AND LEFT(prm.[object_name], 1) = '_') -- Naming convention for internally used SSIS Packages: start with _
  ORDER BY
      prm.[parameter_name]
  ) q
WHERE
  q.Occurrence = 1
*/

GO

ssis-e10-428
Example output of this script, when executed using Management Studio.

ssis-e10-429
And the second time, when a parameter is added and “create_only” is true for the existing parameters.

Deploy your SSIS Packages

This is about how to deploy an .ispac file in an automated way.
An .ispac file is a zipfile that contains all packages in a SSIS project that can be deployed all at once. You create it when you chose Build or Rebuild in Visual Studio when you have the SSIS project open.
By the way, if you rename it and give it a .zip extension, you can simply look what is inside.

Amongst the files you can download here there are demo SSIS projects that I used for this on SQL Server 2012, 2014 and 2016.

ssis-e10-423
The demo SSIS project ..
ssis-e10-424
Menu-option to rebuild the SSIS solution ..
ssis-e10-425
ssis-e10-426
After rebuilding the solution from the menu you will find the ispac files in the bin\<configuration name> folder of each project. In the screenprints above you can see how the bin subfolder is related to the chosen configuration in Visual Studio.

To install an ispac file, I use the command line options of the Integration Services Deployment Wizard.

For a default installation you can find the Integration Services deployment wizard (ISDeploymentWizard.exe) here:

Version Path
SQL Server 2012 C:\Program Files\SQL Server\110\DTS\Binn
SQL Server 2014 C:\Program Files\SQL Server\120\DTS\Binn
SQL Server 2016 C:\Program Files\SQL Server\130\DTS\Binn

You will find the exact command line in one of the following paragraphs, where all steps are glued together.
On 64-bit machines there is also a 32-bit version of this tool, but to be honest I do not see a reason why you would want to use the 32-bit version instead of the 64-bit, therefore I did not list the paths here.

ssis-e10-427
After installation you can find the folder, project and environment here ..

Setup package configuration

After the packages/SSIS projects are deployed, the next thing to do is adding an environment reference to the project(s) and connecting environment variables to package parameters and project parameters, based on the name.
I use this script for that:

post_deployment.sql

--\ -------------------------------------------------------------------
---) Script to add environment reference to project(s) and to connect
---) environment variables to package parameters, based on the name
--/ -------------------------------------------------------------------
/*
(c) Copyright 2016 - hansmichiels.com
 
This program is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.
 
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
 
You should have received a copy of the GNU General Public License
along with this program. If not, see http://www.gnu.org/licenses/.
*/

USE [SSISDB]
GO

DECLARE @object_parameter_value_table TABLE(
  [id] int identity(1, 1),
  [object_type] smallint,
  [object_name] nvarchar(260),
  [parameter_name] nvarchar(128),
  [project_name] NVARCHAR(128)
  )

DECLARE @project_names_table TABLE(
  [id] int identity(1, 1),
  [project_name] NVARCHAR(128)
  )

DECLARE @message nvarchar(255)

DECLARE @id int
DECLARE @max_id int
DECLARE @folder_name sysname = N'unknown'
DECLARE @environment_name nvarchar(128) = N'unknown'
DECLARE @environment_description nvarchar(128) = N'unknown'
DECLARE @project_name NVARCHAR(128)

DECLARE @object_type SMALLINT
DECLARE @object_name NVARCHAR(260)
DECLARE @value_type CHAR(1)
DECLARE @parameter_name NVARCHAR(128)
DECLARE @parameter_value SQL_VARIANT

DECLARE @reference_id BIGINT

--\
---) Environment settings
--/

IF @@SERVERNAME LIKE 'VWS63-SQL1%1'
BEGIN
    SELECT
      @folder_name = N'DEV',
      @environment_name = N'DEV',
      @environment_description = N'My development environment'
END
IF @@SERVERNAME = '<SQL INSTANCE NAME>'
BEGIN
    SELECT
      @folder_name = N'TST',
      @environment_name = N'TST',
      @environment_description = N'My test environment'
END
IF @@SERVERNAME = '<SQL INSTANCE NAME>'
BEGIN
    SELECT
      @folder_name = N'PRD',
      @environment_name = N'PRD',
      @environment_description = N'My production environment'
END

IF @folder_name = N'unknown'
BEGIN
    RAISERROR('Unknown environment', 16, 1)
    -- Make sure that the script does not continue when the environment settings could not be set due to an unknown @@SERVERNAME.
    GOTO Finally
END

INSERT INTO @project_names_table([project_name])
  SELECT
      p.name
  FROM
      [SSISDB].[internal].[folders] f
  JOIN
      [SSISDB].[internal].[environments] e
      ON e.folder_id = f.folder_id
  JOIN
      [SSISDB].[internal].[projects] p
      ON p.folder_id = f.folder_id
  WHERE
      f.name = @folder_name
      AND e.environment_name = @environment_name
      -- Optional project_name filter here:
      -- AND p.name IN ('Project_1', 'Project_2')

--\
---) Add environment reference to project(s).
--/
SELECT @id = 1, @max_id = MAX([id]) FROM @project_names_table
WHILE @id <= @max_id
BEGIN
    SELECT
      @project_name = v.[project_name]
    FROM @project_names_table v
    WHERE [id] = @id;

    IF NOT EXISTS( SELECT 1
        FROM
            [SSISDB].[internal].[folders] f
        JOIN
            [SSISDB].[internal].[environments] e
            ON e.folder_id = f.folder_id
        JOIN
            [SSISDB].[internal].[projects] p
            ON p.folder_id = f.folder_id
        JOIN
            [SSISDB].[internal].[environment_references] r
            ON r.environment_name = e.environment_name
            AND p.project_id = r.project_id
        WHERE
            f.name = @folder_name
            AND e.environment_name = @environment_name
            AND p.name = @project_name
        )
    BEGIN
        SET @message = 'An environment reference for project “' + @project_name + '” is being created.'
        RAISERROR(@message , 0, 1) WITH NOWAIT;

        EXEC [SSISDB].[catalog].[create_environment_reference]
            @environment_name=@environment_name, @reference_id=@reference_id OUTPUT,
            @project_name=@project_name, @folder_name=@folder_name, @reference_type='R';
        --Select @reference_id
    END

    SET @id = @id + 1
END

--\
---) Connect environment variables to package parameters, based on the name
--/
INSERT INTO @object_parameter_value_table (
    [object_type],
    [object_name],
    [parameter_name],
    [project_name]
    )
SELECT
    prm.[object_type],
    prm.[object_name],
    prm.[parameter_name],
    prj.name
FROM
    [SSISDB].[internal].[folders] f
JOIN
    [SSISDB].[internal].[environments] e
    ON e.folder_id = f.folder_id
JOIN
    [SSISDB].[internal].[environment_variables] ev
    ON e.environment_id = ev.environment_id
JOIN
    [SSISDB].[internal].[projects] prj
    ON prj.folder_id = f.folder_id
JOIN
    @project_names_table prjsel
    ON prjsel.project_name = prj.name
JOIN
    [SSISDB].[internal].[object_parameters] prm
    ON prj.project_id = prm.project_id
    AND prm.parameter_name = ev.name
WHERE
    prm.[value_type] != 'R'
    AND prm.value_set = 0
    AND prm.[parameter_name] NOT LIKE 'CM.%'
    AND LEFT(prm.[parameter_name], 1) != '_' -- Naming convention for internally used parameters: start with _
    AND NOT ( prm.[object_type] = 30 AND LEFT(prm.[object_name], 1) = '_') -- Naming convention for internally used SSIS Packages: start with _
    AND f.name = @folder_name
    AND e.environment_name = @environment_name
ORDER BY
    prm.object_name, prm.parameter_name

SELECT @id = 1, @max_id = MAX([id]) FROM @object_parameter_value_table
WHILE @id <= @max_id
BEGIN
    SELECT
        @object_type = v.[object_type],
        @object_name = v.[object_name],
        @parameter_name = v.[parameter_name],
        @project_name = v.[project_name]
    FROM
        @object_parameter_value_table v
    WHERE
        [id] = @id;

    SELECT @value_type = 'R', @parameter_value = @parameter_name;
    
    SET @message = 'Parameter “' + @parameter_name + '” (of object “' + @object_name + '”) is mapped to environment variable.'
    RAISERROR(@message , 0, 1) WITH NOWAIT;

    EXEC [SSISDB].[catalog].[set_object_parameter_value]
      @object_type, @folder_name, @project_name, @parameter_name,
      @parameter_value, @object_name, @value_type

    SET @id = @id + 1
END

RAISERROR('Adding environment reference to project(s) and mapping environment variables to package parameters has completed ..', 0, 1) WITH NOWAIT;

Finally:

GO

Also important is that I use a naming convention to decide whether or not to create a link between a package parameter and an environment variable.
For this I borrowed a naming convention from c# coding, where a variable starting with an underscore is a private variable within a class.
So what I have done is the following:

  • To exclude one package parameter from being mapped to an environment variable, start the name with _ (underscore). I know you could just use parameter names that do not exist as environment variables, but this naming convention makes your design more clear. The naming convention also makes it possible to generate the script to create environment variables after you have initially deployed your packages, you can then just skip the package parameter names that start with _.
  • To exclude all package parameters for a SSIS package, start the package name with _ (e.g. when this package is always executed from other packages and does not need environment mappings).

For the demo I use a solution with two projects, one of them with three packages, and the second one with just one package. This is done to demonstrate how connecting package- and project parameters to environment variables (step 3) works.

Project SSIS-E10-P1

  • ExampleMaster.dtsx: contains Parameter1 and Parameter2. Both parameters should be mapped to environment variables with the same name.
  • Export_Persons.dtsx: contains Parameter1, Parameter2 and _Parameter3. Only the parameters not starting with the underscore should be mapped to the environment variables with the same name.
  • _LoadDimPerson.dtsx: contains Parameter1, Parameter2 and Parameter4. Not a single parameter should be mapped to an environment variable, because the package name starts with _
  • The project contains ProjectParameter1 which should be mapped to an environment variable with the same name.

Project SSIS-E10-P2

  • AnyPackage.dtsx: contains Parameter1 and Parameter5. Only Parameter1 will be mapped, because no environment variable with name Parameter5 exists.
  • The project contains ProjectParameter1 and ProjectParameter2. Both parameters should be mapped to environment variables with the same name.
  • ssis-e10-431
    Example output windows when this script is run after having deployed the ispac files.

    ssis-e10-430
    And the result: the environment is “selectable” and the parameters show that they are mapped to an environment variable by having an underlined environment variable name in the “Value” column.

    Glueing everything together using Powershell

    Alright, what do we have so far?

    1. A sql script for setting up the folder and environment with variables.
    2. One or more .ispac files that can silently be deployed using ISDeploymentWizard.exe and its command line options.
    3. A sql script for connecting package- and project parameters to environment variables.

    So our PowerShell script simply needs to execute a sql script, then deploy the packages, and then execute another sql script. That shouldn’t be too difficult.
    This is the script that does exact those steps. If I have some time left I want to split the configuration and the actual script over different files/scripts.

    deployment_x64.ps1
    ######################################################################
    # Script:
    # ssis_deployment_script.ps1
    # Author:
    # Hans Michiels
    # Description:
    # This scripts consists of three parts:
    # 1. Executes pre_deployment.sql, which sets up an environment
    # with its variables.
    # 2. Installs one or more SSIS Deployment packages (ispac files)
    # that are in the same folder as this script.
    # 3. Executes post_deployment.sql, which configures the installed
    # packages to use the environment variables.
    #
    # (c) Copyright 2016 - hansmichiels.com
    #
    # This program is free software: you can redistribute it and/or modify
    # it under the terms of the GNU General Public License as published by
    # the Free Software Foundation, either version 3 of the License, or
    # (at your option) any later version.
    #
    # This program is distributed in the hope that it will be useful,
    # but WITHOUT ANY WARRANTY; without even the implied warranty of
    # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
    # GNU General Public License for more details.
    #
    # You should have received a copy of the GNU General Public License
    # along with this program. If not, see http://www.gnu.org/licenses/.
    #
    ######################################################################

    cls
    # Get the folder where this script is located
    $ScriptPath = split-path -parent $MyInvocation.MyCommand.Definition
    # Write-Output $ScriptPath;

    ######################################################################
    # CONFIGURATION
    ######################################################################
    # Configure SQL instance name and (SSIS) foldername here.
    $SQLInstance = “VWS63-SQL161”
    $FolderName = “DEV”
    # SQL Server 2016 default location:
    $ISDeploymentWizard = “C:\Program Files\Microsoft SQL Server\130\DTS\Binn\ISDeploymentWizard.exe”
    # My location:
    # $ISDeploymentWizard = “C:\MSSQL\Shared features\130\DTS\Binn\ISDeploymentWizard.exe”

    ######################################################################
    # MAIN SCRIPT
    ######################################################################
    # \
    # 1. Calls pre_deployment sql script
    # /
    $File = $ScriptPath + “\pre_deployment.sql”
    sqlcmd -S $SQLInstance -i $File

    # \
    # 2. Installs each ispac file found in the folder where the powershell script is located.
    # /
    Get-ChildItem $ScriptPath -Filter *.ispac |
    Foreach-Object {
        $ProjectName = $_.BaseName
        Write-Host “Deploying project $ProjectName ..”;
        & $ISDeploymentWizard /Silent /ModelType:Project /SourcePath:”$ScriptPath\$ProjectName.ispac” /DestinationServer:”$SQLInstance” /DestinationPath:”/SSISDB/$FolderName/$ProjectName” | Out-Null
        Write-Host “Deployment of project $ProjectName completed ..”;
    }
    # \
    # 3. Calls post_deployment sql script
    # /
    $File = $ScriptPath + “\post_deployment.sql”
    sqlcmd -S $SQLInstance -i $File

    And this is the result of the execution of the Powershell script. From now on no more manual deployments!
    ssis-e10-435Powershell output window ..

    Download all scripts and SSIS Solutions for SQL Server 2012, 2014 and 2016 here.

    Conclusion / Wrap up

    The goal of this blog post was to explain to you how to automate the deployment of SSIS packages including the configuration and environment setup.
    I have explained how:

    • to create a folder, an environment and environment variables in SSISDB using a sql script;
    • to deploy your SSIS Packages /ispac files using Powershell and the command line options of Integration Services Deployment Wizard;
    • to setup your package configuration after deployment (or: connect package- and project parameters to environment variables);
    • to execute all of the above using Powershell.

    You could also read that I use a naming convention to be able to exclude individual package parameters or all parameters of a package from being connected to environment variables.

    (c) 2016 hansmichiels.com – Do not steal the contents – spread the link instead – thank you.

    How to upgrade your SSIS Packages from SQL Server 2012 to 2016 (SSIS Series)

    Errata (27 November 2016)

    Important, please read!
    Hi, I noticed something weird when upgrading my packages as described below, and thought it would be fair to share this with you.
    After upgrading, my builds started to take a lot of time (up to 4 minutes, I suspect an on-the-fly upgrade was being done). While investigating this, I found that in the project file all packages are also listed with a PackageFormatVersion per package in it, which was still 6!
    I am not sure if this is a Visual Studio bug, if the PackageFormatVersion of the dtsx file says 8, while the dtproj file says 6 for the same package.
    Anyway, after I modified the procedure slightly, my builds were fast again (20 seconds instead of 4 minutes).
    To achieve this, I opened the SSIS 2012 project using Visual Studio 2015, then chose Upgrade All Packages from the contextmenu.
    ssis-e09-422
    After that I changed the TargetServerVersion as described below.
    Sorry for any inconveniece caused by my initial article, and I hope this update will help.

    Introduction

    If you are intending upgrading your SQL Server installations to 2016, you’ll probably want to do the same with your SSIS packages.
    And I can tell you, coming from SQL Server 2012 or 2014 this is fairly easy to do.
    But with my experience I want to help you doing this not the wrong way.

    Problem

    I made the mistake by using the SSIS Package Upgrade Wizard while the projectversion still was SQL Server 2012.
    Funny thing is that the packages seemed to upgrade, however I got all kinds of errors and they just did not work.

    Solution

    While the solution seemed so easy.
    For this article I prepared a SSIS project made with SQL Server 2012.
    In the picture below you can how it looks like when opened with SQL Server Data Tools 2015 on a Server with SQL Server 2016 installed.

    ssis-e09-413After the project name, between brackets you can read that this is a SSIS 2012 project.

    ssis-e09-414Also, if you open a package in Notepad (be careful not to change it) you can see that the PackageFormatVersion is 6. This means SQL Server 2012.

    The simplest way to upgrade all packages in the project is to change the version of the SSIS project! Please be sure to backup your entire project and all SSIS packages first.
    To upgrade all packages, right click on the project node (in the picture on SSIS-E09-Packages (SQL Server 2012) and select Properties.
    Then select Configuration Properties and change the TargetServerVersion to SQL Server 2016.

    ssis-e09-415Changing the TargetServerVersion ..

    Then read the message, follow the instructions and click Yes.

    ssis-e09-416Upgrade instructions and warning

    The good news is, that’s basicly it! When you open a package with Notepad again, you can see that it is upgraded to PackageFormatVersion 8. This means SQL Server 2016.

    ssis-e09-418Package which is upgraded to PackageFormatVersion 8 (SQL Server 2016).

    In my experience with more complex packages than used for this demo, they all upgraded with no problems. I used all the standard SSIS components. Script Tasks and Script Components did not have to be recompiled.

    Still a warning about the Upgrade All Packages menu item, available from the SSIS Packages node. You don’t need it if you simply change the TargetServerVersion of the project as just described. If you use Upgrade All Packages with the wrong TargetServerVersion it seems to work, but the output packages do not work.

    ssis-e09-419The Upgrade All Packages menu-item

    Conclusion / Wrap up

    When you do it the right way, upgrading SSIS Packages from 2012 to 2016 is easy to do. However I did not test this with 2014, I am quite sure it will work in the exact same way.

    As you might have noticed I have changed the frequency of my blog posts, due to several circumstances. I still try to do 2 or 3 a month.

    Last updated: 27 November, 2016.

    (c) 2016 hansmichiels.com – Do not steal the contents – spread the link instead – thank you.

    It’s patch & update time! (Updates on previous posts)

    Introduction

    As you might have noticed I have had a lack of time over the past weeks, this is why my posts are a bit irregular. Sorry for that.
    Today a listing of some updates that I did lately.

    The updates

    How to copy a database table structure (T-SQL Scripting Series)

    This script was last updated on 14 July, 2016. If you use it, make sure it is the most recent version.

    Using alternate (non-time) triggers to start a SQL Agent job (SQL Server Series)

    This article and open source code on Github was updated a few days ago. Please check it if you are interested.

    How to protect your stored procedures against multiple concurrent executions (SQL Server Series)

    This article is just updated. Thanks to very useful comments some of you gave me on the initial article (thank you for that) I have added examples of using sp_getapplock and sp_releaseapplock.

    Conclusion / Wrap up

    Nobody is perfect, this why updates exist.

    (c) 2016 hansmichiels.com – Do not steal the contents – spread the link instead – thank you.

    How to protect your stored procedures against multiple concurrent executions (SQL Server Series)

    Last update: 5 October, 2016: added examples of using sp_getapplock and sp_releaseapplock.

    Introduction

    I ran into this problem lately. I had a stored procedure that did some global stuff for a database, it would do the exact same thing every time you executed it. It had no parameters.
    It would at the best case be useless and use unnecessary server resources when executed multiple times concurrently, and in the worst case would cause deadlocks in that situation.
    And this is exactly what this post is about: a stored procedure that is not supposed to be executed multiple times concurrently.
    Examples are stored procedures that are purging a database log, updating derived tables or columns, etc.

    Problem

    You can start a stored procedure, and before it is finished, can start it again from a different connection.

    Solutions

    There are (at least) two possible solutions for this problem.

    1. Use sp_getapplock and sp_releaseapplock (the preferred solution). My credits go to Andy Novick who has published about this before on MSSQLTips. I have added examples of using ‘Session’ as lock owner, and therefore I hope that my article will still have added value.
    2. My initial solution using dynamic management views is still in this article for reference.

    Using sp_getapplock and sp_releaseapplock

    How it works

    For your convenience the links to MSDN for help on those two system stored procedures:
    MSDN page about sp_getapplock
    MSDN page about sp_releaseapplock

    Important to understand is that as @LockOwner you can either use a transaction or a session.
    I have included two examples for both types of lockowner.

    Using ‘Transaction’ as lock owner

    In this stored procedure example (of which you can download the script) sp_getapplock is used with @LockOwner = ‘Transaction’.
    When you use this example, it is really important to use the TRY .. CATCH code so that the lock is always released before the stored procedure execution ends.

    sp_StoredProcUsingAppLock.sql

    -------------- [sp].[StoredProcUsingAppLock] --------------
    IF EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[sp].[StoredProcUsingAppLock]') AND type in (N'P', N'PC'))
      DROP PROCEDURE [sp].[StoredProcUsingAppLock]
    GO
    SET ANSI_NULLS ON
    GO
    SET QUOTED_IDENTIFIER ON
    GO
    /*
    ==========================================================================================
    Author: Hans Michiels
    Create date: 5-oct-2016
    Description: Example of stored procedure that is using sp_getapplock by using a transaction.
    ==========================================================================================
    */

    CREATE PROCEDURE [sp].[StoredProcUsingAppLock]
    (
      @RaiseError BIT = 0
    )
    AS
    BEGIN
      SET NOCOUNT ON;
      DECLARE @RC INT
      DECLARE @message VARCHAR(500)

      BEGIN TRY

        --\
        ---) Protection Against Concurrent Executions
        --/
        BEGIN TRAN

        SELECT @message = CONVERT(VARCHAR(30), GETDATE(), 121) + ': Try to obtain a lock ..'
        RAISERROR(@message, 0, 1) WITH NOWAIT;

        EXEC @RC = sp_getapplock
            @Resource = 'StoredProcUsingAppLock',
            @LockMode = 'Exclusive',
            @LockOwner = 'Transaction',
            @LockTimeout = 60000 -- 1 minute

        IF @RC < 0
        BEGIN
            IF @@TRANCOUNT > 0 ROLLBACK TRAN;
            SELECT @message = CONVERT(VARCHAR(30), GETDATE(), 121) + ': Sorry, could not obtain a lock within the timeout period, return code was ' + CONVERT(VARCHAR(30), @RC) + '.'
            RAISERROR(@message, 0, 1) WITH NOWAIT;
            RETURN @RC
        END ELSE BEGIN
            SELECT @message = CONVERT(VARCHAR(30), GETDATE(), 121) + ': AppLock obtained ..'
            RAISERROR(@message, 0, 1) WITH NOWAIT;
        END
      
        --\
        ---) Stored procedure body
        --/
        -- Wait so that stored procedure has some considerable execution time.
        SELECT @message = CONVERT(VARCHAR(30), GETDATE(), 121) + ': Hello World!'
        RAISERROR(@message, 0, 1) WITH NOWAIT;

        DECLARE @i INT = 0

        WHILE @i < 6
        BEGIN
            WAITFOR DELAY '00:00:01';
            EXEC [sp].[SubStoredProc];
            SET @i = @i + 1
        END

        IF @RaiseError = 1
        BEGIN
            RAISERROR('An error on demand', 16, 1);
        END

        SELECT @message = CONVERT(VARCHAR(30), GETDATE(), 121) + ': Goodbye ..'
        RAISERROR(@message, 0, 1) WITH NOWAIT;

        COMMIT TRAN
        SELECT @message = CONVERT(VARCHAR(30), GETDATE(), 121) + ': Transaction committed, appLock released ..'
        RAISERROR(@message, 0, 1) WITH NOWAIT;

      END TRY
      BEGIN CATCH
        IF @@TRANCOUNT > 0 ROLLBACK TRAN;
        SELECT @message = CONVERT(VARCHAR(30), GETDATE(), 121) + ': Transaction rolled back, appLock released ..'
        RAISERROR(@message, 0, 1) WITH NOWAIT;
        THROW;
      END CATCH
    END
    GO

    When executed ..

    I would not be me if I did not include a test and a proof that it works.

    As you can see in the printscreens below, all executions as nicely after each other ..
    mssql-e05-354
    mssql-e05-356
    mssql-e05-355
    mssql-e05-357

    As you can see in the printscreens below, this even works well, when an error occurs, due to the TRY .. CATCH block ..
    mssql-e05-361
    mssql-e05-358
    mssql-e05-360
    mssql-e05-359

    Using ‘Session’ as lock owner

    In this stored procedure example (of which you can download the script) sp_getapplock is used with @LockOwner = ‘Session’.
    An “advantage” of using the session might be that you can also use it, when you do not want to use a transaction within your stored procedure.
    As with the previous example, now also it is really important to use the TRY .. CATCH code so that the lock is always released before the stored procedure execution ends.

    demo_setup_sp_StoredProcUsingAppLockNoTran.sql
    -------------- [sp].[StoredProcUsingAppLockNoTran] --------------
    IF EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[sp].[StoredProcUsingAppLockNoTran]') AND type in (N'P', N'PC'))
      DROP PROCEDURE [sp].[StoredProcUsingAppLockNoTran]
    GO
    SET ANSI_NULLS ON
    GO
    SET QUOTED_IDENTIFIER ON
    GO
    /*
    ==========================================================================================
    Author: Hans Michiels
    Create date: 5-oct-2016
    Description: Example of stored procedure that is using sp_getapplock without using a transaction.
    ==========================================================================================
    */

    CREATE PROCEDURE [sp].[StoredProcUsingAppLockNoTran]
    (
      @RaiseError BIT = 0
    )
    AS
    BEGIN
      SET NOCOUNT ON;
      DECLARE @RC INT
      DECLARE @message VARCHAR(500)

      BEGIN TRY

        --\
        ---) Protection Against Concurrent Executions
        --/
        SELECT @message = CONVERT(VARCHAR(30), GETDATE(), 121) + ': Try to obtain a lock ..'
        RAISERROR(@message, 0, 1) WITH NOWAIT;

        EXEC @RC = sp_getapplock
            @Resource = 'StoredProcUsingAppLockNoTran',
            @LockMode = 'Exclusive',
            @LockOwner = 'Session',
            @LockTimeout = 60000 -- 1 minute

        IF @RC < 0
        BEGIN
            SELECT @message = CONVERT(VARCHAR(30), GETDATE(), 121) + ': Sorry, could not obtain a lock within the timeout period, return code was ' + CONVERT(VARCHAR(30), @RC) + '.'
            RAISERROR(@message, 0, 1) WITH NOWAIT;
            RETURN @RC
        END ELSE BEGIN
            SELECT @message = CONVERT(VARCHAR(30), GETDATE(), 121) + ': AppLock obtained ..'
            RAISERROR(@message, 0, 1) WITH NOWAIT;
        END
      
        --\
        ---) Stored procedure body
        --/
        -- Wait so that stored procedure has some considerable execution time.
        SELECT @message = CONVERT(VARCHAR(30), GETDATE(), 121) + ': Hello World!'
        RAISERROR(@message, 0, 1) WITH NOWAIT;

        DECLARE @i INT = 0

        WHILE @i < 6
        BEGIN
            WAITFOR DELAY '00:00:01';
            EXEC [sp].[SubStoredProc];
            SET @i = @i + 1
        END

        IF @RaiseError = 1
        BEGIN
            RAISERROR('An error on demand', 16, 1);
        END

        SELECT @message = CONVERT(VARCHAR(30), GETDATE(), 121) + ': Goodbye ..'
        RAISERROR(@message, 0, 1) WITH NOWAIT;

        EXEC @RC = sp_releaseapplock @Resource='StoredProcUsingAppLockNoTran', @LockOwner='Session';
        SELECT @message = CONVERT(VARCHAR(30), GETDATE(), 121) + ': AppLock released ..'
        RAISERROR(@message, 0, 1) WITH NOWAIT;

      END TRY
      BEGIN CATCH
        EXEC @RC = sp_releaseapplock @Resource='StoredProcUsingAppLockNoTran', @LockOwner='Session';
        SELECT @message = CONVERT(VARCHAR(30), GETDATE(), 121) + ': AppLock released after error ..'
        RAISERROR(@message, 0, 1) WITH NOWAIT;
        THROW;
      END CATCH
    END
    GO

    When executed ..

    As you can see in the printscreens below, all executions as nicely after each other ..
    mssql-e05-363
    mssql-e05-362
    mssql-e05-365
    mssql-e05-364

    As you can see in the printscreens below, this even works well, when an error occurs, due to the TRY .. CATCH block ..
    mssql-e05-367
    mssql-e05-366
    mssql-e05-368
    mssql-e05-369

    Other considerations

    As you might have noticed, this only worked because the timeout was 60 seconds, and all the executions could be completed within those 60 seconds. If this would not be the case, the stored procedure execution that was waiting to obtain the applock would time out and would bail out due to a RETURN statement.
    So you might play a bit with the timeout time to influence this behavior.
    If you do not care that the non-first executions bail out, you could set the time out to a smaller value, e.g. 5000 (milliseconds).
    If you absolutely want all executions to continue after waiting, you could set the time out to a higher value, that is above the maximum execution time of the stored procedure.
    In this way you can choose to let non-first executions wait or not, as could be achieved by using the @WaitIfExecuted parameter in my initial solution.

    My initial solution using dynamic management views

    So if you want to create a stored procedure that is Protected Against Concurrent Executions, you have to do it yourself.
    Making a PACE stored procedure is not very difficult. There is however one big pitfall. The examples on the internet I found all have nasty bug in it, at least when your stored procedure executes other stored procedures: when this happens, the “main” stored procedure goes off the radar, when the dynamic management objects sys.dm_exec_requests and sys.dm_exec_sql_text() are used, that means, you cannot detect that it is running as long as the “sub” stored procedure is being executed.
    To make the challenge even bigger, the object_id of the “main” stored procedure is nowhere, so with the available dynamic management objects there is no way to solve this.
    Therefore an extra “user” dynamic management table is needed, I called it [udm_storedproc_executions]. It is automaticly cleaned up in the stored procedure [sp].[GetPace].
    This stored procedure is used to implement Protection Against Concurrent Executions.
    You execute it at the beginning in stored procedures that you want to protect against concurrent executions.

    How it works

    • You create the table [dbo].[udm_storedproc_executions] (or name it differently) in your database.
    • You create the stored procedure [sp].[GetPace] (or name it differently) in your database.
    • You add the code snippet below to stored procedures that are not supposed to run multiple times concurrently.

    Create the table

    udm_storedproc_executions.sql
    /*
    (c) Copyright 2016 - hansmichiels.com
     
    This program is free software: you can redistribute it and/or modify
    it under the terms of the GNU General Public License as published by
    the Free Software Foundation, either version 3 of the License, or
    (at your option) any later version.
     
    This program is distributed in the hope that it will be useful,
    but WITHOUT ANY WARRANTY; without even the implied warranty of
    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
    GNU General Public License for more details.
     
    You should have received a copy of the GNU General Public License
    along with this program. If not, see .
    */

    IF OBJECT_ID('[dbo].[udm_storedproc_executions]', 'U') IS NOT NULL
       DROP TABLE [dbo].[udm_storedproc_executions];
    GO
    SET ANSI_NULLS ON
    GO
    SET QUOTED_IDENTIFIER ON
    GO
    SET ANSI_PADDING OFF
    GO

    CREATE TABLE [dbo].[udm_storedproc_executions](
     [start_time] [datetime] NOT NULL,
     [session_id] [smallint] NOT NULL,
     [connection_id] [uniqueidentifier] NOT NULL,
     [task_address] [varbinary](8) NOT NULL,
     [object_id] [int] NOT NULL,
      CONSTRAINT [PK_dbo_udm_storedproc_executions] PRIMARY KEY CLUSTERED
    ( [start_time] ASC,
      [session_id] ASC,
      [connection_id] ASC,
      [task_address] ASC
    ) WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
    ) ON [PRIMARY]
    GO

    SET ANSI_PADDING OFF
    GO

    Create the stored procedure

    sp_GetPace.sql
    /*
    (c) Copyright 2016 - hansmichiels.com
     
    This program is free software: you can redistribute it and/or modify
    it under the terms of the GNU General Public License as published by
    the Free Software Foundation, either version 3 of the License, or
    (at your option) any later version.
     
    This program is distributed in the hope that it will be useful,
    but WITHOUT ANY WARRANTY; without even the implied warranty of
    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
    GNU General Public License for more details.
     
    You should have received a copy of the GNU General Public License
    along with this program. If not, see .
    */

    -------------- [sp].[GetPace] --------------
    IF EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[sp].[GetPace]') AND type in (N'P', N'PC'))
      DROP PROCEDURE [sp].[GetPace]
    GO
    SET ANSI_NULLS ON
    GO
    SET QUOTED_IDENTIFIER ON
    GO
    /*
    ==========================================================================================
    Author: Hans Michiels
    Create date: 17-sep-2016
    Description: Generic stored procedure that can give a different stored procedure
                 PACE (Protection Against Concurrent Execution).
    ==========================================================================================
    */

    CREATE PROCEDURE [sp].[GetPace]
    (
      @StoredProcedureName NVARCHAR(256), -- Quoted name including schema, e.g. [dbo].[SomeStoredProcedure]
      @WaitIfExecuted BIT = 0, -- Indication whether to wait when the stored procedure is already executing (and execute it afterwards).
      @PaceAction VARCHAR(8) OUTPUT, -- Domain values [ CONTINUE | WAIT | RETURN ]
      @InsertDmRow BIT = 0, -- Indication if a row must be inserted to table [dbo].[udm_storedproc_executions]
      @EnableDebugMessages BIT = 0 -- Speaks for itself
    )
    AS
    BEGIN
      SET NOCOUNT ON;

      --\
      ---) Declarations.
      --/
      DECLARE @SpidThatWins INT
      DECLARE @WaitTime CHAR(8) = '00:00:03'
      DECLARE @message NVARCHAR(500)

      --\
      ---) Insert new row into table [dbo].[udm_storedproc_executions].
      --/
      IF @InsertDmRow = 1
      BEGIN
          INSERT INTO [dbo].[udm_storedproc_executions]
            SELECT
                der.[start_time],
                der.[session_id],
                der.[connection_id],
                der.[task_address],
                OBJECT_ID(@StoredProcedureName, 'P') AS [object_id]
            FROM
                sys.dm_exec_requests der
            CROSS APPLY
                sys.dm_exec_sql_text(der.sql_handle) xst
            WHERE
                der.session_id = @@SPID
                AND der.[sql_handle] is not null
                AND xst.objectid = OBJECT_ID('[sp].[GetPace]', 'P');
      END

      -- We need to prevent that this stored procedure would be executed multiple times
      -- simultaneously.
      SELECT @SpidThatWins = -- The one that first started. If started at exact the same time, the lowest SPID wins.
        (
        SELECT TOP 1 der.[session_id]
        FROM
            sys.dm_exec_requests der
        JOIN
            dbo.udm_storedproc_executions spx
            ON spx.[session_id] = der.[session_id]
            AND spx.[start_time] = der.[start_time]
            AND spx.[connection_id] = der.[connection_id]
            AND spx.[task_address] = der.[task_address]
        WHERE
            der.[sql_handle] is not null
            AND spx.[object_id] = OBJECT_ID(@StoredProcedureName, 'P')
        ORDER BY
            der.[start_time] ASC,
            der.[session_id] ASC
        )

      SELECT @PaceAction =
      CASE
        WHEN @@SPID = @SpidThatWins THEN 'CONTINUE'
        WHEN @WaitIfExecuted = 0 THEN 'RETURN'
        WHEN @WaitIfExecuted = 1 THEN 'WAIT'
      END

      IF @PaceAction = 'WAIT'
      BEGIN
          -- Still executing, so wait until I can start.
          IF @EnableDebugMessages = 1
          BEGIN
              -- Still executing, so wait until I can start.
              SET @message = @StoredProcedureName + N' is waiting for other execution to finish .. ';
              RAISERROR(@message, 0, 1) WITH NOWAIT;
          END
          WAITFOR DELAY @WaitTime
      END
      IF @PaceAction = 'RETURN' AND @EnableDebugMessages = 1
      BEGIN
          -- Still executing, so wait until I can start.
          SET @message = N'Cancelling execution of stored proc ' + @StoredProcedureName;
          RAISERROR(@message, 0, 1) WITH NOWAIT;
      END

      --\
      ---) Cleanup old rows from [udm_storedproc_executions].
      --/
      DELETE FROM spx
        FROM
            [dbo].[udm_storedproc_executions] spx
        LEFT JOIN
            sys.dm_exec_requests der
            ON spx.[session_id] = der.[session_id]
            AND spx.[start_time] = der.[start_time]
            AND spx.[connection_id] = der.[connection_id]
            AND spx.[task_address] = der.[task_address]
        WHERE
            der.session_id IS NULL;
    END
    GO

    Add the code snippet

    code_snippet_sp_start.sql
    CREATE PROCEDURE [sp].[StoredProcGettingPace]
    AS
    BEGIN
      SET NOCOUNT ON;
      --\
      ---) PACE (Protection Against Concurrent Executions) code
      --/
      DECLARE @PaceAction VARCHAR(8), @InsertDmRow BIT = 1
      WHILE 1 = 1
      BEGIN
          EXECUTE [sp].[GetPace]
            @StoredProcedureName = '[sp].[StoredProcGettingPace]',
            @WaitIfExecuted = 1,
            @PaceAction = @PaceAction OUTPUT,
            @InsertDmRow = @InsertDmRow,
            @EnableDebugMessages = 1;

          IF @PaceAction = 'RETURN' RETURN 0;
          IF @PaceAction = 'CONTINUE' BREAK;
          SET @InsertDmRow = 0;
      END

      --\
      ---) Stored procedure body ..
      --/
    END
    GO

    Proof that it works

    For the demo I use the following stored procedures:

    • [sp].[SubStoredProc]
    • [sp].[StoredProcGettingPace]
    • [sp].[StoredProcGettingPaceNoWait]

    By the way, all the scripts can be downloaded here.

    [sp].[SubStoredProc] is being executed by both other stored procedures.

    [sp].[StoredProcGettingPace]

    First I will demonstrate the testresults with [sp].[StoredProcGettingPace].
    It is executed from 4 query windows at the same time. Debug messages show us what happened.

    mssql-e05-341The query that predicts the order in which the session_ids will be processed: the order is 60, 59, 57, 52.

    mssql-e05-342As expected session 60 is processed first.

    mssql-e05-343As expected session 59 is processed after 60. The “Hello world” time (the beginning of the actual stored procedure body) is after the “Goodbye” time of session 60.

    mssql-e05-344Then session 57. Again the “Hello world” time is after the “Goodbye” time of session 59.

    mssql-e05-345And finally session 52. It goes without saying that the “Hello world” time is after the “Goodbye” time of the previous session.

    [sp].[StoredProcGettingPaceNoWait]

    It is important to understand parameter @WaitIfExecuted of stored procedure [sp].[GetPace].
    In the previous example we used the value 1 (true), which means that every execution of a stored procedure protected with [sp].[GetPace] is eventually executed, but later.
    But if we use the value 0 (false) for @WaitIfExecuted, we actually cancel the execution of a stored procedure if it is already running.

    Here are the testresults of executing [sp].[StoredProcGettingPaceNoWait].
    Again it is executed from 4 query windows at the same time and debug messages show us what happened.

    mssql-e05-346As expected the stored procedure is executed only once. The other executions are cancelled (see below).

    mssql-e05-347

    mssql-e05-348

    mssql-e05-349

    Download all the scripts here.

    Conclusion / Wrap up

    In this blog post you could read about two different solutions that can prevent a stored procedure from being executed multiple times simultaneously.
    This can be convenient in certain situations, when the stored procedure does some “global” work in your database, and it would be unnecessary and potentially “dangerous” when it is executing multiple times concurrently.

    (c) 2016 hansmichiels.com – Do not steal the contents – spread the link instead – thank you.

    Using alternate (non-time) triggers to start a SQL Agent job (SQL Server Series)

    Last update: 29 September, 2016: major update: (1) extended to use an umlimited number of jobtrigger conditions; (2) added SQLAgentJobResult trigger conditions; (3) configuration via config file no longer supported.

    Introduction

    Today some SQL Agent stuff. I thought the SQL Agent works just great, until I got the question: can you start this job, when there is a file dropped in folder X, but only if also another process has already done something in the database. Huh?

    Problem

    With the SQL Agent I can only schedule a job to using time triggers, while I need different triggers like
    existence of one or more files, a certain database state and/or other SQL Agent Job(s) having succeeded, failed or just completed.

    Solution

    So I needed something that checked if a file exists in a folder, and if something was done in the database, and if both is true, a SQL Agent job must be started.
    I decided to create a C# Console Application named SQLAgentJobStarter, that could do this job for me. But I wanted it to be a bit more flexible, using one or more trigger conditions of different types, like file trigger or a sql trigger, or both (as in the requirement above).
    Later on I extended this with jobtrigger conditions, and I changed the definitions a bit. A trigger can now have one or more conditions of different types:

    • file jobtrigger condition
    • sql jobtrigger condition
    • SQLAgentJobResult jobtrigger condition

    The different condition types can be combined in one jobtrigger, e.g. SQLAgentJob “Job_A” must have succeeded and file test.txt must be in folder C:\temp.

    In this article I want to concentrate on instructing you how to use this tool, instead of going into deep from a coding point of view.
    I have made this tool an open source project on GitHub, so you still can look in the code (SQLAgentJobStarter_VisualStudio2015_SourceCode.zip) if you want, but that’s up to you.

    DISCLAIMER: please use this tool at your own risk. It is in a bèta state, you should not use it in production environments, until you have convinced yourself that it does what it should do.

    Installation and demonstration of the tool

    One important comment first. This tool is not a Windows Service. It is “just” a Console Application that needs to be scheduled using a SQL Agent job or Windows Task Scheduler.
    Having said that, what are we going to do?

    • We are going to create two tables in the msdb database: [dbo].[sysjobtriggers] and [dbo].[sysjobtrigger_conditions]. All other SQL Agent job data is also in this database, therewith this seems a fair place. Still I want to emphasize the importance of backing up this database on a regular base, preferably daily. But if you have a DBA and a maintenance plan set up, this might be needless to say.
    • We are going to create a few stored procedures in the msdb database to maintain those tables.
    • We are going to install the executable HansMichiels.com.Tools.SQLAgentJobStarter.exe and corresponding config file in the C:\Program Files\SQLAgentJobStarter\MSSQL13.MSSQLSERVER folder. MSSQL13.MSSQLSERVER represents the SQL Version number and Instance name, in the example a default instance of SQL Server 2016. Change it to the name of your SQL Server version and instance. In this way you can install the tool side by side for different SQL Instances on the same machine.
    • Before start using the tool, check the SQLConnectionString in the HansMichiels.com.Tools.SQLAgentJobStarter.exe.config file. By default, this is configured as follows: key=”SQLConnectionString” value=”Data Source=(local);Initial Catalog=msdb;Integrated Security=SSPI;”. This is okay for a local default SQL Server Instance, but not usable when you are using named instances.
    • We are going to schedule the tool using a SQL Agent job.
    • Now you are ready to rock! All that needs to be done is adding jobtriggers and jobtrigger_conditions to the msdb database. But to help you with this, I will show you a demo now.

    Installation – technical steps

    • Download all the needed files here.
    • Unzip the files in SQLAgentJobStarter_ProgramFiles.zip, and copy them to a folder on your server, I used C:\Program Files\SQLAgentJobStarter\MSSQL13.MSSQLSERVER.
    • Review the script msdb_install.sql and if needed adjust the foldername here: @command=N'”C:\Program Files\SQLAgentJobStarter\MSSQL13.MSSQLSERVER\HansMichiels.com.Tools.SQLAgentJobStarter.exe”‘,
    • After reviewing, execute the script msdb_install.sql
    • Review the SQL Agent job ##SQLAgentJobStarter##, especially the schedule. If you agree with a polling interval of 3 minutes, enable the job, otherwise change the interval first.
      mssql-s01e04-330Enabling the job ..

    So far, so good. The tool is installed.

    Installation – preparation for the demo.

    If you want to join the demo, also do the following:

    • execute script demo_setup_sqlagentjobs.sql: this script will set up dummy SQL Agent jobs that are used for the demo.
    • execute script demo_setup_storedprocedures.sql: this script will create a database [MSSQL_S01E04] with some stored procedures that are used for sql triggers.
    • create a folder C:\Temp, if it does not exist yet, and create a file with a .txt extension in it.
    • execute script demo_setup_sysjobtriggers.sql: this script will add demo rows to the tables [dbo].[sysjobtriggers] and [dbo].[sysjobtrigger_conditions]

    mssql-s01e04-341The jobs that are created so far

    mssql-s01e04-342
    The tables in msdb that hold the trigger information ..

    mssql-s01e04-355The stored procedures in msdb used to maintain the jobtriggers ..

    mssql-s01e04-343These tables have 7 jobtriggers and 9 jobtrigger conditions used for the demo ..

    mssql-s01e04-334The [MSSQL_S01E04] database with stored procedures, used for the demo.

    mssql-s01e04-335The C:\Temp folder, that should contain one or more .txt files ..

    Viewing the demo results.

    No just wait for about 10 minutes.
    In the meantime I can tell you some expected testresults:

    • MSSQL_S01E04_Job0 will never be executed because the jobtrigger uses a sql trigger using stored procedure [MSSQL_S01E04].[sp].[SqlTriggerFalse], that always returns 0 (false).
    • MSSQL_S01E04_Job3 will only be executed every 6 minutes, because this job runs for a little more than 3 minutes, and if the job already runs it will not be started again by the tool.
    • MSSQL_S01E04_Job1 and MSSQL_S01E04_Job2 will be executed every 3 minutes, because their triggers (C:\Temp\*.txt and [MSSQL_S01E04].[sp].[SqlTriggerExample], both return True.
    • MSSQL_S01E04_Job4 will only be executed after the triggering jobs have succeeded both.
    • MSSQL_S01E04_Job5 will only be executed after the triggering job has failed.
    • MSSQL_S01E04_Job6 will only be executed after the triggering job has either succeeded or failed.

    Let’s see if it worked out!

    mssql-s01e04-346Viewing the job history of job 0 to 3..
    In the job history I can see that:

    • MSSQL_S01E04_Job0 was never executed: it is checked on the left, but does not occur on the right.
    • MSSQL_S01E04_Job3 was only executed every 6 minutes, see the red arrows on the right side.
    • MSSQL_S01E04_Job1 and MSSQL_S01E04_Job2 were executed every 3 minutes.
    • ##SQLAgentJobStarter##, the job that caused all this, was also executed every 3 minutes, as scheduled.

    mssql-s01e04-351Viewing the job history of job 4..

    • MSSQL_S01E04_Job4 was only being executed after the triggering jobs had succeeded both.

    mssql-s01e04-352Viewing the job history of job 5..

    • MSSQL_S01E04_Job5 was only being executed after the triggering job had failed.

    mssql-s01e04-353Viewing the job history of job 6..

    • MSSQL_S01E04_Job6 was being executed after the triggering job had either succeeded or failed.

    But there is more. The tool will write to the Windows Eventlog when an error occurs.
    mssql-s01e04-337To view the windows Eventlog, press <Windows Key>+R, then type eventvwr

    mssql-s01e04-338Example of errors that are logged.

    Make sure you clean up your server! Use the script demo_cleanup.sql for that. Otherwise the tool will keep on executing the demo jobs.

    Understanding the stored procedures of the solution

    Stored procedure [dbo].[usp_upsert_jobtrigger]

    With this stored procedure you can insert or update a jobtrigger.

    You can set the following parameters:

    • @trigger_id: If provided as input, it will be used for checking if an update or insert needs to be done. If left null or smaller than or equal to 0, it will be filled in the stored procedure.
    • @enabled: Only enabled jobtriggers are processed. Values 0 or 1 (required).
    • @name: Name for the jobtrigger (unique + required). Is used for checking if an update or insert needs to be done, when @trigger_id is null or smaller than or equal to 0
    • @sqlagent_jobname: The name of the SQL Agent job that must be started when all jobtrigger conditions are true (required).
    • @processing_order: Indicates the order in which jobtriggers must be processed (required). Duplicate processing order values are allowed.
    • @description: An optional description.

    Stored procedure [dbo].[usp_upsert_jobtrigger_condition]

    With this stored procedure you can insert or update a jobtrigger condition of any type.

    You can set the following parameters:

    • @condition_id: If provided as input, it will be used for checking if an update or insert needs to be done. If left null or smaller than or equal to 0, it will be filled in the stored procedure.
    • @trigger_id: The trigger_id of the jobtrigger to which the jobtrigger condition belongs.
    • @enabled: Only enabled jobtrigger conditions are processed. Values 0 or 1 (required).
    • @processing_order: Indicates the order in which jobtrigger conditions must be processed (required). Duplicate processing order values are allowed.
    • @description: An optional description.
    • Note regarding the parameters underneath: You must provide values for the one @sql* parameter OR the three @file* parameters OR the three @job* parameters.
    • @sql_stored_procedure: The fully qualified name (including database name) of a stored procedure that returns a single row with a single column named [SqlTriggerResult] of the BIT datatype.
    • @file_watch_folder: The folder where is searched for files.
    • @file_recurse_subfolders: Indicates if also subfolders of the watchFolder must be checked.
    • @file_name_pattern: A filename or filename pattern with wildcards of the file that must exist for the triggercondition to be true.
    • @job_name: The name of a SQLAgentjob that must be completed with the desired value for @job_result for the triggercondition to be true.
    • @job_result: Domainvalues [S | F | C] meaning Succeeded, Failed or Completed (=Succeeded or Failed).
    • @job_result_maxage: An age in seconds for the maximum time ago the job was completed. Use -1 for unlimited. For instance, you want a sql agent job to start when 4 other jobs, that are scheduled daily and partially run simultaneously, have all succeeded.
      The time difference between the first and the last estimated completion time is 60 minutes.
      The time difference between the first scheduled starttime and the last estimated completion time is 120 minutes.
      Any value (recalculated as seconds) between 60 minutes (plus a safety margin) and 22 hours (24 hours minus 120 minutes) would be fine.

    Stored procedure [dbo].[usp_upsert_sql_jobtrigger_condition]

    With this stored procedure you can insert or update a jobtrigger condition of type ‘sql’ (using a stored procedure).

    You can set the following parameters:
    @condition_id, @trigger_id, @enabled, @processing_order, @description, @sql_stored_procedure: as described above.

    Stored procedure [dbo].[usp_upsert_file_jobtrigger_condition]

    With this stored procedure you can insert or update a jobtrigger condition of type ‘file’.

    You can set the following parameters:
    @condition_id, @trigger_id, @enabled, @processing_order, @description, @file_watch_folder, @file_recurse_subfolders, @file_name_pattern: as described above.

    Stored procedure [dbo].[usp_upsert_jobresult_jobtrigger_condition]

    With this stored procedure you can insert or update a jobtrigger condition of the SQLAgentJobResult type.

    You can set the following parameters:
    @condition_id, @trigger_id, @enabled, @processing_order, @description, @job_name, @job_result, @job_result_maxage: as described above.

    Stored procedure [dbo].[usp_delete_jobtrigger]

    With this stored procedure you can delete a jobtrigger by trigger_id or name.

    Stored procedure [dbo].[usp_delete_jobtrigger_condition]

    With this stored procedure you can delete a jobtrigger condition by using the condition_id.

    Stored procedure [dbo].[usp_delete_jobtrigger_conditions_by_trigger_id]

    With this stored procedure you can delete all jobtrigger conditions for a given trigger_id.

    Examples of usage of the stored procedures

    Among the files you can download there is the file demo_setup_sysjobtriggers.sql, that uses the upsert stored procedures for different use cases.

    Conclusion / Wrap up

    I have tried to explain how the SQLAgentJobStarter tool can be used for starting SQL Agent jobs triggered by the existence of one or more files, by the result of a stored procedure execution or by the execution result of one or more other SQL Agent Jobs.
    A few important sidemarks I still have to make:

    1. In real life scenarios, make sure the files you are polling for are moved after processing, so that the tool does not try to start the job again.
    2. To create stored procedures that are used as a sql trigger you can use the storedprocedure_template.sql file as a template.
    3. To create your own jobtriggers and triggerconditions please check demo_setup_sysjobtriggers.sql in which some of the stored procedures are used.
    4. I am aware that the sourcecode of the tool might need some refactoring. It does the job, however.
    5. Anyone who can help me with creating a graphical user interface to manage the jobtriggers and triggerconditions using C# and WinForms is more than welcome. If you are interested, please contact me. I do not have enough time to do this myself.

    (c) 2016 hansmichiels.com – Do not steal the contents – spread the link instead – thank you.

    Using the project deployment model with environments (SSIS Series)

    Introduction

    Hi, as announced in my previous post of July, 14, I took a summer break. I was really useful and I enjoyed it.
    From now on I will try to write some interesting posts weekly again.
    Today I want to explain the project deployment model of SSIS, introduced already in SSIS 2012.
    Also you are gonna find a (hopefully useful) script below that creates folders, environments and environment variables, and creates connections from package parameters to environment variables.
    But that’s for later, first a little introduction.

    Folders and environments

    What I found very confusing when I started using the project deployment model, already some years ago, is that an environment in this context is not what I expected it to be.
    From a software development perspective an environment is an installed and working version of your software, optionally using one or more databases and configuration setttings exclusively for that environment. For instance a Development or a Test environment.
    But the project deployment model of SSIS does not agree with that definition.
    For the project deployment model an environment is only a set of values for all environment variables, and nothing more.
    Can you think of what that means? Suppose you have a server, both used for Development and Test, and you have databases and cubes with a “_DEV” or a “_TST” suffix to their names.
    But as far as SSIS is concerned, you can only install one version of an SSIS package! Yeah, you can execute it with a different set of parameter values stored in SSIS environment variables, but that’s it.
    You cannot have two different versions of the same package installed on the same instance of SQL Server – at least not using only SSIS Environments.
    I found it not so intuitive that you need to use folders inside the Integration Services Catalog to achieve this.
    So far the theory, let us try this stuff out together.

    Let’s try it

    I’ll just click an example together, you do not need to follow me, my goal is that you understand the concepts so that you can apply it to your own projects.
    By the way, I am using SQL Server 2016, but it should also work in SQL Server 2012 and 2014.

    My test setup is very basic:

    • I have made two databases, [SSIS_S01E08_DEV] AND [SSIS_S01E08_TST], as development- and testenvironment, respectively.
    • Both databases have a table [tbl].[Fruit], but the table in the DEV database has one extra column, [FruitDescription], to simulate a change that is under development but not deployed to test yet.
    • I have made one SSIS Package AddFruit.dtsx, that inserts a row into the table [tbl].[Fruit] passing the value for [FruitName] only. During the demo I will change the package so that it also provides the [FruitDescription] column value.

    Playing around with the example above, I can demonstrate the use of folders and environments when the package is deployed using the project deployment model.

    The initial SSIS Package

    The initial SSIS Package contains just one Execute SQL Task, one parameter named FruitName and one OLEDB Connection manager named MyDatabase.connectionmanager.
    The connection manager is configured by an expression so that the property ConnectionString gets the value of a project parameter also named ConnectionString as demonstrated in the images below.
    SSIS-S01E08-280The Execute SQL Task

    SSIS-S01E08-281The package parameter

    SSIS-S01E08-282The project parameter

    SSIS-S01E08-283The OLEDB connection manager configured with an expression
    SSIS-S01E08-284Then rebuild the Solution

    Deploying the SSIS Package

    Once the solution has been rebuilt, you can deploy the package with the Integration Services Deployment Wizard. You can also call this utility from the command line, but explaining that is out of scope for this blog post.
    To execute this wizard, you have to locate the .ispac file, it is in a subfolder of your projects bin folder.
    If you have created additional solution configurations in your Visual Studio solution, this subfolder might be different from the image below (where the subfolder is Development).
    SSIS-S01E08-285Locating the .ispac file, that was created when the solution was rebuilt.

    SSIS-S01E08-286First step of the wizard

    SSIS-S01E08-287The default deployment model is Project Deployment, the one you need. The other option is Package Deployment.

    SSIS-S01E08-288Select a server name and a path [1]

    SSIS-S01E08-289Select a server name and a path [2]: create a new folder

    SSIS-S01E08-290Select a server name and a path [3]: name the folder after your environment

    SSIS-S01E08-292Select a server name and a path [4]: then click Next

    SSIS-S01E08-293Review and deploy.

    SSIS-S01E08-295Deployment succeeded.

    SSIS-S01E08-296In the Integration Services Catalog you can find your project and package.

    SSIS-S01E08-297You could configure it now, but there are no SSIS environments yet (note: these are a set of environment variable values).

    SSIS-S01E08-298Don’t configure just yet. We will do this later.

    SSIS-S01E08-299Here you could create an environment .. don’t do it now !

    SSIS-S01E08-TST-environment_setup.sql

    --\
    ---) A script to create a folder and environment in an SSIS Catalog.
    ---) Author: Hans Michiels
    --/
    /*
    (c) Copyright 2016 - hansmichiels.com
     
    This program is free software: you can redistribute it and/or modify
    it under the terms of the GNU General Public License as published by
    the Free Software Foundation, either version 3 of the License, or
    (at your option) any later version.
     
    This program is distributed in the hope that it will be useful,
    but WITHOUT ANY WARRANTY; without even the implied warranty of
    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
    GNU General Public License for more details.
     
    You should have received a copy of the GNU General Public License
    along with this program. If not, see http://www.gnu.org/licenses/.
    */

    USE [SSISDB]
    GO

    DECLARE @environment_variables_table TABLE(
      [id] int identity(1, 1),
      [variable_name] nvarchar(128),
      [data_type] nvarchar(128),
      [sensitive] bit,
      [value] nvarchar(4000),
      [description] nvarchar(1024)
      )

    DECLARE @object_parameter_value_table TABLE(
      [id] int identity(1, 1),
      [object_type] smallint,
      [object_name] nvarchar(260),
      [parameter_name] nvarchar(128)
      )

    DECLARE @id INT
    DECLARE @max_id INT
    DECLARE @folder_name sysname
    DECLARE @environment_name nvarchar(128)
    DECLARE @environment_description nvarchar(128)
    DECLARE @project_name nvarchar(128)
    DECLARE @variable_name nvarchar(128)
    DECLARE @data_type nvarchar(128)
    DECLARE @sensitive bit
    DECLARE @value sql_variant
    DECLARE @value_tinyint tinyint
    DECLARE @description nvarchar(1024)
    DECLARE @nsql nvarchar(max)

    DECLARE @object_type smallint
    DECLARE @object_name nvarchar(260)
    DECLARE @value_type CHAR(1)
    DECLARE @parameter_name nvarchar(128)
    DECLARE @parameter_value sql_variant

    --\
    ---) Environment settings
    --/
    SELECT
      @folder_name = N'TST',
      @environment_name = N'TST',
      @environment_description = N'My demo Test environment',
      @project_name=N'SSIS_S01E08'

    --\
    ---) Script to create environment and settings.
    --/
    IF NOT EXISTS (
        SELECT 1
        FROM [SSISDB].[internal].[environments]
        WHERE [environment_name] = @environment_name
        )
    BEGIN
        EXEC [SSISDB].[catalog].[create_environment] @folder_name, @environment_name, @environment_description;
    END

    INSERT INTO @environment_variables_table
      ( [variable_name], [data_type], [sensitive], [value], [description] )
      VALUES
          ( N'ConnectionString', N'String', 0, N'Data Source=.;Initial Catalog=SSIS_S01E08_TST;Provider=SQLNCLI11.1;Integrated Security=SSPI;Auto Translate=False;', N'Connection string for the database.' )
        , ( N'FruitName', N'String', 0, N'Tamarillo', N'' )
        --, ( N'Int32Example', N'Int32', 0, N'1', N'' )
        --, ( N'BooleanExample', N'Boolean', 0, N'1', N'' )
        --, ( N'ByteExample', N'Byte', 0, N'5', N'' )

    SELECT @id = 1, @max_id = MAX([id]) FROM @environment_variables_table
    WHILE @id <= @max_id
    BEGIN
        SELECT
          @variable_name = v.variable_name,
          @data_type = v.data_type,
          @sensitive = v.sensitive,
          @value = v.value,
          @description = v.[description]
        FROM @environment_variables_table v
        WHERE [id] = @id;

        IF EXISTS (
            SELECT 1
            FROM [SSISDB].[internal].[environment_variables] v
            JOIN [SSISDB].[internal].[environments] e ON e.environment_id = v.environment_id
            WHERE v.[name] = @variable_name
            AND e.environment_name = @environment_name
            )
        BEGIN
            SET @nsql = N'EXECUTE [catalog].[delete_environment_variable] '
              + N'@folder_name = N'''+ @folder_name + ''', @environment_name = N''' + @environment_name + ''', '
              + N'@variable_name = N''' + @variable_name + ''''
            PRINT @nsql;
            EXEC sp_executesql @nsql;
        END

        PRINT '/*'
        PRINT @variable_name
        PRINT CONVERT(nvarchar(128), SQL_VARIANT_PROPERTY(@value, 'BaseType'));
        PRINT '*/'

        SET @nsql = N'EXECUTE [catalog].[create_environment_variable] '
          + N'@folder_name = N'''+ @folder_name + ''', @environment_name = N''' + @environment_name + ''', '
          + N'@variable_name = N'''+ @variable_name + ''', @data_type = N''' + @data_type + ''', '
          + N'@sensitive = ' + CONVERT(NVARCHAR, @sensitive) + ', @description = N''' + @description + ''', '
          + CHAR(13) + CHAR(10) + N'@value = ' +
          CASE UPPER(@data_type)
          WHEN 'String' THEN 'N''' + CONVERT(NVARCHAR(1000), @value) + ''' '
          ELSE CONVERT(NVARCHAR(1000), @value)
          END + '; '
        PRINT @nsql;
        EXEC sp_executesql @nsql;

        SET @id = @id + 1
    END

    --\
    ---) Add environment reference to project.
    --/
    Declare @reference_id bigint
    IF NOT EXISTS(SELECT 1
      FROM [SSISDB].[internal].[environment_references] r
      JOIN [SSISDB].[internal].[projects] p
        ON p.project_id = r.project_id
      WHERE p.name = @project_name
        AND r.environment_name = @environment_name
      )
    BEGIN
        EXEC [SSISDB].[catalog].[create_environment_reference]
            @environment_name=@environment_name, @reference_id=@reference_id OUTPUT,
            @project_name=@project_name, @folder_name=@folder_name, @reference_type='R';
        --Select @reference_id
    END

    --\
    ---) Connect environment variables to package parameters, based on the name
    --/
    INSERT INTO @object_parameter_value_table (
        [object_type],
        [object_name],
        [parameter_name]
        )
    SELECT
        prm.[object_type],
        prm.[object_name],
        prm.[parameter_name]
    FROM
        [SSISDB].[internal].[object_parameters] prm
    JOIN
        [SSISDB].[internal].[environment_variables] ev
        ON ev.name = prm.parameter_name
    JOIN
        [SSISDB].[internal].[projects] prj
        ON prj.project_id = prm.project_id
    WHERE
        prm.[value_type] != 'R'
    AND
        prm.value_set = 0
    AND
        prj.name = @project_name
    ORDER BY
        prm.object_name, prm.parameter_name

    SELECT @id = 1, @max_id = MAX([id]) FROM @object_parameter_value_table
    WHILE @id <= @max_id
    BEGIN
        SELECT
          @object_type = v.[object_type],
          @object_name = v.[object_name],
          @parameter_name = v.[parameter_name]
        FROM @object_parameter_value_table v
        WHERE [id] = @id;

        SELECT @value_type = 'R', @parameter_value = @parameter_name;
        
        EXEC [SSISDB].[catalog].[set_object_parameter_value]
          @object_type, @folder_name, @project_name, @parameter_name,
          @parameter_value, @object_name, @value_type

        SET @id = @id + 1
    END

    GO

    My script that facilitates to set up multiple environments quickly (download all the used scripts here).

    In the demo it would not be a problem to do all the configuration by hand, but if you have a lot of SSIS Packages with a lot of parameters, you really do not want to do this, especially not if you have different environments, like Development, Test and Production.
    This is why I made a script that automates this work for me:

    • It creates an environment, if it doesn’t exist yet.
    • It (re)creates environment variables, defined in the script.
    • It adds an environment reference to the project, if it doesn’t exist yet.
    • It connects environment variables to package parameters, based on the name.
      For this to work, it is important that package parameters with the same name used in different packages mean the same thing!
      For instance if you have a parameter SourceFolder used in different packages, it should be the same folder! If not, rename the parameters so that they have a unique name and – when mapped to environment variables – can both be assigned a different value, e.g. CrmSourceFolder and ErpSourceFolder.

    We run the script for the TST environment.

    SSIS-S01E08-300The script was executed successfully.

    SSIS-S01E08-301After the script has run, a TST folder with an TST environment inside exists. You can view or edit its properties.

    SSIS-S01E08-302When the Variables pane is selected on the left, the environment variables, created by the script, are shown.

    SSIS-S01E08-303View package configuration.

    SSIS-S01E08-304The package parameter FruitName is now connected to environment variable FruitName.

    SSIS-S01E08-305When I choose Execute ..

    SSIS-S01E08-306… all I have to do is check Environment, it will automaticly show .\TST

    SSIS-S01E08-307After execution of the package, a row exists in the table (and one from a previous test)

    SSIS-S01E08-308We can also execute the package with a SQL Agent Job ..

    SSIS-S01E08-310Execute the package with a SQL Agent Job: all I have to do is check Environment, it will automaticly show .\TST ..

    SSIS-S01E08-311.. and all values are connected properly ..

    SSIS-S01E08-312And yes, it works!

    Extending the package with a FruitDescription

    I must admit, I cheated a little. I skipped the part of deploying the first version of the SSIS Package to the DEV environment.
    So now I am going to deploy to a new DEV Folder and Environment for the first time, while it is will be the second version of the package. This is not a recommended practice in real projects.
    Anyway, the package gets an extra parameter FruitDescription.
    SSIS-S01E08-313Added FruitDescription

    SSIS-S01E08-314Modify the Execute SQL Task to insert also the value for @FruitDescription [1]

    SSIS-S01E08-315Modify the Execute SQL Task to insert also the value for @FruitDescription [2]

    SSIS-S01E08-316Then Rebuild

    SSIS-S01E08-317Then run deployment wizard again, but now create a DEV folder

    SSIS-S01E08-318Script modifications [1]: use a DEV folder and a DEV environment name.

    SSIS-S01E08-319Add FruitDescription to the list of environment variables and change the values: connection string must use the DEV database and fruitname and -description something different than in the TST environment.

    Then run the script.
    SSIS-S01E08-320The package was deployed using the deployment wizard.

    SSIS-S01E08-321SSIS Environment was created using the script

    SSIS-S01E08-322Environment variables where added using the script

    SSIS-S01E08-323Environment variables were connected to package parameters using the script

    SSIS-S01E08-324Package execution test: note the new parameter FruitDescription

    SSIS-S01E08-325Test result

    SSIS-S01E08-326Final result: different versions of the same SSIS Package installed side by side on the same machine. But in addition to SSIS Environments with variable values, folders inside the Integration Services Catalog are used for that!

    Download the scripts here.

    Conclusion / Wrap up

    With the project deployment model of SSIS, introduced in SQL Server 2012, deployment and configuration of SSIS Packages has become easier.
    However it is important to understand how folders and environments in the Integration Services Catalog work. I have tried to explain this, and provided a script to set up multiple environments quickly.

    Why did you use those weird fruit examples?

    D=Development
    T=Test
    😉
    Read more here:
    Durian
    Tamarillo

    (c) 2016 hansmichiels.com – Do not steal the contents – spread the link instead – thank you.

    Implement logging in a Script Task (SSIS Series)

    Introduction

    This post is a follow-up on two older posts and will be the last one before my summer break.

    Today I want to enhance the SSIS package made in post Fixing corrupt csv files in a SSIS Package (SSIS Series) by using the Plug and Play Logging Solution.

    Problem

    As long as a script runs well, you might not need logging inside. BUT .. when something GOES wrong, having logging in the script can reduce the time needed to troubleshoot the issue.
    In the example for fixing csv files, anything can be wrong with a file, causing an error in the script. In that case it is for instance useful to know which file was being processed when the error occurred.

    Prerequisite

    For this to work, you have to install my Plug and Play Logging Solution, which can be downloaded from this post. It is just one SQL Script you have to run, that creates a [logdb] database with objects inside. You can also add those objects to a different database that already exists.
    It is however not mandatory to rebuild the Fixing corrupt csv files in a SSIS Package example, as you can also add logging to your own Script Task.

    Solution

    This post is a enhancement on the Fixing corrupt csv files in a SSIS Package post, so you might need to read that post first, if you did not do that already.

    Therefore I will dive directly into the changes that are needed to add logging in the Script Task.
    This will be done with lots of screenprints with some comment in between.

    SSIS-S01E07-169Adding a package parameter “LoggingConnectionString”
    First you need to add a package parameter LoggingConnectionString that can be used in an expression of the OLE DB Connection and as input variable for the Script Task.

    SSIS-S01E07-170OLE DB Connection configuration ..
    Then add an OLE DB Connection for the logging database.

    SSIS-S01E07-171OLE DB Connection properties ..
    Use an expression to let the ConnectionString of the OLE DB Connection be equal to the value of the package parameter LoggingConnectionString.

    SSIS-S01E07-173OLE DB Connection – Expression configuration [1]

    SSIS-S01E07-172OLE DB Connection – Expression configuration [2]

    SSIS-S01E07-174Final result
    By the fx icon you can see that the connection manager uses an expression.

    SSIS-S01E07-176Enable logging
    Now enable logging for the package.

    SSIS-S01E07-177Enable logging [2]
    Add a log provider for SQL Server and let it use the logdb.oledbConnection by selecting it under the Configuration column header.

    SSIS-S01E07-178Enable logging [3]
    Then select all events. Filtering on what is actually logged is done by the logging solution (by the value of @MaxMessageClass, see this blog post for more details).

    SSIS-S01E07-179Select the Script Task
    Select the Script Task and add the following Variables to ReadOnlyVariables:

    • System::ExecutionInstanceGUID
    • System::PackageID
    • $Package::LoggingConnectionString


    • SSIS-S01E07-180The added ReadOnlyVariables in the red rectangles

      Below you will find a number of screenprints of the script task to talk you through the changes.
      You can download the C# script here.

      SSIS-S01E07-181Overview
      First make sure the Namespaces region is as shown.
      Then fold the namespace with the guid in the name, and paste the entire namespace HansMichielsCom.PlugAndPlaySSISLoggingSolution underneath it.
      This code is put in a separate namespace, so that it could also be placed in a .NET assembly that is added to the GAC (Global Assembly Cache). When you would do this, you do not have to add the code to every Script Task.
      For the example of today, we just put this namespace inside the Script Task to make things not too complicated for now.

      SSIS-S01E07-182Using the HansMichielsCom.PlugAndPlaySSISLoggingSolution namespace
      As a result, you have to tell the first guid-like namespace, that you want to call code inside the second namespace. Therefore add the using statement as shown above.

      SSIS-S01E07-183Constant used for logging
      Below you will see some printscreens with changed parts in the script.

      SSIS-S01E07-184Method GetSsisLogWriter to instantiate a SsisLogWriter object

      SSIS-S01E07-187Method Main is extended with logging.

      SSIS-S01E07-188Pass the logWriter as parameter to other methods ..

      SSIS-S01E07-189IMPORTANT: Bugfix in CheckAndReturnHeader!

      SSIS-S01E07-190IMPORTANT: Bugfix in CheckAndReturnHeader!
      (header == null) is added to cope with empty files.

      Testing, testing, one, two ..



      SSIS-S01E07-191Test preparations [1]

      SSIS-S01E07-193Test preparations [2]

      SSIS-S01E07-194Test execution

      SSIS-S01E07-195Test result: logging rows done inside the script are in the log table.

      Conclusion / Wrap up

      In this post I have demonstrated how to implement logging in SSIS Script Tasks using my Plug and Play Logging Solution.
      This type of logging gives more control on what to log and how to log it than when you implement logging using SSIS events.
      The examples given are very basic. You can use your imagination to implement logging of errors using a try .. catch block, or use all available parameters of logWriter.AddLogEntry to change the Retention Class, Message Class, and so on.

      In the summer I will take some time for study, reflection, holiday, and still .. work.
      My next post will be early September at the latest, maybe earlier.

      Download the C# script here.

      (c) 2016 hansmichiels.com – Do not steal the contents – spread the link instead – thank you.

    Zeros, bloody zeros! (Data Vault Series)

    Introduction

    I must admit I have a weakness for British humour.
    When I had to cope with leading zeros in business keys some time ago, I spontaneously came up with the title of this post, not knowing that it would serve as such.
    For those who do not know, “Meetings, bloody meetings” is a British comedy training film in which John Cleese plays a main role. It was made in 1976, and a remake was made in 2012.
    It tells in a funny way what can go wrong at meetings and how you can do better, check it out if you can.

    DV-S01E05-meetingsMr John Cleese

    But, obviously, this post is not about meetings but about zeros.

    Problem

    I can be short about that: leading zeros in business key values.
    For instance a customer number is delivered to the data warehouse as 0001806 (instead of 1806).
    This would not be a problem it is would always be delivered exactly like that. But to be honest, you can and will not know that upfront. Even this might be the case now, it might not be in the future.
    When other tools are used, leading zeros could suddenly disappear (for instance when a csv file is modified using Excel), or (more rarely) the number of leading zeros could change (01806, 00001806). When this happens you have a problem, because for the data warehouse 01806, 0001806, 00001806 and 1806 are all different business keys! Even if you have only two variants, it is already a problem.
    Because every business key gets a different row in the hub, and this customer now exists multiple times!

    DV-S01E05-zeros(No acting here, this is how I look sometimes)

    Solution

    If you are familiar with Data Vault, you might already think of same-as-links to solve this.
    But I think the solution should be implemented earlier, to avoid having multiple hub rows.
    Simply always remove leading zeros when the sourcecolumn is (part of) a business key (either primary or foreign key) and seems a number or ID but is delivered as a string/varchar. In this way 1806 will always be 1806! And I think it is pretty impossible that 001806 and 1806 would refer to two different customers.
    Unless, of course, they would come from different source systems. But in that situation, depending on leading zeros would be a bad thing to do, because when then leading zeros dropped off, satellite rows of different customers (in different source systems) could end up as connected to the same hub row! In this situation, in a non-integrated Raw Vault, it would be better to prefix the business key with the source system code and remove the leading zeros, for instance, CRM.1806 and ERP.1806.
    In all cases, you can still store the original value (with leading zeros) as an ordinary attribute in a satellite for auditing reasons.

    How to implement the solution

    There are many ways to remove leading zeros. When I was searching for this I had two requirements:

    • No casting from and to an integer may take place, otherwise all business keys need to be numeric, so this would make the solution less reliable.
    • No function, routine or assembly may be called, this could negatively impact performance. I was looking for an “inline” conversion.

    After some research I found an expression that was the same for SQL and SSIS and quite okay (T-SQL version by Robin Hames, my credits for his work), but appeared to change a string with only one or more zeros to an empty string. And because a 0 can have a meaning – and is certainly different from an empty string – this is undesired behavior, IMHO.
    So I had to add some logic to it: a SELECT CASE in T-SQL and an inline condition (format {condition} ? {true part} : {false part} ) to the SSIS expression.
    Furthermore I came on a different method for T-SQL as well, using the PATINDEX function, which is more compact than the other solution.
    For SSIS I still use the ‘Robin Hames’ method, because the PATINDEX function is not available in SSIS Expressions.
    So .. this is what it has become:

    T-SQL

    Remove_leading_zeros.sql

    SELECT
        example.[id_with_leading_zeros],
       CASE
          WHEN LTRIM(example.[id_with_leading_zeros]) = '' THEN ''
          WHEN PATINDEX( '%[^0 ]%', example.[id_with_leading_zeros]) = 0 THEN '0'
          ELSE SUBSTRING(example.[id_with_leading_zeros], PATINDEX('%[^0 ]%', example.[id_with_leading_zeros]), LEN(example.[id_with_leading_zeros]))
       END AS [id_without_zeros_method1],

       CASE
          WHEN LTRIM(example.[id_with_leading_zeros]) = '' THEN ''
          WHEN PATINDEX( '%[^0 ]%', example.[id_with_leading_zeros]) = 0 THEN '0'
          ELSE REPLACE(REPLACE(LTRIM(REPLACE(-- Robin Hames' method
                REPLACE(LTRIM(example.[id_with_leading_zeros]), ' ', '!#!') -- replace existing spaces with a string that does not occur in the column value, I have chosen '!#!'
                , '0', ' ') -- replace '0' with ' '
                ) -- end of LTRIM to remove leading '0's that have been changed to ' 's
                , ' ', '0') -- change ' ' back to '0'
                , '!#!', ' ') -- change '!#!' back to ' '
       END AS [id_without_zeros_method2]
    FROM
        (
        SELECT
            TOP 1000 RIGHT('00000000000' + CONVERT(NVARCHAR(12), object_id), 14) AS [id_with_leading_zeros]
        FROM
            master.sys.objects
        UNION
        SELECT N' 00000 '
        UNION
        SELECT N'00'
        UNION
        SELECT N' '
        UNION
        SELECT ' 0099990 A '
        UNION
        SELECT '-5550'
        ) example

    SSIS Expression (can be used in Derived Column)

    (LTRIM(REPLACE(id_with_leading_zeros,"0", "")) == "" && LTRIM(id_with_leading_zeros) != "") ? "0" : REPLACE(REPLACE(LTRIM(REPLACE(REPLACE(LTRIM(id_with_leading_zeros)," ","!#!"),"0"," "))," ","0"),"!#!"," ")

    DV-S01E05-151In a Derived Column Transformation this looks for instance like this

    Conclusion / Wrap up

    In this post I have motivated why I think you should remove leading zeros from business keys when data is loaded from source systems to a data warehouse.
    This post also contains different ways to remove leading zeros, two for T-SQL and one for a SSIS expression.

    (c) 2016 hansmichiels.com – Do not steal the contents – spread the link instead – thank you.

    SQL Server 2016 tempdb configurations benchmark (SQL Server Series)

    Introduction

    When you have installed SQL Server 2016 you will have noticed that the configuration for tempdb has changed.
    Microsoft made a proper tempdb configuration easy to do via the GUI during installation.

    Today I want to go through the settings offered and will measure performance of different setups.

    What are the options?

    During installation you can set the following values:
    For TempDB datafiles:

    Setting Default value
    Number of files 4
    Initial size (MB) 8
    Autogrowth (MB) 64
    Data directories Default data directory, in my case C:\MSSQL\Instance root\MSSQL13.MSSQLSERVER\MSSQL\Data

    For TempDB logfile:

    Setting Default value
    Initial size (MB) 8
    Autogrowth (MB) 64
    Log directory Default data directory, in my case C:\MSSQL\Instance root\MSSQL13.MSSQLSERVER\MSSQL\Data

    MSSQL-S01E03-tempdb-setup
    Configuration screen during SQL Server 2016 installation (I changed the directories already).

    So what is the best setup?

    This subject is not new for SQL Server 2016, but the installation procedure now just gives a better opportunity to set it up right the first time.
    There are roughly five differentiators in the setup:

    • Initial size (MB)
    • Autogrowth (MB)
    • Number of data files
    • Location of data files
    • Read/write performance of the different storage media

    What are the considerations when setting these values?

    Initial size (MB)

    A larger initial size will accomplish two things:

    1. It will take longer before the file will have the need to grow.
    2. The file can be stored less fragmented on disk.

    If you have enough disk space, the larger the better, there is no real downside on making the initial size quite large. More about this in my conclusion.

    Autogrowth (MB)

    With autogrowth the choice is a bit more delicated, as a small or large value will both have pros and cons, as pointed out in the schedule below.

    Autogrowth value Pros Cons
    Small More frequent smaller growth actions will cause less noticable hick ups TempDB datafiles will become more fragmented.
    Large Tempdb datafiles will become less fragmented. Less frequent larger growth actions might cause noticable hick ups

    Number of data files

    In general, the recommendation is one data file per (logical) processor. You can configure more files than the number of (logical) processors, but do this only if you notice that this gives a better performance.

    Location of data files

    Locating the data files on different drives can have two effects:

    1. Queue length per disk drive in Windows (all scenarios, also with virtual disks residing on the same physical drive).
    2. Physical writes to the diskdrive can go in parallel (only when different physical drives).

    The tests

    The testcases

    Creating combinations of all differentiators would just create *a lot* of testscenarios.
    So I will be a bit pragmatic.
    The testcases are chosen, so that they can be compared with at least one other testcase, of which one differentiator is different.

    Test # # Data files Init size (MB) Auto growth (MB) # Fol- ders Scenario
    0 4 8 (x4) 64 1 Default installation (data files on OS disk)
    1 4 8 (x4) 64 1 Data files on dedicated drive
    2 4 8 (x4) 64 4 Data files on more drives
    3 4 2048 (x4) 64 4 Data files on more drives + Larger initial size
    4 4 8 (x4) 256 4 Data files on more drives + Larger autogrowth
    5 4 2048 (x4) 64 1 Data files on dedicated drive + Larger initial size
    6 4 2048 (x4) 64 1 Data files on dedicated drive + Larger initial size + TempDB on different SSD
    7 8 2048 (x8) 64 1 Data files on dedicated drive + Larger initial size + More data files
    8 4 8 (x4) 64 1 Default installation (data files on OS disk) + Larger initial size

    Test results

    I will present you both two graphs as well as the raw data. I think especially the first graph gives some useful insights, as it compares all testcases with testcase 0 (default installation), by using an index 100 for testcase 0. If a different testcase performs better, the index will be less than 100. Also, in this way, the average result of both testsets (A and B) can be shown.
    I left testcase 6 out of the graph, later I will tell why I did this.
    MSSQL-S01E03-testresult-indexIndex-based comparison of testresults

    MSSQL-S01E03-testresult-secondsExecution times of testcases for testset A and B

    Test results figures

    Test # Scenario Testresult testset A (ms) Testresult testset B (ms)
    0 Default installation 30359 232795
    1 Data files on dedicated drive 28344 261467
    2 Data files on more drives 22514 248484
    3 Data files on more drives + Larger initial size 28040 210223
    4 Data files on more drives + Larger autogrowth 24173 200632
    5 Data files on dedicated drive + Larger initial size 19608 192587
    6 Data files on dedicated drive + Larger initial size + TempDB on different SSD 123289 387646
    7 Data files on dedicated drive + Larger initial size + More data files 19692 211501
    8 Default installation + Larger initial size 20149 361844

    Conclusion

    Data files on dedicated drive + Larger initial size (test 5) gives the best performance for both testsets. Putting tempdb files on multiple disks can not win from this, however the differences are relatively small.

    Surprisingly performance dramaticly decreases when the virtual disk on which tempdb is stored is moved to a different (faster) SSD (this is test 6)! I have no explanation for this. Therefore I will leave test 6 out of the analysis, by doing this physical storage is the same for all testcases and changes in performance are only caused by the other differentiators.

    My most important conclusion is that the default values offered during installation are quite okay, except for two things:

    The Initial size (MB)

    You should really set the initial size to a big number (gigabytes if possible). Note that if you leave the number of files 4 (which I recommend), the space used is four times the size that you use for Initial size.
    If you have a dedicated disk for tempdb (which is recommended), you might want to set the initial size of the datafile(s) to fill up the disk. You should then disable Autogrowth.
    Also on a production server, check the size of tempdb files after a week of uptime (also the SQL instance may not have been restarted). Say the tempdb size of all files together (can also be one file) is 22GB. You could then set your initial size to 28GB in total, so 7GB per file if you configured 4 files. This would mean that during normal operation, the tempdb never has to grow.

    Dedicated disk

    It is a common best practive to put tempdb on a seperate disk so that the Windows Disk Queue does not have to compete with other read/write actions to that disk.

    Other settings are somewhat less important but … think for yourself (my disclaimer)

    In my opinion both putting tempdb files on multiple different disks and setting Autogrowth to a different value do not offer much performance benefit.
    The Autogrowth value is even irrelevant if the initial size is so large that temdb never has to grow during normal operation.
    But every server is different! Test and think for yourself.
    My testresults are presented “AS IS” in the hope they will be useful as a starting point for further own analysis. I am not responsible for any disadvantage you might have by using my test results for your tempdb setup.

    What if SQL Server is already installed and/or it is an older version?

    No worries, you still can change the tempdb configuration.
    You can move and add files using SQL scripts.
    Initial size and Autogrowth you can also set with Management Studio.
    MSSQL-S01E03-166Go to properties of tempdb

    MSSQL-S01E03-167Here you can change Initial size and Autogrowth

    By the way, I noticed that SQL Server does not delete moved tempdb files. So if you move tempdb files, you should delete the files on the old location, so that they do not use disk space and do not cause confusion for administrators.

    Test justification

    • All tests are done on a Hyper V Virtual Machine running Windows Server 2012 R2 and SQL Server 2016 Developer Edition.
    • Change of physical storage type is done by moving vhdx files to different physical disks, and then change the file location in the settings of the Hyper V Virtual Machine (while it is not running, of course).
    • Each test is done on an initial copy of the Virtual Machine, therewith the startsituation is equal for every test (e.g. no tempdb growth by previous tests).
    • Tests where done with two sets of scripts (set A and B) that inserted, updated and deleted rows in temporary tables in tempdb. Set A consists of three sql scripts, and set B of five.
    • All scripts were developed, so that all tempdb inserts started at the same time (maybe milliseconds difference in starttime only). This to ensure that every testrun was executed in the same way.
    • All scripts were executed in SQL Server Management Studio.
    • The first file of tempdb is stored on its own (virtual) disk except for testcase 0, where tempdb files are on the C-drive where Windows is installed.
    • When spreading files over multiple virtual disks, the disks created for database files, log files, and backup files are used. This was done for pragmatic reasons during the test. This was defendable, because no other activity on databases and log files was going on during the test. It is however not recommended to spread tempdb like that in production environment, unless every disk is dedicated to be used for tempdb only.
    • The host machine had the same applications open during every test, being a Windows Explorer and Microsoft Excel. No other applications that used CPU and could influence performance of the virtual machines were running.
    • The tempdb log file was placed on a seperate disk (the same one) for each test. Therewith it was no subject of performance investigation. This would have at least doubled the number of testcases.

    Wrap up

    Microsoft did a good job in improving the setup options for tempdb in the installation procedure.
    It is a bit of a pity that they did not go the last mile, giving the tempdb data files a more realistic Initial size.
    Luckily this is easy to change during installation, and it is a quick win for anyone installing SQL Server 2016.

    Download all the used scripts here.

    (c) 2016 hansmichiels.com – Do not steal the contents – spread the link instead – thank you.