Tag Archives: SSIS

Executing your own .NET console application from SSIS (SSIS Series)

Introduction

The SSIS Script Task is a very powerful component to use in a SSIS package, and most of the time you can achieve with it what you want.
Still I have faced a few situations where a Script Task was not the best solution. In those cases I made a .NET Console application and executed it from within a SSIS package.

The full article is posted on SQLShack.com, so you can read further there ..

How a SSIS Lookup works different from a T-SQL JOIN (SSIS Series)

Introduction

With a Lookup in a dataflow in a SSIS Package you can achieve a similar result as with a T-SQL Join clause in a SELECT statement: you can check if a row exists in a different dataset, based on one or more (business / natural / surrogate) key columns. However there are a number of differences. One of these differences took me several hours of investigating some time ago, when I had this lookup in SSIS that did not find any row. I did not understand why, because I had a similar T-SQL statement with a JOIN that found all rows. What could be different (it appeared to be a CHAR column connected to a VARCHAR)?
So today I am going to compare the two.

Subjects of comparison:

  1. Perform a lookup when the datatypes are different
  2. Perform a lookup on a different SQL Server Instance
  3. Perform a lookup when the join is not a equi-join
  4. When the source is not SQL Server
  5. When the joined columns differ in Case (Uppercase / Lowercase / Mixed case)

Simple setup for two lookup transformations using the Adventureworks database

Perform a lookup when the datatypes are different

A Transact-SQL query with a join is more forgiving than a SSIS Lookup when different datatypes are joined.

Connect a nvarchar to a varchar column

In a SSIS Lookup this does not work because an error is shown during design:

In T-SQL, this join can be made and will work:

different_datatypes.sql
SELECT
    *
FROM
    [Production].[STG_Product] stg
JOIN
    [Production].[ProductCategory] pc
    ON pc.name = stg.VARCHAR_ProductCategoryName

Connect a fixed length CHAR column to a variable length VARCHAR column (or NCHAR to NVARCHAR)

In SSIS the lookup can be developed, but will not work in practice, when the CHAR column value is not the maximum length (and thus has trailing spaces).
A “join” of a nvarchar to a nchar column ..
.. will not work !

In T-SQL, this join can be made and will work:

fixed_length_column.sql
SELECT
    *
FROM
    [Production].[STG_Product] stg
JOIN
    [Production].[ProductCategory] pc
    ON pc.name = stg.NCHAR_ProductCategoryName

Perform a lookup on a different SQL Server Instance

In a SSIS Lookup you can use different connection managers for source and Lookup transformation, so this is easy to do.

In T-SQL this would be more complex to implement, for instance you could use linked servers if your security policy allows this.

Perform a lookup when the join is not a equi-join

An equi-join is a join where the column values that are joined simply must be equal to each other.

By default a SSIS Lookup will do an equi-join. In the visual editor you connect one or more columns from the Available Input Columns to the Available Lookup Columns. Designed in this way it will be an equi-join.
There are two ways I know of to implement a different kind of join:

  1. in the Advanced Tab of the Lookup Transformation Editor you can modify the SQL statement.
  2. use a OLE DB Command in the Data Flow, but this can have a negative impact on performance.

In T-SQL you can do more complex joins, for instance with a BETWEEN clause for an EffectiveStartDate and EffectiveEndDate.

When the source is not SQL Server

In a SSIS Lookup you can use different connection managers for source and Lookup transformation, so this is easy to do.

In T-SQL this would be practicly impossible, or at least much more complex to accomplish (e.g. with OPENROWSET).

When the joined columns differ in Case

A SSIS Lookup is case-sensitive when Full cache is used, so differences in casing cause that lookup values are not found.
However when you switch to No Cache or Partial Cache , the lookup can become case-insensitive! Keith Mescha brought this to my attention (see below). Thank you, Keith.
So the safest thing to do is convert the column values for columns to connect from Available Input Columns to Available Lookup Columns to either upper- or lowercase on both sides. In T-SQL you can use the LOWER() and UPPER() functions for this, in SSIS you can add a Derived column to your dataflow, where you also can use the LOWER() and UPPER() functions. Depending on your needs, you can replace the existing column or add the upper or lower value as new column.

Example of a derived column that converts the value of a column of the Available Input Columns to uppercase.

In T-SQL the result of this join will depend on collation settings of the SQL Server Instance or column(s) affected in the join.
By default, SQL Server is installed using a case-insensitive collation (e.g. Latin1_General_CI_AS, where CI stands for Case Insensitive), and when the collation is not overruled on column-level, the join will work case-insensitively.

Performance considerations

From my own experience I know SSIS Lookups can be faster when developed correctly. This means:

  • Use full cache whenever the dataset allows this (so memory usage is acceptable given the amount of RAM on the server)
  • In the query of the lookup transformation, only include the columns that are needed in the Lookup.

Conclusion / Wrap up

In this blog post I have compared the SSIS Lookup transformation with the T-SQL Join.
In general T-SQL is more flexible and more forgiving, but when you need to go outside the SQL Server Instance for source or lookup it is easier to use a SSIS Lookup, but beware of the peculiarities of the SSIS Lookup, as described in this article.
Performance-wise, a SSIS Lookup can perform better than a T-SQL join, but this of course depends on a lot of design factors in both your Lookup as well as SQL database.

(c) 2017 hansmichiels.com – Do not steal the contents – spread the link instead – thank you.

Free download: Blog Yearbook 2016

I have made a compilation of all my blog posts of the past year, and it was more work than you would think.
Especially because it also contains a non-published new article.

After filling in the form you can download the e-book for free.

    I promise you the following:

    • I will not give or sell your email address to any third party.
    • I will not spam you.
    • If I start with a newsletter, you can easily unsubscribe or change the frequency.

    (c) 2016 hansmichiels.com – Do not steal the contents – spread the link instead – thank you.

    Using sensitive parameters (SSIS Series)

    Updates

    Change log

    Date Changes
    27 August, 2017 Alan brought to my attention that the sensitive value might get exposed in the log. And yes, he was right. I found out that the Raise Change Event (value True) caused this, so I put it to False. Then everything still worked and the sensitive value was not exposed via the log. So the changed advice is to leave Raise Change Event on False for the variable containing the plain value of the sensitive parameter.

    Introduction

    This time I want to tell something about the use of sensitive parameters. Passwords, connection strings, anything that is .. uhm .. sensitive.

    Problem

    SSIS Package Parameters and Environment variables can hold sensitive values, like passwords, connection strings and so on. You do not want these to be seen by people who are not authorized to have access to this information.

    Solution

    First I want to give a *short* intro on how (not) to save sensitive parameters in SSIS.
    There are several options which are explained enough on other internet sites:

    • DontSaveSensitive
    • EncryptSensitiveWithUserKey
    • EncryptSensitiveWithPassword
    • EncryptAllWithPassword
    • EncryptAllWithUserKey

    My opinion on those options:

    • When you use a *WithUserKey value, you might get into trouble when someone else from the development team wants to work on a SSIS Package.
    • When you use a *WithPassword value, you have to type (or paste) the password every time you open the package in Visual Studio. I personally really disklike that.
    • But more important, why would you care about saving the sensitive data at all? Thinking about professional SSIS package development, you would not want sensitive data like passwords and connection strings hard coded in the package anyway, while you can simply configure your packages from environment variables as explained in two of my earlier blog posts. So, as you might guess, DontSaveSensitive is my favorite option.

    So the rest of this article is based on using the DontSaveSensitive property.
    You have to set it in the SSIS package, but also in the project properties:
    ssis-e11-436Setting the package property “ProtectionLevel”
    ssis-e11-437Setting the Project Protection Level

    I have set up a very simple demo. It consists of the following steps:

    Set up a folder and environment in SSISDB if not done already

    See the script for this in my previous blog post.

    Create a sensitive environment variable “MyDBConnectionString”.

    In the demo I use my existing DEV folder with the DEV environment inside that folder.
    In my case I use Windows Integrated Security, therefore the connection string on itself is not sensitive. I’ll just use this example, because a connection string could also contain a username and password.

    ssis-e11-458The environment variable “MyDBConnectionString”. Check the checkbox to make it sensitive

    Create a SSIS Package with a sensitive parameter “MyDBConnectionString”

    ssis-e11-454
    Do I need to say more?

    In the SSIS Package create a (non-sensitive) variable “MyDBConnectionString”

    Not so exiting either, except that you should set Raise Change Event to True for the variable, so that any other properties set by an expression using the variable will certainly change when the variable changes.
    Before you can do that, you must make sure that the column is visible. The printscreens below explain how.
    ssis-e11-439Clicking on this icon shows the Variable Grid Options
    ssis-e11-440On the Variable Grid Options check “Raise event when variable value changes”
    ssis-e11-441For the variable set “Raise Change Event” to True

    Don’t set “Raise Change Event” to True!
    See change log at the top of the article for details.

    ssis-e11-455Here you can see that the Expression @[User::MyDBConnectionString] is used to set the ConnectionString property of the OLEDB Connection Manager OLE_SRC_MyDatabase.

    Copy the parameter value to the variable value using a Script Task

    The “problem” is that you cannot use the sensitive parameter in an expression that uses the value for a non-sensitive property, in my case the ConnectionString property of an OLEDB Connection Manager.
    This error is shown:
    ssis-e11-435

    This is why you have to copy the value of the sensitive parameter to a non-sensitive variable first using a Script Task.
    I personally think it is a pity that Microsoft did not create a GETSENSITIVEVALUE(« SensitiveParameterOrVariable ») expression, so you could use an Expression Task, but they didn’t, so a script task is the way to go.
    Add a Script Task to the control flow of your SSIS Package and adjust public void Main() as below:

    ScriptMain.cs
    public void Main()
    {
      try
      {
        // Copy sensitive value to normal (non-sensitive) variable, so that it can be used in an expression.
        string value = Dts.Variables["$Package::MyDBConnectionString"].GetSensitiveValue().ToString();
        if (!string.IsNullOrEmpty(value))
        {
          // Only copy, if not empty. Else preserve design time value, for instance for debugging.
          Dts.Variables["User::MyDBConnectionString"].Value = value;
        }
        Dts.TaskResult = (int)ScriptResults.Success;
      }
      catch (Exception e)
      {
        Dts.Log(e.Message, 0, null);
        Dts.TaskResult = (int)ScriptResults.Failure;
        throw (e);
      }
      Dts.TaskResult = (int)ScriptResults.Success;
    }

    The control flow of the demo package now looks as follows:
    ssis-e11-460
    The final control flow of the demo package

    To demonstrate how it works, the “design time” connection string, that is stored in the variable MyDBConnectionString, is different from the sensitive environment variable MyDBConnectionString.
    I have added an Execute SQL Task SQL_FooBar that uses the OLEDB Connection Manager OLE_SRC_MyDatabase.
    The connection string in the variable connects to a database named FF (which is Dutch SMS talk for “Just a moment”/temporary).
    The connection string in the sensitive environment variable connects to tempdb.
    To proof that the connection string from the sensitive environment variable (which passes it to the package parameter) is used during package execution, I take the FF database offline, after the OLEDB Connection Manager is configured.
    ssis-e11-443Taking database FF offline ..

    So when the connection manager would not be configured from the sensitive parameter during run time, the Execute SQL Task would certainly fail during execution.

    Now it’s time to deploy my project with the testpackage to my DEV folder in SSISDB. I do not have screenshots of that, because I assume that you will know how to do this. If not, check my previous blog posts.

    ssis-e11-456
    ssis-e11-450This time I do the configuration manually: in the project I add a reference to the DEV environment (not in the picture), then I connect the environment variable to the package parameter.

    After it is configured, I execute the package.

    ssis-e11-451
    ssis-e11-452

    Shortly after it has succeeded, proof for me that it was using the configured connectionstring with database tempdb, and not the design time value with database FF.

    Considerations for debugging

    If you want to debug the SSIS package in Visual Studio you have a problem: due to DontSaveSensitive the package parameter will have no value, and you cannot debug the package.
    My workaround for this is to put the debug value for the sensitive value in the SSIS variable. In the Script Task the variable will only get the value from the parameter when it is not empty. So during debug time, the variable value will be preserved.
    If your value is too sensitive for this, either use a less sensitive value referring to a test database/server etc., or remove the value from the variable after debugging the package. In the latter case, also delete copies of the package in the bin folder.

    Will my sensitive value be safe in an environment variable?

    Short answer: yes, I think pretty safe.
    Why? If a DBA unticks the “Sensitive” checkbox, the value is wiped out and cannot be read.
    Of course a DBA or hacker with sysadmin access to the SQL Instance and knowledge could deploy a SSIS Package, just to pull out the sensitive value. But there are little things a DBA cannot do..

    Conclusion / Wrap up

    In this article you could read some practical insights on using the ProtectionLevel DontSaveSensitive for a SSIS package containing a sensitive parameter configured from a sensitive environment variable during execution.
    Steps to take are:

    1. Create a folder in SSISDB. In that folder, create an environment. In my example both the folder and environment were called DEV.
    2. Create a sensitive environment variable.
    3. Create a SSIS package with a sensitive parameter and a (non-sensitive) variable with the same name.
    4. In the SSIS package create a script task to set the value of the variable to the value of the parameter.
    5. In the SSIS package use an expression to use the variable value where needed, for example in a OLEDB Connection Manager or a FTP Task.

    (c) 2016/2017 hansmichiels.com – Do not steal the contents – spread the link instead – thank you.

    How to automate your SSIS package deployment and configuration (SSIS Series)




    Last update: 20 May, 2017: made newer scripts available for download after giving a presentation for PASS Slowakia.

    Enhancements in the newer scripts (may 2017)

    • can handle multiple folders/environments on the same SQL Server Instance.
    • can install ispac files that are located in subfolders.
    • can install individual SSIS packages (dtsx files) from Powershell (SQL Server 2016 only feature)

    Introduction

    Hi there, today I want to share my scripts on package deployment. As manual deployment can be cumbersome and prone to errors, I have automated this boring repetitive task.
    This article is based on SQL Server 2016, but I have included scripts to be used for SQL Server 2012 and 2014.

    Problem

    How to automate SSIS Package deployment.

    Solution

    1. Setting up folder and environment
    2. Deploy your SSIS Packages
    3. Setup package configuration (or: connect package- and project parameters to environment variables)
    4. Glueing everything together using Powershell

    Setting up folder and environment

    As you could read in one of my previous posts Using the project deployment model with environments I usually create a one-on-one relation between folders and environments.
    So a folder = an environment, and within the folder several SSIS projects can exist.
    What I have seen others do is creating a “Generic”environment. That is also a possibility, this is a matter of personal choice. I prefer having the environment having an explicit name identical to the foldername (e.g. DEV), just to decrease the risk that any changes are made in the wrong Generic environment variable values.

    Our main goal for this step is to have a way to automate setting up a folder and environment, including all (scripted) environment variables. This script should be rerunable to be able to add or update environment variables when needed.
    One thing you should know is that the database SSISDB has a number of stored procecures in the catalog schema, that I use to achieve this.
    Well, here’s the script. Don’t bother copy and paste while you can download all scripts here.

    pre_deployment.sql

    --\ -------------------------------------------------------------------
    ---) Script to create folder, environment and settings.
    --/ -------------------------------------------------------------------
    /*
    (c) Copyright 2016 - hansmichiels.com
     
    This program is free software: you can redistribute it and/or modify
    it under the terms of the GNU General Public License as published by
    the Free Software Foundation, either version 3 of the License, or
    (at your option) any later version.
     
    This program is distributed in the hope that it will be useful,
    but WITHOUT ANY WARRANTY; without even the implied warranty of
    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
    GNU General Public License for more details.
     
    You should have received a copy of the GNU General Public License
    along with this program. If not, see http://www.gnu.org/licenses/.
    */

    USE [SSISDB]
    GO

    DECLARE @environment_variables_table TABLE(
      [id] int identity(1, 1),
      --)> Just an autonumber to be able to go through the table without cursor.
      [variable_name] nvarchar(128),
      --) Name of the variable
      [data_type] nvarchar(128),
      --) Variable datatype e.g. [ Boolean | Byte | DateTime | Decimal | Double | Int16 | Int32 | Int64 | SByte | Single | String | UInt32 | UInt64 ]
      --) Check all possible values using this query: SELECT DISTINCT [ssis_data_type] FROM [SSISDB].[internal].[data_type_mapping]
      [sensitive] bit,
      --) Indication of the environment variable is sensitive (e.g. a password).
      [value] nvarchar(4000),
      --) The variable value.
      [description] nvarchar(1024),
      --) Extra description for the variable.
      [create_only] bit
      --) Indication that a variable should only be created when it does not exist yet, but should not be replaced when it already exists.
      --) (so the value in the script does not overwrite the existing value).
      --) In this way you can prevent later configuration changes in an environment being undone by the values in the script.
      )

    DECLARE @id int
    DECLARE @max_id int
    DECLARE @folder_id int
    DECLARE @folder_name sysname = N'unknown'
    DECLARE @environment_name nvarchar(128) = N'unknown'
    DECLARE @environment_description nvarchar(128) = N'unknown'
    DECLARE @variable_name nvarchar(128)
    DECLARE @data_type nvarchar(128)
    DECLARE @sensitive bit
    DECLARE @value sql_variant
    DECLARE @description nvarchar(1024)
    DECLARE @create_only bit --
    DECLARE @exists bit
    DECLARE @nsql nvarchar(max)
    DECLARE @message nvarchar(255)

    DECLARE @false BIT = 0, @true BIT = 1

    --\
    ---) Environment settings.
    ---) This is a generic script for all environments, using the @@SERVERNAME (SQL Server Instance name)
    ---) to set values right for each environment.
    --/

    IF @@SERVERNAME LIKE 'VWS63-SQL1%1'
    BEGIN
        SELECT
          @folder_name = N'DEV',
          @environment_name = N'DEV',
          @environment_description = N'My development environment'
    END
    IF @@SERVERNAME = '<SQL INSTANCE NAME>'
    BEGIN
        SELECT
          @folder_name = N'TST',
          @environment_name = N'TST',
          @environment_description = N'My test environment'
    END
    IF @@SERVERNAME = '<SQL INSTANCE NAME>'
    BEGIN
        SELECT
          @folder_name = N'PRD',
          @environment_name = N'PRD',
          @environment_description = N'My production environment'
    END

    IF @environment_name = N'unknown'
    BEGIN
        RAISERROR('Unknown environment', 16, 1)
        -- Make sure that the script does not continue when the environment settings could not be set due to an unknown @@SERVERNAME.
        GOTO Finally
    END

    ----\
    -----) Create folder, when needed.
    ----/
    IF NOT EXISTS (
        SELECT 1
        FROM [SSISDB].[internal].[folders] f
        WHERE f.name = @folder_name
        )
    BEGIN
        SET @message = 'Folder “' + @folder_name + '” is being created.'
        RAISERROR(@message , 0, 1) WITH NOWAIT;
        EXEC [SSISDB].[catalog].[create_folder] @folder_name, @folder_id OUTPUT;
    END

    --\
    ---) Create environment, when needed.
    --/
    IF NOT EXISTS (
        SELECT 1
        FROM [SSISDB].[internal].[environments] e
        JOIN [SSISDB].[internal].[folders] f
            ON e.folder_id = f.folder_id
        WHERE e.[environment_name] = @environment_name
            AND f.name = @folder_name
        )
    BEGIN
        SET @message = 'Environment “' + @environment_name + '” is being created.'
        RAISERROR(@message , 0, 1) WITH NOWAIT;
        EXEC [SSISDB].[catalog].[create_environment] @folder_name, @environment_name, @environment_description;
    END

    IF @environment_name = N'DEV'
    BEGIN
      INSERT INTO @environment_variables_table
          ( [create_only], [variable_name], [data_type], [sensitive], [value], [description] )
      VALUES
      -- Just a few examples of environment variables, add your own here or generate this part from existing environments or deployed packages, see scripts below.
          ( @true, N'Parameter1', N'Int32', @false, N'1', N'Example parameter 1 (DEV).' )
        , ( @true, N'Parameter2', N'String', @false, N'Hello world from script (DEV).', N'Example parameter 2 (DEV).' )
        , ( @true, N'Parameter3', N'Decimal', @false, N'9.95', N'Example parameter 3 (DEV).' )
        , ( @true, N'ProjectParameter1', N'String', @false, N'Hello world from script (DEV).', N'Example projectparameter 1 (DEV).' )
        , ( @true, N'ProjectParameter2', N'Int32', @false, N'4', N'Example projectparameter 2 (DEV).' )
        
    END

    IF @environment_name = N'TST'
    BEGIN
      INSERT INTO @environment_variables_table
          ( [create_only], [variable_name], [data_type], [sensitive], [value], [description] )
      VALUES
      -- Just a few examples of environment variables, add your own here or generate this part from existing environments or deployed packages, see scripts below.
          ( @true, N'Parameter1', N'Int32', @false, N'1', N'Example parameter 1 (TST).' )
        , ( @true, N'Parameter2', N'String', @false, N'Hello world from script (TST).', N'Example parameter 2 (TST).' )
        , ( @true, N'Parameter3', N'Decimal', @false, N'9.95', N'Example parameter 3 (TST).' )
        , ( @true, N'ProjectParameter1', N'String', @false, N'Hello world from script (TST).', N'Example projectparameter 1 (TST).' )
        , ( @true, N'ProjectParameter2', N'Int32', @false, N'4', N'Example projectparameter 2 (TST).' )
        
    END

    IF @environment_name = N'PRD'
    BEGIN
      INSERT INTO @environment_variables_table
          ( [create_only], [variable_name], [data_type], [sensitive], [value], [description] )
      VALUES
      -- Just a few examples of environment variables, add your own here or generate this part from existing environments or deployed packages, see scripts below.
          ( @true, N'Parameter1', N'Int32', @false, N'1', N'Example parameter 1 (PRD).' )
        , ( @true, N'Parameter2', N'String', @false, N'Hello world from script (PRD).', N'Example parameter 2 (PRD).' )
        , ( @true, N'Parameter3', N'Decimal', @false, N'19.95', N'Example parameter 3 (PRD).' )
        , ( @true, N'ProjectParameter1', N'String', @false, N'Hello world from script (PRD).', N'Example projectparameter 1 (PRD).' )
        , ( @true, N'ProjectParameter2', N'Int32', @false, N'4', N'Example projectparameter 2 (PRD).' )
        
    END

    SELECT @id = 1, @max_id = MAX([id]) FROM @environment_variables_table
    WHILE @id <= @max_id
    BEGIN
        SELECT
          @variable_name = v.variable_name,
          @data_type = v.data_type,
          @sensitive = v.sensitive,
          @value = v.value,
          @description = v.[description],
          @create_only = v.[create_only],
          @exists = 0
        FROM @environment_variables_table v
        WHERE [id] = @id;

        IF EXISTS (
            SELECT 1
            FROM
                [SSISDB].[internal].[environment_variables] v
            JOIN
                [SSISDB].[internal].[environments] e ON e.environment_id = v.environment_id
            JOIN
                [SSISDB].[internal].[folders] f
                ON e.folder_id = f.folder_id
            WHERE
                v.[name] = @variable_name
                AND e.environment_name = @environment_name
                AND f.name = @folder_name
            )
        BEGIN
            IF @create_only = 1
            BEGIN
                SET @message = @variable_name + ' already exists and is not replaced.'
                RAISERROR(@message , 0, 1) WITH NOWAIT;
                SET @exists = 1;
            END ELSE BEGIN
                SET @nsql = N'EXECUTE [catalog].[delete_environment_variable] '
                  + N'@folder_name = N'''+ @folder_name + ''', @environment_name = N''' + @environment_name + ''', '
                  + N'@variable_name = N''' + @variable_name + ''''
                PRINT @nsql;
                EXEC sp_executesql @nsql;
            END
        END

        IF @exists = 0
        BEGIN
            SET @message = 'Creating variable “' + @variable_name + '”.';
            RAISERROR(@message , 0, 1) WITH NOWAIT;

            SET @nsql = N'EXECUTE [catalog].[create_environment_variable] '
              + N'@folder_name = N'''+ @folder_name + ''', @environment_name = N''' + @environment_name + ''', '
              + N'@variable_name = N'''+ @variable_name + ''', @data_type = N''' + @data_type + ''', '
              + N'@sensitive = ' + CONVERT(NVARCHAR, @sensitive) + ', @description = N''' + @description + ''', '
              + CHAR(13) + CHAR(10) + N'@value = ' +
              CASE UPPER(@data_type)
              WHEN 'String' THEN 'N''' + CONVERT(NVARCHAR(1000), @value) + ''' '
              ELSE CONVERT(NVARCHAR(1000), @value)
              END + '; '
            -- PRINT @nsql;
            EXEC sp_executesql @nsql;
        END
        SET @id = @id + 1
    END

    RAISERROR('Creating folder, environment and settings has completed ..', 0, 1) WITH NOWAIT;

    Finally:

    GO
    --\
    ---) ON a server where the VARIABLES already exist, you can use this query to generate the 'VALUES' part of the INSERT command used above.
    --/
    /*
    DECLARE @environment_name nvarchar(128) = N'DEV'

    SELECT
        m.*,
        v.name,
        N', ( N''' + v.name + ''', N''' + m.[ssis_data_type] + ''', '
          + CONVERT(NCHAR(1), v.sensitive) + ', N'''
          + CASE v.sensitive
            WHEN 0 THEN CONVERT(NVARCHAR(4000), v.value)
            ELSE '< type sensitive value here TODO >' END
          + ''', N''' + v.description + ''' )' AS [sql_values_part]
    FROM
        [SSISDB].[internal].[environment_variables] v
    JOIN
        [SSISDB].[internal].[environments] e
       ON e.environment_id = v.environment_id
    JOIN
        [SSISDB].[internal].[folders] f
        ON e.folder_id = f.folder_id
    JOIN
        [SSISDB].[internal].[data_type_mapping] m
        ON m.sql_data_type = v.base_data_type
        AND ( m.sql_data_type != 'tinyint' OR m.ssis_data_type = 'Byte')
    WHERE
        e.environment_name = @environment_name
    */

    --\
    ---) ON a server where the PACKAGES are ALREADY DEPLOYED, you can use this query to generate the 'VALUES' part of the INSERT command used above.
    ---) Manually check the output before running the sql script!
    --/
    /*
    SELECT
      N', ( N''' + q.[variable_name] + ''', N''' + q.[data_type] + ''', '
        + CONVERT(NCHAR(1), q.[sensitive]) + ', N'''
        + CASE q.sensitive
            WHEN 0 THEN CONVERT(NVARCHAR(4000), q.value)
            ELSE '< type sensitive value here TODO >' END
        + ''', N''' + q.[description] + ''' )' AS [sql_values_part]
    FROM
      (
      SELECT TOP 2147483647
          prm.[parameter_name] AS [variable_name],
          prm.[parameter_data_type] AS [data_type],r
          prm.[sensitive],
          COALESCE(prm.[default_value], prm.[design_default_value]) AS [value],
          prm.[description],
          ROW_NUMBER() OVER (PARTITION BY prm.[parameter_name] ORDER BY prm.[parameter_name]) AS [Occurrence]
      FROM
          [SSISDB].[internal].[object_parameters] prm
      -- [variable_name], [data_type], [sensitive], [value], [description]
      WHERE
          prm.[value_type] = 'V'
          AND prm.[parameter_name] NOT LIKE 'CM.%'
          AND LEFT(prm.[parameter_name], 1) != '_'
          AND NOT ( prm.[object_type] = 30 AND LEFT(prm.[object_name], 1) = '_') -- Naming convention for internally used SSIS Packages: start with _
      ORDER BY
          prm.[parameter_name]
      ) q
    WHERE
      q.Occurrence = 1
    */

    GO

    ssis-e10-428
    Example output of this script, when executed using Management Studio.

    ssis-e10-429
    And the second time, when a parameter is added and “create_only” is true for the existing parameters.

    Deploy your SSIS Packages

    This is about how to deploy an .ispac file in an automated way.
    An .ispac file is a zipfile that contains all packages in a SSIS project that can be deployed all at once. You create it when you chose Build or Rebuild in Visual Studio when you have the SSIS project open.
    By the way, if you rename it and give it a .zip extension, you can simply look what is inside.

    Amongst the files you can download here there are demo SSIS projects that I used for this on SQL Server 2012, 2014 and 2016.

    ssis-e10-423
    The demo SSIS project ..
    ssis-e10-424
    Menu-option to rebuild the SSIS solution ..
    ssis-e10-425
    ssis-e10-426
    After rebuilding the solution from the menu you will find the ispac files in the bin\<configuration name> folder of each project. In the screenprints above you can see how the bin subfolder is related to the chosen configuration in Visual Studio.

    To install an ispac file, I use the command line options of the Integration Services Deployment Wizard.

    For a default installation you can find the Integration Services deployment wizard (ISDeploymentWizard.exe) here:

    Version Path
    SQL Server 2012 C:\Program Files\SQL Server\110\DTS\Binn
    SQL Server 2014 C:\Program Files\SQL Server\120\DTS\Binn
    SQL Server 2016 C:\Program Files\SQL Server\130\DTS\Binn

    You will find the exact command line in one of the following paragraphs, where all steps are glued together.
    On 64-bit machines there is also a 32-bit version of this tool, but to be honest I do not see a reason why you would want to use the 32-bit version instead of the 64-bit, therefore I did not list the paths here.

    ssis-e10-427
    After installation you can find the folder, project and environment here ..

    Setup package configuration

    After the packages/SSIS projects are deployed, the next thing to do is adding an environment reference to the project(s) and connecting environment variables to package parameters and project parameters, based on the name.
    I use this script for that:

    post_deployment.sql

    --\ -------------------------------------------------------------------
    ---) Script to add environment reference to project(s) and to connect
    ---) environment variables to package parameters, based on the name
    --/ -------------------------------------------------------------------
    /*
    (c) Copyright 2016 - hansmichiels.com
     
    This program is free software: you can redistribute it and/or modify
    it under the terms of the GNU General Public License as published by
    the Free Software Foundation, either version 3 of the License, or
    (at your option) any later version.
     
    This program is distributed in the hope that it will be useful,
    but WITHOUT ANY WARRANTY; without even the implied warranty of
    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
    GNU General Public License for more details.
     
    You should have received a copy of the GNU General Public License
    along with this program. If not, see http://www.gnu.org/licenses/.
    */

    USE [SSISDB]
    GO

    DECLARE @object_parameter_value_table TABLE(
      [id] int identity(1, 1),
      [object_type] smallint,
      [object_name] nvarchar(260),
      [parameter_name] nvarchar(128),
      [project_name] NVARCHAR(128)
      )

    DECLARE @project_names_table TABLE(
      [id] int identity(1, 1),
      [project_name] NVARCHAR(128)
      )

    DECLARE @message nvarchar(255)

    DECLARE @id int
    DECLARE @max_id int
    DECLARE @folder_name sysname = N'unknown'
    DECLARE @environment_name nvarchar(128) = N'unknown'
    DECLARE @environment_description nvarchar(128) = N'unknown'
    DECLARE @project_name NVARCHAR(128)

    DECLARE @object_type SMALLINT
    DECLARE @object_name NVARCHAR(260)
    DECLARE @value_type CHAR(1)
    DECLARE @parameter_name NVARCHAR(128)
    DECLARE @parameter_value SQL_VARIANT

    DECLARE @reference_id BIGINT

    --\
    ---) Environment settings
    --/

    IF @@SERVERNAME LIKE 'VWS63-SQL1%1'
    BEGIN
        SELECT
          @folder_name = N'DEV',
          @environment_name = N'DEV',
          @environment_description = N'My development environment'
    END
    IF @@SERVERNAME = '<SQL INSTANCE NAME>'
    BEGIN
        SELECT
          @folder_name = N'TST',
          @environment_name = N'TST',
          @environment_description = N'My test environment'
    END
    IF @@SERVERNAME = '<SQL INSTANCE NAME>'
    BEGIN
        SELECT
          @folder_name = N'PRD',
          @environment_name = N'PRD',
          @environment_description = N'My production environment'
    END

    IF @folder_name = N'unknown'
    BEGIN
        RAISERROR('Unknown environment', 16, 1)
        -- Make sure that the script does not continue when the environment settings could not be set due to an unknown @@SERVERNAME.
        GOTO Finally
    END

    INSERT INTO @project_names_table([project_name])
      SELECT
          p.name
      FROM
          [SSISDB].[internal].[folders] f
      JOIN
          [SSISDB].[internal].[environments] e
          ON e.folder_id = f.folder_id
      JOIN
          [SSISDB].[internal].[projects] p
          ON p.folder_id = f.folder_id
      WHERE
          f.name = @folder_name
          AND e.environment_name = @environment_name
          -- Optional project_name filter here:
          -- AND p.name IN ('Project_1', 'Project_2')

    --\
    ---) Add environment reference to project(s).
    --/
    SELECT @id = 1, @max_id = MAX([id]) FROM @project_names_table
    WHILE @id <= @max_id
    BEGIN
        SELECT
          @project_name = v.[project_name]
        FROM @project_names_table v
        WHERE [id] = @id;

        IF NOT EXISTS( SELECT 1
            FROM
                [SSISDB].[internal].[folders] f
            JOIN
                [SSISDB].[internal].[environments] e
                ON e.folder_id = f.folder_id
            JOIN
                [SSISDB].[internal].[projects] p
                ON p.folder_id = f.folder_id
            JOIN
                [SSISDB].[internal].[environment_references] r
                ON r.environment_name = e.environment_name
                AND p.project_id = r.project_id
            WHERE
                f.name = @folder_name
                AND e.environment_name = @environment_name
                AND p.name = @project_name
            )
        BEGIN
            SET @message = 'An environment reference for project “' + @project_name + '” is being created.'
            RAISERROR(@message , 0, 1) WITH NOWAIT;

            EXEC [SSISDB].[catalog].[create_environment_reference]
                @environment_name=@environment_name, @reference_id=@reference_id OUTPUT,
                @project_name=@project_name, @folder_name=@folder_name, @reference_type='R';
            --Select @reference_id
        END

        SET @id = @id + 1
    END

    --\
    ---) Connect environment variables to package parameters, based on the name
    --/
    INSERT INTO @object_parameter_value_table (
        [object_type],
        [object_name],
        [parameter_name],
        [project_name]
        )
    SELECT
        prm.[object_type],
        prm.[object_name],
        prm.[parameter_name],
        prj.name
    FROM
        [SSISDB].[internal].[folders] f
    JOIN
        [SSISDB].[internal].[environments] e
        ON e.folder_id = f.folder_id
    JOIN
        [SSISDB].[internal].[environment_variables] ev
        ON e.environment_id = ev.environment_id
    JOIN
        [SSISDB].[internal].[projects] prj
        ON prj.folder_id = f.folder_id
    JOIN
        @project_names_table prjsel
        ON prjsel.project_name = prj.name
    JOIN
        [SSISDB].[internal].[object_parameters] prm
        ON prj.project_id = prm.project_id
        AND prm.parameter_name = ev.name
    WHERE
        prm.[value_type] != 'R'
        AND prm.value_set = 0
        AND prm.[parameter_name] NOT LIKE 'CM.%'
        AND LEFT(prm.[parameter_name], 1) != '_' -- Naming convention for internally used parameters: start with _
        AND NOT ( prm.[object_type] = 30 AND LEFT(prm.[object_name], 1) = '_') -- Naming convention for internally used SSIS Packages: start with _
        AND f.name = @folder_name
        AND e.environment_name = @environment_name
    ORDER BY
        prm.object_name, prm.parameter_name

    SELECT @id = 1, @max_id = MAX([id]) FROM @object_parameter_value_table
    WHILE @id <= @max_id
    BEGIN
        SELECT
            @object_type = v.[object_type],
            @object_name = v.[object_name],
            @parameter_name = v.[parameter_name],
            @project_name = v.[project_name]
        FROM
            @object_parameter_value_table v
        WHERE
            [id] = @id;

        SELECT @value_type = 'R', @parameter_value = @parameter_name;
        
        SET @message = 'Parameter “' + @parameter_name + '” (of object “' + @object_name + '”) is mapped to environment variable.'
        RAISERROR(@message , 0, 1) WITH NOWAIT;

        EXEC [SSISDB].[catalog].[set_object_parameter_value]
          @object_type, @folder_name, @project_name, @parameter_name,
          @parameter_value, @object_name, @value_type

        SET @id = @id + 1
    END

    RAISERROR('Adding environment reference to project(s) and mapping environment variables to package parameters has completed ..', 0, 1) WITH NOWAIT;

    Finally:

    GO

    Also important is that I use a naming convention to decide whether or not to create a link between a package parameter and an environment variable.
    For this I borrowed a naming convention from c# coding, where a variable starting with an underscore is a private variable within a class.
    So what I have done is the following:

    • To exclude one package parameter from being mapped to an environment variable, start the name with _ (underscore). I know you could just use parameter names that do not exist as environment variables, but this naming convention makes your design more clear. The naming convention also makes it possible to generate the script to create environment variables after you have initially deployed your packages, you can then just skip the package parameter names that start with _.
    • To exclude all package parameters for a SSIS package, start the package name with _ (e.g. when this package is always executed from other packages and does not need environment mappings).

    For the demo I use a solution with two projects, one of them with three packages, and the second one with just one package. This is done to demonstrate how connecting package- and project parameters to environment variables (step 3) works.

    Project SSIS-E10-P1

    • ExampleMaster.dtsx: contains Parameter1 and Parameter2. Both parameters should be mapped to environment variables with the same name.
    • Export_Persons.dtsx: contains Parameter1, Parameter2 and _Parameter3. Only the parameters not starting with the underscore should be mapped to the environment variables with the same name.
    • _LoadDimPerson.dtsx: contains Parameter1, Parameter2 and Parameter4. Not a single parameter should be mapped to an environment variable, because the package name starts with _
    • The project contains ProjectParameter1 which should be mapped to an environment variable with the same name.

    Project SSIS-E10-P2

  • AnyPackage.dtsx: contains Parameter1 and Parameter5. Only Parameter1 will be mapped, because no environment variable with name Parameter5 exists.
  • The project contains ProjectParameter1 and ProjectParameter2. Both parameters should be mapped to environment variables with the same name.
  • ssis-e10-431
    Example output windows when this script is run after having deployed the ispac files.

    ssis-e10-430
    And the result: the environment is “selectable” and the parameters show that they are mapped to an environment variable by having an underlined environment variable name in the “Value” column.

    Glueing everything together using Powershell

    Alright, what do we have so far?

    1. A sql script for setting up the folder and environment with variables.
    2. One or more .ispac files that can silently be deployed using ISDeploymentWizard.exe and its command line options.
    3. A sql script for connecting package- and project parameters to environment variables.

    So our PowerShell script simply needs to execute a sql script, then deploy the packages, and then execute another sql script. That shouldn’t be too difficult.
    This is the script that does exact those steps. If I have some time left I want to split the configuration and the actual script over different files/scripts.

    deployment_x64.ps1
    ######################################################################
    # Script:
    # ssis_deployment_script.ps1
    # Author:
    # Hans Michiels
    # Description:
    # This scripts consists of three parts:
    # 1. Executes pre_deployment.sql, which sets up an environment
    # with its variables.
    # 2. Installs one or more SSIS Deployment packages (ispac files)
    # that are in the same folder as this script.
    # 3. Executes post_deployment.sql, which configures the installed
    # packages to use the environment variables.
    #
    # (c) Copyright 2016 - hansmichiels.com
    #
    # This program is free software: you can redistribute it and/or modify
    # it under the terms of the GNU General Public License as published by
    # the Free Software Foundation, either version 3 of the License, or
    # (at your option) any later version.
    #
    # This program is distributed in the hope that it will be useful,
    # but WITHOUT ANY WARRANTY; without even the implied warranty of
    # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
    # GNU General Public License for more details.
    #
    # You should have received a copy of the GNU General Public License
    # along with this program. If not, see http://www.gnu.org/licenses/.
    #
    ######################################################################

    cls
    # Get the folder where this script is located
    $ScriptPath = split-path -parent $MyInvocation.MyCommand.Definition
    # Write-Output $ScriptPath;

    ######################################################################
    # CONFIGURATION
    ######################################################################
    # Configure SQL instance name and (SSIS) foldername here.
    $SQLInstance = “VWS63-SQL161”
    $FolderName = “DEV”
    # SQL Server 2016 default location:
    $ISDeploymentWizard = “C:\Program Files\Microsoft SQL Server\130\DTS\Binn\ISDeploymentWizard.exe”
    # My location:
    # $ISDeploymentWizard = “C:\MSSQL\Shared features\130\DTS\Binn\ISDeploymentWizard.exe”

    ######################################################################
    # MAIN SCRIPT
    ######################################################################
    # \
    # 1. Calls pre_deployment sql script
    # /
    $File = $ScriptPath + “\pre_deployment.sql”
    sqlcmd -S $SQLInstance -i $File

    # \
    # 2. Installs each ispac file found in the folder where the powershell script is located.
    # /
    Get-ChildItem $ScriptPath -Filter *.ispac |
    Foreach-Object {
        $ProjectName = $_.BaseName
        Write-Host “Deploying project $ProjectName ..”;
        & $ISDeploymentWizard /Silent /ModelType:Project /SourcePath:”$ScriptPath\$ProjectName.ispac” /DestinationServer:”$SQLInstance” /DestinationPath:”/SSISDB/$FolderName/$ProjectName” | Out-Null
        Write-Host “Deployment of project $ProjectName completed ..”;
    }
    # \
    # 3. Calls post_deployment sql script
    # /
    $File = $ScriptPath + “\post_deployment.sql”
    sqlcmd -S $SQLInstance -i $File

    And this is the result of the execution of the Powershell script. From now on no more manual deployments!
    ssis-e10-435Powershell output window ..

    Download all scripts and SSIS Solutions for SQL Server 2012, 2014 and 2016 here.

    Conclusion / Wrap up

    The goal of this blog post was to explain to you how to automate the deployment of SSIS packages including the configuration and environment setup.
    I have explained how:

    • to create a folder, an environment and environment variables in SSISDB using a sql script;
    • to deploy your SSIS Packages /ispac files using Powershell and the command line options of Integration Services Deployment Wizard;
    • to setup your package configuration after deployment (or: connect package- and project parameters to environment variables);
    • to execute all of the above using Powershell.

    You could also read that I use a naming convention to be able to exclude individual package parameters or all parameters of a package from being connected to environment variables.

    (c) 2016/2017 hansmichiels.com – Do not steal the contents – spread the link instead – thank you.


    UPDATED: How to upgrade your SSIS Packages from SQL Server 2012 to 2016 (SSIS Series)

    Errata

    Important, please read!
    Hi, I noticed something weird when upgrading my packages as described below, and thought it would be fair to share this with you.

    Script Tasks might not get upgraded well

    There seems to be an issue with upgrading packages with Script Tasks in it. The packages I upgraded with just plain SSIS stuff did not have that problem.
    The following warnings are given:
    Warning 0x40016044: DFT_Export_Person: Found SQL Server Integration Services 2012 Script Component SCR_Add_RowNumber that requires migration!
    Warning 0x40016044: SCR_Foo_Bar: Found SQL Server Integration Services 2012 Script Task “ST_a1fbd3cd5d924758a2c98397289876d0” that requires migration!
    However, the wizard still says that the upgrade is successful:
    The message that the upgrade is successful

    I think these warnings should be big fat errors and I will show you why.

    Before the upgrade: the script task has a ReadOnlyVariable and a ReadWriteVariable

    Before the upgrade: the script task has code in public void Main()

    After the upgrade: the script task does not have the ReadOnlyVariable nor the ReadWriteVariable

    After the upgrade: the code in public void Main() has disappeared

    As you know I am not anti-Microsoft, in the past I have thanked them for being so cool towards developers.
    But as far as this upgrade process is concerned I wonder: what must the developer at Microsoft have thought when making this a warning instead of an error? Was it monday morning, was he or she drunk or on drugs? I don’t know, the package is now cripple and will not work. That is not a warning.

    So I “warn” you, pay attention to your script tasks, if you use them.

    Use “Upgrade All Packages” option

    After upgrading, my builds started to take a lot of time (up to 4 minutes, I suspect an on-the-fly upgrade was being done). While investigating this, I found that in the project file all packages are also listed with a PackageFormatVersion per package in it, which was still 6!
    I am not sure if this is a Visual Studio bug, if the PackageFormatVersion of the dtsx file says 8, while the dtproj file says 6 for the same package.
    Anyway, after I modified the procedure slightly, my builds were fast again (20 seconds instead of 4 minutes).
    To achieve this, I opened the SSIS 2012 project using Visual Studio 2015, then chose Upgrade All Packages from the contextmenu.
    ssis-e09-422
    After that I changed the TargetServerVersion as described below.
    Sorry for any inconveniece caused by my initial article, and I hope this update will help.

    (The original post starts below)

    Introduction

    If you are intending upgrading your SQL Server installations to 2016, you’ll probably want to do the same with your SSIS packages.
    And I can tell you, coming from SQL Server 2012 or 2014 this is fairly easy to do.
    But with my experience I want to help you doing this not the wrong way.

    Problem

    I made the mistake by using the SSIS Package Upgrade Wizard while the projectversion still was SQL Server 2012.
    Funny thing is that the packages seemed to upgrade, however I got all kinds of errors and they just did not work.

    Solution

    While the solution seemed so easy.
    For this article I prepared a SSIS project made with SQL Server 2012.
    In the picture below you can how it looks like when opened with SQL Server Data Tools 2015 on a Server with SQL Server 2016 installed.

    ssis-e09-413After the project name, between brackets you can read that this is a SSIS 2012 project.

    ssis-e09-414Also, if you open a package in Notepad (be careful not to change it) you can see that the PackageFormatVersion is 6. This means SQL Server 2012.

    The simplest way to upgrade all packages in the project is to change the version of the SSIS project! Please be sure to backup your entire project and all SSIS packages first.
    To upgrade all packages, right click on the project node (in the picture on SSIS-E09-Packages (SQL Server 2012) and select Properties.
    Then select Configuration Properties and change the TargetServerVersion to SQL Server 2016.

    ssis-e09-415Changing the TargetServerVersion ..

    Then read the message, follow the instructions and click Yes.

    ssis-e09-416Upgrade instructions and warning

    The good news is, that’s basicly it! When you open a package with Notepad again, you can see that it is upgraded to PackageFormatVersion 8. This means SQL Server 2016.

    ssis-e09-418Package which is upgraded to PackageFormatVersion 8 (SQL Server 2016).

    In my experience with more complex packages than used for this demo, they all upgraded with no problems. I used all the standard SSIS components. Script Tasks and Script Components did not have to be recompiled.

    Still a warning about the Upgrade All Packages menu item, available from the SSIS Packages node. You don’t need it if you simply change the TargetServerVersion of the project as just described. If you use Upgrade All Packages with the wrong TargetServerVersion it seems to work, but the output packages do not work.

    ssis-e09-419The Upgrade All Packages menu-item

    Conclusion / Wrap up

    When you do it the right way, upgrading SSIS Packages from 2012 to 2016 is easy to do. However I did not test this with 2014, I am quite sure it will work in the exact same way.

    As you might have noticed I have changed the frequency of my blog posts, due to several circumstances. I still try to do 2 or 3 a month.

    Last updated: 27 November, 2016.

    (c) 2016 hansmichiels.com – Do not steal the contents – spread the link instead – thank you.

    Using the project deployment model with environments (SSIS Series)

    Introduction

    Hi, as announced in my previous post of July, 14, I took a summer break. I was really useful and I enjoyed it.
    From now on I will try to write some interesting posts weekly again.
    Today I want to explain the project deployment model of SSIS, introduced already in SSIS 2012.
    Also you are gonna find a (hopefully useful) script below that creates folders, environments and environment variables, and creates connections from package parameters to environment variables.
    But that’s for later, first a little introduction.

    Folders and environments

    What I found very confusing when I started using the project deployment model, already some years ago, is that an environment in this context is not what I expected it to be.
    From a software development perspective an environment is an installed and working version of your software, optionally using one or more databases and configuration setttings exclusively for that environment. For instance a Development or a Test environment.
    But the project deployment model of SSIS does not agree with that definition.
    For the project deployment model an environment is only a set of values for all environment variables, and nothing more.
    Can you think of what that means? Suppose you have a server, both used for Development and Test, and you have databases and cubes with a “_DEV” or a “_TST” suffix to their names.
    But as far as SSIS is concerned, you can only install one version of an SSIS package! Yeah, you can execute it with a different set of parameter values stored in SSIS environment variables, but that’s it.
    You cannot have two different versions of the same package installed on the same instance of SQL Server – at least not using only SSIS Environments.
    I found it not so intuitive that you need to use folders inside the Integration Services Catalog to achieve this.
    So far the theory, let us try this stuff out together.

    Let’s try it

    I’ll just click an example together, you do not need to follow me, my goal is that you understand the concepts so that you can apply it to your own projects.
    By the way, I am using SQL Server 2016, but it should also work in SQL Server 2012 and 2014.

    My test setup is very basic:

    • I have made two databases, [SSIS_S01E08_DEV] AND [SSIS_S01E08_TST], as development- and testenvironment, respectively.
    • Both databases have a table [tbl].[Fruit], but the table in the DEV database has one extra column, [FruitDescription], to simulate a change that is under development but not deployed to test yet.
    • I have made one SSIS Package AddFruit.dtsx, that inserts a row into the table [tbl].[Fruit] passing the value for [FruitName] only. During the demo I will change the package so that it also provides the [FruitDescription] column value.

    Playing around with the example above, I can demonstrate the use of folders and environments when the package is deployed using the project deployment model.

    The initial SSIS Package

    The initial SSIS Package contains just one Execute SQL Task, one parameter named FruitName and one OLEDB Connection manager named MyDatabase.connectionmanager.
    The connection manager is configured by an expression so that the property ConnectionString gets the value of a project parameter also named ConnectionString as demonstrated in the images below.
    SSIS-S01E08-280The Execute SQL Task

    SSIS-S01E08-281The package parameter

    SSIS-S01E08-282The project parameter

    SSIS-S01E08-283The OLEDB connection manager configured with an expression
    SSIS-S01E08-284Then rebuild the Solution

    Deploying the SSIS Package

    Once the solution has been rebuilt, you can deploy the package with the Integration Services Deployment Wizard. You can also call this utility from the command line, but explaining that is out of scope for this blog post.
    To execute this wizard, you have to locate the .ispac file, it is in a subfolder of your projects bin folder.
    If you have created additional solution configurations in your Visual Studio solution, this subfolder might be different from the image below (where the subfolder is Development).
    SSIS-S01E08-285Locating the .ispac file, that was created when the solution was rebuilt.

    SSIS-S01E08-286First step of the wizard

    SSIS-S01E08-287The default deployment model is Project Deployment, the one you need. The other option is Package Deployment.

    SSIS-S01E08-288Select a server name and a path [1]

    SSIS-S01E08-289Select a server name and a path [2]: create a new folder

    SSIS-S01E08-290Select a server name and a path [3]: name the folder after your environment

    SSIS-S01E08-292Select a server name and a path [4]: then click Next

    SSIS-S01E08-293Review and deploy.

    SSIS-S01E08-295Deployment succeeded.

    SSIS-S01E08-296In the Integration Services Catalog you can find your project and package.

    SSIS-S01E08-297You could configure it now, but there are no SSIS environments yet (note: these are a set of environment variable values).

    SSIS-S01E08-298Don’t configure just yet. We will do this later.

    SSIS-S01E08-299Here you could create an environment .. don’t do it now !

    SSIS-S01E08-TST-environment_setup.sql

    --\
    ---) A script to create a folder and environment in an SSIS Catalog.
    ---) Author: Hans Michiels
    --/
    /*
    (c) Copyright 2016 - hansmichiels.com
     
    This program is free software: you can redistribute it and/or modify
    it under the terms of the GNU General Public License as published by
    the Free Software Foundation, either version 3 of the License, or
    (at your option) any later version.
     
    This program is distributed in the hope that it will be useful,
    but WITHOUT ANY WARRANTY; without even the implied warranty of
    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
    GNU General Public License for more details.
     
    You should have received a copy of the GNU General Public License
    along with this program. If not, see http://www.gnu.org/licenses/.
    */

    USE [SSISDB]
    GO

    DECLARE @environment_variables_table TABLE(
      [id] int identity(1, 1),
      [variable_name] nvarchar(128),
      [data_type] nvarchar(128),
      [sensitive] bit,
      [value] nvarchar(4000),
      [description] nvarchar(1024)
      )

    DECLARE @object_parameter_value_table TABLE(
      [id] int identity(1, 1),
      [object_type] smallint,
      [object_name] nvarchar(260),
      [parameter_name] nvarchar(128)
      )

    DECLARE @id INT
    DECLARE @max_id INT
    DECLARE @folder_name sysname
    DECLARE @environment_name nvarchar(128)
    DECLARE @environment_description nvarchar(128)
    DECLARE @project_name nvarchar(128)
    DECLARE @variable_name nvarchar(128)
    DECLARE @data_type nvarchar(128)
    DECLARE @sensitive bit
    DECLARE @value sql_variant
    DECLARE @value_tinyint tinyint
    DECLARE @description nvarchar(1024)
    DECLARE @nsql nvarchar(max)

    DECLARE @object_type smallint
    DECLARE @object_name nvarchar(260)
    DECLARE @value_type CHAR(1)
    DECLARE @parameter_name nvarchar(128)
    DECLARE @parameter_value sql_variant

    --\
    ---) Environment settings
    --/
    SELECT
      @folder_name = N'TST',
      @environment_name = N'TST',
      @environment_description = N'My demo Test environment',
      @project_name=N'SSIS_S01E08'

    --\
    ---) Script to create environment and settings.
    --/
    IF NOT EXISTS (
        SELECT 1
        FROM [SSISDB].[internal].[environments]
        WHERE [environment_name] = @environment_name
        )
    BEGIN
        EXEC [SSISDB].[catalog].[create_environment] @folder_name, @environment_name, @environment_description;
    END

    INSERT INTO @environment_variables_table
      ( [variable_name], [data_type], [sensitive], [value], [description] )
      VALUES
          ( N'ConnectionString', N'String', 0, N'Data Source=.;Initial Catalog=SSIS_S01E08_TST;Provider=SQLNCLI11.1;Integrated Security=SSPI;Auto Translate=False;', N'Connection string for the database.' )
        , ( N'FruitName', N'String', 0, N'Tamarillo', N'' )
        --, ( N'Int32Example', N'Int32', 0, N'1', N'' )
        --, ( N'BooleanExample', N'Boolean', 0, N'1', N'' )
        --, ( N'ByteExample', N'Byte', 0, N'5', N'' )

    SELECT @id = 1, @max_id = MAX([id]) FROM @environment_variables_table
    WHILE @id <= @max_id
    BEGIN
        SELECT
          @variable_name = v.variable_name,
          @data_type = v.data_type,
          @sensitive = v.sensitive,
          @value = v.value,
          @description = v.[description]
        FROM @environment_variables_table v
        WHERE [id] = @id;

        IF EXISTS (
            SELECT 1
            FROM [SSISDB].[internal].[environment_variables] v
            JOIN [SSISDB].[internal].[environments] e ON e.environment_id = v.environment_id
            WHERE v.[name] = @variable_name
            AND e.environment_name = @environment_name
            )
        BEGIN
            SET @nsql = N'EXECUTE [catalog].[delete_environment_variable] '
              + N'@folder_name = N'''+ @folder_name + ''', @environment_name = N''' + @environment_name + ''', '
              + N'@variable_name = N''' + @variable_name + ''''
            PRINT @nsql;
            EXEC sp_executesql @nsql;
        END

        PRINT '/*'
        PRINT @variable_name
        PRINT CONVERT(nvarchar(128), SQL_VARIANT_PROPERTY(@value, 'BaseType'));
        PRINT '*/'

        SET @nsql = N'EXECUTE [catalog].[create_environment_variable] '
          + N'@folder_name = N'''+ @folder_name + ''', @environment_name = N''' + @environment_name + ''', '
          + N'@variable_name = N'''+ @variable_name + ''', @data_type = N''' + @data_type + ''', '
          + N'@sensitive = ' + CONVERT(NVARCHAR, @sensitive) + ', @description = N''' + @description + ''', '
          + CHAR(13) + CHAR(10) + N'@value = ' +
          CASE UPPER(@data_type)
          WHEN 'String' THEN 'N''' + CONVERT(NVARCHAR(1000), @value) + ''' '
          ELSE CONVERT(NVARCHAR(1000), @value)
          END + '; '
        PRINT @nsql;
        EXEC sp_executesql @nsql;

        SET @id = @id + 1
    END

    --\
    ---) Add environment reference to project.
    --/
    Declare @reference_id bigint
    IF NOT EXISTS(SELECT 1
      FROM [SSISDB].[internal].[environment_references] r
      JOIN [SSISDB].[internal].[projects] p
        ON p.project_id = r.project_id
      WHERE p.name = @project_name
        AND r.environment_name = @environment_name
      )
    BEGIN
        EXEC [SSISDB].[catalog].[create_environment_reference]
            @environment_name=@environment_name, @reference_id=@reference_id OUTPUT,
            @project_name=@project_name, @folder_name=@folder_name, @reference_type='R';
        --Select @reference_id
    END

    --\
    ---) Connect environment variables to package parameters, based on the name
    --/
    INSERT INTO @object_parameter_value_table (
        [object_type],
        [object_name],
        [parameter_name]
        )
    SELECT
        prm.[object_type],
        prm.[object_name],
        prm.[parameter_name]
    FROM
        [SSISDB].[internal].[object_parameters] prm
    JOIN
        [SSISDB].[internal].[environment_variables] ev
        ON ev.name = prm.parameter_name
    JOIN
        [SSISDB].[internal].[projects] prj
        ON prj.project_id = prm.project_id
    WHERE
        prm.[value_type] != 'R'
    AND
        prm.value_set = 0
    AND
        prj.name = @project_name
    ORDER BY
        prm.object_name, prm.parameter_name

    SELECT @id = 1, @max_id = MAX([id]) FROM @object_parameter_value_table
    WHILE @id <= @max_id
    BEGIN
        SELECT
          @object_type = v.[object_type],
          @object_name = v.[object_name],
          @parameter_name = v.[parameter_name]
        FROM @object_parameter_value_table v
        WHERE [id] = @id;

        SELECT @value_type = 'R', @parameter_value = @parameter_name;
        
        EXEC [SSISDB].[catalog].[set_object_parameter_value]
          @object_type, @folder_name, @project_name, @parameter_name,
          @parameter_value, @object_name, @value_type

        SET @id = @id + 1
    END

    GO

    My script that facilitates to set up multiple environments quickly (download all the used scripts here).

    In the demo it would not be a problem to do all the configuration by hand, but if you have a lot of SSIS Packages with a lot of parameters, you really do not want to do this, especially not if you have different environments, like Development, Test and Production.
    This is why I made a script that automates this work for me:

    • It creates an environment, if it doesn’t exist yet.
    • It (re)creates environment variables, defined in the script.
    • It adds an environment reference to the project, if it doesn’t exist yet.
    • It connects environment variables to package parameters, based on the name.
      For this to work, it is important that package parameters with the same name used in different packages mean the same thing!
      For instance if you have a parameter SourceFolder used in different packages, it should be the same folder! If not, rename the parameters so that they have a unique name and – when mapped to environment variables – can both be assigned a different value, e.g. CrmSourceFolder and ErpSourceFolder.

    We run the script for the TST environment.

    SSIS-S01E08-300The script was executed successfully.

    SSIS-S01E08-301After the script has run, a TST folder with an TST environment inside exists. You can view or edit its properties.

    SSIS-S01E08-302When the Variables pane is selected on the left, the environment variables, created by the script, are shown.

    SSIS-S01E08-303View package configuration.

    SSIS-S01E08-304The package parameter FruitName is now connected to environment variable FruitName.

    SSIS-S01E08-305When I choose Execute ..

    SSIS-S01E08-306… all I have to do is check Environment, it will automaticly show .\TST

    SSIS-S01E08-307After execution of the package, a row exists in the table (and one from a previous test)

    SSIS-S01E08-308We can also execute the package with a SQL Agent Job ..

    SSIS-S01E08-310Execute the package with a SQL Agent Job: all I have to do is check Environment, it will automaticly show .\TST ..

    SSIS-S01E08-311.. and all values are connected properly ..

    SSIS-S01E08-312And yes, it works!

    Extending the package with a FruitDescription

    I must admit, I cheated a little. I skipped the part of deploying the first version of the SSIS Package to the DEV environment.
    So now I am going to deploy to a new DEV Folder and Environment for the first time, while it is will be the second version of the package. This is not a recommended practice in real projects.
    Anyway, the package gets an extra parameter FruitDescription.
    SSIS-S01E08-313Added FruitDescription

    SSIS-S01E08-314Modify the Execute SQL Task to insert also the value for @FruitDescription [1]

    SSIS-S01E08-315Modify the Execute SQL Task to insert also the value for @FruitDescription [2]

    SSIS-S01E08-316Then Rebuild

    SSIS-S01E08-317Then run deployment wizard again, but now create a DEV folder

    SSIS-S01E08-318Script modifications [1]: use a DEV folder and a DEV environment name.

    SSIS-S01E08-319Add FruitDescription to the list of environment variables and change the values: connection string must use the DEV database and fruitname and -description something different than in the TST environment.

    Then run the script.
    SSIS-S01E08-320The package was deployed using the deployment wizard.

    SSIS-S01E08-321SSIS Environment was created using the script

    SSIS-S01E08-322Environment variables where added using the script

    SSIS-S01E08-323Environment variables were connected to package parameters using the script

    SSIS-S01E08-324Package execution test: note the new parameter FruitDescription

    SSIS-S01E08-325Test result

    SSIS-S01E08-326Final result: different versions of the same SSIS Package installed side by side on the same machine. But in addition to SSIS Environments with variable values, folders inside the Integration Services Catalog are used for that!

    Download the scripts here.

    Conclusion / Wrap up

    With the project deployment model of SSIS, introduced in SQL Server 2012, deployment and configuration of SSIS Packages has become easier.
    However it is important to understand how folders and environments in the Integration Services Catalog work. I have tried to explain this, and provided a script to set up multiple environments quickly.

    Why did you use those weird fruit examples?

    D=Development
    T=Test
    😉
    Read more here:
    Durian
    Tamarillo

    (c) 2016 hansmichiels.com – Do not steal the contents – spread the link instead – thank you.

    Implement logging in a Script Task (SSIS Series)

    Introduction

    This post is a follow-up on two older posts and will be the last one before my summer break.

    Today I want to enhance the SSIS package made in post Fixing corrupt csv files in a SSIS Package (SSIS Series) by using the Plug and Play Logging Solution.

    Problem

    As long as a script runs well, you might not need logging inside. BUT .. when something GOES wrong, having logging in the script can reduce the time needed to troubleshoot the issue.
    In the example for fixing csv files, anything can be wrong with a file, causing an error in the script. In that case it is for instance useful to know which file was being processed when the error occurred.

    Prerequisite

    For this to work, you have to install my Plug and Play Logging Solution, which can be downloaded from this post. It is just one SQL Script you have to run, that creates a [logdb] database with objects inside. You can also add those objects to a different database that already exists.
    It is however not mandatory to rebuild the Fixing corrupt csv files in a SSIS Package example, as you can also add logging to your own Script Task.

    Solution

    This post is a enhancement on the Fixing corrupt csv files in a SSIS Package post, so you might need to read that post first, if you did not do that already.

    Therefore I will dive directly into the changes that are needed to add logging in the Script Task.
    This will be done with lots of screenprints with some comment in between.

    SSIS-S01E07-169Adding a package parameter “LoggingConnectionString”
    First you need to add a package parameter LoggingConnectionString that can be used in an expression of the OLE DB Connection and as input variable for the Script Task.

    SSIS-S01E07-170OLE DB Connection configuration ..
    Then add an OLE DB Connection for the logging database.

    SSIS-S01E07-171OLE DB Connection properties ..
    Use an expression to let the ConnectionString of the OLE DB Connection be equal to the value of the package parameter LoggingConnectionString.

    SSIS-S01E07-173OLE DB Connection – Expression configuration [1]

    SSIS-S01E07-172OLE DB Connection – Expression configuration [2]

    SSIS-S01E07-174Final result
    By the fx icon you can see that the connection manager uses an expression.

    SSIS-S01E07-176Enable logging
    Now enable logging for the package.

    SSIS-S01E07-177Enable logging [2]
    Add a log provider for SQL Server and let it use the logdb.oledbConnection by selecting it under the Configuration column header.

    SSIS-S01E07-178Enable logging [3]
    Then select all events. Filtering on what is actually logged is done by the logging solution (by the value of @MaxMessageClass, see this blog post for more details).

    SSIS-S01E07-179Select the Script Task
    Select the Script Task and add the following Variables to ReadOnlyVariables:

    • System::ExecutionInstanceGUID
    • System::PackageID
    • $Package::LoggingConnectionString


    • SSIS-S01E07-180The added ReadOnlyVariables in the red rectangles

      Below you will find a number of screenprints of the script task to talk you through the changes.
      You can download the C# script here.

      SSIS-S01E07-181Overview
      First make sure the Namespaces region is as shown.
      Then fold the namespace with the guid in the name, and paste the entire namespace HansMichielsCom.PlugAndPlaySSISLoggingSolution underneath it.
      This code is put in a separate namespace, so that it could also be placed in a .NET assembly that is added to the GAC (Global Assembly Cache). When you would do this, you do not have to add the code to every Script Task.
      For the example of today, we just put this namespace inside the Script Task to make things not too complicated for now.

      SSIS-S01E07-182Using the HansMichielsCom.PlugAndPlaySSISLoggingSolution namespace
      As a result, you have to tell the first guid-like namespace, that you want to call code inside the second namespace. Therefore add the using statement as shown above.

      SSIS-S01E07-183Constant used for logging
      Below you will see some printscreens with changed parts in the script.

      SSIS-S01E07-184Method GetSsisLogWriter to instantiate a SsisLogWriter object

      SSIS-S01E07-187Method Main is extended with logging.

      SSIS-S01E07-188Pass the logWriter as parameter to other methods ..

      SSIS-S01E07-189IMPORTANT: Bugfix in CheckAndReturnHeader!

      SSIS-S01E07-190IMPORTANT: Bugfix in CheckAndReturnHeader!
      (header == null) is added to cope with empty files.

      Testing, testing, one, two ..



      SSIS-S01E07-191Test preparations [1]

      SSIS-S01E07-193Test preparations [2]

      SSIS-S01E07-194Test execution

      SSIS-S01E07-195Test result: logging rows done inside the script are in the log table.

      Conclusion / Wrap up

      In this post I have demonstrated how to implement logging in SSIS Script Tasks using my Plug and Play Logging Solution.
      This type of logging gives more control on what to log and how to log it than when you implement logging using SSIS events.
      The examples given are very basic. You can use your imagination to implement logging of errors using a try .. catch block, or use all available parameters of logWriter.AddLogEntry to change the Retention Class, Message Class, and so on.

      In the summer I will take some time for study, reflection, holiday, and still .. work.
      My next post will be early September at the latest, maybe earlier.

      Download the C# script here.

      (c) 2016 hansmichiels.com – Do not steal the contents – spread the link instead – thank you.

    Zip sourcefiles in a SSIS Package (SSIS Series)

    Introduction

    As I am in a (un)zipping mood, this week the opposite problem of last week.
    The example is not so ready-to-use as in my previous post, as it depends on choices made in your development process of other SSIS packages.
    Still I hope you can adjust this example to match your needs.

    Problem

    If you have imported and archived (csv- or xml) files, these files can take up a lot of disk space. The archive folder can also become overwhelmingly full with ten thousands of files, which can make it slowly accessible and difficult to manage.

    Solution

    To tackle this problem I have made a fully configurable SSIS package, that can be executed from a different package with an Execute Package Task.
    The package is made using SSIS 2014.

    The zipping itself is done with 7za.exe, for more information on how to get and install this free program see my previous blog post.

    Scenario

    When I used this concept, a different SSIS package moved files to a subfolder of an archive folder after they were processed, but their name was prefixed with the LoadCycleID with leading zeros, for instance 001345_customers.xml.

    The archive folder and subfolder were separated, because the archive folder was an environment variable and the same for all packages, while the subfolder was different for different SSIS packages, and contained the name of the source system of the files.
    If you only have one archive folder, just change the example to use that instead of the concatenation of archive folder and subfolder.

    Overview

    Here is a picture of the end result:

    SSIS-S01E06-124
    The Control Flow

    Detailed steps

    Parameters and variables

    First you need to add a few parameters and variables to the package.

    Parameters:

    Name Data type Required Description
    _7zaPathAndFilename String False The full path and filename of 7za.exe.
    ArchiveFolder String True The root folder for archiving processed files.
    ArchiveSubFolder String True The subfolder of the ArchiveFolder in which the files to be zipped exist, the subfolder can for instance be named after a source system like SAP or CRM.
    LoadCycleID Integer True The ID of the LoadCycle that is processing these files. In your situation it could also be called ExecutionID, SessionID and so on, the name is not really important.

    SSIS-S01E06-125
    Package parameters

    Variables:

    However not all variables might be necessary, I like the concept of being able to configure all values in one place.
    The data type of all variables is String, except LoadCycleID which is an Integer.
    Every variable uses an expression to set its value.
    Here is a detailed list of all variables and the expression used for each one.

    Name Expression
    ArchiveFilesToDeleteInFolder @[$Package::ArchiveFolder] + "\\" + @[$Package::ArchiveSubFolder]
    ArchiveFilesToDeletePattern @[User::LoadCycleIDString] + "*"
    ArchiveFilesToZipPattern @[$Package::ArchiveFolder] + "\\" + @[$Package::ArchiveSubFolder] + "\\" + @[User::LoadCycleIDString] + "*.*"
    ArchiveFilesWorkingDirectory @[$Package::ArchiveFolder]
    ArchiveZipFile @[User::LoadCycleIDString] + "_" + @[$Package::ArchiveSubFolder] + ".zip"
    ArchiveZipPathFile_Final @[$Package::ArchiveFolder] + "\\" +@[$Package::ArchiveSubFolder] + "\\" + @[User::LoadCycleIDString] + "_" + @[$Package::ArchiveSubFolder] + ".zip"
    ArchiveZipPathFile_Temp @[$Package::ArchiveFolder] + "\\" + @[User::LoadCycleIDString] + "_" + @[$Package::ArchiveSubFolder] + ".zip"
    LoadCycleID @[$Package::LoadCycleID]
    LoadCycleIDString RIGHT("00000" + (DT_STR, 10, 1252) @[$Package::LoadCycleID], 6)

    Important note: if you expect that the value for LoadCycleID will grow beyond 999999 over time, you should add more prefixed zero’s to LoadCycleIDString and increase the number 6, the second parameter of the RIGHT function.

    SSIS-S01E06-126
    SSIS variables of the package

    The Control Flow

    The control flow is quite straight forward, only three tasks and we’re done.

    EPR Zip files in archive subfolder

    This task creates one zipfile containing multiple other files in the ArchiveSubFolder.
    From the SSIS Toolbox drag an Execute Process Task to the Control Flow panel and open the Execute Process Task Editor.
    On the General and Process panes you do not have to change anything.

    SSIS-S01E06-128
    SSIS-S01E06-129
    No changes needed here

    On the Expressions pane, press the ellipsis next to Expressions. In the Property Expressions Editor add the following:

    Property Expression
    Arguments "a "+ "\"" + @[User::ArchiveZipFile] + "\" " + "\"" + @[User::ArchiveFilesToZipPattern] + "\" -mx=9"
    Executable @[$Package::_7zaPathAndFilename]
    WorkingDirectory @[User::ArchiveFilesWorkingDirectory]

    Please note that in the Arguments expression -mx=9 is used. This is the compression level, level 9 means “ultra” compression. This level gives the smallest zip files but is also the slowest compression level. You can for instance also use 5 for “normal” or 3 for “fast” compression, if this is what you want.

    SSIS-S01E06-127
    Expressions for the Execute Process Task

    SCR Delete uncompressed files

    This task deletes the files that were added to the zip file in the previous task.

    In the Script Task Editor, add both variables User::ArchiveFilesToDeleteInFolder and User::ArchiveFilesToDeletePattern to ReadOnlyVariables, then press Edit Script ….

    Then in the namespaces region add one namespace, as indicated below.

    Namespaces region
    #region Namespaces
    using System;
    using System.Data;
    using Microsoft.SqlServer.Dts.Runtime;
    using System.Windows.Forms;
    // Added namespace:
    using System.IO;
    #endregion

    Adjust public void main() as follows:

    public void Main

    public void Main()
    {
      // Copy SSIS variable values to C# string variables.
      string archiveFilesToDeleteInFolder = Dts.Variables["User::ArchiveFilesToDeleteInFolder"].Value.ToString();
      string archiveFilesToDeletePattern = Dts.Variables["User::ArchiveFilesToDeletePattern"].Value.ToString();

      // Delete each file that just has been added to the zip archive.
      foreach (string f in Directory.EnumerateFiles(archiveFilesToDeleteInFolder, archiveFilesToDeletePattern))
      {
        File.Delete(f);
      }

     Dts.TaskResult = (int)ScriptResults.Success;
    }


    WARNING: Make sure that no files that meet the file name pattern of files to add to the zip file are added to the ArchiveSubFolder while or after the zipfile is (being) created. Otherwise files that are NOT zipped might get deleted!

    Download the script here.

    FST Move zip file

    The File System Task is to move the zip file to the ArchiveSubFolder, because the zipfile is initially created in a temporary folder and not in the ArchiveSubFolder.
    This is to prevent that it would be deleted by coincidence, if you use a value for ArchiveFilesToDeletePattern that ends with .*
    Configure the File System Task as shown in the picture below:

    SSIS-S01E06-130
    File System Task Configuration

    Download the script here.

    Let the car drive

    Time to test the package and tell the result in pictures (if you don’t mind).

    SSIS-S01E06-131
    Before: files of LoadCycleID 001346 need to be zipped ..

    SSIS-S01E06-132
    Execute the package ..

    SSIS-S01E06-133
    All csv files replaced with one zip file ..

    SSIS-S01E06-134
    .. that contains the csv files ..

    Conclusion / Wrap up

    In this blog post you could read the steps needed to build a SSIS package that creates a zip file containing a number of files in an archive(sub)folder.
    This makes the archive smaller and easier to manage.
    However beware that you only delete files that were added to the zip file. For more details see the warning above.

    (c) 2016 hansmichiels.com – Do not steal the contents – spread the link instead – thank you.

    Unzip specific files from lots of zipped archives (SSIS Series)

    Introduction

    This week I want to discuss the trouble you may have when searching hundreds of zip files for specific files inside them.
    Recognizable?
    At least I faced this problem some time ago, and solved it using SSIS and the commandline version of 7-zip, named 7za.exe.
    The choice for SSIS was purely based on convenience, as you could tackle this problem also using Powershell, C# or almost any other scripting language.
    But I thought SSIS is just a neat and quick way to do this.

    Problem

    Finding specific files in huge amounts of zip files.

    Solution

    To tackle this problem I have made a fully configurable SSIS package, that can be executed with different parameters.
    The main task of the package is to extract specific files (based on a file name pattern) from zipfiles in a folder and – optionally – subfolders.
    The unzipping itself is done with 7za.exe, which is the commandline version of 7z, the great open source zip program, that exists already for several years and is used by millions of users.

    By the way, the package is made using SSIS 2014.

    Preparations: download and install 7za

    It is important to download this package from the official 7z web site, just to avoid that you would get a copy from some deceitful web site infected with malware.
    To go to the official 7-zip download page click on this link:
    7-zip download page
    Most important is the 7-Zip Extra: standalone console version.
    For the example I have used version 16.02 (2016-05-21) for Windows, but it should also work with version 9.20 (2010-11-18), then called 7-Zip Command Line Version.

    If you download the newest version, you also need software to unzip the 7z160-extra.7z file, for instance the regular 32-bit or 64-bit version of 7-zip, which can also be downloaded on the same page.

    SSIS-S01E05-100
    Download page of 7-zip

    I assume you will use the newest version 7z160-extra.7z
    Once unzipped, just take the time to read License.txt, readme.txt and history.txt.
    Then copy the three files 7zxa.dll, 7za.dll and 7za.exe to a folder on the Windows computer where SSIS runs, usually a Windows Server.
    If you run a x64 version of Windows, you can take the x64 version of the 7za files, which are in a subfolder “x64” of the unzipped files.

    SSIS-S01E05-101
    Unzipped 7za files ..

    I have created a subfolder “7za” in the “C:\Program files” folder of my virtual Windows Server, and copied the x64-version of the three mentioned files to it, but you can put them in any folder you like.
    When you have done this, the real fun can start ..

    Overview

    Here is a picture of the end result:

    SSIS-S01E05-102
    The Control Flow

    Detailed steps

    Parameters and variables

    First you need to add a few parameters and variables to the package.

    Parameters:

    Name Data type Required Description
    _7zaPathAndFilename String False The full path and filename of 7za.exe.
    FilenamePattern String True The filename pattern of files you want to unzip.
    SourceRootFolder String True The main folder where you want to search for zip files.
    TraverseSubfolders Boolean False Indication if you want to look only in SourceRootFolder, or also in its subfolders.
    UnzipRootFolder String True The folder to which files are unzipped. Important: this may NOT be a subfolder of SourceRootFolder.
    ZipFilenamePattern String True The filename pattern of the zipfiles from which you want to unzip files.

    SSIS-S01E05-103
    Package parameters

    Variables:

    Name Data type Description
    FLC_UnzipFolder String The subfolder where the files will be unzipped. This is a subfolder of the UnzipRootFolder, with the same relative subpath as the zipfile has in SourceRootFolder. Additionally, the zipfilename has been replaced with a subfolder that is the same as the zipfilename without the file extension.
    FLC_ZipPathAndFileName String The zipfile full path, that is target of investigation if any files with FilenamePattern are zipped inside.

    SSIS-S01E05-104
    SSIS variables needed to do the job

    Developing the package

    FELC Unzip files

    This foreach loop container finds all zip files.
    In the Foreach Loop Editor, select Collection in the left pane.
    As we want to make the solution generic by using all those parameters mentioned above, we have to add a few expressions.

    SSIS-S01E05-105
    Foreach Loop container Collection configuration

    Add the following three expressions (in my opinion the editor is not very userfriendly, the window is too small, but it can be done):

    Property Expression
    Directory @[$Package::SourceRootFolder]
    FileSpec @[$Package::ZipFilenamePattern]
    Recurse @[$Package::TraverseSubfolders]

    Please note that these expressions will set the values for Folder, Files and the Traverse subfolders checkbox in the Enumerator configuration part of the screen (inside the purple square in the picture above).
    If you close and reopen the Foreach Loop Editor, you will notice that the configuration fields now use the values from the corresponding parameters, except the value Fully qualified for Retrieve file name, which is hardcoded.

    SSIS-S01E05-106
    Example of how to set an expression..

    Then, still in the Foreach Loop Editor, select Variable Mappings in the left pane and map variable User::FLC_ZipPathAndFileName to Index 0, as shown in the picture below.

    SSIS-S01E05-107
    Variable Mappings

    EXPR Set Unzip folder

    The next nifty thing that needs to be done, is creating a folder to store the unzipped files. As we do not know if zipped files with the same name exist in multiple zip files, the safest thing to do is to create a directory structure identical to the directory structure of the zip files.
    Additionally, the basename of the zip file (so without file extension) is used as an extra subfolder. In this way we are sure that unzipped files with the same name can be stored without overwriting other ones.

    To do this, from the SSIS Toolbox drag an Expression Task inside the Foreach Loop container.
    Open the Expression Task and paste this expression into the Expression textbox.

    @[User::FLC_UnzipFolder] =
    REPLACE(LEFT(@[User::FLC_ZipPathAndFileName],
    FINDSTRING(@[User::FLC_ZipPathAndFileName], "."
    + TOKEN(@[User::FLC_ZipPathAndFileName], ".",
    TOKENCOUNT(@[User::FLC_ZipPathAndFileName], ".")),1) -1),
    @[$Package::SourceRootFolder], @[$Package::UnzipRootFolder])

    Then press the Evaluate Expression button to check that the expression is valid.

    SSIS-S01E05-108
    Configuring the Expression Task

    FSYS Create Unzip folder

    In the previous step we have composed the full path to unzip files into, but this folder structure might not exist yet.
    Therefore we use A File System Task named FSYS Create Unzip folder to create the folder structure.
    The good news is that the File System Task can create multiple (sub)folders at once.
    Configure it as shown in the picture.
    SSIS-S01E05-109
    Configuration of FSYS Create Unzip folder

    EPR Unzip Selected Files

    Below you see a few screenshots for the configuration of the Execute Process Task named EPR Unzip Selected Files.
    The funny thing is, that in the Process pane, you do not have to change anything. Executable and Arguments are set using expressions (coming next) and all the other properties can keep their default values.

    SSIS-S01E05-110
    Process configuration.

    SSIS-S01E05-111
    Setting Executable and Arguments using expressions
    Arguments (copy-friendly):

    "-y x \"" + @[User::FLC_ZipPathAndFileName] + "\" -o" + @[User::FLC_UnzipFolder] + " " + @[$Package::FilenamePattern] + " -r"

    SCR Remove Empty folders

    There is one problem with the steps done so far. If a zipfile does not contain a file that matches the FilenamePattern, then the folderstructure is made for nothing.
    However you could just leave these empty folders, I think the solution is a bit better if these empty folders are deleted.
    To do this, from the SSIS Toolbox drag an Script Task inside the Foreach Loop container.

    SSIS-S01E05-112
    In the Script Task Editor, add both variables User::FLC_UnzipFolder and $Package::UnzipRootFolder to ReadOnlyVariables

    In the Script Task Editor window, press Edit Script ….

    Please note that you can download the C# script here..

    Then in the namespaces region add two namespaces, as indicated below.

    Namespaces region
    #region Namespaces
    using System;
    using System.Data;
    using Microsoft.SqlServer.Dts.Runtime;
    using System.Windows.Forms;
    // Added namespaces:
    using System.IO;
    using System.Linq;
    #endregion

    Adjust public void main() as follows:

    public void Main

    public void Main()
    {
      // Copy SSIS variable value of FLC_UnzipFolder to C# string variable.
      string unzipFolder = Dts.Variables["User::FLC_UnzipFolder"].Value.ToString();

      // Source: http://stackoverflow.com/questions/22520761/delete-all-empty-directories-in-specified-path
      // My credits go to author "Jon" from Berlin.
      // Because not all zipfiles might contain files with the FilenamePattern we are looking for, an empty path might have been created.
      // This script deletes these empty folders, so that in the end only folders will exist with unzipped files or subfolders in it.
      var di = new DirectoryInfo(unzipFolder);
      var root = Dts.Variables["$Package::UnzipRootFolder"].Value.ToString(); // no trailing slash!
      while (di.FullName != root
              && !di.EnumerateFiles().Any()
              && !di.EnumerateDirectories().Any())
      {
        di.Delete();
        di = di.Parent;
      }

     Dts.TaskResult = (int)ScriptResults.Success;
    }

    Download the script here.

    The proof of the pudding is in the eating

    Time to do some testing.
    I have created a few folders with subfolders and subsubfolders. You can see the folder structure in the last picture below.
    As you might have noticed the default values for my parameters already prepared for the testcase: I want to unzip only customer files from zip files with 201605 in the name.

    SSIS-S01E05-113
    Executing the package in Debug mode..

    Below you can see the result. Only customer files are unzipped. The subfolders under the Archive folder that contain no customer zip files (but only currency files), do not exist under the Unzipped folder.

    SSIS-S01E05-114
    Folder structure of Archive and Unzipped rootfolders..

    SSIS-S01E05-115
    Unzipped “Customer” files ..

    Important notes for using with huge amounts of zip files

    From my experience I must also warn you. If you run this package on a folder containing gigabytes of zipfiles, there are two things to be aware of:

    1. The execution of the package can potentially take a long time, several hours is not unusual.
    2. Make sure you have enough disk space. Suppose you have the following hypothetical scenario: zip files: 100GB, average zip compression rate: 20%, percentage of files to be unzipped: 30%. In this case you are going to unzip files that have a zipped size of 30% of 100GB, is 30GB. Once unzipped, their size will be 30GB * (1/0,20) = 150GB. You need to have that disk space available on the disk where the Unzipped folder resides.

    Conclusion / Wrap up

    With this SSIS package you can selectively unzip files from archived zip files.
    When using it, make sure you have enough disk space, and be aware that the package can run for several hours, depending on the number and (average) size of the zip files to process.

    (c) 2016 hansmichiels.com – Do not steal the contents – spread the link instead – thank you.