Tag Archives: zip

Free download: Blog Yearbook 2016

I have made a compilation of all my blog posts of the past year, and it was more work than you would think.
Especially because it also contains a non-published new article.

After filling in the form you can download the e-book for free.

    I promise you the following:

    • I will not give or sell your email address to any third party.
    • I will not spam you.
    • If I start with a newsletter, you can easily unsubscribe or change the frequency.

    (c) 2016 hansmichiels.com – Do not steal the contents – spread the link instead – thank you.

    Zip sourcefiles in a SSIS Package (SSIS Series)

    Introduction

    As I am in a (un)zipping mood, this week the opposite problem of last week.
    The example is not so ready-to-use as in my previous post, as it depends on choices made in your development process of other SSIS packages.
    Still I hope you can adjust this example to match your needs.

    Problem

    If you have imported and archived (csv- or xml) files, these files can take up a lot of disk space. The archive folder can also become overwhelmingly full with ten thousands of files, which can make it slowly accessible and difficult to manage.

    Solution

    To tackle this problem I have made a fully configurable SSIS package, that can be executed from a different package with an Execute Package Task.
    The package is made using SSIS 2014.

    The zipping itself is done with 7za.exe, for more information on how to get and install this free program see my previous blog post.

    Scenario

    When I used this concept, a different SSIS package moved files to a subfolder of an archive folder after they were processed, but their name was prefixed with the LoadCycleID with leading zeros, for instance 001345_customers.xml.

    The archive folder and subfolder were separated, because the archive folder was an environment variable and the same for all packages, while the subfolder was different for different SSIS packages, and contained the name of the source system of the files.
    If you only have one archive folder, just change the example to use that instead of the concatenation of archive folder and subfolder.

    Overview

    Here is a picture of the end result:

    SSIS-S01E06-124
    The Control Flow

    Detailed steps

    Parameters and variables

    First you need to add a few parameters and variables to the package.

    Parameters:

    Name Data type Required Description
    _7zaPathAndFilename String False The full path and filename of 7za.exe.
    ArchiveFolder String True The root folder for archiving processed files.
    ArchiveSubFolder String True The subfolder of the ArchiveFolder in which the files to be zipped exist, the subfolder can for instance be named after a source system like SAP or CRM.
    LoadCycleID Integer True The ID of the LoadCycle that is processing these files. In your situation it could also be called ExecutionID, SessionID and so on, the name is not really important.

    SSIS-S01E06-125
    Package parameters

    Variables:

    However not all variables might be necessary, I like the concept of being able to configure all values in one place.
    The data type of all variables is String, except LoadCycleID which is an Integer.
    Every variable uses an expression to set its value.
    Here is a detailed list of all variables and the expression used for each one.

    Name Expression
    ArchiveFilesToDeleteInFolder @[$Package::ArchiveFolder] + "\\" + @[$Package::ArchiveSubFolder]
    ArchiveFilesToDeletePattern @[User::LoadCycleIDString] + "*"
    ArchiveFilesToZipPattern @[$Package::ArchiveFolder] + "\\" + @[$Package::ArchiveSubFolder] + "\\" + @[User::LoadCycleIDString] + "*.*"
    ArchiveFilesWorkingDirectory @[$Package::ArchiveFolder]
    ArchiveZipFile @[User::LoadCycleIDString] + "_" + @[$Package::ArchiveSubFolder] + ".zip"
    ArchiveZipPathFile_Final @[$Package::ArchiveFolder] + "\\" +@[$Package::ArchiveSubFolder] + "\\" + @[User::LoadCycleIDString] + "_" + @[$Package::ArchiveSubFolder] + ".zip"
    ArchiveZipPathFile_Temp @[$Package::ArchiveFolder] + "\\" + @[User::LoadCycleIDString] + "_" + @[$Package::ArchiveSubFolder] + ".zip"
    LoadCycleID @[$Package::LoadCycleID]
    LoadCycleIDString RIGHT("00000" + (DT_STR, 10, 1252) @[$Package::LoadCycleID], 6)

    Important note: if you expect that the value for LoadCycleID will grow beyond 999999 over time, you should add more prefixed zero’s to LoadCycleIDString and increase the number 6, the second parameter of the RIGHT function.

    SSIS-S01E06-126
    SSIS variables of the package

    The Control Flow

    The control flow is quite straight forward, only three tasks and we’re done.

    EPR Zip files in archive subfolder

    This task creates one zipfile containing multiple other files in the ArchiveSubFolder.
    From the SSIS Toolbox drag an Execute Process Task to the Control Flow panel and open the Execute Process Task Editor.
    On the General and Process panes you do not have to change anything.

    SSIS-S01E06-128
    SSIS-S01E06-129
    No changes needed here

    On the Expressions pane, press the ellipsis next to Expressions. In the Property Expressions Editor add the following:

    Property Expression
    Arguments "a "+ "\"" + @[User::ArchiveZipFile] + "\" " + "\"" + @[User::ArchiveFilesToZipPattern] + "\" -mx=9"
    Executable @[$Package::_7zaPathAndFilename]
    WorkingDirectory @[User::ArchiveFilesWorkingDirectory]

    Please note that in the Arguments expression -mx=9 is used. This is the compression level, level 9 means “ultra” compression. This level gives the smallest zip files but is also the slowest compression level. You can for instance also use 5 for “normal” or 3 for “fast” compression, if this is what you want.

    SSIS-S01E06-127
    Expressions for the Execute Process Task

    SCR Delete uncompressed files

    This task deletes the files that were added to the zip file in the previous task.

    In the Script Task Editor, add both variables User::ArchiveFilesToDeleteInFolder and User::ArchiveFilesToDeletePattern to ReadOnlyVariables, then press Edit Script ….

    Then in the namespaces region add one namespace, as indicated below.

    Namespaces region
    #region Namespaces
    using System;
    using System.Data;
    using Microsoft.SqlServer.Dts.Runtime;
    using System.Windows.Forms;
    // Added namespace:
    using System.IO;
    #endregion

    Adjust public void main() as follows:

    public void Main

    public void Main()
    {
      // Copy SSIS variable values to C# string variables.
      string archiveFilesToDeleteInFolder = Dts.Variables["User::ArchiveFilesToDeleteInFolder"].Value.ToString();
      string archiveFilesToDeletePattern = Dts.Variables["User::ArchiveFilesToDeletePattern"].Value.ToString();

      // Delete each file that just has been added to the zip archive.
      foreach (string f in Directory.EnumerateFiles(archiveFilesToDeleteInFolder, archiveFilesToDeletePattern))
      {
        File.Delete(f);
      }

     Dts.TaskResult = (int)ScriptResults.Success;
    }


    WARNING: Make sure that no files that meet the file name pattern of files to add to the zip file are added to the ArchiveSubFolder while or after the zipfile is (being) created. Otherwise files that are NOT zipped might get deleted!

    Download the script here.

    FST Move zip file

    The File System Task is to move the zip file to the ArchiveSubFolder, because the zipfile is initially created in a temporary folder and not in the ArchiveSubFolder.
    This is to prevent that it would be deleted by coincidence, if you use a value for ArchiveFilesToDeletePattern that ends with .*
    Configure the File System Task as shown in the picture below:

    SSIS-S01E06-130
    File System Task Configuration

    Download the script here.

    Let the car drive

    Time to test the package and tell the result in pictures (if you don’t mind).

    SSIS-S01E06-131
    Before: files of LoadCycleID 001346 need to be zipped ..

    SSIS-S01E06-132
    Execute the package ..

    SSIS-S01E06-133
    All csv files replaced with one zip file ..

    SSIS-S01E06-134
    .. that contains the csv files ..

    Conclusion / Wrap up

    In this blog post you could read the steps needed to build a SSIS package that creates a zip file containing a number of files in an archive(sub)folder.
    This makes the archive smaller and easier to manage.
    However beware that you only delete files that were added to the zip file. For more details see the warning above.

    (c) 2016 hansmichiels.com – Do not steal the contents – spread the link instead – thank you.