How to add a rownumber to a Data Flow Task in SSIS 2012

(This blog post was first published on 20-jun-2013 on my former blog)

The problem

For one of my projects I needed to add the file line number to the staging table, so that the raw data that was inserted into the database, would be tracable back to the source files exact line.

First I wanted to use the RowCount component in the Data Flow Task, it was easy to configure (only assign a variable) but it appeared that the variable only got the value of the current row after the Data Flow Task had finished, so you knew afterwards how many rows had gone through the dataflow. But that was not what I needed! I want an instant row number, telling me which row is flowing through right now!

The solution: a script component

So what we have to do is add a script component to the Data Flow Task.

20130620_pscr1

 

But remember, if we work with a ReadWrite variable, we will have the same problem as with the RowCount component: the value of the variable is not updated until the dataflow task has finished, so in the flow it will have the initial value for every row.

To work around this, we have to add an output column to the script component as follows:

20130620_pscr2

 

Now we have to edit the script. No ReadOnlyVariables or ReadWriteVariables need to be set, just press the <Edit Script> button.

20130620_pscr3

As the script really is just a C# class, all we have to do is add a private variable, initialize it, increase it for every row and assign the value to the outputcolumn “RowNumber”.

I will show you how:

Simply add the parts in red boxes to the script:

20130620_pscr4

20130620_pscr5

That’s it. Simple, hey?

Now, there is one more thing to do. Open your OLE DB Destination and make a mapping for the RowNumber column. In my example the database field is also called RowNumber, but this is of course not required, you can give it any name you like.

20130620_pscr6

Now you’re done, run the script and you will see that the RowNumber column in the destination table will increase for every row that is inserted!

Hans Michiels

Hans Michiels

Hans is an Independent Business Intelligence and Datawarehouse Consultant & Microsoft SQL Server Consultant, working in the Netherlands. He has been working in the software industry since 1996, with SQL Server since the year 2001, and since 2008 he has a primary focus on datawarehouse- and business intelligence projects using Microsoft technology, using a Datavault and Kimball architecture. He has a special interest in Datawarehouse Automation and Metadata driven solutions. * Certified in Data Vault Modeling: Certified Data Vault 2.0 Practitioner, CDVDM (aka Data Vault 1.0) * Certified in MS SQL Server: * MCSA (Microsoft Certified Solutions Associate) SQL Server 2012 - MCITP Business Intelligence Developer 2005/2008 - MCITP Database Developer 2005/2008 - MCITP Database Administrator 2005/2008

More Posts

13 thoughts on “How to add a rownumber to a Data Flow Task in SSIS 2012”

  1. Hi,
    When i add the code parts in C# , Input0Buffer isn’t recognized.
    Do i have to add some namespaces ? and how can i do this ?
    Thanks.

    1. Hi Erik, it should work out of the box. If you open the C# script task you should see the method Input0_ProcessInputRow and it should compile without adding any namespaces.
      Which version of SSIS are you using?

  2. Hi Hans,

    I’m getting an “PipelineBuffer has encountered an invalid row index value” error.

    We are using VS 2015.
    Any thoughts are appreciated.

    1. Hi Peter,

      I never had to deal with this error.
      It is difficult to tell what the reason could be by the information provided.
      Which version of SSIS are you using?
      If you can send me the SSIS Package I can have a look, but I can’t promise a solution.
      For this I will send my email address to you by email.

      Best regards,
      Hans

  3. Hi Hans,
    I’m getting the same issue as Peter… “PipelineBuffer has encountered an invalid row index value”.

    Please send me an e-mail and I will send you a screenshot of my code.

    I’m using VS 2010.

    Thanks!
    -Wesley

    1. I figured it out! You have to make sure to use the correct override method. By default, SSIS C# has the “Input0_ProcessInput” method when adding a transformation script task; however, your process specifies the “Input0_ProcessInputRow” method. If you rename the existing method to the correct method, you will get a prompt asking you if you want to “rename an inherited method” or something to that effect and you want to say “Yes” and then save changes. Re-open the script task, open the script, close it and save it and it will not have the “red X” showing. I re-ran my package and it worked!

      1. Hello Wesley,
        very good that you found it! Sorry I didn’t reply earlier, I was busy and just saw your two comments.

  4. I am new to scripting, but I have your script working to create a key, the problem is that sometimes I need to use it to raise the starting seed a bit, I modified this portion to go from a 0 to
    m_rowNumber = 200000;
    expecting the Row Number to start at 200,000 and then go up from there, but it still starts at 1. What am I doing wrong. I am using this with SQL 2017

    1. Hello Dave,
      Sorry I missed your comment some months ago.
      But to be honest I do not understand what could have gone wrong. If you put the assignment of the value 200000 in the PreExecute() method I would – just as you – expect it to start counting at 200000.
      Best regards,
      Hans

Leave a Reply

Your e-mail address will not be published. Required fields are marked *