July 2012

You are currently browsing the monthly archive for July 2012.

The other day I needed a counter in my SSIS Control Flow.  Using the Foreach Loop container I was looping over a bunch of files and needed to count the number of files that got imported successfully.  The counter had to comply with the following requirements:

  • Easy to implement
  • Fast

Let’s test some different methods on getting this implemented. 

Counting Possibilities

Two of the methods are possible as of SSIS 2005 while the third one is new since SSIS 2012. So yeah, the screenshots are taken using SQL Server 2012 RTM.

Setting The Scene

We’ll start from a new package and create a package variable of type Int32.  This variable will keep track of the count.

To be able to performance test the different possibilities, I’m using a For Loop container.  This loop is using another package variable called loop, type Int32, and performs the loop 10,000 times.  In case you don’t know how to set up such a loop, check out this screenshot:

Using a For Loop container to test task performance

So my package contains the following two variables:

Two package variables of type Int32

The cnt variable is the one to actually hold the count.

To test performance I’ll be using the command-line dtexec utility so that the debugging overhead does not mess up the numbers.   I’ve also executed each method at least three times to ensure a certain result was not “by accident”.

Using a Script Task

The most logical component that came to mind was a Script Task.

Using a couple lines of code, C# in this case, the counter value can be incremented with one:

int cnt = (int)Dts.Variables["cnt"].Value;
Dts.Variables["cnt"].Value = ++cnt;

The above code assumes that the cnt variable has been put in the ReadWriteVariables property:

The User::cnt variable is in the ReadWriteVariables for the script to use

Here’s what that looks like in the Control Flow:

For Loop container with a Script Task

So how fast does this execute?

Looping 10,000 times over the Script Task takes 8 seconds

About eight seconds, that’s acceptable.

However, what I don’t like about this method is how long it takes to implement.  I mean, it takes more than just a couple of seconds, right?  And the fact that you actually need to use custom .NET code to perform such a trivial task as adding one to a number.  Using .NET code is good for complex situations, when there’s no other option.  But the risk of a bug is always larger, imagine I wrote cnt++ instead of ++cnt, what do you think the results would be? (Hint: my laptop would crash before the counter reaches 10,000).

On to another option then!

Using a Execute SQL Task

Instead of resorting to .NET code, increasing a number by one is easy to achieve using a simple T-SQL statement, right?  So I thought, let’s try a Execute SQL Task!

Here’s the query:

select ? + 1 as cnt

What does the task look like?

Using Execute SQL Task to increase a package variable

ResultSet has been set to Single row.

The Parameter Mapping page has got the User::cnt variable specified:

Specifying the User::cnt package variable in the Parameter Mapping page

And the Result Set page has got the single value from the single row mapped to the User::cnt package variable:

Mapping the cnt result to the User::cnt variable

What do you say, easier to implement than the Script method?  I do think so!

This method has one limitation though: it needs a Connection Manager connecting to a SQL Server database.  However, in most ETL packages you’ll probably have that present already.  What I was a bit worried about though is the overhead of connecting to the server, how much will it be?

Let’s find out!

The Execute SQL Task needs almost a minute to increase the counter

That’s right, using the Execute SQL Task to increment the variable 10,000 times takes about a minute.  On my laptop.  Connecting to the tempdb on my local instance.  When testing this over a network, it even resulted in timings over four minutes.  So this solution is really unacceptable in my opinion.

However, we can give it one more try.  A Connection Manager has got a property called RetainSameConnection.  By default this is set to False which means that our test above has opened and closed ten thousand connections to my local tempdb.  Oh my, of course that takes a while!

Setting it to True gives us the following result:

Setting RetainSameConnection to True speeds up the process by three

About twenty seconds, which is about one third of the previous result.  That surely is better.  And what’s perhaps even more interesting is that a similar result is achieved when connecting to a database over the network: from over four minutes down to twenty seconds.  So yeah, this would work for me.

Sidenote: for other interesting use of the RetainSameConnection property, check out this post regarding transactions by Matt Masson.

But we’re not stopping just yet.  As of SQL Server 2012, we’ve got a third possibility!

Using an Expression Task (SQL2012!)

Instead of resorting to custom .NET code or (ab)using the Execute SQL Task, in SSIS 2012 we’ve got a new task: the Expression Task.

The new Expression Task in a For Loop ContainerThe Expression Task builds and evaluates SSIS expressions, nifty!

As you can read in the SSIS Toolbox, the Expression Task builds and evaluates SSIS Expressions that set variable values at runtime.  That surely sounds exactly like what we’d need, doesn’t it?

So how does that work?

Using the Expression Task to increase the counter

Really easy to set up, we just specify that the User::cnt variable should be set to itself plus one.  When put in a For Container, we’ve got a counter!

But how does it perform?

The Expression Task is the easiest and the fastest method, mission accomplished!

About seven seconds, which is even slightly faster than the Script Task method!

With that we’ve found the method that complies with both requirements: easy to implement and performs fast!  Now how am I going to convince my clients to upgrade to 2012, hmm, need to think a bit…

Conclusion

We found out that the new Expression Task is a very useful task for its purpose.  In our case we used it to create a counter.

If you’re not on SQL Server 2012 yet, better stick to either Script Task or Execute SQL Task with RetainSameConnection set to True.

Have fun!

Valentino.

Share

Tags: , ,

I recently discovered something which I wish I’d found years ago!  Have you ever been in the situation that you need some information regarding a certain Active Directory user or group but you don’t have access to the Active Directory Users and Computers MMC snap-in?  Read on!

The Good Old NET Command

Remember this one?  You may already have seen it in its net use form to create a mapped network drive.  Net use has even gotten a Wikipedia page dedicated to it.  But use is not its only use!

Options of the net command

Let’s look into a couple more interesting uses of this oldie.

Getting User Account Info

As you can see in the screenshot below, you can get some details on an AD user account by running the following command:

net user <account_name> /domain

The output:

Using "net user" to find AD account info

Recently I had a suspicion that a certain account at work was locked out because our nightly job crashed with a weird error.  Using this command I was indeed able to tell that the account was locked.  Can always be interesting to find out first before waking up the local sysadmin.

When an account is locked, the Account active setting will say Locked instead of Yes.

What’s also interesting here is that list of Global Group Memberships.  If you’re experiencing security issues because a certain account does not seem to have access to something while the sysadmin has ensured you that it has been added to the appropriate groups, here’s your quick way to find out for sure!

Getting AD Group Details

The following command can be used to get a list of everyone who’s part of a certain AD group:

net group <group_name /domain

The output looks like this (yes, I’ve hidden the aliases of my colleagues):

Using the "net group" command to find who belongs to the group

Again, practical when you’re troubleshooting security issues.

Stopping And Starting Services

The following command can be used to stop a service on the local machine:

net stop <service_name>

Replace “stop” with “start” to start it again.

Please note: you’ll need to run the DOS box as administrator for this to work.

Here’s me restarting my SSIS 2008 service:

Using "net stop" and "net start" to manipulate local Windows services

Can be useful to create a batch script to stop/start all you SQL Server services on your development PC, especially if you’ve got several versions installed and want all resource available to the one currently in use.  That way you can quickly switch between versions and still keep resources available for other tasks.

In case you’re not sure what the service is called, open up the list (Start > Run > service.msc > enter), double-click your service and have a look at the Service name property:

Finding out the name of a service

Finding the name of your computer

The next command can be used to find the name or your PC, plus some other details:

net config workstation

The output:

Using "net config workstation" to get your computer's name

But I actually use another really simple command for that:

hostname

Use "hostname" to get the name of your machine

Yep, as simple as that, hostname prints the name of your PC.

Conclusion

That’s it for this one, a couple of good uses of the long-forgotten net command.

Have fun!

Valentino.

Share

Tags:

© 2008-2017 BI: Beer Intelligence? All Rights Reserved