SSIS

You are currently browsing articles tagged SSIS.

About two months ago I was contacted by Nat Dunn, founder of Webucator – a Microsoft Certified Partner for Learning Solutions.  He wanted to know if I’d be okay with them creating a video with content based on one of my blog posts.  This video would be part of a new SQL Server Solutions from the Web course.  Who am I to refuse such a kind request?

The blog post which has been filmed is the one in which I show beginners the very useful Match Items by Matching Names functionality in SSIS: Silly SQL #1: OLE DB Destination [SSIS].

Interested in watching the movie?  Check it out at Youtube:

Silly SQL #1 OLE DB Destination SSIS

 

 

 

 

 

 

 

 

 

 

Please note that the situation explained in the movie is something which you’ll only run into when making changes to existing packages.  This would be the case when requirements have changes and you need to add additional columns, or when you’re working with template packages.

Liked the movie?  Have a look at what else Webucator has got to offer in terms of SQL Training!

Have fun!

Valentino.

Share

Tags:

Surprised!Sometimes we take silly little things we do in our daily life for granted and assume everyone else is aware of them too.  That’s where we’re wrong: as I’ve found out, they aren’t!  Not always anyway.  That brought me to the idea to start this Silly SQL blog series.  Each post will explain one little thing I do or use regularly that makes my life easier.  Here’s the first one!

Last week I noticed a co-worker making a lot of keyboard noise while implementing a Data Flow Transformation in an SSIS ETL package.  When I turned around to have a look at his screen I saw he was working on an OLE DB Destination, nothing wrong with that.  Basically he was hitting the down arrow followed by TAB twice, down arrow again and so on in order to set up matching input columns with destination columns.  This method worked because we decided to put the incoming fields in the same order as the columns in the destination table and we also gave them the same name using aliases in the source query.

However, for tables with over 200 fields this method is quite tiresome (and annoying for colleagues unless they’re using a headset).  Nice as I am I decided to help him out.  I asked him if I could borrow his mouse for a second and then right-clicked in the grey area in between the two tables:

The Map Items by Matching Names functionality

You should have seen his face when he saw that window appear, and even more when I selected Map Items by Matching Names!  Apparently this is some functionality that’s been hidden really well because in that same week I caught another co-worker in exactly the same situation.  And these are not junior profiles I’m talking about!

If you’re now thinking “Hang on, I never have to set up the matches myself?” that may be true!  If you’re always creating new packages and the names are matching then BIDS will set up the matching fields automatically when you open the Mappings page.  But we are working with previously-defined templates to speed up development.  In that case BIDS will not set up the matches so that functionality shown above really comes in handy!

A fast way to find out if all fields have been matched is to click the Input Column header:

Click Input Columns header to put unmatched items on top!

This will order the items with the unmatched ones, recognized by <ignore>, on top!

See, the things you take for granted aren’t always that for others, as proven here.

That’s it for now, let’s see if I can come up with another silly thing for the next post!

Update: this post was turned into a movie by the good folks of Webucator: check it out!

In the meantime: have fun!

Valentino.

Share

Tags: , ,

SQL Server Days 2013For the first time ever am I not only participating in the organization of the SQL Server Days, I’ll be speaking as well!  Exciting times!

In my session, Cleaning up the mess with SSIS and DQS, you’ll learn a couple of tricks on dealing with dirty data.

Here’s the abstract for my session:

Are you loading data from exotic sources such as Excel and flat file?

Or are you dealing with manually-entered data through an application that allows, well, practically anything?  And do you sometimes run into trouble because the incoming data is not as expected?

Then you should really join us in this session in which I’ll demonstrate (yes, demos!) several different techniques that can be used to cleanse your dirty data!

Check out the agenda for more details!

We’ve got speakers from all over the world: South-Africa, Italy, Slovenia, UK, US and of course… Belgium!

Not registered yet?  Do it now!

Ow, and did you know we’ve introduced a special feature?  It’s called Bring a colleague for free.  If you register you can register an additional person for free, as long as there aren’t 50 of those free ones registered that is so be fast!

Bring your colleague

In the meantime: have fun and see you there!

Valentino.

Share

Tags: , , , , ,

A Record SetAs you may already know, it is possible to use the Execute SQL Task to populate a package variable with a result set.

In case you’re not that familiar with this technique yet, here are a quick two words on setting that up.  You just give it a query, set the ResultSet property to Full result set and configure a package variable in the Result Set property window.  The package variable’s type is System.Object.

But what exactly is this mysterious System.Object and how can we explore it?  Well, that depends.  More precisely, it depends on the Connection Type which you’ve chosen in the Execute SQL Task properties.

Let’s explore two possibilities: ADO.NET and OLE DB.  Our end goal is straightforward: retrieve the number of records in the result set.

The query which I’m using in the Execute SQL task is this one:

select ProductAlternateKey
from dbo.DimProduct
where Color = 'blue'

On my AdventureWorksDW2012 database it should return 28 records: 28

Exploring the ADO.NET result set

The first step is finding out what type exactly this result set object is.  Hook up a Script Task to your Execute SQL task and put a breakpoint on it.  Now run your package and examine the Locals window:

Debugging the Control Flow to find the object type

Well look at that, it’s a System.Data.DataSet!  Using this knowledge it’s fairly simple to produce code that fetches the record count:

DataSet ds = (DataSet)Dts.Variables["MyResultset"].Value;
MessageBox.Show(ds.Tables[0].Rows.Count.ToString());

Note: don’t forget to add the package variable to the ReadOnlyVariables before opening the code editor.

The System.Data namespace is included by default in the using statements, no worries there.  So we can just cast the variable into a Dataset.  The DataSet object contains a DataTableCollection called Tables.  As there’s only one result set this is located at index zero.  We travel down the object tree to finally find the Count property of the Rows DataRowCollection.

And here’s the result:

The message box shows 28 items

That’s all there’s to it, easy huh?  Let’s move on to our second option, OLE DB.

Exploring the OLE DB result set

Once again we start at the beginning: with the debugging of the Control Flow to find out what object type our mysterious System.Object is:

The OLE DB result set gives us a System.__ComObject, hmm...

Hmm, System.__ComObject, that’s … special.  Ow right, the OLE DB provider uses a COM wrapper.  How can we “unwrap” our object and introduce it to the .NET world?  Let’s see if we can find out what’s hidden behind that wrapper, by using the following code:

MessageBox.Show(Microsoft.VisualBasic.Information.TypeName(Dts.Variables["MyResultset"].Value));

TypeName is a VB.NET function and retrieves the data type of the parameter passed into it.

To get this to run in a C# SSIS task you first need to add the Microsoft.VisualBasic reference:

Adding a reference to the VB.NET assembly

Executing the package results in this:

Result type: Recordset

So, our result is Recordset, hmm, well, I think we more or less knew this already.  What kind of Recordset?  Well, an ADO Recordset.  We know this because the following code works:

System.Data.OleDb.OleDbDataAdapter da = new System.Data.OleDb.OleDbDataAdapter();
DataTable dt = new DataTable();
da.Fill(dt, Dts.Variables["MyResultset"].Value);
MessageBox.Show(dt.Rows.Count.ToString());

Basically, we use the Fill method of the OleDbDataAdapter to fill a System.Data.DataTable with the data from the ADO Recordset.  The version of the method in our example (there are several overrides) accepts two parameters:

public int Fill(

DataTable dataTable,

Object ADODBRecordSet

)

With the DataTable filled we’ve got once again access to a Rows DataRowsCollection, exactly the same as in our ADO.NET example in fact.  Executing the package now results in exactly the same message box as shown earlier: 28 records!

Beware of pitfalls

If you mix the two methods up you’ll get funky errors such as:

System.InvalidCastException: Unable to cast COM object of type ‘System.__ComObject’ to class type ‘System.Data.DataSet’. Instances of types that represent COM components cannot be cast to types that do not represent COM components; however they can be cast to interfaces as long as the underlying COM component supports QueryInterface calls for the IID of the interface.

and also

System.ArgumentException: Object is not an ADODB.RecordSet or an ADODB.Record.

So be careful, use the right object types for your particular System.Object.

Conclusion

In this article I’ve demonstrated a couple of methods which can be used to retrieve information from the mysterious System.Object result set in an SSIS package.

Have fun!

Valentino.

Additional References

Execute SQL Task

Result Sets in the Execute SQL Task

OleDbDataAdapter Class

Share

Tags: , ,

Once again I’ve been wasting some time because of a silly bug.  This time it was due to the OLE DB Source component and the way it works with parameters.  If you are in a situation where you know your query is working fine and yet no records are going down the data flow, here’s a possible solution!

Disclaimer: this issue exists up until SQL Server 2008 R2.  Read on for details!

Update: after being advised to do so by several people, including Jamie Thomson, I’ve filed a bug at MS Connect: SSIS OLE DB Source incorrectly returns zero records in combination with parameter and comment

The Situation

I had a Data Flow with an OLE DB Source that uses one parameter, for instance:

select ProductAlternateKey, EnglishProductName
from dbo.DimProduct
--some really smart comment goes here
where Color = ?

I knew the query was working fine because when executed through SSMS and with the question mark replaced with ‘blue’, it would return 28 rows:

28 records in Management Studio

But when executed in BIDS, through either Execute Package or Execute Task, it would return zero records:

Zero records, zilch, nada, niente, none at all!

So I thought something must be going wrong with the package variable that gets passed into the source parameter, somehow.  I’m not going into details on what I tried out in my attempt to get this working, but I can tell you that I started to get really irritated.  My colleague Koen Verbeeck (b|t) can confirm this because I called him over to my desk to help me think! (thanks btw!) Smile

After some further tinkering with the data flow, we had our smart moment of the day and decided to launch SQL Server Profiler to see what BIDS was sending to the server!  I’m not sure if you’re aware of this but BIDS is doing some metadata-related stuff when preparing queries.  As far as I can tell, it also tries to determine the parameter type by running the following query:

 set fmtonly on select Color from  dbo.DimProduct
--some really smart comment goes here where 1=2 set fmtonly off

When creating this statement, it seems to use the whole FROM clause of the original query, including any trailing comments.  It combines that with a SELECT statement that contains the field that gets filtered and it appends " where 1=2 set fmtonly off".

But alas, apparently it’s not aware that lines can be commented out by using a double dash.  So part of its generated statement is commented out.  What it should have done is used some CRLFs, especially in front of the WHERE clause.  But it didn’t.

So, as a result of that, FMTONLY remains on while the SELECT statement gets executed, resulting in zero records!

For those unfamiliar with the FMTONLY setting:

Returns only metadata to the client. Can be used to test the format of the response without actually running the query.

And I can actually confirm what I’m stating here by changing the query to the following:

set fmtonly off;
select ProductAlternateKey, EnglishProductName
from dbo.DimProduct
--some really smart comment goes here
where Color = ?

28 records down the pipe!

We've got data!

But this hack is a little too dirty to put in production.  So what else can we do?  Well, use block-style comments instead and we won’t face the issue!

select ProductAlternateKey, EnglishProductName
from dbo.DimProduct
/* some even smarter comment goes here */
where Color = ?

So, as I mentioned at the start of the post, this behavior can be reproduced using SSIS versions prior to 2012.  What about 2012 then?  Here’s the result of the Data Flow using the first query mentioned above:

SSIS 2012: we've got data, even with the "faulty" query!

Alright, that works better!  Now let’s use Profiler to check what’s going on here.  This is the first statement that gets executed:

exec [sys].sp_describe_undeclared_parameters N'select ProductAlternateKey, EnglishProductName
from dbo.DimProduct
--some really smart comment goes here
where Color = @P1'

Further down, I also see this one:

exec [sys].sp_describe_first_result_set N'select ProductAlternateKey, EnglishProductName
from dbo.DimProduct
--some really smart comment goes here
where Color = @P1',N'@P1 nvarchar(15)',1

It is using an entirely different approach, no longer using the FMTONLY setting!  Hang on, this rings a bell!  Look what the BOL page for SET FMTONLY (2012 version) specifies:

Do not use this feature. This feature has been replaced by sp_describe_first_result_set (Transact-SQL), sp_describe_undeclared_parameters (Transact-SQL), sys.dm_exec_describe_first_result_set (Transact-SQL), and sys.dm_exec_describe_first_result_set_for_object (Transact-SQL).

Cool stuff!

Conclusion

If you’re not on SQL Server 2012 yet, be careful with comments in OLE DB Sources in the SSIS Data Flow!  Ow, and get the SQL Server Profiler off its dusty shelf now and then!

Have fun!

Valentino.

Share

Tags: , , , ,

« Older entries

© 2008-2017 BI: Beer Intelligence? All Rights Reserved