September 2009

You are currently browsing the monthly archive for September 2009.

Recently I was investigating an issue related to ordering data.  As a test, I ran the following script:

– Sorting data from a table variable
declare @tbl table ( ProductNumber varchar(25) );
insert into @tbl select ‘BBBB’;
insert into @tbl select ‘AAAA’;
insert into @tbl select ‘A-B’;
insert into @tbl select ‘A123′;
select * from @tbl order by ProductNumber asc;

– Sorting data from a temporary table
create table #tbl ( ProductNumber varchar(25) );
insert into #tbl select ‘BBBB’;
insert into #tbl select ‘AAAA’;
insert into #tbl select ‘A-B’;
insert into #tbl select ‘A123′;
select * from #tbl order by ProductNumber asc;
drop table #tbl;

And it gave me this result:

Results from script

As you can see, the order of the data coming from a temporary table is different than the data from a table variable, even though the same data type is used.  At first I thought, how on earth is this possible?  After spending some time pondering about this problem (and after a colleague reported that when he executed the above script, the result was as expected), it came to me.  Don’t tell me it’s a collation problem?!  Well, it is.  (In case you don’t know collation, in short “Collations specify the rules for how strings of character data are sorted and compared”.  More info through that previous link and here.)

When I executed the script, my Management Studio session was connected to AdventureWorks2008, one of my test databases.  When I switched to tempdb, the result was normal – both queries returned the same result.  Then I had a look at the collations.  My tempdb is using Latin1_General_CI_AS while the AdventureWorks2008 database is apparently using SQL_Latin1_General_CP1_CI_AS.

A temporary table is created in the tempdb and thus uses the collation of the tempdb for its string columns.  And a table variable uses the same collation as the database to which the session is connected.  Which is quite logical because otherwise you would get collation conflicts when using the table variable in combination with a table from the active database in the same query.  I will demonstrate that with the following script:

declare @tbl table ( ProductNumber varchar(25) );
insert into @tbl select ‘BBBB’;
select * from AdventureWorks.Production.Product P
where P.ProductNumber in (select ProductNumber from @tbl);

On my (SQL Server 2008) server I also have the old SQL2005 AdventureWorks database up and running, which is using the Latin1_General_CI_AS collation.  Executing the above script (which references that AdventureWorks DB) while being connected to AdventureWorks2008 (or any other DB that uses a different collation from Latin1_General_CI_AS) results in the following error:

Msg 468, Level 16, State 9, Line 3
Cannot resolve the collation conflict between “SQL_Latin1_General_CP1_CI_AS” and “Latin1_General_CI_AS” in the equal to operation.

Changing the active database to any other DB that uses the Latin1_General_CI_AS collation and then executing the query results in a positive execution.

Now, to get back to the initial issue of sorting inconsistency, even though the collations are not exactly the same, they are both Latin1, Case Insensitive, Accent Sensitive and still they don’t sort the data in the same way??  For an explanation on that I found the following page on the Microsoft Support site: Comparing SQL collations to Windows collations.  In short: a Windows collation (such as Latin1_General_CI_AS) uses a different comparison algorithm than a SQL collation (SQL_Latin1_General_CP1_CI_AS).  A Windows collation uses the same algorithm as for Unicode data, even when the data is non-Unicode.

All that means is that in our test example, in order to get the data in the right order all the time, we could switch to Unicode fields:

– Sorting data from a table variable
declare @tbl table ( ProductNumber nvarchar(25) );
insert into @tbl select ‘BBBB’;
insert into @tbl select ‘AAAA’;
insert into @tbl select ‘A-B’;
insert into @tbl select ‘A123′;
select * from @tbl order by ProductNumber asc;

– Sorting data from a temporary table
create table #tbl ( ProductNumber nvarchar(25) );
insert into #tbl select ‘BBBB’;
insert into #tbl select ‘AAAA’;
insert into #tbl select ‘A-B’;
insert into #tbl select ‘A123′;
select * from #tbl order by ProductNumber asc;
drop table #tbl;

Executing that returns the following result, no matter what the active database is:

Script result when using Unicode fields

And it also means that you need to be careful when retrieving data from different sources (which is how I came across the issue in the first place).  If you need to have your data sorted in a particular, consistent way and you’re doing that by using an ORDER BY in a SELECT statement on the source system, double-check if all sources are using the same collation!

In case the above left you wondering about what collation to use: the SQL collations are there for backward compatibility.  For new developments use the Windows collations!

See my Fun With Strings article for another story related to collation.

Additional reference material:

SQL Server Central – Comparing Table Variables with Temporary Tables

Share

Tags: , ,

In this short article I will be talking about two functions in the SQL Server Reporting Services (SSRS) function stack.  Those functions are IIF() and Switch().  And I’ll be showing you how easy it is to add an Else part to the Switch function.

Two commonly-used functions in Reporting Services are the IIF() and the Switch().  These are two functions of the Program Flow type, or Decision Functions as they are called on this MSDN page.

In case you’re wondering why it’s so difficult to find a function reference for the built-in functions of SSRS, it’s because these are actually Visual Basic functions and Microsoft refers to those for any detailed explanation.  Click this link for the IIF() function in the Visual Basic Language Reference, and this one for the Switch().

Anyone who’s done some programming most likely already knows the if <expression> then <some_code> else <other_code> statement.  If <expression> evaluates to true then <some_code> gets executed, else <other_code>  gets executed.

The IIF() works in the same way.  According to its description it

Returns one of two objects, depending on the evaluation of an expression.

This is its definition:

Public Function IIf( _
ByVal Expression As Boolean, _
ByVal TruePart As Object, _
ByVal FalsePart As Object _
) As Object

Here’s a simple example:

=IIf(Fields!YearlyIncome.Value >= 60000,"High","Low")

Using this expression, the "High" string is returned when the value of the YearlyIncome field is equal to or above 600, while the string "Low" is returned when the value is below 600.

Now have a look at the following example.  It has been nicely structured with indentation and line breaks to make reading easier.

=IIF
(
    Sum(Fields!LineTotal.Value) >= 100,
    "Violet",
    IIF()
    (
        Sum(Fields!LineTotal.Value) < 25,
        "Transparent",
        "Cornsilk"
    )
)

As you see, it shows a nested IIF inside another one.  Imagine that there were several more nestings and that line breaks were not used by the coder.  Would be a nightmare to read, right?

That’s why the Switch() was invented.  The description for the Switch function reads:

Evaluates a list of expressions and returns an Object value corresponding to the first expression in the list that is True.

And this is the function definition:

Public Function Switch( _
    ByVal ParamArray VarExpr() As Object _
) As Object

In Reporting Services, the VarExpr parameter is simply an even list of expressions and/or object references separated by commas.  Which comes down to something like this: Switch(<expr1>, val1, <expr2>, val2).

Here’s a simple example:

=Switch
(
    Fields!State.Value = "OR", "Oregon",
    Fields!State.Value = "WA", "Washington"
)

This expression says that if the value for the State field is "OR" then the Switch function will return "Oregon", and so on…

Now, to get to the point of this article, the Switch function does not contain an ELSE part like the IIF does.

But I wouldn’t be writing this if there wasn’t a workaround, would I?  If you read the Switch’s description closely, it says that it will return the first expression in the list that is true.  So each expression is evaluated in the order that they are passed to the function.  To get ELSE-like behavior we would need an expression that evaluates to True but only when all other expressions are False.  So, why not use True as expression?  It’s the simplest expression that I can think of and it does the works!

Have a look at the following, it’s a rewrite of the last IIF example mentioned earlier.

=Switch
(
    Sum(Fields!LineTotal.Value) >= 100, "Violet",
    Sum(Fields!LineTotal.Value) < 25, "Transparent",
    True, "Cornsilk"
)

So, which one do you think is the most readable?  The IIF, or the Switch?  These are only simple examples that I’ve been using, imagine situations with ten or more possibilities.  Well, I think you’ve got my point by now.

Quick tip for users of Report Builder 2.0: to be able to format your expression with line breaks and tabs, you need to use CTRL + ENTER or CTRL + TAB in the Expression Builder.  Just hitting ENTER will close the popup window.  It’s quite annoying if you’re used to the BIDS interface, but it works :-)

Happy reporting,

Valentino.

Share

Tags: , , ,

I came across an issue when playing around with Report Builder 2.0.  I had created a report using an embedded data source.  Once I’d published the report to the report server, I couldn’t get it to run anymore.  Instead it gave me the following error:

This report cannot be run in report builder because it contains one or more embedded data sources with credential options that are not supported.  Instead of embedded data sources use shared data sources or save and view the report on the server.

Okay, no problem I thought, let’s just create a shared data source and switch to that one then.  So I opened up the Data Source Properties in Report Builder and selected the Use a shared connection or report model radio button.

Unfortunately, when running the report it threw me that same error?!  And when I open the Data Source properties again, my change was undone!  It was still using the embedded data source.

As far as I’m concerned that should be a bug.

The only way that I could switch my data source to a shared connection was by creating a new data source, which means you also need to move all datasets connected to the original data source.

Quick tip: if you first rename the original data source and datasets to something like srcMyDataset_OLD, you can give the correct name to the new one straightaway.

So I guess that’s another workaround on my list :-)

This issue was encountered while using Report Builder 2.0 (10.0.2531.0).  I tried to reproduce it using Report Builder 3.0 (10.50.1092.20 – that’s the version of the SQL Server 2008 R2 August CTP) and I couldn’t.  Which means it has been fixed.  Good on you Microsoft!

Share

Tags: , , , , , ,

Small disclaimer: in this post “wide” is not referring to the “Wide Tables” feature, but just means a regular table with many columns.

If you’ve ever needed to load a wide table, for instance one that consists of about 250 columns, you may have come to the conclusion that some manipulations in your SSIS package take forever.  Here’s a quick tip how to speed up your package development.

This tip is actually quite generic and applicable to daily Windows use.  It’s just that sometimes you forget about the most simple things because you’re used to a different way of working.  Why make it easy when difficult works fine, right?  Well, maybe not.

Okay, here it goes: don’t forget that you’ve got a keyboard, do not always use the mouse! The Business Intelligence Development Studio offers a very graphical way of designing your data flow (or control flow for that matter) but sometimes it’s really faster to use the keyboard.

An example.  Imagine the Derived Column Transformation.  To avoid typographical errors you want to select a column name from the tree view on the left in the Derived Column Transformation Editor.  And then you discover that the list of column names is not ordered alphabetically, and on top of that, there’s no way to rearrange them!  Good luck finding your column.

Derived Column Transformation Editor

And then you remember that you’ve got a keyboard.  By typing the first letters of the column name, the selected item will jump down to the first match.  Isn’t that a great invention?!  In the screenshot above I typed “pu” and it selected PurchaseOrderNumber, which is about 32 columns down from the top.

This tip not only works in our Derived Column Transformation but in any screen where a similar tree view appears, such as in the “Inputs and Outputs Properties” tab of the Advanced Editor of an OLE DB Source:

Advanced Editor of OLE DB Source

This can be useful if you need to tell the component that you’ve ordered the data on certain columns.  And why on earth would you want to do that, you may ask yourself.  Well, there are certain components that expect the data to be ordered, such as the Merge item.

If you apply this tip whenever you need to locate something in a tree view, you’ll find that your live has become just a bit easier.

And if you’re not satisfied with the way of working as described above, there’s always the XML!  That’s right, just right-click on your package in the Solution Explorer and select View Code.

Context menu on package in Solution Explorer

Then you’ll discover that a .dtsx file is made up completely of XML.  If you spend some time examining the XML code, you’ll probably find what you’re looking for.

As an example let’s say we want to locate our PurchaseOrderNumber from earlier and change its SortKeyPosition property to 1 (which means the data coming in is ordered by PurchaseOrderNumber).  So I open the XML and scroll down a bit.  I recognize a component which I’ve named “OLE_SRC Wide Table”:

<component id=”1″ name=”OLE_SRC Wide Table”

I also recognize my SELECT statement.  Scrolling down a bit further I see the

<outputs>

followed by a bunch of outputColumn items.  I put the cursor at the <outputs> line and hit CTRL-F.  Then I type “pur” and hit ENTER.

Find and Replace window in dtsx XML

The selection jumps to the column that we’re after.  I close the Find and Replace window and scroll to the right to find the sortKeyPosition property and I update its value to 1.

To confirm that this is working I open up the Advanced Editor again and locate our column.  SortKeyPosition is set to 1!

SortKeyPosition has been updated through XML

I hope that with this I showed you that sometimes it’s interesting to think about what you’re actually doing and try to find a way to do it better, and faster.

Share

Tags: , , , ,

© 2008-2017 BI: Beer Intelligence? All Rights Reserved