Excel

You are currently browsing articles tagged Excel.

Here’s a quick one about my favorite setting of the week.  And for once it’s not related to SQL Server but a bit more general.  It’s meant to improve the everyday use of a PC.  Yet a bit more precisely, it’s got to do with Microsoft Office (2007 or 2010, either will do).

Ever since I got a PC at my client, I was annoyed by the fact that Word would open all documents in the same window.  As a result of that, when I had two documents open, I couldn’t use ALT-TAB to switch between them.  (I prefer using my apps maximized.)  That really annoyed me.  Furthermore, each time when I clicked the closing X at the top right of the screen, it would close down all open documents instead of the one in front of me.  Aaarghh!!

And today I finally found the setting to make it behave decently.  It’s not so complicated once you know where to find it.  (In fact, it’s a simple checkbox.)

Curious?  I’ll explain how to change the setting in Word, but Excel will work just as fine.

Open the Word Options by going to File > Options.  Then switch to the Advanced page.  Locate the Display section.  In that section you’ll find a checkbox labeled Show all windows in the Taskbar.  Guess what that does?!

Options: Advanced - Show all windows in the Taskbar

So that one more aggravation gone!

Have fun!

Valentino.

Share

Tags: , ,

As a BI developer I regularly use Excel to analyze my data on correctness.  In Excel it’s easy to filter out unrelated records, as you’ll probably already know.

How do I get my data in Excel?

With a simple copy/paste.  Since SQL Server 2008, you can just right-click on the results in the Management Studio and select Copy with Headers.  This will put the selected data into memory, including the column headers.

SSMS 2008: right-click on results to Copy with Headers

You can even make a selection in the Results pane and only copy that, as shown in the screenshot above.  Here’s what it looks like when pasted into Excel:

Data with headers pasted into Excel

How do I make the columns wider?

Excel will not expand the column widths by default, shown in that last screenshot.  You could manually enlarge them one by one, but that’s not convenient when you last filled column is called “DI”.  Instead, I use the following trick.  First I select the whole sheet by clicking on the grey square in the top-left corner:

Select the whole sheet using the grey square

Then I double-click on the vertical divider in between columns A and B.  This will adjust all column widths so that the data fits nicely.

Double-click the divider in between two columns to adjust width automatically

Does Excel know SELECT DISTINCT?

The other day I needed to filter out all duplicate records out of a really large dataset.  I first tried by activating the regular filter.  This is an Excel feature that I use a lot.  Just clicking the Filter button on the Data ribbon will add the small dropdowns next to each column’s header:

Excel's Filter functionality

But in those dropdowns I couldn’t find an option which would get me to the required result.

(Note that I added a couple of duplicate records in my dataset.)

After a little search, I found the magic checkbox!  I didn’t know that it could be so easy.

To find it, click the Advanced button in the Data ribbon.

The Advanced button on the Data ribbon

In the Advanced Filter window that appears you see a checkbox called Unique records only.

Advanced Filter: Unique records only checkbox

There’s your SELECT DISTINCT!

And here’s the sheet with the duplicates filtered out:

Excel sheet after filtering out duplicate records with the "Unique records only" option

Do you also see that the filter dropdowns are gone, although I did not remove them?  The Advanced Filter cannot be used in combination with the regular filter.  Once you activate the regular filter by clicking the Filter button, your duplicates will re-appear.

If you want to further filter your data after filtering out all duplicate records, you should select the Copy to another location option in the Advanced Filter window.  Copy it below your data on the same sheet and apply the regular filter on that new set of data.

Filtered set of data below full dataset on same sheet

Copying to another sheet is not supported.  Apparently you can only copy to the active sheet.  Should you try it out anyway, you’ll get this message:

Warning: You can only copy filtered data to the active sheet

Happy filtering!

Valentino.

Share

Tags: ,

A while ago I wrote an article about how you can use the SQL Server Import and Export wizard to import Excel data into a SQL Server database.  In this sequel I’m going to show you some problems which you may encounter when using that wizard.  Just like any good old wizard, he’s only as good as his recipes.  If a recipe is missing an ingredient, the resulting potion will probably not behave as expected and before you know it it explodes in your face.

I’ll be using the same Excel file as in my previous article.  In case you’re having some problems understanding the data or locating certain screens mentioned in this article, I recommend you to first read the prequel.

Some Common Wizard Pitfalls And Their Solution

Drop And Re-Create Destination Table

Imagine that for your import process you’re planning to import that Excel data regularly.  So the logical step is to make sure that each time the process runs, it starts from scratch.  Following that, what seems to be an interesting checkbox is located on the Column Mappings screen.  This checkbox is called Drop and re-create destination table.

Column Mappings: Drop and re-create destination table

So you activate that checkbox and execute the package.  However, it ends with an error in the Executing phase:

SQL Server Import and Export Wizard: error while Executing

Here’s what the error details say:

image

Error 0xc002f210: Drop table failed… Cannot drop the table dbo.ProductList$ because it does not exist or you do not have permission.

So why does this error occur?  Let’s examine the SSIS package that was generated.  This is what the Control Flow looks like:

A Control Flow that doesn't take a non-existing table into account

New here is that Drop table SQL Task.  Taking a closer look at the task, here’s the query:

drop table [dbo].[ProductList$]
GO

So the first task is to drop the table.  This flow does not take into account that the table may not yet exist, as is the case here, causing the error.

However, do you notice the blue arrow connecting the Drop table task with the next SQL Task?  That means that it doesn’t require the execution of the task to end successfully, the flow will continue even when an error occurred (unlike when the green connector – meaning Success – is used).

Double-clicking the blue connector gives us the Precedence Constraint Editor window:

Precedence Constraint Editor

Indeed, the value for the constraint is set to Completion.

This also explains why the wizard continued executing all phases, even though an error occurred.  And the end result was that the table is really created, containing the data as expected.

If you’d like to avoid the error and handle the non-existing table, you could replace the query in the Drop table SQL Task with the following one:

if exists
(
    select * from INFORMATION_SCHEMA.TABLES T
    where T.TABLE_NAME = 'ProductList$'
        and T.TABLE_SCHEMA = 'dbo'
)
    drop table [dbo].[ProductList$];

It first checks if the table exists, taking schema into account, and will only perform the DROP TABLE statement if the table actually exists.

Field Length Insufficient

On to the next possible issue.  When examining our Excel sheet we find out that the content of the Class column is either blank or just one character:

Possible values of the Class field

So we decide to change its type to char(1), again using the Column Mappings screen.

Column Mappings: change type to char(1)

However, clicking Finish at the end of the wizard gives us a fatal error:

SQL Server Import and Export Wizard: Operatoin stopped in Error

And these are the details of the error:

    • Error 0xc02020c5: Data Flow Task 1: Data conversion failed while converting column “Class” (63) to column “Class” (139).  The conversion returned status value 4 and status text “Text was truncated or one or more characters had no match in the target code page.”.(SQL Server Import and Export Wizard)
    • Error 0xc020902a: Data Flow Task 1: The “output column “Class” (139)” failed because truncation occurred, and the truncation row disposition on “output column “Class” (139)” specifies failure on truncation. A truncation error occurred on the specified object of the specified component.

      (SQL Server Import and Export Wizard)

    • Error 0xc0047022: Data Flow Task 1: SSIS Error Code DTS_E_PROCESSINPUTFAILED.  The ProcessInput method on component “Data Conversion 0 – 0″ (131) failed with error code 0xC020902A while processing input “Data Conversion Input” (132). The identified component returned an error from the ProcessInput method. The error is specific to the component, but the error is fatal and will cause the Data Flow task to stop running.  There may be error messages posted before this with more information about the failure.

      (SQL Server Import and Export Wizard)

Wow, that’s quite a long list for such a simple change!  Apparently it wasn’t such a good idea to limit the length of this field.  But why does this error occur?  After all, the values in this field are just one character long.  Except, after taking a closer look at the Excel sheet it turns out that these values contain a trailing space!

One way of avoiding this issue is by specifying a higher length for the field.  In this particular case however, you may decide to modify your Data Flow so that it removes the trailing space.  Here’s what the Data Flow currently looks like:

Data Flow to import Excel data

The Data Conversion transformation takes care of converting the Class column into a DT_STR of length 1.

Data Conversion Transformation Editor

We’re going to replace that Data Conversion Transformation with another one, a Derived Column transformation.  This is one of the transformations which I’m using all the time.

Derived Column Transformation Editor

As you can see in the screenshot, I’ve set it up so that it uses the TRIM() function on the Class input column to remove any leading or trailing spaces.  Furthermore I’m using a cast to DT_STR of length 1 to ensure the correct field type.  The resulting column is called Class_STR, to clearly indicate that it has been converted to DT_STR.

This is the resulting Data Flow:

Data Flow: Destination component complains about lineage ID

Looks like we’ve got an issue with our Destination component now.  Which is quite logical: we still need to tell it that it should use the newly-created Class_STR column.  So double-click the component.

 Restore Invalid Column References Editor

Double-clicking the Destination component will open up the Restore Invalid Column References Editor.  In Available Columns, select the new Class_STR column to replace the Class column which was used previously.

That’s it, your SSIS package will now remove the trailing spaces from the Class column and store it in a column of char(1).

Excel Layout Change

Here’s another common issue when dealing with Excel SSIS imports.  If you’re not 100% in control of that Excel sheet, someone will someday make a structural change to it and it will cause your import process to fail.

In this really simple example I’ve opened up the Excel sheet and renamed the Color column to Colour.  Which is something that may happen in real life: a British person takes over the maintenance of that product list and sees that the Color column is spelled the wrong way, and corrects it without informing anyone.

What does that mean for our import?  Here’s the result when manually executing the package using DTExecUI.exe:

DTExecUI: Package Execution Progress

In this case execution will fail because the package cannot find the Color column in the Excel sheet.  More precisely it says VS_NEEDSNEWMETADATA.  This is a really common error when using Integration Services, but you need to manually update the package to handle such changes.

In other words: try to be in control of that sheet as much as you can, and if possible: set up another way to maintain such data.  For instance by using a Master Data Management system.  But that’s stuff for later on, in future articles.

Conclusion

As long as you’re aware of some of the common issues into which you may run, I still think the Import and Export Wizard is an interesting option to start your first SSIS package.  When running into an issue, I recommend to open up the SSIS package and have a closer look through the Business Intelligence Development Studio.  Then make any changes there so you can handle the errors.

Happy importing!

Valentino.

Share

Tags: , , ,

In a previous article I’ve shown you how to import data from an Excel sheet using the OPENROWSET() function.  And I concluded by stating that it’s not the best option when automating your data import.

Today I’ll repeat the Excel data import process by using SQL Server Integration Services, also known as SSIS.

I’ll be using SQL Server 2008 R2, but I’m quite sure that the process is very similar to the first release of 2008, and even to 2005.  The Excel file that I will be importing is the one used in my previous article, and I’ll also refer to some parts of that article, so you may want to have a read over that one when something here isn’t clear.

Furthermore I’m using a Windows 7 64-bit machine, with the ACE 14 driver (beta) installed.  To avoid any discussion about versions and for my own (future) reference, here’s the result of a SELECT @@VERSION:

Microsoft SQL Server 2008 R2 (CTP) – 10.50.1352.12 (X64)   Oct 30 2009 18:06:48   Copyright (c) Microsoft Corporation  Enterprise Evaluation Edition (64-bit) on Windows NT 6.1 <X64> (Build 7600: )

Create SSIS Package To Import Excel Data

Usually you will start by creating a new package in an Integration Services project, add an Excel source to a new Data Flow, throw in some Data Flow Transformations and end your flow with an OLE DB Destination connecting to your SQL Server.

But that’s not the approach that I’ll take in this article.  I’ll make use of a shortcut (and meanwhile I’m showing you how well integrated some components really are).

First, I’m creating a new database called ExcelImport, using the Management Studio (aka SSMS).  Once the database is created, right-click on it and choose Tasks > Import Data.

Start the Import Data Wizard through Management Studio

This will open up the SQL Server Import And Export Data Wizard.  Like all good wizards, it starts with a Welcome screen containing an introductory text about its purpose – something about “create simple packages that import and export data between many popular data formats including databases, spreadsheets” – and so on.  It also includes a checkbox that says “Do not show this starting page again”.  That’s my favorite object on the page :-)   Okay, I agree, the page is useful for people who have never seen the wizard before and who may have opened it up by accident, but that’s about as far as its use goes methinks.

So, do whatever you like with the checkbox and click Next.  That opens up the Choose a Data Source screen.

SQL Server Import and Export Wizard - Choose a Data Source

In that screen you have several options in the Data Source dropdown.  The one we’re interested in is called Microsoft Excel.  Once that option is selected, the controls further down the screen change into what is shown in the screenshot.  Select your file and the right version, in my case I’ve got an Excel 2007 file.  If your sheet contains a header row, activate the First row has column names checkbox.

Clicking Next will open up the Choose a Destination screen.

SQL Server Import and Export Wizard - Choose a Destination

In that screen, select Microsoft OLE DB Provider for SQL Server as Destination.  Ensure that your SQL Server instance is the right one in the Server Name dropdown and the Authentication is filled out as expected.  The correct database should be selected by default because we did a right-click on it to start the wizard.

Another Next click opens up the Specify Table Copy or Query window.

SQL Server Import and Export Wizard - Specify Table Copy or Query

Here we can choose between either retrieving the full table – all rows, all columns – or writing a query ourselves.  I’ll go for the first option.

Click Next once more to open the Select Source Tables and Views screen.  That name seems a bit weird in the context of an Excel import but I guess that’s not really important here.  Just read it as “Select the sheet that you’d like to import”.

SQL Server Import and Export Wizard - Select Source Table and Views 

I’ll go for the ProductList$ Source.  The sheet in my Excel file is called ProductList.  Note that the Destination is editable – I’m changing the destination table to dbo.ProductList (with the dollar sign removed).

If you’d like to view your data right now you can hit the Preview button.  It opens up a window such as this one:

SQL Server Import and Export Wizard - Select Source Table and Views - Preview Data 

That should look familiar.

A more interesting button is the one called Edit Mappings.  Clicking that button opens a screen that lets you change the destination table’s schema.  By default, all columns that seem to be numeric are mapped to a type of float and all the others are mapped to nvarchar with a length of 255.  Depending on your situation you can either leave it as it is (in case you’re just loading a staging table and want to handle data conversions later on) or you should review each column (for instance when you’re erasing and loading the full table every night and this is the actual table being used by other processes).

Note: whenever a column contains a blank cell, its type will be nvarchar(255), even when all other values are numeric.  Also, if you don’t need Unicode support, don’t use nvarchar but change it to varchar instead.

Here’s the Column Mappings screen with some of the defaults changed:

The Column Mappings window

Everything that I changed has been indicated using the yellow marker.  I’ve changed some field types and lenghts, made one field non-nullable and adapted the destination name.

Clicking OK followed by Next brings us to the following screen, Review Data Type Mapping.

SQL Server Import and Export Wizard - Review Data Type Mapping

This screen gives another overview of all the columns that we’re planning to import.  Note that each column of which we’ve changed the type has gotten a nice yellow warning sign.  This happens because a data conversion needs to occur, and there’s always something that can go wrong when converting data.  The On Error and On Truncation columns show the action that should happen when such an event occurs.  We’ll leave them all at Use Global.  The Global settings are located at the bottom of the screen and are both set to Fail.  That’s the best option at the moment (the only other one is Ignore but that means you won’t get any notification in case of an error or truncation problem).

After clicking Next we end up at the Save and Run Package window.

SQL Server Import and Export Wizard - Save and Run Package

I have activated the Save SSIS Package checkbox and chose the Do not save sensitive data option.  By default it is saved in SQL Server, which is actually the MSDB.  That’s good because we’re going to examine the contents of the package later on so we want to keep it.

The Run immediately checkbox was activated by default.  This will execute the package in the last step of the wizard.

Another Next click and we’re on the Save SSIS Package screen.

SQL Server Import and Export Wizard - Save SSIS Package

Here we can give our package a decent name, such as ExcelImport.  You can also see the server on which I’m saving the package.

The final Next click brings us to the Complete the Wizard screen, woohoo!

 SQL Server Import and Export Wizard - Complete the Wizard

We can see a short overview of what we’ve configured in the previous steps.  Click Finish to execute and save the package!

SQL Server Import and Export Wizard - The execution was successful

And we’ve successfully executed the package!

Taking A Closer Look At The SSIS Package

When connecting to the Integration Services server (through SSMS for instance), I now have a new package in the root of the MSDB folder.

The ExcelImport SSIS package, stored in MSDB

It’s located there because I chose to save the package to SQL Server.

Adding An Existing Package To An SSIS Project

We’re now going to open it in the Business Intelligence Development Studio (aka BIDS).  So, open the BIDS and create a new SSIS Project (or use an existing one, doesn’t really matter).

Once the project is open, right-click the SSIS Packages folder in the Solution Explorer and select Add Existing Package.

Right-click SSIS Packages folder in Solution Explorer to Add Existing Package

That opens up the Add Copy of Existing Package window.

Sidenote: do you see that SSIS Import and Export Wizard option in the previous screenshot?  That’s right, the wizard that we’ve used extensively in the earlier part of this article can be launched from here as well.

Add Copy of Existing Package

Select SSIS Package Store as location of the package and enter the name of your server in the second dropdown.  Once that is done you can click the button with the ellipsis and select your package under the MSDB node.

Clicking OK will add the package to your project in the Solution Explorer.  Double-click it to open it up.

The Control Flow

In the Control Flow you can see two components: a SQL Task that contains a CREATE TABLE statement and a Data Flow.

Here’s what the CREATE TABLE statement looks like:

CREATE TABLE statement in the Execute SQL Task

As you can see, the table is created using the column names and types just like we configured them through the wizard.

Important to note here is that the Control Flow does not take anything else into account.  For instance, what happens if we execute the package twice?  It will fail because the table already exists!

In case you don’t believe me, just try it out!

Execute SQL Statement failed There is already an object named 'ProductList' in the database. 

The Data Flow

Opening up the Data Flow we see that it contains three components: an Excel source component that uses an Excel Connection Manager to connect to our now well-known Excel sheet, a Data Conversion Transformation to take care of the conversions that we requested and an OLE DB Destination that uses an OLE DB Connection Manager to connect to our SQL Server.

The Data Flow to transfer Excel data into SQL Server

Important to note here is that whenever an issue occurs, such as a conversion problem, the flow will fail.

In production environments, certain events need to be taken into account.  The purpose of this article was just to show you how you can use a wizard to generate an SSIS package for you.  You can now use this package as the basis for a well-developed Excel Import template.

Conclusion

With this article I hope to have shown you how to use Integration Services to import Excel data, and also that the Management Studio knows how to use other SQL Server components, such as SSIS, quite well.

If you’re running into some issues while using the wizard, or you just like reading what I write, check out my follow-up article covering some common pitfalls.

Need to go to sleep now, long drive tomorrow, PASS European Conference in Germany!  I do hope that Adam Saxton will be there because I was planning to see his presentations all day long.  I already read that one of the speakers – Brent Ozar – won’t be able to make it.  Darned ash cloud…  Next time someone starts talking to me about Azure I’ll run away screaming.

Just kidding :-)

Have fun!

Valentino.

References

Microsoft Support: How to import data from Excel to SQL Server

How to: Run the SQL Server Import and Export Wizard

Share

Tags: , , , ,

The purpose of this article is to demonstrate how to retrieve data from an Excel sheet and put it in a table in a SQL Server database.

Introduction

Anyone who’s ever used a computer for a significant amount of time has probably come into contact with Excel, the spreadsheet application part of the Microsoft Office suite. Its main purposes are to perform calculations and create charts and pivot tables for analysis.

But people have great imagination and invent new uses for it every day.  I’ve even seen it used as a picture album.  (Sorry dad, but I know you won’t be reading this anyway. :-) )  Ever since he had this specific YACI, or “Yet Another Computer Issue”, because his PC wasn’t powerful enough to open his 45 MB Excel file, uh, “picture collection”, he took some evening classes.  He’s now putting his Photoshopped pictures in PowerPoint…  Anyway, let’s get back on track now.

Another use, and the one that’s the subject of this article, is when Excel has been used as a database.  Come on, you know what I’m talking about, with the first row containing the column headers followed by possibly thousands of data rows.  The following screenshot contains an example, and is also the file that I will be using in this article.  I took all records from the Production.Product table in the AdventureWorks 2008R2 database and dumped them in Excel.

An Excel sheet used as a data store

At some point people will realize, either because someone told them or because they lost some data due to inattentiveness, that it wasn’t a really good idea to keep all that data in an Excel sheet.  And they’ll ask you to put it in a real database such as SQL Server.

That’s what I’m going to show you in the next paragraphs: how to import data from Excel into SQL Server.

Using OPENROWSET() To Query Excel Files

There are actually several different ways to achieve this.  In this article I will use the OPENROWSET() function.  This is a T-SQL function that can be used to access any OLE DB data source.  All you need is the right OLE DB driver.  The oldest version which I could confirm that contains this function is SQL Server 7.0, good enough to say that any version supports it.

My sample Excel files are located in C:\temp\.  This folder contains two files: Products.xls and Products.xlsx.  The first file is saved in the old format, Excel 97-2003, while the second file was saved from Excel 2010.  Both files contain the same data.  The sheet containing the list of products is called ProductList.

And here are the queries:

--Excel 2007-2010
SELECT * --INTO #productlist
FROM OPENROWSET('Microsoft.ACE.OLEDB.12.0',
    'Excel 12.0 Xml;HDR=YES;Database=C:\temp\Products.xlsx',
    'SELECT * FROM [ProductList$]');

--Excel 97-2003
SELECT * --INTO #productlist
FROM OPENROWSET('Microsoft.Jet.OLEDB.4.0',
    'Excel 8.0;HDR=YES;Database=C:\temp\Products.xls',
    'select * from [ProductList$]');

These queries are just returning the data from the Excel file into the Results window, when executed using the Management Studio.  To insert the data into a table, uncomment the INTO clause.  When uncommented, the statement retrieves the data from the Excel sheet and puts it into a newly-created local temporary table called #productlist.

Furthermore, the query assumes that the first row contains the header.  If that’s not the case, replace HDR=YES with HDR=NO.

Note: if you get an error message when running the query, look further down in this article.  I’ve covered a couple of them.

With the INTO clause uncommented and the query executed, the temporary table can now be queried just like any other table:

SELECT * FROM #productlist

What Type Is Your Data?

Let’s have a look if this method of using a SELECT INTO in combination with OPENROWSET and a temporary table is smart enough to interpret the correct data types of the data coming in.  Use the following command to describe the metadata of the temporary table:

USE tempdb;
GO
sp_help '#productlist';

Because a temporary table is stored in the tempdb, the sp_help command should be issued against that database.

Here’s the part of the output in which we’re interested:

The data types used when combining OPENROWSET with SELECT INTO

As you can see, anything that looks like text will be put in a field of type nvarchar(510) and anything that looks like a number (integers, floating-point numbers, datetime values, …) is put into a float(53).  Not a lot of intelligence there.  This is the result when no formatting was put on the cells in Excel.

As an experiment I’ve changed the format of some fields in the Excel file and then retried the SELECT INTO statement.  What did I change?  I identified ProductID as being a number without any decimals, changed StandardCost and ListPrice to a currency with four decimal digits and I changed SellStartDate and SellEndDate to a custom date/time format showing both date and time.

The effect on the table creation was not completely as I would have expected:

SELECT INTO with some field types changed

ProductID is still being stored into a float field, even though in Excel it’s defined as having no decimals.  And the datetime values are not recognized either.  Okay, I used a custom format there, so maybe it’s due to that.

It’s up to you of course how you use this method of importing the data.  You can put your records into a temporary table to process further, or you can create a table with the expected data types upfront and import the data directly into that one.

Some Possible Issues

Let’s cover some issues related to this method.

Enable ‘AD Hoc Distributed Queries’

The OPENROWSET() function expects that the ‘Ad Hoc Distributed Queries’ option is enabled on the server.  When that’s not the case you’ll see the following message:

Msg 15281, Level 16, State 1, Line 1

SQL Server blocked access to STATEMENT ‘OpenRowset/OpenDatasource’ of component ‘Ad Hoc Distributed Queries’ because this component is turned off as part of the security configuration for this server. A system administrator can enable the use of ‘Ad Hoc Distributed Queries’ by using sp_configure. For more information about enabling ‘Ad Hoc Distributed Queries’, see “Surface Area Configuration” in SQL Server Books Online.

This is one of the advanced options.  To enable it you can use the following command:

sp_configure 'show advanced options', 1;
GO
RECONFIGURE;
GO

sp_configure 'Ad Hoc Distributed Queries', 1;
GO
RECONFIGURE;
GO

To get a good look at all the different settings, just run the sp_configure procedure without any parameters.

Note: if you’re not the administrator of the server, you should talk to the DBA who’s responsible before attempting this.

The File Needs To Be Closed

When the Excel file is not closed, you’ll end up with the following error:

Msg 7399, Level 16, State 1, Line 1

The OLE DB provider “Microsoft.Jet.OLEDB.4.0″ for linked server “(null)” reported an error. The provider did not give any information about the error.

Msg 7303, Level 16, State 1, Line 1

Cannot initialize the data source object of OLE DB provider “Microsoft.Jet.OLEDB.4.0″ for linked server “(null)”.

So close the file and try the query again.

OLE DB Driver Not Installed

The OPENROWSET() function uses OLE DB, so it needs a driver for your data source, in this case for Excel.  If the right driver is not installed, you’ll see the following error (or similar, depends on the version used).

Msg 7302, Level 16, State 1, Line 1

Cannot create an instance of OLE DB provider “Microsoft.ACE.OLEDB.12.0″ for linked server “(null)”.

To solve the issue, install the right driver and try again.

How can you tell what drivers are installed?  Open up the ODBC Data Source Administrator window (Start > Run > type ODBCAD32.EXE and enter) and have a look in the Drivers tab.  The following screenshot (taken on a Dutch Windows XP) shows both the JET 4.0 driver for Excel 97-2003 and the fairly-new ACE driver for Excel 2007.

odbcad32.exe - ODBC Data Source Administrator

The drivers can be downloaded from the following pages on the Microsoft site:

Excel 97-2003 Jet 4.0 driver

Excel 2007 ACE driver – 12.00.6423.1000

Excel 2010 ACE driver (beta) – 14.00.4732.1000

Sidenote: the Excel 2010 driver is not supported on Windows XP, but I was able to query the 2010 Excel sheet using the 2007 driver.  I guess that this is the result of the Office Open XML standard which was introduced in Office 2007.

Driver backward-compatibility

The ACE drivers are backwards-compatible.  So the following queries are working perfectly:

--old Excel with new ACE driver - working query 1
SELECT * --INTO #productlist
FROM OPENROWSET('Microsoft.ACE.OLEDB.12.0',
    'Excel 8.0;HDR=YES;Database=C:\temp\Products.xls',
    'SELECT * FROM [ProductList$]');

--old Excel with new ACE driver - working query 2
SELECT * --INTO #productlist
FROM OPENROWSET('Microsoft.ACE.OLEDB.12.0',
    'Excel 12.0;HDR=YES;Database=C:\temp\Products.xls',
    'SELECT * FROM [ProductList$]');

In other words, you won’t be needing that first link for the Jet driver.  For the full story have a look at this blog post by Adam Saxton of the CSS SQL Server Escalation Services team.

The 64-bit Story

So, what if you’re running a 64-bit OS?  I’ll start by saying that I had quite some issues getting OPENROWSET to work, but finally I managed it.  Following is a list of my attempts, each time with the resulting message.  And finally I’ll show you how I got it to work.  The problem was something really unexpected…

ACE 14 64-bit through SSMS

My main laptop is running Windows 7 64-bit, Office 2010 64-bit and SQL Server 2008 R2 64-bit.  So I installed the 64-bit version of the ACE 14 driver, which happens to be the first OLE DB driver for Excel that ships in 64-bit.  But when I execute my query I’m getting the following message:

Msg 7403, Level 16, State 1, Line 1

The OLE DB provider “Microsoft.ACE.OLEDB.14.0″ has not been registered.

Is this because SSMS ships only in 32-bit?  Maybe, but I’m not able to install the 32-bit driver.  It doesn’t allow me to because I’ve got Office in 64-bit installed.  The installer throws me the following error:

Microsoft Access database engine 2010 (beta) - You cannot install the 32-bit version of Access Database engine for Microsoft Office 2010 because you currently have 64-bit Office products installed...

ACE 12 32-bit on a 64-bit machine

When I check the installed drivers using the 32-bit version of the ODBC Data Source Administrator (located in C:\Windows\SysWOW64), I notice that the ACE 12 driver is installed.  However, trying to use that one from the Management Studio gives me this:

Msg 7399, Level 16, State 1, Line 1

The OLE DB provider “Microsoft.ACE.OLEDB.12.0″ for linked server “(null)” reported an error. The provider did not give any information about the error.

Msg 7330, Level 16, State 2, Line 1

Cannot fetch a row from OLE DB provider “Microsoft.ACE.OLEDB.12.0″ for linked server “(null)”.

The Results pane shows all the columns with the right column names, retrieved from Excel.  But the driver seems to have a problem retrieving the actual data.

This issue with error 7330 is mentioned in the following thread on the SQL Server MSDN forum, but unfortunately the proposed solution didn’t solve the problem in my case.

64-bit SQLCMD using ACE 14 driver

I also tried using the 64-bit version of sqlcmd.exe, but strangely enough that throws the same error.

Using sqlcmd 64-bit to query Excel

I actually expected this last method to work, after all, everything is now running in 64-bit.  But alas, it didn’t…

One more go…

After some more trial and error, I have actually found a way to get the query to work.  I don’t have a logical explanation on why it’s behaving the way it is, but, well, it is working…

This query is running fine:

SELECT * --INTO #productlist
FROM OPENROWSET('Microsoft.ACE.OLEDB.12.0',
    'Excel 12.0 Xml;HDR=YES;Database=C:\temp\Products.xlsx',
    'SELECT * FROM [ProductList$]');

But this one isn’t:

--Excel 2007-2010
SELECT * --INTO #productlist
FROM OPENROWSET('Microsoft.ACE.OLEDB.12.0',
    'Excel 12.0 Xml;HDR=YES;Database=C:\temp\Products.xlsx',
    'SELECT * FROM [ProductList$]');

It’s exactly the same query, only difference is the comment line at the start.  And even weirder, if I add a space after the double-dash, the query works fine as well!

Then I decided to remove the commented INTO clause.  This made the weird behavior disappear.  So for some reason SQL Server doesn’t like the OPENROWSET function combined with comments inside the query.  The strange behavior also disappears when a space is added between the double-dash and the INTO keyword.

Uh, computers can be so much fun, right? :-)

If anyone has got an explanation on this strange behavior: please do post a comment!  For now my conclusion is: don’t use comments when creating an OPENROWSET query.

IMPORTANT UPDATE (April 11, 2010): it seems that the current installer for the ACE 14 driver contains a bug and registers it as being “Microsoft.ACE.OLEDB.12.0” instead of “Microsoft.ACE.OLEDB.14.0” .  This explains some of the issues shown above.  Some evidence on the issue:

Microsoft Connect: Access Database Engine 2010 installation issue to use with ADO access technology to access data from Jet database (.mdb files)

The ‘Microsoft.ACE.OLEDB.14.0′ provider is not registered ….. (see last comment)

Excel Services, ODC and Microsoft.ACE.OLEDB.14.0

Conclusion

The above has shown that OPENROWSET() can be a useful function, given the right circumstances.  But in the wrong setting it can be quite cumbersome to get to work.

I would recommend this method only for one-off quick imports, such as when you as a developer are given a bunch of data in a spreadsheet and need to get it into the database, one way or another.  I would not use it for an automated import process.  For that we’ve got a more interesting alternative which I’ll cover in an upcoming article.

Have fun!

Valentino.

References

BOL 2008: Special Table Types (incl. temporary tables)

BOL 2008: OPENROWSET() function

BOL 2008: the INTO clause

CSS SQL Server Engineers: How to get a x64 version of Jet?

Share

Tags: , , ,

« Older entries

© 2008-2017 BI: Beer Intelligence? All Rights Reserved