Avoid looping when inserting data

Sometimes you’re in a situation where your stored procedure needs to insert several new records and some fields are based on calculations, or one of the fields contains a kind of batch number that needs to increase for every record where another certain field contains the same value.

In this situation you may have a habit of building a loop to insert the records, such as the following:

(I’ve used the Person.Address table from the AdventureWorks DB for demonstration purposes – it doesn’t contain too many fields :-) )

declare @counter int;
set @counter = 1;
while @counter <= 10000
begin
    insert into Person.Address (AddressLine1, AddressLine2, City, 

        StateProvinceID, PostalCode, ModifiedDate)
    values ('the first address line', 'second add line', 'some city', 1,
        'PC-0000' + cast(@counter as varchar(10)), getdate());
    set @counter = @counter + 1;
end

Executing the loop above takes 3.5 seconds on my laptop.  This is not very long, but it can be better.

If you think about this, what is going on in those 3.5 seconds?  Not only are we inserting 10000 records, every index in that table is getting 10000 updates.  Besides the clustered index there are 3 additional indexes.  In total this makes 40000 changes that are being processed on the table.  And that’s not all, every insert happens inside a transaction, so 10000 transactions.  SQL Server needs to keep track of what has been modified, so writes to the transaction log are going on as well.

Now, how can we avoid this?  The answer lies in the table variable!  Create a table variable that will hold your new records, loop to create the records in the table variable and then, after the loop, insert all records from the table variable into your actual table.

-- declaration of table var 
declare @table table
(
    [AddressLine1] [nvarchar](60) NOT NULL,
    [AddressLine2] [nvarchar](60) NULL,
    [City] [nvarchar](30) NOT NULL,
    [StateProvinceID] [int] NOT NULL,
    [PostalCode] [nvarchar](15) NOT NULL,
    [ModifiedDate] [datetime] NOT NULL
);
declare @counter int;
set @counter = 1; 

while @counter <= 10000
begin
    -- insert in table var instead of actual table  
   insert into @table (AddressLine1, AddressLine2, City, 

        StateProvinceID, PostalCode, ModifiedDate)
    values ('the first address line', 'second add line', 'some city', 1, 

        'PC-0000' + cast(@counter as varchar(10)), getdate());
    set @counter = @counter + 1;
end 

-- insert in actual table 
insert into Person.Address (AddressLine1, AddressLine2, City, 

    StateProvinceID, PostalCode, ModifiedDate)
select * from @table;

This now executes in 453 milliseconds, isn’t that great?  That’s an improvement of about 700%!

A good reason for this is that we now reduced the number of transactions to 1 because the data is being inserted in one go.  And the changes to the indexes are happening all at once, which should be better as well.

In my opinion, in specific situations this can be an interesting concept to optimize your code performance.

See Books Online for more info on the table variable itself: http://msdn.microsoft.com/en-us/library/ms175010.aspx

Share

Tags: , , ,

© 2008-2012 A Developer's Blog All Rights Reserved