Mladen Prajdić Blog

Blog about stuff and things and stuff. Mostly about SQL server and .Net

More on Flat File Bulk Import methods speed comparison in SQL Server 2005

Well i've done some more testing of bulk import methods i described in my previous post. The difference between previsous tests and these tests are these:

1. Database recovery model was set to Bulk logged. (This improved performance by 2-3 seconds for each method)
2. I tried BULK INSERT and BCP with and without TABLOCK option.
3. I also measured general processor utilization and RAM. IO reads and writes were similar in all methods.

everything else was the same.

Method Time (miliseconds) CPU utilization (%) RAM change (MB)
Bulk insert with Tablock 4648 75 +5MB
SSIS FastParse 5812 75 +30Mb
BCP with Tablock 6300 100 +5MB
Bulk insert 7812 75 +5MB
OpenRowset 8750 95 +5MB
BCP 11250 100 +5MB

SSIS also has a bulk insert tablock option set to true in the SQL Server destination. So my guess is that approx 1 second overhead comes from package startup time. However SSIS  uses more memory than Bulk insert.

So if you can use TABLOCK option Bulk insert is the way to go. If not SSIS is.

It must be taken into account that the the test flat file i used had only 4 integer columns and you could set the FastParse option to all of them. In real world there are no such ideal conditions. And since i had only integer type i don't have to specify collations.

Modified scripts are:

BULK INSERT testBulkInsert
FROM     'd:\work\test.txt'
WITH (
    FORMATFILE='d:\work\testImport-f-n.Fmt',
    TABLOCK     
)

insert into testOpenRowset(c1, c2, c3, c4) SELECT t1.c1, t1.c2, t1.c3, t1.c4 FROM OPENROWSET ( BULK 'd:\work\test.txt', FORMATFILE = 'd:\work\testImport-f-n.Fmt' ) AS t1(c1, c2, c3, c4);

exec master..xp_cmdshell 'bcp test.dbo.testBCP in d:\work\test.txt -T -b1000000 -fd:\work\testImport-f-n.Fmt -h"tablock"'

kick it on DotNetKicks.com

Legacy Comments


Adam Machanic
2006-07-24
re: More on Flat File Bulk Import methods speed comparison in SQL Server 2005
Mladen,

The BULK INSERT and BCP tests are not equivalent. Two differences:

A) You are setting number of rows per batch in the BCP but not in BULK INSERT. Try removing that from the BCP (assuming that you're inserting more than one million rows -- if not, it won't make a difference)

B) You didn't set TABLOCK mode in the BCP. To do that, add the option: -h"tablock"


Mladen
2006-07-24
re: More on Flat File Bulk Import methods speed comparison in SQL Server 2005
acctually rows per batch worked a little strange for me.

In bcp if i didn't specify it, it automatically took 1000 rows per batch. So i had to set it to 1mil.

In bulk insert i didn0t specify it because it took all 1mil as one batch.

i acctually did specify it at first but saw no difference so i omitted it.

as for BCP with tablock (thanx for the tip)
it takes approx 6300 ms with 100% cpu utilization and +5 MB RAM increase.

I'll add it to the list tommorow.

Rudi
2006-08-25
re: More on Flat File Bulk Import methods speed comparison in SQL Server 2005
Hi,

Any idea where a solution using ADO.NET 2.0 SqlBulkCopy
[http://msdn2.microsoft.com/en-us/library/system.data.sqlclient.sqlbulkcopy.aspx] reading in the same flat-file would fit in?

I do not see an option to set TabLock, so I imagine, it would be the slowest option, right?

Would there be a way to improve the performance if the [C#] code would spin up multiple threads, and divide the work among the threads (given that the machine is a multi CPU box). Or will SQL Server then serialize the work?

[I will in the future (2-3 months) need to develop something like this.]

Mladen
2006-08-25
re: More on Flat File Bulk Import methods speed comparison in SQL Server 2005
sorry but i have no idea about proformance of SqlBulkCopy.
i'd be very surprised to see it beat BCP with tablock option.
multiple threads in c# will probably work.
your biggest problem will be network bandwidth.
SQL will serialize work as it sees fit.

but i'll try to test this method on the weekend and post the results if i find time. :)

Mladen
2006-08-27
re: More on Flat File Bulk Import methods speed comparison in SQL Server 2005
Rudi:
if you haven't seen it yet take a look here:
http://weblogs.sqlteam.com/mladenp/archive/2006/08/26/11368.aspx

tablock can be specified in the constructor of the SqlBulkCopy class.

Rudi
2006-08-28
re: More on Flat File Bulk Import methods speed comparison in SQL Server 2005
Mladen,

Thanks for the results and the time!!! Very much appreciated.

At least now we do know the relative cost of using a full C# solution compared to the native SQL solutions.

Mladen
2006-08-28
re: More on Flat File Bulk Import methods speed comparison in SQL Server 2005
You're welcome. :)

rockmoose
2006-10-13
re: More on Flat File Bulk Import methods speed comparison in SQL Server 2005
Just had to share this find :
Import 1TB in 1 Hour
http://blogs.msdn.com/sqlcat/archive/2006/05/19/602142.aspx

Mladen
2006-10-14
re: More on Flat File Bulk Import methods speed comparison in SQL Server 2005
Very nice!!
Thanx moose.