I want some Moore

Blog about stuff and things and stuff...
mostly about SQL server and .Net
posts - 161, comments - 1487, trackbacks - 33

My Links

SQLTeam.com Links

News

Hi! My name is 
Mladen Prajdić  I'm from Slovenia and I'm currently working as a .Net (C#) and SQL Server developer. I'm also a MCP and MCTS for SQL Server. 
Welcome to my blog.

Search this Blog
 

My Blog Feed via Email


Get your Google PageRank
Users Online: who's online

Article Categories

Archives

Post Categories

Cool software

Other Blogs

Other stuff

SQL stuff

More on Flat File Bulk Import methods speed comparison in SQL Server 2005

Well i've done some more testing of bulk import methods i described in my previous post. The difference between previsous tests and these tests are these:

1. Database recovery model was set to Bulk logged. (This improved performance by 2-3 seconds for each method)
2. I tried BULK INSERT and BCP with and without TABLOCK option.
3. I also measured general processor utilization and RAM. IO reads and writes were similar in all methods.

everything else was the same.

Method Time (seconds) CPU utilization (%) RAM change (MB)
Bulk insert with Tablock 4648 75 +5MB
SSIS FastParse 5812 75 +30Mb
BCP with Tablock 6300 100 +5MB
Bulk insert 7812 75 +5MB
OpenRowset 8750 95 +5MB
BCP 11250 100 +5MB

SSIS also has a bulk insert tablock option set to true in the SQL Server destination. So my guess is that approx 1 second overhead comes from package startup time. However SSIS  uses more memory than Bulk insert.

So if you can use TABLOCK option Bulk insert is the way to go. If not SSIS is.

It must be taken into account that the the test flat file i used had only 4 integer columns and you could set the FastParse option to all of them. In real world there are no such ideal conditions. And since i had only integer type i don't have to specify collations.

Modified scripts are:

BULK INSERT testBulkInsert
FROM     'd:\work\test.txt'
WITH (
    FORMATFILE='d:\work\testImport-f-n.Fmt',
    TABLOCK     
)

insert into testOpenRowset(c1, c2, c3, c4)
SELECT    t1.c1, t1.c2, t1.c3, t1.c4
FROM    OPENROWSET
    ( 
        BULK 'd:\work\test.txt', 
        FORMATFILE = 'd:\work\testImport-f-n.Fmt'
    ) AS t1(c1, c2, c3, c4);

exec master..xp_cmdshell 'bcp test.dbo.testBCP in d:\work\test.txt -T -b1000000 -fd:\work\testImport-f-n.Fmt -h"tablock"'
kick it on DotNetKicks.com

Print | posted on Saturday, July 22, 2006 3:05 PM

Feedback

# Bulk Insert時の性能比較(オプションの有効活用がポイント)

Bulk Insert??????(???????????????)
7/23/2006 9:40 AM | 米田 Blog ( SQL Server MEMO )

# re: More on Flat File Bulk Import methods speed comparison in SQL Server 2005

Mladen,

The BULK INSERT and BCP tests are not equivalent. Two differences:

A) You are setting number of rows per batch in the BCP but not in BULK INSERT. Try removing that from the BCP (assuming that you're inserting more than one million rows -- if not, it won't make a difference)

B) You didn't set TABLOCK mode in the BCP. To do that, add the option: -h"tablock"

7/24/2006 12:17 AM | Adam Machanic

# re: More on Flat File Bulk Import methods speed comparison in SQL Server 2005

acctually rows per batch worked a little strange for me.

In bcp if i didn't specify it, it automatically took 1000 rows per batch. So i had to set it to 1mil.

In bulk insert i didn0t specify it because it took all 1mil as one batch.

i acctually did specify it at first but saw no difference so i omitted it.

as for BCP with tablock (thanx for the tip)
it takes approx 6300 ms with 100% cpu utilization and +5 MB RAM increase.

I'll add it to the list tommorow.
7/24/2006 12:31 AM | Mladen

# re: More on Flat File Bulk Import methods speed comparison in SQL Server 2005

Hi,

Any idea where a solution using ADO.NET 2.0 SqlBulkCopy
[http://msdn2.microsoft.com/en-us/library/system.data.sqlclient.sqlbulkcopy.aspx] reading in the same flat-file would fit in?

I do not see an option to set TabLock, so I imagine, it would be the slowest option, right?

Would there be a way to improve the performance if the [C#] code would spin up multiple threads, and divide the work among the threads (given that the machine is a multi CPU box). Or will SQL Server then serialize the work?

[I will in the future (2-3 months) need to develop something like this.]
8/25/2006 6:13 PM | Rudi

# re: More on Flat File Bulk Import methods speed comparison in SQL Server 2005

sorry but i have no idea about proformance of SqlBulkCopy.
i'd be very surprised to see it beat BCP with tablock option.
multiple threads in c# will probably work.
your biggest problem will be network bandwidth.
SQL will serialize work as it sees fit.

but i'll try to test this method on the weekend and post the results if i find time. :)
8/25/2006 6:33 PM | Mladen

# re: More on Flat File Bulk Import methods speed comparison in SQL Server 2005

Rudi:
if you haven't seen it yet take a look here:
http://weblogs.sqlteam.com/mladenp/archive/2006/08/26/11368.aspx

tablock can be specified in the constructor of the SqlBulkCopy class.
8/27/2006 1:11 PM | Mladen

# re: More on Flat File Bulk Import methods speed comparison in SQL Server 2005

Mladen,

Thanks for the results and the time!!! Very much appreciated.

At least now we do know the relative cost of using a full C# solution compared to the native SQL solutions.
8/28/2006 9:30 AM | Rudi

# re: More on Flat File Bulk Import methods speed comparison in SQL Server 2005

You're welcome. :)
8/28/2006 9:37 AM | Mladen

# re: More on Flat File Bulk Import methods speed comparison in SQL Server 2005

Just had to share this find :
Import 1TB in 1 Hour
http://blogs.msdn.com/sqlcat/archive/2006/05/19/602142.aspx
10/13/2006 9:54 PM | rockmoose

# re: More on Flat File Bulk Import methods speed comparison in SQL Server 2005

Very nice!!
Thanx moose.
10/14/2006 3:14 PM | Mladen

# re: More on Flat File Bulk Import methods speed comparison in SQL Server 2005

Thanks for the results and the time!!! Very much appreciated.
3/27/2008 5:52 PM | Freebie

# re: More on Flat File Bulk Import methods speed comparison in SQL Server 2005

Thanks for the post Mladen! I've been wondering which method was fastest. Glad to see that bulk insert is the fastest, especially since a lot of hosting companies don't allow xp_cmdshell.

Jason
9/1/2008 6:42 PM | Jason

Post Comment

Title  
Name  
Email
Url
Comment   
Please add 1 and 4 and type the answer here:

Powered by: