More on Flat File Bulk Import methods speed comparison in SQL Server 2005
Well i've done some more testing of bulk import methods i described in my previous post. The difference between previsous tests and these tests are these:
1. Database recovery model was set to Bulk logged. (This improved performance by 2-3 seconds for each method)
2. I tried BULK INSERT and BCP with and without TABLOCK option.
3. I also measured general processor utilization and RAM. IO reads and writes were similar in all methods.
everything else was the same.
Method | Time (miliseconds) | CPU utilization (%) | RAM change (MB) |
Bulk insert with Tablock | 4648 | 75 | +5MB |
SSIS FastParse | 5812 | 75 | +30Mb |
BCP with Tablock | 6300 | 100 | +5MB |
Bulk insert | 7812 | 75 | +5MB |
OpenRowset | 8750 | 95 | +5MB |
BCP | 11250 | 100 | +5MB |
SSIS also has a bulk insert tablock option set to true in the SQL Server destination. So my guess is that approx 1 second overhead comes from package startup time. However SSIS uses more memory than Bulk insert.
So if you can use TABLOCK option Bulk insert is the way to go. If not SSIS is.
It must be taken into account that the the test flat file i used had only 4 integer columns and you could set the FastParse option to all of them. In real world there are no such ideal conditions. And since i had only integer type i don't have to specify collations.
Modified scripts are:
BULK INSERT testBulkInsert FROM 'd:\work\test.txt' WITH ( FORMATFILE='d:\work\testImport-f-n.Fmt', TABLOCK )insert into testOpenRowset(c1, c2, c3, c4) SELECT t1.c1, t1.c2, t1.c3, t1.c4 FROM OPENROWSET ( BULK 'd:\work\test.txt', FORMATFILE = 'd:\work\testImport-f-n.Fmt' ) AS t1(c1, c2, c3, c4);
exec master..xp_cmdshell 'bcp test.dbo.testBCP in d:\work\test.txt -T -b1000000 -fd:\work\testImport-f-n.Fmt -h"tablock"'
Legacy Comments
Adam Machanic
2006-07-24 |
re: More on Flat File Bulk Import methods speed comparison in SQL Server 2005 Mladen, The BULK INSERT and BCP tests are not equivalent. Two differences: A) You are setting number of rows per batch in the BCP but not in BULK INSERT. Try removing that from the BCP (assuming that you're inserting more than one million rows -- if not, it won't make a difference) B) You didn't set TABLOCK mode in the BCP. To do that, add the option: -h"tablock" |
Mladen
2006-07-24 |
re: More on Flat File Bulk Import methods speed comparison in SQL Server 2005 acctually rows per batch worked a little strange for me. In bcp if i didn't specify it, it automatically took 1000 rows per batch. So i had to set it to 1mil. In bulk insert i didn0t specify it because it took all 1mil as one batch. i acctually did specify it at first but saw no difference so i omitted it. as for BCP with tablock (thanx for the tip) it takes approx 6300 ms with 100% cpu utilization and +5 MB RAM increase. I'll add it to the list tommorow. |
Rudi
2006-08-25 |
re: More on Flat File Bulk Import methods speed comparison in SQL Server 2005 Hi, Any idea where a solution using ADO.NET 2.0 SqlBulkCopy [http://msdn2.microsoft.com/en-us/library/system.data.sqlclient.sqlbulkcopy.aspx] reading in the same flat-file would fit in? I do not see an option to set TabLock, so I imagine, it would be the slowest option, right? Would there be a way to improve the performance if the [C#] code would spin up multiple threads, and divide the work among the threads (given that the machine is a multi CPU box). Or will SQL Server then serialize the work? [I will in the future (2-3 months) need to develop something like this.] |
Mladen
2006-08-25 |
re: More on Flat File Bulk Import methods speed comparison in SQL Server 2005 sorry but i have no idea about proformance of SqlBulkCopy. i'd be very surprised to see it beat BCP with tablock option. multiple threads in c# will probably work. your biggest problem will be network bandwidth. SQL will serialize work as it sees fit. but i'll try to test this method on the weekend and post the results if i find time. :) |
Mladen
2006-08-27 |
re: More on Flat File Bulk Import methods speed comparison in SQL Server 2005 Rudi: if you haven't seen it yet take a look here: http://weblogs.sqlteam.com/mladenp/archive/2006/08/26/11368.aspx tablock can be specified in the constructor of the SqlBulkCopy class. |
Rudi
2006-08-28 |
re: More on Flat File Bulk Import methods speed comparison in SQL Server 2005 Mladen, Thanks for the results and the time!!! Very much appreciated. At least now we do know the relative cost of using a full C# solution compared to the native SQL solutions. |
Mladen
2006-08-28 |
re: More on Flat File Bulk Import methods speed comparison in SQL Server 2005 You're welcome. :) |
rockmoose
2006-10-13 |
re: More on Flat File Bulk Import methods speed comparison in SQL Server 2005 Just had to share this find : Import 1TB in 1 Hour http://blogs.msdn.com/sqlcat/archive/2006/05/19/602142.aspx |
Mladen
2006-10-14 |
re: More on Flat File Bulk Import methods speed comparison in SQL Server 2005 Very nice!! Thanx moose. |