I want some Moore

Blog about stuff and things and stuff. Mostly about SQL server and .Net
posts - 219, comments - 2287, trackbacks - 33

My Links

Advertisement

News

Hi! My name is 
Mladen Prajdić  I'm from Slovenia and I'm currently working as a .Net (C#) and SQL Server developer.

I also speak at local user group meetings and conferences like SQLBits and NT Conference
Welcome to my blog.
SQL Server MVP

My Books

SQL Server MVP Deep Dives 2
The Red Gate Guide to SQL Server Team based Development Free e-book

My Blog Feed via Email
Follow MladenPrajdic on Twitter


Users Online: who's online

Article Categories

Archives

Post Categories

Cool software

Other Blogs

Other stuff

SQL stuff

Flat File Bulk Import methods speed comparison in SQL Server 2005

In a previous post I explained where an SSIS FastParse option can be found
when importing data from Flat files.
I finally had some time to explore all four of SQL Server 2005 Bulk import utilities.
Those are:

1. BCP
2. Bulk Insert
3. OpenRowset with BULK option
4. SQL Server Integration Services - SSIS

I ran each bulk import option 12 times, disregarded best and worst time and averaged the remaining ten times.
Results are:

1. SSIS - FastParse ON = 7322 ms
2. SSIS - FastParse OFF = 8387 ms
3. Bulk Insert = 10534 ms
4. OpenRowset = 10687 ms
5. BCP = 14922 ms

So speed gain is quite large when using FastParse.
I was also surprised that SSIS - FastParse OFF method was faster by 20% to Bulk Insert and OpenRowset
and around 40% faster than BCP.

Since my desire was to test how much faster is importing flat files when FastParse option is used
I created a text file containing 4 bigint columns with 1,000,000 rows.

The script i used to create a sample test file in C#:

string str;
StreamWriter sw = new StreamWriter(@"d:\work\test.txt");
for (int i = 1; i <= 1000000; i++)
{
    str = i.ToString() + "|" + Convert.ToString(i * 2) + "|" + Convert.ToString(i * 3) + "|" + Convert.ToString(i / 2);
    sw.WriteLine(str);
}
sw.Close();
I also created this format file for use with BCP, Bulk Insert and OpenRowset:
9.0
4
1       SQLBIGINT        0       8       "|"   1     c1       ""
2       SQLBIGINT        0       8       "|"   2     c2       ""
3       SQLBIGINT        0       8       "|"   3     c3       ""
4       SQLBIGINT        0       8       "\r\n"   4     c4       ""

SSIS Package was a very simple one with a Flat File source and SQL server destination objects.

The sql script i used is:

create database test
go
use test
go
-- ran for each SSIS test run
-- SSIS data type for each column was "eight-byte signed integer [DT_I8]"
drop table testFastParse
create table testFastParse(c1 bigint, c2 bigint, c3 bigint, c4 bigint)
go
-- insert data using OPENROWSET 
create table testOpenRowset(c1 bigint, c2 bigint, c3 bigint, c4 bigint)
go
DBCC DROPCLEANBUFFERS
declare @start datetime
set @start = getdate()
insert into testOpenRowset(c1, c2, c3, c4)
SELECT    t1.c1, t1.c2, t1.c3, t1.c4
FROM    OPENROWSET( BULK 'd:\work\test.txt', 
        FORMATFILE = 'd:\work\testImport-f-n.Fmt') AS t1(c1, c2, c3, c4);
select  getdate() - @start as ElapsedTime
drop table testOpenRowset
-- insert data using Bulk Insert
create table testBulkInsert(c1 bigint, c2 bigint, c3 bigint, c4 bigint)
go
DBCC DROPCLEANBUFFERS
declare @start datetime
set @start = getdate()
BULK INSERT testBulkInsert
   FROM 'd:\work\test.txt'
   WITH (FORMATFILE='d:\work\testImport-f-n.Fmt')
select  getdate() - @start as ElapsedTime
drop table testBulkInsert
go
-- insert data using BCP
create table testBCP(c1 bigint, c2 bigint, c3 bigint, c4 bigint)
go
DBCC DROPCLEANBUFFERS
exec master..xp_cmdshell 'bcp test.dbo.testBCP in d:\work\test.txt -T -b1000000 -fd:\work\testImport-f-n.Fmt'
drop table testBCP
go
drop database test

Print | posted on Monday, July 17, 2006 5:28 PM |

Feedback

Gravatar

# SSIS: Comparison of Bulk Import methods in SQL Server 2005

I have just stumbled across an excellent post from mladen where he compares the performance of the following...
7/18/2006 5:44 AM | Jamie Thomson - SSIS Jibber Jabb
Gravatar

# re: Flat File Bulk Import methods speed comparison in SQL Server 2005

Allow me to differ on your conclusions.

You are not using Bulk insert neither its format file correctly.

BULK INSERT is by far the fastest method!!!

Clues:
DB recovery model should be switched to BULK_LOGGED
Format file should have the right collation to prevent unicode parsing

... etc
7/20/2006 10:17 PM | Noel Diaz- DBA
Gravatar

# re: Flat File Bulk Import methods speed comparison in SQL Server 2005

i'll check on the recovery model. And since all 4 methods do the same thing my doesn't that mean that you can rule out the Logging.

I though that you only need collation for string types and not integer types?

Since the method is used to demonstrate how SSIS's FASTPARSE can be use to speed things up and it doesn't work on string datatypes there's no need for a collation.

Thanx for the input.
7/21/2006 10:55 AM | Mladen
Gravatar

# Bulk Insert時の性能比較(オプションの有効活用がポイント)

Bulk Insert??????(???????????????)
7/23/2006 9:40 AM | 米田 Blog ( SQL Server MEMO )
Gravatar

# re: Flat File Bulk Import methods speed comparison in SQL Server 2005

>>I though that you only need collation for string types and not integer types? <<

Your "TEXT" file is made up of STRINGS. The format file should specify SQLCHAR for all instead!!

To verify that, to can even change the length to "0" because it will simply use the delimiters.

BULK LOGGING is *only* achived under certain conditions and those are created by either using bcp, bulk insert or the OLEDB stream provider "WITH" the database recovery mode set to BULK_LOGGED.

Hope this helps
8/3/2006 4:53 AM | Noel Diaz- DBA
Gravatar

# re: Flat File Bulk Import methods speed comparison in SQL Server 2005

create procedure dbo.gettexttotable
@Texto TEXT
as

DECLARE @LinhaAtual VARCHAR(8000)
DECLARE @Tamanho BIGINT
DECLARE @Inicio BIGINT
DECLARE @Separador VARCHAR(1)
DECLARE @TempImport TABLE(Idx BIGINT IDENTITY(1,1), GroupID BIGINT, Linha VARCHAR(8000), Tamanho BIGINT, Inicio BIGINT)

SET @Tamanho = 1
SET @Inicio = 1
SET @Separador = '='
SET @GroupID = 0

WHILE (@Inicio < DATALENGTH(@Texto)) BEGIN
SET @Tamanho = CHARINDEX(@Separador, SUBSTRING(@Texto, @Inicio, DATALENGTH(@Texto)), 1)
SET @LinhaAtual = SUBSTRING(SUBSTRING(@Texto, @Inicio, DATALENGTH(@Texto)), 1, @Tamanho)
SET @LinhaAtual = REPLACE(@LinhaAtual,@Separador,'')
IF UPPER(SUBSTRING(@LinhaAtual, 1, 1)) = 'C' OR UPPER(SUBSTRING(@LinhaAtual, 1, 1)) = 'F' BEGIN
SET @GroupID = @GroupID + 1
END
INSERT INTO @TempImport(GroupID, Linha, Tamanho, Inicio) VALUES(@GroupID, @LinhaAtual, @Tamanho-1, @Inicio)
SET @Inicio = @Inicio + @Tamanho
END




Hardway, but, works when i dont have acess on server harddisk... =\

and i think its a very faster...

i just send by parameter reading from textfile by asp!!!

=)
3/22/2007 7:42 PM | Fernando
Gravatar

# re: Flat File Bulk Import methods speed comparison in SQL Server 2005

What about OpenXML ?
4/8/2007 8:06 PM | Nabin Padhi
Gravatar

# re: Flat File Bulk Import methods speed comparison in SQL Server 2005

Yeah, XML bulk load is what I am looking for info on.
12/27/2007 2:53 AM | Kfhoz
Gravatar

# re: Flat File Bulk Import methods speed comparison in SQL Server 2005

look here:
http://weblogs.sqlteam.com/mladenp/archive/2007/06/18/60235.aspx
12/27/2007 9:37 AM | Mladen
Gravatar

# How many user pull data from SQL Server 2005 at a time

Please, Any one tell.
How many User pull data at time from sqlserver 2005 ,they used same database


Thanx In Advanced..
3/23/2008 5:09 AM | Manohar Shinde
Gravatar

# re: Flat File Bulk Import methods speed comparison in SQL Server 2005

Poor old BCP getting a bad wrap here. If you had bothered to use the load hint "TABLOCK" I think the results would have been much closer to the top. Same goes for BULK INSERT.

-- insert data using BCP
create table testBCP(c1 bigint, c2 bigint, c3 bigint, c4 bigint)
go
DBCC DROPCLEANBUFFERS
exec master..xp_cmdshell 'bcp test.dbo.testBCP in d:\work\test.txt -T -b1000000 -fd:\work\testImport-f-n.Fmt -h"TABLOCK"'
drop table testBCP
go
3/25/2008 9:55 PM | Dan Guru
Gravatar

# re: Flat File Bulk Import methods speed comparison in SQL Server 2005

well i have in a follow up to this post:
http://weblogs.sqlteam.com/mladenp/archive/2006/07/22/10742.aspx
3/25/2008 10:08 PM | Mladen
Gravatar

# re: Flat File Bulk Import methods speed comparison in SQL Server 2005

Yes but how can I get a format file to ignore a header record, I've tried FIRSTROW=2 and I still get errors loading?
12/9/2008 11:45 PM | Dean
Comments have been closed on this topic.

Powered by:
Powered By Subtext Powered By ASP.NET