Mike Femenella Blog

..or maybe I'm just the guy with the shovel behind the horses in the parade

How to tell what your insert, update is doing

Lately I’ve been working on a data migration SSIS package for a customer and some of it is just a straight forward execute sql tasks in SSIS on some very large tables. I pull 10 million rows into a staging area, massage some data into the proper format, insert it into it’s final production table resting place, easy enough. I’m testing it on a very low powered 32 bit server and it takes a while. At one point today I started wondering if it was just locked up or was still running.

I’ve used the sys.partitions table before as an alternate method of getting row counts for a table as it already has that information handy for me. This time I was querying the table while the insert was running and I was able to see exactly what was happening. The code below will walk you through an easy example. This will work the same in 2005, or 2008.

First, just create a throw away table:

Create table Loadtest
(
IDVal int identity(1,1),
CustomerName varchar(200),
CustomerCity varchar(200)
)

 

Now let’s create a looping insert that gives us enough time to look at things. You can just stop it when you have had enough. It’s nothing exciting here, I’m just creating an loop 2 million times and doing an insert..like I said, stop it when you’ve had enough:

declare @counter int
set @counter=2000000
While @counter>0
BEGIN
    Insert into LoadTest(CustomerName,CustomerCity)
        values('Mike Femenella','Matthews')
        SET @counter=@counter-1
END       

While your loop is running, open up another query analyzer window to the same database and run this:

SELECT sp.index_id,Rows,si.name,si.type_desc
      FROM sys.partitions sp
      inner join sys.indexes si
      on sp.object_id=si.object_id
      AND sp.index_id=si.index_id
      where object_name(sp.object_id)='LoadTest'

Since the table is not indexed you’ll see the row count and that it’s populating the heap, hit F5 a few times and you can watch the rows field increment. If you have doubts your insert statement is actually running, this is a good way to check that it is alive and you can also get an idea of it’s progress if you have a guess as to how many rows “should” be there.

--Unindexed, you see the heap being populated.     

   index_id        Rows    name    type_desc
    0                   24252    NULL    HEAP

 

So that was mildly interesting and sort of cool. Of course we can take it a step further (and I always love doing that). Go ahead and stop the loop when you have have a million or so rows in there, you need a large amount of data in the LoadTest table so you have enough time to run the sys.partitions query again. I was letting my loop run while writing the first part of the blog so I have 2 million rows in LoadTest.

Let’s create a mirror image of our LoadTest table and since I’m at the end of my work week and feeling very creative I’m going to call it LoadTest_Copy, go ahead and F5 and create the table.. Award winning material here isn’t it?

Create table Loadtest_Copy
(
IDVal int ,
CustomerName varchar(200),
CustomerCity varchar(200)
)

GO
Create Clustered index ixc_IDVal on LoadTest_Copy(IDVal)
Create index ix_CustomerName on LoadTest_Copy(CustomerName)
Create index ix_CustomerCity on LoadTest_Copy(CustomerCity)

GO

In another window modify your sys.partition query and change the name from LoadTest to LoadTest_Copy. Run this now and you have 3 rows, one for the clustered index and one for each of the other indexes. Somewhat more realistic and more interesting.

SELECT sp.index_id,Rows,si.name,si.type_desc
      FROM sys.partitions sp
      inner join sys.indexes si
      on sp.object_id=si.object_id
      AND sp.index_id=si.index_id
      where object_name(sp.object_id)='LoadTest_Copy'

 

Results:

index_id    Rows    name                           type_desc
1                0          ixc_IDVal                     CLUSTERED
2                0          ix_CustomerName       NONCLUSTERED
3                0          ix_CustomerCity         NONCLUSTERED

Now, do an insert from LoadTest into LoadTest_Copy. Start it and then flip over to your sys.partition query, run it while the insert is happening and look at the results.

You can keep hitting F5 and watch the numbers change. Notice it builds the clustered index first, which of course makes sense, and then will go through each of the other 2 indexes building them out.

index_id    Rows         name                           type_desc
1                   449900    ixc_IDVal                    CLUSTERED
2                   0             ix_CustomerName      NONCLUSTERED
3                   0            ix_CustomerCity          NONCLUSTERED

 

index_id    Rows            name                      type_desc
1               2000000        ixc_IDVal                 CLUSTERED
2               1065245        ix_CustomerName   NONCLUSTERED
3               0                   ix_CustomerCity     NONCLUSTERED

 

index_id    Rows          name                      type_desc
1               2000000       ixc_IDVal                 CLUSTERED
2               2000000       ix_CustomerName   NONCLUSTERED
3               126137         ix_CustomerCity     NONCLUSTERED

 

And finally now that it’s done:

index_id         Rows        name                       type_desc
1                    2000000    ixc_IDVal                  CLUSTERED
2                    2000000    ix_CustomerName    NONCLUSTERED
3                    2000000    ix_CustomerCity       NONCLUSTERED

 

So next time you’re wondering if your insert or update statement is actually doing anything useful, you can leverage some of the underlying DMV data to make sure things are still on track and it’s a good way to tell if you have time to run out for coffee. That’s the last SQL related thing I’m doing, time for vacation…cheers!