How to tell what your insert, update is doing
Lately I’ve been working on a data migration SSIS package for a customer and some of it is just a straight forward execute sql tasks in SSIS on some very large tables. I pull 10 million rows into a staging area, massage some data into the proper format, insert it into it’s final production table resting place, easy enough. I’m testing it on a very low powered 32 bit server and it takes a while. At one point today I started wondering if it was just locked up or was still running.
I’ve used the sys.partitions table before as an alternate method of getting row counts for a table as it already has that information handy for me. This time I was querying the table while the insert was running and I was able to see exactly what was happening. The code below will walk you through an easy example. This will work the same in 2005, or 2008.
First, just create a throw away table:
Create table Loadtest
(
IDVal int identity(1,1),
CustomerName varchar(200),
CustomerCity varchar(200)
)
Now let’s create a looping insert that gives us enough time to look at things. You can just stop it when you have had enough. It’s nothing exciting here, I’m just creating an loop 2 million times and doing an insert..like I said, stop it when you’ve had enough:
declare @counter int
set @counter=2000000
While @counter>0
BEGIN
Insert into LoadTest(CustomerName,CustomerCity)
values('Mike Femenella','Matthews')
SET @counter=@counter-1
END
While your loop is running, open up another query analyzer window to the same database and run this:
SELECT sp.index_id,Rows,si.name,si.type_desc
FROM sys.partitions sp
inner join sys.indexes si
on sp.object_id=si.object_id
AND sp.index_id=si.index_id
where object_name(sp.object_id)='LoadTest'
Since the table is not indexed you’ll see the row count and that it’s populating the heap, hit F5 a few times and you can watch the rows field increment. If you have doubts your insert statement is actually running, this is a good way to check that it is alive and you can also get an idea of it’s progress if you have a guess as to how many rows “should” be there.
--Unindexed, you see the heap being populated.
index_id Rows name type_desc
0 24252 NULL HEAP
So that was mildly interesting and sort of cool. Of course we can take it a step further (and I always love doing that). Go ahead and stop the loop when you have have a million or so rows in there, you need a large amount of data in the LoadTest table so you have enough time to run the sys.partitions query again. I was letting my loop run while writing the first part of the blog so I have 2 million rows in LoadTest.
Let’s create a mirror image of our LoadTest table and since I’m at the end of my work week and feeling very creative I’m going to call it LoadTest_Copy, go ahead and F5 and create the table.. Award winning material here isn’t it?
Create table Loadtest_Copy
(
IDVal int ,
CustomerName varchar(200),
CustomerCity varchar(200)
)
GO
Create Clustered index ixc_IDVal on LoadTest_Copy(IDVal)
Create index ix_CustomerName on LoadTest_Copy(CustomerName)
Create index ix_CustomerCity on LoadTest_Copy(CustomerCity)
GO
In another window modify your sys.partition query and change the name from LoadTest to LoadTest_Copy. Run this now and you have 3 rows, one for the clustered index and one for each of the other indexes. Somewhat more realistic and more interesting.
SELECT sp.index_id,Rows,si.name,si.type_desc
FROM sys.partitions sp
inner join sys.indexes si
on sp.object_id=si.object_id
AND sp.index_id=si.index_id
where object_name(sp.object_id)='LoadTest_Copy'
Results:
index_id Rows name type_desc
1 0 ixc_IDVal CLUSTERED
2 0 ix_CustomerName NONCLUSTERED
3 0 ix_CustomerCity NONCLUSTERED
Now, do an insert from LoadTest into LoadTest_Copy. Start it and then flip over to your sys.partition query, run it while the insert is happening and look at the results.
You can keep hitting F5 and watch the numbers change. Notice it builds the clustered index first, which of course makes sense, and then will go through each of the other 2 indexes building them out.
index_id Rows name type_desc
1 449900 ixc_IDVal CLUSTERED
2 0 ix_CustomerName NONCLUSTERED
3 0 ix_CustomerCity NONCLUSTERED
index_id Rows name type_desc
1 2000000 ixc_IDVal CLUSTERED
2 1065245 ix_CustomerName NONCLUSTERED
3 0 ix_CustomerCity NONCLUSTERED
index_id Rows name type_desc
1 2000000 ixc_IDVal CLUSTERED
2 2000000 ix_CustomerName NONCLUSTERED
3 126137 ix_CustomerCity NONCLUSTERED
And finally now that it’s done:
index_id Rows name type_desc
1 2000000 ixc_IDVal CLUSTERED
2 2000000 ix_CustomerName NONCLUSTERED
3 2000000 ix_CustomerCity NONCLUSTERED
So next time you’re wondering if your insert or update statement is actually doing anything useful, you can leverage some of the underlying DMV data to make sure things are still on track and it’s a good way to tell if you have time to run out for coffee. That’s the last SQL related thing I’m doing, time for vacation…cheers!