Finding group of records with a certain status

Thu Sep 20, 2007 by Peter Larsson in optimization

I recently was given the task to optimize some code prior to a client's upgrade to SQL Server 2005.
The objective for the old code was to get all orders where status for all orderlines where set to 'DROP'.

SELECT     t.OrderID,
           t.OrderDate
FROM       tblOrder AS t
INNER JOIN (
               SELECT     a.OrderID,
                          CASE
                              WHEN SUM(b.Type) = COUNT(*) THEN 'DROP'
                              ELSE NULL
                          END AS [Type]
               FROM       tblOrder AS a
               INNER JOIN (
                              SELECT     OrderID,
                                         CASE
                                             WHEN [Status] = 'DROP' THEN 1
                                             ELSE 0
                                         END AS Type
                              FROM       tblOrder
                          ) AS b ON b.OrderID = a.OrderID
               GROUP BY   a.OrderID
            ) AS c ON c.OrderID = t.OrderID AND c.Type = 'DROP'

After some testing I come up with this alternative

SELECT     w.OrderId,
           w.OrderDate
FROM       tblOrder AS w
INNER JOIN (
                SELECT    OrderID
                FROM      tblOrder
                GROUP BY  OrderID
                HAVING    MIN(CASE WHEN Status = 'DROP' THEN 1 ELSE 0 END) = 1
            ) AS d ON d.OrderID = w.OrderID

I immediately saw that this code run much faster. Then I started to think if the time spent optimizing the code was worth the time.
For testing I set up two identical SQL machines, one with SQL Server Enterprise 2005 SP2 and the other with SQL Server Enterprise 2000 SP4.
SQL Profiler was configured to measure CPU, DURATION, READS and WRITES.

These are the results I got with the old query on both boxes

SQL Server 2000				SQL Server 2005
DURATION	CPU	READS	WRITES	DURATION	CPU	READS	WRITES
1373	938	40236	25	546	516	22185	0
1310	922	40170	25	583	515	22433	0
1283	891	40170	25	548	485	22199	0
1360	953	40170	25	493	500	22210	0
1376	953	40170	25	494	500	22300	0

These are the results I got for the new query on both boxes

SQL Server 2000				SQL Server 2005
DURATION	CPU	READS	WRITES	DURATION	CPU	READS	WRITES
296	266	27322	25	185	156	18112	0
296	250	27272	25	164	172	18106	0
283	265	27272	25	167	172	18106	0
266	266	27272	25	169	172	18106	0
230	250	27272	25	168	156	18106	0

My first reaction was that the time spent optimizing the query was worth the time because the old query went from {1300, 900, 40000, 25} down to {300, 250, 27000, 25} on the SQL Server 2000 box, which is a speed improvement between 200% and 300%.

But the real surprise is that the old query without any optimization run almost as fast on the SQL Server 2005 box than the optimized query on SQL Server 2000 box.

My conclusion is that SQL Server 2005 has a better query engine and produces better execution plans than SQL Server 2000. This do not take away the fact that the optimized query is faster on both boxes.

The lesson learned here is that you can never throw in enough hardware to solve badly written code.

For completeness, I also post the execution plans for both queries

Old query

SQL Server 2000

SQL Server 2005

New query

SQL Server 2000

SQL Server 2005

Continued here

http://weblogs.sqlteam.com/peterl/archive/2007/09/24/Finding-group-of-records-with-a-certain-status-II.aspx

Legacy Comments


Hugo Kornelis 2007-09-20	re: Finding group of records with a certain status Hi Peter, Did you also try this variation? SELECT a.OrderID, a.OrderDate FROM Orders AS a WHERE NOT EXISTS (SELECT * FROM Orders AS b WHERE b.OrderID = a.OrderID AND b.Status <> 'Drop');

Peso 2007-09-20	re: Finding group of records with a certain status Thanks for the tip. I haven't tried it yet, but shouldn't it be = 'Drop' in the EXISTS case?

Hugo Kornelis 2007-09-21	re: Finding group of records with a certain status "shouldn't it be = 'Drop' in the EXISTS case?" Nope. This is the old "don't use no double negations, not never!" thingie. Instead of searching for orders that have all rows at status Drop, we search for orders that have no rows at any other status other than Drop - if an order has no rows with Status <> Drop, then obviously, all statuses must be = Drop. Best, Hugo

Peso 2007-09-21	re: Finding group of records with a certain status You are absolutely correct. That proves the point that you should never write code when sitting on an airplane.

dineshrajan 2010-08-09	re: Finding group of records with a certain status declare @Order table ( orderid int identity(1,1) not null, ordername varchar(100) ) declare @OrderType table ( OrderTypeId int identity(1,1) not null, orderid int, Typename varchar(100) ) insert into @Order(ordername) values ('X100'),('XY120'),('ZE100') insert into @OrderType(orderid,Typename) values (1,'Pending'),(1,'Completed'),(1,'Draft') ,(2,'Completed'),(2,'Completed'),(2,'Completed') select * from @Order select * from @OrderType select O.* from @Order O join ( select distinct orderid,[StatusCount]= sum(case when Typename = 'Completed' THEN 1 ELSE 0 END),[Count]= COUNT(*) from @OrderType group by orderid ) T1 on O.orderid = T1.orderid and T1.[Count] = T1.StatusCount