Thinking outside the box

Patron Saint of Lost Yaks
posts - 203, comments - 734, trackbacks - 4

My Links

Advertisement

News

Archives

Post Categories

PIVOT conundrum

I started out with typing

SELECT @@VERSION

and got the result as

Microsoft SQL Server 2005 - 9.00.3215.00 (Intel X86) 
       Dec 8 2007 18:51:32 
       Copyright (c) 1988-2005 Microsoft Corporation
       Developer Edition on Windows NT 5.1 (Build 2600: Service Pack 3)


So far so good. Now I created some sample data like this

-- Prepare sample data
DECLARE     @Sample TABLE (RecNo TINYINT, ID TINYINT, EventDate SMALLDATETIME, OrderStatus VARCHAR(9))

INSERT      @Sample
SELECT      1, 10, '7/5/2008', 'Opened' UNION ALL
SELECT      2, 11, '7/5/2008', 'Closed' UNION ALL
SELECT      3, 12, '7/5/2008', 'Closed' UNION ALL
SELECT      4, 13, '7/6/2008', 'Opened' UNION ALL
SELECT      4, 14, '7/6/2008', 'Opened' UNION ALL
SELECT      4, 15, '7/6/2008', 'Closed' UNION ALL
SELECT      1, 16, '7/7/2008', 'Opened' UNION ALL
SELECT      2, 17, '7/8/2008', 'Postponed' UNION ALL
SELECT      3, 18, '7/8/2008', 'Closed'

I wanted to PIVOT the data, and I wrote this query

SELECT      p.*
FROM        (
                  SELECT      EventDate,
                              OrderStatus
                  FROM        @Sample
            ) AS s
PIVOT       (
                  COUNT(s.OrderStatus)
                  FOR s.OrderStatus IN ([Opened], [Closed], [Postponed])
            ) AS p

The result came back as expected.

EventDate         Opened Closed Postponed
2008-07-05        1          2          0
2008-07-06        2          1          0
2008-07-07        1          0          0
2008-07-08        0          1          1


Now I wanted to see if there was a discrepancy if counting another column, so I wrote this query

SELECT      p.*
FROM        (
                  SELECT      EventDate,
                              OrderStatus
                  FROM        @Sample
            ) AS s
PIVOT       (
                  COUNT(s.EventDate)
                  FOR s.OrderStatus IN ([Opened], [Closed], [Postponed])
            ) AS p

And got this resultset back.

Opened Closed Postponed
4          4          1

Huh!? What happened here? The column EventDate is now missing.
It seems that PIVOT operator destroys the column used for aggregation.

To verify, I wrote a simpler query

SELECT      p.*
FROM        @Sample AS s
PIVOT       (
                  COUNT(s.EventDate)
                  FOR s.OrderStatus IN ([Opened], [Closed], [Postponed])
            ) AS p
 
The resultset is now
 
RecNo ID         Opened Closed Postponed
1          10         1          0          0
2          11         0          1          0
3          12         0          1          0
4          13         1          0          0
4          14         1          0          0
4          15         0          1          0
1          16         1          0          0
2          17         0          0          1
3          18         0          1          0

And yes, the EventDate column is destroyed. And what more? All columns from @Sample table is present with the exception for the translation from OrderStatus to the target column.

So I went back and wrote a simpler query for counting the OrderStatus column to avoid destroying the EventDate column.

SELECT      p.*
FROM        @Sample AS s
PIVOT       (
                  COUNT(s.OrderStatus)
                  FOR s.OrderStatus IN ([Opened], [Closed], [Postponed])
            ) AS p

As I suspected, the EventDate column is back, but also all other columns from @Sample table.

RecNo ID         EventDate         Opened Closed Postponed
1          10         2008-07-05        1          0          0
1          16         2008-07-07        1          0          0
2          11         2008-07-05        0          1          0
2          17         2008-07-08        0          0          1
3          12         2008-07-05        0          1          0
3          18         2008-07-08        0          1          0
4          13         2008-07-06        1          0          0
4          14         2008-07-06        1          0          0
4          15         2008-07-06        0          1          0

My conclusion is that the PIVOT aggregates on all columns not being translated with the FOR keyword.
 

Print | posted on Tuesday, July 15, 2008 4:00 PM | Filed Under [ Optimization SQL Server 2008 Algorithms SQL Server 2005 ]

Feedback

Gravatar

# re: PIVOT conundrum

This is called implicit grouping. A good read is Itzik Ben-Gan's article "Enhancing PIVOT for Crosstab Queries". If you are a SQL Server Magazine subscriber, you can access it using Instanct Doc #93907.
7/15/2008 5:53 PM | John
Gravatar

# re: PIVOT conundrum

So in Pivot the coloumn which we want to pivot must be fixed i.e we should know well in advance.
plz tell me more on pivot whn the data is not same thn how to use Pivot in that case????
7/17/2008 8:28 AM | Smitha
Comments have been closed on this topic.

Powered by:
Powered By Subtext Powered By ASP.NET