Jeff's SQL Server Blog

Random Thoughts & Cartesian Products with Microsoft SQL Server
posts - 157, comments - 2686, trackbacks - 64

My Links

Advertisement

News

Welcome to my weblog. My name is Jeff Smith, I am software developer in Boston, MA and I was recently named a 2009 SQL Server MVP. Check in frequently for tips, tricks, commentary and ideas on SQL Server and .NET programming.


Subscribe





Archives

Post Categories

Programming

Sports

SQL

Pivot function (C#)

public static DataTable Pivot(IDataReader dataValues, string keyColumn, string pivotNameColumn, string pivotValueColumn)

        {

            DataTable tmp = new DataTable();

            DataRow r;

            string LastKey = "//dummy//";

            int i, pValIndex, pNameIndex;

            string s;

            bool FirstRow = true;

 

            // Add non-pivot columns to the data table:

 

            pValIndex = dataValues.GetOrdinal(pivotValueColumn);

            pNameIndex = dataValues.GetOrdinal(pivotNameColumn);

 

            for (i=0; i <= dataValues.FieldCount -1; i++)

                if (i != pValIndex &&  i != pNameIndex )

                    tmp.Columns.Add(dataValues.GetName(i),dataValues.GetFieldType(i));

          

            r = tmp.NewRow();

 

            // now, fill up the table with the data:

            while (dataValues.Read())

            {

                // see if we need to start a new row

                if (dataValues[keyColumn].ToString() != LastKey)

                {

                    // if this isn't the very first row, we need to add the last one to the table

                    if (!FirstRow)

                        tmp.Rows.Add(r);

                    r = tmp.NewRow();

                    FirstRow = false;

                    // Add all non-pivot column values to the new row:

                    for (i=0; i<= dataValues.FieldCount-3;i++) 

                        r[i] = dataValues[tmp.Columns[i].ColumnName];

                    LastKey = dataValues[keyColumn].ToString();

                }

                // assign the pivot values to the proper column; add new columns if needed:

                s = dataValues[pNameIndex].ToString();

                if (!tmp.Columns.Contains(s))

                    tmp.Columns.Add(s, dataValues.GetFieldType(pValIndex));

                r[s] = dataValues[pValIndex];

            }

 

            // add that final row to the datatable:

            tmp.Rows.Add(r);


            // Close the DataReader

            dataValues.Close();          

        

            // and that's it!

            return tmp;

        }

Note: If you are using .NET 2.0 or above, you can use the change shown here to ensure that the new pivoted columns created by this function are sorted.

Print | posted on Wednesday, May 11, 2005 3:02 PM |

Feedback

Gravatar

# Cross Tabs at the Presentation Layer

5/11/2005 3:49 PM | Welcome to Jeff's Blog
Gravatar

# .NET CrossTabs: Transforming a Normalized IDataReader into a Pivoted DataTable

5/15/2005 10:10 PM | Welcome to Jeff's Blog
Gravatar

# .NET CrossTabs versus SQL Server CrossTabs

5/15/2005 10:14 PM | Welcome to Jeff's Blog
Gravatar

# re: Pivot function (C#)

I think you forgot to mention in your post the bandwidth cost of returning the entire resultset to the client.

Showing the cost of returning the entire resultset plus the cost of creating the pivot query would in both methods will i think strenthen the result of your conclusion.
6/21/2005 5:44 AM | Little Monster
Gravatar

# re: Pivot function (C#)

Little Monster -- don't forget, the client isn't doing any summarization at all; only fully summarized rows are returned to the client which simply pivots them.

For example, if you have a 10,000 row transaction table which will be summarized into data for 10 clients over 12 months, only 120 fully aggregated rows should be returned to the client which are then pivoted into the final 10 by the client-side code. A T-SQL pivot would indeed only return 10 rows (with 11 more columns of course), but at the huge expense (as shown) of having SQL Server do all the work and forcing presentation code into your database layer.

So the key point is that the client is not doing any pivot calculations or aggregating at all; it is simply changing the format of data that is returned. Bandwith isn't really a consideration, since we are working with raw *data* in its native, summarized form.
6/21/2005 10:36 AM | Jeff
Gravatar

# re: Pivot function (C#)

Hi,

The above code inserts the data in a datatable. How do i insert it into an excel sheet?

You can email me at the following sandybeach50@hotmail.com
6/23/2005 11:03 AM | Sand
Gravatar

# re: Pivot function (C#)

>>The above code inserts the data in a datatable. How do i insert it into an excel sheet?

You don't. Excel has pivoting capabilities built in. Otherwise, it is an exercise on taking a DataTable and exporting it to Excel which isn't really the scope of this post. If there is a popular need for this, let me know.
6/30/2005 8:52 PM | Jeff
Gravatar

# re: Pivot function (C#)


Hi,

This code sounds great.
But... will this surely work for adding data into a SQL table or will I have to add sumthing to this ?

please let me know.

Thanx
- PD
7/22/2005 8:52 AM | PD
Gravatar

# re: Pivot function (C#)

Jeff,
Thanks for the code above. I'm writing a SQL Mobile app and need to transform row into cols to display in a datgrid. Only challenge I found with your example is that it gives a cartesian product. (e.g. 10 rows x 8 pivot cols always gives 80 rows....even some of the rows actually 'match'). So, I hacked a version to use the KeyColumn passed in and update the row if the key already existed. Hope you don't mind. Please pass along if this helps anyone else. (Also tweaked your version slightly based on having a key). Hasn't been extensively tested but gives the expected result. One other note, I did use 2.0 DataTable object but could easily be modified to not need it....
Tim
---------------------------
public static DataTable Pivot(IDataReader dataValues, string keyColumn, string pivotNameColumn, string pivotValueColumn)
{
DataTable tmp = new DataTable();
DataRow r;
int i, pValIndex, pNameIndex;
string s;

// Add non-pivot columns to the data table:
pValIndex = dataValues.GetOrdinal(pivotValueColumn);
pNameIndex = dataValues.GetOrdinal(pivotNameColumn);

// Loop through columns
for (i = 0; i <= dataValues.FieldCount - 1; i++)
{
if (i != pValIndex && i != pNameIndex)
{
DataColumn dc = new DataColumn(dataValues.GetName(i), dataValues.GetFieldType(i));
tmp.Columns.Add(dc);

// Add key column
if (dc.ColumnName == keyColumn)
tmp.PrimaryKey = new DataColumn[] { dc };
}
}

// now, fill up the table with the data
while (dataValues.Read())
{
// assign the pivot values to the proper column; add new columns if needed:
s = dataValues[pNameIndex].ToString();
if (!tmp.Columns.Contains(s))
tmp.Columns.Add(s, dataValues.GetFieldType(pValIndex));

// Create new row after adding any additional columns
r = tmp.NewRow();

// Add pivot value
r[s] = dataValues[pValIndex];

// Add all non-pivot column values to the new row:
for (i = 0; i <= dataValues.FieldCount - 3; i++)
r[i] = dataValues[tmp.Columns[i].ColumnName];

// Look for key
DataRow rowFound = tmp.Rows.Find(r[keyColumn]);

// Add new record if not found
if (null == rowFound)
{
tmp.Rows.Add(r);
}
else // Key already exists .. just update it
{
rowFound[s] = dataValues[pValIndex];
}
}
// Close the DataReader
dataValues.Close();
// and that's it!
return tmp;
}
9/5/2005 7:35 PM | Tim
Gravatar

# re: Pivot function (C#)

Hi Tim --

Thanks for the code -- of course I don't mind at all if anyone makes modifications or takes the general idea I've presented and improves upon it. The whole point isn't really to provide a function that is perfect, but to show how easy and more efficient it is to have your presentation layer do formatting and NOT yoru database layer.

I am not sure, however, what you mean by the code "generates a cartesian product". Remember, as written, you should passed in a fully summarized and sorted datareader -- grouped by the columns you are pivotting, so that for a particular value that row or column shouldn't already exist in
your data. Remember, the key idea is that SQL does the summarizing, sorting and calculations, and we are just taking the result and formatting it into columns -- we shouldn't need to do any summarizing or sorting or checks to see if more than 1 value needs to be aggregated for a particular set of key values since that should have been done already by SQL.

Double-check the article for more info on this. If you can provide some sample data and the SELECT statement you are using and what is happening specifically using the original function, it will be much easier to determine what might be going wrong but right not I am not 100% sure what exactly is going on based on how you described the situation.
9/6/2005 9:09 AM | Jeff
Gravatar

# re: Pivot function (C#)

Jeff,

Sorry for my ambiguity. Good points about letting the query do any summarizing and grouping. I'm basically trying to build an Excel like-list in the datagrid. My basic row element is a company and I'm trying to show a list of all or select categories and whether, for the select companies, those categories are appropriate. A company can have multiple categories but only but each CompanyID ~ CategoryID should be unique.
Tables:
tblCompany, tblCategory, and tblCompanyCategories.
---------------------
The other hitch is that I want to show not just matching categories but all or select ones. So some form of outer join is necessary. I then want to transform the result into a list of unique companies, catgories now as columns, and a true or false if a particular category is relevant. Here's the query -- one of the ugliest I have ever written but can't think of a better way due to SQL Server Mobile's lack of support for SPROCs.
SQL:
SELECT tblCompany.CompanyID, tblCompany.Company, tblCompany.Ticker, CONVERT(bit, tblCategory.CategoryID) AS HasCategory, tblCategory.Category
FROM tblCompany INNER JOIN tblCompanyCategories ON tblCompany.CompanyID = tblCompanyCategories.CompanyID
LEFT OUTER JOIN tblCategory ON tblCompanyCategories.CategoryID = tblCategory.CategoryID
WHERE ( tblCompanyCategories.CategoryID IN (1, 2, 3, 4, 5, 6, 13, 14, 15, 16, 17)
AND tblCompany.CompanyID IN (SELECT CompanyID FROM tblCompanyCategories AS tblTemp
WHERE CategoryID = ?)) ORDER BY tblCategory.SortOrder
---------------------
The '?' is a shorthand notation for a parameter that SQL Mobile does thankfully support.

Now, when I pivot this using the standard approach (as I'm most people normally require). I get a unque row for each set of companies and categories. So, if I selected categoryid = 1, I'd have one row where perhaps it matched, but another row for each other category...in addition to the categories for each already holding a column place. I'm sure there are technical terms for these concepts but I don't know them. I think your default approach provides a single column transform whereas I was looking for a multi-column. i.e.

CompanyID Col1 Col2 Col3
1 x
1 x
1 x

vs.

CompanyID Col1 Col2 Col3
1 x x x


...or...maybe I just did it wrong! Either way, please don't mistake me. I have you to thank for getting me started. It took very little effort, using your example, to tweak it to what I wanted. I onle posted my example because I have seen people asking in a lot of different forums how to do it. You gave the best, cleanest example, and I thought I had a way of doing a multi-col pivot that I have also seen others asking about.

Am I doing it wrong?
thanks again,
tim
9/6/2005 10:40 PM | Tim
Gravatar

# re: Pivot function (C#)

Tim --

It appears the problem is that you have not sorted your resultset by CompanyID. To quote from the article:

>>All you need to do is open up an IDataReader object that returns the data you wish to Pivot. The data should be fully summarized by SQL Server already -- i.e., you should GROUP BY your pivot column plus any other columns you wish to return, and you should aggregate your pivot value accordingly. The DataReader should also be sorted so that all rows which will be pivoted into one are sorted together.

>>For example, consider the Orders table in the Northwind database. Suppose you want to display CompanyNames as rows, ProductNames as columns, and the total quantity of orders per CompanyName per ProductName in your crosstab. The DataReader's recordset should be grouped by CompanyName and ProductName, it should include a SUM(Qty) calculation, and it should be sorted by CompanyName. (see the example for this exact scenario using Northwind).

Again, it may pay to re-read the article one more time and double-check all parts of the example using northwind (i.e., look at the tables, the view that I use, the SELECT statement, and so on). It does seems that adding proper grouping and a sort to your SQL is all you would need to make the existing function work perfectly.
9/7/2005 8:33 AM | Jeff
Gravatar

# re: Pivot function (C#)

I'm sorry. I did miss that point. I'll re-run it later. And put a timer on both methods just for grins.

Thanks so much again. And sorry to take up so much of your time with this.
9/7/2005 11:16 AM | Tim
Gravatar

# re: Pivot function (C#)

Jeff,

Uncle. I re-ran it, after re-reading the article and sorting on my key of CompanyID and then used the Queryperformancecounter from kernel32 using p/Invoke to time the runs. Your version runs many times faster. I'd give a better ratio but for some reason the timer doesn't give me a reliable answer. My return datatable as something like 152 rows, using the same categoryid, which sort to 47 unique companies. On your algorthym the first run is usually on the order of 3 ms. And subquent runs are at or below 1 ms. My version reports, using the same timer algorthym, something like 800000 ms....but I say it isn't reliable because that'd be like 13 mins. but there's actually no perceptible difference in performance in terms of responsiveness. Seems to be a glitch in the Queryperformancecounter. But I don't don't yours is faster. Of course, this is on the desktop. I'll also test them both in Compact Framework to see if I can't get at least a better metric on mine....

Thanks again!
9/7/2005 10:15 PM | Tim
Gravatar

# re: Pivot function (C#)

Ok...Last word on this, promise. Just tested on CF and realized why I was getting a bad result from my testing. I was forgetting to set the start ref using the test on my version. After repeated testing the two approaches seem run about the same length on the desktop. Avg. of 2-4 ms. On CF they both take a bit longer but are also comparable. Its been fun....In case anyone is interested, I'm using an excellent example by Dan Fox and Jon Box for the timer that I found at MSDN. The link is at:
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dnnetcomp/html/netcfperf.asp

This was designed for CF but you can simpley replace 'coredll.dll' import reference to 'kernel32.dll' to use it from the desktop.
9/7/2005 10:41 PM | Tim
Gravatar

# re: Pivot function (C#)


Subject : Pivot function

There is a problem ( not a bug ).

As per the requirement of Pivot the keycolumn has to be sorted, where as incase of Month in the pivotcolumn the result can not be availed in ascending order. Please let me know if you have any solution to sort the pivotcolumn as well.

Regards,
Fayeez
10/2/2005 4:21 AM | Fayeez
Gravatar

# re: Pivot function (C#)

Your problem is the one of ordering things by month names or other date formats. Don't do this. Order your results by VALUES -- i.e., month numbers or actual datetime datatypes.

that is, don't do this:

select ...
from ...
order by MonthName(SomeDate)

this returns a VARCHAR, which puts December before January (D comes before J). Makes sense, right? SQL doesn't know or care if your varchar strings are the names of months, it will order them alphabetically.

always do this:

select ...
from ...
order by Month(SomeDate)

Here, the Month() function returns a numeric value from 1-12 for each month. This orders correctly. In your SELECT portion, feel free to return the monthname (or better yet, derive + format it and display it at the client layer, NOT in SQL) but do not sort by month names.

And, of course, whever you do ORDER or GROUP by a month, make sure you GROUP and/or ORDER by a Year as well if your data spans multiple years otherwise Jan2004 gets lumped in with Jan2005.

Hope this helps.
10/3/2005 8:52 AM | Jeff
Gravatar

# re: Pivot function (C#)

Jeff, Awesome solution! I've been struggling to get Sql to this for days....
11/9/2005 9:10 AM | William
Gravatar

# re: Pivot function (C#)

Jeff,

Thanks for the routine. I'm going to bookmark your site. Its great to find good routines like this one to get real work done.

Thanks again.

Steve.Buchanan@gmail.com
7/19/2006 9:13 AM | Steve Buchanan
Gravatar

# re: Pivot function (C#)


good solution , i have used this and it works great !!!
9/25/2006 1:46 PM | Esha
Gravatar

# re: Pivot function (C#)

Thanks for the routine. I made a couple small changes. First, I changed this to accept a reference to an existing DataTable, instead of a DataReader. Second, I added the following if statment to the code below to accept null data points in the pivotNameColumn: if (s.Length > 0)

The reason I needed to support null data points in the pivotNameColumn is b/c the query to populate the source DataTable used a left outer join for the name and value columns, so my source DataTable and destination DataTable looked like this:

Source DataTable:
key Name Value
1
2 email <email1>
3 email <email2>
4

Destination DataTable:
key email
1
2 <email1>
3 <email2>
4

Thanks again. This saved me a ton of time.

public static DataTable Pivot(ref DataTable dataValues, string keyColumn, string pivotNameColumn, string pivotValueColumn)
{
try
{
DataTable tmp = new DataTable();
DataRow r;
string LastKey = "//dummy//";
int i, pValIndex, pNameIndex;
string s;
bool FirstRow = true;

// Add non-pivot columns to the data table:
pValIndex = dataValues.Columns.IndexOf(pivotValueColumn);
pNameIndex = dataValues.Columns.IndexOf(pivotNameColumn);

for (i = 0; i <= dataValues.Columns.Count - 1; i++)
if (i != pValIndex && i != pNameIndex)
{
tmp.Columns.Add(dataValues.Columns[i].ColumnName.ToString(), dataValues.Columns[i].DataType);
}
r = tmp.NewRow();

// now, fill up the table with the data:
foreach (DataRow row in dataValues.Rows)
{
// see if we need to start a new row
if (row[keyColumn].ToString() != LastKey)
{
// if this isn't the very first row, we need to add the last one to the table
if (!FirstRow)
{
tmp.Rows.Add(r);
}
r = tmp.NewRow();
FirstRow = false;

// Add all non-pivot column values to the new row:
for (i = 0; i <= dataValues.Columns.Count - 3; i++)
{
r[i] = row[tmp.Columns[i].ColumnName];
}
LastKey = row[keyColumn].ToString();
}

// assign the pivot values to the proper column; add new columns if needed:
s = row[pNameIndex].ToString();
if (s.Length > 0)
{
if (!tmp.Columns.Contains(s) && s != null)
{
tmp.Columns.Add(s, dataValues.Columns[pValIndex].DataType);
}
r[s] = row[pValIndex];
}
}
// add that final row to the datatable:
tmp.Rows.Add(r);

// and that's it!
return tmp;
}
catch (Exception ex)
{
MessageBox.Show(ex.Message);
return null;
}
}
5/20/2007 3:25 PM | Tim Oehlke
Gravatar

# re: Pivot function (C#)

Hi Jeff,

You are saying that formatting and such should be done on the presentation layer? All formatting? How are you defining formatting for a statement like that? I am curious because I am about to impliment a solution on a project that requires a pivot and I am considering whether to do it in SQL or my C# app.
Thanks! Love your site!

Richard
6/10/2007 1:43 AM | Richard
Gravatar

# re: Pivot function (C#)

Yes, Richard, all formatting should be done in your presentation layer -- and this usually includes pivots. A relational database should not be returning results with varying column names; the structure of the data returned should be consistent. If your presentation layer is capable, it is just about always easier, shorter and more efficient to simply let your presentation layer pivot data. This includes Reporting Services, Crystal Reports, Access, even Excel via pivot tables. see:

http://weblogs.sqlteam.com/jeffs/archive/2005/05/12/5127.aspx

for a performance comparison for doing this in C# versus at the database layer.
6/10/2007 2:07 PM | Jeff
Gravatar

# re: Pivot function (C#)

Good discussion, however once I have a the pivot results from your function or a SQL pivot, how would I consolidate the results (squash the rows) so that the following NULLS would be removed to look like:

partcodeid Advantage G&G GBC SWR
2 NULL NULL NULL 456
3 NULL 789 NULL NULL
5 NULL NULL NULL 6666


Advantage G&G GBC SWR
- 789 - 456
- - - 6666


Thanks!
10/18/2007 4:13 PM | Mike
Gravatar

# re: Pivot function (C#)

Mike -- those results make no sense .. I have no idea what you are trying to do there. Can you try to be more specific? As I mentioned several times, you need to do all summarizing in SQL *BEFORE* the function comes into play; if you don't want to return 1 row per PartCodeID, don't group by that column in your SELECT.
10/18/2007 4:26 PM | Jeff
Gravatar

# re: Pivot function (C#)

This has been a wonderful posting. I agree that the performance is much better in the presentation layer. Further, the implementation using a datareader as opposed to a dataadapter increases the efficiency. As I have been using the code, I started to wonder why we could not simply use an inheritance of the ObjectDataSource in ASP 2.0 and create a custom PivotDataSource using the functions listed here.

I have looked into the implementation of an ICollection to support composite primary key fields. However, I don't think this will work for multiple column fields. I just don't see how the standard GridView control could support it. A PivotView control would need to be developed and with AJAX and LINQ the possibility of this control seems bery promising.

Ever thought about that?
10/20/2007 6:32 PM | Franco
Gravatar

# re: Pivot function (C#)

It works!
Thank you!
8/21/2009 2:34 AM | apache
Gravatar

# re: Pivot function (C#)

Great article Jeff!

Is there any chance you could provide the Pivot function in VB? Thanks!
2/3/2010 4:55 PM | Andy
Gravatar

# re: Pivot function (C#)

Hello,

I have a problem using the above code. It is generating 3 empty columns at the end. Plz letme know how to handle this.
DataTable t1 = Pivot(dr, "NAME", "Date", "SUM(XX)");

public static DataTable Pivot(IDataReader dataValues, string keyColumn, string pivotNameColumn, string pivotValueColumn)
{
DataTable tmp = new DataTable();
DataRow r;
string LastKey = "//dummy//";
int i, pValIndex, pNameIndex;
string s;
bool FirstRow = true;
// Add non-pivot columns to the data table:
pValIndex = dataValues.GetOrdinal(pivotValueColumn);
pNameIndex = dataValues.GetOrdinal(pivotNameColumn);
for (i = 0; i <= dataValues.FieldCount - 1; i++)
if (i != pValIndex && i != pNameIndex)
tmp.Columns.Add(dataValues.GetName(i), dataValues.GetFieldType(i));
r = tmp.NewRow();
// now, fill up the table with the data:
while (dataValues.Read())
{
// see if we need to start a new row
if (dataValues[keyColumn].ToString() != LastKey)
{
// if this isn't the very first row, we need to add the last one to the table
if (!FirstRow)
tmp.Rows.Add(r);
r = tmp.NewRow();
FirstRow = false;
// Add all non-pivot column values to the new row:
for (i = 0; i <= dataValues.FieldCount - 3; i++)
r[i] = dataValues[tmp.Columns[i].ColumnName];
LastKey = dataValues[keyColumn].ToString();
}
// assign the pivot values to the proper column; add new columns if needed:
s = dataValues[pNameIndex].ToString();
if (!tmp.Columns.Contains(s))
tmp.Columns.Add(s, dataValues.GetFieldType(pValIndex));
if (s == "") { continue; }
r[s] = dataValues[pValIndex];
}
// add that final row to the datatable:
tmp.Rows.Add(r);
// Close the DataReader
dataValues.Close();
// and that's it!
return tmp;
}

Thanks in advance,
Bharath

Thanks,
Bharath
7/7/2010 6:35 AM | BHarath
Gravatar

# re: Pivot function (C#)

if i bind it to gridview how to create sum in footer.
9/22/2010 3:42 PM | kwangba
Comments have been closed on this topic.

Powered by:
Powered By Subtext Powered By ASP.NET