Jeff Smith Blog

Random Thoughts & Cartesian Products with Microsoft SQL Server

Splitting a single DataTable into Parent/Child DataTables for Hierarchical Processing (e.g., nested ASP.NET Repeater controls)

In ASP.NET, we often would like to output "grouped" data on our web pages, like this:

Customer Product     Sales
-------- ----------  -----
ABC Foundation
         Product 1   $200
         Product 2   $437
         Product 3   $523
       
The XLZ Company
         Product 1   $240
         Product 2   $892
         Product 3   $395
  
The easiest way to do this is with nested Repeater controls; one for the outer group (Customers, this case), and within that Repeater's ItemTemplate we'd have another Repeater control for the details (Products). 

To use nested repeaters, you would return two separate result sets from SQL: 
  1. A "Customers" result set, listing one row per Customer to display on the report
  2. A "Products" result set, listing one row per Product to display.
You would put both of those results into a single DataSet as DataTables, and then create a relation between the two (on the CustomerID column).   Now, in our DataSet, we have a parent-child relation and each row in the parent Customers DataTable will have related child rows in the Products DataTable.

Then, in ASP.NET, we create nested repeaters:  the outer repeater is bound to the Customers DataTable, and the inner repeater gets bound for each Customer row to the child rows in the Products DataTable.    There are lots of good examples detailing this technique here and here, so I won't get into specifics on that in this space.

However -- what if you already have a stored procedure that returns this data in a single result set? The nested Repeater controls won't do you much good, since we need two separate (but related) DataTables.  Or, what if it is inefficient to return these results separately in SQL?  For example, suppose we want to output only the products for customers that have a total sales of greater than $200 for a specified time period.  We'd need to process the sales transactions table twice in SQL to generate the list of Customers separate from the list of Products.  Thus, our SQL becomes twice as inefficient as it needs to be to use this technique.

To accommodate those possibilities, I recently wrote a simple function in C# that takes a single DataTable and creates a related parent DataTable in the same DataSet based on specified parameters.  This is useful if you only have a single result set from SQL (or any other data source) and you'd like to break that data up into two separate parent-child DataTables in order to use nested repeaters or any other type of parent-child processing on the data.

As an example, if you start with a single DataTable like this:

Products:

CustomerID  CustomerName    Product     Sales
----------  --------------- ----------  -----
1           ABC Foundation  Product 1   $200
1           ABC Foundation  Product 2   $437
1           ABC Foundation  Product 3   $523
2           The XLZ Company Product 1   $240
2           The XLZ Company Product 2   $892
2           The XLZ Company Product 3   $395

We can pass that DataTable to the CreateParentTable() function and you would end up with two related DataTables in the DataSet, like this:

Customers:

CustomerID  CustomerName   
----------  ---------------
1           ABC Foundation 
2           The XLZ Company

Products:

CustomerID  Product     Sales
----------  ----------  -----
1           Product 1   $200
1           Product 2   $437
1           Product 3   $523
2           Product 1   $240
2           Product 2   $892
2           Product 3   $395

Then we can simply bind the outer Repeater to the Customers DataTable, and then bind the inner Repeater to a call to each DataRow's GetChildRows() method to get the related products.  All from a single SQL result set.

Also, notice that the function removes the CustomerName column from the Products DataTable, but leaves the CustomerID column for the relation. 

Here's the function:

/// <summary>
///   Creates a parent DataTable within the DataSet using distinct rows from
///   an existing "source" DataTable, based on the column(s) specified.
///   The source table then becomes the "child" table of the newly created
///   parent. A DataRelation is also created between the parent table and the
///   child table, using the column(s) specified.
/// </summary>
/// <param name="sourceTable">
///    The source DataTable, which must be within a DataSet.  
///    This source table will become the "child" table.
/// </param>

/// <param name="parentTableName">
///    This name will be assigned to the parent table once it is created.
/// </param>

/// <param name="relationColumns">
///    Specify the columns used to relate the parent table to the child table.
/// </param>

/// <param name="additionalColumns">
///    Any additional column(s) in the source table that will be extracted to the
///    parent table.  These columns will be removed from the source table.
/// </param>

/// <param name="relationName">
///    The name of the relation that will be created between the parent and
///    the child table.
/// </param>

///
public static void AddParentTable(DataTable sourceTable, string parentTableName,
     string[] relationColumns, string[] additionalColumns, string relationName)

{
    DataSet dataSet = sourceTable.DataSet;

    if (dataSet == null)
    {
        throw new Exception("The source DataTable must be contained in a DataSet");
    }

    // generate the set of columns to use to create the Parent table:
    string[] cols = new string[relationColumns.Length+additionalColumns.Length];
    relationColumns.CopyTo(cols,0);
    additionalColumns.CopyTo(cols,relationColumns.Length);

    // create the parent table, copying unique rows from the Child table:
    DataTable parent = sourceTable.DefaultView.ToTable(parentTableName, true, cols);
   
    // add the parent table to the DataSet:
    dataSet.Tables.Add(parent);

    // remove the additional columns from the child table that were
    // copied to the parent table:

    foreach (string s in additionalColumns)
        sourceTable.Columns.Remove(s);

    // create the relation between the new parent table and the child table:
    DataColumn[] parentColumns = new DataColumn[relationColumns.Length];
    DataColumn[] childColumns = new DataColumn[relationColumns.Length];

    for (int i=0; i< relationColumns.Length;i++)
    {
        parentColumns[i] = parent.Columns[relationColumns[i]];
        childColumns[i] = sourceTable.Columns[relationColumns[i]];
    }

    // And, finally, add the relation to the parent table:
    parent.ChildRelations.Add(relationName, parentColumns, childColumns);
}

The parameters are all explained in the XML documentation.  Feel free to make modifications or overloads of this function.  Some good overloads might be one that accepts a single relation column name, or that accepts a parameter that determines whether or not columns are removed from the child table.
   
Using our example, we'd call it like this:

SqlConnection cn = new SqlConnection(<... your connection string ...>)
SqlCommand cm = new SqlCommand(<... the name of the stored proc ...>, cn);
cm.CommandType= CommandType.StoredProcedure;
cn.Open();

SqlDataAdapter da = new SqlDataAdapter(cm);
DataSet ds = new DataSet();
da.Fill(ds, "Products");

CreateParentTable(ds.Tables["Products"], "Customers", new string[] {"CustomerID"},
    new string[] {"CustomerName"}, "CustomerProducts")


After that is executed, the DataSet now contains an additional "Customers" table, and a relation called "CustomerProducts" to the "Products" table.   The relation is on "CustomerID", and the "Customers" table also includes the "CustomerName" column. 

Thus, if we had nested repeaters in our ASP.NET code like this:

 <table>
    <tr>
        <th>Customer</th>
       <th>Product</th>
       <th>Sales</th>
    </tr>
<asp:Repeater runat="server" ID="rpt1">
    <ItemTemplate>
       <tr>
            <td colspan="3"><%# ((System.Data.DataRowView) Container.DataItem)["CustomerName"] %></td>
       </tr>
       <asp:Repeater runat="server" ID="rpt2" 
          DataSource='<%# ((System.Data.DataRowView) Container.DataItem).Row.GetChildRows("CustomerProducts") %>'>
            <ItemTemplate>
            <tr>
                <td></td>
                <td> <%# ((System.Data.DataRow) Container.DataItem)["Product"] %></td>
                <td> <%# ((System.Data.DataRow) Container.DataItem)["Sales"] %></td>
            </tr>
        </ItemTemplate>
       </asp:Repeater>
    </ItemTemplate>
</asp:Repeater>
</table>

we can simply assign rpt1's DataSource to our DataSet, set its DataMember as "Customers", call DataBind(), and we are good to go:

rpt1.DataSource = ds;
rpt1.DataMember = "Customers";
rpt1.DataBind();

A couple of more notes:
  1. The outer Repeater's DataItems are DataRowView objects, but the inner Repeater's DataItems are DataRows.  This is because we called the DataRowView's DataRow's GetChildRows() method, and not the DataRowView's CreateChildView() method.  I have found that this provides much better performance.  For small results, you may wish to call CreateChildView().

  2. Instead of databinding in the HTML markup, you may wish to bind the inner Repeaters during the outer Repeater's DataBinding() event.  This keeps your HTML markup a little cleaner and shorter.

  3. This function only works in .NET 2.0 or above, since it uses ToTable() method of the DataRowView class.
For more on nested repeaters, see:
Note that all of those example use two separate SQL result sets.  Using the function provided, we can do the same with a single result.

see also: