Jeff Smith Blog

Random Thoughts & Cartesian Products with Microsoft SQL Server

Creating CSV strings in SQL: Should Concatenation and Formatting Be Done at the Database Layer?

Introduction

A question I see very often in the SQLTeam forums is how to return data in a summarized form by concatenating multiple values into single CSV string columns.  For example, taking data like this:

FarmerName           FruitName 
-------------------- ----------
Farmer Ted           Apple     
Farmer Ted           Orange    
Farmer Fred          Orange    
Farmer Fred          Grapes    
Farmer Fred          Grapefruit
Farmer Jed           Orange    

... and returning the results from the database in this format:

Farmer         FruitList
---------      -------
Farmer Ted     Apple,Oranges
Farmer Fred    Grapefruit,Grapes,Orange
Farmer Jed     Orange

Notice that we are only returning one row per Farmer, and the "FruitList" column returned is a concatenated comma-separated list of values in the Fruit column.

Seems simple enough ... let's take a look.  Here's our schema:

create table Farmers
(
    FarmerID int primary key,
    Name varchar(20)
)

create table Fruit
(
    FruitID int primary key,
    FruitName varchar(10)
)

create table FarmersFruit
(
    FarmerID int,
    FruitID int,
    primary key (FarmerID, FruitID)
)

And here's our sample data if you'd like to play along at home:

insert into Farmers(FarmerID, Name)
select 1,'Farmer Ted' union all
select 2,'Farmer Fred' union all
select 3,'Farmer Jed'

insert into Fruit(FruitID, FruitName)
select 1,'Apple' union all
select 2,'Orange' union all
select 3,'Grapes' union all
select 4,'Grapefruit'

insert into FarmersFruit (FarmerID, FruitID)
select 1,1 union all
select 1,2 union all
select 2,2 union all
select 2,4 union all
select 2,3 union all
select 3,2

Notice that the data is stored in a nice, simple, normalized manner.   We can return all of the Farmers and their Fruit with a stored procedure that consists of a simple SELECT with some JOINS like this:

create procedure FarmerSummary
as
select
  F.Name as FarmerName, F.FarmerID, FR.FruitName, FR.FruitID

from
  Farmers F

inner join
  FarmersFruit FF on F.FarmerID = FF.FarmerID

inner join
  Fruit FR on FF.FruitID = FR.FruitID

order by
  F.FarmerID, FR.FruitID


Executing FarmerSummary return results with one fruit per row and each Farmer (potentially) repeated multiple times, as is standard in SQL when joining tables in many-to-many relations:

FarmerName           FarmerID    FruitName  FruitID
-------------------- ----------- ---------- -----------
Farmer Ted           1           Apple      1
Farmer Ted           1           Orange     2
Farmer Fred          2           Orange     2
Farmer Fred          2           Grapes     3
Farmer Fred          2           Grapefruit 4
Farmer Jed           3           Orange     2

(6 row(s) affected)

Concatenating Multiple Rows in T-SQL

That is all fine and good, but how do we return only one row per Farmer with all of their fruit together in a single column?  Concatenating the fruit names into a string is done typically by creating a simple User-Defined Function:

create function GetFruitList(@FarmerID int)
returns varchar(8000)
as
begin
    declare @ret varchar(8000)
    set @ret = ''

    select @ret = @ret + ',' + F.FruitName
    from Fruit F
    inner join FarmersFruit FF on F.FruitID = FF.FruitID
    where FF.FarmerID = @FarmerID
    order by FruitName asc

    return substring(@ret,2,8000)
end   

The function accepts a FarmerID, and returns a CSV-string of fruit that the farmer "farms".  (An exciting example, I know!)   There are other ways to get the same result (SQL 2005 has some useful XML features we can use, or we could create a CLR aggregate function), but I find the UDF to be the easiest way to go.

We can use this function to get our desired results by simply calling it for each row in the Farmers table:

create procedure FarmerSummaryList
as
select
  f.Name, dbo.GetFruitList(F.FarmerID) as Fruits

from
  Farmers f

order by
  f.FarmerID


Our fancy new FarmerSummaryList procedure returns the results just the way we'd like to display them:

Name                 Fruits
-------------------- --------------------------
Farmer Ted           Apple,Orange
Farmer Fred          Grapefruit,Grapes,Orange
Farmer Jed           Orange

(3 row(s) affected)

Presentation versus Data Implications

Ok, so far we have not broken any new ground.  I've used this technique before and it works quite well and very efficiently, and I'm sure that maybe you have as well.  However, we need to carefully consider what is happening when we employ a function like this and return results in this manner.

Let's say that I am designing a web application, and for the "Farmer Summary" page that shows one row per farmer in a grid, I'd like to show all of the fruit that each farmer is associated with separated by commas.  The FarmerSummary stored procedure works great for this, and it allows me to quickly bind the procedure results to a grid in ASP.NET and output everything without the need for anything special.

However, suppose on another page we decide to display the Fruit for each farmer separated by line breaks in the grid, instead of separated by commas?  That is, instead of

Apples,Oranges

we want to show:

Apples
Oranges

In HTML terms, we might choose to delimit our list by <BR> tags. What should we do with our FarmerSummaryList stored procedure?  We could call it as is, and do a replace on our Front End to replace all "," characters with "<BR>".  That would work, but what if items in the list themselves contain commas?  (In fact, that is an issue of any approach that builds CSV lists in general -- we must ensure that our chosen delimiter cannot exist in the data, and what should be returned if it does?)   We could also create another stored procedure, FarmerSummaryListBR or something like that, that delimits by "<BR>" instead of commas and use that for this specific web page.

The issue here is that suddenly our database code is now becoming dependent on how we wish to format our results.  Altering formatting now requires that we edit our SQL code, not just our ASP.NET code.  And if we alter our SQL code to do formatting, other applications or clients that consume our data might suddenly not work correctly.  Imagine that we have a report that uses the FarmerSummaryList stored procedure -- what happens to our report if we change it to use <BR> tags?  Our report will suddenly output those <BR> tags and become a mess.

A More Flexible T-SQL Solution

One way to fix this is to make our code more generic and reusable, and create it so that formatting is not embedded and hard-coded.  Instead of writing our user-defined function to always return CSV strings, I would suggest instead to write it like this:

alter function GetFruitList(@FarmerID int, @Delimiter varchar(10))
returns varchar(8000)
as
begin
    declare @ret varchar(8000)
    set @ret = ''
   
    select @ret = @ret + @Delimiter + F.FruitName

    from Fruit F
    inner join FarmersFruit FF on F.FruitID = FF.FruitID
    where FF.FarmerID = @FarmerID
    order by FruitName asc

    return substring(@ret,len(@Delimiter + 'x'),8000)
end

Notice that the delimiter is now a parameter; we must pass in what we wish to delimit our string by when calling the function.  This allows us to use a comma or a comma plus a space or a "<BR>" tag or anything else when calling the function.  We also use a little trick at the end to get the length of the delimiter by appending an 'x' to it, in case the delimiter contains trailing spaces which are not counted by the len() function.  

With this new UDF, we can then alter our stored procedure to also accept a delimiter which is simply passed to the function:

alter procedure FarmerSummaryList @Delimiter varchar(10)
as
select
  f.Name, dbo.GetFruitList(F.FarmerID,@Delimiter) as Fruits

from
  Farmers f

order by
  f.FarmerID


Now, when we can call our stored procedure we can specify the delimiter to use to get the exact output we need:

FarmerSummaryList '<br>'

Name                 Fruits
-------------------- -----------------------------
Farmer Ted           Apple<br>Orange
Farmer Fred          Grapefruit<br>Grapes<br>Orange
Farmer Jed           Orange

(3 row(s) affected)

Well, it seems like we solved our little dilemma; we should be good to!

Encountering Limits and Issues with the T-SQL Approach

Suppose now that we are happy with what we've done so far, but then we realize that our web application needs to let users click on each fruit name in our "FruitList" string to jump to the fancy "Show Fruit" page, passing the corresponding FruitID as a URL parameter.  This of course is very typical of a nicely hyper-linked web application.  Our current stored procedure, despite the added flexibility of our @Delimiter parameter, will not help us with this.   What should we do here?  Should we make it much more complicated and add in format strings and other options?  Should we concatenate both the names and the FruitID values into our string somehow, and then parse and extract those values in our application?  If so, what format should we use to do this?  Also, what if we want a "fruit description" to show when you hover over the link via the title attribute, or use other values from the Fruit table in a similar manner?  Perhaps we need to come up with more complex strings embedding all of this data like this:

Apple,1,A delicious treat;Orange,2,Oranges from Florida;Grapes,3,A Whole Bunch Of Them

Hmmm ... Now things are getting a bit more complicated, and once again we find that our output and formatting is getting mixed up with our database code.  We shouldn't need to concatenate and create string formats using our nice clean raw data only to then parse those strings at our client to extract back out that raw data!  That's almost as bad as working with a poorly designed unnormalized database in the first place, with CSV columns and things like that. Isn't there an easier way to do this?

Well, actually, there is. And it's not very exciting for all of the talented T-SQL programmers out there.  In fact, it's quite boring:  Just use the simple, basic FarmerSummary stored procedure.  And do all formatting on the web page in ASP.NET. 

That's it. 

Format At Your Presentation Layer, Not Your Database Layer

Suppose that instead of writing UDFs and worrying about exactly how to return multiple concatenated values in a single string per row from the database, we simply do the following in our ASP.NET code:

  1. Retrieve the clean, normalized, raw data from the FarmerSummary stored procedure with a DataReader
  2. Create a new DataTable, with string columns for FarmerName and FruitList
  3. Loop through the rows in the DataReader, concatenating together a string of HTML hyperlinks for each fruit (No formatting or parsing is require to extract Fruit names or ID's)
  4. Each time a new farmer is encountered, finish off the current HTML string, add a row to the return DataTable, and begin a new HTML string
  5. When done, add the last HTML string and FamerID to the return DataTable
  6. Bind the DataGrid or Repeater or whatever you are using to display the data to the newly created DataTable

We are essentially doing in our loop what our user defined function is doing, only now we are doing it a the client, so incorporating specific formatting into our process makes sense.  We can change our formatting any way we want -- all at the client; we can add class or target attributes to our hyperlink, alter the base address, and so on, all without changing any database code.  And we can output those values separated by commas, or <BR> tags, or in an unordered list, as well.

Here is some sample code demonstrating this technique for our FruitSummary stored procedure.  It just returns the modified, summarized DataTable as indicated above:

        public static DataTable GetFarmerSummary()
        {
            SqlConnection cn = new SqlConnection( <your connection here> );
            SqlCommand cm = new SqlCommand("FarmerSummary", cn);
            cm.CommandType= CommandType.StoredProcedure;

            cn.Open();

            SqlDataReader dr = cm.ExecuteReader();
            string list = "";

            DataTable ret = new DataTable();
            DataRow row = null;
            ret.Columns.Add("FarmerID",typeof(int));
            ret.Columns.Add("FarmerName", typeof(String));
            ret.Columns.Add("FruitList", typeof(String));

            while (dr.Read())
            {
                if ((row == null) || ((int) row["FarmerID"] != (int) dr["FarmerID"]))
                {
                    if (row != null)
                        row["FruitList"] = list;
                    list = "";
                    row = ret.Rows.Add(new object[] {dr["FarmerID"], dr["FarmerName"], list});
                }
                list += String.Format("<a href='Fruit.aspx?ID={0}'>{1}</a><br>", dr["FruitID"],dr["FruitName"]);
            }

            row["FruitList"] = list;
           
            dr.Close();
            cn.Close();

            return ret;
        }

Notice that there really isn't much to it; most of the code is just initializing things, opening the connection, creating the return DataTable, and so on.  The loop is very basic and is easily alterable to build the "FruitList" any way that you want, as shown.  You could use a StringBuilder if necessary, though for very short lists it may not be worth it.  We can call our FarmerSummary stored procedure over and over, many different ways, each time formatting and outputting the results differently, and all we need is one stored procedure and only our client code needs to change. 

Note that is only one approach; there are hundreds of ways to do the same thing in ASP.NET.  We can create custom controls and classes to do this more generically, and we could also trap events on our web page to hide or cancel the creation of specific items in repeaters or lists or grids and format things just the way we need.

Conclusion

So, remember: Avoid putting presentation code in your database layer whenever possible.  You may be doing it without even realizing it, and returning CSV lists from your database is a great example.  By simply returning clean and normalized data to your clients, you make it much easier for those clients to consume that data and to ultimately present and output it any way that they wish.

Finally, you may remember that I recommend this exact same approach when pivoting data:  simply write clean, normalized, basic SQL, and do your formatting/pivoting at the client.  It's easier, it's faster, it's more flexible, it's the way to go.

see also:

Legacy Comments


Sheldon
2007-10-10
re: Creating CSV strings in SQL: Should Concatenation and Formatting Be Done at the Database Layer?

What about in cases where you get 100 results but if you did the concat (with commas or whatever) in SQL there are only 10 results? Won't the fact that you are getting less rows speed up the application? Especially when the SQL Sever is a different physical server than the Web Server.

Jeff
2007-10-10
re: Creating CSV strings in SQL: Should Concatenation and Formatting Be Done at the Database Layer?
Good point, Sheldon, it all depends on the bottlenecks. You'd have to weigh the cost of the extra data returned versus the cost (in SQL and at the front end) of processing that data, plus the maintenance implications.

Think of the way reporting tools work -- you return data like this in SQL:

custID, custName, tranID, tranAmount
ABC,ABC Company,1,$100
ABC,ABC Company,2,$120
ABC,ABC Company,3,$120
DEF,DEF Corp,4,$392
DEF,DEF Corp,5,$120
..etc...

and on the report you might have a group header on CustID.... but SQL still returns the customer name over and over for each row, just because that's the way relational databases return their results. It's the same basic idea. The only way to avoid that is to return a hierarchical result via XML or some other format.

Regina Obe
2007-10-12
re: Creating CSV strings in SQL: Should Concatenation and Formatting Be Done at the Database Layer?
What about using techniques such as the XPATH or CLR to build aggregate functions such as described in the below

http://weblogs.asp.net/dwahlin/archive/2006/09/26/Using-SQL-Server-2005-CLR-Features.aspx - granted XPath syntax is pretty ugly and unintuitive, but more generic.

The concatenate aggregate function packaged as a sample project in SQL Server 2005 (located in :\Program Files\Microsoft SQL Server\90\Samples\Engine\Programmability\CLR\) does what you describe too, but in a really generic way and you can tweak it to take in a conditional delimeter I think (although I admit I haven't tried that). I personally just changed it not to take any delimiter to be a simple glue function.

http://technet.microsoft.com/en-us/library/ms131056.aspx

You can then do

select
Farmers.Name, dbo.Concatenate(FF.FruitName) as Fruits
from Farmers INNER JOIN FarmersFruit FF ON Farmers.FarmerID = FF.FarmerID
inner join Fruit F on F.FruitID = FF.FruitID
GROUP BY Farmers.Name, Farmers.FarmerID
ORDER BY Farmers.NAME

I really wish SQL Server 2005 supported creating aggregates in SQL without forcing DBAs to hope into Visual Studio to create these functions in C# or Vb.

I particularly like PostgreSQL's approach. The way PostgreSQL allows users to create aggregates in various languages (and writing the code right in the database) is just so much easier than the way SQL Server does it. Pretty much just one line of code in PostgreSQL to do the 30 someodd lines that SQL Server forces you to write in CLR.

http://www.commandprompt.com/ppbook/r23733

An if you chose plperl or something like that to write it in, you can just write it in perl without having to hope out of the database, write the code, register your library etc. The code lives right in the db.






Jeff
2007-10-12
re: Creating CSV strings in SQL: Should Concatenation and Formatting Be Done at the Database Layer?
Regina -- I appreciate your feedback, but I mentioned both of the options you suggested right in the article .... The point of the article isn't about how to concatenate things in SQL; it's about if we *should*, and what the ramifications are, and the pros and cons of doing it at the database versus at the client, and what to do when things get more complicated (such as the hyperlink example) ...

maybe give it a re-read ?

Regina Obe
2007-10-12
re: Creating CSV strings in SQL: Should Concatenation and Formatting Be Done at the Database Layer?
Jeff,

Oops sorry about that, I guess I did miss that line.

I guess the UDF is the easiest way to go, but from an efficiency and reusability stand point, I feel the CLR aggregate function if you have SQL Server 2005 is the better solution. With the UDF you'd be creating a UDF for every table you would need to glue values together. What if you needed a where in there like only a certain category of the fruits in the Farmer's basket.

Of course this is just a difference in opinion so don't take any offense and I know that wasn't quite the point of your article - just thought I would mention it.

Jeff
2007-10-12
re: Creating CSV strings in SQL: Should Concatenation and Formatting Be Done at the Database Layer?
Again, Regina, I think you are completely missing the point of the article. It is not about *how* to do this, it's about *where* this should be done -- at the database or the client, and the pros and cons of the different approaches, and the implications of doing this at the database as discussed in the examples .... Maybe try re-reading a 3rd time?? :)

Regina
2007-10-12
re: Creating CSV strings in SQL: Should Concatenation and Formatting Be Done at the Database Layer?
Okay - I see your point now and see where it has some benefits. For the most part I disagree.

The reason is that most of the time when I need to do this is in cases like where Sheldon described where I'm returning 1000s of Farmers.

It seems extremely impractical to loop thru all those records (say 1000x20) on the client side just because of the amount of data you would have to pass thru the pipe in that case and most of the time it is simply to just display them in one row like if you need to glue together a sequence of comments. In that case I would use a | or some other delimeter and use a Split function to split into an array and rejoin back or just simply violate your tenet of "Format At Your Presentation Layer, Not Your Database Layer", just because the code you have to write is just so much shorter, easier to understand and efficient when done at the database layer.

Plus I think- stored procedures are pretty dangerous to use except in the specific applications they were designed for anyway.

So 3 -1 = 2 for database.

Jeff
2007-10-15
re: Creating CSV strings in SQL: Should Concatenation and Formatting Be Done at the Database Layer?
Again, Regina, I will use the reporting tool analogy. How does it work when you write a Crystal or Access or Reporting services report, and you'd like a group header on customerID/CustomerName and to show detail for sales? Also, suppose there is a "CustomerDescription" column with text that you wish to show on the customer header as well? You only show the CustomerID, CustomerName and CustomerDescription once per customer, yet in the data returned from SQL Server it is repeated over and over for every single row in your data. The report handles the grouping and suppressing and formatting, and in SQL we just return the raw data.

How do you handle situations like this in a typical reporting tool?

Regina Obe
2007-10-15
re: Creating CSV strings in SQL: Should Concatenation and Formatting Be Done at the Database Layer?
Jeff,

That's why I said I don't completely disagree with you. I think the example you presented was not a good one. First of all I don't consider C# code presentation layer. It is massaging the data into a format suitable to be pushed into a presentation layer (your aspx file or grid view or whatever). So you have simply pushed your logic from the database layer to what I call the business layer.

Now let me present an example with Crystal. Let us say you have to keep track of 2 sets of things - Farmers Fruits and Farmers Insurance and on the report you need to group Farmers into States and for each state you have to give a tally of the number of Farmers in each state. In this case I pull my data down so I only have one record per farmer and separate glob fields for Fruits and Insurance because I don't particularly care about the formating of how those are listed I just want a simple list. If I were to normalize this - firstly I really can't because Fruits and Insurance don't mix very nicely and it would screw up my counts unless I did some massaging in my C# which is NOT PRESENTATION LAYER. Even if I were able to my C# code that massages it into a format suitable for Crystal or whatever report consumption would look pretty ugly.

The alternative when I do care about the formatting would be to resort to subreports or some xmlish format which is much slower and messier to control on the report. Yes when I really care about formatting I do that, but most of the time - for these short list things - a simple list is good enough

Jeff
2007-10-15
re: Creating CSV strings in SQL: Should Concatenation and Formatting Be Done at the Database Layer?
>>That's why I said I don't completely disagree with you. I think the example you presented was not a good one. First of all I don't consider C# code presentation layer. It is massaging the data into a format suitable to be pushed into a presentation layer (your aspx file or grid view or whatever). So you have simply pushed your logic from the database layer to what I call the business layer.

Regina -- C# code can be *both*. Your business layer can return a standard, normalized DataTable, and you can have a *presentation* C# function (not in your business layer, but at your asp.net presentation layer) that puts it into a "presentation" dataTable for DataGrid consumption. And that's just one simple example, you can use asp.net databinding events and many other simple ways in ASP/ASP.NET/etc to do this .... the point is that it takes about 5 lines of code in C#, it allows your SQL to be clean, simple and standard, it eliminates concatenation in SQL and parsing, and it completely leaves all presentation at your presentation layer.

I would NEVER suggest have a function like this in your business layer, because that would be (as you said) just doing in C# what the SQL concatenation is doing; you'd still be mixing data and presentation, you'd still need to unparse these strings and rejoin them, you'd still have re-use issues, etc ... I am sorry that my example confused you, it was just showing an example of how simple the algorithm can be, not a suggestion how to implement this in your business layer code as opposed to in T-SQL.

Regina
2007-10-16
re: Creating CSV strings in SQL: Should Concatenation and Formatting Be Done at the Database Layer?
Jeff,

I guess this is something I have given a great deal of thought to too and I still think about it a lot when the question poses its head and sometimes I go one way or the other, but you seem to kind of dismiss it as - YOU should really try not to do this which really bothers me. Like the example I provided, I would be interested in knowing how you would do this at the presentation level because that is a use case where I often side with putting it in the DB.

Like I said for simple formatting - I don't see the point of making the presentation layer work harder to produce nothing of interest and frankly parsing delimited text seems just as easy to me as what you are doing (and I can have a real function that takes the text and breaks it out and have the function called right from the .ASPX side like a formatting function.

C# was not designed to be a presentational language even though yes you can use it that way - just as I can use SQL to solve convoluted mathematical equations. If indeed there was a lot of formatting and grouping etc. (which is the realm of aspx and report writers) and farmer fruits was the focus of my reports (as it would be with sales orders) - then by all means yes normalize. To me those are 2 separate things. Simple format vs. (complex format and grouping)

My problem with your putting html in your C# datatable and saying (ah but its a presentational datatable what the hell does that mean?) is that you have completely taken out the ability to style it and all that from the .ASPX side. That is why I don't consider that you are using your C# in a useful presentational way and therefore a poor example of a format at the presentational layer.

The issue about reuse - lets say you have to link the farmer on multiple pages on your site as I have often had to do. Then in those cases to prevent repeating said C# code - I would move it out of the code behind file into a separate class where it can be used by multiple pages so I don't have to re repeat it. But hay one day on one page I need the styling of the link slightly different from the other pages or I don't want a link at all - so I need to either create another copy of my code on the .aspx.cs page or make my C# presentational datatable function take in a whole bunch of arguments - which brings us back to what you complained about with the stored procedure.

Now like I said - for reporting use cases - I consider stored procedures - particularly bad uses - because they are opaque things and you don't want opaque things when you are reporting or drawing a datagrid unless you know that page owns that stored proc. Someone can take out the field in the stored proc or sort by god forbid fruits instead of farmers and you are screwed anyway. You would be better off using a stored function, or making sure you have a 1 to 1 with your use cases or just god forbid writing the sql in your report writer and therefore you wouldn't have these issues you presented at all.

Jeff
2007-10-16
re: Creating CSV strings in SQL: Should Concatenation and Formatting Be Done at the Database Layer?
Alright, one last try. It seems that you keep missing the point of what I am trying to write ....

>>C# was not designed to be a presentational language even though yes you can use it that way - just as I can use SQL to solve convoluted mathematical equations.

that is wrong. The ToString() method formats. String.Format() formats and presents code. In ASP.NET, every piece of data that appears on the page comes from C# (or VB.NET) code. C# is a programming language, every asp.net page and every windows form is ENTIRELY MADE OF C# CODE. And in order to format data grids, lists, tables, or anything else, you use C# code to do it-- conditional formatting, suppressing rows, presenting dates and numbers, fonts, colors, borders, sizes, etc. It is a presentation code.

If you feel that C# cannot be "presentation code", despite the fact that asp.net and windows forms are inherited from base c# classes and the very way that we create these forms is by writing C# code, then I am not sure what to tell you.

>> If indeed there was a lot of formatting and grouping etc. (which is the realm of aspx and report writers) and farmer fruits was the focus of my reports (as it would be with sales orders) - then by all means yes normalize. To me those are 2 separate things. Simple format vs. (complex format and grouping)

Not sure what you are saying here -- sounds like you are agreeing with me, so that's good.

>> My problem with your putting html in your C# datatable and saying (ah but its a presentational datatable what the hell does that mean?) is that you have completely taken out the ability to style it and all that from the .ASPX side.

Again, you are not getting it. This is *not* a library function. This is *not* designed for re-use. It is just a simple code-behind function in a specific page that formats the generic data in one specific way for one specific data grid. As I said about 10 times so far, it is JUST AN EXAMPLE OF ONE WAY TO DO IT. You can use events on your ASP.NET page to conditional concatenate and/or suppress rows, or build an ASP.NEt table manually by looping through rows, or other simple code-behind functions to do formatting and grouping, all on your ASP.NET page, all at your presentation layer. Or you can create an asp.net custom control that does this more generically.

To repeat: I AM NOT SUGGESTING THAT FUNCTION AS A "LIBRARY" FUNCTION TO BE RE-USED IT, OR TO BE PUT IN YOUR DATA ACCESS LAYER -- IT IS JUST AN *EXAMPLE* THAT IS BOUND TO THE SPECIFIC PAGE USING IT.

>>That is why I don't consider that you are using your C# in a useful presentational way and therefore a poor example of a format at the presentational layer.

Again, see above.

>>The issue about reuse - lets say you have to link the farmer on multiple pages on your site as I have often had to do. Then in those cases to prevent repeating said C# code - I would move it out of the code behind file into a separate class where it can be used by multiple pages so I don't have to re repeat it. But hay one day on one page I need the styling of the link slightly different from the other pages or I don't want a link at all - so I need to either create another copy of my code on the .aspx.cs page or make my C# presentational datatable function take in a whole bunch of arguments - which brings us back to what you complained about with the stored procedure.

And yet again, see above.

I apologize for losing my patience, and for emphasizing things in an obnoxious ALL CAPITALS manner, but I am not sure how else to get my points across.

Jeff
2007-10-16
re: Creating CSV strings in SQL: Should Concatenation and Formatting Be Done at the Database Layer?
Let me also try taking a different angle.

To summarize my position:

1) concatenating in SQL can be done, it can be useful -- just be careful and consider if you are forcing specific presentation into your SQL Code. If you do this, consider accepting a @Delimiter parameter.

2) if the concatenation gets more complex, as in my hyperlink example, be very careful because now you are doing lots of string manipulation when maybe you might not need to. And, you are getting closer to needing one specific stored procedure for every different way you want to present the exact same data.

3) thus, consider keeping your SQL code short, simple and generic, and formatting at your presentation layer. Here's a quick example of one way to do this, to demonstrate how easy it is.

that's it. that's my position.

Regina
2007-10-16
re: Creating CSV strings in SQL: Should Concatenation and Formatting Be Done at the Database Layer?
Jeff,

I apologize too for being kind of dense about the matter.

I like your summarization of your position. It is much clearer and I think we are pretty much in agreement.

My similar summary
Concatenation in SQL can be done and can be useful - just be aware when you are doing it and if it can be done simpler in your presentation layer.

Case in point I do a lot of concatenation in views that are linked via MS Access for users who then use that to create reports and queries. They know nothing about C# or VB and how to prevent multiple joins from messing up their counts and often they just want to see their comments in a single field. There is no way to easily inject C# and VB in there for these users without taking away some of their ability to do their own reports.

My other point about C# not being presentation (I do use it a lot for creating formatting functions and in row bound event of grid views so don't get me wrong I just don't use it for generating data tables because the idea of reshaping data into another container seems something more suited for a database to do (I'll think about that a bit more - I guess there are cases such as with your pivot example where it makes sense). So maybe I was getting needlessly obsessed about your choice of example) . Very sorry about that.


Jeff
2007-10-16
re: Creating CSV strings in SQL: Should Concatenation and Formatting Be Done at the Database Layer?
No problem, Regina, glad we are on the same page. It's always good to have a discussion to help clarify viewpoints and present ideas.

As for the "presentation" DataTable, I do agree that it is perhaps not the best example and it can cause confusion as to where this code lives; to be 100% honest, I used that as an example simply because it was something that I could wrap all in one function easily in the blog post, as opposed to doing something with a DataGrid and code-behind on the events which would take much more work to format and explain.

Perhaps I will add some clarifying comments or other examples to the blog post at some point this week. thanks for bringing up your points, it is appreciated.

Reggie
2008-01-11
re: Creating CSV strings in SQL: Should Concatenation and Formatting Be Done at the Database Layer?
Does this only work in SQL Server 2005?

Jeff
2008-01-11
re: Creating CSV strings in SQL: Should Concatenation and Formatting Be Done at the Database Layer?
>> Does this only work in SQL Server 2005?

Does what work? The entire point of the article is NOT to do anything specific in SQL, just simple format and concatenate data at your application layer. It's easier, shorter, usually more efficient, more flexible, and so on.

The UDF example shown works in SQL >= 2000.

Reggie
2008-01-11
re: Creating CSV strings in SQL: Should Concatenation and Formatting Be Done at the Database Layer?
>> The UDF example shown works in SQL >= 2000.

This is what I was asking. Thanks for your response.

Jay
2009-03-25
re: Creating CSV strings in SQL: Should Concatenation and Formatting Be Done at the Database Layer?
I like this approach for SQL2005:

SELECT STUFF((SELECT ', '+last_name FROM participant_demographics FOR XML PATH('')),1,2,'') last_names