Jeff Smith Blog

Random Thoughts & Cartesian Products with Microsoft SQL Server

The Mailbag: More on Ordering by Distinct Values …

I promise to get back to writing articles on a more regular basis soon, but in the meantime, here's a comment from Nathan A. on using DISTINCT and ORDER BY:

Jeff,

This is actually a problem I have been puzzling over for quite a while now. I actually need to do that sort. I wonder if I may have to create another column that has the list of ordering values in it in increasing order so for my example above assuming a letter table and a number table that contains the numbers for letters, the number table would not change but I would add this second column to the letter table:

B 0
A 0, 1
C 2
B 3

But I don't like that idea as it is a denormalization and would require extra maintenance.

I can think of a way to do it with a specified number of subsequent rows to sort by. In your example you order by the minimum value of the number column. If we wanted to order by the minimum, then by the second minimum, then by the third minimum we can use nested queries to select each of those values in different columns and order by them. I have developed  query like this but there are some issues. First of all, getting the second and third lowest values requires nested queries themselves so this would require many nested queries (not sure if that is a problem). The number of nested queries increases based on how many levels down you want to sort by. The other problem with this method is that you have to specify how many levels down you want to sort by, you can't just sort by a concatenation of all numbers.

The real world example of this problem actually seems like it would be useful in many situations. Consider a task table, that has a list of tasks for people to accomplish and an assignee table that has a list of people assigned to the task. A task can have many people assigned to it. I want to get a list of tasks sorted by their assignees in alphabetical order so if the joined table looked like this:

Task Assignee
1 John
1 Mark
2 John
3 Mark

The result would be in this order:
2 John
1 John, Mark
3 Mark


Let me know what you think the best approach is.


Hi Nathan --

Well, one way to handle this is to write a User Defined Function that returns a string concatenating distinct Assignees for the Task provided as a parameter.  We can use a UDF similar to this one as an example.  UDF's such as these are the simplest and most efficient way I have seen to handle concatenation at the database layer, though there are other methods you can try.

So, let's say we have this for a schema and sample data:

create table Tasks (TaskID int primary key, TaskName varchar(10))
go

create table TaskAssignees
(   TaskID int references Tasks(TaskID), Assignee varchar(10), primary key (TaskID, Assignee))
go

insert into Tasks
select 1,'Task A' union all
select 2,'Task B' union all
select 3,'Task C'

insert into TaskAssignees
select 1, 'John' union all
select 1, 'Mark' union all
select 2, 'John' union all
select 3, 'Mark' union all
select 3, 'Ed'

We can create a UDF like this:
   
create Function GetTaskAssignees(@TaskID int)
returns varchar(100)
as
    begin
    declare @ret varchar(100)
    set @ret = ''

    select @ret= @ret + ', ' + Assignee
    from TaskAssignees
    where TaskID = @TaskID
    order by Assignee

    return substring(@ret,3,100)
    end

... and get the output you are looking for like this:

select TaskID, TaskName, dbo.GetTaskAssignees(TaskID) as Assignees
from Tasks
order by Assignees

TaskID      TaskName   Assignees
----------- ---------- --------------
3           Task C     Ed, Mark
2           Task B     John
1           Task A     John, Mark

So, this should  actually work for well, though for large sets of data performance may be an issue.  As an added bonus, this handles presentation of the names assigned to each task for you as well.

Depending on the data, however, you may need to concatenate items of a fixed length, padded by spaces, instead of simply comma-separated.   This would apply if you are sorting by numeric values, such as:

1,23,45
1,3
6,12,4

Notice that "1,23,25" sorts before  "1,3" in the example above, since it is just comparing two strings.  To solve this, you'd have to write the UDF to output like this:

1 , 3
1 ,23 ,45
6 ,12 , 4

That way, " 3" (padded to the right with a space) sorts correctly before "23".  You could also pad with leading zeroes, or padding after the value instead of before.  The trick here is to identify how much padding you need.

If you  need assistance with altering the example UDF shown to pad the output, let me know.

Legacy Comments


Anonymous
2008-12-30
re: The Mailbag: More on Ordering by Distinct Values ...
You can also do this (SQL 2005 and above):

SELECT
DISTINCT
t.TaskID
,t.taskName
,STUFF((SELECT ', ' + t2.Assignee FROM TaskAssignees AS t2 WHERE t2.taskID = t.taskID FOR XML PATH('')), 1, 2, '') AS EMP_NAMES
FROM dbo.Tasks AS t
inner join dbo.TaskAssignees AS TA
on t.taskID = TA.taskID
ORDER BY Emp_Names

smeet
2009-01-28
Can we use EXEC (@sqlvar) in UDF
Can't we use [b]exec[/b]ute statement in functions?????

If yes then how???????
plz provide example too.......

i have created a function and due to some reason i have to store the sql statement in a variable to execute the statement.....
But sql says invalid use of EXECUTE in function.......


****************--------*********************
My second question is : How can i store a value returned from execute(tsql) in a variable.

Ex:------------- -------------->>>>>>>>
declare @sal decimal (16,2)
declare @tsql varchar(4000)

set @tsql = "select " + @sal + " = salary from employee"
exec (@tsql)
-------------------------------

i need the returned salary in the variable @sal, so that i can further manipulate it.
I dont want to create a physical table or to store the returned value in a temp table from executing the sql variable.

If it is possible in procedure then also fine. But with no use of table/temp table

Thanks

sqlgeek
2009-02-16
re: The Mailbag: More on Ordering by Distinct Values ...
thanks
great post


http://sql-insights.blogspot.com/


Catto
2009-08-05
re: The Mailbag: More on Ordering by Distinct Values ...
Hey Now,

That sure is a great post.

Thx 4 the info,
Catto

George
2009-09-22
re: The Mailbag: More on Ordering by Distinct Values ...
Come back Jeff, we miss your writings! Hope all is well :)