I promise to get back to writing articles on a more regular basis soon, but in the meantime, here's a comment from Nathan A. on using DISTINCT and ORDER BY
This is actually a problem I have been puzzling over for quite a while now. I actually need to do that sort. I wonder if I may have to create another column that has the list of ordering values in it in increasing order so for my example above assuming a letter table and a number table that contains the numbers for letters, the number table would not change but I would add this second column to the letter table:
A 0, 1
But I don't like that idea as it is a denormalization and would require extra maintenance.
I can think of a way to do it with a specified number of subsequent rows to sort by. In your example you order by the minimum value of the number column. If we wanted to order by the minimum, then by the second minimum, then by the third minimum we can use nested queries to select each of those values in different columns and order by them. I have developed query like this but there are some issues. First of all, getting the second and third lowest values requires nested queries themselves so this would require many nested queries (not sure if that is a problem). The number of nested queries increases based on how many levels down you want to sort by. The other problem with this method is that you have to specify how many levels down you want to sort by, you can't just sort by a concatenation of all numbers.
The real world example of this problem actually seems like it would be useful in many situations. Consider a task table, that has a list of tasks for people to accomplish and an assignee table that has a list of people assigned to the task. A task can have many people assigned to it. I want to get a list of tasks sorted by their assignees in alphabetical order so if the joined table looked like this:
The result would be in this order:
1 John, Mark
Let me know what you think the best approach is.
Hi Nathan --
Well, one way to handle this is to write a User Defined Function that returns a string concatenating distinct Assignees for the Task provided as a parameter. We can use a UDF similar to this one
as an example. UDF's such as these are the simplest and most efficient way I have seen to handle concatenation at the database layer, though there are other methods
you can try.
So, let's say we have this for a schema and sample data:
create table Tasks (TaskID int primary key, TaskName varchar(10))
create table TaskAssignees
( TaskID int references Tasks(TaskID), Assignee varchar(10), primary key (TaskID, Assignee))
insert into Tasks
select 1,'Task A' union all
select 2,'Task B' union all
select 3,'Task C'
insert into TaskAssignees
select 1, 'John' union all
select 1, 'Mark' union all
select 2, 'John' union all
select 3, 'Mark' union all
select 3, 'Ed'
We can create a UDF like this:
create Function GetTaskAssignees(@TaskID int)
declare @ret varchar(100)
set @ret = ''
select @ret= @ret + ', ' + Assignee
where TaskID = @TaskID
order by Assignee
... and get the output you are looking for like this:
select TaskID, TaskName, dbo.GetTaskAssignees(TaskID) as Assignees
order by Assignees
TaskID TaskName Assignees
----------- ---------- --------------
3 Task C Ed, Mark
2 Task B John
1 Task A John, Mark
So, this should actually work for well, though for large sets of data performance may be an issue. As an added bonus, this handles presentation of the names assigned to each task for you as well.
Depending on the data, however, you may need to concatenate items of a fixed length, padded by spaces, instead of simply comma-separated. This would apply if you are sorting by numeric values, such as:
Notice that "1,23,25" sorts before
"1,3" in the example above, since it is just comparing two strings. To solve this, you'd have to write the UDF to output like this:
1 , 3
1 ,23 ,45
6 ,12 , 4
That way, " 3" (padded to the right with a space) sorts correctly before "23". You could also pad with leading zeroes, or padding after the value instead of before. The trick here is to identify how much padding you need.
If you need assistance with altering the example UDF shown to pad the output, let me know.