Jeff's SQL Server Blog

Random Thoughts & Cartesian Products with Microsoft SQL Server
posts - 144, comments - 1810, trackbacks - 64

My Links

SQLTeam.com Links

News

Welcome to my weblog. My name is Jeff Smith, and I am a software developer in Boston, MA. Check in frequently for tips, tricks, commentary and ideas on SQL Server and .NET programming.


Subscribe





Archives

Post Categories

Programming

Sports

SQL

SELECT DISTINCT and ORDER BY

Let's take a look at another one of those stupid, arbitrary SQL Server error messages that Bill Gates clearly only created because Micro$oft is evil and incompetent and they want to annoy us (and probably kill baby squirrels, too):

Msg 145, Level 15, State 1, Line 4
ORDER BY items must appear in the select list if SELECT DISTINCT is specified.

This message pops up when you ask for DISTINCT rows for one set of columns, but you'd like to have the results ordered by one or more columns not specified in your distinct set.  For some reason, SQL Server will not allow this!  Why not?

Let's look at an example.  Suppose we have the following data:

Letter  Value
----    ------   
A       1
A       1
B       3
B       3
C       2

From that, let's say we'd like see only DISTINCT letters, ordered by value.  Clearly, this means we want to return:

A
C
B

... right?  "A" has the lowest values of 1, followed "C" with a value of 2, and finally "B" with values of 3.  Yet, if we write:

select distinct letter
from x
order by Value

we receive back the above error message.  Why?

Remember, the "rules" of what is allowed and not allowed in SQL is not determined by how your data looks at a particular point in time; it is a constant.  Just because one set of data seems to be perfectly orderable in this scenario does not mean that all data is always going to be perfectly orderable as well.  What if we had this set of data instead:

Letter  Value
----    -----   
A       1
A       1
B       3
B       0
C       2

Look carefully at the above, and think about what should be returned when asking for distinct Letters ordered by Value. 

The answer is .... there is no correct answer!  B contains a value of both 0 and 3, so should it appear first, or should it appear last?  The answer is -- who knows?!  It is not specified when simply asking for distinct Letters ordered by Value.   It is not a clear, complete, deterministic question that you are asking SQL Server to answer; it is like asking someone "what color is it when you add 4 and 3?"  -- the question makes no sense and cannot be conclusively answered as stated.

Now, one person might say, "well, clearly, we want to B to be returned first, since it has the value of 0 and that is lower than all the other values", and another might say "well, clearly, we want B to be returned last, since it has the value of 3 and that is clearly higher than all the other values."  And both people would technically be right, and both are entitled to their logic and their needs -- but that logic and those requirements need to be specified; they cannot be implied, since there is more than one way to interpret the question.  Computers are funny that way, right? They are pretty darn smart, but they always insist on being told exactly what it is we want.   How annoying!

(I often use this analogy regarding users or programmers who refuse to express their requirements or needs logically and completely: It is like a magic Genie saying "you can have anything in the world, all you need to do is tell me specifically what you want and make it perfectly clear and you will get it!", and the users thinking about it, and eventually concluding "nahh... not worth it, sounds like too much work; can't you just figure out what I want?")

So, what is the solution here? To make the error "just go away", we can try just adding the Value column to the SELECT clause like this:

select distinct Letter, Value
from x
order by Value

In our original set of data, this works fine -- it returns the results we want and expect, only with an additional column.  If your data has a 1-1 relation between the ORDER BY columns and the SELECT columns, then simply adding the ORDER BY columns to your SELECT does the trick. 

However, running that SELECT on the second set of data (in which "B" has values of 0 and 3) results in "B" being repeated twice:

B    0
A    1
C    2
B    3

To solve this, like usual, we simply must decide what we want, and explicitly tell SQL Server what that is.  If we want to order by the letter using the lowest associated value, then we simply ask for that ... not by using DISTINCT, but by using GROUP BY:

select Letter
from x
group by Letter
order by min(Value)

This results in:

B
A
C

... And if we want to do the opposite, and order by the Letter with the greatest associated value, then we write:

select Letter
from x
group by Letter
order by max(Value)

A
C
B

We may even want to add up all of the values for each letter and order by that, or maybe there is some other rule. The key is that SQL Server cannot help you until you specifically express what you are looking for.  And that is what this error message is all about -- "Hey, what you are asking for can be interpreted many different ways; can you be more specific?"

SQL (and any other programming language) really is like the magic Genie; it can give you whatever you want, but there is a catch -- you need to ask for it!


see also:

Print | posted on Thursday, December 13, 2007 2:58 PM

Feedback

# re: SELECT DISTINCT and ORDER BY Errors

Nice. I've been working with SQL for 12 years (gulp!) and never knew that you could use aggregate functions in your order by clause. Of course, I almost never use order by . . . but now I know (and knowledge is power!)
12/13/2007 3:21 PM | Bob

# re: SELECT DISTINCT and ORDER BY

Bob you never use order by ?
12/14/2007 12:57 PM | Jon

# re: SELECT DISTINCT and ORDER BY

Yawn, "micro$oft" ? yeah because dell, apple, oracle and the like are not-for-profit charity organisations that help little old ladies over the road.

m()n$4Nt0, now that's more like it. I'd be more worried about disappearing bees due to GM crops, than I would be of MS.

http://en.wikipedia.org/wiki/Colony_Collapse_Disorder

No bees = no food. Yikes!
12/18/2007 4:39 AM | adolf garlic

# re: SELECT DISTINCT and ORDER BY

>>yeah because dell, apple, oracle and the like are not-for-profit charity organisations that help little old ladies over the road.

What? There's no way that those companies are not-for-profit charity organizations! You are totally wrong about this, I even looked it up!

12/19/2007 8:27 AM | Jeff Smith

# re: SELECT DISTINCT and ORDER BY

" Bob you never use order by ?"

Not never but very rarely. I manage a data warehouse with a Business Objects front-end; Data Loading doesn't care about alpha order and BO sorts your queries for you.

When I do use it, it's for much simpler purposes than this.
12/20/2007 1:32 PM | Bob

# re: SELECT DISTINCT and ORDER BY

ItemRS.Open "select distinct isnull(itemname,' ')from orderdetail", frmpomenu.Pocon, adOpenForwardOnly, adLockReadOnly
This is my query, in which i have to specify order by itemname. The resulting recordset will be populated to a combo.

Pls help me to rectify this query.
1/10/2008 5:03 AM | SUBA

# re: SELECT DISTINCT and ORDER BY

you rock
3/28/2008 11:26 AM | duncan campbell

# re: SELECT DISTINCT and ORDER BY

Thank you. This should really be the first result in Google, not a page or two down the list. A clear, concise and well written solution to the error.
4/27/2008 9:54 PM | Ben Williams

# re: SELECT DISTINCT and ORDER BY

Thank you. This should really be the first result in Google, not a page or two down the list. A clear, concise and well written solution to the error.
4/27/2008 9:54 PM | Ben Williams

# re: SELECT DISTINCT and ORDER BY

SELECT DISTINCT *
FROM web_artistas
WHERE web_artistas.id NOT IN (
SELECT web_muestras_artistas.idartista
FROM web_muestras_artistas WHERE web_muestras_artistas.idmuestra = '1') AND web_artistas.ididioma = '1' ORDER BY web_artistas.apellido ASC

In 4.0.25-standard version, not runs... but in version 5 yes.
Any solution for this?

THANKS A LOT!
5/7/2008 5:03 PM | Marcos Mans

# re: SELECT DISTINCT and ORDER BY

Really Nice suggestions.

Thanks Jeff Smith :)
7/4/2008 6:17 AM | bbp

# re: SELECT DISTINCT and ORDER BY

Stewy -- my response is here.
7/17/2008 9:03 AM | jeff

# re: SELECT DISTINCT and ORDER BY

Haha we are probably have same case LoL....

Thanks anyway for the tip!
This might help me!!!

==================================================================================
Quote:
"You can learn from your mistakes but not from being perfect"
7/19/2008 1:45 AM | AiryceSoft

Post Comment

Title  
Name  
Email
Url
Comment   
Please add 3 and 3 and type the answer here:

Powered by: