SELECT DISTINCT and ORDER BY
Let's take a look at another one of those stupid, arbitrary SQL Server error messages that Bill Gates clearly only created because Micro$oft is evil and incompetent and they want to annoy us (and probably kill baby squirrels, too):
Msg 145, Level 15, State 1, Line 4
ORDER BY items must appear in the select list if SELECT DISTINCT is specified.
ORDER BY items must appear in the select list if SELECT DISTINCT is specified.
This message pops up when you ask for DISTINCT rows for one set of columns, but you'd like to have the results ordered by one or more columns not specified in your distinct set. For some reason, SQL Server will not allow this! Why not?
Let's look at an example. Suppose we have the following data:
Letter Value
---- ------
A 1
A 1
B 3
B 3
C 2
---- ------
A 1
A 1
B 3
B 3
C 2
From that, let's say we'd like to see only DISTINCT letters, ordered by value. Clearly, this means we want to return:
A
C
B
C
B
... right? "A" has the lowest values of 1, followed "C" with a value of 2, and finally "B" with values of 3. Yet, if we write:
select distinct letter
from x
order by Value
from x
order by Value
we receive back the above error message. Why?
Remember, the "rules" of what is allowed and not allowed in SQL is not determined by how your data looks at a particular point in time; it is a constant. Just because one set of data seems to be perfectly orderable in this scenario does not mean that all data is always going to be perfectly orderable as well. What if we had this set of data instead:
Letter Value
---- -----
A 1
A 1
B 3
B 0
C 2
---- -----
A 1
A 1
B 3
B 0
C 2
Look carefully at the above, and think about what should be returned when asking for distinct Letters ordered by Value.
The answer is .... there is no correct answer! B contains a value of both 0 and 3, so should it appear first, or should it appear last? The answer is -- who knows?! It is not specified when simply asking for distinct Letters ordered by Value. It is not a clear, complete, deterministic question that you are asking SQL Server to answer; it is like asking someone "what color is it when you add 4 and 3?" -- the question makes no sense and cannot be conclusively answered as stated.
Now, one person might say, "well, clearly, we want to B to be returned first, since it has the value of 0 and that is lower than all the other values", and another might say "well, clearly, we want B to be returned last, since it has the value of 3 and that is clearly higher than all the other values." And both people would technically be right, and both are entitled to their logic and their needs -- but that logic and those requirements need to be specified; they cannot be implied, since there is more than one way to interpret the question. Computers are funny that way, right? They are pretty darn smart, but they always insist on being told exactly what it is we want. How annoying!
(I often use this analogy regarding users or programmers who refuse to express their requirements or needs logically and completely: It is like a magic Genie saying "you can have anything in the world, all you need to do is tell me specifically what you want and make it perfectly clear and you will get it!", and the users thinking about it, and eventually concluding "nahh... not worth it, sounds like too much work; can't you just figure out what I want?")
So, what is the solution here? To make the error "just go away", we can try just adding the Value column to the SELECT clause like this:
select distinct Letter, Value
from x
order by Value
from x
order by Value
In our original set of data, this works fine -- it returns the results we want and expect, only with an additional column. If your data has a 1-1 relation between the ORDER BY columns and the SELECT columns, then simply adding the ORDER BY columns to your SELECT does the trick.
However, running that SELECT on the second set of data (in which "B" has values of 0 and 3) results in "B" being repeated twice:
B 0
A 1
C 2
B 3
A 1
C 2
B 3
To solve this, like usual, we simply must decide what we want, and explicitly tell SQL Server what that is. If we want to order by the letter using the lowest associated value, then we simply ask for that ... not by using DISTINCT, but by using GROUP BY:
select Letter
from x
group by Letter
order by min(Value)
from x
group by Letter
order by min(Value)
This results in:
B
A
C
A
C
... And if we want to do the opposite, and order by the Letter with the greatest associated value, then we write:
select Letter
from x
group by Letter
order by max(Value)
A
C
B
from x
group by Letter
order by max(Value)
A
C
B
We may even want to add up all of the values for each letter and order by that, or maybe there is some other rule. The key is that SQL Server cannot help you until you specifically express what you are looking for. And that is what this error message is all about -- "Hey, what you are asking for can be interpreted many different ways; can you be more specific?"
SQL (and any other programming language) really is like the magic Genie; it can give you whatever you want, but there is a catch -- you need to ask for it!
see also:
- By The Way ... DISTINCT is not a function ...
- Is it a String Literal or an Alias?
- SELECT * FROM TABLE -- except for these columns
- But *WHY* Must That Column Be Contained in an Aggregate Function or the GROUP BY clause?
- In SQL, it's a Case Expression, *not* a Case Statement
- Sometimes the problem isn't the code. It's the specs.
Legacy Comments
Bob
2007-12-13 |
re: SELECT DISTINCT and ORDER BY Errors Nice. I've been working with SQL for 12 years (gulp!) and never knew that you could use aggregate functions in your order by clause. Of course, I almost never use order by . . . but now I know (and knowledge is power!) |
Jon
2007-12-14 |
re: SELECT DISTINCT and ORDER BY Bob you never use order by ? |
adolf garlic
2007-12-18 |
re: SELECT DISTINCT and ORDER BY Yawn, "micro$oft" ? yeah because dell, apple, oracle and the like are not-for-profit charity organisations that help little old ladies over the road. m()n$4Nt0, now that's more like it. I'd be more worried about disappearing bees due to GM crops, than I would be of MS. http://en.wikipedia.org/wiki/Colony_Collapse_Disorder No bees = no food. Yikes! |
Jeff Smith
2007-12-19 |
re: SELECT DISTINCT and ORDER BY >>yeah because dell, apple, oracle and the like are not-for-profit charity organisations that help little old ladies over the road. What? There's no way that those companies are not-for-profit charity organizations! You are totally wrong about this, I even looked it up! |
Bob
2007-12-20 |
re: SELECT DISTINCT and ORDER BY " Bob you never use order by ?" Not never but very rarely. I manage a data warehouse with a Business Objects front-end; Data Loading doesn't care about alpha order and BO sorts your queries for you. When I do use it, it's for much simpler purposes than this. |
SUBA
2008-01-10 |
re: SELECT DISTINCT and ORDER BY ItemRS.Open "select distinct isnull(itemname,' ')from orderdetail", frmpomenu.Pocon, adOpenForwardOnly, adLockReadOnly This is my query, in which i have to specify order by itemname. The resulting recordset will be populated to a combo. Pls help me to rectify this query. |
duncan campbell
2008-03-28 |
re: SELECT DISTINCT and ORDER BY you rock |
Ben Williams
2008-04-27 |
re: SELECT DISTINCT and ORDER BY Thank you. This should really be the first result in Google, not a page or two down the list. A clear, concise and well written solution to the error. |
Ben Williams
2008-04-27 |
re: SELECT DISTINCT and ORDER BY Thank you. This should really be the first result in Google, not a page or two down the list. A clear, concise and well written solution to the error. |
Marcos Mans
2008-05-07 |
re: SELECT DISTINCT and ORDER BY SELECT DISTINCT * FROM web_artistas WHERE web_artistas.id NOT IN ( SELECT web_muestras_artistas.idartista FROM web_muestras_artistas WHERE web_muestras_artistas.idmuestra = '1') AND web_artistas.ididioma = '1' ORDER BY web_artistas.apellido ASC In 4.0.25-standard version, not runs... but in version 5 yes. Any solution for this? THANKS A LOT! |
bbp
2008-07-04 |
re: SELECT DISTINCT and ORDER BY Really Nice suggestions. Thanks Jeff Smith :) |
Stewy
2008-07-15 |
re: SELECT DISTINCT and ORDER BY I have an issue with both DISTINCT and GROUP BY. The issue is that using either one, the results comes back ordered as if using order by. I need the unique results in the order they are in the database. How can I do this? Thanks |
jeff
2008-07-17 |
re: SELECT DISTINCT and ORDER BY Stewy -- my response is here. |
AiryceSoft
2008-07-19 |
re: SELECT DISTINCT and ORDER BY Haha we are probably have same case LoL.... Thanks anyway for the tip! This might help me!!! ================================================================================== Quote: "You can learn from your mistakes but not from being perfect" |
lorddef
2008-10-15 |
re: SELECT DISTINCT and ORDER BY Also in addition to what "adolf garlic" said, the behaviour you mention is not SQL server specific, it's actually part of the SQL standard. SQL server is doing thinjgs the correct way, even if it is a PITA. |
Jeff Smith
2008-10-15 |
re: SELECT DISTINCT and ORDER BY lorddef -- Yes, that is kinda my point -- this behavior is the correct, logical way as explained. (Hint: When someone who writes a Microsoft SQL Server blog makes a comment about how "micro$oft is killing baby squirrels", consider the possibility that perhaps they are making fun of the anti-Microsoft ignorance out there ...) |
Daniel
2008-11-06 |
re: SELECT DISTINCT and ORDER BY I have the same question that Stewy asked. I have read your response to him but that didn't help me. Basically is it possible to return the distinct field as well as the field that was used to order the results, but only the 1st occurance. |
jeff
2008-11-06 |
re: SELECT DISTINCT and ORDER BY Daniel -- how about an example? What you are asking is very vague and doesn't seem to make much sense without further details. |
vadimas
2008-11-14 |
re: SELECT DISTINCT and ORDER BY What if I want a result like: B 0 A 1 C 2 i.e. I want the values as well? |
David Silverlight
2008-12-08 |
re: SELECT DISTINCT and ORDER BY That was really brilliant!!! I just posted a reference to it in my own forum(URL Below). A few minutes before that, I had come to the conclusion that it just could not be done, but I googled just a bit more and came across your post. http://www.community-credit.com/cs/forums/p/2405/5716.aspx#5716 |
Nathan A
2008-12-10 |
re: SELECT DISTINCT and ORDER BY Your solution only helps if the sort field does not have duplicates. Let's take your example but add one record: B 0 A 1 C 2 B 3 A 0 If we just use: select Letter from x group by Letter order by min(Value) A contains a minimum value of 0 and so does B, who should be first? Let's assume you want to sort by minimum value, but if two letters agree on their minimum value then you want to sort on the next minimum value and so on. How would you do that? |
vuxes
2008-12-16 |
re: SELECT DISTINCT and ORDER BY How do I get both columns Letter and Value to display? |
Jeff
2008-12-16 |
re: SELECT DISTINCT and ORDER BY Nathan -- that's an interesting puzzle to solve; has nothing really to do with the orginal article, but it is interesting nonetheless. Let me think about that one -- might make a good blog post. However, SQL isn't traditionally good at handling situations such as the one you describe. |
Jeff
2008-12-16 |
re: SELECT DISTINCT and ORDER BY vuxes -- please read the article. you are not giving a clear specification as to what you want and what should happen. |
Nathan A
2008-12-19 |
re: SELECT DISTINCT and ORDER BY Jeff, This is actually a problem I have been puzzling over for quite a while now. I actually need to do that sort. I wonder if I may have to create another column that has the list of ordering values in it in increasing order so for my example above assuming a letter table and a number table that contains the numbers for letters, the number table would not change but I would add this second column to the letter table: B 0 A 0, 1 C 2 B 3 But I don't like that idea as it is a denormalization and would require extra maintenance. I can think of a way to do it with a specified number of subsequent rows to sort by. In your example you order by the minimum value of the number column. If we wanted to order by the minimum, then by the second minimum, then by the third minimum we can use nested queries to select each of those values in different columns and order by them. I have developed query like this but there are some issues. First of all, getting the second and third lowest values requires nested queries themselves so this would require many nested queries (not sure if that is a problem). The number of nested queries increases based on how many levels down you want to sort by. The other problem with this method is that you have to specify how many levels down you want to sort by, you can't just sort by a concatenation of all numbers. The real world example of this problem actually seems like it would be useful in many situations. Consider a task table, that has a list of tasks for people to accomplish and an assignee table that has a list of people assigned to the task. A task can have many people assigned to it. I want to get a list of tasks sorted by their assignees in alphabetical order so if the joined table looked like this: Task Assignee 1 John 1 Mark 2 John 3 Mark The result would be in this order: 2 John 1 John, Mark 3 Mark Let me know what you think the best approach is. |
Jeff
2008-12-23 |
re: SELECT DISTINCT and ORDER BY Nathan see: http://weblogs.sqlteam.com/jeffs/archive/2008/12/23/60803.aspx for a new post on this topic. Thanks! |
prabakaran
2009-01-02 |
re: SELECT DISTINCT and ORDER BY Thanks for your valuable information. It was really of use to me. -Prabakaran.K www.usjobcareer.com The No.1 Job and Career Search Portal |
Eric
2009-01-02 |
re: SELECT DISTINCT and ORDER BY Hi Jeff, Terrific and informative article! However, I would like to echo the questions of vadimas and vexus: how do you display the value along with the letter? In other words, how do I get this result: B 0 A 1 C 2 I don't want the concatenated results that Nathan was looking for, just the lowest value for each letter. I would expect the query to look something like: select Letter, value from x group by Letter order by max(Value) But of course that throws an error. Thank you for your help! |
Jeff S
2009-01-02 |
re: SELECT DISTINCT and ORDER BY Eric -- MAX() returns the highest value, not the lowest, so you'd use MIN(). To return anything in a resultset, simply add it to the SELECT clause: select Letter, MIN(value) from x group by Letter order by MIN(Value) That's it! |
Harish KV
2009-02-26 |
SELECT DISTINCT and ORDER BY Hi... Really a great solution...the way you give the solution with the example is simply superb... I hav to use the word which is more powerful than "Thnx"...bt ...am helpless...i couldn't find.... |
Rob
2009-03-11 |
re: SELECT DISTINCT and ORDER BY Damn helpful, thanks! |
Carl
2009-03-19 |
re: SELECT DISTINCT and ORDER BY Great stuff. I was using messy temp tables. this is sooooo much easier. thanks for posting |
DaveM
2009-03-31 |
re: SELECT DISTINCT and ORDER BY Exactly what I needed! |
Fuud
2009-04-06 |
re: SELECT DISTINCT and ORDER BY Thank you!!! |
KC
2009-04-15 |
re: SELECT DISTINCT and ORDER BY Thanks a lot for this. It helped me save a lot of time |
Stephane
2009-04-24 |
re: SELECT DISTINCT and ORDER BY Your Blog has been added in my favorites :) |
AndyP
2009-05-15 |
re: SELECT DISTINCT and ORDER BY Ditto the first comment. Thanks very much! |
JG
2009-06-24 |
re: SELECT DISTINCT and ORDER BY Cheers Jeff. Exactly what I was looking for, and the explanation was awesome. Nice to know why the error occurs as well how to get around it! |
Syni
2009-08-19 |
re: SELECT DISTINCT and ORDER BY Hi The reasons and ways to avoid this error have discussed in this site with good examples. By making small changes in the query http://sqlerror104.blogspot.com/2009/08/order-by-items-must-appear-in-select_19.html |
Anirban
2009-08-20 |
re: Nathan's problem Hello Guys, I understand that this is a T-SQL related forum, but it should be interesting to see how other databases deal with the same problem. In this case, I would like to present a solution that would work in both MySQL and sqlite. The table in question has the records: ------------------ Task Assignee ------------------ 1 John 1 Mark 2 John 3 Mark Let the name of the table be X select Task, group_concat(Assignee) as AssigneeGroup from X group by Task order by AssigneeGroup This should give the result: ------------------ Task AssigneeGroup ------------------ 2 John 1 John, Mark 3 Mark I have tested this on MySQL. sqlite should also give the same result. |
prax
2009-09-17 |
re: SELECT DISTINCT and ORDER BY What a clever usage of aggregate function in the order by clause ! You'll be really enjoying working with SQL!! Thanks a lot. |
Abu Abdullah
2009-11-17 |
re: SELECT DISTINCT and ORDER BY Hello Thanks for sharing your information. I am trying the following: How to implement your idea on the following query. When I tried to add the Min() and Group By, it fails. I am tring to get the top 5 minumum numbers. I don't need repeatative numbers. I also tried distinct on PeriodicConsumptions.ConsumptionKWH field, but it fails. SELECT top 5 PeriodicConsumptions.ConsumptionKWH , MeterReadings.MeterNo, MeterReadings.LastKWH, MeterReadings.AccountNo, MeterReadings.MultiplyFactor, MeterReadings.LastReading, MeterReadings.ThisMonthReading, MeterReadings.NumberofDays, MeterReadings.ReadingGYear, MeterReadings.ReadingGMonth, MeterReadings.ReadingDate FROM MeterReadings, PeriodicConsumptions Where MeterReadings.AccountNo = PeriodicConsumptions.AccountNo and MeterReadings.ReadingGYear =PeriodicConsumptions.ConsumptionYear and MeterReadings.ReadingGMonth =PeriodicConsumptions.ConsumptionMonth and MeterReadings.ReadingDate = PeriodicConsumptions.FromDate ORDER BY (PeriodicConsumptions.ConsumptionKWH) ASC I hope to see answers. Thanks |
Amr Abdelrahman
2009-12-03 |
re: SELECT DISTINCT and ORDER BY Thanks a lot, its too usefull |
ochriste
2009-12-23 |
re: SELECT DISTINCT and ORDER BY Thanks, that saved me from cracking my skull on the wall! |
Robert
2010-01-08 |
re: SELECT DISTINCT and ORDER BY Thank you so much for this bit. It gets tiresome trying to translate "uber geek". This is clear and totally captures the essence of the issue. |
Iswan
2010-02-19 |
re: SELECT DISTINCT and ORDER BY Hi, I really need a solution for this : I have the data : 1 A 100 1 B 51 2 C 21 2 D 26 2 E 12 And I need something like this : 1 B 51 2 E 12 Is there any way to implement this using similar query? |
rohan
2010-05-24 |
re: SELECT DISTINCT and ORDER BY u know what, thats greate work. u made my life much easy. was searching for this for a long time and that just worked like a magic. thanks for sharing... good work. |
Rick_kap
2010-06-10 |
re: SELECT DISTINCT and ORDER BY Hi, I've been reading through this page and i have a simular problem. I was wondering if someone could help. Here is is the SQL statement that i'm using now. strSQL = "SELECT TOP 5 user_id, subjectID, user_score, time_taken FROM result WHERE subjectID NOT LIKE '"& word &"' AND quiz_date >= DATEADD(mm, -1, CURRENT_TIMESTAMP) ORDER BY u_score DESC, time_sec ASC" The table shows top 5 users in the last month and the problem i face is that if user_id is in the top 5 more than once he is displayed. What i would like to happen is for the SQL query to ignore him/her seconded time round and move onto the next record. Is this possible to do?? Thanks in advance Rick |
Dean Hiller
2010-06-14 |
re: SELECT DISTINCT and ORDER BY We have a case where we are doing a join between two tables. We want distinct values in the left table but the join with the right table causes duplicate values. In the example above then, we never have B 0 and B 3....It would always be B 0 and B 0 as it would be the same row once grouped. How do we do a query like that? (We encountered this same problem in oracle....GREAT article by the way!!) select t1.* from table t1 inner join table2 t2 on t2.x=t1.x group by t1.* order by t1.startdate; We don't want to list out all the columns in group by(want hibernate to do that itself). It gets worse actually as we join with many other tables and need all those values as well, but they are just duplicates when joining with this one table. Maybe it has to be a subquery somehow? It's like I need to eliminate duplicates in a subquery.....hmmmmm. ah, I figured it out....at this point, yeah you have to go to a subquery and narrow down the results so the join would not result in duplicate rows and put the group by in the subquery and the order by in the master query so it looks like this in the master query select t1.* from table t1 where t1.tpvid in (subquery) this article helped alot in getting me to this point...great article. thanks, Dean |