Jeff's SQL Server Blog

Random Thoughts & Cartesian Products with Microsoft SQL Server
posts - 157, comments - 2686, trackbacks - 64

My Links

Advertisement

News

Welcome to my weblog. My name is Jeff Smith, I am software developer in Boston, MA and I was recently named a 2009 SQL Server MVP. Check in frequently for tips, tricks, commentary and ideas on SQL Server and .NET programming.


Subscribe





Archives

Post Categories

Programming

Sports

SQL

In SQL, it's a Case Expression, *not* a Case Statement

A Google search for the phrase

sql "case statement"

returns 127,000 results.  Meanwhile, if we do a search for the phrase

sql "case expression"

we get back only 43,900 results.

Why do I bring this up?  Because that's a pretty good indicator that most people think the CASE construct in SQL is a statement that somehow alters code that is executed or compiled, when in fact it is really just an expression, that simply accepts and evaluates expressions as arguments and always simply returns a single value.  That's it!  In fact, a good case (pun intended) can be made that it is really just a fancy function with an odd, flexible syntax, and nothing else.

We see this a lot when people try to do things like this:

SELECT ...
FROM ..
ORDER BY CASE WHEN @Sort=1 THEN Col1 ASC ELSE Col1 DESC END

And, of course, it doesn't work -- note the inclusion of the sort direction indicators (ASC/DESC) within the CASE parameters.  And people wonder "why isn't the case statement [sic] working" ?  They treat it almost as if it were some sort of run-time conditional compilation directive, that dynamically alters the code itself regardless of what commands are put within the case parameters, and the end result is an entirely new SQL command.  That's not how it works!  It is just a simple function, an expression, that returns a single value.  And the arguments of the CASE, like any other function or expression, can also only be expressions.  You cannot stuff random snippets of T-SQL syntax in there, such as sort order indicators or references to database objects (i.e., table names), since those are not expressions that can be evaluated and returned as a single value.

I've talked a lot here about the difference between code and data, and this seems to be yet another place where the confusion surfaces.   A CASE statement expression accepts and returns data,  it doesn't accept and return code.  It's really that simple.

It may help to imagine that CASE has a syntax that uses parenthesis, such as:

CASE(WHEN ... END)

which perhaps makes it more apparent that CASE is just a fancy function that returns a value.  Unfortunately, that syntax is not valid; if it were, it might make code more readable
-- especially for beginner -- and easier to visualize  as a function or expression.

I'm sure some of the confusion comes from people with VB backgrounds, where CASE is indeed a statement, and segments of the CASE contain code that is executed, or not, depending on different conditions.

So, the next time you hear someone mention a "case statement" when discussing SQL -- stop them!  That's right, just jump up in the middle of a meeting and yell "Noooo!!" and run across the room and throw hot coffee on their face! They might be a great SQL developer, and they might even personally understand exactly how a CASE works, but by using the phrase "case statement" it propagates the misunderstanding and the confusion about what CASE really is and what it does.

I hope I've made a compelling case ... and just in case this doesn't make sense, the best case scenario might be to ...  ok, that's enough!  Sorry!  Case closed!


see also:

Print | posted on Thursday, May 03, 2007 9:15 AM | Filed Under [ Miscellaneous T-SQL ]

Feedback

Gravatar

# re: In CASE you didn't know ....

It is probably just laziness as most people are used to saying "case statement" from other languages.
5/4/2007 4:59 AM | aob
Gravatar

# re: In CASE you didn't know ....

Thanks for clearing that up Jeff, but the coffee in the face comment is going to bring on another type of case....more legal in nature hehe.
5/4/2007 9:20 AM | Salman
Gravatar

# re: In CASE you didn't know ....

Great comment, Salman! :)
5/4/2007 9:21 AM | Jeff
Gravatar

# re: In CASE you didn't know ....

SELECT ...
FROM ..
ORDER BY CASE WHEN @Sort=1 THEN Col1 ELSE 0 END ASC, CASE WHEN @Sort=1 THEN 0 ELSE Col1 END DESC

;)
5/6/2007 4:19 AM | Chris
Gravatar

# re: In CASE you didn't know ....

Wow! I had no idea you could do that, the point of my article was that I thought it was impossible to sort dynamically using a CASE expression!! ;)

In all seriousness, you seem to be implying that Col1 is numeric by returning 0 as the default sort constant; if that is the case, it's even easer:

order by col1 * (case when @sort = 1 then 1 else -1 end) asc

or even

order by col1 * (@sort1*2-1) asc

But typically, you'd use NULL as the standard default constant to avoid the need for any implicit conversions depending on the data type.
5/6/2007 10:31 AM | Jeff
Gravatar

# re: In CASE you didn't know ....

I'm actually one of those beginners you were talking about. To me, it looks like Chris is creating a statement that depending on the variable @sort, looks like:

SELECT... FROM... ORDER BY COL1 ASC, 0 DESC

or

SELECT... FROM... ORDER BY 0 ASC, COL1 DESC

Sorting the 0th column will do nothing so the equivalent expression is to sort by COL1 ASC or COL1 DESC. The difference between your statement in the article and Chris', is that you are trying to return both the column name AND the sort direction whereas Chris is returning the single element, the column name, and applying the sort direction outside the CASE expression.

As a beginner, I always have problems getting CASE to work the way I expect it to, but based on this article, I will definately think about it differently.
5/6/2007 2:40 PM | RyanB
Gravatar

# re: In CASE you didn't know ....

RyanB -- Actually, the way you've described how Chris's SELECT is working is exactly how CASE doesn't work. The CASE is not returning an element, as you say, such as either a column name or a column number. The CASE expression is returning a single VALUE. The CASE expression is evaluated for each row in the table, and the result is not the column name itself, but the VALUE within that column name, over and over, for each row.

The 0 is a constant value, a numeric literal, not a column indicator, because CASE cannot return a "column reference", it is just an *expression* that returns *data* -- numbers, dates, bits, text, etc -- never code or database object references or anything else.

Maybe this will help: the case DOESN'T work like this:

"select data from the table; if @sort = .. then order by .., else order by ..."

it DOES work like this:

"select data from the table, and evaluate CASE when @Sort = ... END for EACH ROW, ordering by that expression."

I hope this helps to clear things up a little, if it still doesn't make sense, definitely let me know. Thanks for your feedback!!
5/6/2007 3:07 PM | Jeff
Gravatar

# re: In CASE you didn't know ....

I don't have SQL in front of me, so I can't try it, but does Chris' method work? If so, that would suggest otherwise. I don't believe that COL1 is a numeric as you interpreted it. As you explain it, CASE can only return a value, so if COL1 is a numeric, what does the resulting statement look like? I don't understand how that works unless CASE can return a column as a value.

I'll do some experimenting, but however it turns out, at least this article has got me thinking. Thanks.
5/6/2007 4:32 PM | RyanB
Gravatar

# re: In CASE you didn't know ....

Oh... I think I see what you are getting at now. If COL1 is a numerical value, and you are sorting by that numerical value (as we are), then the entire statement is something like:

SELECT... FROM... ORDER BY 1 ASC, 0 DESC
SELECT... FROM... ORDER BY 2 ASC, 0 DESC
SELECT... FROM... ORDER BY 3 ASC, 0 DESC

OR

SELECT... FROM... ORDER BY 0 ASC, 3 DESC
SELECT... FROM... ORDER BY 0 ASC, 2 DESC
SELECT... FROM... ORDER BY 0 ASC, 1 DESC

In the first case, items are first sorted in ascending order of the value in COL1, and then further "sorted" by 0, resulting in an ascending list. In the second case, items are all "sorted" by 0, and then put in descending order by the resulting value of COL1. The value is returned on a per row basis.

Wouldn't it hold just as true though if COL1 were a string? I really just need to sit down in front of SQL I think... I'm not sure I fully understand this yet. To me there isn't a difference. Assuming Chris' statement works, I don't see how it's evaluated based on what you're telling me. If you have some brilliant insight on how to explain this to me, please let me know, otherwise I think I just need to experiment some.
5/6/2007 4:55 PM | RyanB
Gravatar

# re: In CASE you didn't know ....

RyanB -- Again, you wrote this:

>>what does the resulting statement look like?

What I am trying to say is: The "resulting statement" NEVER CHANGES. A CASE does NOT dynamically alter code, it just RETURNS A VALUE. That's it. I'm afraid you are still not quite getting the concept.

Chris's statement simply sorts by two CASE expressions. The first CASE expression, for each row in the table, returns a value that is either equal to the column value or equal to "0" (0 must get converted to the column's data type). The second CASE expression, for each row in the table, also returns a value that is either equal to the column value or equal to "0". That's it.

The SQL, any way you slice it, simply reads:

SELECT ... ORDER BY <expression1> ASC, <expression2> DESC

The only affect that CASE has is that it determines the way the expressions are evaluated and what values are returned. The "resulting statement" never changes. Maybe another example will help. Look at this:

SELECT .. ORDER BY LEFT(SomeValue,2) ASC

What is happening there? Is the LEFT "altering the statement returned" ? No, it just a formula, a function, and expression that we are sorting by. For each row in the table, the LEFT() is evaluated, and the results of those evaluations -- the values returned -- are used to sort the result. CASE works EXACTLY the same way as LEFT() or any other function, it is just more complex and more flexible.


5/6/2007 5:02 PM | Jeff
Gravatar

# re: In CASE you didn't know ....

>>Wouldn't it hold just as true though if COL1 were a string?

Yes, but the zero would need to be implicitly converted to a string. Again, since a CASE is just an expression, it always returns the same data type -- a single CASE expression can't sometimes return a STRING, other times a number, other times a date. Thus, if the column is a string or a date or anything other than numeric, the 0 returned would have to be converted to that datatype, adding some overhead. That is why I recommend using NULL if you want to use that technique, since no data types or conversions are necessary. Any constant value of the correct datatype will have the same effect -- you can return 0, 2924, -243232, 1.1, etc -- since the value is a constant, it has no effect on the sort.

And yet again, don't think of it as the SQL statement changing for each row, think in terms of the DATA RETURNED for each which ultimately determines the sort.

so, if I have these values:

1
2
3
4

and you wonder what happens when Chris's SQL is executed, think of the RESULT:

when @sort =1

col1, case1, case2
1,1,0
2,2,0
3,3,0
4,4,0

and since we order by case1 asc, case2 desc, it comes out 1-4. But when @sort=2, the resulting data (NOT code!) is:

col1,case1,case2
1,0,1
2,0,2
3,0,3
4,0,4

and we are ordering by case 1 asc, case 2 desc, the result comes out 4-1.

I hope this helps.
5/6/2007 5:08 PM | Jeff
Gravatar

# re: In CASE you didn't know ....

Nice to see some discussion on this, as the article only provided what didn't work, i thought i'd provide an example of how it should be done to achieve the same thing.

But, yes the example will work - although i only provided an example for a numeric value.

You could also use Null instead of 0 which would probably be more correct as it will work with any type.

Basically the idea of it is that depending on the value of @Sort, one of the cases will always be canceled out and return a constant, which will allow the opposing statement to be the actively sorted one.

Using case some pretty advanced sorting can be implemented.

eg.

SELECT ..
FROM ..
ORDER BY CASE
WHEN TypeID = 1 THEN 1
WHEN TypeID in (2,3) THEN 2
WHEN TypeID = 4 AND ValueID < 10 THEN 3
WHEN TypeID = 4 AND ValueID >= 20 THEN 4
WHEN Name like 'Test%' THEN 99
ELSE 5 END, DateField DESC
5/6/2007 10:04 PM | Chris
Gravatar

# re: In CASE you didn't know ....

OK, so yes Jeff, that is what I was trying to understand. I suggested it was the resulting statement, but what I really should have meant was the result. That last break down makes it clear in my head; I wasn't making the connection that the sort happens after the results are tabulated. I was still caught up in it being some run-time dynamic code, but I see how that fits now. I suppose if I looked at an execution plan I would have picked up on that.

Chris, thanks for the example as well. I usually spend some time fighting with CASE expressions trying to make them work, at least I know now why I'm always fighting them and I have a much better understanding about how to fix my problems in the future.
5/7/2007 1:45 AM | RyanB
Gravatar

# re: In CASE you didn't know ....

Everywhere else CASE is a statement nor a function. But we have to learn another lesson in the MS SQL Server World, one that I really love.

MSSQL dialect and SQL in general is a very concise language in terms of form, results and meaning. I have learned that in the worst way. Like when a table scales to millions of records and you find a little misconception turned into a 1000 seconds of extra cpu processing! But well, I love to see myself in the worst mirror ever! :)
5/7/2007 12:52 PM | MRamirez
Gravatar

# re: In CASE you didn't know ....

case, in the hands of someone who knows what they are doing, is probably the most effective tool to overcome inherent sql deficiencies. thanks, chris, for showing me yet another way to use case effectively (i have a pretty nice technique for multiple count(*) that I use).
5/9/2007 2:15 AM | matt
Gravatar

# re: In SQL, it's a Case Expression, *not* a Case Statement

You know if case is creating such a big problem, I say the fault is in the language designers for using CASE in a place that could be deliberately misinterpreted. Perhaps using another label for the same expression function would have solved this problem. But it's ultimately the laziness of the original designer of the sql extension that's to be blamed because language design does not happen in a vacuum and people who have to use such language features need unambiguous conventions.
2/9/2008 1:50 PM | Jay T
Gravatar

# openrowset

Great discussion, may god keeps increasing your knowledge.
am using openrowset for querrying, the result displays ?????????? instead of arabic name
kindly suggest ..

regards,
Waseem
4/28/2008 4:52 AM | waseem
Gravatar

# MS Sql Tip

MSSqlTips has a good article about this...
http://www.mssqltips.com/tip.asp?tip=1455
10/31/2008 8:03 AM | John
Comments have been closed on this topic.

Powered by:
Powered By Subtext Powered By ASP.NET