Mladen Prajdić Blog

Blog about stuff and things and stuff. Mostly about SQL server and .Net

How SQL Server short-circuits WHERE condition evaluation

It does when it feels like it, but not in the way you immediately think of.

A few days ago Jeff posted about short-circuiting wondering how it works and since I've played with this a long while ago I thought I'd share my results. Hopefully it will make it clearer how SQL Server behaves when evaluating conditions in its WHERE clause. So let's see how it's done in greater detail.

First we must create our test table with some sample data which will be defined like this:

IF OBJECT_ID('t1') IS NOT NULL
     DROP TABLE t1

CREATE TABLE t1(id INT PRIMARY KEY, val VARCHAR(10)) INSERT INTO t1 SELECT 1, '19000101' UNION ALL SELECT 2, 'val 2' UNION ALL SELECT 3, '20080303'

Now let's play a bit. :)

 

AND Conditions

Run this statement:

SELECT * FROM t1 
WHERE  id / 0 = 1
       AND id = 2
       AND CONVERT(DATETIME, val) > GETDATE() 

You'll get this error:

Msg 241, Level 16, State 1, Line 10
Conversion failed when converting datetime from character string.

The execution plan shows us how SQL Server parameterizes our select statement:

SELECT * FROM [t1] WHERE [id]/@1=@2 AND [id]=@3 AND CONVERT([datetime],[val],0) > getdate() 

The most obvious condition to fail is of course the divide by zero. You'd think it would be the first to evaluate since it's obviously an illegal call and everything else would be discarded. However because of the statement parameterization this doesn't happen because at the condition evaluation time the values 0 and 1 aren't known. For SQL Server they are still valid parameters whose condition cost must be evaluated. Going back to our error message we can see that the only row with an invalid date is the one where id = 2. This tells us that query processor first looked up the row by id, then applied the datetime conversion. Thus we get our order. Following query only strengthens our deduction since now we get the "Divide by zero error encountered"  because the date successfully converted.

SELECT * FROM t1 
WHERE  id / 0 = 1
       AND id = 3
       AND CONVERT(DATETIME, val) > GETDATE() 

This example shows us how SQL Server short-circuits conditions based on some internal threshold that measures condition importance and their evaluation order. There's nothing we can do about this and the order of condition evaluation when actually reading data is unknown to us.

 

OR Conditions

Run this statement:

SELECT * FROM t1 
WHERE  id = 3
       OR id / 0 = 1

Here it is a whole different story than above with only AND conditions. Here you immediately see the error "Divide by zero error encountered". If you take a closer look you'll see that we never get to the execution plan phase. Why does this differ so much from the AND conditions?  Because OR's immediately mean Table/Index scan. Therefore no parameterization is done and at the query evaluation time all of the values are known. This is when the divide by zero is caught before the execution plan is even built or taken from cache and our query fails. It doesn't matter that our first condition is already true.

 

Literal Conditions

Run these statements:

SELECT 1 WHERE (1 = 1) OR (1 / 0 = 0)
SELECT 1 WHERE (1 = 0) OR (1 / 0 = 0)

We see that with literal conditions the short circuiting is being done on 1=1 and 1=0 and the 1 / 0 isn't evaluated at all, but this is not to be mistaken to work identically with real data as is shown above.

 

CASE Statement Conditions

Run these 2 statements:

SELECT * FROM t1 
WHERE  CASE
           WHEN ISDATE(val) = 0 THEN 0
           WHEN CONVERT(DATETIME, val) > GETDATE() THEN 1            
           ELSE 0
       END = 1

SELECT * FROM t1 WHERE CASE WHEN CONVERT(DATETIME, val) > GETDATE() THEN 1
WHEN ISDATE(val) = 0 THEN 0
ELSE 0 END = 1

These 2 statements nicely illustrate that the order of condition evaluation in the CASE statement is always from first to last. The first statement will return one row since the second rows with invalid date will be filtered out by the ISDATE() function, while in the second statement it won't be filtered out and it will throw an error. This is of course quite logical because a CASE statement is simply a big if... else if... else... statement written differently.

 

If C++/C#/VB has short-circuiting why can't SQL Server have it?

To truly answer this let's take a look at how both work with conditions. C++/C#/VB all have short circuiting defined in the language specifications to speed up code execution. Why bother evaluating  N OR conditions when the first one is already true or M AND conditions when the first one is already false.

We as developers have to be aware that SQL Server works differently. It is a cost based system. To get the optimal execution plan for our query the query processor has to evaluate every where condition and assign it a cost. These costs are then evaluated as a whole to form a threshold that must be lower than the defined threshold SQL Server has for a good plan. If the cost is lower than the defined threshold the plan is used, if not the whole process is repeated again with a different mix of condition costs. Cost here is either a scan or a seek or a merge join or a hash join etc... Because of this the short-circuiting as is available in C++/C#/VB simply isn't possible. You might think that forcing use of index on a column counts as short circuiting but it doesn't. It only forces the use of that index and with that shortens the list of possible execution plans. The system is still cost based.

 

As a developer you must be aware that SQL Server does not do short-circuiting like it is done in other programming languages and there's nothing you can do to force it to.

 

kick it on DotNetKicks.com
 

Legacy Comments


Craig
2008-02-27
re: How SQL Server short-circuits WHERE condition evaluation
Nice article.

Solmaz
2008-03-11
re: regarding automation testing for sharepoint 2007
Hello,

I am new to Sharepoint development environment and the platform.

I Need to do Automated testing of applications for Microsoft Office SharePoint Server2007.
I don't have any idea about it, I wonder if you could help me?

Any useful links or tutorials or any other source which can be helpful to me.

Looking forward to receiving your email.



Many Thanks

Mladen
2008-03-11
re: How SQL Server short-circuits WHERE condition evaluation
maybe this site isn't a good place to ask about sarepoint development, since i don't do that :)

Mladen
2008-03-26
re: How SQL Server short-circuits WHERE condition evaluation
rb, thanx for letting me know about this.
since i've never really seriously programmed in VB i've just assumed that it also has short circuiting built in.
just another example of how assumptions can be wrong, right? :))

vladimir
2008-04-03
re: How SQL Server short-circuits WHERE condition evaluation
The versions of VB prior to VB.NET do not have short-circuits, but VB.NET (finally) has. Because of backward-compatibility it introduced 2 additional logical operators AndAlso & OrElse, that do the short-circuiting.

Krunal
2009-02-04
re: How SQL Server short-circuits WHERE condition evaluation
Good one ir help me lot. [:)]

Andrej
2009-03-13
re: How SQL Server short-circuits WHERE condition evaluation
useful post Mladen.. Except I couldn't get the same result on myw sql2k server. Probably some different configuration...

rgdz.

Craig
2010-02-25
re: How SQL Server short-circuits WHERE condition evaluation
Hi Mladen,

is there anything we can do with hints and the like to suggest to the optimiser what's likely to be of lower cost?

e.g. I have the following (simplified) query:

DECLARE @LastProcDate as DateTime
SET @LastProcDate = (SELECT MAX(Table2.LastModified) FROM Table2)

SELECT *
FROM Table1 T1 INNER JOIN Table2 T2 ON T1.ID = T2.ID
WHERE T2.LastModified < T1.LastModified AND T1.LastModified > @LastProcDate

Now, I know that T1.LastModified > @LastProcDate is going to produce more FALSE responses than the other half of the AND clause. What can I do to influence the optimiser?

Thanks
Craig

ghezal
2010-03-06
re: How SQL Server shrt-circuits WHERE condition evaluation
hi dear
when I want to search from Letter_To then it show this message
.Net sqlclient problem syntax error for the 'OR' and the code is this one
Dim sqry As String = "select * from LetterTable where Letter_No=" & txtLNO.Text & "or Letter_To ='" & txtTo.Text & "';"
so is there any body that help me, plz thanks in advance. what should I write instead of OR.