I want some Moore

Blog about stuff and things and stuff...
mostly about SQL server and .Net
posts - 161, comments - 1487, trackbacks - 33

My Links

SQLTeam.com Links

News

Hi! My name is 
Mladen Prajdić  I'm from Slovenia and I'm currently working as a .Net (C#) and SQL Server developer. I'm also a MCP and MCTS for SQL Server. 
Welcome to my blog.

Search this Blog
 

My Blog Feed via Email


Get your Google PageRank
Users Online: who's online

Article Categories

Archives

Post Categories

Cool software

Other Blogs

Other stuff

SQL stuff

How SQL Server short-circuits WHERE condition evaluation

It does when it feels like it, but not in the way you immediately think of.

A few days ago Jeff posted about short-circuiting wondering how it works and since I've played with this a long while ago I thought I'd share my results. Hopefully it will make it clearer how SQL Server behaves when evaluating conditions in its WHERE clause. So let's see how it's done in greater detail.

First we must create our test table with some sample data which will be defined like this:

IF OBJECT_ID('t1') IS NOT NULL
     DROP TABLE t1

CREATE TABLE t1(id INT PRIMARY KEY, val VARCHAR(10))
INSERT INTO t1
SELECT 1, '19000101' UNION ALL 
SELECT 2, 'val 2' UNION ALL 
SELECT 3, '20080303'

Now let's play a bit. :)

 

AND Conditions

Run this statement:

SELECT * FROM t1 
WHERE  id / 0 = 1
       AND id = 2
       AND CONVERT(DATETIME, val) > GETDATE() 

You'll get this error:

Msg 241, Level 16, State 1, Line 10
Conversion failed when converting datetime from character string.

The execution plan shows us how SQL Server parameterizes our select statement:

SELECT * FROM [t1] WHERE [id]/@1=@2 AND [id]=@3 AND CONVERT([datetime],[val],0) > getdate() 

The most obvious condition to fail is of course the divide by zero. You'd think it would be the first to evaluate since it's obviously an illegal call and everything else would be discarded. However because of the statement parameterization this doesn't happen because at the condition evaluation time the values 0 and 1 aren't known. For SQL Server they are still valid parameters whose condition cost must be evaluated. Going back to our error message we can see that the only row with an invalid date is the one where id = 2. This tells us that query processor first looked up the row by id, then applied the datetime conversion. Thus we get our order. Following query only strengthens our deduction since now we get the "Divide by zero error encountered"  because the date successfully converted.

SELECT * FROM t1 
WHERE  id / 0 = 1
       AND id = 3
       AND CONVERT(DATETIME, val) > GETDATE() 

This example shows us how SQL Server short-circuits conditions based on some internal threshold that measures condition importance and their evaluation order. There's nothing we can do about this and the order of condition evaluation when actually reading data is unknown to us.

 

OR Conditions

Run this statement:

SELECT * FROM t1 
WHERE  id = 3
       OR id / 0 = 1

Here it is a whole different story than above with only AND conditions. Here you immediately see the error "Divide by zero error encountered". If you take a closer look you'll see that we never get to the execution plan phase. Why does this differ so much from the AND conditions?  Because OR's immediately mean Table/Index scan. Therefore no parameterization is done and at the query evaluation time all of the values are known. This is when the divide by zero is caught before the execution plan is even built or taken from cache and our query fails. It doesn't matter that our first condition is already true.

 

Literal Conditions

Run these statements:

SELECT 1 WHERE (1 = 1) OR (1 / 0 = 0)
SELECT 1 WHERE (1 = 0) OR (1 / 0 = 0)

We see that with literal conditions the short circuiting is being done on 1=1 and 1=0 and the 1 / 0 isn't evaluated at all, but this is not to be mistaken to work identically with real data as is shown above.

 

CASE Statement Conditions

Run these 2 statements:

SELECT * FROM t1 
WHERE  CASE
           WHEN ISDATE(val) = 0 THEN 0
           WHEN CONVERT(DATETIME, val) > GETDATE() THEN 1            
           ELSE 0
       END = 1

SELECT * FROM t1 
WHERE  CASE
           WHEN CONVERT(DATETIME, val) > GETDATE() THEN 1           
           WHEN ISDATE(val) = 0 THEN 0           
           ELSE 0
       END = 1

These 2 statements nicely illustrate that the order of condition evaluation in the CASE statement is always from first to last. The first statement will return one row since the second rows with invalid date will be filtered out by the ISDATE() function, while in the second statement it won't be filtered out and it will throw an error. This is of course quite logical because a CASE statement is simply a big if... else if... else... statement written differently.

 

If C++/C#/VB has short-circuiting why can't SQL Server have it?

To truly answer this let's take a look at how both work with conditions. C++/C#/VB all have short circuiting defined in the language specifications to speed up code execution. Why bother evaluating  N OR conditions when the first one is already true or M AND conditions when the first one is already false.

We as developers have to be aware that SQL Server works differently. It is a cost based system. To get the optimal execution plan for our query the query processor has to evaluate every where condition and assign it a cost. These costs are then evaluated as a whole to form a threshold that must be lower than the defined threshold SQL Server has for a good plan. If the cost is lower than the defined threshold the plan is used, if not the whole process is repeated again with a different mix of condition costs. Cost here is either a scan or a seek or a merge join or a hash join etc... Because of this the short-circuiting as is available in C++/C#/VB simply isn't possible. You might think that forcing use of index on a column counts as short circuiting but it doesn't. It only forces the use of that index and with that shortens the list of possible execution plans. The system is still cost based.

 

As a developer you must be aware that SQL Server does not do short-circuiting like it is done in other programming languages and there's nothing you can do to force it to.

 

kick it on DotNetKicks.com
 

Print | posted on Monday, February 25, 2008 10:54 AM

Feedback

# re: How SQL Server short-circuits WHERE condition evaluation

Nice article.
2/27/2008 3:40 AM | Craig

# re: regarding automation testing for sharepoint 2007

Hello,

I am new to Sharepoint development environment and the platform.

I Need to do Automated testing of applications for Microsoft Office SharePoint Server2007.
I don't have any idea about it, I wonder if you could help me?

Any useful links or tutorials or any other source which can be helpful to me.

Looking forward to receiving your email.



Many Thanks
3/11/2008 11:31 AM | Solmaz

# re: How SQL Server short-circuits WHERE condition evaluation

maybe this site isn't a good place to ask about sarepoint development, since i don't do that :)
3/11/2008 4:47 PM | Mladen

# re: How SQL Server short-circuits WHERE condition evaluation

Quote: "If C++/C#/VB has short-circuiting why can't SQL Server have it?"

FYI VB doesn't have short-circuiting [http://support.microsoft.com/kb/817250]
3/26/2008 11:14 PM | RB

# re: How SQL Server short-circuits WHERE condition evaluation

rb, thanx for letting me know about this.
since i've never really seriously programmed in VB i've just assumed that it also has short circuiting built in.
just another example of how assumptions can be wrong, right? :))
3/26/2008 11:20 PM | Mladen

# re: How SQL Server short-circuits WHERE condition evaluation

The versions of VB prior to VB.NET do not have short-circuits, but VB.NET (finally) has. Because of backward-compatibility it introduced 2 additional logical operators AndAlso & OrElse, that do the short-circuiting.
4/3/2008 5:34 PM | vladimir

# re: How SQL Server short-circuits WHERE condition evaluation

hi hi hi
9/24/2008 5:59 AM | paja

Post Comment

Title  
Name  
Email
Url
Comment   
Please add 6 and 6 and type the answer here:

Powered by: