Low-Hanging Fruit of Sargable Expressions

Can you spot the performance problem with the two queries below? These queries both illustrate a common and nasty performance problem that robs CPU cycles and consumes I/O bandwidth.

CREATE TABLE dbo.Foo(

    Bar varchar(10) NOT NULL

            CONSTRAINT PK_Foo PRIMARY KEY CLUSTERED,

    Data char(100) NOT NULL,

    CreateDate datetime NULL

);

 

CREATE INDEX Foo_CreateDate ON dbo.Foo(CreateDate);

--select single row by key

SELECT Data FROM dbo.Foo

WHERE Bar = 1;

 

--select rows created within last week

SELECT Data FROM dbo.Foo

WHERE DATEDIFF(day, CreateDate, GETDATE()) <= 7;

 

Both of these queries are examples of non-sargable conditions.  The first SELECT statement shows the gotcha of data type precedence.  The second query demonstrates the anti-pattern of applying a function to an indexed column instead of specifying the column alone on one side of the comparison operator.

Data Type Precedence

The WHERE clause predicate of the first query isn't sargable because SQL Server must convert the varchar "Bar" column to integer before the comparison to the integer literal 1.  SQL Server has well-defined rules which dictate that, when two expressions are compared, the side with the lower precedence data type is converted to that of the higher data type.  A column's index cannot be used for a seek operation when the column is converted to a different data type.  Implicit conversion due to precedence rules is especially insidious.

The order of data type precedence is listed in the Books Online (http://msdn.microsoft.com/en-us/library/ms190309.aspx).  You don't need to memorize the entire data type precedence list to write efficient queries, though.  Just pay attention to detail regarding literals and data types to ensure the data types match whenever possible.  Be aware of column data types during query development.

The problem with the first query can be remedied simply enclosing the literal in quotes. The two quotes turn the clustered index scan into an efficient seek:

--select single row by key

 

SELECT Data FROM dbo.Foo

WHERE Bar = '1';

Columns Must Stand Alone

The WHERE clause predicate of the second SELECT statement is also non-sargable because "CreateDate" is only passed as an argument to the DATEDIFF function. Columns must stand alone on one side of the comparison operator in order for an index seek to occur.  Remember that the indexed values are the actual column values, not the results of a function (with the exception of a indexed computed column, which is beyond the scope of this post).

The performance issue with the second query can be addressed by reformulating the predicate so that "CreateDate" is alone.  The query version below allows a seek using the non-clustered index.  Whether or not the index is actually used depends on whether SQL Server deems it advantageous to do so based on estimates of qualifying rows.

--select rows created within last 7 days

SELECT Data FROM dbo.Foo

WHERE CreateDate >= DATEADD(day,-7,CAST(GETDATE() AS DATE));

 

 

Summary

Sargable conditions are low-hanging fruit that you should pay attention to during query development or when investigating a performance issue.  Not only does one need indexes to provide an efficient data access path, queries need to be written so that they can take advantage of them.