Low-Hanging Fruit of Sargable Expressions
Can you spot the performance problem with the two queries below? These queries both illustrate a common and nasty performance problem that robs CPU cycles and consumes I/O bandwidth.
CREATE TABLE dbo.Foo( Bar varchar(10) NOT NULL CONSTRAINT PK_Foo PRIMARY KEY CLUSTERED, Data char(100) NOT NULL, CreateDate datetime NULL ); CREATE INDEX Foo_CreateDate ON dbo.Foo(CreateDate); --select single row by key SELECT Data FROM dbo.Foo WHERE Bar = 1; --select rows created within last week SELECT Data FROM dbo.Foo WHERE DATEDIFF(day, CreateDate, GETDATE()) <= 7; |
Both of these queries are examples of non-sargable conditions. The first SELECT statement shows the gotcha of data type precedence. The second query demonstrates the anti-pattern of applying a function to an indexed column instead of specifying the column alone on one side of the comparison operator.
Data Type Precedence
The WHERE clause predicate of the first query isn't sargable because SQL Server must convert the varchar "Bar" column to integer before the comparison to the integer literal 1. SQL Server has well-defined rules which dictate that, when two expressions are compared, the side with the lower precedence data type is converted to that of the higher data type. A column's index cannot be used for a seek operation when the column is converted to a different data type. Implicit conversion due to precedence rules is especially insidious.
The order of data type precedence is listed in the Books Online (http://msdn.microsoft.com/en-us/library/ms190309.aspx). You don't need to memorize the entire data type precedence list to write efficient queries, though. Just pay attention to detail regarding literals and data types to ensure the data types match whenever possible. Be aware of column data types during query development.
The problem with the first query can be remedied simply enclosing the literal in quotes. The two quotes turn the clustered index scan into an efficient seek:
--select single row by key SELECT Data FROM dbo.Foo WHERE Bar = '1'; |
Columns Must Stand Alone
The WHERE clause predicate of the second SELECT statement is also non-sargable because "CreateDate" is only passed as an argument to the DATEDIFF function. Columns must stand alone on one side of the comparison operator in order for an index seek to occur. Remember that the indexed values are the actual column values, not the results of a function (with the exception of a indexed computed column, which is beyond the scope of this post).
The performance issue with the second query can be addressed by reformulating the predicate so that "CreateDate" is alone. The query version below allows a seek using the non-clustered index. Whether or not the index is actually used depends on whether SQL Server deems it advantageous to do so based on estimates of qualifying rows.
--select rows created within last 7 days SELECT Data FROM dbo.Foo WHERE CreateDate >= DATEADD(day,-7,CAST(GETDATE() AS DATE));
|
Summary
Sargable conditions are low-hanging fruit that you should pay attention to during query development or when investigating a performance issue. Not only does one need indexes to provide an efficient data access path, queries need to be written so that they can take advantage of them.
Legacy Comments
Remote DBA
2009-03-09 |
re: Low-Hanging Fruit of Sargable Expressions Very interesting to know that these conditions are called non-sargable. BTW another issue with the first query will be conversion error during the scan, so it is always good to match the types in advance. thanks, Vadym. |
CodeMonkey
2009-03-16 |
re: Low-Hanging Fruit of Sargable Expressions Your sargable date comparison query won't return the same results as the non-sargable version. In fact, if the CreateDate column doesn't include a time component, you might not even get the ones from 7 days back at all. The sargable example is affected by the time component, while the non-sargable one ignores the time component. Still a good tip, but a reminder that while they might look the same at first glance, DATEDIFF and DATEADD do not work the same. And that one should always validate that your query returns the same results before and after optimization. |
Dan Guzman
2009-03-17 |
re: Low-Hanging Fruit of Sargable Expressions CodeMonkey, you are of course right that the datetime queries are not equivalent. I did recognize this when I drafted it but forgot to go back and fix it up before I posted. I'll change the sarable version from CreateDate >= DATEADD(day, -7, GETDATE()) to CreateDate >= DATEADD(day, -7, CAST(GETDATE() AS DATE); |