Data Types of Parameters and Execution Plans
In a previous blog, I mentioned a performance problem that I've been working on for a few weeks. I can finally say that this issue has been resolved. The problem was that the database server (SQL Server 2005) was running at 95%-100% CPU utilization at all times.
Here is the query that was causing our problem (object names have been changed):
SELECT Table1Id FROM Table1 WHERE Unit = ?
We were receiving a bad execution plan for this query due to nvarchar being used for the Unit parameter. The Unit column is defined as varchar(50) in the table. See the below demonstration for the table structure including constraints and indexes.
The application is written in Java and uses the JDBC driver for SQL Server from Inet software. The application does not specify the data types of the parameters in their queries and relies on the driver to do that for them. It turns out that the driver is choosing nvarchar(4000) for the Unit parameter. Once the application was modified to convert the parameter to varchar, CPU utilization dropped to below 5%.
Here is what the above query was changed to in the application:
SELECT Table1Id FROM Table1 WHERE Unit = cast(? AS VARCHAR)
I am able to duplicate the issue with pure T-SQL. Please see below for a demonstration of the problem.
Setup the environment:
SET NOCOUNT ONCREATE TABLE Table1 ( Table1Id int IDENTITY(1, 1) NOT NULL, Unit varchar(50) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL, CONSTRAINT PK_Table1 PRIMARY KEY NONCLUSTERED (Table1Id), CONSTRAINT uniq_Unit UNIQUE CLUSTERED (Unit) )
INSERT INTO Table1 (Unit) SELECT TOP 1000000 REPLACE(NEWID(), '-', '') FROM SomeLargeTable
INSERT INTO Table1 (Unit) VALUES('abcdefghijklmnopqrstuvwxyz')
CREATE STATISTICS Stats1 ON Table1 (Table1Id) WITH FULLSCAN CREATE STATISTICS Stats2 ON Table1 (Unit) WITH FULLSCAN
Turn on the option in SQL Server Management Studio to include the actual execution plan by hitting Ctrl+M or by selecting "Include Actual Execution Plan" in the Query menu.
Bad execution plan:
DECLARE @Unit1 nvarchar(50)
SET @Unit1 = 'abcdefghijklmnopqrstuvwxyz'SELECT Table1Id FROM Table1 WHERE Unit = @Unit1
Good execution plan:
DECLARE @Unit2 varchar(50)
SET @Unit2 = 'abcdefghijklmnopqrstuvwxyz'SELECT Table1Id FROM Table1 WHERE Unit = @Unit2
Here are the two execution plans together, ran as a batch to show cost (nvarchar/bad one is first):
Notice how the query that used nvarchar for the parameter costs 100% in the batch.
Cleanup the environment:
DROP TABLE Table1
This issue would have been avoided if the application used stored procedures or if it wasn't left up to the driver to determine what data type to use for the parameter.
Note: This occurs on the SQL collations only and not the Windows ones. Thanks Mladen for testing that out.
Legacy Comments
Jeff
2007-11-16 |
re: Data Types of Parameters and Execution Plans Great post, Tara, very nice example. - jeff |
WesleyB
2007-11-16 |
re: Data Types of Parameters and Execution Plans I frequently saw this issue with tinyint columns being compared to int parameters (in SQL Server 2000). In Visual Studio 2008 For Database Professionals they have actually included an automatic way of detecting these mismatches for stored procedures. |
Anon
2007-11-16 |
re: Data Types of Parameters and Execution Plans Do you believe this issue to be a type precedence one? I found that if the Table1.Unit is created as an nvarchar(50), both select statements will produce the desired clustered index seek. |
Mladen
2007-11-16 |
re: Data Types of Parameters and Execution Plans if memory server correctly this is called datatype upcasting... |
Anon
2007-11-16 |
re: Data Types of Parameters and Execution Plans That's why I was asking. Nvarchar is higher on the precedence list over varchar. I assume that there is an implicit downcast from nvarchar to varchar and that is why there is the discrepancy between plans when using the varchar as the column type. It seems that it's a precedence issue, but I don't know (and would like to know). |
Mladen
2007-11-16 |
re: Data Types of Parameters and Execution Plans it depends on the collation. if you use SQL_* collation you get tara's plans if you use windows collation you get seeks for both |