In a previous blog, I mentioned a performance problem that I've been working on for a few weeks. I can finally say that this issue has been resolved. The problem was that the database server (SQL Server 2005) was running at 95%-100% CPU utilization at all times.
Here is the query that was causing our problem (object names have been changed):
SELECT Table1Id
FROM Table1
WHERE Unit = ?
We were receiving a bad execution plan for this query due to nvarchar being used for the Unit parameter. The Unit column is defined as varchar(50) in the table. See the below demonstration for the table structure including constraints and indexes.
The application is written in Java and uses the JDBC driver for SQL Server from Inet software. The application does not specify the data types of the parameters in their queries and relies on the driver to do that for them. It turns out that the driver is choosing nvarchar(4000) for the Unit parameter. Once the application was modified to convert the parameter to varchar, CPU utilization dropped to below 5%.
Here is what the above query was changed to in the application:
SELECT Table1Id
FROM Table1
WHERE Unit = cast(? AS VARCHAR)
I am able to duplicate the issue with pure T-SQL. Please see below for a demonstration of the problem.
Setup the environment:
SET NOCOUNT ON
CREATE TABLE Table1
(
Table1Id int IDENTITY(1, 1) NOT NULL,
Unit varchar(50) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL,
CONSTRAINT PK_Table1 PRIMARY KEY NONCLUSTERED (Table1Id),
CONSTRAINT uniq_Unit UNIQUE CLUSTERED (Unit)
)
INSERT INTO Table1 (Unit)
SELECT TOP 1000000 REPLACE(NEWID(), '-', '')
FROM SomeLargeTable
INSERT INTO Table1 (Unit) VALUES('abcdefghijklmnopqrstuvwxyz')
CREATE STATISTICS Stats1 ON Table1 (Table1Id) WITH FULLSCAN
CREATE STATISTICS Stats2 ON Table1 (Unit) WITH FULLSCAN
Turn on the option in SQL Server Management Studio to include the actual execution plan by hitting Ctrl+M or by selecting "Include Actual Execution Plan" in the Query menu.
Bad execution plan:
DECLARE @Unit1 nvarchar(50)
SET @Unit1 = 'abcdefghijklmnopqrstuvwxyz'
SELECT Table1Id
FROM Table1
WHERE Unit = @Unit1

Good execution plan:
DECLARE @Unit2 varchar(50)
SET @Unit2 = 'abcdefghijklmnopqrstuvwxyz'
SELECT Table1Id
FROM Table1
WHERE Unit = @Unit2

Here are the two execution plans together, ran as a batch to show cost (nvarchar/bad one is first):

Notice how the query that used nvarchar for the parameter costs 100% in the batch.
Cleanup the environment:
DROP TABLE Table1
This issue would have been avoided if the application used stored procedures or if it wasn't left up to the driver to determine what data type to use for the parameter.
Note: This occurs on the SQL collations only and not the Windows ones. Thanks Mladen for testing that out.