Mladen Prajdić Blog

Blog about stuff and things and stuff. Mostly about SQL server and .Net

SQL Server - Find missing and unused indexes

Indexes are one of the most important database features. Without them your database will crawl under a table in fear of simple queries on large tables or complex queries on small tables. That’s why one of the most important things a DBA or a database developer should know is basic index maintenance.

Performance problems are usually result of missing indexes. Index tuning is more of an art than it is a science since possible workloads are so many that there is no one rule to rule them all. It’s a mixture of testing, implementing, retesting and reimplementing. For this purpose The SQL Server team has created the Database Tuning Advisor that takes a recorded workload (from SQL Profiler), analyzes it and suggests index improvements. However this is a tool for some really deep analysis, so we’ll take a look at some lightweight options.

On the other hand too many unused indexes bloat the database and unnecessarily increase the database size. This can cause problems with larger backups that take longer to create. This is why there’s a delicate balance to keep when maintaining indexes.

With SQL Server 2005 we got introduced to Dynamic Management Views (DMV’s). They are a powerful tool when troubleshooting SQL Server performance. Their downside is that most of them hold aggregated values since server start. That’s why we have to poll them twice at the start and at the end of the desired interval to get meaningful results. However for our purpose we don’t have to do this since we want to know which indexes are not used since our server started. We can find this information because when building an execution plan the SQL Server keeps information about every index it has used and could have used because it’s a cost based engine.

 

Finding missing indexes

Needed Dynamic Management Views (DMV’s)

As I said earlier SQL Server keeps data about possible missing indexes. If an actual execution plan is included when running a query then the missing indexes are also displayed there.

The data about missing indexes is stored in the following DMV’s which all exclude info about spatial indexes:

sys.dm_db_missing_index_groups

This DMV returns only 2 columns with information about which indexes are in which group.

sys.dm_db_missing_index_group_stats

This DMV returns information about each missing indexes group. It returns info like the estimated average impact or how many seeks, scans and compilations/recompilations would benefit from adding the missing indexes.

sys.dm_db_missing_index_details

This DMV returns detailed information about each missing index like table name that is missing an index and CSV’s of columns that the index would be beneficial on.

sys.dm_db_missing_index_columns

This a Dynamic Management Function (DMF) that accepts an index_handle parameter. It returns columns that should be in the suggested index identified with the index_handle that can be obtained from sys.dm_db_missing_index_details and sys.dm_db_missing_index_groups. It does not include spatial indexes. The column named column_usage returns info on how this column would benefit for a particular index. EQUALITY and INEQUALITY mean that the column would be used in a where clause predicate. INCLUDE means that the column should be an included column on an existing non-clustered index.

 

A simple example

USE AdventureWorks;
GO
-- run some queries
SELECT  City, ModifiedDate
FROM    Person.Address
WHERE   StateProvinceID < 1000 AND ModifiedDate > DATEADD(yyyy, -5, GETDATE());
GO
SELECT  City, StateProvinceID, PostalCode
FROM    Person.Address
WHERE   StateProvinceID = 15733;
GO
-- get the missing indexes that would be beneficial for speeding up above queries
SELECT  D.index_handle, [statement] AS full_object_name, unique_compiles, avg_user_impact, user_scans, user_seeks, column_name, column_usage
FROM    sys.dm_db_missing_index_groups G
        JOIN sys.dm_db_missing_index_group_stats GS ON G.index_group_handle = GS.group_handle
        JOIN sys.dm_db_missing_index_details D ON G.index_handle = D.index_handle
        CROSS APPLY sys.dm_db_missing_index_columns (D.index_handle) DC
ORDER BY D.index_handle, [statement];

 

The DMV’s return raw data so you’ll have to do some string magic to build the CREATE INDEX statements out of it.

 

Limitations

This cool feature of course has some limitations. Some of those are:

  • Database Engine Tuning Advisor kicks its behind. Think of Missing Indexes DMV’s as a really lightweight DTA. So when doing a really in-depth index analysis, don’t rely only on the Missing Indexes DMV’s. Just start with them.

  • Missing Indexes DMV’s don’t provide any information about new index overhead like space or IO/CPU overhead on updates, inserts and deletes.

  • There’s no information about the column order in the suggested index or whether it should be clustered or non-clustered.

  • Missing Indexes DMV’s consider only per query indexes and not per workload indexes.

  • Missing Indexes DMV’s can track a maximum of 500 indexes.

  • Trivial execution plans (plans for really simple SQL Statements) are not considered.

 

 

Finding unused indexes

Since SQL Server keeps data of all used indexes, getting the unused indexes is a simple matter of comparing used indexes to all existing indexes. Those that exist but are not used are of course unused indexes.

SELECT  OBJECT_SCHEMA_NAME(I.OBJECT_ID) AS SchemaName,
        OBJECT_NAME(I.OBJECT_ID) AS ObjectName,
        I.NAME AS IndexName        
FROM    sys.indexes I   
WHERE   -- only get indexes for user created tables
        OBJECTPROPERTY(I.OBJECT_ID, 'IsUserTable') = 1 
        -- find all indexes that exists but are NOT used
        AND NOT EXISTS ( 
                    SELECT  index_id 
                    FROM    sys.dm_db_index_usage_stats
                    WHERE   OBJECT_ID = I.OBJECT_ID 
                            AND I.index_id = index_id 
                            -- limit our query only for the current db
                            AND database_id = DB_ID()) 
ORDER BY SchemaName, ObjectName, IndexName 

 

Note that both these methods are only useful if your server has been running long enough and has been through peek usage. The data in the mentioned DMV’s is cleared on the next instance restart. If we want to save this information for further analysis we should create a scheduled job that periodically queries the DMV’s and saves the information to tables.

 

kick it on DotNetKicks.com

Legacy Comments


Tara
2009-04-09
re: SQL Server - Find missing and unused indexes
http://sqlserverpedia.com/wiki/Index_Related_DMV_Queries

;-)

Saggi Neumann
2009-04-09
re: SQL Server - Find missing and unused indexes
Hey,

Since the DMV sys.dm_db_index_usage_stats also tracks changes in indexes (not only seeks and scans) in the user_updates column, it might list some indexes that aren't used for querying but are updated. Your query would just get you indexes on tables that are not touched. I'd change it to something like:


SELECT OBJECT_SCHEMA_NAME(I.OBJECT_ID) AS SchemaName,
OBJECT_NAME(I.OBJECT_ID) AS ObjectName,
I.NAME AS IndexName
FROM sys.indexes I
WHERE -- only get indexes for user created tables
OBJECTPROPERTY(I.OBJECT_ID, 'IsUserTable') = 1
-- find all indexes that exists but are NOT used
AND NOT EXISTS (
SELECT index_id
FROM sys.dm_db_index_usage_stats
WHERE OBJECT_ID = I.OBJECT_ID
AND I.index_id = index_id
-- limit our query only for the current db
AND database_id = DB_ID()
AND (user_seeks >0 OR user_scans > 0)
)
ORDER BY SchemaName, ObjectName, IndexName

Cheers,
S. Neumann

Mladen
2009-04-10
re: SQL Server - Find missing and unused indexes
cool!
didn't think of that at all.

thanx!

Bernd Kriszio
2010-09-19
re: SQL Server - Find missing and unused indexes
Perhaps some indexes are unused, but I don't think it would be a good idea to drop the indexes created by primary keys, or unique indexes. I add the following lines

and i.is_primary_key = 0
and i.is_unique = 0

Bernd