Indexes are one of the most important database features. Without them your database will crawl under a table in fear of simple queries on large tables or complex queries on small tables. That’s why one of the most important things a DBA or a database developer should know is basic index maintenance.
Performance problems are usually result of missing indexes. Index tuning is more of an art than it is a science since possible workloads are so many that there is no one rule to rule them all. It’s a mixture of testing, implementing, retesting and reimplementing. For this purpose The SQL Server team has created the Database Tuning Advisor that takes a recorded workload (from SQL Profiler), analyzes it and suggests index improvements. However this is a tool for some really deep analysis, so we’ll take a look at some lightweight options.
On the other hand too many unused indexes bloat the database and unnecessarily increase the database size. This can cause problems with larger backups that take longer to create. This is why there’s a delicate balance to keep when maintaining indexes.
With SQL Server 2005 we got introduced to Dynamic Management Views (DMV’s). They are a powerful tool when troubleshooting SQL Server performance. Their downside is that most of them hold aggregated values since server start. That’s why we have to poll them twice at the start and at the end of the desired interval to get meaningful results. However for our purpose we don’t have to do this since we want to know which indexes are not used since our server started. We can find this information because when building an execution plan the SQL Server keeps information about every index it has used and could have used because it’s a cost based engine.
Finding missing indexes
Needed Dynamic Management Views (DMV’s)
As I said earlier SQL Server keeps data about possible missing indexes. If an actual execution plan is included when running a query then the missing indexes are also displayed there.
The data about missing indexes is stored in the following DMV’s which all exclude info about spatial indexes:
This DMV returns only 2 columns with information about which indexes are in which group.
This DMV returns information about each missing indexes group. It returns info like the estimated average impact or how many seeks, scans and compilations/recompilations would benefit from adding the missing indexes.
This DMV returns detailed information about each missing index like table name that is missing an index and CSV’s of columns that the index would be beneficial on.
This a Dynamic Management Function (DMF) that accepts an index_handle parameter. It returns columns that should be in the suggested index identified with the index_handle that can be obtained from sys.dm_db_missing_index_details and sys.dm_db_missing_index_groups. It does not include spatial indexes. The column named column_usage returns info on how this column would benefit for a particular index. EQUALITY and INEQUALITY mean that the column would be used in a where clause predicate. INCLUDE means that the column should be an included column on an existing non-clustered index.
A simple example
-- run some queries
SELECT City, ModifiedDate
WHERE StateProvinceID < 1000 AND ModifiedDate > DATEADD(yyyy, -5, GETDATE());
SELECT City, StateProvinceID, PostalCode
WHERE StateProvinceID = 15733;
-- get the missing indexes that would be beneficial for speeding up above queries
SELECT D.index_handle, [statement] AS full_object_name, unique_compiles, avg_user_impact, user_scans, user_seeks, column_name, column_usage
FROM sys.dm_db_missing_index_groups G
JOIN sys.dm_db_missing_index_group_stats GS ON G.index_group_handle = GS.group_handle
JOIN sys.dm_db_missing_index_details D ON G.index_handle = D.index_handle
CROSS APPLY sys.dm_db_missing_index_columns (D.index_handle) DC
ORDER BY D.index_handle, [statement];
The DMV’s return raw data so you’ll have to do some string magic to build the CREATE INDEX statements out of it.
This cool feature of course has some limitations. Some of those are:
Database Engine Tuning Advisor kicks its behind. Think of Missing Indexes DMV’s as a really lightweight DTA. So when doing a really in-depth index analysis, don’t rely only on the Missing Indexes DMV’s. Just start with them.
Missing Indexes DMV’s don’t provide any information about new index overhead like space or IO/CPU overhead on updates, inserts and deletes.
There’s no information about the column order in the suggested index or whether it should be clustered or non-clustered.
Missing Indexes DMV’s consider only per query indexes and not per workload indexes.
Missing Indexes DMV’s can track a maximum of 500 indexes.
Trivial execution plans (plans for really simple SQL Statements) are not considered.
Finding unused indexes
Since SQL Server keeps data of all used indexes, getting the unused indexes is a simple matter of comparing used indexes to all existing indexes. Those that exist but are not used are of course unused indexes.
SELECT OBJECT_SCHEMA_NAME(I.OBJECT_ID) AS SchemaName,
OBJECT_NAME(I.OBJECT_ID) AS ObjectName,
I.NAME AS IndexName
FROM sys.indexes I
WHERE -- only get indexes for user created tables
OBJECTPROPERTY(I.OBJECT_ID, 'IsUserTable') = 1
-- find all indexes that exists but are NOT used
AND NOT EXISTS (
WHERE OBJECT_ID = I.OBJECT_ID
AND I.index_id = index_id
-- limit our query only for the current db
AND database_id = DB_ID())
ORDER BY SchemaName, ObjectName, IndexName
Note that both these methods are only useful if your server has been running long enough and has been through peek usage. The data in the mentioned DMV’s is cleared on the next instance restart. If we want to save this information for further analysis we should create a scheduled job that periodically queries the DMV’s and saves the information to tables.