So, Dave and I were having a conversation about clustered indexes over lunch yesterday.
Yeah, we're this boring.
Anyway, I have an issue at work where I have a table that has a horrible clustered index. It's not very selective (the first key can return upwards of 80K rows) and the first key changes frequently when the row is updated (leading to fragmentation). It's great for one semi-frequent query in our system, but really isn't useful for most others. The problem is that SQL Server (talking SQL Server 2000 here) uses the index A LOT in the form of clustered index scans. To the point of ignoring every other index on the table. If I go in and search anything but the PK, it will ignore every other index in the table and use the clustered index. Period. No discussion. I've gone to using index hints to force it to use the correct index for the query, which actually improves the search by a factor of 10 to 100 most times.
The problem is that SQL Server assumes that a Bookmark Lookup step has a near-infinite cost. So it will do anything, including the equivalent of a full table scan, to avoid that bookmark lookup. Which is just insane in most cases.
The table also has this nasty trigger on it that basically serializes any rowset insert, updates, or deletes. There are good reasons for this, but it still sucks. There are no set-based operations on any data modifications in this table. As a result, data modifications take a very long itme.
So then the discussion took a turn.
"Why do you have a clustered index".
Well, because you have to.
"No, you don't"
hmm. No clustered index. No bookmark lookups. No completely ludicrous query plans to avoid the bookmark lookup. No tablespace fragmentation. Horrible insert performance, but it's not gonna get any worse than it already is with that damn trigger on there anyway. Every index on the table stand on it's own merits, equally likely to be chosen based on its selectivity, not on some artificial and horrible decision made by the query processor to avoid a bookmark lookup that would likely cost nothing anyway.
Hmm.
If this was true, we'd see behavior where if you have a table with a clustered index, pretty much any non-clustered index that isn't a fully covering index is next to useless. Which is generally the observed behavior.
So the choices are: Clustered index and every other index is covering for a particular query, or no clustered index and let the other indexes fight it out for selectivity for each query.
The nonclustered index approach looks to be more flexible in this case. The cost, of course, is insert performance. Which is what I'll be testing today.