Ah, this is
not an anti-identity rant, don't worry!
Though, in a round-a-bout sort of way, it is yet another argument against always blindly using them -- but not in the way you might expect.
There is a simple rule I'd like to propose, let me know what you think:
"If your client code, SQL code, or configuration files reference the primary key column of a table to determine any application logic, that primary key column should not be an identity."
Now, I am not saying that primary key column can't be a meaningless integer. I am just saying it should not be an
auto-generated meaningless integer.
Working with an inherited web application (in "classic" ASP .... yuck), there were lots of tables like this:
groupID, groupName
1, Administrator
2, User
3, Finance User
5, Account Manager
8, ReadOnly
and there was
lots of code written like this:
if groupID = 1 then show delete button
if groupID =3 or groupID=5 and groupID != 8 then show the create new project button
... repeated everywhere, throughout the application and in the SQL code. Yet, the groupID column was an identity. I discussed a similar issue
here, but this is a slightly different approach. In this case, let us assume that we really want to -- or need to -- embed that "groupID" value everywhere in our code. And we want that table to have an integer primary key to keep things short and simple and precise in our database code. That's all fine, it happens, and it certainly is common.
But why does that groupID need to be an identity?
For static, non-changing system tables, why assign an identity primary key? Why not just make it an integer? And once a year or so when someone adds and entry to this table (usually manually, behind the scenes with an INSERT command or opening the table directly) what is so bad about having them manually "figure out" the next ID to use and manually insert it?
The database will always have that primary key constraint on the column -- we don't gain any integrity by using an identity. We just gain ... well, what exactly do we gain when the table is static?
The advantage of
not using an identity is this: if we wish to confer meaning on these ID values, we have complete control over them. Moving from one system to the next means we can just copy them over and we don't need to worry about setting IDENTITY INSERT ON and things like that. We don't have to worry about gaps in the sequence or re-seeding the tables. We can guarantee that we have complete control over these values, and they never will be created anywhere unless we
explicitly create them.
If we do use an identity, that's great the next "ID" will pop up magically, but what good does that really do for us in this scenario? I understand that when an order is generated, or a customer is added, or for things like that, an IDENTITY is the way to go. But for static tables, I really honestly don't bother with an identity. Just put an INT primary key on there and you are good to go.
The reason I bring this up is as the result of a mini-debate with another developer. If a system table such as "Status" will only ever have 3 values, and we wish to assign each of those values an integer "ID" as a primary key, why bother with an identity? Why not just use a standard, non-auto-generating integer column as the primary key?
Do the pros of an identity
in this situation outweigh the cons?
Then again, I am a bit of a control freak, I admit ... But I think that key values in your data tables that your applications and systems rely on are a pretty good thing to have control over.
(Another minor point oft-overlooked: Don't forget about considering the use of tinyint or smallint data types for these columns as well. )