byrmol Blog

Garbage

"Look up" tables, domains and a apology.

First of the bat I'd like to apologies to Ryan Rinaldi for my assumption about his intelligence based on one comment of his in my blog 4 Developers and a DBA.

As always, Lavos, has provided the karma check that I needed so I decided to visit his blog.
He seems like an intelligent guy, and to assume otherwise based on one comment is rude. So once again sorry Ryan.

The relational model is surprisingly simple. In fact, so easy, I taught my wife the basics in about 30 minutes.
It went something like this…

A type is set of values, and can be just about anything.. string, number, fingerprint, image, DNA. The term domain is its synonym.
In RM, we define the type (and in proper implementation of RM, the operators as well), and hence the set of values, that define the type.

A type also comes with a name (Attribute). So a "column" in RM is a attribute, domain pair.
The only other requirement is that an attribute, domain pair must take one and only one value of that type (1NF)

"Columns" are "grouped" into sets (No 2 columns have the same name) and this is what we call the "heading".(The DDL we create in database speak)
One or a combination of the columns in the "group" must uniquely identify the "row".

When we give this set of columns a name we have made a relation (table).
Values that enter the table must conform to the heading and must be unique according to the key defined.

That's it! I obviously left out the algebra, calculus and functional dependency because quite frankly she would fall a sleep.

To see a more thorough definition from the master RM educator (Date) see here Link

From my experience, people completely miss the point of domains (types).
Which brings us to the topic of "Look Up" tables….

The primary use of a "look up" table is used to restrict values that may be entered in a column via a Foreign Key constraint.
This is exactly what a domain does. It only allows values that we defined in our type.

If we take the "Look Up" table approach to its logical conclusion, why don't you have a "Look Up" table for your integer columns or for that matter every column?

An example is Gender. Without getting too pedantic (chromosomal minorities) but including "Unknown" values (anything but NULL) there are really only 3 values. Male, Female and Unknown. So in SQL speak it would be declared thus
Gender VARCHAR(7) NOT NULL DEFAULT('Unknown') CHECK(Gender in ('Male', 'Female', 'Unknown'))

Obviously a trivial example, but the thought process is the same for any domain.

A non trivial example is street types. We have a type with over 25 values in a CHECK constraint. Last month we added 1 more. (Governments.. do they have anything better to do?)
Now a lot of you will be saying that it is far easier to add a row to a table than to alter a check constraint. And you would probably be right but I suspect you would let a user do that too. I don't trust users and when the shit hits the fan I am responsible for the integrity and accuracy of that data.

How many typos (which effectively give duplicates) do you see in some of your "Look Up" tables? So I don't give them a chance to mess with domains values, they can suggest them and after review they can be entered.

The decision is obviously on a model by model basis but here is a quick and dirty guide as when to use a CHECK constraint instead of a look up table: when the set of values are not changing frequently (I leave it up to you to define "frequent") or the word "type" is used to describe the attribute name.

Lastly, what about the hard coded values in your application? This shows a lack of imagination. For a start, all domains should be documented and in our case we create "virtual" views for each of them in a database named (what else?) Domains as well as its definition in SQL in a table named DomainDefinitions that we use for rapidly creating new tables.

CREATE VIEW Genders
AS
SELECT CAST('Male' AS VARCHAR(7)) AS Gender
UNION ALL
SELECT CAST('Female' AS VARCHAR(7)) AS Gender
UNION ALL
SELECT CAST('Unknown' AS VARCHAR(7)) AS Gender
GO

The .NET Dataset has been a real boon for us in this area as we simply load all our domains into it once and then flip a switch to repopulate when any changes are made to these domains…

It is obviously a little bit more work, but the payoffs include better performance (CPU versus IO), better data accuracy (mainly bought about by processes) and a feel good factor that you are staying as close to theory as your DBMS will allow.

EDIT: Fixed a dodgy "a" tag.. Why does Opera handle it and not IE?

Legacy Comments


Ryan A. Rinaldi
2004-02-18
re: "Look up" tables, domains and a apology.
First, thank you for the apology. I would just like you to know that I wasn't offended. I was upset that I said something that was taken the wrong way, which was entirely my fault.

Second, the link you provided is broken. :)

Third, I do see a value in CHECK constraints. Just in my travels I very rarely would run into an occasion where I had a list of values that I needed to restrict a column to, that did not also have some sort of meta-data attached. This requires the uses of a relation.

Thanks for writing a follow up post. I appreciate the clarification and it makes it quite clear how and why you are doing things the way you are doing them. :)

DavidM
2004-02-18
re: "Look up" tables, domains and a apology.
I knew you would cope.. but still poor manners on my behalf. I should get my wife to moderate my posts, but she reckons I'm full of shit to begin with, so nothing would get through!

Sorry about the link... It seems to work from my end.. (I use Opera if that makes any difference). Although I am seeing some strange highlighting....

Lavos has kept me in check a couple of times.. He did a famous one when I went on a rage about nulls. I (and another 2 DBA's) manage 38 databases supporting 43 applications. This release (we have one every 3 months), adds 2 more databases and 2 more apps. We have to be data nazi's as any mistake is VERY quickly picked up by the media.

Having a central repository (and hence process) for them makes a lot of sense in our environment

What type of Meta-Data? Lavos used that reason as well and for the life of me I can't figure out what extra meta data you would need.

Lavos
2004-02-18
re: "Look up" tables, domains and a apology.
Honestly, a lot of it is stuff that could be kept in a data dictionary or other machine readable format outside of the database.

The first example I can think of is extra descriptive text, column headers, sort orders, and other associated codes. There are several ways to do it, but the extra table *seems* like it would be the easiest. Your views should be just as easy to use as long as they don't want admin's to change the values.


I guess part of my push on meta-data is because I'm attempting to use generative programming techniques in my work, which relies on additional information.

I really liked reading that small part about your setup. I hadn't thought about storing domains that way before. I'll have to convert part of a project I did over, because it really is a more effective way of expressing the intent.

Unfortunately, I don't think my company's collaboration partner will like this approach.

DavidM
2004-02-18
re: "Look up" tables, domains and a apology.
Display Order..Of Course! Alphabetical seems to be our standard.

Adam Weigert
2004-02-18
re: "Look up" tables, domains and a apology.
Sadly, I have not come across a simple domain constraint set in my work. Well, lets say they started out as simple domain constraints, but then they wanted to add additional properties to each of the values. :( So, like everything, there are always exceptions.

SteveH
2004-02-19
re: "Look up" tables, domains and a apology.
<:-)>Couldn't you handle display order using the domain technique just by naming the values something like.

FirstOne
_NoThisOneFirst
__ThisOneShouldBeFirst
___UhOhThisIsReallyFirst

Then an alphabetic sort would work.

This method seems to work for my users when they name folders. </:-)>

Marc Shiker
2004-02-19
re: "Look up" tables, domains and a apology.
Thanks for the post and the insight. It caused a productive discussion between a coworker and myself on how we handle that situation. I've always, by default without thinking of an alternative solution, created the lookup table to store the possible values of the "type" for the table. What you are suggesting sure makes a ton of sense.

Brett
2004-02-19
re: "Look up" tables, domains and a apology.
Geez what the hell did I miss....

Just one question...How do you add new values to a constraint?

Oh, never mind, just read that...

What is your production environment like?

What's the amount of time allotted for development, QA, regression, and the release window? What about a backout strategy?

I'll let my user admin add the value.

ok, now for a MOO




DavidM
2004-02-19
re: "Look up" tables, domains and a apology.
Brett,

In the "Domains" database we have the views, the DomainsDefintion and one other table (Domain Usage) that tracks where it is used (Database, table, Column) and the Check Constraints Name.

When a new value is added to a domain, we alter the view (about 2 minutes work), alter the constraint definition (about another minutes work) and then run a stored procedure that automatically drops and recreates the check constraint (WITH NOCHECK, if we don't have this option, the locks just kill us) in all the database and tables affected... Works like a charm...

I would like to add that we have several "Look Up" tables simply because SQL Server does not have decent UDT support. The domains in question only have a single "property". We couldn't do it if we had say a domain of type Point for example...

We will be joining the beta of Yukon in June, and we hope to show that these "Look Ups" can be converted to a domain..

Andres Aguiar
2004-02-24
re: "Look up" tables, domains and a apology.
Another use case is when you need to refer to specific values in your code.

For example, you need to know if the person is male or female to apply a business rule. In that case you'll need to define an enum class in your application that maps to the same values.

I like the overall idea, but you have a lot of redundancy in your model. Each time you add a new value you have to change the constraint, the view, and eventually an enum class. This seems like a good scenario for code generation ;)


DavidM
2004-02-24
re: "Look up" tables, domains and a apology.
Andres,

Thanks for the feedback.

You example is only valid if your business rules are in your app and not in the database itself..
see my post about the database as your BLL.

As I have said, the choice (as always) is on a model by model basis...

There is zero redundancy in this model, only administration. Be definition, if you have added a value to a domain, you have changed the type itself. This should not be considered trivial by any stretch of the imagination. This techinque is really usefull for enterprises (ie multiple database) but probably overkill for a single database/application.

One of the major benefits we get is when it comes to OLAP. Beautifully shared dimensions only need to be declared once as we "know" that all domains contain exactly the same values regardless of the database in question... Vital for us...

1234
2008-06-03
re: "Look up" tables, domains and a apology.
I thik it is better to include examples i mean code in .net