Database design can be very complicated, and it truly is an art as opposed to a science; sometimes there are multiple correct ways to model the same data with pros and cons to each. I can understand that
normalization can be tricky to comprehend and to implement, and that concepts like stored procedures and foreign keys and even indexes and constraints can take time to grasp.
But -- what about
Data Types? They are so basic, so simple, so fundamental; not only for database design, but for
any sort of programming in general ... what excuse is there for not using correct data types for the columns in a table design?
I see it time and time again in the
SQLTeam forums -- "dates" that don't sort properly, "numbers" that don't add correctly, "boolean" columns containing 10 different values, invalid entries that
somehow show up in "date" columns, and so on ... Of course, since we are rarely provided any DDL to review, it often takes dozens of posts going back and forth until we finally realize: "wait ... you
aren't using a datetime data type to store these dates??? Arggh!!"
Look, I can understand why you think that columns like
Period1, Period2, ..., Period12 are a good idea in your tables at first ... I can accept that you have nothing but IDENTITY primary keys ... I see the perceived advantage of using dynamic SQL created in your client applications, instead of using stored procedures, to keep things "simple" ... but using correct data types is so basic, so important, so fundamental, and so crucial to pretty much
anything working correctly in a database that there is just no excuse or reason to abuse them!
In short, even a poorly designed database, with one giant "master" table with no normalization or logic anywhere in sight, should
still at least use a Money data type to store currency values!
Perhaps the confusion comes from Excel users, where data types are handled behind the scenes ... or maybe "old school" VB programmers used to using variant data types (or worse -- undeclared variables!) to store values... But when you design a table in any database, you are always explicitly stating the data types of the columns -- there's nothing hidden, or no option to ignore them. You
must declare a data type when creating a column, so how can anyone justify using VARCHAR to store a date?
Well, I suppose we are all guilty on way or another of abusing data types. I often use Integers when a byte will suffice, or a DateTime when a smalldatetime would do just fine, or VARCHARs that are larger than necessary. But at least, fundamentally, those values
act the way they are supposed to -- it is just a matter of optimizing disk space in these cases. Using an Integer instead of a Byte may waste some space, but at least I can still calculate,sort, join and compare those values correctly -- which is what a database is all about.
So, please, the next time you create a table -- think carefully about the data types of your columns. It's not a "formatting" declaration, like in Excel, or some sort of general guideline that SQL will follow, or something to determine later on. Each column's data type is a crucial, fundamental aspect of that table and your entire schema, and greatly determines the efficiency of your code in terms of both performance and simplicity. So many complicated SQL "problems" and "puzzles" that beginner programmers encounter can be solved easily and quickly by simply using the correct data types on your tables. It's not all you need to do to ensure a good database design, but it is the very first step.
see also: