10 Common Design Mistakes
1) No logical model
Obvious isn’t it? But I can almost guarantee that everyone has started a DB project without one at some stage. You are lucky if you see an ERD and should buy a Lotto ticket if you see a Functional Dependency analysis.
As an example, my current project’s schema can be broken into 7 functional areas. I produced ERDs and FDs for 5 of these areas but ignored the other 2. Despite changing requirements, the “planned” 5 areas remain almost untouched, while the “unplanned” schemas have undergone countless re-writes. By the way, the 5 “planned” areas make up 90% of the business model.
I wasted my time and the client’s because I got lazy…
2) Physical Contamination:
While preparing your logical model, you start thinking too hard on what it will look like in the DBMS. It is a subtle issue that can distort your model.
This is how it affected my current project... Confusion and Clarity
3) Using NULL
Look out for the flames! Besides the relational heresy, NULL is hardly ever modeled at the logical level, but is jammed into the physical database as an afterthought.
Do you ever hear middle tier programmers complaining that C#/VB/java etc. cannot assign NULL to an integer or decimal (value types) in the middle/UI tier? Oh wait… you will when the database starts spitting out NULLs at them…
Programmer: “What will I display to the user?”
Dodgy DBA: “NULL”
Programmer: “But the UI control is limited to integers only!”
Dodgy DBA: “I don’t know… how about 0?”
Programmer: “Idiot!”
4) No Domains/types
How many times do you see people declare columns as simply the base type with no additional constraints?
e.g. NumberOfSiblings INT
I don’t know many people who have minus 65 children….
5) Denormalising
Nothing quite like receiving the wrong results at the speed of light is there. If you have to, you MUST tell the database about it. And in my experience it is bloody hard to do that with current SQL DBMSs…
A good example of this was at my last project. A central query required a 38 table JOIN. Response time from the DBMS (in this case DB2) was (I though at least) an impressive 220ms average. “Not good enough” was the call from the powers that be. A big meeting took place and after almost 6 hours of utter crap, it was decided that all this data would be stored redundantly in another table just for this query. The resultant query was almost instantaneous. The big wigs where happy but the DBAs where not. Why? Because we had to add triggers to keep it in sync and then run verification procedures every night to ensure correctness.
You’re probably thinking, “Suck it up Dave! That’s your job!” And good for you! But now there is an extra level of management (not cheap), and surprisingly (to the managers at least) the DBMS has to work harder.
Manager: “Why are the clock cycles up 12%?”
DBA: “Because I have to verify your busted arse table!”
Manager: “Do you have to?”
DBA: “Do you remember when we made the front page because we sent the wrong information to some of our clients?”
Manager: “Oh….”
It was actually page 3, but front page sounds better…
6) Performance obsession
Almost identical to the “Physcial Contamination” issue.
You know the type of people I am talking about.
“A string SEEK takes 2 milliseconds longer than an integer, so damn the model and stuff this meaningless number in there…at least it goes fast”
Yeah right… As fast as a lemming to a cliff.
I am fully aware of the lemming myth. Hail Disney!
7) No default values
Not mandatory of course, but makes the DBMS more responsive to change. As an example, a changing requirement might require an additional column to be added to a table. Without a default, you WILL have to re-write all your DML code and most probably client side code as well.
8) Identity madness
Excessive use of artificial/surrogate keys turns your lovely RM based logical model effectively into a pointer based cluster f***.
9) Relying on the application for correctness
One of the primary rules in data management is to trust no one or thing to provide correctness EXCEPT the database. Just because current DBMSs suck so badly at even simple rules doesn’t excuse you from passing responsibility to the application.
10) Relying on Stored Procedures for correctness
Stored procedures are a mixed blessing. Great for security and client interaction, but can make you lazy when implementing business rules. If it can be done at DDL time.. DO IT!
Legacy Comments
Jeff S
2005-09-13 |
re: 10 Common Design Mistakes Very nice as usual, David! I'd like to add one: Focusing too much on how your reports/application will LOOK when deciding how to model your tables. For example, knowing that on a given report, you will want to see Actuals and Budgets side by side on 1 row -- so you feel that you need to create a column called "Actual" and a column called "Budget" in your table, instead of a properly normalized table with Budget rows and Actual rows. |
Alex Weatherall
2005-09-13 |
re: 10 Common Design Mistakes Hi, While you make many valid points, can they all be described as design mistakes. I would like to see this list again with a bit more balance regarding certain points. 3) Using NULLs: Why is this a design mistake? Creating all columns as nullable - big mistake. But defining a column as NULLable to correspond with a requirment from your business needs etc isn't necessarily a mistake. 5) Denormalising: surely this isn't a design mistake - you've taken your design to it's natural normalised conclusion, and then for performance, maintenance, other reasons you make the informed decision to denormalise a specific part of the design. The design mistake comes from not considering the normalised design in the first place (see your own blog on storing graphic coordinates). 10) Stored Procedures: Not sure what you are meaning - if you are saying that data access logic is ok in the client layer, then I think in general I disagree (flexible reporting etc excluded). You make the point yourself in point 9) don't assume that the application will do things correctly. Well this includes accessing the database using DDL. I understand that you're making these points from a worst practice point of view, which is why I think the balanced argument is required. Thanks Alex |
Brett
2005-09-13 |
Alex The Null comment was from a logical relation model bent Foregt "flexibility". All data access should be done in sprocs balanced argument? I thought David was spot on. If you re-read this Alex, why don't you post what your background is in. |
DavidM
2005-09-13 |
re: 10 Common Design Mistakes Thanks Jeff. You are dead right. Modelling based on output falls into the "physical contamination" issue. Alex, thanks for the feedback. You are correct that some of the points are not really "design" related, but mistakes none the less. As Brett pointed out, the use of NULL without a business case is the real issue. "Denormalising" implies that you have normalised first, the mistake is not informing the DBMS about it and the consequences that follow.. Your reference to my other blogs does not make sense. It was a logical to physical implementation mistake. The PointArray type is in 1NF and not related to the denormalising point. DDL = Data Definition Language... not DML (data manipulation language. I am saying business rules should be imbedded at the "CREATE TABLE" stage (If possible) and not the stored procedure layer. This is as fair and balanced as I get.. Feel free to write a more "balanced" side (whatever that is) but I certainly will not.. |
melvin
2005-09-13 |
re: 10 Common Design Mistakes All very good design principles. I'd also like to add: - We're in agreement that it's the DBMS' responsibility to stop all data combinations that are outside of the business model (eg. a person can't have negative age), however, today's leading DBMS products can't handle all the constraints one might want implemented solely within DDL. As such, I've had great success implementing comprehensive "complex" constraints (a.k.a "business rules") as table-level triggers, as well as also generating, from a single source, a rule verification stored procedure to be run on-demand by the paranoid dba. Unique RAISERRORs are issued for every rule broken. Deferred constraint checks are implemented via a final call from the front-end prior to the actual COMMIT (too bad there's no such thing as an "ON COMMIT" trigger yet). Yeah, I can hear it now, "performance sucks"! Whether you execute the "rules" within your transaction logic (see next point on stored procedures below), or within table insert/update/delete triggers, or just before the transaction COMMIT, it still needs to be done. These complex constraints are just SELECT statements with multi-table JOINs that catch illegal state combinations (eg. disallow order discounts > 10% to new customers of < 1yr). The trick is to only execute the rules that may be affected at trigger-time and at commit-time. Brownie points go to the developer that can suck-out (at run-time) the NULL and DDL-defined column constraints from the database catalog and implement them as field-level checks on your front-end screens. - Stored procedures make excellent, version-able entry-points for the front-end application(s) to call. They (1), pear down the data to the minimum required (a backplane is faster than Gbit E), and (2), batch data between user "think" times by returning multiple result sets (parent and child rows; don't forget the drop-down list boxes too) and as such, are optimized for high-latency networks. But they needn't contain any business rules (see previous point); just the transaction logic. Usually, all the stored procedures need to do is exit on the first INSERT/UPDATE/DELETE statement failure (you are error checking right?). And none of this business with issuing SELECTs that check for the existence of the row you are just about to insert; just do the damn INSERT already. Let the RDBMS cough-up the system message, then map the system message to a pretty user message at the front-end. Any unexpected system messages should cause your application to panic with "Internal Error" then abort. I've successfully implemented all DDL-defined constraints and (recoverable) system messages (dup key, fk violations, dead lock, ...) using a mapping table: sysmsg, differentiator code -> usermsg, usermsg substitution params; where differentiator is a unique string on the message text; eg. the name of the check constraint. - Yeah, too many surrogate keys make you go nuts. But composite keys are a bit of a pain. It's really nice when all your foreign keys are one column wide though. Still, your data usually came from elsewhere so you'll need alternate (unique) keys (as the row is known to your sources) on your tables as well. And, it isn't really wise to use someone else's key unless you really trust your source to not introduce duplicates, or your table is never expected to be a superset of the source, or won't succumb to temporal issues (eg. using a social security number for the employee PK; you may want to add a employee before his ss number is known). FYI: Apparently, all NULLs are the same when it comes to optional unique constraints; this is another place SQL falls down. So you'll need to implement the constraint with a trigger or a unique index on a view (non-nullable rows) defined on that table; ick! *** if you've been paying close attention, you may have guessed I've written about my experiences with MS SQL 2000. Some of my design choices may not translate to Oracle or DB2; they may not have the facility to return multiple result sets from stored procedures or issues many RAISERROR-equivalents on a given DML statement. |
melvin
2005-09-13 |
re: 10 Common Design Mistakes Ooops! Of course what I meant by "All very good design principles." in my earlier message is the direct opposite of the "mistakes" you list. The consequences of which you describe should hopefully hammer that home. |
Alex Weatherall
2005-09-14 |
re: 10 Common Design Mistakes Brett, David, You are correct, I misread your point about stored procedures and DDL (I read DML - doh!), and I was reading this from an implementation point of view on SQL Server, rather than a ERM design perspective. Thanks Alex |
Claes W - rockmoose
2005-09-14 |
re: 10 Common Design Mistakes This is great, very good post, very good. Some guys are working their asses off to nullify the nulls... http://blogs.msdn.com/somasegar/archive/2005/08/11/450640.aspx |
Claes W - rockmoose
2005-09-14 |
re: 10 Common Design Mistakes I mean, YOUR post is great. The link was trivia that You might find interesting. |
Piatkow
2006-02-03 |
re: 10 Common Design Mistakes Nothing wrong with implementing denormalised structures or nulls as long as you <b>understand and justify</b> the compromises that you are making. You only do that by producing a fully normalised model, without nulls. Then examining and justifying every step back from that. My personal experience when I was a working DBA was that most "requirements" for nulls could actually be traced back to sloppy business analysis. |
prince
2009-12-23 |
re: 10 Common Design Mistakes please could you kindly send me this: LIST OF DESIGN ERRORS THAT CAN BE AVOIDED CONSTRUCTING A PILOT PLANT FIRST BEFORE BUILDING THE MAIN PLANT? |
pages de casinos en ligne
2010-01-18 |
re: 10 Common Design Mistakes Most mid-large size applications are designed as n-tier architecture, thus entities within LINQ act more as a container (DTO or Business objects) for carrying data from one tier to another. The problem with LINQ to SQL is that disconnecting from one data context and reattaching to another is not trivial.... |
Azadeh Khojandi
2010-03-29 |
re: 10 Common Design Mistakes Very nice ,thanks |