Let us start with a simple question:
What is the goal of software development, be it database or .Net (or any other language)?
The first answer would be: Customer satisfaction!
And you'd be right. However there's more to customer satisfaction then the immediate product delivery effect. We have to think about future change requests, maintenance periods, etc...
Almost every business application out there consists of 2 basic parts: database back end and some kind of front end that consumes the data. In our case the front end is anything with access to the database.
In regard to future code changes and addition of new features what is the best way to construct this code? I would say the best way is using modular or black box design which states that solutions to big problems are built from solutions to smaller problems. This means that we have small independent modules that are black boxes to the outside world and they perform some operation based on input parameters and optionally provide output for other modules to consume. So how does this apply to our back and front end systems?
Front end - object oriented code
Dependency injection (a form of Inversion of Control) is the leading principle that enables our goal of keeping code in modules that have no outer dependencies. The input (constructor) arguments are usually interfaces that provide knowledge about the other modules that our module (class) uses. With this setup we can easily test each module by itself using automated unit tests. Also changing a module does not break other modules unless we change it's interface. And this is what we want.
This also enables us to keep our code separated and non repeatable. The worst thing we can do is have code that does the same thing in more than once place. Think maintenance nightmare!
A new feature request came in? No problem! Build it, test it, use interfaces of existing modules to access their functionality if needed. Change of an existing feature? Again, no problem. Change the module holding the feature and write new tests to validate its working correctly. And we're done.
A module in object oriented code is a class that exposes its functionality through an interface.
A simple example just to demonstrate the Dependency injection (DI) using Star Trek theme for all the geeks out there :)
Move method implementations:
// ENTERPRISE // BORG CUBE
public void Move(Speed speed) public void Move(Speed speed)
if (speed == Speed.Impulse) if (speed == Speed.Impulse)
else if (speed == Speed.Warp) else if (speed == Speed.Warp)
throw new Exception("Can't go faster than warp!"); _ITransWarpEngine.GoReallyFast();
Method implementations are irrelevant, since the point here is that our ships are built from modules. They know only about the modules interface and that's why the module itself can be changed with no problem. We could completely change the internals of ImpulseEngine without having to change a single thing in our ship as long as the drives interface stayed the same. And yes, we could refactor this simple example further on.
If you ask me Dependency Injection is every developers dream. Unfortunately that dream usually comes to a screeching halt when we start dealing with databases.
Back end - databases
The concept of DI is meant for object oriented logic which is excellent for code but totally fails when dealing with databases. A module in code is a class that implements an interface which exposes it's functionality. There is no such concept in databases.
So what is a module in a database? Table, row, stored procedure? Wrong.
A module in the database is DATA.
Remember: Data is the core of the business application. Bad data, lousy business. Doesn't matter if you have the most amazing application known to mankind.
This is a classic example of a denormalized schema. The problem lies in the ShippingAddress column. Since it is in the Orders table it could contain duplicates if we're shipping different orders to the same address. Because of this we can and always will get data inconsistency. And that is bad for a number of reasons.
CREATE TABLE [Order]
This below result set is the common problem with denormalized design. If we edit and save the address for order 4, we can change it independently of order 1 despite both orders having the same address. Now we have duplicated data and we don't know which one is correct. In a normalized design this isn't possible.
This is the upper table normalized into 2 tables: Order and Address. This way our data is only in one place and thus it can't get corrupted by bad input. Note that AddressValue column should be split further, but for shortness sake lets make it simple.
CREATE TABLE [Order]
CREATE TABLE [Address]
Data normalization is not just a whim of every developers arch nemesis - the DBA ;). It's a simple concept that is used to guarantee data consistency and validity. So when the time comes to change or extend our schema we don't have to change it in different places in our database. Just like Dependency Injection enables simple and testable change in code by modularizing objects so does Database Normalization by applying the same goal for data.
So don't fear normalization. Embrace it and make your life easier in the long run.
Denormalizing on purpose
The most heard and quite valid argument is that fully normalized database schema is slow. This is true since querying 4 normalized joined tables is slower than querying 1 denormalized table.
However this argument is usually used in the wrong context. People tend to just leave their data denormalized.
Always first normalize data and then denormalize it if the need arises.
With SQL Server this problem can easily be solved with indexed views. With them we can query the denormalized data but update the normalized data. And thus we can achieve the performance we need without having duplicated data.
Dependency Injection and Database Normalization are two entirely different concepts that strive to achieve the same thing. Ability to make quick and stable changes.
Keeping the area of change small enough helps us make any kind of change fast be it small or large. In the long run this is what we want. Let's face it, how many times did an important business application that supports some business process have a lifetime of six months or less without the slightest change? Almost never. And because of that it pays to design our business software properly. It really isn't that hard.