Most Valuable Yak (Rob Volk) Blog

…and other neat SQL Server tricks

T-SQL Tuesday #72: Data Modeling

It's that time again: T-SQL Tuesday! This month the lovely and amazing Mickey Stuewe (b | t) is hosting and asks us to write about the trials and tribulations of data modeling.

Below is a tale that occurs somewhat frequently (in my humble experience anyway), I’ve changed some of the details so as not to embarrass anyone besides myself. Any resemblance to actual data models, living or dead, is really unfortunate.

(Yes, the excessive code is deliberate)

I really love ice cream!

I decided to open my own ice cream parlor and design the database I'd need to support it:

CREATE TABLE IceCream(IceCreamID int NOT NULL IDENTITY(1,1) PRIMARY KEY, IceCreamName nvarchar(255) NOT NULL);

Naturally we need to support some accessories:

CREATE TABLE IceCreamToppings(IceCreamToppingID int NOT NULL IDENTITY(1,1) PRIMARY KEY, IceCreamToppingName nvarchar(255) NOT NULL);

And of course we need data:

INSERT IceCream(IceCreamName) VALUES ('vanilla'),('chocolate'),('strawberry'),('Neopolitano'),('pistachio'),('mint chocolate chip');
INSERT IceCreamToppings(IceCreamToppingName) VALUES ('chocolate sprinkles'),('candy sprinkles'),('chocolate chips'),
('blueberry sauce'),('whipped cream'),('butterscotch'),('m&ms'),('oreo crumbles');

Business took off into the stratosphere!

My customers also love putting ice cream on breakfast foods, so I offered pancakes and waffles!

CREATE TABLE Pancakes(PancakeID int NOT NULL IDENTITY(1,1) PRIMARY KEY, PancakeName nvarchar(255) NOT NULL);
CREATE TABLE PancakeToppings(PancakeToppingID int NOT NULL IDENTITY(1,1) PRIMARY KEY, PancakeToppingName nvarchar(255) NOT NULL);
CREATE TABLE Waffles(WaffleID int NOT NULL IDENTITY(1,1) PRIMARY KEY, WaffleName nvarchar(255) NOT NULL);
CREATE TABLE WaffleToppings(WaffleToppingID int NOT NULL IDENTITY(1,1) PRIMARY KEY, WaffleToppingName nvarchar(255) NOT NULL);

INSERT Pancakes(PancakeName) VALUES ('Homemade'),('frozen');
INSERT Waffles(WaffleName) VALUES ('Belgian'),('Eggo round'),('Eggo Square');
INSERT PancakeToppings(PancakeToppingName) VALUES ('chocolate chips'),('blueberries'),('blueberry sauce'),('maple syrup'),('apples'),('cinnamon sugar'),('powdered sugar');
INSERT WaffleToppings(WaffleToppingName) VALUES ('chocolate chips'),('blueberries'),('blueberry sauce'),('maple syrup'),('apples'),('cinnamon sugar'),('powdered sugar');

WOW!

They loved this so much I started offering other breakfast options:

CREATE TABLE Omelettes(OmeletteID int NOT NULL IDENTITY(1,1) PRIMARY KEY, OmeletteName nvarchar(255) NOT NULL);
CREATE TABLE OmeletteStuffings(OmeletteStuffingID int NOT NULL IDENTITY(1,1) PRIMARY KEY, OmeletteStuffingName nvarchar(255) NOT NULL);

INSERT Omelettes(OmeletteName) VALUES ('whole eggs'),('egg whites');
INSERT OmeletteStuffings(OmeletteStuffingName) VALUES ('sausage'),('mushrooms'),('green peppers'),('ham'),('spinach'),('feta cheese'),('onion'),('red peppers');

And then lunch offerings soon followed:

CREATE TABLE Pizza(PizzaID int NOT NULL IDENTITY(1,1) PRIMARY KEY, PizzaName nvarchar(255) NOT NULL);
CREATE TABLE PizzaToppings(PizzaToppingID int NOT NULL IDENTITY(1,1) PRIMARY KEY, PizzaToppingName nvarchar(255) NOT NULL);

CREATE TABLE Burgers(BurgerID int NOT NULL IDENTITY(1,1) PRIMARY KEY, BurgerName nvarchar(255) NOT NULL); CREATE TABLE BurgerToppings(BurgerToppingID int NOT NULL IDENTITY(1,1) PRIMARY KEY, BurgerToppingName nvarchar(255) NOT NULL);

CREATE TABLE Sandwiches(SandwichID int NOT NULL IDENTITY(1,1) PRIMARY KEY, SandwichName nvarchar(255) NOT NULL); CREATE TABLE SandwichToppings(SandwichToppingID int NOT NULL IDENTITY(1,1) PRIMARY KEY, SandwichToppingName nvarchar(255) NOT NULL);

INSERT Pizza(PizzaName) VALUES ('deep dish'),('sicilian'),('regular'); INSERT PizzaToppings(PizzaToppingName) VALUES ('pepperoni'),('sausage'),('mushrooms'),('green peppers'),('ham'),('spinach'),('feta cheese'),('onion'),('pineapple'); INSERT Burgers(BurgerName) VALUES ('beef'),('kobe'),('angus'),('turkey'),('chicken'); INSERT BurgerToppings(BurgerToppingName) VALUES ('bacon'),('mayo'),('mustard'),('lettuce'),('tomato'),('red onion'),('white onion'),('sauteed onion'),('mushrooms'),('sauteed mushrooms'),('pickles'); INSERT Sandwiches(SandwichName) VALUES ('roast beef'),('ham'),('turkey'),('pastrami'),('baloney'); INSERT SandwichToppings(SandwichToppingName) VALUES ('bacon'),('mayo'),('mustard'),('lettuce'),('tomato'),('red onion'),('white onion'),('sauteed onion'),('mushrooms'),('sauteed mushrooms'),('pickles');

And then, my business went international!

People wanted my food, but didn't speak English, so I offered translations for everything:

CREATE TABLE IceCreamToppingsTranslation(LanguageID int NOT NULL, IceCreamToppingID int NOT NULL, TranslatedText nvarchar(4000) NOT NULL);
CREATE TABLE WaffleToppingsTranslation(LanguageID int NOT NULL, WaffleToppingID int NOT NULL, TranslatedText nvarchar(4000) NOT NULL);
CREATE TABLE PancakeToppingsTranslation(LanguageID int NOT NULL, PancakeToppingID int NOT NULL, TranslatedText nvarchar(4000) NOT NULL);
CREATE TABLE PizzaToppingsTranslation(LanguageID int NOT NULL, PizzaToppingID int NOT NULL, TranslatedText nvarchar(4000) NOT NULL);
CREATE TABLE BurgerToppingsTranslation(LanguageID int NOT NULL, BurgerToppingID int NOT NULL, TranslatedText nvarchar(4000) NOT NULL);
-- repeat ad nauseum

And then...

At some point, I was too busy serving food to design databases anymore, and couldn't deal with the performance issues that were cropping up, so I hired ace database designer Portia N. Carrera, and she promptly replied:

"This design is ridiculous! And it will only get worse as time goes on! No one in their right mind 
would ever do such a thing!" (edited to remove excessive but totally justified profanity) 

Well, they would. They did. **

"Bull----! How did it get like this?" 

Well, that's kinda the thing. I started small, and added stuff. It's easy and natural, and fairly sensible as long as things stay small. Unfortunately as things grew I never went back to look at what's there and question whether this is the "right" way or if there's a better way. This became technical debt and the effort needed to change it was insurmountable.

"Here's a good take on technical debt:" https://medium.com/@MostlyHarmlessD/on-technical-debt-7bac65edf349 

It describes the reasoning/mentality around these decisions, with much of it focused on time constraints. Portia calls this "5 minute/5 second design" and curses everyone's lack of foresight (when she's in a bad mood). ***

Well Portia, how would you fix it?

"Like this:" 

CREATE TABLE FoodTypes(FoodTypeID int NOT NULL PRIMARY KEY, FoodType nvarchar(64) NOT NULL);
INSERT FoodTypes(FoodType) VALUES(N'Ice Cream'),(N'Pancakes'),(N'Waffles'),(N'Omelettes'),(N'Pizza'),(N'Burgers'),(N'Sandwiches');

CREATE TABLE Foods(FoodID int NOT NULL PRIMARY KEY, FoodTypeID int NOT NULL, Food nvarchar(64) NOT NULL); INSERT Foods(FoodTypeID,Food) VALUES(1,N'Vanilla'),(1,N'Chocolate'),(1,N'Strawberry') – Ice cream ,(2,N'Homemade'),(2,N'Frozen') – Pancakes ,(3,N'Belgian'),(3,N'Eggo Round'),(3,N'Eggo Square') – Waffles ,(4,N'Whole Eggs'),(4,N'Egg Whites') – Omelettes ,(5,N'Deep Dish'),(5,N'Sicilian'),(5,N'Regular'); – Pizza

CREATE TABLE FoodExtras(FoodExtraID int NOT NULL PRIMARY KEY, FoodTypeID int NOT NULL, FoodExtraName nvarchar(64) NOT NULL); INSERT FoodExtras(FoodTypeID,FoodExtraName) VALUES(1,N'Chocolate sprinkles'),(1,N'Candy sprinkles'),(1,N'Chocolate chips'),(1,N'Blueberry sauce') – Ice cream toppings ,(2,N'Chocolate chips'),(2,N'Blueberries'),(2,N'Blueberry sauce'),(2,N'Maple syrup') – Pancake toppings ,(3,N'Chocolate chips'),(3,N'Blueberries'),(3,N'Blueberry sauce'),(3,N'Maple syrup') – Waffle toppings ,(4,N'Sausage'),(4,N'Mushrooms'),(4,N'Green Peppers'),(4,N'Ham') – Omelette stuffings ,(5,N'Pepperoni'),(5,N'Sausage'),(5,N'Mushrooms'),(5,N'Green Peppers'),(5,N'Ham'); – Pizza toppings

CREATE TABLE FoodExtraTranslations(FoodExtraID int NOT NULL, LanguageID int NOT NULL, TranslatedText nvarchar(4000) NOT NULL); INSERT FoodExtraTranslations(FoodExtraID,LanguageID,TranslatedText) VALUES(1,1031,N'German Chocolate Sprinkles'),(1,1036,N'French Chocolate Sprinkles') ,(2,1031,N'German Candy Sprinkles'),(1,1036,N'French Candy Sprinkles') ,(13,1031,N'Sausage auf Deutsch'),(13,1036,N'Sausage au français');

She continued:

"If this looks suspiciously like the dreaded Entity-Attribute-Value model, that's because it is. 
There are definite disadvantages to EAV, but for this particular case it's a reasonable option to support 
this data. The biggest advantage is that no new structure is required; your data becomes row-based rather than 
table-based. If you get a new type of food, you just add a row to the FoodTypes table, then the associated rows 
for extras and their translations. You'll probably wonder why Food and FoodExtras are separate, and they really 
don't have to be. I'll leave that redesign for you to figure out, you're not paying me enough for that." 

But I can't rewrite all my application code! They're expecting tables named Waffles, Pancakes, etc. and it's too much work to change it!

"Oh give me a ------- break! Just create views with the same name that only SELECT those rows:" 

CREATE VIEW Pancakes(PancakeID,PancakeName) AS SELECT FoodID, Food FROM Foods WHERE FoodTypeID=2 WITH CHECK OPTION; -- Pancakes
GO
CREATE VIEW Waffles(WaffleID,WaffleName) AS SELECT FoodID, Food FROM Foods WHERE FoodTypeID=3 WITH CHECK OPTION; -- Waffles
GO
CREATE VIEW Pizza(PizzaID,PizzaName) AS SELECT FoodID, Food FROM Foods WHERE FoodTypeID=5 WITH CHECK OPTION; -- Pizza
GO

"And if you're doing things correctly, you're calling a stored procedure to do all your INSERT/UPDATE/DELETE operations, 
those can be modified easily:" 

CREATE PROCEDURE AddFood @foodType int, @foodName nvarchar(64) AS
INSERT Foods(FoodTypeID,Food) VALUES(@foodType,@foodName);
GO

ALTER PROCEDURE AddPizza @pizzaName nvarchar(64) AS EXEC AddFood 5, @pizzaName; – original AddPizza code commented out now… GO

Portia continued:

"BTW, you shouldn't hard-code your IDs either, but I'll let you fix that. And if you're doing it with ad-hoc SQL, 
or an ORM of some kind, you can still use the views if you add an INSTEAD OF trigger to handle INSERTs and UPDATEs. You're 
not paying me enough to do that either, but it's easy enough to write a code generator to do all that." 

Lessons Learned

If you can’t make or take the time to design them up front, take time every few months to review your table designs. If you see a lot of similar or identical tables, look to consolidate them. See if a row-based approach can be done. Also make sure to test performance, sometimes it may not improve with a new design.

But more importantly, TAKE MORE THAN 5 MINUTES to consider your database designs! It's too easy to "get it done right now" and leave it to someone else to fix later, if ever. Technical debt compounds faster than credit card debt.

** This is based on a model I've seen in my career, with about 130 "types" of things that all have an associated translation table attached.

*** And if you think that technical debt is unavoidable, or not that bad, consider working at Facebook: http://www.darkcoding.net/software/facebooks-code-quality-problem/