Brett Kaiser (x002548) Blog

Not Just a Number - Brett Kaiser

Database Design based on existing data

I really like to avoid doing database design based on existing data structures.  I really prefer to start with a functional flow of the business, target the data (be it existing electronic data, memo;s, post it's, whatever, data is data) for each function, organize all the data across functions into general entities, then take all of the data elemets or attributes and associate them with the higher level Entities.  From all of this, I then would form a logical data model, develop a data dictionary and present it to the the Clients for accuracy.  Then worry about the physical implementation.

BUT!  Sometimes that's not possible.  BUT! Good Normalization rules still apply.  In this post, macca asked a question. that based on an existing data file, how would you build a database.  Since there are 34 unique “Things” should they build that many tables?  Well, in my opinion, the answer was no.  It should be 1 table (as we find out later though, it's really 2).  The interesting part about this problem was the normalization of the data and the code that breaks it up into it's normalized componenets.

How data ends up so denormalized in the first place always suprises me. I mean I understand X400 directory structures and the like as an original “slick” pre XML days to tag data, but it's gonna be really hard for me to accept this, or to accept XML.  I just don't see how it beats the relational model.  I mean how dynamic do you need to be?

In any case, this problem caused me more time than usual to provide the poster a solution, so I figured I'd post it.  Hey Thursday was a foggy day.  Knocked it off this morning in relative short order, but Thursday I just couldn't “see” it.  Kinda like smashing your face on to your desk over and over...anyway....enjoy

USE Northwind
GO

SET NOCOUNT ON
CREATE TABLE myStage99(RowNum int IDENTITY(1,1), col1 varchar(8000))
GO

INSERT INTO myStage99(Col1)
SELECT 'one/a/001, one/a/002, one/a/500' UNION ALL
SELECT 'one/b/001, one/b/002, one/b/500' UNION ALL
SELECT 'one/c/001, one/c/002, one/c/500' UNION ALL
SELECT 'one/d/001, one/d/002, one/d/500' UNION ALL
SELECT 'one/e/001, one/e/002, one/e/500' UNION ALL
SELECT 'one/f/001, one/f/002, one/f/500' UNION ALL
SELECT 'two/a/001, two/a/002, two/a/500' UNION ALL
SELECT 'two/b/001, two/b/002, two/b/500' UNION ALL
SELECT 'two/f/001, two/f/002, two/f/500' UNION ALL
SELECT 'seven/a/001, seven/a/002, seven/a/500' UNION ALL
SELECT 'seven/b/001, seven/b/002, seven/b/500' UNION ALL
SELECT 'seven/f/001, seven/f/002, seven/f/500'
GO

CREATE TABLE mySection_SubSection99([Section] varchar(20), SubSection char(1), SubSectionNum char(3))
GO

DECLARE   @RowNum int, @MAX_RowNum int, @s int, @e int, @col1 varchar(8000), @parse varchar(24)
 , @Section varchar(20), @SubSection char(1), @SubSectionNum char(3)

SELECT @RowNum = 1, @MAX_RowNum = MAX(RowNum), @s = 1, @e = 1 FROM myStage99

WHILE @RowNum <= @MAX_RowNum
  BEGIN
 SELECT @Col1 = Col1 FROM myStage99 WHERE RowNum = @RowNum
 SELECT @e = CHARINDEX(',',Col1,@s)-1 FROM myStage99 WHERE RowNum = @RowNum
 WHILE @e > -1
   BEGIN
  SELECT @Parse = SUBSTRING(Col1, @s, @e-@s+1) FROM myStage99 WHERE RowNum = @RowNum
  
  INSERT INTO mySection_SubSection99([Section], SubSection, SubSectionNum)
  SELECT    SUBSTRING(@Parse,1,CHARINDEX('/',@Parse)-1) AS [Section]
   , SUBSTRING(@Parse,CHARINDEX('/',@Parse)+1,1) AS SubSection
   , RIGHT(@Parse,3) AS SubSectionNum

  SELECT @s = @e + 3
  SELECT @e = CHARINDEX(',',Col1,@s)-1 FROM myStage99 WHERE RowNum = @RowNum
   END

 SELECT @Parse = REVERSE(SUBSTRING(REVERSE(Col1), 1, CHARINDEX(',',REVERSE(Col1))-2))
   FROM myStage99 WHERE RowNum = @RowNum

 INSERT INTO mySection_SubSection99([Section], SubSection, SubSectionNum)
 SELECT    SUBSTRING(@Parse,1,CHARINDEX('/',@Parse)-1) AS [Section]
  , SUBSTRING(@Parse,CHARINDEX('/',@Parse)+1,1) AS SubSection
  , RIGHT(@Parse,3) AS SubSectionNum

 SELECT @RowNum = @RowNum + 1, @s = 1, @e = 1
  END
GO

CREATE TABLE [dbo].[mySection99] (
 [Section] [varchar] (20) NOT NULL PRIMARY KEY
) ON [PRIMARY]
GO

INSERT INTO mySection99([Section])
SELECT DISTINCT [Section] FROM mySection_SubSection99
GO

ALTER TABLE dbo.mySection_SubSection99 ADD CONSTRAINT
 FK_mySection_SubSection99_mySection99 FOREIGN KEY
 (
 [Section]
 ) REFERENCES dbo.mySection99
 (
 [Section]
 )
GO

SELECT * FROM mySection99
GO

SELECT * FROM mySection_SubSection99
GO

DROP TABLE myStage99
DROP TABLE mySection_SubSection99
DROP TABLE mySection99
GO