byrmol Blog

Garbage

Towards a Point Array constraint

In my last blog, I asked the question for breaking 1NF for a particular situation involving Points.

Graz hit the nail on the head with the question about the relevance of the internal data components to the outside world.  Type vs Entity is always a case by case evaluation ... So here is this one

The “base“ object stores 2 Points.  All of the initial sub classes of this base object follows this representation.  Initially I decomposed a Point to columns X and Y, giving columns X,Y,Xi,Yi for the objects spatial position.  So from the get go I am contaminated by implementation issues screwing the model and my mind in the process. I think it is because of this that I have chosen an entity representation for a Point. 

Enter the free-hand drawing object.... Not 2 points but n points where n >=2.  So there I was locked in a entity mindset thinking “I'll have to create another table just for this object with a point per row.  Poor MSDE“.  Soon after I blogged.

One comment in the feedback was a simple question “So, how does this problem relate to 1NF?“, coupled with Graz's comment it forced me to rethink the whole representation. 

I have chosen instead to represent the spatial information for all objects as a PointArray type.  In SQL Server (none of this “it will be there in the next version” crap thank you very much!) this is a bit of work.

Dave's Rules of the constraint game:

  1. Nothing but TSQL
  2. Must be declarative enforced
  3. No external processes (sp_OA% I am looking at you!)

A COMMA is used to delimit the X and Y values and COLON is used to delimit Points in the array. eg 12,34:54,36: represents 2 Points.

Some helpers function first..You'll need a Numbers table (n)

CREATE FUNCTION dbo.StringCount
(
@String VARCHAR(8000),
@CountString VARCHAR(10)
)
RETURNS INT
AS
BEGIN
DECLARE @Return INT
SET @Return = LEN(@String) - LEN(REPLACE(@String,@CountString,''))
RETURN @Return
END

GO

CREATE FUNCTION RowParser
(
@Text TEXT,
@Separator VARCHAR(3)
)
RETURNS TABLE
AS
RETURN
(
SELECT n,
  SUBSTRING(@Text, n,
  CASE SIGN(CHARINDEX(@Separator, @Text,n)-n)
   WHEN -1 THEN
     CASE PATINDEX('%,%', SUBSTRING(@Text, n, ABS(CHARINDEX(@Separator, @Text,n)-n)))
      WHEN 0 THEN DATALENGTH(@Text)-n+1
      ELSE PATINDEX('%,%', SUBSTRING(@Text, n, ABS(CHARINDEX(@Separator, @Text,n)-n))) -1
     END
   ELSE ABS(CHARINDEX(@Separator, @Text,n)-n)
   END) AS Data
FROM Numbers
WHERE n BETWEEN 0 AND (DATALENGTH(@Text) - DATALENGTH(@Separator))
AND ((SUBSTRING(@Text,n-DATALENGTH(@Separator),DATALENGTH(@Separator)) = @Separator AND n > 1) OR n = 1)
)

Here is the function that will be used in the CHECK constraint for this type. 

CREATE FUNCTION dbo.IsPointArray
(
@PointArray VARCHAR(8000)
)
RETURNS BIT
AS
BEGIN
DECLARE @Return BIT
SET @Return = 0
--No Letters, Same number of COMMA as COLON
-- Start with a number and ends with a COLON
-- No consecutive delimiters
-- 2 or more points
IF @PointArray NOT LIKE '%[^0-9,:]%' AND dbo.StringCount(@PointArray,',') = dbo.StringCount(@PointArray,':')
 AND @PointArray LIKE '[0-9]%:' AND @PointArray NOT LIKE '%[,:][,:]%'
 AND dbo.StringCount(@PointArray,':') >=2
BEGIN
 --Add 2nd IF to overcome no short-circuit evaluation
 --Parse the PointArray into Point Rows
 --If more than 1 COMMA is found in any row the delimiter sequence is out of step
 IF NOT EXISTS (SELECT 1 from dbo.RowParser(@PointArray,':')
   WHERE dbo.StringCount(Data,',') != 1)
  SET @Return = 1
END
RETURN @Return
END
go

declare @CSV VARCHAR(8000)
--SET @CSV = '1,2,3:5,4:4:5,23:' --Bad
SET @CSV = '22,354:45,23:2345,465:' --Good
--SET @CSV = '22,354:' --Bad
Select dbo.IsPointArray(@CSV)

Not exactly elegant, but in the initally testing I have done, appears to constrain the input enough.  Obviously the front end client will have a regex expression to do some validation but the DBA inside me says “Trust no one.. especially you Dave.. Oh yeah and what the hell is a colour blind DBA doing designing front end graphics?”

I'd love to here of a simpler method that follows Dave's rules...

Legacy Comments


mkbosmans
2005-02-08
re: Towards a Point Array constraint
Yeah, this is kinda what I meant when asking “So, how does this problem relate to 1NF?“
You just create your own 'datatype' that encapsulates the whole freehand object and you don't have to worry about atomicity etc. anymore. Perhaps this is even better enforced as a user defined datatype and rules. I suppose it could be done with little effort, along the lines Jeff points out: http://weblogs.sqlteam.com/jeffs/archive/2004/12/02/2959.aspx

robvolk
2005-02-09
re: Towards a Point Array constraint
I suppose I'm being curmudgeony, but what happens if you need to store splines, curves, etc., that have control points as well as endpoints? You'd almost certainly need a new type anyway, but it might be something to think about for your data model.

Then of course there's always polar coordinates :)

Adam Machanic
2005-02-14
re: Towards a Point Array constraint
If you figure out a good way to represent curves and splines (good meaning efficient) you stand to make a lot of cash in the GIS biz...