Most Valuable Yak (Rob Volk) Blog

…and other neat SQL Server tricks

Regular Expressions are cool…

I recently had to modify tables to add some auditing columns and triggers. Being a good fair to middling crazed, insane DBA and wanting to ensure data integrity and good design, I made the columns not nullable, and therefore also had to add defaults for these new columns:

ALTER TABLE [dbo].[MyTable100]
    ADD [DateCreated]  DATETIME     CONSTRAINT [DF_MyTable_DateCreated] DEFAULT (getdate()) NOT NULL,
        [DateModified] DATETIME     CONSTRAINT [DF_MyTable_DateModified] DEFAULT (getdate()) NOT NULL,
        [UserCreated]  VARCHAR (64) CONSTRAINT [DF_MyTable_UserCreated] DEFAULT (suser_sname()) NOT NULL,
        [UserModified] VARCHAR (64) CONSTRAINT [DF_MyTable_UserModified] DEFAULT (suser_sname()) NOT NULL;

Unfortunately by making them not nullable, the ALTER statement has to rewrite all the data pages to insert the default values.  This kinda-sorta locked down the table(s) and made the database unresponsive for about, oh, 2 hours or so.  (Stooooooooopid multi-million row tables!)  It also had the side effect of making months worth of transactions appear as if they were created instantly by the same user.  For some odd reason my boss felt neither circumstance was a good idea to deploy to production.

Happily the solution to both problems was to make the new columns nullable:

ALTER TABLE [dbo].[MyTable]
    ADD [DateCreated]  DATETIME      NULL,
        [DateModified] DATETIME      NULL,
        [UserCreated]  VARCHAR (64)  NULL,
        [UserModified] VARCHAR (64)  NULL;

And add the constraints afterwards:

ALTER TABLE [dbo].[MyTable]
    ADD CONSTRAINT [DF_MyTable_DateCreated] DEFAULT (getdate()) FOR [DateCreated],
        CONSTRAINT [DF_MyTable_DateModified] DEFAULT (getdate()) FOR [DateModified],
        CONSTRAINT [DF_MyTable_UserCreated] DEFAULT (suser_sname()) FOR [UserCreated],
        CONSTRAINT [DF_MyTable_UserModified] DEFAULT (suser_sname()) FOR [UserModified];

The problem was turning the original, single statement into the 2 new statements, without doing it by hand.  (Did I mention there were a few hundred tables in 8 different databases?)  Playing around a bit I discovered that this syntax works perfectly:

ALTER TABLE [dbo].[MyTable]
    ADD [DateCreated]  DATETIME      NULL, CONSTRAINT [DF_MyTable_DateCreated] DEFAULT (getdate()) FOR [DateCreated],
        [DateModified] DATETIME      NULL, CONSTRAINT [DF_MyTable_DateModified] DEFAULT (getdate()) FOR [DateModified],
        [UserCreated]  VARCHAR (64)  NULL, CONSTRAINT [DF_MyTable_UserCreated] DEFAULT (suser_sname()) FOR [UserCreated],
        [UserModified] VARCHAR (64)  NULL, CONSTRAINT [DF_MyTable_UserModified] DEFAULT (suser_sname()) FOR [UserModified];

Wow, that's pretty close to the original!  (And it runs instantly since it only requires schema locks on the table)  It's still annoying because I have to copy the new column name and paste it at the end of the DEFAULT definition.

Normally this is where I break out my 1337 4@X0r skillz and INFORMATION_SCHEMA and generate the SQL I need.  Sadly the cruel, cold mistress named CHANGE CONTROL had other plans:  we have to build from source and deploy with zero changes to the script.  Minor changes are allowed but must be exactly repeatable/reproducible.  (And also checked into source control, but that's for another time.)

If you've ever played with regular expressions (regex) you'll know that it's not hard to reformat the first statement into the last.  If you haven't used regex before, or always get frustrated by them, hopefully this example will convince you to spend more time with them.  SQL Server Management Studio has a find/replace feature that can use regex.  Here's what I used to get the job done (each regex follows after the "-> "):

   Find-> {.*}{\[.*\]}{.*}{CONSTRAINT}{.*}{NOT NULL}{.*}
Replace-> \1\2\3 NULL, CONSTRAINT\5FOR \2\7 

The Find expression breaks up each line into the following groups, using { and } to denote each:

1. All characters before left square bracket [ -> {.*}
2. All characters between (and including) square brackets -> {\[.*\]}


3. All characters between right square bracket ] and the word CONSTRAINT -> {.*}


4. The word CONSTRAINT -> {CONSTRAINT}


5. All characters between CONSTRAINT and NOT NULL -> {.*}


6. The words NOT NULL -> {NOT NULL}


7. All characters after NOT NULL -> {.*}

The Replace expression uses backreferences (\1, \2, etc.) to do the following:

1. Rebuild the first 3 captured groups exactly as before -> \1\2\3
2. Add a space and the word NULL ->  NULL


3. Add a comma, space, and the word CONSTRAINT -> , CONSTRAINT


4. Rebuild the 5th captured group (all characters between CONSTRAINT and NOT NULL) -> \5


5. Add the FOR keyword and a space -> FOR


6. Add the 2nd captured group, which is the column name and its enclosing brackets -> \2


7. Add the remaining characters captured after NOT NULL -> \7

Try it out for yourself!  You can, of course, do a lot more with regular expressions than this, but it's a good introduction for a relatively common problem. In case you're wondering, the find/replace ran in about 15-20 seconds, and the original 2 hour deployment now takes less than 1 minute.

One final note:  if you've used regex in Perl or some other language, you'll probably wonder why I'm capturing groups with {} instead of the standard () characters.  I wonder that too; my only explanation is Microsoft are idiots hate Unix and Perl people have their own unexplained reasons for using non-standard characters. It's not the first time.

Enjoy!

Legacy Comments


Andrew
2010-08-23
re: Regular Expressions are cool...
Yeah, I'm a big PCRE (Perl Compatible Regular Expressions) fan from my perl hacking days in college and am confounded when I come across regex implementations that are different. I'm okay with .NET adding named back-references or additional characters types. I can easily opt-out of using those. But little things like braces instead of parens throw off my mojo so much I don't use SSMS/VS. I just copy out to one of many PCRE tools, do the replacement there, and paste back into SSMS/VS. Or write a little perl script to operate on the source controlled files directly.

My tool of choice these days is The Regex Coach, which amazingly enough, is written in LISP. I wasn't aware the language was still in use, much less had GUI libraries. It even does a step-by-step replacement for when I get dumb and need to debug.

But yeah, I have a few stored perl scripts mostly for changing T-SQL style, like converting the auto-generated CREATE TABLE statements from SQL into something I consider more legible.