I had to do some data clean up the other day, and really needed some regular expression
replacements to do the job.
Since .NET has a great RegularExpressions namespace
, and since SQL 2005 allows you to integrate .NET CLR functions in your T-SQL code, I thought I'd go ahead and experiment with creating a RegExReplace() function.
I am not so sure that I recommend using a function like this in production (there's lots of pros and cons
of CLR integration in SQL databases), but for data cleaning or quick tasks or just learning how to use new features or technology, it is very interesting and easy to do. All you need is a SQL Server 2005 database (Express
is fine) and Visual Studio 2005.
Open up Visual Studio 2005 and create a new SQL Server Project, and after giving it a name and location, you will be prompted to connect to the SQL Server 2005 database in which you'd like to add your code.
Once the project is created, choose Project->Add User Defined Function, and name the .cs file anything you like, such as "RegExFunction.cs".
Once the file has been added to your project, open it up and paste in the following code (changes made to the original template are in bold
public partial class UserDefinedFunctions
public static SqlString RegExReplace(SqlString expression, SqlString pattern, SqlString replace)
if (expression.IsNull || pattern.IsNull || replace.IsNull)
Regex r = new Regex(pattern.ToString());
return new SqlString(r.Replace(expression.ToString(), replace.ToString()));
It's really quite simple; within the class definition, just define public static methods that accept and return SQLTypes, and if those methods are marked with the SqlFunction attribute, when deployed they become available in your database code as T-SQL User-Defined Functions! Quite cool.
In this example, our function is accepting 3 SQLString parameters, and if any are null, we return null. If they are all legit, we construct a RegEx object from the pattern passed in, do the replace, and return the result. Note that this will not be especially efficient, since the RegEx object is created and destroyed for each call, but it does work and it is interesting at the very least to play around with. You might also want to experiment with other options, such as ignoring whitespace or case sensitivity, provided by the RegEx class. This particular code is very basic, and doesn't handle error checking or anything like that, you may wish to make improvements or optimizations in your own implementation.
Now that your code is ready to go, choose Build->Deploy Solution. If all goes well, your assembly and new function have been deployed to your SQL database!
There is one final thing you must do before you can use the function, and that is configure your server to allow CLR code to execute, if it hasn't been configured already. To do this, you must execute the following T-SQL statement:
sp_configure 'clr enabled',1
followed by either a server stop/re-start, or executing:
Once that is complete, you can now use your new function like any other User Defined T-SQL function. For example,
(1 row(s) affected)
Now you can do a standard Regular Expression Replacement within your database directly, for example as an UPDATE:
SET MessyColumn = dbo.RegExReplace(MessyColumn, ... , ....)
Here's my two cents on using CLR code in a database: If the code is purely a generic function or tool that has nothing specific to do with your data, and it fits and works logically in a database querying language, and there is no way to efficiently implement that code in T-SQL, then it may be worthwhile to implement that function via the CLR. This is a pretty good example. A bad
example would be a .NET function that returns a CustomerName when passed a customerID, or something along those lines. That's just my take on things, for what it's worth.
So, use wisely and have fun!