Regular Expression Replace in SQL 2005 (via the CLR)
I had to do some data clean up the other day, and really needed some regular expression replacements to do the job.
Since .NET has a great RegularExpressions namespace, and since SQL 2005 allows you to integrate .NET CLR functions in your T-SQL code, I thought I'd go ahead and experiment with creating a RegExReplace() function.
I am not so sure that I recommend using a function like this in production (there's lots of pros and cons of CLR integration in SQL databases), but for data cleaning or quick tasks or just learning how to use new features or technology, it is very interesting and easy to do. All you need is a SQL Server 2005 database (Express is fine) and Visual Studio 2005.
Open up Visual Studio 2005 and create a new SQL Server Project, and after giving it a name and location, you will be prompted to connect to the SQL Server 2005 database in which you'd like to add your code.
Once the project is created, choose Project->Add User Defined Function, and name the .cs file anything you like, such as "RegExFunction.cs".
Once the file has been added to your project, open it up and paste in the following code (changes made to the original template are in bold):
using System.Data;
using System.Data.SqlClient;
using System.Data.SqlTypes;
using Microsoft.SqlServer.Server;
using System.Text.RegularExpressions;
public partial class UserDefinedFunctions
{
[Microsoft.SqlServer.Server.SqlFunction(IsDeterministic=true,IsPrecise=true)]
public static SqlString RegExReplace(SqlString expression, SqlString pattern, SqlString replace)
{
if (expression.IsNull || pattern.IsNull || replace.IsNull)
return SqlString.Null;
Regex r = new Regex(pattern.ToString());
return new SqlString(r.Replace(expression.ToString(), replace.ToString()));
}
};
It's really quite simple; within the class definition, just define public static methods that accept and return SQLTypes, and if those methods are marked with the SqlFunction attribute, when deployed they become available in your database code as T-SQL User-Defined Functions! Quite cool.
In this example, our function is accepting 3 SQLString parameters, and if any are null, we return null. If they are all legit, we construct a RegEx object from the pattern passed in, do the replace, and return the result. Note that this will not be especially efficient, since the RegEx object is created and destroyed for each call, but it does work and it is interesting at the very least to play around with. You might also want to experiment with other options, such as ignoring whitespace or case sensitivity, provided by the RegEx class. This particular code is very basic, and doesn't handle error checking or anything like that, you may wish to make improvements or optimizations in your own implementation.
Now that your code is ready to go, choose Build->Deploy Solution. If all goes well, your assembly and new function have been deployed to your SQL database!
There is one final thing you must do before you can use the function, and that is configure your server to allow CLR code to execute, if it hasn't been configured already. To do this, you must execute the following T-SQL statement:
followed by either a server stop/re-start, or executing:
Once that is complete, you can now use your new function like any other User Defined T-SQL function. For example,
-------------------------
137
(1 row(s) affected)
Now you can do a standard Regular Expression Replacement within your database directly, for example as an UPDATE:
SET MessyColumn = dbo.RegExReplace(MessyColumn, ... , ....)
WHERE ...
Here's my two cents on using CLR code in a database: If the code is purely a generic function or tool that has nothing specific to do with your data, and it fits and works logically in a database querying language, and there is no way to efficiently implement that code in T-SQL, then it may be worthwhile to implement that function via the CLR. This is a pretty good example. A bad example would be a .NET function that returns a CustomerName when passed a customerID, or something along those lines. That's just my take on things, for what it's worth.
So, use wisely and have fun!
More Resources:
- .NET Regular Expressions Cheatsheet
- SQL CLR Blog: More Regular Expression example functions
- MSDN: Using CLR Integration in SQL Server 2005
- SQL Team Article: Regular Expressions in SQL Server 2000
- SQL Server 2005: CROSS APPLY
- Taking a look at CROSS APPLY
- SQL Server 2005: Specifying Partitions for Aggregate Functions
- SQL Server 2005: Using EXCEPT and INTERSECT to compare tables
- Regular Expression Replace in SQL 2005 (via the CLR)
- More with SQL Server 2005 : Top n Per Group, Paging, and Common Table Expressions
- SQL Server 2005: Using PARTITION and RANK in your criteria
- Lots of great SQL Server 2005 Express Downloads
Legacy Comments
Kamran
2007-09-18 |
re: Regular Expression Replace in SQL 2005 (via the CLR) Thanks alot |
Randy
2007-09-26 |
re: Regular Expression Replace in SQL 2005 (via the CLR) thanks. just getting started in .net need some simple examples. |
Chanrith
2007-11-15 |
re: Regular Expression Replace in SQL 2005 (via the CLR) This was good. Thanks! I agree that this should be used very seldomly and really is for things that you would normally have to bend over backwords to achieve in traditional T-SQL. Definitely not something to use for on the fly query calls but using them to run aggregated updates or inserts is perfect. |
sove
2008-02-11 |
re: Regular Expression Replace in SQL 2005 (via the CLR) thanks. just getting started in .net need some simple examples. |
Jason
2008-09-26 |
re: Regular Expression Replace in SQL 2005 (via the CLR) This helped alot with a problem we had. We were using Extended SP's to call VBScript.RegEx, but had problems with strings over 4000 characters. But we did notice using the CLR seems to be much slower. 50k rows, and doing a replace took 3:48 while the old Extended SP took 2:12. Any hints on optimizing this (I personally have never coded in C# yet, so this is my first stab into this). I did instantiate the object as a compiled (Regex r = new Regex(pattern.ToString(), RegexOptions.IgnoreCase | RegexOptions.IgnorePatternWhitespace | RegexOptions.Compiled);) |
Metpor Tasarim Sihirbazi
2009-03-07 |
re: Regular Expression Replace in SQL 2005 (via the CLR) thanks alot |
Elmanuel Minh Duong
2009-06-12 |
re: Regular Expression Replace in SQL 2005 (via the CLR) # Hello Every Body !!!! i'm a new member, your can help me !!!! Regular Expression Replace in database. with webform application. ex: i have 2 table : tblTag (tagID, tagvalue,ColumnName,tableName) and tblProfile(Username, address,birthday, tel, fisrtName, lastname, email....). Now, i want to replace record in database. when members upload file word from client to server. Ex: Userneme :[username] =>>> save data to database flow record. with value had user enter it. Address :[address]=>>.______________________________________. So, i can't understand..you can help me. Thank's you so much |
Aravamudan
2009-07-30 |
re: Regular Expression Replace in SQL 2005 (via the CLR) Thank you very much. It helped a lot :) |
Peter
2009-09-07 |
re: Regular Expression Replace in SQL 2005 (via the CLR) Is it also posible to do this only in a query? |
diancitie
2010-02-02 |
re: Regular Expression Replace in SQL 2005 (via the CLR) Thanks for article. Keep up sharing. Great article.Thanks. |
Jason Clark
2010-04-08 |
re: Regular Expression Replace in SQL 2005 (via the CLR) HUGE help on a project I'm working with! Thanks for sharing the code! |
Dan
2010-06-23 |
re: Regular Expression Replace in SQL 2005 (via the CLR) Can this be modified to connect to different servers across environments? By this I mean if I develop on an enterprise DEV server and set that connection string per your instructions, can I then choose a different server in code based on Configuration string? Thanks. |
Juan Zamora
2010-08-19 |
re: Regular Expression Replace in SQL 2005 (via the CLR) Hi I made some modifications to your code since it does not allow to use the regex with more than 4000 characters on database. Best Regards, Juan [Microsoft.SqlServer.Server.SqlFunction(IsDeterministic = true, IsPrecise = true)] public static SqlChars RegExReplace([SqlFacet(MaxSize = -1)]SqlString expression, SqlString pattern, SqlString replace) { if (expression.IsNull || pattern.IsNull || replace.IsNull) return SqlChars.Null; Regex r = new Regex(pattern.ToString()); return new SqlChars(r.Replace(expression.ToString(), replace.ToString())); } |