Monday, June 15, 2009
I’ll be talking about the following topics:
What should developers know about database design so they don't have performance and logical problems? What's new in SQL Server 2008 that helps solve some business problems that sometime required "hacking" before. Concurrency design models and isolation levels.
The presentation will start on Tuesday June 16th at 2:00 PM EST / 6:00 PM UTC / 8:00 PM CET
More info at PASS Application Development SIG
UPDATE: you can view the presentation here.
Friday, June 05, 2009
There are times when you want to have one .cs file in multiple projects. However if you do Add existing item you’ll notice that the file is copied to each project’s folder. This is not what we want.
The solutions is of course pretty simple once you know where to look. In the Add existing item dialog you have to add the file as a Link as is shown on the picture:

And there you go. you can have one file in multiple projects.
Friday, May 29, 2009
I’ve written two articles on SQLTeam.com about a great new feature in SQL Server 2008 called Extended Events. They are the new low level, high performance eventing system in SQL Server. They use less system resources and provide better tracking of SQL Server performance than previous methods like Perfmon and SQL Trace/Profiler events.
1. Introduction to SQL Server 2008 Extended Events
This is an introductory article where we take a look at performance troubleshooting and system monitoring and what they lack in previous SQL Server versions. After that we get to know the Extended Events architecture, new terminology and we try them out with a simple example.
2. Advanced SQL Server 2008 Extended Events with Examples
Part 2 of the series takes a look at some performance considerations we must be aware of when using Extended Events like synchronous or asynchronous target target, predicate short circuiting and event action size. After we take a look at 7 examples that show a different way we can use them to troubleshoot our system. Examples are meant to show Extended Events power and give some ideas on how to use them for more advanced monitoring.

Thursday, May 07, 2009
Recently i was looking for a SQL Server 2008 based template that would look cool. Since I haven’t found anything I’ve made my own. It’s a dark background template that you can use freely.
Suggestions for improvements are welcome in the comments. Also if you have an idea for a light background based template, I’d love to hear it.
|
The title look
|
The slide look
|
|

|

|
Here you go: SQL Server 2008 Dark Template.
Enjoy it!
Tuesday, April 28, 2009
A while back we had an interesting problem at work. We were calculating MD5 hashes for some values in both .Net and SQL Server and although the input values were all the same our MD5 hashes were different. After some time spent looking dumbfounded at the code I’ve realized what the bug was and started laughing. I saw that HashBytes function was the “culprit”. HashBytes function was introduced in SQL server 2005 to simplify creating hashes in the database. It can convert values to MD2, MD4, MD5, SHA, or SHA1 formats. HashBytes' result depends on the input text’s data type and can at the first glance give a bit different results that we might think, especially when combined with the .Net framework. Of course hashing itself works correctly in both. Let’s see what going on with some examples. From here on when i say hash i mean MD5 hash.
Example in SQL Server
-- note the size 15 of all datatypes
declare @val1 varchar(15),
@val2 nvarchar(15),
@val3 char(15),
@val4 nchar(15)
-- example of 1 byte/char text
-- all variables are of same length
select @val1 = '1234567890',
@val2 = N'1234567890',
@val3 = '1234567890',
@val4 = N'1234567890'
-- all 4 return different results
select HASHBYTES('md5', @val1) as MD5_varchar, -- result = 0xE807F1FCF82D132F9BB018CA6738A19F
-- just to show that collation doesn't change the hash
HASHBYTES('md5', @val1 collate Cyrillic_General_BIN2) as MD5_varchar_collation, -- result = 0xE807F1FCF82D132F9BB018CA6738A19F
HASHBYTES('md5', @val2) as MD5_Nvarchar, -- result = 0xE15E31C3D8898C92AB172A4311BE9E84
HASHBYTES('md5', @val3) as MD5_char, -- result = 0x2120C3F3423F89BA8A65ABD933321884
HASHBYTES('md5', @val4) as MD5_Nchar -- result = 0x90DEF5840F3A31174CA44E2022F743B6
-- example of 2 bytes/char text
-- 中文 means Chinese in written text
-- converting 2 bytes/char text to varchar cuts the text in half
select @val1 = '中文', -- this cuts the text in half
@val2 = N'中文', -- this stores the whole text
@val3 = '中文', -- this cuts the text in half
@val4 = N'中文' -- this stores the whole text
-- all 4 return different results
select HASHBYTES('md5', @val1) as MD5_varchar, -- result = 0xEA03FCB8C47822BCE772CF6C07D0EBBB
HASHBYTES('md5', @val2) as MD5_Nvarchar, -- result = 0x73C6C8CD2F94355EF015E5265D5E65B1
HASHBYTES('md5', @val3) as MD5_char, -- result = 0xA13C45A38853677887B4839071537634
HASHBYTES('md5', @val4) as MD5_Nchar -- result = 0xEADEBD3BD72A481C43C828E0C550145C
The catch here is the data type difference. Nvarchar and nchar take twice more bytes to store data than varchar and char. Since char and nchar pad the right side of the string with spaces to fill the gap up to defined data type length the spaces are also used in MD5 calculation. A collation plays no part in calculating hash values since collations are only applied to sorts and comparisons.
However care must be taken with texts that need 2 bytes per char of storage space like Chinese text. Storing it in a varchar variable cuts it in half.
Example in .Net - C#
For generating MD5 hashes I’ve used the code from this site which turned up as the first result on Google for “MD5 in C#”:
public string GetMD5Hash(string input)
{
System.Security.Cryptography.MD5CryptoServiceProvider x = new System.Security.Cryptography.MD5CryptoServiceProvider();
//byte[] bs = System.Text.Encoding.ASCII.GetBytes(input);
//byte[] bs = System.Text.Encoding.UTF7.GetBytes(input);
//byte[] bs = System.Text.Encoding.UTF8.GetBytes(input);
byte[] bs = System.Text.Encoding.Unicode.GetBytes(input);
//byte[] bs = System.Text.Encoding.UTF32.GetBytes(input);
bs = x.ComputeHash(bs);
System.Text.StringBuilder s = new System.Text.StringBuilder();
foreach (byte b in bs)
{
s.Append(b.ToString("x2").ToLower());
}
string password = s.ToString();
return password;
}
// hashes for the simple “1234567890” text in all encodings
ASCII: e807f1fcf82d132f9bb018ca6738a19f
UTF7: e807f1fcf82d132f9bb018ca6738a19f
UTF8: e807f1fcf82d132f9bb018ca6738a19f
Unicode: e15e31c3d8898c92ab172a4311be9e84
UTF32: 6a57502c29a5081f03cb70e0ad38ecc7
// hashes for the complex “中文” text in all encodings
ASCII: ea03fcb8c47822bce772cf6c07d0ebbb
UTF7: eb02105e5c51a33f21e8da7f8102cfda
UTF8: a7bac2239fcdcb3a067903d8077c4a07
Unicode: 73c6c8cd2f94355ef015e5265d5e65b1
UTF32: 65fe91b81ed1107566f9f9f5ed4ccaf1
All strings in .Net store their chars in 2 bytes by default. When hashing values we have to take this into account and use proper text Encoding. .Net supports 5 encodings: ASCII (7 bits per char), UTF7 (7 bits per char), UTF8 (8 bits = 1 byte per char), Unicode (UTF-16) (16 bits = 2 bytes per char) and UTF32 (32 bits = 4 bytes per char).
For text with only first 127 chars in the ASCII table ASCII, UTF7 and UTF8 encodings all return the same hash, but with UTF16 and UTF32 comes endianness so they don’t return the same hash values. Also note that the ASCII and UTF7 encoding aren’t recommended to be used anymore except in legacy apps. UTF8 should be used instead.
For text with complex chars we have to use Unicode or UTF32 or we loose char information.
Putting .Net and SQL Server together
It turns out that such a simple thing as hashing can become a serious issue if we’re not careful. Trouble always awaits when dealing with text and encodings. :)
When using SQL Server’s varchar data type the .Net encoding to go with is UTF8 since it’s the fastest and most optimized of the three (ASCII, UTF7, UTF8). When using the nvarchar data type to go is Unicode (UTF16) but we also have to know the texts endianness to create correct hashes. UTF32 is practically useless in this case because SQL Server doesn’t have a data type that stores text in 4 bytes/char so we’ll never get the same results if we use it.
This advice only applies when creating hashes both in .Net and SQL server and comparing them. If we’re creating hashes in .Net and only store them in a database then we don’t have to worry about this.
Wednesday, April 08, 2009
Indexes are one of the most important database features. Without them your database will crawl under a table in fear of simple queries on large tables or complex queries on small tables. That’s why one of the most important things a DBA or a database developer should know is basic index maintenance.
Performance problems are usually result of missing indexes. Index tuning is more of an art than it is a science since possible workloads are so many that there is no one rule to rule them all. It’s a mixture of testing, implementing, retesting and reimplementing. For this purpose The SQL Server team has created the Database Tuning Advisor that takes a recorded workload (from SQL Profiler), analyzes it and suggests index improvements. However this is a tool for some really deep analysis, so we’ll take a look at some lightweight options.
On the other hand too many unused indexes bloat the database and unnecessarily increase the database size. This can cause problems with larger backups that take longer to create. This is why there’s a delicate balance to keep when maintaining indexes.
With SQL Server 2005 we got introduced to Dynamic Management Views (DMV’s). They are a powerful tool when troubleshooting SQL Server performance. Their downside is that most of them hold aggregated values since server start. That’s why we have to poll them twice at the start and at the end of the desired interval to get meaningful results. However for our purpose we don’t have to do this since we want to know which indexes are not used since our server started. We can find this information because when building an execution plan the SQL Server keeps information about every index it has used and could have used because it’s a cost based engine.
Finding missing indexes
Needed Dynamic Management Views (DMV’s)
As I said earlier SQL Server keeps data about possible missing indexes. If an actual execution plan is included when running a query then the missing indexes are also displayed there.
The data about missing indexes is stored in the following DMV’s which all exclude info about spatial indexes:
sys.dm_db_missing_index_groups
This DMV returns only 2 columns with information about which indexes are in which group.
sys.dm_db_missing_index_group_stats
This DMV returns information about each missing indexes group. It returns info like the estimated average impact or how many seeks, scans and compilations/recompilations would benefit from adding the missing indexes.
sys.dm_db_missing_index_details
This DMV returns detailed information about each missing index like table name that is missing an index and CSV’s of columns that the index would be beneficial on.
sys.dm_db_missing_index_columns
This a Dynamic Management Function (DMF) that accepts an index_handle parameter. It returns columns that should be in the suggested index identified with the index_handle that can be obtained from sys.dm_db_missing_index_details and sys.dm_db_missing_index_groups. It does not include spatial indexes. The column named column_usage returns info on how this column would benefit for a particular index. EQUALITY and INEQUALITY mean that the column would be used in a where clause predicate. INCLUDE means that the column should be an included column on an existing non-clustered index.
A simple example
USE AdventureWorks;
GO
-- run some queries
SELECT City, ModifiedDate
FROM Person.Address
WHERE StateProvinceID < 1000 AND ModifiedDate > DATEADD(yyyy, -5, GETDATE());
GO
SELECT City, StateProvinceID, PostalCode
FROM Person.Address
WHERE StateProvinceID = 15733;
GO
-- get the missing indexes that would be beneficial for speeding up above queries
SELECT D.index_handle, [statement] AS full_object_name, unique_compiles, avg_user_impact, user_scans, user_seeks, column_name, column_usage
FROM sys.dm_db_missing_index_groups G
JOIN sys.dm_db_missing_index_group_stats GS ON G.index_group_handle = GS.group_handle
JOIN sys.dm_db_missing_index_details D ON G.index_handle = D.index_handle
CROSS APPLY sys.dm_db_missing_index_columns (D.index_handle) DC
ORDER BY D.index_handle, [statement];
The DMV’s return raw data so you’ll have to do some string magic to build the CREATE INDEX statements out of it.
Limitations
This cool feature of course has some limitations. Some of those are:
-
Database Engine Tuning Advisor kicks its behind. Think of Missing Indexes DMV’s as a really lightweight DTA. So when doing a really in-depth index analysis, don’t rely only on the Missing Indexes DMV’s. Just start with them.
-
Missing Indexes DMV’s don’t provide any information about new index overhead like space or IO/CPU overhead on updates, inserts and deletes.
-
There’s no information about the column order in the suggested index or whether it should be clustered or non-clustered.
-
Missing Indexes DMV’s consider only per query indexes and not per workload indexes.
-
Missing Indexes DMV’s can track a maximum of 500 indexes.
-
Trivial execution plans (plans for really simple SQL Statements) are not considered.
Finding unused indexes
Since SQL Server keeps data of all used indexes, getting the unused indexes is a simple matter of comparing used indexes to all existing indexes. Those that exist but are not used are of course unused indexes.
SELECT OBJECT_SCHEMA_NAME(I.OBJECT_ID) AS SchemaName,
OBJECT_NAME(I.OBJECT_ID) AS ObjectName,
I.NAME AS IndexName
FROM sys.indexes I
WHERE -- only get indexes for user created tables
OBJECTPROPERTY(I.OBJECT_ID, 'IsUserTable') = 1
-- find all indexes that exists but are NOT used
AND NOT EXISTS (
SELECT index_id
FROM sys.dm_db_index_usage_stats
WHERE OBJECT_ID = I.OBJECT_ID
AND I.index_id = index_id
-- limit our query only for the current db
AND database_id = DB_ID())
ORDER BY SchemaName, ObjectName, IndexName
Note that both these methods are only useful if your server has been running long enough and has been through peek usage. The data in the mentioned DMV’s is cleared on the next instance restart. If we want to save this information for further analysis we should create a scheduled job that periodically queries the DMV’s and saves the information to tables.
Thursday, February 19, 2009
I was tagged by Denis Gobo in the latest blog meme: What was your first computer and games you’ve played?
Well mine was the Sinclair ZX Spectrum.
I got it when i was 9 years old and I only used it to play games. No programming for this kid just yet. That came a lot later.
This baby had a tape drive whose sound I really enjoyed imitating to annoy my mom :)
So what were some of the games? These are the four I remember:
Grand Prix Circuit
This was my first intro into the world of F1. I can still remember the “eeeeee cccccc beep beep beep” sounds of the engine and tires. Ahhhh, those were the days…
The Chessmaster 2000
My first chess program. I was taught chess at the age of 4 by my uncle and have played it more or less actively since. I remember staying up late for many of nights trying to beat it.
Frogger
The one, the only, the ultimate - the frog! Up, down, all around, left, right, jump with might! Such simplicity yet so playable. Even today I like to play a game of Frogger now and then.
Popeye
This one I enjoyed playing but i haven no clue why. Probably to bring out my romantic side as the game was filled with hearts and winning the lady. :)
That’s it.
I’m tagging Jeff Smith, Peter Larsson, Adam Machanic and Robert C. Cain to tell us what they had and played with.
Monday, February 09, 2009
This new version brings one completely new feature and a few improved old ones.
The new feature is Window Connection Coloring.
It includes a colored strip indicator that can be docked to any side of the window.
Improved features include:
You can also take a look at the full feature list.
Hope you enjoy the best free Add-In for SQL Server Management Studio out there!
And if you're feeling extra generous there's always PayPal. After all i do have to pay for hosting and stuff, right? :)
Saturday, January 31, 2009
On Saturday 31st January 2009 at around 14:40 UTC something “historical” happened. Google virtually stopped working. It might as well died. Every single search result was marked with “This site may harm your computer.”. And it redirected you to a warning page.
Of course this was probably because of some bug in their harmful content detection software or a failed upgrade or something similar. I doubt it was a virus. It doesn’t really matter why but for an hour I was forced to use Live search. Yahoo was just too much typing in the address bar. :)
However this brings on an interesting thought. What would happen if this search outage happened on a Wednesday which is usually the busiest week day. This would result in huge loss of revenue and traffic for everyone. If we just take a look at this blog: 95% of all traffic comes from Google searches. Yep… that’s 95% !! And others have pretty much the same deal. You do the numbers for other sites that have several tens of thousands of users per day. This was a worldwide failure. USA, Central and South America, Europe and Asia (I haven’t talked with anyone from the African continent). You name it, it failed.
And the best thing is that Google only provides a service you’re not committed to using so you can’t do anything about your lost revenue. And as much fun as this brought to people imagine the consequences on a larger scale.
Today around 1.5 billion people use the internet. Of them on average 70% use Google Search. That’s a bit over 1 billion people. So let’s make a really pessimistic estimate that each of those people spend $1 per day on the internet. They spend it because they found something they wanted to buy. Using equal hourly distribution for the day (earth spins :) and Google was down 1 hour) we get little less than $42 million loss. And remember that’s just for $1 per person per day. There are no real statistics I could find on this but i wouldn’t be surprised if the real estimate was 10 times higher. Some might argue that people would just go search somewhere else. Wrong. If you’re reading this then you’re not an average internet user. An average internet user is set in their way. if something doesn’t work they go away and forget about it.
I know Google’s slogan is Don’t be evil, but still… are you sure about that? Are you prepared to be totally dependent on them? What can you do about it? I know I have no clue, do you? They still provide me with 95% of traffic :)
On to Live search. I have to say that they’re improving. I actually found what i was looking for there.
But change is hard. It’s interesting to see how my eye and arm/mouse movements are synced with Google’s search layout. It simply hurts to switch. Ok… it doesn’t hurt but it’s still annoying.
Which begs the question: “Why change your search engine at all?” I know I have no other reason then when the primary one fails.
Search results were pretty much the same now that I can cross reference them on Google. Therefore if Microsoft has any plans to steal people form Google they’ll have to have some added value in their search. I have no clue as to what that might be, but I’m not a marketing guru. :)
And of course here’s a picture to prove the “historical” event for posterity with a bit of irony:

Monday, January 05, 2009
Comparing result sets
There are times, although not often that you have to compare 2 result sets. This usually happens when you're analyzing data for whatever reason or unit testing a database.
Usual methods include
- using the UNION of both queries because it filters duplicate data and checking if the row counts are the same
- using the EXCEPT if you have SQL Server 2005+
- using NOT EXISTS which is very ugly and not worth it
- using a FULL OUTER JOIN and looking at nulls or some other method I haven't mentioned
- returning result sets to client and comparing them there
- some other method I haven't mentioned
But for a while now I've been using this method using BINARY_CHECKSUM, CHECKSUM and CHECKSUM_AGG functions, since they are available in all versions of SQL Server. I prefer BINARY_CHECKSUM over CHECKSUM because BINARY_CHECKSUM is case sensitive. In a case-insensitive database CHECKSUM returns the same value for 'abcd' and 'ABCD' while BINARY_CHECKSUM does not.
Here's the code I use for result set comparison:
USE AdventureWorks
GO
IF (
SELECT CHECKSUM_AGG(BINARY_CHECKSUM(*))
FROM (
-- first resultset to compare
SELECT *
FROM Person.Address
) t1
)
=
(
SELECT CHECKSUM_AGG(BINARY_CHECKSUM(*))
FROM (
-- second resultset to compare
SELECT *
FROM Person.Address
WHERE AddressID < 1000
) t1
)
BEGIN
SELECT 'Resultsets equal'
END
ELSE
BEGIN
SELECT 'Resultsets NOT equal'
END
Deleting duplicates
Deleting duplicates becomes an easy task with this method both in SQL Server 2000 and 2005+:
CREATE TABLE #t1(ID INT NULL, VALUE VARCHAR(2))
INSERT INTO #t1(ID, VALUE) VALUES (1,'aa')
INSERT INTO #t1(ID, VALUE) VALUES (2,'bb')
INSERT INTO #t1(ID, VALUE) VALUES (1,'aa')
INSERT INTO #t1(ID, VALUE) VALUES (1,'aa')
INSERT INTO #t1(ID, VALUE) VALUES (3,'cc')
INSERT INTO #t1(ID, VALUE) VALUES (3,'cc')
GO
-- BINARY_CHECKSUM(<column names>): <column names> are columns that we want to compare duplicates for
-- if you want to compare the full row then change BINARY_CHECKSUM(<column names>) -> BINARY_CHECKSUM(*)
-- for SQL Server 2000+ a loop
-- save checksums and rowcounts for duplicates
SELECT BINARY_CHECKSUM(ID, VALUE) AS ChkSum, COUNT(*) AS Cnt
INTO #t2
FROM #t1
GROUP BY BINARY_CHECKSUM(ID, VALUE) HAVING COUNT(*)>1
DECLARE @ChkSum BIGINT, @rc INT
-- get the first checksum and set the rowcount to the count - 1
-- because we want to leave one duplicate
SELECT TOP 1 @ChkSum = ChkSum, @rc = Cnt-1 FROM #t2
WHILE EXISTS (SELECT * FROM #t2)
BEGIN
-- rowcount is one less than the duplicate rows count
SET ROWCOUNT @rc
DELETE FROM #t1 WHERE BINARY_CHECKSUM(ID, VALUE) = @ChkSum
-- remove the processed duplicate from the checksum table
DELETE #t2 WHERE ChkSum = @ChkSum
-- select the next duplicate rows to delete
SELECT TOP 1 @ChkSum = ChkSum, @rc = Cnt-1 FROM #t2
END
SET ROWCOUNT 0
GO
SELECT * FROM #t1
-- for SQL Server 2005+ a cool CTE
;WITH Numbered
AS
(
SELECT ROW_NUMBER() OVER (PARTITION BY ChkSum ORDER BY ChkSum) AS RN, *
FROM (
SELECT BINARY_CHECKSUM(ID, VALUE) AS ChkSum
FROM #t1
) t
)
DELETE FROM Numbered WHERE RN > 1;
GO
SELECT * FROM #t1
DROP TABLE #t1;
DROP TABLE #t2;
It's fast, simple, clean, understandable and works like a charm. If you have a better way do let me know.
Monday, December 29, 2008
This is something I haven't noticed before but I think it's mighty cool!
When you choose to Include the Actual Execution Plan in SSMS 2008 it suggest any missing indexes it thinks that are needed. Plus it also calculates the performance benefit from adding them.
I'm guessing the number (green 84.7361 in the picture) is in percent... So it's like a Database Tuning Advisor "Lite" :))
And best of all this also works when you're connected to a SQL Server 2000 or 2005.
This is what it looks like:
I'm starting to like this new SSMS more and more.
Wednesday, December 03, 2008
I've written a second article here on SQL Team on how to schedule jobs in SQL Server 2005 Express.
Intro
In my previous article we saw how to make simple job scheduling in SQL Server 2005 Express work. We limited the scheduling to one time or daily repeats. Sometimes this isn't enough. In this article we'll take a look at how to make a scheduling solution based on Service Broker worthy of the SQL Server Agent itself including run once, daily, weekly, monthly absolute and relative scheduling:
Scheduling Jobs in SQL Server Express - Part 2
And part 1 of this series:
Scheduling Jobs in SQL Server Express - Part 1
Monday, November 03, 2008
The victim of my carnage being a connection to the SQL Server 2005.
How many times have you issued a query that you had no idea how long it would run, just to loose your patience after 10 minutes and rolled it back, forgetting that it'll probably take 10+ minutes to rollback anyway?
Come on don't be shy... raise your hand... I know you're one of them :)
Although there is no way to get the rollback time from the ROLLBACK command (if there is do let me know) we can use an extended way of calling the KILL command.
KILL command has an option WITH STATUSONLY that tells you how long until will the killed process completes.
We call it like this:
So basically you can do this in one window:
SELECT @@SPID as SpidToKill -- remeber this. It was 60 in my case
BEGIN TRAN
-- our really long query
ROLLBACK
And after you issue a rollback run the following in another window:
KILL 60 -- 60 was the remembered @@SPID from the above
KILL 60 WITH STATUSONLY;
the message you'll get is this:
/*
spid 60: Transaction rollback in progress. Estimated rollback completion: 2% Estimated time left: 130 seconds.
*/
And after running it a few more times you can see that the times are nicely displayed:
/*
SPID 60: transaction rollback in progress. Estimated rollback completion: 7%. Estimated time remaining: 92 seconds.
SPID 60: transaction rollback in progress. Estimated rollback completion: 18%. Estimated time remaining: 81 seconds.
SPID 60: transaction rollback in progress. Estimated rollback completion: 29%. Estimated time remaining: 71 seconds.
SPID 60: transaction rollback in progress. Estimated rollback completion: 94%. Estimated time remaining: 3 seconds.
-- we get this message after the spid has completed dieing
Msg 6106, Level 16, State 1, Line 2
Process ID 60 is not an active process ID.
*/
We can see that the progress report is quite handy to have. and if it takes around 10 minutes you at least know you have time for some coffee.
So don't let those long lasting rollbacks eat your nerves. Happy KILL-ing!
Tuesday, October 21, 2008
XML encoding is necessary if you have to save XML text in an XML document. If you don't escape special chars the XML to insert will become a part of the original XML DOM and not a value of a node.
Escaping the XML means basically replacing 5 chars with new values.
These replacements are:
| < | -> | < |
| > | -> | > |
| " | -> | " |
| ' | -> | ' |
| & | -> | & |
Here are 4 ways you can encode XML in C#:
1. string.Replace() 5 times
This is ugly but it works. Note that Replace("&", "&") has to be the first replace so we don't replace other already escaped &.
string xml = "<node>it's my \"node\" & i like it<node>";
encodedXml = xml.Replace("&", "&").Replace("<", "<").Replace(">", ">").Replace("\"", """).Replace("'", "'");
// RESULT: <node>it's my "node" & i like it<node>
2. System.Web.HttpUtility.HtmlEncode()
Used for encoding HTML, but HTML is a form of XML so we can use that too. Mostly used in ASP.NET apps. Note that HtmlEncode does NOT encode apostrophes ( ' ).
string xml = "<node>it's my \"node\" & i like it<node>";
string encodedXml = HttpUtility.HtmlEncode(xml);
// RESULT: <node>it's my "node" & i like it<node>
3. System.Security.SecurityElement.Escape()
In Windows Forms or Console apps I use this method. If nothing else it saves me including the System.Web reference in my projects and it encodes all 5 chars.
string xml = "<node>it's my \"node\" & i like it<node>";
string encodedXml = System.Security.SecurityElement.Escape(xml);
// RESULT: <node>it's my "node" & i like it<node>
4. System.Xml.XmlTextWriter
Using XmlTextWriter you don't have to worry about escaping anything since it escapes the chars where needed. For example in the attributes it doesn't escape apostrophes, while in node values it doesn't escape apostrophes and qoutes.
string xml = "<node>it's my \"node\" & i like it<node>";
using (XmlTextWriter xtw = new XmlTextWriter(@"c:\xmlTest.xml", Encoding.Unicode))
{
xtw.WriteStartElement("xmlEncodeTest");
xtw.WriteAttributeString("testAttribute", xml);
xtw.WriteString(xml);
xtw.WriteEndElement();
}
// RESULT:
/*
<xmlEncodeTest testAttribute="<node>it's my "node" & i like it<node>">
<node>it's my "node" & i like it<node>
</xmlEncodeTest>
*/
Each of the four ways is different, so use each one where you fell appropriate. You can't go wrong with SecurityElement though. :)
Thursday, October 16, 2008
SQL Server Profiler is a well known tool for tracing different activity that happens between your server and the clients connected to it.
But very few people know that you can have custom events and trace them in the SQL Server Profiler. They can be found under User configurable event group:
We can have up to 10 custom events.
Firing these custom events is done by executing sp_trace_generateevent stored procedure which takes event id as the first input parameter. These event ID's span from 82 (UserConfigurable:0) to 91 (UserConfigurable:9)
A simple code example to demonstrate this:
USE AdventureWorks
GO
CREATE PROCEDURE spDoStuff
@AddressID int
AS
UPDATE Person.Address
SET ModifiedDate = GETDATE()
WHERE AddressID = @AddressID
IF @@ROWCOUNT = 0
BEGIN
DECLARE @msg NVARCHAR(128)
SELECT @msg = N'No data updated. Rowcount 0. AddressId = ' + CAST(@AddressID AS NVARCHAR(10))
EXEC sp_trace_generateevent 82, @msg
END
SELECT * FROM Person.Address WHERE AddressID = @AddressID
GO
-- UPDATES OK
EXEC spDoStuff 1
GO
-- this id doesn't exist. trace event is executed as shown in the below picture
EXEC spDoStuff -22
GO
DROP PROCEDURE spDoStuff
The UserConfigurable:0 event was successfully fired as is shown in the profiler trace.
These custom events are highly underused and there isn't even much talk about them at all. That is a shame because they can be a very helpful addition to your debugging or general execution tracing. By using them you can easily narrow the profiler trace and with that the load on the server over long periods of time making them ideal for finding seldom occurring problems in production environment.
There is one downside to them though. Running sp_trace_generateevent requires ALTER TRACE permissions to run. Most likely the user running the stored procedure doesn't have that permission. Unless you're running under SA of course :)
A workaround for this is to putting it simply create a wrapper stored procedure around the sp_trace_generateevent stored procedure, sign the wrapper with a certificate and allow public access to it.