This is one of those things that you probably know, but have not run into it lately. That was the case for me recently as I was troubleshooting an issue in someone else's code, and spent several hours on it before stumbling across this cause.
The issue at hand is that people forget about the true nature of NULL in a database column. Another way of thinking about NULL is to substitute the word UNKNOWN. NULL is not anything...it is unknown. NULL does not equal anything. That is why you have to use the magic phrase IS NULL instead of = NULL in your SQL queries:
SELECT COUNT(*) FROM Person.Address
-- 19,614 total records in the table
SELECT * FROM Person.Address WHERE AddressLine2 = NULL
-- ZERO Records Returned
SELECT * FROM Person.Address WHERE AddressLine2 IS NULL
-- 19,252 Records Returned
And not only is NULL not equal to anything, it is not LIKE anything either. That's pretty easy for people to grasp, once they have bought into the equals issue...
SELECT * FROM Person.Address WHERE AddressLine2 LIKE '%2%'
-- 61 Records returned
Notice that none of the rows returned have NULL in the AddressLine2 column because NULL is not LIKE 2. But what trips up some database developers, and was at the root of my trouble today, is that while NULL is not LIKE anything, it also is not NOT LIKE anything either.
SELECT * FROM Person.Address WHERE AddressLine2 NOT LIKE '%2%'
-- 19,553 Records EXPECTED
-- Only 301 Records actually returned
In the above example, the developer was expecting the inverse of the LIKE clause, that is, he expected the other 19,553 records to be returned, but if you look at the actual results, you see that what you get are only the records that have a value (NOT NULL) and that value is not like the search value. All of the records that have a NULL AddressLine2 are excluded from the result set.
How do you fix it?
There are a few different approaches you could take to fixing it. My first suggestion would be to disallow NULLs in the column in the first place.
-- When removing NULL from an existing table, first UPDATE the table with values.
UPDATE Person.Address SET AddressLine2 = '' WHERE AddressLine2 IS NULL
-- NOW Alter the table to disallow NULL
ALTER TABLE Person.Address
ALTER COLUMN AddressLine2 NVARCHAR(60) NOT NULL
Now don't get me wrong, I'm not one of those guys who forbids NULL in his databases. I'm like the ability to have NULL in many situations. To me there is a certain amount of truthiness to saying, "Hey, I don't know what the value is." instead of just using an empty string or zero (especially zero...zero is a real value...there's a big difference between having zero dollars in your savings account and not knowing how much is in savings). On the other hand, there are times when a column absolutely must have a value. It is NOT acceptable for the bank to not know how much is in my savings account, for example.
So, if you are going to still allow NULL in your database, then you need to interpret or convert NULL on the fly using ISNULL or a CASE statement, or some other related approach.
SELECT * FROM Person.Address WHERE ISNULL(AddressLine2, '') LIKE '%2%'
-- 61 Records returned
SELECT * FROM Person.Address WHERE ISNULL(AddressLine2, '') NOT LIKE '%2%'
-- 19,553 Records returned
This returned the expected results, but be warned, there are serious performance ramifications to using functions (e.g. ISNULL() ) in your WHERE clause, and better options exist. Some recommended reading on that subject are the article Avoid enclosing Indexed Columns in a Function in the WHERE clause by Bill Graziano and the blog post Optimizing Conditional WHERE Clauses: Avoiding ORs and CASE Expressions by Jeff Smith.
P.S. For another interesting challenge with NULL, check out Peter Larson's NULL NOT IN Conundrum.
posted @ Monday, June 08, 2009 10:53 PM