Thinking outside the box

Patron Saint of Lost Yaks
posts - 203, comments - 734, trackbacks - 4

My Links

Advertisement

News

Archives

Post Categories

NULL NOT IN conundrum

The scenario is very simple. You want to get all ID’s from one table that does not exists in another table.
This is how the two simple tables are set up.

DECLARE     @Table1 TABLE
            (
                         ID INT
            )

INSERT      @Table1
SELECT      1 UNION ALL
SELECT      2 UNION ALL
SELECT      3 UNION ALL
SELECT      4 UNION ALL
SELECT      5 UNION ALL
SELECT      6 UNION ALL
SELECT      7

DECLARE     @Table2 TABLE
            (
                         ID INT
            )

INSERT      @Table2
SELECT      1 UNION ALL
SELECT      3 UNION ALL
SELECT      4 UNION ALL
SELECT      5 UNION ALL
SELECT      7 UNION ALL
SELECT      NULL

Now you run this piece of code to get the missing ID’s, right?

SELECT      t1.ID
FROM        @Table1 AS t1
WHERE       t1.ID NOT IN (SELECT x.ID FROM @Table2 AS x)

But wait! Why did the I get an empty resultset in return? I should get ID 2 and 6 back.
The reason to this perhaps unexpected result is that you have a NULL value in the IN list. When the SQL Engine tries to match the outer values from Table1 with the values taken from Table2 in the IN list, a match against NULL is unknown. So we can’t decide if the outer value exists or not, so the SQL Engine does not return the value, because it is interpreted as the outer value exists in the IN list.

You can solve this by writing an extra WHERE in the IN clause, like this

SELECT      t1.ID
FROM        @Table1 AS t1
WHERE       t1.ID NOT IN (SELECT x.ID FROM @Table2 AS x WHERE x.ID IS NOT NULL)

Now you get ID’s 2 and 6 back.

 

Print | posted on Thursday, March 19, 2009 10:02 AM | Filed Under [ SQL Server 2008 SQL Server 2005 SQL Server 2000 ]

Feedback

Gravatar

# re: NULL NOT IN conundrum

Hi Peter.
I never knew that in/not in so strongly depends on the nullability of the returned list.
thanks, Vadym.
3/19/2009 10:52 AM | Remote DBA
Gravatar

# re: NULL NOT IN conundrum

Wouldn't it be better to do this?

SELECT t1.ID
FROM @Table1 AS t1
LEFT JOIN @Table2 AS t2
ON t1.ID = t2.ID
WHERE t2.ID IS NULL

This is probably a much faster operation than doing a NOT IN.
3/20/2009 1:32 AM | Eric H
Gravatar

# re: NULL NOT IN conundrum

As far as I know the fastest operation for this case (and also safe for null values) is not exists

SELECT t1.ID
FROM @Table1 AS t1
where not exists ( select 1
from @Table2 AS t2
where t1.ID = t2.ID)

Thanks, Vadym.
3/20/2009 8:20 AM | Remote DBA
Gravatar

# re: NULL NOT IN conundrum

You have to look at the definition of [NOT] IN Let me quote from SQL FOR SMARTIES Chapter 14:

14.03. NULLs and the IN() Predicate

NULLs make some special problems in a NOT IN() predicate with a subquery. Consider these two tables:

CREATE TABLE Table1 (x INTEGER);
INSERT INTO Table1 VALUES (1), (2), (3), (4);

CREATE TABLE Table2 (x INTEGER);
INSERT INTO Table2 VALUES (1), (NULL), (2);

Now execute the query:

SELECT *
FROM Table1
WHERE x NOT IN (SELECT x FROM Table2)

Let's work it out step by painful step:

1) do the subquery
SELECT *
FROM Table1
WHERE x NOT IN (1, NULL, 2);

2) convert the NOT IN() to its definition
SELECT *
FROM Table1
WHERE NOT (x IN (1, NULL, 2));

3) expand IN() predicate
SELECT *
FROM Table1
WHERE NOT ((x = 1) OR (x = NULL) OR (x = 2));

4) apply DeMorgan's law:
SELECT *
FROM Table1
WHERE ((x <> 1) AND (x <> NULL) AND (x <> 2

5) constant logical expression
SELECT *
FROM Table1
WHERE ((x <> 1) AND UNKNOWN AND (x <> 2));

6) Reduction of OR to constant
SELECT *
FROM Table1
WHERE UNKNOWN;

7) Results are always empty.

Now try this with another set of tables

CREATE TABLE Table3 (x INTEGER);
INSERT INTO Table3 VALUES (1), (2), (NULL), (4);

CREATE TABLE Table4 (x INTEGER);
INSERT INTO Table3 VALUES (1), (3), (2);

Let's work out the same query step by painful step again.

1) do the subquery
SELECT *
FROM Table3
WHERE x NOT IN (1, 3, 2);

2) convert the NOT IN() to Boolean expression
SELECT *
FROM Table3
WHERE NOT (x IN (1, 3, 2));

3) expand IN() predicate
SELECT *
FROM Table3
WHERE NOT ((x = 1) OR (x = 3) OR (x = 2));

4) DeMorgan's law:
SELECT *
FROM Table3
WHERE ((x <> 1) AND (x <> 3) AND (x <> 2));

5) Computed result set; I will show it as a UNION with substitutions

SELECT *
FROM Table3
WHERE ((1 <> 1) AND (1 <> 3) AND (1 <> 2)) -- FALSE
UNION ALL
SELECT *
FROM Table3
WHERE ((2 <> 1) AND (2 <> 3) AND (2 <> 2)) -- FALSE
UNION ALL
SELECT * FROM Table3
WHERE ((CAST(NULL AS INTEGER) <> 1)
AND (CAST(NULL AS INTEGER) <> 3)
AND (CAST(NULL AS INTEGER) <> 2)) -- UNKNOWN
UNION ALL
SELECT *
FROM Table3
WHERE ((4 <> 1) AND (4 <> 3) AND (4 <> 2)); -- TRUE

6) Result is one row = (4).

3/24/2009 7:16 PM | Joe Celko
Gravatar

# re: NULL NOT IN conundrum

My favorite way to describe this ...

You want to make a sandwich, so you need ham, cheese, mayo, lettuce, tomatoes and bread. In the fridge you have some cheese, some bread and possibly some other stuff, but you don't know what it is (NULL).

What should go on your shopping list that is NOT IN the fridge? "Some stuff, but you don't know what is is (NULL)" is the only answer.
3/27/2009 9:23 PM | Jay
Gravatar

# re: NULL NOT IN conundrum (In answer to Jay)

@Jay:

If I definitely have cheese and bread, and there is something else in my fridge and I don't know what it is, "the only answer" is that I would need ham, mayo, lettuce, and tomatoes - there is no way the unidentified "Stuff" (NULL) is going in my sandwich!

6/14/2009 12:59 AM | Magoo
Gravatar

# re: NULL NOT IN conundrum

Nice article, but I think that your title would work better if you capitalized it the way that you did in the text of the article. That is, "NULL is not LIKE and not NOT LIKE" instead of "NULL is NOT LIKE and NOT NOT LIKE".
4/29/2010 10:08 AM | strategies pour le video keno
Comments have been closed on this topic.

Powered by:
Powered By Subtext Powered By ASP.NET