In SQL, the general rule of thumb is that the number of rows returned from a SELECT will be zero if your criteria did not match any data. However, there is an important exception to this rule: it does
not apply when asking for aggregate calculations such as SUM(), MIN() or MAX(), without any grouping. This is rather interesting and important to know and look out for, as it can cause some confusion and recently some of my ASP.NET code failed due to this. Let's take a look.
We'll be using the following simple table for our examples:
create table tmp (f1 int, f2 int)
insert into tmp
select 1,1 union all
select 1,2 union all
select 2,3
Now, if we SELECT from this table without any criteria, we will get 3 rows returned:
select f1,f2 from tmp
f1 f2
----------- -----------
1 1
1 2
2 3
(3 row(s) affected)
If we put criteria on our SELECT that doesn't match any rows -- for example, where f2 > 3 -- we will get back zero rows:
select f1,f2 from tmp where f2 > 3
f1 f2
----------- -----------
(0 row(s) affected)
So far, so good. Let's try adding a GROUP BY clause to this SELECT. If we want just the the MAX(f2) per f1, we get back two rows:
select f1, max(f2) as MaxF2 from tmp group by f1
f1 MaxF2
----------- -----------
1 2
2 3
(2 row(s) affected)
Once again, let's constrain our results so that no data matches:
select f1, max(f2) as MaxF2 from tmp where f2 > 3 group by f1
f1 MaxF2
----------- -----------
(0 row(s) affected)
... and once again, no rows are returned. This should all be expected, and when writing our client code, we can do simple checks to verify if any data matched our SELECT statement's criteria simply by checking the row count. In the case of a SqlDataReader, the Read() method will return False immediately and we know that no rows were returned, so no data matched our criteria.
So, it seems that we have simply verified the following:
In a SELECT statement, if the WHERE clause results in no matching rows in the underlying table(s), zero rows are returned in the result set.
However, there is an exception to this rule! Let's get the SUM() of the f1 column from the entire table, without any criteria or grouping. As expected, this will result in a single row returned with the total:
select sum(f1) as TotalF1 from tmp
TotalF1
-----------
4
(1 row(s) affected)
Next, let's filter the result, just as before, so that no rows match our criteria, and again ask for the SUM(f1). Since no rows match our criteria, we expect to once again get 0 results returned. However, notice what happens:
select sum(f1) as TotalF1 from tmp where f1 > 4
TotalF1
-----------
NULL
(1 row(s) affected)
A row is returned, with a Null value! Yet, there are *no* rows that match our criteria! This is very interesting, and rather important to know. Please note that there is a big difference between getting a single row back with a NULL value as opposed to getting no rows back at all, even though we might logically interpret the results the same way. In fact, if you just are selecting only aggregate calculations from a table, without any grouping, one row will
always be returned -- never more, never less (note that this does not apply if you add a HAVING clause -- see below).
As an example, say that you have a simple stored procedure like this:
create procedure GetPreviousWeek @CurrentWeek DateTime
as
select Max(Week) as PreviousWeek
from Calendar
where Week < @CurrentWeek
Written as is, it is important to remember that you will
always get a
single row back, even if there is no previous week before the @CurrentWeek parameter! I had a similar stored procedure, and incorrectly wrote my client logic like this:
SqlDataReader r = cmd.ExecuteReader()
if (r.Read())
{
// there is a previous week
}
else
{
// there is not a previous week
}
That was based on the assumption that the WHERE clause would result in no rows being returned if there was no previous week and therefore no matching data. However, since the stored procedure was only calculating and returning an aggregate total, a single row is always returned, which meant that my logic was flawed. What I really needed was something like this:
SqlDataReader r = cmd.ExecuteReader()
if (r.Read())
{
if (r["PreviousWeek"] == dbNull.Value)
{
// there is no previous week
}
else
{
// there IS a previous week value
}
}
else
{
// this should never occur!
}
Notice the difference in the logic.
In short, I think it makes sense: If I ask for only an aggregate
value for some data, even if that data has no rows, I should always get a value back -- though, of course, it may be NULL. However, if I specifically ask for a
set of rows (by not aggregating or by providing a GROUP BY clause) then I should expect that I might not receive any rows back at all if no data matches my criteria.
Consider a "real world" example: If I asked you for the total population of all U.S. states that start with "Z", you would be able to reply with a single value (NULL, or perhaps 0). But, if I ask you to list all of the states that start with "Z" along with their individual population, the answer would be an empty set; you would not be able to give me a value or set of values, you would reply that "no states start with Z so there is no data to return."
Please note that adding a HAVING clause to filter your SELECT may indeed result in zero rows being returned, since the HAVING is applied after the aggregate is calculated. For example:
select sum(f1) as TotalF1 from tmp having sum(f1) > 10
TotalF1
-----------
(0 row(s) affected)
In that case, the single value is calculated, but it is not returned in the final results because the HAVING clause prevented it.
So, keep this in mind when returning single, aggregate values in a SELECT: You will always get a single row returned, and never an "empty set", unless you also provide a HAVING clause filter. If you have any columns in your GROUP clause, however, this does not apply and you may indeed get an empty set returned if your criteria does not match any data.
see also: