Server: Msg 8120, Level 16, State 1, Line 1
Column 'xyz' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.
Arrghh !!! There it is, yet again .. that annoying error message. Why is SQL so picky about this? What's the deal!?
A very important concept for anyone who ever works with SQL to fully understand is this:
the aggregate function requirement for non-grouped columns is not some arbitrary syntax restriction of SQL in general or of SQL Server's implementation! It is simple logic, like dividing by zero -- it must be handled and dealt with explicitly, whether we are writing SQL or working it out on paper by hand.
Suppose I ask you this: "For each city in Massachusetts, what is the total number of residents and what is their age?"
Now, think logically and carefully about what I am asking you. Does it make sense? The total number of people does -- it is just a single value, the count of how many people live in the city. But how can I return "their age"? What is really being asked for? A long list of every age of every person, all delimited by commas (e.g., a list of thousands of repeating numbers)? A distinct list of all ages, also separated by commas (e.g., 0,1,2,3 ....) ? Perhaps the average age? The median age? The minimum and/or the maximum age? The result of adding up the ages of everyone in the city?
As you can see, we cannot answer the question, since it does not make logical sense without providing more specific information about what you need.
Now, if we relate this to SQL, the original question directly translates to something like this:
select city, count(*) as residents, age
from people
where state='ma'
group by city
If we run that, the result is that "annoying" error:
Server: Msg 8120, Level 16, State 1, Line 1
Column 'people.age' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.
In other words, the computer has the same restriction we do -- it cannot answer that question unless you are more specific! As you hopefully can see now, this is not some arbitrary syntax error or some SQL Server implementation quirk -- it is a logic error. We
must qualify what we mean by "their age" when asking the question in English; what
exactly do we want returned?
If it turns out that we want the average age of the residents, now we can ask a complete and logical question in English: "For each city, what is the total number of residents and what is their
average age?" And that directly translates to a more complete and valid SQL statement:
select city, count(*) as residents, avg(age) as AverageAge
from people
where state='ma'
group by city
... which will now compile and execute without errors.
If this concept still doesn't make sense, please let me know in the comments and I can try to clarify further; it is so crucial to understand this when writing SQL and thinking logically about the data you are trying to return in a SELECT statement or a report.
(More on GROUP BY
here.)
see also: