Starting with SQL Server 2005 RTM and through the latest SQL Server 2008 build at the time of this writing, there is a pretty nasty bug with respect to the COUNT(expression) function and its rule that when using an expression, it filters out NULLs in its count results.
Let me explain with a table called, NAME:
First_Name
John
Adam
Kris
NULL
Paula
NULL
George
‘select count(*) from NAME’ will return the value of 7.
‘select count(First_Name) from NAME’ will return the value of 5 because there are two NULL values that are being excluded by rule.
So far so good. The bug in this case shows up under conditions where you are COUNTing a column from what I’ll call a “grandchild table.” In other words, if I have a table, SYS.OBJECTS, and it is joined to SYS.COLUMNS, which is then joined to another table (ANOTHERTABLE) that may or may not have matching SYS.COLUMNS in it and you are counting a column (eliminating NULLs) within ANOTHERTABLE, you may get erroneous results.
A working example is below, thanks to Ionel’s repro query included in the Connect bug submission:
select o.object_id,
count(x.column_id) should_be_zero_always
from sys.objects o
left join sys.columns c
on o.object_id = c.object_id
left join (
select column_id = 2
where 0 = 1
) x
on c.column_id = x.column_id
group by o.object_id
You can run the above code in any database, and will likely see that 1s show up in the column titled, “should_be_zero_always.”
As you can see, the above query is going to be quite common where you do not want to filter results from the driving table and are forced to use an outer join.
The data behind the scenes looks like this:
o.object_id c.object_id x.column_id cnt_returns_1_y_n
1131151075 NULL NULL y
1147151132 1147151132 1 n
1147151132 1147151132 2 n
1147151132 1147151132 3 n
1147151132 1147151132 4 n
1147151132 1147151132 5 n
1147151132 1147151132 6 n
1163151189 NULL NULL y
1179151246 NULL NULL y
Where we have a NULL in c, x is also forced to be NULL (even though x is hardcoded to return an empty data set in the example above), COUNT(x.column_id) will return a 1 for this scenario, when it should be returning a zero.
This works correctly in SQL Server 2000, I might add.
The bug is currently being tracked internally at Microsoft, and in theory, no future CTPs (which would be for releases beyond 2008) won’t have the issue, but there is no E.T.A. on when we’ll see a fix in 2005 or 2008. Hopefully by the end of the year if we’re lucky and it doesn’t get de-prioritized.
I would encourage you to review the Connect bug submission and at least vote, but by all means, please leave your feedback on the issue while you are there.
https://connect.microsoft.com/SQLServer/feedback/ViewFeedback.aspx?FeedbackID=380304