Brett Kaiser (x002548) Blog

Not Just a Number - Brett Kaiser

Show me The TOP n Number of things based on a key

OK, another one that seems to be often asked, and I'm sure there's a SQL Team article out there...I'm just lazy.  Anyway, this post got me interested in this again (and I always have trouble trying to remember the answer), so I decided to post some solutions, so I don't have to remember. With the Help of Pat Phelan, Rudy Limeback, and Mr. B. Lindman (aka the Blind Dude), we came up with the following.

Notice the differences in the Plans.  While Pat's Subquery looks cleaner, Rudy's Join seems more effecient.  Any comments on the plans would be greatly appreciated.

EDIT: Sunsande points out some other shortcomings in the SQL.  With their proposed SQL you not only get the top n results, but you do get ties.  However n top results would require dynamic sql for this.  Still I think it looks to be the best solution so far.  Thanks

DECLARE @N int, @sql nvarchar(4000)
     SET @N = 3  --The number of records to return for each grouping.

SET @sql =
 N'  SELECT a.SiteId, a.EmpId, a.Sales '
+ '    FROM myTable99 a '
+ '   WHERE a.Sales IN (SELECT DISTINCT TOP ' + CONVERT(varchar(15),@n) + ' b.Sales '
+ '         FROM myTable99 b '
+ '        WHERE a.EmpID = b.EmpId '
+ '     ORDER BY b.sales DESC) '
+ 'ORDER BY a.EmpID, a.Sales DESC, a.SiteID '

EXECUTE sp_executesql @sql

The Question here was, Show my the TOP 3 Sales for an Employee by Store

USE Northwind
GO

SET NOCOUNT ON
CREATE TABLE myTable99(SiteId int, EmpId int, Sales money
  , PRIMARY KEY(EmpId, Sales, SiteId))
GO

INSERT INTO myTable99(SiteId, EmpId, Sales)
SELECT 1, 1, 10.00 UNION ALL
SELECT 2, 1, 15.00 UNION ALL
SELECT 3, 1, 20.00 UNION ALL
SELECT 4, 1, 50.00 UNION ALL
SELECT 5, 1, 10.00 UNION ALL
SELECT 6, 1, 5.00 UNION ALL
SELECT 1, 2, 100.00 UNION ALL
SELECT 2, 2, 1500.00 UNION ALL
SELECT 3, 2, 2000.00 UNION ALL
SELECT 4, 2, 5000.00 UNION ALL
SELECT 5, 2, 1000.00 UNION ALL
SELECT 6, 2, 500.00 UNION ALL
SELECT 1, 3, 1.00 UNION ALL
SELECT 2, 3, 1.50 UNION ALL
SELECT 3, 3, 2.00 UNION ALL
SELECT 4, 3, 5.00 UNION ALL
SELECT 5, 3, 1.00 UNION ALL
SELECT 6, 3, .50
GO


SET SHOWPLAN_TEXT ON
GO

DECLARE @N int
     SET @N = 3  --The number of records to return for each grouping.

SELECT a.SiteId, a.EmpId, a.Sales
   FROM myTable99 a
   WHERE (SELECT Count(*)
      FROM myTable99 b
      WHERE  b.EmpId   = a.EmpId
         AND a.Sales  <= b.Sales) <= @N
   ORDER BY a.EmpID, a.Sales DESC, a.SiteID


    SELECT  a.EmpId, a.SiteId, a.sales
      FROM myTable99 a
INNER JOIN myTable99 b
        ON a.EmpId  = b.EmpId
       AND a.Sales <= b.Sales
  GROUP BY a.EmpId, a.SiteId, a.sales
  HAVING COUNT(*) <= @N
ORDER BY a.EmpId, a.sales desc
GO

SET SHOWPLAN_TEXT OFF
GO

SET NOCOUNT OFF
DROP TABLE myTable99
GO

 

 

Legacy Comments


Brett
2005-02-11
Interesting caviat
Seems that if there's any duplicate data that is part of the results, then the query ignores that data...I had the PK on all three fields.I guess it shouldn't have it on Sales..anyway, the fix would be:

SELECT a.EmpId, a.SiteId, a.sales
FROM (SELECT DISTINCT EmpId, SiteId, sales FROM myTable99) a
INNER JOIN (SELECT DISTINCT EmpId, SiteId, sales FROM myTable99) b
ON a.EmpId = b.EmpId
AND a.Sales <= b.Sales
GROUP BY a.EmpId, a.SiteId, a.sales
HAVING COUNT(*) <= @N
ORDER BY a.EmpId, a.sales desc
GO

sunsande
2005-02-23
All of yours solutions are not complete ...
Hi, there

I also became to that problem while working on a MS Access project. It is beautiful in math or programming sense and I liked it a lot after I solved it :).

Assume that the data for the EmpId=1 is :

SELECT 1, 1, 50.00 UNION ALL
SELECT 2, 1, 50.00 UNION ALL
SELECT 3, 1, 50.00 UNION ALL
SELECT 4, 1, 50.00 UNION ALL
SELECT 5, 1, 10.00 UNION ALL
SELECT 6, 1, 5.00 UNION ALL

Update yours databases and rerun the queries. Surprise! You don't show any information about EmpID=1 ?
Why?

Because you assumed implicitly in your queries that an Employee can't make more than N (3 in our case) sales of the same type.

The condition HAVING COUNT(*) <= @N won't work with the example above just because there are 4 top sales and 4 is never <= 3.

Here is my solution:

I assumed to show TOP 3 sales values. In the common case it doesn't mean TOP 3 records but the TOP 3 groups of records.
For the example above it will return :

1, 1, 50.00
2, 1, 50.00
3, 1, 50.00
4, 1, 50.00
5, 1, 10.00
6, 1, 5.00

These are the 3 sales groups 50, 10, and 5.

Here is the query:
SELECT a.SiteId, a.EmpId, a.Sales
FROM myTable99 a
WHERE a.Sales IN (SELECT DISTINCT TOP 3 b.Sales
FROM myTable99 b WHERE a.EmpID = b.EmpId
ORDER BY b.sales DESC)
ORDER BY a.EmpID, a.Sales DESC, a.SiteID

And as a final note: if you want exactly 3 rows per EmpID you can run again TOP N over the groups in the query I suggested. But why shoud we prefer:

1, 1, 50.00
2, 1, 50.00
3, 1, 50.00

instead of

2, 1, 50.00
3, 1, 50.00
4, 1, 50.00

for example. Well this is another topic :).

Thank you for BLOGing :)!

sunsande
(sunsande(aD)gmail<Tot>com)


sloan
2005-08-19
re: Show me The TOP n Number of things based on a key
I think the queries are on the same stage as:
http://support.microsoft.com/kb/q153747/

HOWEVER, these queries do NOT scale well.

If you follow the Microsft Examples (NorthWind database), and then add 1,000,000 Products, the perforamce is horrible (Over 1 Hr).

Here is a sql server 2000 script to add 1,000,000 Products. Run it, and then try the KB, and you'll see what I mean.

The below codes is NOT production code, its only a hacky way to get 1,000,000 records in the database, with some randomness to them.


declare @counter int
select @counter = 0
declare @previousMaxID int
declare @currentID varchar(12)
declare @catID int
declare @randomUnitsInStock int


while @counter < 994922
begin

select @previousMaxID = MAX(ProductID) from Products
select @currentID = convert(varchar(12), @previousMaxID + 1)

select @catID = @previousMaxID % 8
Select @catID = @catID + 1

Select @randomUnitsInStock = @previousMaxID % 10000

INSERT INTO Products
(
ProductName,SupplierID,CategoryID,QuantityPerUnit,UnitPrice,UnitsInStock,UnitsOnOrder,ReorderLevel,Discontinued
)
VALUES
( 'Product_' + @currentID , @catID , @catID , @currentID + 'QuanityPerUnit' , 100.00, @randomUnitsInStock, 100, 100, 0)

select @counter = @counter + 1
end