Davide Mauri Blog

Experiences with SQL Server

A better NTILE implementation

The implementation of the NTILE(n) windowing function is a little bit slow and requires a temp worktable that generates A LOT of I/O. Probably that because, as the BOL says, if you have a number of rows that is not divisible by your "n" bucket value, the function has to make the "Larger groups come before smaller groups in the order specified by the OVER clause".

If you're using NTILE for statistical purposes and so you don't care about having larger groups before smaller one, mostly because the difference among the groups population tipically will be of only one unit, you can implement NTILE with the following script:

SELECT 
   CustomerKey,
   CEILING(ROW_NUMBER() OVER (ORDER BY YearlyIncome ASC) / ((SELECT CAST(COUNT(*) + 1 AS FLOAT) FROM DimCustomer) / @n) ) AS NTileFast
FROM 
   
DimCustomer

where @n is a variable that contains your bucket value. For example, if you want a NTILE(5) your @n value will be 5.

You can make some test using the AdventureWorksDW database:

DECLARE @n INT;
SET @n = 5;

WITH CTE AS (
   SELECT 
      CustomerKey,
      NTILE(@n) OVER (ORDER BY YearlyIncome ASC) AS NTileStd
   FROM 
      DimCustomer
)
SELECT
   COUNT(*),
   NTileStd
FROM
   CTE
GROUP BY
   NTileStd
ORDER BY
   2;

WITH CTE AS (
   SELECT 
      CustomerKey,
      CEILING(ROW_NUMBER() OVER (ORDER BY YearlyIncome ASC) / ((SELECT CAST(COUNT(*) + 1 AS FLOAT) FROM DimCustomer) / @n) ) AS NTileFast
   FROM 
   DimCustomer
)
SELECT
   COUNT(*),
   NTileFast
FROM
   CTE
GROUP BY
   NTileFast
ORDER BY
   2;

You'll notice that the first will make 38490 I/O (!!!) where the second one will only make 1036 I/O, which is 37 time LESS!!!!

I have discovered this behaviour with my collegue Marco Russo using a milions rows table and as you may image 37 times less I/O DOES the difference! :-)

Legacy Comments


Joe Celko
2006-04-23
re: A better NTILE implementation
Very nice job! But I wonder why the NTILE() is so bad?

Davide Mauri
2006-04-24
re: A better NTILE implementation
I think that the performance of NTILE() are so bad because of the definition of NTILE (present in the Sql Server's Books on Line): "Larger groups come before smaller groups in the order specified by the OVER clause".
I really don't know if it has a mathematical basis or not, anyway the fact that larger groups has to come before smaller ones, force the query optimizer to use a temporary worktable that - i think - scans with a cursor so that it can obey the above rule when creating the final resultset.
Of course this is my guessing from what I see reading the execution plan.

ML
2006-04-26
re: A better NTILE implementation
Have you suggested improving the function to MS?
http://lab.msdn.microsoft.com/productfeedback/Default.aspx

Davide Mauri
2006-04-27
re: A better NTILE implementation
Not yet. I'm going to drop an email to the optimizer team right now :-)