Dan Guzman Blog

Ad-Hoc Rollup by date/time Interval

I often use aggregate queries to rollup data by an arbitrary date/time interval.  I'll share some techniques that I use to accomplish the task in case you find these useful, using the same table below:

CREATE TABLE dbo.WebStats

(

      RequestTimestamp datetime NOT NULL,

      Page varchar(255) NOT NULL

);

CREATE CLUSTERED INDEX WebStats_cdx ON dbo.WebStats(RequestTimestamp, Page);

 

INSERT INTO dbo.WebStats (RequestTimestamp, Page)

VALUES

      ('2010-01-01T00:00:00', 'Default.aspx')

      ,('2010-01-01T00:00:15', 'Default.aspx')

      ,('2010-01-01T00:01:05', 'Order.aspx')

      ,('2010-01-01T00:01:30', 'Default.aspx')

      ,('2010-01-01T00:01:40', 'OrderStatus.aspx')

      ,('2010-01-01T00:02:05', 'Default.aspx')

      ,('2010-01-01T00:03:05', 'ProductInfo.aspx')

      ,('2010-01-01T00:03:30', 'Default.aspx');

GO

 

Simple Rollup

Without an auxiliary table, a little DATEADD magic can do the trick.  Here's an example that summarizes web page requests by minute for the specified date/time range:

DECLARE

      @StartTimestamp datetime = '2010-01-01T00:00:00'

      ,@EndTimestamp datetime = '2010-01-02T00:00:00';

 

SELECT

      DATEADD(minute, DATEDIFF(minute, @StartTimestamp, RequestTimestamp), @StartTimestamp) AS Interval,

      COUNT(*) AS PageRequests

FROM dbo.WebStats

GROUP BY

      DATEADD(minute, DATEDIFF(minute, @StartTimestamp, RequestTimestamp), @StartTimestamp)

ORDER BY

      Interval; 

 

Results:

Interval

PageRequests

2010-01-01 00:00:00.000

2

2010-01-01 00:01:00.000

3

2010-01-01 00:02:00.000

1

2010-01-01 00:03:00.000

2

2010-01-01 00:29:00.000

1

2010-01-01 00:31:00.000

1

2010-01-01 00:42:00.000

1

2010-01-01 02:01:00.000

2

2010-01-01 02:03:00.000

2

2010-01-01 02:31:00.000

1

2010-01-01 02:44:00.000

1

2010-01-01 02:49:00.000

1

 

Arbitrary Intervals

The simple rollup method works well for any of the pre-defined units provided by the DATEADD function (year, quarter, month, day, hour, minute, second or week).  However, it lacks the flexibility to roll up to an arbitrary interval like 15 minutes or 30 seconds.  A little DATEADD/DATEDIFF math addresses this gap.  Below is an example of a 30-minute interval rollup using this technique:

DECLARE

      @StartTimestamp datetime = '2010-01-01T00:00:00'

      ,@EndTimestamp datetime = '2010-01-01T04:00:00'

      ,@IntervalSeconds int = 1800; --30 minutes

SELECT

      DATEADD(second

            ,DATEDIFF(second, @StartTimestamp

            ,RequestTimestamp)

            / @IntervalSeconds * @IntervalSeconds, @StartTimestamp) AS Interval

      ,COUNT(*) AS PageRequests

FROM dbo.WebStats

WHERE

      RequestTimestamp >= @StartTimestamp

      AND RequestTimestamp < @EndTimestamp

GROUP BY

      DATEADD(second

            ,DATEDIFF(second, @StartTimestamp

            ,RequestTimestamp) / @IntervalSeconds * @IntervalSeconds, @StartTimestamp)

ORDER BY

      Interval;

 

Interval

PageRequests

2010-01-01 00:00:00.000

9

2010-01-01 00:30:00.000

2

2010-01-01 02:00:00.000

4

2010-01-01 02:30:00.000

3

 

Missing Intervals

You probably noticed that periods with no activity at all are omitted rather than reporting a zero value.  One method to include the missing intervals is with an outer join to a temporal table containing all the desired intervals.  Ideally, the temporal table would be a permanent one but I've found it impractical to maintain such a table for ad-hoc needs.  Fortunately, a utility numbers CTE is a handy way to generate the needed intervals dynamically.  The example below provides up to 65,536 interval values and can be easily extended as needed.

DECLARE

      @StartTimestamp datetime = '2010-01-01T00:00:00'

      ,@EndTimestamp datetime = '2010-01-01T04:00:00'

      ,@IntervalSeconds int = 1800; --30 minutes

 

WITH

      T2 AS (SELECT 0 AS Num UNION ALL SELECT 0),

      T4 AS (SELECT 0 AS Num FROM T2 AS A CROSS JOIN T2 AS B),

      T256 AS (SELECT 0 AS Num FROM T4 AS A CROSS JOIN T4 AS B CROSS JOIN T4 AS C CROSS JOIN T4 AS D),

      T65536 AS (SELECT ROW_NUMBER() OVER(ORDER BY A.Num) AS Num FROM T256 AS A CROSS JOIN T256 AS B)

SELECT

      DATEADD(second

            ,(Num-1) * @IntervalSeconds, @StartTimestamp) AS Interval

      ,COUNT(WebStats.RequestTimestamp) AS PageRequests

FROM T65536

LEFT JOIN dbo.WebStats ON

      WebStats.RequestTimestamp >= DATEADD(second, (Num-1) * @IntervalSeconds, @StartTimestamp)

      AND WebStats.RequestTimestamp < DATEADD(second, Num * @IntervalSeconds, @StartTimestamp)

WHERE

      Num <= DATEDIFF(second, @StartTimeStamp, @EndTimestamp) / @IntervalSeconds

GROUP BY

      DATEADD(second

            ,(Num-1) * @IntervalSeconds, @StartTimestamp)

ORDER BY

      Interval;  

 

Interval

PageRequests

2010-01-01 00:00:00.000

9

2010-01-01 00:30:00.000

2

2010-01-01 01:00:00.000

0

2010-01-01 01:30:00.000

0

2010-01-01 02:00:00.000

4

2010-01-01 02:30:00.000

3

2010-01-01 03:00:00.000

0

2010-01-01 03:30:00.000

0