Ad-Hoc Rollup by date/time Interval
I often use aggregate queries to rollup data by an arbitrary date/time interval. I'll share some techniques that I use to accomplish the task in case you find these useful, using the same table below:
CREATE TABLE dbo.WebStats ( RequestTimestamp datetime NOT NULL, Page varchar(255) NOT NULL );
CREATE CLUSTERED INDEX WebStats_cdx ON dbo.WebStats(RequestTimestamp, Page); INSERT INTO dbo.WebStats (RequestTimestamp, Page) VALUES ('2010-01-01T00:00:00', 'Default.aspx') ,('2010-01-01T00:00:15', 'Default.aspx') ,('2010-01-01T00:01:05', 'Order.aspx') ,('2010-01-01T00:01:30', 'Default.aspx') ,('2010-01-01T00:01:40', 'OrderStatus.aspx') ,('2010-01-01T00:02:05', 'Default.aspx') ,('2010-01-01T00:03:05', 'ProductInfo.aspx') ,('2010-01-01T00:03:30', 'Default.aspx'); GO |
Simple Rollup
Without an auxiliary table, a little DATEADD magic can do the trick. Here's an example that summarizes web page requests by minute for the specified date/time range:
DECLARE @StartTimestamp datetime = '2010-01-01T00:00:00' ,@EndTimestamp datetime = '2010-01-02T00:00:00'; SELECT DATEADD(minute, DATEDIFF(minute, @StartTimestamp, RequestTimestamp), @StartTimestamp) AS Interval, COUNT(*) AS PageRequests FROM dbo.WebStats GROUP BY DATEADD(minute, DATEDIFF(minute, @StartTimestamp, RequestTimestamp), @StartTimestamp) ORDER BY Interval; |
Results:
Interval |
PageRequests |
2010-01-01 00:00:00.000 |
2 |
2010-01-01 00:01:00.000 |
3 |
2010-01-01 00:02:00.000 |
1 |
2010-01-01 00:03:00.000 |
2 |
2010-01-01 00:29:00.000 |
1 |
2010-01-01 00:31:00.000 |
1 |
2010-01-01 00:42:00.000 |
1 |
2010-01-01 02:01:00.000 |
2 |
2010-01-01 02:03:00.000 |
2 |
2010-01-01 02:31:00.000 |
1 |
2010-01-01 02:44:00.000 |
1 |
2010-01-01 02:49:00.000 |
1 |
Arbitrary Intervals
The simple rollup method works well for any of the pre-defined units provided by the DATEADD function (year, quarter, month, day, hour, minute, second or week). However, it lacks the flexibility to roll up to an arbitrary interval like 15 minutes or 30 seconds. A little DATEADD/DATEDIFF math addresses this gap. Below is an example of a 30-minute interval rollup using this technique:
DECLARE @StartTimestamp datetime = '2010-01-01T00:00:00' ,@EndTimestamp datetime = '2010-01-01T04:00:00' ,@IntervalSeconds int = 1800; --30 minutes SELECT DATEADD(second ,DATEDIFF(second, @StartTimestamp ,RequestTimestamp) / @IntervalSeconds * @IntervalSeconds, @StartTimestamp) AS Interval ,COUNT(*) AS PageRequests FROM dbo.WebStats WHERE RequestTimestamp >= @StartTimestamp AND RequestTimestamp < @EndTimestamp GROUP BY DATEADD(second ,DATEDIFF(second, @StartTimestamp ,RequestTimestamp) / @IntervalSeconds * @IntervalSeconds, @StartTimestamp) ORDER BY Interval; |
Interval |
PageRequests |
2010-01-01 00:00:00.000 |
9 |
2010-01-01 00:30:00.000 |
2 |
2010-01-01 02:00:00.000 |
4 |
2010-01-01 02:30:00.000 |
3 |
Missing Intervals
You probably noticed that periods with no activity at all are omitted rather than reporting a zero value. One method to include the missing intervals is with an outer join to a temporal table containing all the desired intervals. Ideally, the temporal table would be a permanent one but I've found it impractical to maintain such a table for ad-hoc needs. Fortunately, a utility numbers CTE is a handy way to generate the needed intervals dynamically. The example below provides up to 65,536 interval values and can be easily extended as needed.
DECLARE @StartTimestamp datetime = '2010-01-01T00:00:00' ,@EndTimestamp datetime = '2010-01-01T04:00:00' ,@IntervalSeconds int = 1800; --30 minutes WITH T2 AS (SELECT 0 AS Num UNION ALL SELECT 0), T4 AS (SELECT 0 AS Num FROM T2 AS A CROSS JOIN T2 AS B), T256 AS (SELECT 0 AS Num FROM T4 AS A CROSS JOIN T4 AS B CROSS JOIN T4 AS C CROSS JOIN T4 AS D), T65536 AS (SELECT ROW_NUMBER() OVER(ORDER BY A.Num) AS Num FROM T256 AS A CROSS JOIN T256 AS B) SELECT DATEADD(second ,(Num-1) * @IntervalSeconds, @StartTimestamp) AS Interval ,COUNT(WebStats.RequestTimestamp) AS PageRequests FROM T65536 LEFT JOIN dbo.WebStats ON WebStats.RequestTimestamp >= DATEADD(second, (Num-1) * @IntervalSeconds, @StartTimestamp) AND WebStats.RequestTimestamp < DATEADD(second, Num * @IntervalSeconds, @StartTimestamp) WHERE Num <= DATEDIFF(second, @StartTimeStamp, @EndTimestamp) / @IntervalSeconds GROUP BY DATEADD(second ,(Num-1) * @IntervalSeconds, @StartTimestamp) ORDER BY Interval; |
Interval |
PageRequests |
2010-01-01 00:00:00.000 |
9 |
2010-01-01 00:30:00.000 |
2 |
2010-01-01 01:00:00.000 |
0 |
2010-01-01 01:30:00.000 |
0 |
2010-01-01 02:00:00.000 |
4 |
2010-01-01 02:30:00.000 |
3 |
2010-01-01 03:00:00.000 |
0 |
2010-01-01 03:30:00.000 |
0 |