Another running streaks algorithm
It has been some years since this article was posted http://www.sqlteam.com/article/detecting-runs-or-streaks-in-your-data and things has evolved since then. So I thought about using XML to solve the case. If the number of records are large, maybe this approach http://weblogs.sqlteam.com/peterl/archive/2009/08/12/Calculating-Running-Streak-over-many-records.aspx will be better for you?
Below, I am using the same sample data as the original.
DECLARE @Sample TABLE
(
GameDate DATETIME,
Result CHAR(1)
)
INSERT @Sample
SELECT '1/1/2000', 'W' UNION ALL
SELECT '1/12/2000', 'L' UNION ALL
SELECT '1/15/2000', 'W' UNION ALL
SELECT '1/17/2000', 'W' UNION ALL
SELECT '1/22/2000', 'W' UNION ALL
SELECT '2/1/2000', 'L' UNION ALL
SELECT '2/5/2000', 'W' UNION ALL
SELECT '2/8/2000', 'L' UNION ALL
SELECT '2/16/2000', 'W' UNION ALL
SELECT '2/19/2000', 'L' UNION ALL
SELECT '2/25/2000', 'L' UNION ALL
SELECT '2/28/2000', 'L' UNION ALL
SELECT '3/15/2000', 'L' UNION ALL
SELECT '3/19/2000', 'W' UNION ALL
SELECT '3/25/2000', 'W'
For this to work, you must have a Tally table present. A tally table is nothing more than a table having numbers sequentially stored, ranging from 1 (or 0) to about 100,000. I am using 100,000 and it's very rarely I have had use for a bigger tally numbers table. So, if you don't have a Tally numbers table, go create one.
And now on to the suggestion.
DECLARE @Results VARCHAR(MAX)
SET @Results = (
SELECT '' + Result
FROM @Sample
ORDER BY GameDate
FOR XML PATH('')
)
;WITH Yak(Start, Streak)
AS (
SELECT MIN(Number) AS Start,
COUNT(*) AS Streak
FROM (
SELECT Number,
ROW_NUMBER() OVER (ORDER BY Number) AS recID
FROM dbo.TallyNumbers
WHERE Number BETWEEN 1 AND DATALENGTH(@Results)
AND SUBSTRING(@Results, Number, 1) = 'W'
-- Or <> 'L' for Undefeated (if t for "tie" is used)
) AS d
GROUP BY Number - recID
)
SELECT s.GameDate,
y.Streak
FROM Yak AS y
INNER JOIN (
SELECT ROW_NUMBER() OVER (ORDER BY GameDate) AS recID,
GameDate
FROM @Sample
) AS s ON s.recID = y.Start
ORDER BY s.GameDate
Legacy Comments
Ryan Randall
2009-08-13 |
re: Another running streaks algorithm Here's another idea to play with... -- ; with a as (select GameDate, Result, row_number() over (order by GameDate) as r, row_number() over (partition by Result order by GameDate) as pr from @Sample) , b as (select *, case when Result = 'W' then r-pr else pr-1 end as p1, case when Result = 'W' then pr-1 else r-pr end as p2 from a) select min(GameDate) as GameDateFrom, max(GameDate) as GameDateTo, max(p2)-min(p2)+1 as Streak from b where Result = 'W' group by p1 order by 1 --/ |
Ryan Randall
2009-08-13 |
re: Another running streaks algorithm Slightly more refined... -- select Result, min(GameDate) as GameDateFrom, max(GameDate) as GameDateTo, count(*) as Streak from (select *, case Result when 'W' then 1 else -1 end * (row_number() over (order by GameDate) - row_number() over (partition by Result order by GameDate)) as p2 from @Sample) a group by Result, p2 order by 2 --/ |
Peso
2009-08-13 |
re: Another running streaks algorithm Now that's the Ryan I am used too! |
Ryan Randall
2009-08-13 |
re: Another running streaks algorithm Hehe - I realise now that this boils down to the same thing as http://www.sqlteam.com/article/detecting-runs-or-streaks-in-your-data, but using row_number() which wasn't available then. -- select Result, min(GameDate) as GameDateFrom, max(GameDate) as GameDateTo, count(*) as Streak from (select *, (row_number() over (order by GameDate) - row_number() over (partition by Result order by GameDate)) as p2 from @Sample) a group by Result, p2 order by 2 --/ |
Ryan Randall
2009-08-13 |
re: Another running streaks algorithm > Now that's the Ryan I am used too! Hehe - wish I'd have thought it through a bit more first though! |
empire3
2012-05-04 |
re: Another running streaks algorithm This algorithm posted by Ryan Randall is super. But has anyone modified it to detect runs for values less than some threshold? Like temperature data, for example. We'd like to use this algorithm to detect runs of daily temperatures less than 32 degrees. But it's not a simple modification. Probably need to use a subquery that uses case to convert the temperature into High or Low, but I don't see how to do it. And very much need this algorithm. |
concord 11
2012-06-27 |
concord 11 Like temperature data, for example. We'd like to use this algorithm to detect runs of daily temperatures less than 32 degrees. But it's not a simple modification. Probably need to use a subquery that uses case to convert the temperature into High or Low, but I don't see how to do it. And very much need this algorithm. |
WiseOldMan
2012-08-14 |
re: Another running streaks algorithm This is great for one team's data. Suppose however you have many teams over many years and you want to find the longest streak for each one? |