Peter Larsson Blog

Patron Saint of Lost Yaks

Another running streaks algorithm

It has been some years since this article was posted http://www.sqlteam.com/article/detecting-runs-or-streaks-in-your-data and things has evolved since then. So I thought about using XML to solve the case. If the number of records are large, maybe this approach http://weblogs.sqlteam.com/peterl/archive/2009/08/12/Calculating-Running-Streak-over-many-records.aspx will be better for you?

Below, I am using the same sample data as the original.

DECLARE @Sample TABLE
        (
            GameDate DATETIME,
            Result CHAR(1)
        )

INSERT @Sample
SELECT '1/1/2000', 'W' UNION ALL
SELECT '1/12/2000', 'L' UNION ALL
SELECT '1/15/2000', 'W' UNION ALL
SELECT '1/17/2000', 'W' UNION ALL
SELECT '1/22/2000', 'W' UNION ALL
SELECT '2/1/2000', 'L' UNION ALL
SELECT '2/5/2000', 'W' UNION ALL
SELECT '2/8/2000', 'L' UNION ALL
SELECT '2/16/2000', 'W' UNION ALL
SELECT '2/19/2000', 'L' UNION ALL
SELECT '2/25/2000', 'L' UNION ALL
SELECT '2/28/2000', 'L' UNION ALL
SELECT '3/15/2000', 'L' UNION ALL
SELECT '3/19/2000', 'W' UNION ALL
SELECT '3/25/2000', 'W'

For this to work, you must have a Tally table present. A tally table is nothing more than a table having numbers sequentially stored, ranging from 1 (or 0) to about 100,000. I am using 100,000 and it's very rarely I have had use for a bigger tally numbers table. So, if you don't have a Tally numbers table, go create one.

And now on to the suggestion.

DECLARE @Results VARCHAR(MAX)

SET     @Results = (
                    SELECT      '' + Result
                    FROM        @Sample
                    ORDER BY    GameDate
                    FOR XML     PATH('')
                   )

;WITH Yak(Start, Streak)
AS (
    SELECT      MIN(Number) AS Start,
                COUNT(*) AS Streak
    FROM        (
                    SELECT  Number,
                            ROW_NUMBER() OVER (ORDER BY Number) AS recID
                    FROM    dbo.TallyNumbers
                    WHERE   Number BETWEEN 1 AND DATALENGTH(@Results)
                            AND SUBSTRING(@Results, Number, 1) = 'W'
                                -- Or <> 'L' for Undefeated (if t for "tie" is used)
                ) AS d
    GROUP BY    Number - recID
)

SELECT      s.GameDate,
            y.Streak
FROM        Yak AS y
INNER JOIN  (
                SELECT ROW_NUMBER() OVER (ORDER BY GameDate) AS recID,
                        GameDate
                FROM    @Sample
            ) AS s ON s.recID = y.Start
ORDER BY    s.GameDate

Legacy Comments


Ryan Randall
2009-08-13
re: Another running streaks algorithm
Here's another idea to play with...

--
; with
a as (select GameDate, Result, row_number() over (order by GameDate) as r,
row_number() over (partition by Result order by GameDate) as pr
from @Sample)
, b as (select *, case when Result = 'W' then r-pr else pr-1 end as p1,
case when Result = 'W' then pr-1 else r-pr end as p2 from a)
select min(GameDate) as GameDateFrom, max(GameDate) as GameDateTo, max(p2)-min(p2)+1 as Streak
from b where Result = 'W' group by p1 order by 1
--/

Ryan Randall
2009-08-13
re: Another running streaks algorithm
Slightly more refined...

--
select Result, min(GameDate) as GameDateFrom, max(GameDate) as GameDateTo, count(*) as Streak
from (select *, case Result when 'W' then 1 else -1 end * (row_number() over (order by GameDate) -
row_number() over (partition by Result order by GameDate)) as p2 from @Sample) a
group by Result, p2 order by 2
--/

Peso
2009-08-13
re: Another running streaks algorithm
Now that's the Ryan I am used too!

Ryan Randall
2009-08-13
re: Another running streaks algorithm
Hehe - I realise now that this boils down to the same thing as http://www.sqlteam.com/article/detecting-runs-or-streaks-in-your-data, but using row_number() which wasn't available then.

--
select Result, min(GameDate) as GameDateFrom, max(GameDate) as GameDateTo, count(*) as Streak
from (select *, (row_number() over (order by GameDate) -
row_number() over (partition by Result order by GameDate)) as p2 from @Sample) a
group by Result, p2 order by 2
--/

Ryan Randall
2009-08-13
re: Another running streaks algorithm
> Now that's the Ryan I am used too!

Hehe - wish I'd have thought it through a bit more first though!

empire3
2012-05-04
re: Another running streaks algorithm
This algorithm posted by Ryan Randall is super. But has anyone modified it to detect runs for values less than some threshold?

Like temperature data, for example. We'd like to use this algorithm to detect runs of daily temperatures less than 32 degrees. But it's not a simple modification. Probably need to use a subquery that uses case to convert the temperature into High or Low, but I don't see how to do it. And very much need this algorithm.

concord 11
2012-06-27
concord 11
Like temperature data, for example. We'd like to use this algorithm to detect runs of daily temperatures less than 32 degrees. But it's not a simple modification. Probably need to use a subquery that uses case to convert the temperature into High or Low, but I don't see how to do it. And very much need this algorithm.

WiseOldMan
2012-08-14
re: Another running streaks algorithm
This is great for one team's data. Suppose however you have many teams over many years and you want to find the longest streak for each one?