Thinking outside the box

Patron Saint of Lost Yaks
posts - 203, comments - 734, trackbacks - 4

My Links

Advertisement

News

Archives

Post Categories

Another running streaks algorithm

It has been some years since this article was posted http://www.sqlteam.com/article/detecting-runs-or-streaks-in-your-data and things has evolved since then. So I thought about using XML to solve the case. If the number of records are large, maybe this approach http://weblogs.sqlteam.com/peterl/archive/2009/08/12/Calculating-Running-Streak-over-many-records.aspx will be better for you?

Below, I am using the same sample data as the original.

DECLARE @Sample TABLE
        (
            GameDate DATETIME,
            Result CHAR(1)
        )

INSERT @Sample
SELECT '1/1/2000', 'W' UNION ALL
SELECT '1/12/2000', 'L' UNION ALL
SELECT '1/15/2000', 'W' UNION ALL
SELECT '1/17/2000', 'W' UNION ALL
SELECT '1/22/2000', 'W' UNION ALL
SELECT '2/1/2000', 'L' UNION ALL
SELECT '2/5/2000', 'W' UNION ALL
SELECT '2/8/2000', 'L' UNION ALL
SELECT '2/16/2000', 'W' UNION ALL
SELECT '2/19/2000', 'L' UNION ALL
SELECT '2/25/2000', 'L' UNION ALL
SELECT '2/28/2000', 'L' UNION ALL
SELECT '3/15/2000', 'L' UNION ALL
SELECT '3/19/2000', 'W' UNION ALL
SELECT '3/25/2000', 'W'

For this to work, you must have a Tally table present. A tally table is nothing more than a table having numbers sequentially stored, ranging from 1 (or 0) to about 100,000. I am using 100,000 and it's very rarely I have had use for a bigger tally numbers table. So, if you don't have a Tally numbers table, go create one.

And now on to the suggestion.

DECLARE @Results VARCHAR(MAX)

SET     @Results = (
                    SELECT      '' + Result
                    FROM        @Sample
                    ORDER BY    GameDate
                    FOR XML     PATH('')
                   )

;WITH Yak(Start, Streak)
AS (
    SELECT      MIN(Number) AS Start,
                COUNT(*) AS Streak
    FROM        (
                    SELECT  Number,
                            ROW_NUMBER() OVER (ORDER BY Number) AS recID
                    FROM    dbo.TallyNumbers
                    WHERE   Number BETWEEN 1 AND DATALENGTH(@Results)
                            AND SUBSTRING(@Results, Number, 1) = 'W'
                                -- Or <> 'L' for Undefeated (if t for "tie" is used)
                ) AS d
    GROUP BY    Number - recID
)

SELECT      s.GameDate,
            y.Streak
FROM        Yak AS y
INNER JOIN  (
                SELECT ROW_NUMBER() OVER (ORDER BY GameDate) AS recID,
                        GameDate
                FROM    @Sample
            ) AS s ON s.recID = y.Start
ORDER BY    s.GameDate

Print | posted on Wednesday, August 12, 2009 12:49 AM | Filed Under [ SQL Server 2008 Algorithms SQL Server 2005 ]

Feedback

Gravatar

# re: Another running streaks algorithm

Here's another idea to play with...

--
; with
a as (select GameDate, Result, row_number() over (order by GameDate) as r,
row_number() over (partition by Result order by GameDate) as pr
from @Sample)
, b as (select *, case when Result = 'W' then r-pr else pr-1 end as p1,
case when Result = 'W' then pr-1 else r-pr end as p2 from a)
select min(GameDate) as GameDateFrom, max(GameDate) as GameDateTo, max(p2)-min(p2)+1 as Streak
from b where Result = 'W' group by p1 order by 1
--/
8/13/2009 11:46 AM | Ryan Randall
Gravatar

# re: Another running streaks algorithm

Slightly more refined...

--
select Result, min(GameDate) as GameDateFrom, max(GameDate) as GameDateTo, count(*) as Streak
from (select *, case Result when 'W' then 1 else -1 end * (row_number() over (order by GameDate) -
row_number() over (partition by Result order by GameDate)) as p2 from @Sample) a
group by Result, p2 order by 2
--/
8/13/2009 12:36 PM | Ryan Randall
Gravatar

# re: Another running streaks algorithm

Now that's the Ryan I am used too!
8/13/2009 12:40 PM | Peso
Gravatar

# re: Another running streaks algorithm

Hehe - I realise now that this boils down to the same thing as http://www.sqlteam.com/article/detecting-runs-or-streaks-in-your-data, but using row_number() which wasn't available then.

--
select Result, min(GameDate) as GameDateFrom, max(GameDate) as GameDateTo, count(*) as Streak
from (select *, (row_number() over (order by GameDate) -
row_number() over (partition by Result order by GameDate)) as p2 from @Sample) a
group by Result, p2 order by 2
--/
8/13/2009 12:55 PM | Ryan Randall
Gravatar

# re: Another running streaks algorithm

> Now that's the Ryan I am used too!

Hehe - wish I'd have thought it through a bit more first though!
8/13/2009 12:58 PM | Ryan Randall
Gravatar

# re: Another running streaks algorithm

This algorithm posted by Ryan Randall is super. But has anyone modified it to detect runs for values less than some threshold?

Like temperature data, for example. We'd like to use this algorithm to detect runs of daily temperatures less than 32 degrees. But it's not a simple modification. Probably need to use a subquery that uses case to convert the temperature into High or Low, but I don't see how to do it. And very much need this algorithm.
5/4/2012 5:22 AM | empire3
Gravatar

# concord 11

Like temperature data, for example. We'd like to use this algorithm to detect runs of daily temperatures less than 32 degrees. But it's not a simple modification. Probably need to use a subquery that uses case to convert the temperature into High or Low, but I don't see how to do it. And very much need this algorithm.
6/27/2012 5:01 AM | concord 11
Gravatar

# re: Another running streaks algorithm

This is great for one team's data. Suppose however you have many teams over many years and you want to find the longest streak for each one?
8/14/2012 7:23 PM | WiseOldMan
Comments have been closed on this topic.

Powered by:
Powered By Subtext Powered By ASP.NET