Thinking outside the box

Patron Saint of Lost Yaks
posts - 199, comments - 687, trackbacks - 4

My Links

Advertisement

News

Archives

Post Categories

Another running streaks algorithm

It has been some years since this article was posted http://www.sqlteam.com/article/detecting-runs-or-streaks-in-your-data and things has evolved since then. So I thought about using XML to solve the case. If the number of records are large, maybe this approach http://weblogs.sqlteam.com/peterl/archive/2009/08/12/Calculating-Running-Streak-over-many-records.aspx will be better for you?

Below, I am using the same sample data as the original.

DECLARE @Sample TABLE
        (
            GameDate DATETIME,
            Result CHAR(1)
        )

INSERT @Sample
SELECT '1/1/2000', 'W' UNION ALL
SELECT '1/12/2000', 'L' UNION ALL
SELECT '1/15/2000', 'W' UNION ALL
SELECT '1/17/2000', 'W' UNION ALL
SELECT '1/22/2000', 'W' UNION ALL
SELECT '2/1/2000', 'L' UNION ALL
SELECT '2/5/2000', 'W' UNION ALL
SELECT '2/8/2000', 'L' UNION ALL
SELECT '2/16/2000', 'W' UNION ALL
SELECT '2/19/2000', 'L' UNION ALL
SELECT '2/25/2000', 'L' UNION ALL
SELECT '2/28/2000', 'L' UNION ALL
SELECT '3/15/2000', 'L' UNION ALL
SELECT '3/19/2000', 'W' UNION ALL
SELECT '3/25/2000', 'W'

For this to work, you must have a Tally table present. A tally table is nothing more than a table having numbers sequentially stored, ranging from 1 (or 0) to about 100,000. I am using 100,000 and it's very rarely I have had use for a bigger tally numbers table. So, if you don't have a Tally numbers table, go create one.

And now on to the suggestion.

DECLARE @Results VARCHAR(MAX)

SET     @Results = (
                    SELECT      '' + Result
                    FROM        @Sample
                    ORDER BY    GameDate
                    FOR XML     PATH('')
                   )

;WITH Yak(Start, Streak)
AS (
    SELECT      MIN(Number) AS Start,
                COUNT(*) AS Streak
    FROM        (
                    SELECT  Number,
                            ROW_NUMBER() OVER (ORDER BY Number) AS recID
                    FROM    dbo.TallyNumbers
                    WHERE   Number BETWEEN 1 AND DATALENGTH(@Results)
                            AND SUBSTRING(@Results, Number, 1) = 'W'
                                -- Or <> 'L' for Undefeated (if t for "tie" is used)
                ) AS d
    GROUP BY    Number - recID
)

SELECT      s.GameDate,
            y.Streak
FROM        Yak AS y
INNER JOIN  (
                SELECT ROW_NUMBER() OVER (ORDER BY GameDate) AS recID,
                        GameDate
                FROM    @Sample
            ) AS s ON s.recID = y.Start
ORDER BY    s.GameDate

Print | posted on Wednesday, August 12, 2009 12:49 AM | Filed Under [ SQL Server 2008 Algorithms SQL Server 2005 ]

Feedback

Gravatar

# re: Another running streaks algorithm

Here's another idea to play with...

--
; with
a as (select GameDate, Result, row_number() over (order by GameDate) as r,
row_number() over (partition by Result order by GameDate) as pr
from @Sample)
, b as (select *, case when Result = 'W' then r-pr else pr-1 end as p1,
case when Result = 'W' then pr-1 else r-pr end as p2 from a)
select min(GameDate) as GameDateFrom, max(GameDate) as GameDateTo, max(p2)-min(p2)+1 as Streak
from b where Result = 'W' group by p1 order by 1
--/
8/13/2009 11:46 AM | Ryan Randall
Gravatar

# re: Another running streaks algorithm

Slightly more refined...

--
select Result, min(GameDate) as GameDateFrom, max(GameDate) as GameDateTo, count(*) as Streak
from (select *, case Result when 'W' then 1 else -1 end * (row_number() over (order by GameDate) -
row_number() over (partition by Result order by GameDate)) as p2 from @Sample) a
group by Result, p2 order by 2
--/
8/13/2009 12:36 PM | Ryan Randall
Gravatar

# re: Another running streaks algorithm

Now that's the Ryan I am used too!
8/13/2009 12:40 PM | Peso
Gravatar

# re: Another running streaks algorithm

Hehe - I realise now that this boils down to the same thing as http://www.sqlteam.com/article/detecting-runs-or-streaks-in-your-data, but using row_number() which wasn't available then.

--
select Result, min(GameDate) as GameDateFrom, max(GameDate) as GameDateTo, count(*) as Streak
from (select *, (row_number() over (order by GameDate) -
row_number() over (partition by Result order by GameDate)) as p2 from @Sample) a
group by Result, p2 order by 2
--/
8/13/2009 12:55 PM | Ryan Randall
Gravatar

# re: Another running streaks algorithm

> Now that's the Ryan I am used too!

Hehe - wish I'd have thought it through a bit more first though!
8/13/2009 12:58 PM | Ryan Randall

Post Comment

Title  
Name  
Email
Url
Comment   

Powered by:
Powered By Subtext Powered By ASP.NET