Peter Larsson Blog

Patron Saint of Lost Yaks

How to determine if you should use full or differential backup?

Or ask yourself, "How much of the database has changed since last backup?".
Here is a simple script that will tell you how much (in percent) have changed in the database since last backup.

-- Prepare staging table for all DBCC outputs
DECLARE @Sample TABLE
        (
            Col1 VARCHAR(MAX) NOT NULL,
            Col2 VARCHAR(MAX) NOT NULL,
            Col3 VARCHAR(MAX) NOT NULL,
            Col4 VARCHAR(MAX) NOT NULL,
            Col5 VARCHAR(MAX)
        )
 
-- Some intermediate variables for controlling loop
DECLARE @FileNum BIGINT = 1,
        @PageNum BIGINT = 6,
        @SQL VARCHAR(100),
        @Error INT,
        @DatabaseName SYSNAME = 'Yoda'
 
-- Loop all files to the very end
WHILE 1 = 1
    BEGIN
        BEGIN TRY
            -- Build the SQL string to execute
            SET     @SQL = 'DBCC PAGE(' + QUOTENAME(@DatabaseName) + ', ' + CAST(@FileNum AS VARCHAR(50)) + ', '
                            + CAST(@PageNum AS VARCHAR(50)) + ', 3) WITH TABLERESULTS'
 
            -- Insert the DBCC output in the staging table
            INSERT  @Sample
                    (
                        Col1,
                        Col2,
                        Col3,
                        Col4
                    )
            EXEC    (@SQL)
 
            -- DCM pages exists at an interval
            SET    @PageNum += 511232
        END TRY
 
        BEGIN CATCH
            -- If error and first DCM page does not exist, all files are read
            IF @PageNum = 6
                BREAK
            ELSE
                -- If no more DCM, increase filenum and start over
                SELECT  @FileNum += 1,
                        @PageNum = 6
        END CATCH
    END
 
-- Delete all records not related to diff information
DELETE
FROM    @Sample
WHERE   Col1 NOT LIKE 'DIFF%'
 
-- Split the range
UPDATE  @Sample
SET     Col5 = PARSENAME(REPLACE(Col3, ' - ', '.'), 1),
        Col3 = PARSENAME(REPLACE(Col3, ' - ', '.'), 2)
 
-- Remove last paranthesis
UPDATE  @Sample
SET     Col3 = RTRIM(REPLACE(Col3, ')', '')),
        Col5 = RTRIM(REPLACE(Col5, ')', ''))
 
-- Remove initial information about filenum
UPDATE  @Sample
SET     Col3 = SUBSTRING(Col3, CHARINDEX(':', Col3) + 1, 8000),
        Col5 = SUBSTRING(Col5, CHARINDEX(':', Col5) + 1, 8000)
 
-- Prepare data outtake
;WITH cteSource(Changed, [PageCount])
AS (
    SELECT      Changed,
                SUM(COALESCE(ToPage, FromPage) - FromPage + 1) AS [PageCount]
    FROM        (
                    SELECT CAST(Col3 AS INT) AS FromPage,
                            CAST(NULLIF(Col5, '') AS INT) AS ToPage,
                            LTRIM(Col4) AS Changed
                    FROM    @Sample
                ) AS d
    GROUP BY    Changed
    WITH ROLLUP
)
-- Present the final result
SELECT  COALESCE(Changed, 'TOTAL PAGES') AS Changed,
        [PageCount],
        100.E * [PageCount] / SUM(CASE WHEN Changed IS NULL THEN 0 ELSE [PageCount] END) OVER () AS Percentage
FROM    cteSource

Legacy Comments


Simon Sabin
2010-10-22
re: How to determine if you should use incremental or differential backup?
Don't you mean FULL or differential?

Peso
2010-10-22
re: How to determine if you should use incremental or differential backup?
I sure do. Will change title now...
Thank you Simon.

Paul S. Randa;
2010-10-23
re: How to determine if you should use full or differential backup?
Saw this in Steve's Database Weekly. Your script is using the wrong iterator - should be 511232 not 510232. And it will stop on the first offline file instead of going to the next online one. I suspect it will stop when it hits a log file too. See www.sqlskills.com/... for how to only iterate over online data files.

Thanks

Peso
2010-10-24
re: How to determine if you should use full or differential backup?
Wow! Your vesion is much better. Of course!

Bob
2010-10-25
re: How to determine if you should use full or differential backup?
So what is your decision point?
Is the intent to use this in a scheduled task that takes either a full or diff backup based on the percent changed?