Brett Kaiser (x002548) Blog

Not Just a Number - Brett Kaiser

Deltas

It's all about the update...

EDIT: 9/9/2004:  Well that wasn't much of an explanation.  What the hell are deltas.....is it a place in Mississippi where they have great music and food?  Well, yes....but deltas here are meant to describe a process, where by a database table has a set of data WITH a primary key, and a file is supplied daily with “updates“ (and I use that term loosely) that need to be applied to the table.  Both the file and the table do not have to have the same structure, number of columns, datatypes and/or sizes...but the MUST have a key that is unique (on a flat file?...well yeah....).

 

1st.  Load the file to a staging table.  (make sure the PK is defined to make sure you're good to go)

2nd.  You check to see what data in the staging table is not in the table.  These are INSERTS

3rd.  You need to see what's in that staging table that's missing in the table.  These are DELETES.

4th.  Lastly you need to see what is the same and what changes need to be applied.  These are the UPDATES

Below is a method that I believe is the most effecient to do this.

Guess this could apply to any platform.  Defenitley influenced by SQT Team...but I think I first saw this from the Dr., Jeff Smith.

http://www.sqlteam.com/forums/pop_profile.asp?mode=display&id=6859

Deletes and inserts would be index seek operation, but also benefit from the reduction in the number of look ups based on the delta derived table.  The update should see huge performace gains.

USE Northwind
GO

SET NOCOUNT ON
GO

SELECT * FROM Employees

-- Create some test data
SELECT EmployeeID
,LastName
,FirstName
,Title
,TitleOfCourtesy
,BirthDate
,HireDate
,Address
,City
,Region
,PostalCode
,Country
,HomePhone
,Extension
,ReportsTo
,PhotoPath
 INTO EMP1 FROM Employees  -- Pretend this is the New File

SELECT EmployeeID
,LastName
,FirstName
,Title
,TitleOfCourtesy
,BirthDate
,HireDate
,Address
,City
,Region
,PostalCode
,Country
,HomePhone
,Extension
,ReportsTo
,PhotoPath
 INTO EMP2 FROM Employees  -- Pretend this is the database table that needs to be acted upon
GO

-- Create some changes
DELETE FROM EMP1 WHERE EmployeeID = 1
INSERT INTO EMP1 (LastName, FirstName) SELECT 'Kaiser','Brett'
UPDATE EMP1 SET LastName = 'White' WHERE EmployeeId = 4
GO

--Check For Deltas
SELECT * INTO DELTAS
  FROM (
  SELECT * FROM EMP1 
      UNION ALL
  SELECT * FROM EMP2
 ) AS XXX
GROUP BY
   EmployeeID
 , LastName
 , FirstName
 , Title
 , TitleOfCourtesy
 , BirthDate
 , HireDate
 , Address
 , City
 , Region
 , PostalCode
 , Country
 , HomePhone
 , Extension
 , ReportsTo
 , PhotoPath
HAVING COUNT(*) = 1

-- Add New recocrds
INSERT INTO EMP2 (LastName, FirstName)
     SELECT l.LastName, l.FirstName
       FROM EMP1 l
  LEFT JOIN EMP2 r
  ON l.EmployeeId = r.EmployeeId
      WHERE r.EmployeeID Is NULL
        AND EXISTS (SELECT * FROM DELTAS d WHERE l.EmployeeId = d.EmployeeId)

-- Delete missing
DELETE FROM EMP2
WHERE EmployeeID IN (
    SELECT r.EmployeeID 
      FROM EMP1 l
RIGHT JOIN EMP2 r
 ON l.EmployeeId = r.EmployeeId
     WHERE l.EmployeeID Is NULL
       AND EXISTS (SELECT * FROM DELTAS d WHERE r.EmployeeId = d.EmployeeId))

-- Update the changes ....update all columns for simplicity
UPDATE r
  SET r.LastName  = l.Lastname
    , r.FirstName = l.Firstname
      FROM EMP1 l
INNER JOIN EMP2 r
 ON l.EmployeeId = r.EmployeeId
INNER JOIN ( SELECT EmployeeId FROM DELTAS GROUP BY EmployeeId HAVING COUNT(*) =2) AS d
 ON r.EmployeeId = d.EmployeeId
GO

-- Take a peek -- files should now match
   SELECT 'NEW FILE' AS SOURCE, * FROM EMP1
UNION ALL
   SELECT 'TARGET FILE' AS SOURCE, * FROM EMP2
 ORDER BY EmployeeID
GO

-- Clean up this mess
SET NOCOUNT OFF
GO

DROP TABLE DELTAS
DROP TABLE EMP1
DROP TABLE EMP2
GO