Deltas
It's all about the update...
EDIT: 9/9/2004: Well that wasn't much of an explanation. What the hell are deltas.....is it a place in Mississippi where they have great music and food? Well, yes....but deltas here are meant to describe a process, where by a database table has a set of data WITH a primary key, and a file is supplied daily with “updates“ (and I use that term loosely) that need to be applied to the table. Both the file and the table do not have to have the same structure, number of columns, datatypes and/or sizes...but the MUST have a key that is unique (on a flat file?...well yeah....).
1st. Load the file to a staging table. (make sure the PK is defined to make sure you're good to go)
2nd. You check to see what data in the staging table is not in the table. These are INSERTS
3rd. You need to see what's in that staging table that's missing in the table. These are DELETES.
4th. Lastly you need to see what is the same and what changes need to be applied. These are the UPDATES
Below is a method that I believe is the most effecient to do this.
Guess this could apply to any platform. Defenitley influenced by SQT Team...but I think I first saw this from the Dr., Jeff Smith.
http://www.sqlteam.com/forums/pop_profile.asp?mode=display&id=6859
Deletes and inserts would be index seek operation, but also benefit from the reduction in the number of look ups based on the delta derived table. The update should see huge performace gains.
USE Northwind
GO
SET NOCOUNT ON
GO
SELECT * FROM Employees
-- Create some test data
SELECT EmployeeID
,LastName
,FirstName
,Title
,TitleOfCourtesy
,BirthDate
,HireDate
,Address
,City
,Region
,PostalCode
,Country
,HomePhone
,Extension
,ReportsTo
,PhotoPath
INTO EMP1 FROM Employees -- Pretend this is the New File
SELECT EmployeeID
,LastName
,FirstName
,Title
,TitleOfCourtesy
,BirthDate
,HireDate
,Address
,City
,Region
,PostalCode
,Country
,HomePhone
,Extension
,ReportsTo
,PhotoPath
INTO EMP2 FROM Employees -- Pretend this is the database table that needs to be acted upon
GO
-- Create some changes
DELETE FROM EMP1 WHERE EmployeeID = 1
INSERT INTO EMP1 (LastName, FirstName) SELECT 'Kaiser','Brett'
UPDATE EMP1 SET LastName = 'White' WHERE EmployeeId = 4
GO
--Check For Deltas
SELECT * INTO DELTAS
FROM (
SELECT * FROM EMP1
UNION ALL
SELECT * FROM EMP2
) AS XXX
GROUP BY
EmployeeID
, LastName
, FirstName
, Title
, TitleOfCourtesy
, BirthDate
, HireDate
, Address
, City
, Region
, PostalCode
, Country
, HomePhone
, Extension
, ReportsTo
, PhotoPath
HAVING COUNT(*) = 1
-- Add New recocrds
INSERT INTO EMP2 (LastName, FirstName)
SELECT l.LastName, l.FirstName
FROM EMP1 l
LEFT JOIN EMP2 r
ON l.EmployeeId = r.EmployeeId
WHERE r.EmployeeID Is NULL
AND EXISTS (SELECT * FROM DELTAS d WHERE l.EmployeeId = d.EmployeeId)
-- Delete missing
DELETE FROM EMP2
WHERE EmployeeID IN (
SELECT r.EmployeeID
FROM EMP1 l
RIGHT JOIN EMP2 r
ON l.EmployeeId = r.EmployeeId
WHERE l.EmployeeID Is NULL
AND EXISTS (SELECT * FROM DELTAS d WHERE r.EmployeeId = d.EmployeeId))
-- Update the changes ....update all columns for simplicity
UPDATE r
SET r.LastName = l.Lastname
, r.FirstName = l.Firstname
FROM EMP1 l
INNER JOIN EMP2 r
ON l.EmployeeId = r.EmployeeId
INNER JOIN ( SELECT EmployeeId FROM DELTAS GROUP BY EmployeeId HAVING COUNT(*) =2) AS d
ON r.EmployeeId = d.EmployeeId
GO
-- Take a peek -- files should now match
SELECT 'NEW FILE' AS SOURCE, * FROM EMP1
UNION ALL
SELECT 'TARGET FILE' AS SOURCE, * FROM EMP2
ORDER BY EmployeeID
GO
-- Clean up this mess
SET NOCOUNT OFF
GO
DROP TABLE DELTAS
DROP TABLE EMP1
DROP TABLE EMP2
GO