SQL Server 2008: Change Data Capture (CDC)
UPDATE #3
Another question I have received in the feedback portion of this blog post could be summarized as follows:
"How does SQLCDC handle new columns being added to a source table being monitored/captured by SQLCDC?"
I'll first address the inverse of this question. How does SQLCDC handle columns that are removed from a source table? The awnser is that SQLCDC does the best it can by not altering the source table's corresponding Change Table schema so as to not break any downstream consuming applications/services. And thus any columns dropped from the source table remain in the change table with NULLs being inserted into the removed column going forward.
Now, moving on, in regard to this actual question, SQLCDC (for the same reasons cited above) does not alter a source table's corresponding change table when a new column gets added to a source table being captured by SQLCDC. To enable SQLCDC to capture the new column you simply need to create a second capture instance which will then capture the new column and it's data.
Note: A source table being captured by SQLCDC can only have up to 2 Capture Instances. If a source table incurs regular columns being added/removed you will need to recycle your capture instances.
--http://www.bivoyage.com
--http://weblogs.sqlteam.com/derekc
--Derek Comingore's Microsoft BI Blog Sample Code
--SQL CDC Blog Question Addressed: What is the process to get a new column tracked w/o losing the existing CDC table data?
USE [AdventureWorks]
GO
--Step 1: Add a New Column to the HumanResources.Employee Table
ALTER TABLE humanresources.employee
ADD [NewCol] INT;
GO
--A source table can only have 2 capture instances associated with it!
--Step 2: Create a new, second capture instance to facilitate the capture of change data for the new column
EXEC Sp_cdc_enable_table
humanresources ,
employee ,
'HumanResources_Employee_New' ,
1 ,
dbo
GO
--Step 3: Sample DML Statement
UPDATE humanresources.employee
SET newcol = 1
WHERE employeeid = 1
GO
--Step 4: Create a new, second UDF that uses the new capture instance to collect the new column's data
CREATE FUNCTION [dbo].[Udf_employee_newcol]
(@start_time DATETIME,
@end_time DATETIME)
RETURNS @Employee TABLE(employeeid INT,
nationalidnumber NVARCHAR(15),
contactid INT,
managerid INT,
title NVARCHAR(50),
birthdate DATETIME,
maritalstatus NCHAR(1),
gender NCHAR(1),
hiredate DATETIME,
salariedflag [FLAG],
vacationhours SMALLINT,
sickleavehours SMALLINT,
currentflag [FLAG],
rowguid UNIQUEIDENTIFIER,
[user] NVARCHAR(50),
[NewCol] INT,
cdc_operation VARCHAR(1))
AS
BEGIN
--declare local variables to hold LSNs
DECLARE @from_lsn BINARY(10),
@to_lsn BINARY(10)
--Map the time interval to a change data capture query range.
IF (@start_time IS NULL)
BEGIN
SELECT @from_lsn = sys.Fn_cdc_get_min_lsn('HumanResources_Employee_New')
END
ELSE
BEGIN
SELECT @from_lsn = sys.Fn_cdc_map_time_to_lsn('smallest greater than or equal',@start_time)
END
IF (@end_time IS NULL)
BEGIN
SELECT @to_lsn = sys.Fn_cdc_get_max_lsn()
END
ELSE
BEGIN
SELECT @to_lsn = sys.Fn_cdc_map_time_to_lsn('largest less than or equal',@end_time)
END
--if same then exit
IF (@from_lsn = sys.Fn_cdc_increment_lsn(@to_lsn))
BEGIN
RETURN
END
-- Query for change data
INSERT INTO @Employee
SELECT employeeid,
nationalidnumber,
contactid,
managerid,
title,
birthdate,
maritalstatus,
gender,
hiredate,
salariedflag,
vacationhours,
sickleavehours,
currentflag,
rowguid,
[user],
[NewCol],
CASE __$operation
WHEN 1
THEN 'D'
WHEN 2
THEN 'I'
WHEN 4
THEN 'U'
ELSE NULL
END AS cdc_operation
FROM cdc.Fn_cdc_get_net_changes_humanresources_employee_new(@from_lsn,@to_lsn,'all')
RETURN
END
GO
--Step 5: Query the new UDF for the Sample DML
SELECT *
FROM [dbo].[Udf_employee_newcol](NULL,NULL)
GO
--http://weblogs.sqlteam.com/derekc
--Derek Comingore's Microsoft BI Blog Sample Code
--SQL CDC Blog Question Addressed: What is the process to get a new column tracked w/o losing the existing CDC table data?
USE [AdventureWorks]
GO
--Step 1: Add a New Column to the HumanResources.Employee Table
ALTER TABLE humanresources.employee
ADD [NewCol] INT;
GO
--A source table can only have 2 capture instances associated with it!
--Step 2: Create a new, second capture instance to facilitate the capture of change data for the new column
EXEC Sp_cdc_enable_table
humanresources ,
employee ,
'HumanResources_Employee_New' ,
1 ,
dbo
GO
--Step 3: Sample DML Statement
UPDATE humanresources.employee
SET newcol = 1
WHERE employeeid = 1
GO
--Step 4: Create a new, second UDF that uses the new capture instance to collect the new column's data
CREATE FUNCTION [dbo].[Udf_employee_newcol]
(@start_time DATETIME,
@end_time DATETIME)
RETURNS @Employee TABLE(employeeid INT,
nationalidnumber NVARCHAR(15),
contactid INT,
managerid INT,
title NVARCHAR(50),
birthdate DATETIME,
maritalstatus NCHAR(1),
gender NCHAR(1),
hiredate DATETIME,
salariedflag [FLAG],
vacationhours SMALLINT,
sickleavehours SMALLINT,
currentflag [FLAG],
rowguid UNIQUEIDENTIFIER,
[user] NVARCHAR(50),
[NewCol] INT,
cdc_operation VARCHAR(1))
AS
BEGIN
--declare local variables to hold LSNs
DECLARE @from_lsn BINARY(10),
@to_lsn BINARY(10)
--Map the time interval to a change data capture query range.
IF (@start_time IS NULL)
BEGIN
SELECT @from_lsn = sys.Fn_cdc_get_min_lsn('HumanResources_Employee_New')
END
ELSE
BEGIN
SELECT @from_lsn = sys.Fn_cdc_map_time_to_lsn('smallest greater than or equal',@start_time)
END
IF (@end_time IS NULL)
BEGIN
SELECT @to_lsn = sys.Fn_cdc_get_max_lsn()
END
ELSE
BEGIN
SELECT @to_lsn = sys.Fn_cdc_map_time_to_lsn('largest less than or equal',@end_time)
END
--if same then exit
IF (@from_lsn = sys.Fn_cdc_increment_lsn(@to_lsn))
BEGIN
RETURN
END
-- Query for change data
INSERT INTO @Employee
SELECT employeeid,
nationalidnumber,
contactid,
managerid,
title,
birthdate,
maritalstatus,
gender,
hiredate,
salariedflag,
vacationhours,
sickleavehours,
currentflag,
rowguid,
[user],
[NewCol],
CASE __$operation
WHEN 1
THEN 'D'
WHEN 2
THEN 'I'
WHEN 4
THEN 'U'
ELSE NULL
END AS cdc_operation
FROM cdc.Fn_cdc_get_net_changes_humanresources_employee_new(@from_lsn,@to_lsn,'all')
RETURN
END
GO
--Step 5: Query the new UDF for the Sample DML
SELECT *
FROM [dbo].[Udf_employee_newcol](NULL,NULL)
GO
UPDATE #2
I have received several questions regarding how/if you can capture the user that commits DML statements to a change table. As I mentioned in my feedback for this post, there is no out-of-the-box solution to capture the user who commits the changes (DML) being captured, however you can build a custom solution such as the script below:
Hope this Helps!
Derek
--Derek Comingore's Microsoft BI Blog Sample Code
--www.bivoyage.com
--SQL CDC Blog Question Addressed: Is there a way to know who caused the change?
--set current database context
USE [AdventureWorks]
GO
--append new user column to the HumanResources.Employee Table
ALTER TABLE humanresources.employee
ADD [User] NVARCHAR(50)
GO
--disable any triggers
DISABLE Trigger ALL ON HumanResources.employee
GO
--enable CDC for AW
EXEC Sp_cdc_enable_db
GO
--enable CDC for HumanResources.Employee
EXEC Sp_cdc_enable_table
humanresources ,
employee ,
NULL ,
1 ,
dbo
GO
--create sample UDF to return change data
CREATE FUNCTION [dbo].[Udf_employee]
(@start_time DATETIME,
@end_time DATETIME)
RETURNS @Employee TABLE(employeeid INT,
nationalidnumber NVARCHAR(15),
contactid INT,
managerid INT,
title NVARCHAR(50),
birthdate DATETIME,
maritalstatus NCHAR(1),
gender NCHAR(1),
hiredate DATETIME,
salariedflag [FLAG],
vacationhours SMALLINT,
sickleavehours SMALLINT,
currentflag [FLAG],
rowguid UNIQUEIDENTIFIER,
[user] NVARCHAR(50),
cdc_operation VARCHAR(1))
AS
BEGIN
--declare local variables to hold LSNs
DECLARE @from_lsn BINARY(10),
@to_lsn BINARY(10)
--Map the time interval to a change data capture query range.
IF (@start_time IS NULL)
BEGIN
SELECT @from_lsn = sys.Fn_cdc_get_min_lsn('HumanResources_Employee')
END
ELSE
BEGIN
SELECT @from_lsn = sys.Fn_cdc_map_time_to_lsn('smallest greater than or equal',@start_time)
END
IF (@end_time IS NULL)
BEGIN
SELECT @to_lsn = sys.Fn_cdc_get_max_lsn()
END
ELSE
BEGIN
SELECT @to_lsn = sys.Fn_cdc_map_time_to_lsn('largest less than or equal',@end_time)
END
--if same then exit
IF (@from_lsn = sys.Fn_cdc_increment_lsn(@to_lsn))
BEGIN
RETURN
END
-- Query for change data
INSERT INTO @Employee
SELECT employeeid,
nationalidnumber,
contactid,
managerid,
title,
birthdate,
maritalstatus,
gender,
hiredate,
salariedflag,
vacationhours,
sickleavehours,
currentflag,
rowguid,
[user],
CASE __$operation
WHEN 1
THEN 'D'
WHEN 2
THEN 'I'
WHEN 4
THEN 'U'
ELSE NULL
END AS cdc_operation
FROM cdc.Fn_cdc_get_net_changes_humanresources_employee(@from_lsn,@to_lsn,'all')
RETURN
END
GO
--sample DML statement
UPDATE humanresources.employee
SET title = 'Marketing Director',
[User] = system_user
WHERE employeeid = 13
--get the net change of the sample DML including the User
SELECT *
FROM [dbo].[Udf_employee](NULL,NULL)
--www.bivoyage.com
--SQL CDC Blog Question Addressed: Is there a way to know who caused the change?
--set current database context
USE [AdventureWorks]
GO
--append new user column to the HumanResources.Employee Table
ALTER TABLE humanresources.employee
ADD [User] NVARCHAR(50)
GO
--disable any triggers
DISABLE Trigger ALL ON HumanResources.employee
GO
--enable CDC for AW
EXEC Sp_cdc_enable_db
GO
--enable CDC for HumanResources.Employee
EXEC Sp_cdc_enable_table
humanresources ,
employee ,
NULL ,
1 ,
dbo
GO
--create sample UDF to return change data
CREATE FUNCTION [dbo].[Udf_employee]
(@start_time DATETIME,
@end_time DATETIME)
RETURNS @Employee TABLE(employeeid INT,
nationalidnumber NVARCHAR(15),
contactid INT,
managerid INT,
title NVARCHAR(50),
birthdate DATETIME,
maritalstatus NCHAR(1),
gender NCHAR(1),
hiredate DATETIME,
salariedflag [FLAG],
vacationhours SMALLINT,
sickleavehours SMALLINT,
currentflag [FLAG],
rowguid UNIQUEIDENTIFIER,
[user] NVARCHAR(50),
cdc_operation VARCHAR(1))
AS
BEGIN
--declare local variables to hold LSNs
DECLARE @from_lsn BINARY(10),
@to_lsn BINARY(10)
--Map the time interval to a change data capture query range.
IF (@start_time IS NULL)
BEGIN
SELECT @from_lsn = sys.Fn_cdc_get_min_lsn('HumanResources_Employee')
END
ELSE
BEGIN
SELECT @from_lsn = sys.Fn_cdc_map_time_to_lsn('smallest greater than or equal',@start_time)
END
IF (@end_time IS NULL)
BEGIN
SELECT @to_lsn = sys.Fn_cdc_get_max_lsn()
END
ELSE
BEGIN
SELECT @to_lsn = sys.Fn_cdc_map_time_to_lsn('largest less than or equal',@end_time)
END
--if same then exit
IF (@from_lsn = sys.Fn_cdc_increment_lsn(@to_lsn))
BEGIN
RETURN
END
-- Query for change data
INSERT INTO @Employee
SELECT employeeid,
nationalidnumber,
contactid,
managerid,
title,
birthdate,
maritalstatus,
gender,
hiredate,
salariedflag,
vacationhours,
sickleavehours,
currentflag,
rowguid,
[user],
CASE __$operation
WHEN 1
THEN 'D'
WHEN 2
THEN 'I'
WHEN 4
THEN 'U'
ELSE NULL
END AS cdc_operation
FROM cdc.Fn_cdc_get_net_changes_humanresources_employee(@from_lsn,@to_lsn,'all')
RETURN
END
GO
--sample DML statement
UPDATE humanresources.employee
SET title = 'Marketing Director',
[User] = system_user
WHERE employeeid = 13
--get the net change of the sample DML including the User
SELECT *
FROM [dbo].[Udf_employee](NULL,NULL)
UPDATE #1
I received a question via email about "Where is the sys.sp_cdc_enable_table_change_data_capture_internal located?"...
The awnser is that its contained in the Resource database along with several other '_internal' CDC objects. If your curious about how to view the Resources database in sql 2005 (or 2008) you can see this article http://www.myitcommunity.com/articles/18/view.asp?id=9138 .
Cheers!
Derek
------------------------------------------------------------------------------------------------
Introduction
One of SQL Server 2008's biggest BI features being touted is Change Data Capture (CDC). CDC is basically suppose to be a built-in solution to the old-age practice in ETL solutions of identifying and using change identifiers columns in source systems. I have now spent a fair amount of time using this feature and more importantly how to leverage it inside of SSIS packages for incremental ETL solutions. My work here has been to prepare for an upcoming demonstration of CDC in SSIS. This post/Q&A is a brief summary of my findings thus far...
*CDC is being positioned as the 'design of choice' for SQL Server 2008+ OLTP database servers for exposing changed relational data for data warehousing consumption purposes.
What is Change Data Capture (CDC)?
CDC records (or captures) DML activity on designated tables. CDC works by scanning the transaction log for a designated table's 'captured columns' whose content has changed and then making those changes available for data syncronizing purposes in a relational format. As you can see this feature in entrenched in transaction log architecture and thus alot of the metadata in CDC is related around the concept of a Log Sequence Number (LSN).
So whats a LSN?
Here is the definition of a LSN per Books Online: "Every record in the Microsoft SQL Server transaction log is uniquely identified by a log sequence number (LSN). LSNs are ordered such that if LSN2 is greater than LSN1, the change described by the log record referred to by LSN2 occurred after the change described by the log record LSN. "
How do I get CDC?
CDC is a feature of SQL Server 2008 Enterprise, Developer, and Evaluation editions.
What are the target applications or consumers of the CDC technology?
ETL Solutions are the most common, however any data consuming application that requires syncronizing data could benefit from the technology.
Is CDC configurable via a UI or just TSQL?
As of this time, just TSQL.
How do you configure CDC?
- Enable CDC for a database
- Enables the current database via the USE statement *select is_cdc_enabled from sys.databases where [name] = 'AdventureWorks' to determine if DB is allready enabled *Also note that when you do this all of the below system objects get created in the selected database
- Enable CDC for a given table and it's selected columns
- Specify the captured table's schema, name, database role, capture instance name (defaults to schema_name), support net changes (bit, set it to 1 if you want both change data table-valued functions created), name of captured table's unique index, captured column list (null/default to all columns), filegroup for the change table (null/defaults to default filegroup)
- Query Change Data via 1 of 2 built in table-valued functions created during step #2
- For all changes (meaning a row is returned for each DML) use cdc.fn_cdc_get_all_changes_<capture_instance>
- For the net changes (meaning one row returned for each source row modified among 1 or more DMLs) use cdc.fn_cdc_get_net_changes_<capture_instance>
What are all of the CDC system objects available to me?
System Tables:
- cdc.captured_columns
- cdc.change_tables
- cdc.ddl_history
- cdc.index_columns
- cdc.lsn_time_mapping
- cdc.Schema_Name_CT (change tables) *this is just the default naming convention, configurable via the enable table sysproc
DMVs:
- sys.dm_cdc_log_scan_sessions
- sys.dm_repl_traninfo
- sys.dm_cdc_errors
System Stored Procedures:
- sys.sp_cdc_enabledb
- sys.sp_cdc_disabledb
- sys.sp_cdc_cleanup_change_table
- sys.sp_cdc_disable_db_change_data_capture
- sys.sp_cdc_disable_table_change_data_capture
- sys.sp_cdc_enable_db_change_data_capture
- sys.sp_cdc_enable_table_change_data_capture
- sys.sp_cdc_get_ddl_history
- sys.sp_cdc_get_captured_columns
- sys.sp_cdc_help_change_data_capture
System Functions:
- cdc.fn_cdc_get_all_changes_<capture_instance>
- cdc.fn_cdc_get_net_changes_<capture_instance>
- sys.fn_cdc_decrement_lsn
- sys.fn_cdc_get_column_ordinal ( 'capture_instance' , 'column_name' )
- sys.fn_cdc_get_max_lsn
- sys.fn_cdc_get_min_lsn
- sys.fn_cdc_has_column_changed
- sys.fn_cdc_increment_lsn
- sys.fn_cdc_is_bit_set
- sys.fn_cdc_map_lsn_to_time
- sys.fn_cdc_map_time_to_lsn
Do the change tables keep growing?
No, there is an automatic cleanup process that occurs every three days (and this is configurable). For more intense environments you can leverage the manual method using the system stored procedure: sys.sp_cdc_cleanup_change_table. When you execute this system procedure you specify the low LSN and any change records occuring before this point are removed and the start_lsn is set to the low LSN you specified.
How do you leverage CDC in SSIS Packages?
Books Online in CTP5 (November) actually has a sample package in the topic 'change data capture in integration services' and I found this to be a good starting point to build from. For my CDC/SSIS demo here is the Control/Data Flow I am using:
- Calculate Date Intervals (these will correspond to LSNs later) *also note that in both BOL and my own package we are using fixed intervals, in the real world this will be driven by a table solution which tells the SSIS package when the last successful execution occurs (starting point of next package iteration)
- Check is any data is available in the selected date/time interval. This is important because the rest of the package will fail if no data is ready. BOL recommends performing Thead.Sleep/WAITFORs here. I am not for demo purposes but its not a bad idea.
- Build the query via a SSIS variable *BOL states that SSIS packages cannot call the cdc.fn_cdc_getnet|all functions and must use a wrapper. Whether or not we end up being forced to do this, it is a good design practice, below is my custom function that SSIS calls passing in the start/end datetime values to get the actual change data in the data flow step below.
- Create a data flow task that executes the SSIS variable query (OLEDB source), and then splits the rows into via a conditional split based on the CDC_OPERATION column calculated in the function below.
CREATE FUNCTION [dbo].[Udf_promotion]
(@start_time DATETIME,
@end_time DATETIME)
RETURNS @Promotion TABLE(specialofferid INT,
DESCRIPTION NVARCHAR(255),
discountpct SMALLMONEY,
[Type] NVARCHAR(50),
category NVARCHAR(50),
startdate DATETIME,
enddate DATETIME,
minqty INT,
maxqty INT,
rowguid UNIQUEIDENTIFIER,
modifieddate DATETIME,
cdc_operation VARCHAR(1))
AS
BEGIN
--declare local variables to hold LSNs
DECLARE @from_lsn BINARY(10),
@to_lsn BINARY(10)
--Map the time interval to a change data capture query range.
IF (@start_time IS NULL)
BEGIN
SELECT @from_lsn = sys.Fn_cdc_get_min_lsn('Sales_SpecialOffer')
END
ELSE
BEGIN
SELECT @from_lsn = sys.Fn_cdc_map_time_to_lsn('smallest greater than or equal',@start_time)
END
IF (@end_time IS NULL)
BEGIN
SELECT @to_lsn = sys.Fn_cdc_get_max_lsn()
END
ELSE
BEGIN
SELECT @to_lsn = sys.Fn_cdc_map_time_to_lsn('largest less than or equal',@end_time)
END
--if same then exit
IF (@from_lsn = sys.Fn_cdc_increment_lsn(@to_lsn))
BEGIN
RETURN
END
-- Query for change data
INSERT INTO @Promotion
SELECT specialofferid,
DESCRIPTION,
discountpct,
[Type],
category,
startdate,
enddate,
minqty,
maxqty,
rowguid,
modifieddate,
CASE __$operation
WHEN 1
THEN 'D'
WHEN 2
THEN 'I'
WHEN 4
THEN 'U'
ELSE NULL
END AS cdc_operation
FROM cdc.Fn_cdc_get_net_changes_sales_specialoffer(@from_lsn,@to_lsn,'all')
RETURN
END
GO
(@start_time DATETIME,
@end_time DATETIME)
RETURNS @Promotion TABLE(specialofferid INT,
DESCRIPTION NVARCHAR(255),
discountpct SMALLMONEY,
[Type] NVARCHAR(50),
category NVARCHAR(50),
startdate DATETIME,
enddate DATETIME,
minqty INT,
maxqty INT,
rowguid UNIQUEIDENTIFIER,
modifieddate DATETIME,
cdc_operation VARCHAR(1))
AS
BEGIN
--declare local variables to hold LSNs
DECLARE @from_lsn BINARY(10),
@to_lsn BINARY(10)
--Map the time interval to a change data capture query range.
IF (@start_time IS NULL)
BEGIN
SELECT @from_lsn = sys.Fn_cdc_get_min_lsn('Sales_SpecialOffer')
END
ELSE
BEGIN
SELECT @from_lsn = sys.Fn_cdc_map_time_to_lsn('smallest greater than or equal',@start_time)
END
IF (@end_time IS NULL)
BEGIN
SELECT @to_lsn = sys.Fn_cdc_get_max_lsn()
END
ELSE
BEGIN
SELECT @to_lsn = sys.Fn_cdc_map_time_to_lsn('largest less than or equal',@end_time)
END
--if same then exit
IF (@from_lsn = sys.Fn_cdc_increment_lsn(@to_lsn))
BEGIN
RETURN
END
-- Query for change data
INSERT INTO @Promotion
SELECT specialofferid,
DESCRIPTION,
discountpct,
[Type],
category,
startdate,
enddate,
minqty,
maxqty,
rowguid,
modifieddate,
CASE __$operation
WHEN 1
THEN 'D'
WHEN 2
THEN 'I'
WHEN 4
THEN 'U'
ELSE NULL
END AS cdc_operation
FROM cdc.Fn_cdc_get_net_changes_sales_specialoffer(@from_lsn,@to_lsn,'all')
RETURN
END
GO
OK, but what if I just want to load the actual columns that changed (ie for UPDATEs)?
I have not created a demo that does this yet but I can tell you that the key to enable this is the metadata column returned from the cdc.fn_get_net|all functions, __$update_mask. Here is Books Online description of this metadata column:
"A bit mask with a bit corresponding to each captured column identified for the capture instance. This value has all defined bits set to 1 when __$operation = 1 or 2. When __$operation = 3 or 4, only those bits corresponding to columns that changed are set to 1."
*__$operation = 3|4 are the pre and post update operations
Overall Impression?
I'm quite impressed with the new CDC feature built into the SQL Server relational engine. It is fairly easy to configure and use. In terms of enhancements, I think the matter of replacing the typical 'ETL Run' table should be considered. For example, when we configure this new technology going forward to replace our old source 'change identifier' columns we will still need a solution (usually table-based) for tracking when our ETL Packages have last ran (both successful and failed). It would be nice if we could somehow configure the CDC technology to say 'this SSIS Package is the only consuming application, therefore purge the data in a transactional manner once I retrieve it (which we could do manual right now I believe)'. Basically, give the option of replacing the LSN filtering of records to just retrieve and purge to so I can rely on the fact that the change table only contains data I have yet to process. Also, CDC can only work with SQL Server. While this is an obvious constraint, it would be nice to overcome it since SQL Server BI Systems will typically query other types of RDBMS for their source system data and not just SQL Server.
Overall the feature is great, it replaces the need for source column change identifers, and it relocates the ETL querying load to a seperate change table in the source system. If your organization has a complete SQL Server based environment, CDC is a great candidate for your future ETL solutions.
Legacy Comments
SQLDev
2008-10-13 |
re: SQL Server 2008: Change Data Capture (CDC) How do we use CDC for related tables such as a master and detail where master having the PK and some basic details such as created by created on etc and the details having the entity specific attributes? How do I use the CDC data for a single record modification by joining the master and details CDC tables? Please help. |
Derek Comingore
2008-10-13 |
re: SQL Server 2008: Change Data Capture (CDC) Hi, CDC is only available in SQL Server 2008 Enterprise, Developer ,and Evaluation editions. Furthermore, CDC is enabled at two levels, first a database level and then on specific tables you identify via the sys.sp_cdc_enable_table command. sys.sp_cdc_enable_table [ @source_schema = ] 'source_schema', [ @source_name = ] 'source_name' , [ @role_name = ] 'role_name' [,[ @capture_instance = ] 'capture_instance' ] [,[ @supports_net_changes = ] supports_net_changes ] [,[ @index_name = ] 'index_name' ] [,[ @captured_column_list = ] 'captured_column_list' ] [,[ @filegroup_name = ] 'filegroup_name' ] [,[ @partition_switch = ] 'partition_switch' ] If you have a Master|Child detail tables you would have to enable both tables for CDC if you need to detect and consume the change data of both. |
Prasad
2008-10-14 |
re: SQL Server 2008: Change Data Capture (CDC) I liked the explanation. Thanks for putting this togehter. |
Automotive Software
2008-11-01 |
re: SQL Server 2008: Change Data Capture (CDC) This is a great enhancement to SQL that was otherwise always difficult to do. We did it with SQL 2005 via CLR triggers and it worked pretty well. |
m moen
2008-11-20 |
re: SQL Server 2008: Change Data Capture (CDC) Derek, This is a very thorough piece. Thanks for putting this together. |
yousef
2008-11-25 |
re: SQL Server 2008: Change Data Capture (CDC) Can I capture changes occuring on data sources other than SQL database? for example, Oracle data source>? |
Derek Comingore
2008-11-26 |
re: SQL Server 2008: Change Data Capture (CDC) No, SQL Server 2008 CDC is a feature of the relational database engine. In order to support change data capture on our systems either the vendor would have to build that feature of MSFT COULD potentially create a RDBMS-neutral capturing process. |
Durani
2009-02-26 |
re: SQL Server 2008: Change Data Capture (CDC) No wonder CDC is very helpfull, but how to find who did the changes ? Is there any way to find that info? |
Narasimha
2009-03-05 |
re: SQL Server 2008: Change Data Capture (CDC) If i want to use the CDC+SSIS as alternative solution against Transactional Replication, How can i send the initial data in the table to the target? Not sure the transactional log contains the entries of that table right from the begining.. |
rahul aggarwal
2009-03-06 |
re: SQL Server 2008: Change Data Capture (CDC) Hi, How can I change the schema and user for CDC. Because it creates its table in CDC schema. When ever I take the backup of this database. Backup doesn't include the schema, user and table that is created for the CDC, in the backup. Is there any way that I can take the whole backup or i can change the schema name for CDC. Thanks, |
Kevin
2009-05-26 |
re: SQL Server 2008: Change Data Capture (CDC) Since Yousef asked... Attunity provides a plug-in for SSIS for CDC to: Oracle, SQL Server 2000/2005, Mainframe and HP NonStop sources. Sets up in minutes and is easy to use. kevin.maguire@attunity.com |
T
2009-06-08 |
re: SQL Server 2008: Change Data Capture (CDC) Personally, I think CDC is a pile of crap. I have been trying to use it to track specific field changes on tables by placing a trigger under the CDC table. It runs for a while but then fails without telling which table it failed to load data into. My trigger may be in error but I am not sure which trigger failed. By looking at [dm_cdc_errors] how do you know what table it failed on? |
Ilona
2009-07-08 |
re: SQL Server 2008: Change Data Capture (CDC) What are the options besides Attunity in using SQL Server 2008 as the destination and sourcing CDC data from SQL 2005 and 2000. Maybe using replication? |
Derek
2009-07-08 |
re: SQL Server 2008: Change Data Capture (CDC) Hi Ilona! Attunity offers a 3rd party based CDC solution including support for SQL Server Integration Services (SSIS). The primary benefit of Attunity from a purely Microsoft perspective is that the solution supports pre-SQL Server 2008 data sources as opposed to the out-of-the-box SQL Server 2008 CDC technology which requires OLTP-based servers to run on SQL Server 2008. You are correct, in terms of purely Microsoft SQL Server technologies, replication is the next "closest" technology. You can also emulate CDC with SSIS packages continuously running as well as Service-Broker applications. Nice to Meet You! Derek SQL Server MVP |
Juan
2009-07-10 |
re: SQL Server 2008: Change Data Capture (CDC) Hi, nice work! Is there any way to know the user who made the change? Thanks! Juan |
bibi14
2009-07-17 |
re: SQL Server 2008: Change Data Capture (CDC) Hi Derek, Do you know if there is a way to update automatically the system tables when the tracked tables are modified (add/delete column) without "restarting" the tracking (and losing the information in the system tables) ?? Thanx |
Siva
2009-07-20 |
re: SQL Server 2008: Change Data Capture (CDC) Pretty decent work. Eager to know if there is a way to capture changes in a different database instead of having CDC tables in the same DB? Regards Shiv |
Chris
2009-07-31 |
re: SQL Server 2008: Change Data Capture (CDC) I would like to second the following questions which were previously asked but not answered yet: 1) Is there a way to know who caused the change? 2) Is there a way to point the CDC tables to a different database? 3) Once you enable CDC on a table and then modify that table (add/drop column for example), the resulting CDC table doesn't seem to reflect the change (I understand keeping the columns when they're deleted for history purpose, but new columns' data isn't tracked). What is the process to get the new column tracked w/o losing the existing CDC table data? |
Derek Comingore
2009-08-01 |
re: SQL Server 2008: Change Data Capture (CDC) Hi Chris and other Fellow BI Gurus I have done a bit of research and found that in regard to Chris's Question #1, there is no out-of-the-box solution with SQLCDC to find out who commits the DML statements, HOWEVER you can use a custom solution such as the script I'll post in the blog post itself. Hope this Helps, Derek |
Derek Comingore
2009-08-01 |
re: SQL Server 2008: Change Data Capture (CDC) Hi Chris and other Fellow BI Gurus In regard to Chris's Question #2, the sp_cdc_enable_DB and sp_cdc_enable_table system stored procedures enables change data capture for a designated table in the current database and therefore the answer is no, you cannot change (meaning relocating the CDC-related infrastructure to a seperate database). Hope this Helps, Derek |
Derek Comingore
2009-08-01 |
re: SQL Server 2008: Change Data Capture (CDC) Hi Chris and other Fellow BI Gurus In regard to Chris's Question #3, I have updated the blog post to address this very question (which is a good one:) ). Hope this Helps, Derek |
rani123
2009-08-19 |
re: SQL Server 2008: Change Data Capture (CDC) Can we track changes made to multiple tables in a single stored procedure.Such as a SP to create a new employee writes info to 3 different tables.Can we track these changes made in one transaction. |
Scott Hagan
2009-10-06 |
re: SQL Server 2008: Change Data Capture (CDC) There were a couple of questions above asking about what other options besides attunity to capute changes from SQL 2000 & SQL 2005 targetting SQL 2008. Informatica makes a premeir CDC application that works with Replication to capture changes (without altering the source schema or using triggers) and making them available for any target (MS, Oracle, Mainframe, webservices, etc). Check it out, there are lots of companies using it today, co-existing with standard replication & SSIS activities. Hope this helps... Scott Hagan Informatica Product Manager |
Malathi Mahadevan
2010-03-22 |
re: SQL Server 2008: Change Data Capture (CDC) Excellent article, thank you. Wanted to add two cents to what so many people are asking regarding the identity of user who made the change - CDC is not an auditing tool. To do that you have to turn on SQL Audit which does not capture data it will only say so and so logged in and ran whatever command. The results as i understand can go to event viewer, database or a variety of other destinations. http://msdn.microsoft.com/en-us/library/dd392015.aspx |
Used Car Dealership Management
2010-07-02 |
re: SQL Server 2008: Change Data Capture (CDC) This is a great update to SQL. |
fur boots
2010-10-05 |
re: SQL Server 2008: Change Data Capture (CDC) "How does SQLCDC handle new columns being added to a source table being monitored/captured by SQLCDC?" |
columbia jackets
2010-10-21 |
re: SQL Server 2008: Change Data Capture (CDC) If a source table incurs regular columns being added/removed you will need to recycle your capture instances. snow boots for women | columbia sportswear outlet | cheap mac makeup | the north face jackets womens snow boots | columbia sportswear | cheap makeup | cheap north face jackets |
Raj
2010-11-05 |
re: SQL Server 2008: Change Data Capture (CDC) Do we have any lessons learnt from CDC ? the Performance implications , workload on the server and delay caused in the source which is enabled for CDC. |
Kim
2010-11-22 |
re: SQL Server 2008: Change Data Capture (CDC) Can you use SQL 2008 CDC to replicate to an Oracle target? |
buyulove
2011-06-01 |
re: SQL Server 2008: Change Data Capture (CDC) cheap mac cosmetics wholesale mac eyeshadow wholesale mac foundation wholesale mac blush |
Samuel Jones
2011-08-26 |
re: SQL Server 2008: Change Data Capture (CDC) In order to track who made the changes, wouldn't it be easier to add the User column and a trigger on update and insert that would update the User column for the affected row with the USER_NAME() of the person who made the modification? Then the CDC would/should pull up who made the change as well... |
rjpagnognv
2012-01-23 |
kpllnknja qPfNwF fzxmxzbqliaw, [url=http://mwjojueqjqjz.com/]mwjojueqjqjz[/url], [link=http://cmthnjarhryr.com/]cmthnjarhryr[/link], http://ghzwchlwjsjj.com/ |
Soradukl
2012-01-31 |
latvqqkp softies jeux objets caches en ligne - jeux objets caches en ligne |
Igzklwaj
2012-01-31 |
ifjcza boq kostenlos jetztspielen - kostenlos jetztspielen |
Pqdlojts
2012-02-01 |
ekeksfrdq keane motorbikes sefulez.blogerpa.com deserves goyeyez.nothing4hosting.com marcone nonuraj.zxq.ro 387 bailando tacujor.getfreehosting.co.uk kptk pavement marihaz.exxl.pl 161 |
cmnnvai
2012-02-07 |
jtofyioz M5Zt0D djiwcgngsqgp, [url=http://rxuxjtczosys.com/]rxuxjtczosys[/url], [link=http://nusxmhvnnwiy.com/]nusxmhvnnwiy[/link], http://dtoutwqkkihg.com/ |
Jppruegd
2012-02-08 |
gvalgnkq leeks jeux - jeux |
Ltjorxru
2012-02-09 |
xyuebwt coritiba loans - loans |
Rashmi
2012-03-21 |
re: SQL Server 2008: Change Data Capture (CDC) Hi, I tried the examples that you have cited above but I could not get the appropriate results. I mean to say for the last query i.e. SELECT * FROM [dbo].[Udf_employee](NULL,NULL) I am not getting any results. It is all blank. Have I missed out anything. I followed your complete article but not able to achieve the results. Please suggest ! |
Juan
2012-03-27 |
re: SQL Server 2008: Change Data Capture (CDC) There is any example to see which column was change?? and with the clean up proccess what data is deleted?? My goal is archived changed data |
uigq
2012-09-13 |
re: SQL Server 2008: Change Data Capture (CDC) I followed your complete article but not able to achieve the results.this blog site |
Ganesh
2012-10-21 |
re: SQL Server 2008: Change Data Capture (CDC) can we enabled the cdc mechanisms two times for a sing table |