SSIS Performance Framework Part 1
You can expect:
This is a medium intensity post (you should not get a migraine, but I make no promises)
I Expect (that you have the following experience):
- The ability to POINT and CLICK on your OWN (I won't hold your hand like a little girl. Get over it.)
- at lease SOME SSIS experience
- SSIS Logging
- Stored Procedure writing
Performance Measuring in SSIS
SSIS Logging lets you see what the heck is going on during run-time of your packages. You can see where failures occur and that is what MOST people are anxious to see. I am not most people and I want to see how fast my packages are running. You can see total execution time from an SQL Agent job, but that isn't enough! I want to know how EVERYTHING is performing. Armed with this information I know exactly what to focus on if things are not what I expect them to be. This post will show you how to set up logging to a SQL server and then USE that logging info for the forces of good.Next is the Details tab. These settings are package level settings and will propagate to all objects in the package. So, when you see the OnError event checked, this will automatically be checked for any object you go to configure the logging for.
CAST(RIGHT([MESSAGE], LEN([MESSAGE]) - CHARINDEX('INPUT : ', [MESSAGE]) - 7) AS INT)
CREATE PROCEDURE [DBO].[CHECKPROGRESS]
(
@EXECUTIONID AS UNIQUEIDENTIFIER,
@MESSAGE AS VARCHAR(100),
@GOAL AS INT,
@PROCESS VARCHAR(50)
)
AS BEGIN
SELECT PROCESS = @PROCESS, TIMEINSECS = DATEDIFF(SS, MIN(STARTTIME), MAX(ENDTIME)),
RECCOUNT = SUM(CAST(RIGHT([MESSAGE], LEN([MESSAGE]) -
CHARINDEX('INPUT : ', [MESSAGE]) - 7) AS INT)),
RECSPERSEC = SUM(CAST(RIGHT([MESSAGE], LEN([MESSAGE]) -
CHARINDEX('INPUT : ', [MESSAGE]) - 7) AS NUMERIC)) /
DATEDIFF(SS, MIN(STARTTIME), MAX(ENDTIME)),
RECSPERHOUR = (SUM(CAST(RIGHT([MESSAGE], LEN([MESSAGE]) -
CHARINDEX('INPUT : ', [MESSAGE]) - 7) AS NUMERIC)) /
DATEDIFF(SS, MIN(STARTTIME), MAX(ENDTIME))) * 60 * 60,
GOAL = @GOAL,
PERCENTCOMPLETE = (SUM(CAST(RIGHT([MESSAGE], LEN([MESSAGE]) -
CHARINDEX('INPUT : ', [MESSAGE]) - 7) AS NUMERIC)) /
@GOAL) * 100 ,
ESTIMATEDTOTALTIME = CAST(ROUND(DATEDIFF(SS, MIN(STARTTIME),
MAX(ENDTIME)) / (SUM(CAST(RIGHT([MESSAGE], LEN([MESSAGE]) -
CHARINDEX('INPUT : ', [MESSAGE]) - 7) AS NUMERIC)) /
@GOAL), 0) AS INT),
SERVERETA = DATEADD(SS, (CAST(ROUND(DATEDIFF(SS,
MIN(STARTTIME), MAX(ENDTIME)) / (SUM(CAST(RIGHT([MESSAGE],
LEN([MESSAGE]) - CHARINDEX('INPUT : ', [MESSAGE]) - 7)
AS NUMERIC)) / @GOAL), 0) AS INT) - DATEDIFF(SS,
MIN(STARTTIME), MAX(ENDTIME))), GETDATE()),
CURRENTSERVERTIME = GETDATE()
FROM SYSDTSLOG90
WHERE EXECUTIONID = @EXECUTIONID
AND [EVENT] = 'ONPIPELINEROWSSENT' AND [MESSAGE] LIKE '%' + @MESSAGE + '%'
END
Here is a driver for that stored proc that matches my demo package.
DECLARE @MAX INT
DECLARE @EXECID UNIQUEIDENTIFIER
SELECT @MAX = MAX(ID)
FROM SYSDTSLOG90
WHERE STARTTIME BETWEEN CONVERT(VARCHAR, GETDATE(), 101)
AND CONVERT(VARCHAR, DATEADD(DD, 1,GETDATE()), 101)
SELECT @EXECID = EXECUTIONID
FROM SYSDTSLOG90
WHERE ID = @MAX
DECLARE @RESULTS TABLE ( PROCESS VARCHAR(50),
TIMEINSECS INT,
RECCOUNT INT,
RECSPERSEC REAL,
RECSPERHOUR REAL,
GOAL INT,
PERCENTCOMPLETE REAL,
ESTIMATEDTOTALTIME INT,
SERVERETA DATETIME,
CURRENTSERVERTIME DATETIME )
INSERT INTO @RESULTS
EXEC CHECKPROGRESS @EXECID,
'LOOKUP THE DATA IN THE OTHER DEMO TABLE', 1, 'LOOKUP'
INSERT INTO @RESULTS
EXEC CHECKPROGRESS @EXECID,
'RECORDS FOUND', 1, 'RECORDS FOUND'
INSERT INTO @RESULTS
EXEC CHECKPROGRESS @EXECID,
'RECORDS NOT FOUND', 1, 'RECORDS NOT FOUND'
SELECT * FROM @RESULTS
If you run you package and then create the stored proc and then run the driver you should see some valuable results. I've used this exact stuff to measure performance day over day, month over month, etc. Keep in mind that there are other factors that can affect performance (such as some dimwit running a CURSOR on your server...), so sometimes you need to average runtimes. Armed with this data, you can march up to Excel and tell it, "Give me charts fool!". Charts make management giggly and thus you can defend your well written code when all of a sudden the Virtual your DB is running on decides to do something funny! (true story)
This post is the foundation for this performance series. We will use the sysdtslog90 table extensively to determine actual performance and not just anecdotal performance. The OnError stuff will be used in subsequent posts. It is there for a reason if you were wondering.
-Madman out-
Legacy Comments
Mosher
2008-07-13 |
re: SSIS Performance Framework Part 1 Great job. |
Damien Alvarado
2008-10-16 |
re: SSIS Performance Framework Part 1 Excellent post, I've also been reading about the SSIS API and when you inherit a mess (I'm a consultant) these tools can really point you in the right direction to get immediate impact. We implemented a custom audit function but I think we could have used the built in logging component as you described. I enjoy your writing style very much. It sounds like you wrote it while having a beer. This is a good thing. |
James Nix
2008-10-20 |
re: SSIS Performance Framework Part 1 If you need assistance, feel free to contact me. - Madman - |