What's New in DTS 2005
What's new in DTS 2005...that was the title of tonight's local SQL Server User Group meeting. We're a bit spoiled here in the Pacific Northwest, with our local SQL Server User Group meeting on the main Microsoft campus in Redmond. It's not too tough to get speakers straight from The Company. Tonight it was Donald Farmer, Group Program Manager, SQL Server - DTS. As he put it, the short answer to the question is: Everything.
Obviously I can't give you here in one blog post all that he covered in an hour-plus presentation (not to mention the demos) but I'll hit some of the points that stood out for me.
- DTS will be named something new in a couple of weeks. Donald didn't say what, and in fact said he didn't have any idea. That's probably something that's handled by the marketing group, or something like that. Just taking a stab in the dark, I would guess something like ETL Services. That seems to fit with the current mode of naming things related to SQL Server (Analysis Services, Reporting Services, ...). But that's just a WAG on my part.
- Once data is in a pipeline (dataflow) it is just a rowset and you can now use all rowset commands like merge joins (UNION ALL). You can also split data flows into different paths based on conditions, so this could allow you to split data based on certain criteria, massage each path differently, and merge it back together.
- The new DTS Designer, which is a Visual Studio IDE type environment called BI Studio, now has a debugger including stop points, and the ability to inspect variables, etc. There are also Data Viewers that let you see data points displayed in various ways (grid, histogram, scatter plot, etc.)
- For loops and For Each loops. Imagine “For Each file in this folder”. Or “For Each Database on this Server”. Or “For Each Table in this Database”. I assume, although he didn't say so, that these could be nested.
- Error Flows: What happens when an error occurs? Now you can, for example, redirect the row that generated the error into a separate flow. Maybe that flow just logs it into a table. Maybe it does some other processing and later merges it back into the main flow. It's up to you, but now errors don't have to just cause a task to fail, although that setting is still an option, too.
- Very fancy package configuration files where you can set a gazillion different settings including global variables or other content that would be dynamic between different servers such as connection strings.
- Checkpoint Restart: You can define checkpoints along the way, and if the package gets hung up for some reason, it can be restarted from the last checkpoint rather than all the way back from the beginning. Of course there are also transactions and something called sequences. I didn't get the full idea of those, but they sound like a way to group tasks so that they act like one transaction.
- Data Sampling Task: Set a percentage, such as 10% and the tasks after that work on just the sample of the data. Could be good to get a feel for validity of data before processing the whole set.
- Raw File Source: This is an output of data in DTS's “raw“ format. The demo showed extracting data from an original data source, which in theory could be a complex and time consuming process. Then there were two flows generated, one that stored everything in a Raw File Source, and the other that ran through a Data Sampling Task. Later, if the sample passes your standards, another portion of the package could then rapidly read the Raw File Source to retrieve the full data set for processing without having to do the costly extraction process again.
- There is a Package Migration Wizard to migrate DTS 2000 packages to DTS 2005, but as you would expect, anything that is really complex should be rewritten manually to use the most efficient features now available.
- Database Maintenance Plans are now DTS packages. I think this is a good thing. Maintenance Plans were a bit too much of a black box previously.
One other thing that Donald pointed out was that many DBAs use DTS not for moving data, but for establishing workflows. That makes sense for some maintenance things you'd want to do. I can see it even more with the new features. Also, Donald mentioned that some features such as Text Mining may be Enterprise level only, but they are looking to put a lot of power in the lower level versions. I think everything identified above will be standard.
I haven't done a lot of complex tasks in DTS up to this point, but damn this is looking good! Oh, and of course you can still do scripting tasks that now use VB.NET. Overall, I'd say they are on a good path to hit their goal of becoming truly an Enterprise-class ETL system, which is also why I think they'll pick the name above, to emphasize ETL.
Legacy Comments
Scott Allen
2004-09-09 |
re: What's New in DTS 2005 ETL Services just doesn't roll off the tounge... blah! |
AjarnMark
2004-09-09 |
re: What's New in DTS 2005 >> ETL Services just doesn't roll off the tounge... blah! Well, I never claimed to be a marketing guru. It was just the first thing that came to mind based on the other names they've been using. And Microsoft is known for taking common or industry jargon words and turning them into product names: Office, Word, Project, Publisher... So I'm expecting something with ETL or maybe an even more generic term if one applies. We shall soon see whether "rolling off the tongue" is an important requirement. |
Brett
2004-09-09 |
What's not new? What's the percentage? >>ETL? What do Helicopters have to do with anything? Also...have we already built most of this already? For every file in a folder? Wait a minute...let me open my sql tool box Work flow? For admins? Ain't those scheduled jobs? Error Handling? You didn't say they upgraded it. Is it the same? Don't tell me...and this is all in DTS, er ETL, right? |
Tara
2004-09-09 |
re: What's New in DTS 2005 I saw the ETL name on a MS presentation today. The slide was talking about DTS 2005. I can't remember if ETL was the new name or just an acronym on the slide. |
AjarnMark
2004-09-10 |
re: What's New in DTS 2005 Brett, have you written something that will iterate through every file in a folder on the operating system and then import it into your database? I know you can get lists of files into SQL Server, but then writing the logic to loop through that list, pass the filename to an import routine, and have that import routine import the rows it could while either logging the error rows to another table or re-processing them through another algorithm would be quite a task. I'm sure you could do it, but wouldn't it be nice to have a built-in looping task and built-in error redirection that still allows the successful rows to come in and let's you do a variety of other tasks on the error rows instead of just reporting success or failure? For Workflow, in DTS, tasks do not have to be sequential, but some can be whereas in a job aren't they always sequential? And if one step has an error in a job, your choices are either quit with notification or continue to the next step. What a pain. Maybe you'd like some notifications and a redirection of only the error rows to another process. Overall I'd say it's much more robust. Tara, MS is marketing it as an "Enterprise-level ETL tool" so we'll be seeing the ETL abbreviation a lot. It will be interesting to see what name they actually come up with. |
Thomas Williams
2004-09-13 |
re: What's New in DTS 2005 G'day AjarnMark, and thanks for the write-up on your SIG. I use DTS a fair bit, and these improvements sound pretty cool...I wonder if your man Donald Farmer would be interested in coming to Melbourne, Australia to present again? |
Thomas Williams
2004-09-13 |
re: What's New in DTS 2005 http://dotnetjunkies.com/WebLog/thomasswilliams/archive/2004/09/13/25374.aspx |
Mads
2006-05-04 |
re: What's New in DTS 2005 Use VB.NET with DTS, can you also use C#? |