Introduction
SQL is a set-based language, and often we wish to pass sets of data as a parameter to a stored procedure. For example, you might wish to pass a set of Customers to an invoice generating stored procedure, or a set of employees which for you’d like to calculate a particular bonus. Sometimes we need even more complex data than simple lists; you might have a time sheet application that lets a user create or edit a weekly time sheet at the client, and then you need to accept and validate an entire week of data when submitted.
A common way to pass large sets of data to a stored procedure is to use an
XML or CSV character string. This can work, but it can be complicated to encode and then decode the strings passed in, and you don’t have a clean way to validate the data as it is passed. For example, you cannot force the CSV to accept only a list of valid customer ID’s, or a particular data type. The CSV format can also get very complicated if there is a lot of data passed in with multiple columns, especially if the data might contain characters that need to be escaped and then un-escaped to work in a CSV format.
SQL Server 2000 and 2005 do have a table data type, but you cannot pass a table variable to a stored procedure as an argument, so that doesn’t work.
So, is there any easy was a way for a stored procedure to accept really complicated data, and for that data to be completely validated with foreign key constraints, strong data types, and possibly even CHECK constraints? That would certainly make things quicker and easier. Well, believe it or not, SQL Server has everything you need
already in place to accept large, complicated sets of data as parameters in a stored procedure!
A Simplistic Single User Approach
Let’s create a simple example: Suppose we have an application that tracks Customer data, and at the client application we need to set it up so that users can quickly edit the Status of many customers all at once, and then click SAVE to update the database. Of course, this example can easily be expanded to include other attributes of the customer as well, such as the customer Name or perhaps a customer type.
Our stored procedure will accept a set of rows with two columns: CustomerID and StatusID. Both CustomerID and StatusID must be valid entries in the Customers and Status tables, respectively.
What we can do is simply create a regular, permanent table that we can use to store the set of parameter values:
Create table Param_CustomerStatus
(
CustomerID int references Customers (CustomerID) primary key,
StatusID int references Status (StatusID) not null
)
I’ll call that table a
parameter table from now on. With that table in place, we can write our stored procedure that updates the customer status for all of those parameter values, like this:
create procedure UpdateCustomerStatus
as
update Customer
set StatusID = x.StatusID
from Customer
inner join Param_CustomerStatus x on
Customer.CustomerID = x.CustomerID
delete from Param_CustomerStatus
To call our procedure, all we need to do is load up the Param_CustomerStatus table with our values and then call the UpdateCustomerStatus stored procedure. Simple and efficient! We know that our stored procedure only “accepts” valid CustomerID and StatusID values, and we can quickly and efficiently use standard INSERT statements to load up our parameter table without worrying about XML or CSV strings and all the headaches that go with it.
To allow for a client application to call this stored procedure, we can allow the client to use
SQLBulkCopy or something similar to directly bulk load the CustomerID/StatusID values into this table, or we can create a stored procedure that clients can call over and over to add values to it:
create procedure Param_CustomerStatusAdd @CustomerID int, @StatusID int
as
insert into Param_CustomerStatus (CustomerID, StatusID)
values (@CustomerID, @StatusID)
With these components in place, you can essentially pass very complex data to any stored procedure, just by simply filling up the parameter table first.
The Problem
There’s a catch to this simplistic approach, of course, and I’m sure most of you already spotted it: What happens in a multi-user environment? Two users might be filling up the Param_CustomerStatus table at the same time, or one user might empty it out while another is trying to fill it, causing all sorts of havoc when the UpdateCustomerStatus procedure is run.
A More Robust Multi-User Solution
Here’s a quick and easy way to allow to this work for multi-user environments: Before loading parameter values into the parameter table, you ask for a unique
parameter key value. Once you have the parameter key, you include that value with your parameter values when inserting rows into the parameter table. Then, when you call the stored procedure, you simply pass in the parameter key again. The stored procedure will use the key when accessing and emptying out the parameter table.
Let’s create a simple, single ParameterKey table that all of our parameter tables will use:
create table ParameterKeys
(
ParameterKey int identity primary key,
DateCreated datetime not null
)
With this table in place, we create our new and improved multi-user Param_CustomerStatus table:
create table Param_CustomerStatus
(
ParameterKey int references ParameterKeys (ParameterKey) on delete cascade,
CustomerID int references Customers (CustomerID) on delete cascade,
StatusID int references Status (StatusID) on delete cascade,
Primary key (ParameterKey, CustomerID)
)
Notice that now the primary key of our parameter table is the combination of ParameterKey and CustomerID, allowing multiple users to put entries into this table at the same time without violating PK constraints, as long as they all have their own unique parameter key.
You might also notice that we use cascading deletes on the foreign key constraints to ensure that our parameter table never stops the database from being able to update or delete entities in our data.
Now that we have our table in place, a simple stored procedure lets you get a parameter key when you are ready to fill up a parameter table:
create procedure Param_GetKey @ParameterKey int output
as
delete from ParameterKeys where DateCreated < dateadd(day,-2, getdate())
insert into ParameterKeys (DateCreated) values (getdate())
set @ParameterKey = scope_identity()
Notice that this stored procedure also removes any old parameter key values and any associated data with it. This is a nice way to ensure that if something happens abruptly and a parameter table isn’t properly emptied, things stay nice and clean over time. The cascading delete constraint immediately removes all parameter data for any old keys, all at once, from any and all parameter tables in your database.
We also can create a stored procedure to delete the data for a particular parameter key, which should be called when you are done using the parameter values:
create procedure Param_DeleteKey @ParameterKey int
as
delete from ParameterKeys where ParameterKey = @ParameterKey
Finally, we then can simply create our UpdateCustomerStatus procedure like this:
create procedure UpdateCustomerStatus @ParameterKey int
as
update Customer
set StatusID = x.StatusID
from Customer
inner join Param_CustomerStatus x on
Customer.CustomerID = x.CustomerID
where
x.ParameterKey = @ParameterKey
exec Param_DeleteKey @ParameterKey
It’s as simple as that! Everything works fine in a multi-user environment, everything is quick and set-based, and you can truly pass very complicated data to stored procedures. Nothing is stopping you from having multiple parameter tables for multiple parameters (all related to the same ParameterKeys table) and you can even have a single stored procedure accept very complicated, multi-table parameter values. Best of all, everything is strongly-typed and has foreign key constraints that guarantees that your parameter values are valid.
To allow for clients to insert parameter values to our parameter table, we might also create a stored procedure that the client application can use:
create procedure Param_CustomerStatusAdd @ParameterKey int, @CustomerID int, @StatusID int
as
insert into Param_CustomerStatus (ParameterKey, CustomerID, StatusID)
values (@ParameterKey, @CustomerID, @StatusID)
Note that you should add some validation to the stored procedure to alert the client if there is a problem adding the parameter; for example, they might try to add the same customer twice, or insert a customer that doesn't exist, or they might not specify a valid StatusID. This is yet another great feature of this technique -- as you "prepare" your parameter data, it is also validated by the parameter table constraints!
Here’s a quick example, where we could use a complicated SQL statement to tell us which customers we need to mark as inactive:
declare @key int
exec GetParameterKey @key output
insert into Param_CustomerStatus (ParameterKey, CustomerID, StatusID)
select @key, CustomerID, StatusID
from
-- long, complicated select here --
exec UpdateCustomerStatus @key
Now, clearly this is a simple example – you’d never need to call such a basic stored procedure in this way from your T-SQL code. But perhaps changing the customer status is more than a simple UPDATE, and it involves updating multiple tables in the database, archiving data, sending notifications, and so on. This allows you to generate the customer parameter list any way you need in multiple places, but all you need is one single stored procedure that uses that list to do the job in a quick and efficient set-based manner.
A Reporting Example
Here’s perhaps a more realistic example: Suppose we have a reporting application that let’s the user choose which Customers to include on a report, along with their ordering, and also lets them assign a comment to each customer. The report then uses this criteria and information to display the results.
First, we need a parameter table for this report:
create table Param_CustomerReportFilter
(
ParameterKey int references ParameterKeys(ParameterKey) on delete cascade,
CustomerID int references Customers (CustomerID) on delete cascade,
Sort int not null,
Comment varchar(1000),
Primary key (ParameterKey, CustomerID)
)
Since a client will be setting the parameter values, we can create a stored procedure like this to add parameters to the Param_CustomerReportFilter table:
create procedure Param_CustomerReportFilterAdd
@ParameterKey int,
@CustomerID int,
@Sort int,
@Comment varchar(1000)
as
insert into Param_CustomerReportFilter (ParamterKey, CustomerID, Sort, Comment)
values (@ParameterKey, @CustomerID, @Sort, @Comment)
As before, you should add some error checking and a return value to this stored procedure to inform the client if any problems arose when preparing the parameter data.
Finally, our report will accept a @ParameterKey parameter and join to the Param_CustomerReportFilter table to create its results:
create procedure CustomerReport @ParameterKey int
as
select
c.*, f.Comment
from
Customers c
inner join
Param_CustomerReportFilter f on
c.CustomerId = f.CustomerId and
f.ParameterKey = @ParameterKey
order by
f.sort
exec Param_DeleteKey @ParameterKey
Our client application simply needs to keep track of the customers that the user wants to see on the report, along with the sort and any notes for each. When the user is done and the report is requested, you simply get a parameter key, load up the Param_CustomerReportFilter table using either SQLBulkCopy or the Param_CustomerReportFilterAdd stored procedure, and then pass in that key when you finally call the CustomerReport stored procedure to return the report data.
A Few Notes
Here’s few more things to think about if you wish to use this technique:
Transaction Logs: Transaction logs can be an issue if you are filling and deleting these parameter tables often. If you use a simple recovery model, this shouldn’t be an issue. You could move the parameter tables to an alternate database on your server if necessary.
Persistent Parameter Values: You don’t need to empty out the parameter tables each time a stored procedure is called that uses the parameter values; you could make this optional with an additional parameter value (i.e., @EmptyParams bit = 1), or you could just let the parameter table get cleaned out every day or so by the “auto-clean” feature of GetParameterKey. Persisting the parameter values could be useful if you need to use them over and over for subsequent procedure calls.
Performance: Once your parameter table is filled, performance will be excellent since it is just a simple, indexed JOIN to the parameter table to get things done. Filling up the parameter table by executing single row INSERT stored procedures over and over for each parameter value might not be efficient if it is expensive to call database commands many times from the client. As mentioned, you can increase the performance of this step by using SqlBulkCopy to bulk copy the parameter values to the parameter tables, bypassing the single row insert stored procedures completely.
Conclusion
Like many simple techniques, there’s nothing special or revolutionary about doing things this way – we are just leveraging existing, basic features of SQL Server to get the job done. No technique is perfect, but this is a pretty simple way to pass very complex data to a stored procedure that has a clear, clean, predefined structure and that can be validated in many different ways. There is also no need to worry about clients not formatting CSV or XML values consistently or properly, or that certain limits regarding parameter size or values will be reached that may make working with CSV or XML strings difficult.
(
6/27/2007 Update: fixed an issue with the UPDATE statements ... thanks Hugo!)
see also: