Partitioning Automation - The Basics
Ok, let's start with the basic; again, remember that I assume that you have read the SQL Server 2005 Partitioned Tables and Indexes whitepaper. Another document on partitioning (which explaing more deeply the differences between LEFT and RIGHT partitioning) is also this one.
In any scenario where partitioning is used, if we want to use the partition swich trick we have to know in which filegroup the partition resides and the id of that partition. In order to make the partitioning switch work, in fact, we have to create the two support table in the same filegroup where the partition of the main table we have to change resides.
Confused? Well, let's make a simple example. Let's say we have a table called Orders which is partitioned over a RequestDate column, by month. Imagine that we have 12 filegroups (one for each month) namede FG1...FG12, and that, for any reason, we have to re-import, from our mainframe machine, all the orders of July.
With the aid of partitioning we can proceed with the following steps:
1) create a OldJulyOrders and NewJulyOrders table with the same schema of Orders (schema and indexes!)
2) fill NewJulyOrders with data coming from the mainframe
3) switch the partition in wich FG7 (July orders) resides to the OldJulyOrders table.
4) switch the table NewJulyOrders into the Order partition that contains July orders.
(If you want to all the scripts you need to run in order to run this scenario are in the next article, Partitioning Automation - Updating existing data).
As said before the "problem" is that the two New and Old tables must be created in the same filegroup of July’s order partition, which, in our case, is FG7.
So, how we can know the id of that partition? And, if we also don’t know the filegroup we have to use, how we can discover that information?
The answers, as you may already have guessed, are hidden in the system views; in particular, in these views:
sys.partition_functions
contains details for each partitioning function
sys.partition_range_values
contains data for each value range used in the partitiong functions
sys.destination_data_spaces
contains associations to map partitions to filegroups
sys.filegroups
contains filegroup (which is a data space) details
in addition there the $partition function, which returns the partition id in which our data will fall when applied on a partition:
select $partition.PF_DateRange('20050814') as boundary_id
where PF_Daterange is the partitioning function. In our case the results will be 8, since we have defined the following partitioning scheme and function:
create partition function PF_DateRange(datetime)
as
range left for values (
'20050131 23:59:59.997',
'20050228 23:59:59.997',
'20050331 23:59:59.997',
'20050430 23:59:59.997',
'20050531 23:59:59.997',
'20050630 23:59:59.997',
'20050731 23:59:59.997',
'20050831 23:59:59.997',
'20050930 23:59:59.997',
'20051031 23:59:59.997',
'20051130 23:59:59.997',
'20051231 23:59:59.997'
)
go
create partition scheme PS_DateRange
as
partition PF_DateRange
to
(
[FG1],
[FG2],
[FG3],
[FG4],
[FG5],
[FG6],
[FG7],
[FG8],
[FG9],
[FG10],
[FG11],
[FG12],
[PRIMARY]
)
go
create table Orders
(
OrderId int not null,
Description varchar(10) not null,
RequestDate datetime not null default(getdate())
) on PS_DateRange(RequestDate)
go
unfortunately the $partition function don’t give us the filegroup name, which we also need. No problem, here’s the sql script that do this:
select
prv.boundary_id,
prv.value,
fg.name
from
sys.partition_range_values prv
inner join
sys.partition_functions pf on prv.function_id = pf.function_id
inner join
sys.destination_data_spaces dds on dds.destination_id = prv.boundary_id
inner join
sys.filegroups fg on fg.data_space_id = dds.data_space_id
where
pf.name = 'PF_DateRange'
and
prv.boundary_id = $partition.PF_DateRange('20050814')
if you also what to see all your partitions along with their range value and the relative used filegroup, this is the script for you.
select
ps.name as PSName,
dds.destination_id as PartitionNumber,
fg.Name as FileGroupName,
prv.value as RangeValue
from
sys.indexes i
inner join
sys.partition_schemes ps on ps.data_space_id = i.data_space_id
inner join
sys.destination_data_spaces dds on dds.partition_scheme_id = ps.data_space_id
inner join
sys.filegroups fg on fg.data_space_id = dds.data_space_id
left outer join
sys.partition_range_values prv on prv.boundary_id = dds.destination_id
where
i.[object_id] = object_id('Orders')
And that's it! Now we can know the partition id (PartitionNumber column of the above script) and the filegroup name we have to use, in order to have the script in next articles working!
These scripts will be vital if you're going to implement a windowing scenario, since the mapping between Partition Number and Filegroup will change as soon as so begin to use merge and split functions over your partition (Eg.: The filegroup FG7 will be in the partition number 6 instead of 7). With the aid of these scripts we'll be able to get the right values anyway.
One final note: these scripts are taken from Kimberly’s whitepaper and I’ve just modified them a while to add some more data, so, again, forward your “thanks” to her!