Davide Mauri Blog

Experiences with SQL Server

Partitioning Automation - The Basics

Ok, let's start with the basic; again, remember that I assume that you have read the SQL Server 2005 Partitioned Tables and Indexes whitepaper. Another document on partitioning (which explaing more deeply the differences between LEFT and RIGHT partitioning) is also this one.

In any scenario where partitioning is used, if we want to use the partition swich trick we have to know in which filegroup the partition resides and the id of that partition. In order to make the partitioning switch work, in fact, we have to create the two support table in the same filegroup where the partition of the main table we have to change resides.

Confused? Well, let's make a simple example. Let's say we have a table called Orders which is partitioned over a RequestDate column, by month. Imagine that we have 12 filegroups (one for each month) namede FG1...FG12, and that, for any reason, we have to re-import, from our mainframe machine, all the orders of July.
With the aid of partitioning we can proceed with the following steps:

1) create a OldJulyOrders and NewJulyOrders table with the same schema of Orders (schema and indexes!)

2) fill NewJulyOrders with data coming from the mainframe

3) switch the partition in wich FG7 (July orders) resides to the OldJulyOrders table.

4) switch the table NewJulyOrders into the Order partition that contains July orders.

(If you want to all the scripts you need to run in order to run this scenario are in the next article, Partitioning Automation - Updating existing data).

As said before the "problem" is that the two New and Old tables must be created in the same filegroup of July’s order partition, which, in our case, is FG7.

So, how we can know the id of that partition? And, if we also don’t know the filegroup we have to use, how we can discover that information?

The answers, as you may already have guessed, are hidden in the system views; in particular, in these views:

sys.partition_functions
            contains details for each partitioning function

sys.partition_range_values
            contains data for each value range used in the partitiong functions

sys.destination_data_spaces
            contains associations to map partitions to filegroups

sys.filegroups
            contains filegroup (which is a data space) details

in addition there the $partition function, which returns the partition id in which our data will fall when applied on a partition:

select $partition.PF_DateRange('20050814') as boundary_id

where PF_Daterange is the partitioning function. In our case the results will be 8, since we have defined the following partitioning scheme and function:

create partition function PF_DateRange(datetime)

as

range left for values (

      '20050131 23:59:59.997',

      '20050228 23:59:59.997',

      '20050331 23:59:59.997',

      '20050430 23:59:59.997',

      '20050531 23:59:59.997',

      '20050630 23:59:59.997',

      '20050731 23:59:59.997',

      '20050831 23:59:59.997',

      '20050930 23:59:59.997',

      '20051031 23:59:59.997',

      '20051130 23:59:59.997',

      '20051231 23:59:59.997'

)

go

 

create partition scheme PS_DateRange

as

partition PF_DateRange

to

(

      [FG1],

      [FG2],

      [FG3],

      [FG4],

      [FG5],

      [FG6],

      [FG7],

      [FG8],

      [FG9],

      [FG10],

      [FG11],

      [FG12],

      [PRIMARY]

)

go

 

create table Orders

(

      OrderId           int not null,

      Description       varchar(10) not null,

      RequestDate       datetime not null default(getdate())

) on PS_DateRange(RequestDate)

go

unfortunately the $partition function don’t give us the filegroup name, which we also need. No problem, here’s the sql script that do this:

select

      prv.boundary_id,

      prv.value,

      fg.name

from

      sys.partition_range_values prv

inner join

      sys.partition_functions pf on prv.function_id = pf.function_id

inner join

      sys.destination_data_spaces dds on  dds.destination_id = prv.boundary_id

inner join

      sys.filegroups fg on fg.data_space_id = dds.data_space_id

where

      pf.name = 'PF_DateRange'

and
      prv.boundary_id = $partition.PF_DateRange('20050814')

 

if you also what to see all your partitions along with their range value and the relative used filegroup, this is the script for you.

 

select

      ps.name as PSName,

      dds.destination_id as PartitionNumber,

      fg.Name as FileGroupName,

      prv.value as RangeValue

from

      sys.indexes i

inner join

      sys.partition_schemes ps on ps.data_space_id = i.data_space_id

inner join

      sys.destination_data_spaces dds on dds.partition_scheme_id = ps.data_space_id

inner join

      sys.filegroups fg on fg.data_space_id = dds.data_space_id

left outer join

      sys.partition_range_values prv on prv.boundary_id = dds.destination_id

where

      i.[object_id] = object_id('Orders')

 

And that's it! Now we can know the partition id (PartitionNumber column of the above script) and the filegroup name we have to use, in order to have the script in next articles working!

 

These scripts will be vital if you're going to implement a windowing scenario, since the mapping between Partition Number and Filegroup will change as soon as so begin to use merge and split functions over your partition (Eg.: The filegroup FG7 will be in the partition number 6 instead of 7). With the aid of these scripts we'll be able to get the right values anyway.

One final note: these scripts are taken from Kimberly’s whitepaper and I’ve just modified them a while to add some more data, so, again, forward your “thanks” to her!