Manipulate XML data with non-xml columns and not using variable

Some time ago, I displayed how to work with XML data when searching for data stored in a XML column.
Here Some XML search approaches and here Updated XML search (test case with variables).

Today I thought I should demonstrate how to insert and update XML columns with data from other columns and not using variables. Well, I do have some examples of using variables in here because I wanted to display the core difference.
And this is my first blog post (not counting the previous announcement) since I become a MVP, I thought this blog post should be about something necessary.

Much of the XML manipulations I have made by trials and errors, because I haven’t found a single good source of information about these things. If you do know of such information, please comment.

First, we create a sample table to hold some important data, like this


CREATE TABLE    #Sample
                (
                    RowID INT IDENTITY(1, 1) PRIMARY KEY CLUSTERED,
                    MemberID INT,
                    MemberData XML
                )

INSERT  #Sample
SELECT  1, '<meta><customergroup>F</customergroup><mosaic>Young educated man</mosaic></meta>' UNION ALL
SELECT  2, '<meta><age>24</age></meta>'

CREATE PRIMARY XML INDEX IX_PrimaryXML ON #Sample(MemberData)

CREATE XML INDEX IX_Element ON #Sample(MemberData)
 USING XML INDEX IX_PrimaryXML FOR PATH


As you see, I also created the primary XML index and a secondary XML index on the XML column.
Now let’s see what is stored in the important table.


SELECT  MemberID,
        MemberData,
        MemberData.value('/meta[1]/customergroup[1]', 'VARCHAR(8)') AS CustomerGroup,
        MemberData.value('/meta[1]/mosaic[1]', 'VARCHAR(200)') AS Mosaic,
        MemberData.value('/meta[1]/age[1]', 'TINYINT') AS Age,
        MemberData.value('/meta[1]/zipcode[1]', 'VARCHAR(5)') AS ZipCode,
        MemberData.value('/meta[1]/status[1]', 'VARCHAR(15)') AS [Status]
FROM    #Sample


The exercise now is to add one or more elements to the XML column, both with using a variable but also, more importantly, using a column. Create a helper table like this below.


DECLARE @Sample TABLE
        (
            MemberID INT,
            ZipCode VARCHAR(5),
            Mosaic VARCHAR(200)
        )

INSERT  @Sample
SELECT  1, '26737', 'SQLTeam' UNION ALL
SELECT  2, '12345', 'Smart woman in the countryside'


When an element doesn’t already exist, the element is added to the XML column. Beware that an additional element is created if one alerady exists!


-- New element for all
UPDATE      s
SET         MemberData.modify('insert <zipcode>{sql:column("x.ZipCode")}</zipcode> into (/meta)[1]')
FROM        #Sample AS s
INNER JOIN  @Sample AS x ON x.MemberID = s.MemberID


Now do the SELECT again and you will see that a zipcode element was added to both records.


SELECT  MemberID,
        MemberData,
        MemberData.value('/meta[1]/customergroup[1]', 'VARCHAR(8)') AS CustomerGroup,
        MemberData.value('/meta[1]/mosaic[1]', 'VARCHAR(200)') AS Mosaic,
        MemberData.value('/meta[1]/age[1]', 'TINYINT') AS Age,
        MemberData.value('/meta[1]/zipcode[1]', 'VARCHAR(5)') AS ZipCode,
        MemberData.value('/meta[1]/status[1]', 'VARCHAR(15)') AS [Status]
FROM    #Sample


As I said before, if an element already exist a new elemtent with same name is added. This behaviour can be avoided by using a WHERE MemberData.exists clause. Do the updates first since it only touches existing records with matching elements and then do the inserts with the MemberData.exists clause.


-- Update existing element
UPDATE      s
SET         MemberData.modify('replace value of (/meta/mosaic/text())[1] with sql:column("x.Mosaic")')
FROM        #Sample AS s
INNER JOIN  @Sample AS x ON x.MemberID = s.MemberID

-- Insert new element. Beware of duplicates.
UPDATE      s
SET         MemberData.modify('insert <mosaic>{sql:column("x.Mosaic")}</mosaic> into (/meta)[1]')
FROM        #Sample AS s
INNER JOIN  @Sample AS x ON x.MemberID = s.MemberID


Now do the SELECT again and you will see that the element mosaic was updated for member 1 and added to member 2. This is also true for member 1 because now this member has two elements with same name. 


SELECT  MemberID,
        MemberData,
        MemberData.value('/meta[1]/customergroup[1]', 'VARCHAR(8)') AS CustomerGroup,
        MemberData.value('/meta[1]/mosaic[1]', 'VARCHAR(200)') AS Mosaic,
        MemberData.value('/meta[1]/age[1]', 'TINYINT') AS Age,
        MemberData.value('/meta[1]/zipcode[1]', 'VARCHAR(5)') AS ZipCode,
        MemberData.value('/meta[1]/status[1]', 'VARCHAR(15)') AS [Status]
FROM    #Sample


And this was my reason for this blog post. How to add and replace elements with values from another column. So why not display how to do this with variables too?


DECLARE
@Status VARCHAR(18)

SET         @Status = 'Shipped'

-- Update single record without prior element
UPDATE      #Sample
SET         MemberData.modify('replace value of (/meta/status/text())[1] with sql:variable("@Status")')
WHERE       MemberID = 1

-- Create new element for single record
UPDATE      #Sample
SET         MemberData.modify('insert <status>{sql:variable("@Status")}</status> as last into (/meta)[1]')
WHERE       MemberID = 2


As you can see now, same rules apply. Member 1 did not get a new Status element since we wanted to update previous value. But member 2 did get a new element.


SELECT  MemberID,
        MemberData,
        MemberData.value('/meta[1]/customergroup[1]', 'VARCHAR(8)') AS CustomerGroup,
        MemberData.value('/meta[1]/mosaic[1]', 'VARCHAR(200)') AS Mosaic,
        MemberData.value('/meta[1]/age[1]', 'TINYINT') AS Age,
        MemberData.value('/meta[1]/zipcode[1]', 'VARCHAR(5)') AS ZipCode,
        MemberData.value('/meta[1]/status[1]', 'VARCHAR(15)') AS [Status]
FROM    #Sample


Don’t forget to do your housecleaning and drop the important table.


DROP TABLE  #Sample


Besides doing update and insert, you can also use the delete syntax to remove elements if you jus don’t to clear them.
I hope you liked this post.

Cheers!

Microsoft MVP

Today it happened. I received the Microsoft MVP Award for my contributions to the Microsoft SQL Server community.
I am very honored by the award and I will continue to work hard for the community to keep their trust in me.

I will continue to update this blog and help users out in the SQLTeam forums, SqlServerCentral forums, SQL Server Magazine forums and SQL Server Developer Center forums among a few other.

Thank you to everyone who reads this blog and leave comments. I still learn something new every day.


Best Regards,
Peter Larsson

Microsoft Connect - Enhanced Syntax For Insert Into Statement

I've posted a feedback on Microsoft Connect about how to enhance the INSERT INTO syntax.
Especially for INSERT INTO ... EXEC ...

Sometimes when you need the result from a stored procedure, the SP itself returns two (or more) resultsets.
And it's only possibly to fetch and store the first resultset.

What I have suggested is an enhanced syntax for INSERT INTO ... EXEC, like this

INSERT INTO Table1 (Col1, Col2), Table2 (ColX, ColY, ColZ)
EXEC usp_MyStoredProcedure @Param1, @Param2

In this example, usp_MyStoredProcedure returns three resultsets, of which I want to store the two first.
First resultset has two columns, and second resultset has three columns.

Let Microsoft know what you think about this suggestion.
https://connect.microsoft.com/SQLServer/feedback/ViewFeedback.aspx?FeedbackID=470881

Microsoft Connect - SSMS Debugger Issue

I've found an issue with the Debugger for SQL Server 2008 Management Studio a while ago.
This is my way to ask you to endorse a fix
https://connect.microsoft.com/SQLServer/feedback/ViewFeedback.aspx?FeedbackID=374183

Please let Microsoft know what you think about this suggestion.

Extended Get Nth Weekday of period

Recently I posted a function which returned the Nth weekday of a month, either from the beginning of month or from the end of month. Function is found here http://weblogs.sqlteam.com/peterl/archive/2009/06/17/How-to-get-the-Nth-weekday-of-a-month.aspx

I have fiddled around with it and have now extended the function to find the Nth weekday not only for a month, but also for a quarter or a year. Below is the function.

CREATE FUNCTION dbo.fnGetNthWeekdayOfPeriod
(
    @theDate DATETIME,
    @theWeekday TINYINT,
    @theNth SMALLINT,
    @theType CHAR(1)
)
RETURNS DATETIME
BEGIN
    RETURN (
                SELECT theDate
                FROM    (
                            SELECT DATEADD(DAY, theDelta +(@theWeekday + 6 - DATEDIFF(DAY, '17530101', theFirst) % 7) % 7, theFirst) AS theDate
                            FROM    (
                                        SELECT CASE UPPER(@theType)
                                                    WHEN 'M' THEN DATEADD(MONTH, DATEDIFF(MONTH, @theNth, @theDate), '19000101')
                                                    WHEN 'Q' THEN DATEADD(QUARTER, DATEDIFF(QUARTER, @theNth, @theDate), '19000101')
                                                    WHEN 'Y' THEN DATEADD(YEAR, DATEDIFF(YEAR, @theNth, @theDate), '19000101')
                                                END AS theFirst,
                                                CASE SIGN(@theNth)
                                                    WHEN -1 THEN 7 * @theNth
                                                    WHEN 1 THEN 7 * @theNth - 7
                                                END AS theDelta
                                        WHERE   @theWeekday BETWEEN 1 AND 7
                                                AND (
                                                        @theNth BETWEEN -5 AND 5
                                                        AND UPPER(@theType) = 'M'
                                                    OR
                                                        @theNth BETWEEN -14 AND 14
                                                        AND UPPER(@theType) = 'Q'
                                                    OR
                                                        @theNth BETWEEN -53 AND 53
                                                        AND UPPER(@theType) = 'Y'
                                                )
                                                AND @theNth <> 0
                                    ) AS d
                        ) AS d
                WHERE   CASE UPPER(@theType)
                            WHEN 'M' THEN DATEDIFF(MONTH, theDate, @theDate) 
                            WHEN 'Q' THEN DATEDIFF(QUARTER, theDate, @theDate) 
                            WHEN 'Y' THEN DATEDIFF(YEAR, theDate, @theDate) 
                        END = 0
            )
END

How to get the Weekday and Nth from a date

You call this function with a date. The function returns a table with one record and 3 columns.
First column is Weekday; Monday = 1, Tuesday = 2, Wednesday = 3; Thursday = 4, Friday = 5, Saturday = 6 and Sunday = 7.
Second column is number of occurencies of that date since beginning of selected period type.
Third columns is number of occurencies left of that period type.

CREATE FUNCTION dbo.fnGetWeekdayAndNths
(
    @theDate DATETIME,
    @theType CHAR(1)
)
RETURNS TABLE
AS
RETURN (   SELECT 1 + DATEDIFF(DAY, -53690, @theDate) % 7 AS theWeekday,
                    1 +(theDelta - 1) / 7 AS Beginning,
                    DATEDIFF(DAY, DATEADD(DAY, -1, thePeriod), @theDate) / 7 - 1 AS Ending
            FROM    (
                        SELECT CASE UPPER(@theType)
                                    WHEN 'M' THEN DATEADD(MONTH, DATEDIFF(MONTH, -53690, @theDate), -53659)
                                    WHEN 'Q' THEN DATEADD(QUARTER, DATEDIFF(QUARTER, -53690, @theDate), -53600)
                                    WHEN 'Y' THEN DATEADD(YEAR, DATEDIFF(YEAR, -53690, @theDate), -53325)
                                END AS thePeriod,
                                CASE UPPER(@theType)
                                    WHEN 'M' THEN DATEPART(DAY, @theDate)
                                    WHEN 'Q' THEN DATEDIFF(DAY, DATEADD(QUARTER, DATEDIFF(QUARTER, 0, @theDate), 0), DATEADD(QUARTER, DATEDIFF(QUARTER, -53690, @theDate), -53600))
                                    WHEN 'Y' THEN DATEPART(DAYOFYEAR, @theDate)
                                END AS theDelta
                    ) AS d
            WHERE   UPPER(@theType) IN('Y', 'Q', 'M')
        )

How to get the Nth weekday of a month

You call this function with three parameters:

1. Any date of the month in question
2. The weekday to calculate; Monday 1, Tuesday 2, Wednesday 3, Thursday 4, Friday 5, Saturday 6 and Sunday 7
3. The choice of weekday count; a positive number means from the beginning of month and a negative number means from the end of month

If a valid date cannot be calculated, NULL is returned. For an extended version which, besides month, also handles quarter and year, see new blog post here
http://weblogs.sqlteam.com/peterl/archive/2009/06/18/Extended-Get-Nth-Weekday-of-period.aspx

CREATE FUNCTION dbo.fnGetNthWeekdayOfMonth
(
    @theDate DATETIME
    @theWeekday TINYINT,
    @theNth SMALLINT
)
RETURNS DATETIME
BEGIN
    RETURN  (
                SELECT  theDate
                FROM    (
                            SELECT  DATEADD(DAY, 7 * @theNth - 7 * SIGN(SIGN(@theNth) + 1) +(@theWeekday + 6 - DATEDIFF(DAY, '17530101', DATEADD(MONTH, DATEDIFF(MONTH, @theNth, @theDate), '19000101')) % 7) % 7, DATEADD(MONTH, DATEDIFF(MONTH, @theNth, @theDate), '19000101')) AS theDate
                            WHERE   @theWeekday BETWEEN 1 AND 7
                                    AND @theNth IN (-5, -4, -3, -2, -1, 1, 2, 3, 4, 5)
                        ) AS d
                WHERE   DATEDIFF(MONTH, theDate, @theDate) = 0
            )
END

Timings of different techniques for finding missing records

I did some tests today to measure the different approaches for finding records present in one table but not in another. The results of CPU and Duration are presented below with some help from SQL Profiler.
Number of reads are equal between methods but different depending how many record in #TableB.

If there are other methods I haven't included, please let me know.


Method      TableA   TableB   CPU  Duration
----------  -------  -------  ---  --------
GROUP BY    1000000  1000000  748       754
LEFT JOIN   1000000  1000000  328       321
NOT EXISTS  1000000  1000000  265       288
NOT IN      1000000  1000000  296       293
EXCEPT      1000000  1000000  312       288
GROUP BY    1000000   500000  577      2984
LEFT JOIN   1000000   500000  328      2930
NOT EXISTS  1000000   500000  187      2991
NOT IN      1000000   500000  312      2861
EXCEPT      1000000   500000  234      2882
GROUP BY    1000000        0  453      5866
LEFT JOIN   1000000        0  280      5791
NOT EXISTS  1000000        0  125      5855
NOT IN      1000000        0  250      5825
EXCEPT      1000000        0  234      5798

Here is the code for testing

CREATE TABLE     #TableA
                 (
                          i INT PRIMARY KEY CLUSTERED
                 )
 
INSERT   #TableA
SELECT   Number
FROM     dbo.F_TABLE_NUMBER_RANGE(0, 999999)
ORDER BY Number
 
CREATE TABLE     #TableB
                 (
                          i INT PRIMARY KEY CLUSTERED
                 )
 
INSERT   #TableB
SELECT   Number
FROM     dbo.F_TABLE_NUMBER_RANGE(0, 999999)
ORDER BY Number
 
DELETE   f
FROM     (
                 SELECT   TOP 500000
                          i
                 FROM     #TableB
                 ORDER BY NEWID()
         ) AS f
 
-- GROUP BY
SELECT    i
FROM      (
              SELECT   0 AS s, i FROM #TableA UNION ALL
              SELECT   1, i FROM #TableB
          ) AS d
GROUP BY  i
HAVING    MAX(s) = 0
 
-- LEFT JOIN
SELECT      a.i
FROM        #TableA AS a
LEFT JOIN   #TableB AS b ON b.i = a.i
WHERE       b.i IS NULL
 
-- NOT EXISTS
SELECT   a.i
FROM     #TableA AS a
WHERE    NOT EXISTS (SELECT b.i FROM #TableB AS b WHERE b.i = a.i)
 
-- NOT IN
SELECT   a.i
FROM     #TableA AS a
WHERE    a.i NOT IN (SELECT b.i FROM #TableB AS b)

-- EXCEPT
SELECT i FROM #TableA
EXCEPT
SELECT i FROM #TableB
 
DROP TABLE    #TableA,
              #TableB

Extract XML structure automatically, part 2

For some time ago, I posted an algorithm how to get the XML structure automatically. Today I stumbled across another approach which seems to be faster. Reservations though I haven't tested this against large xml data yet.

However, the previous algorithm relied on a WHILE loop here
http://weblogs.sqlteam.com/peterl/archive/2009/03/05/Extract-XML-structure-automatically.aspx

but this new algorithm doesn't. It's all xml internal thingies going on.

DECLARE     @Nodes TABLE
            (
                         NodeID INT IDENTITY(1, 1) PRIMARY KEY CLUSTERED,
                         ParentNodeName NVARCHAR(64),
                         NodeName NVARCHAR(64)
            )

DECLARE @Data XML

SET @Data = '
<root>
        <elementGroup>
                        <element>
                                     <stuff>
                                                 <comment>Stuff comment</comment>
                                     </stuff>
                                     <comment>Element comment</comment>
                        </element>
                        <comment>Element group comment</comment>
            </elementGroup>
            <comment>Root comment</comment>
</root>'

INSERT      @Nodes
            (
                         ParentNodeName,
                         NodeName
            )
SELECT      e.value('local-name(..)[1]', 'VARCHAR(MAX)') AS ParentNodeName,
            e.value('local-name(.)[1]', 'VARCHAR(MAX)') AS NodeName
FROM        @data.nodes('//*[local-name(.) > ""]') AS n(e)

;WITH Yak(NodeLevel, RootName, ElementName, NodeID, NodePath)
AS (
            SELECT     0,
                       ParentNodeName,
                       NodeName,
                       NodeID,
                       CAST(NodeID AS VARCHAR(MAX))
            FROM       @Nodes
            WHERE      ParentNodeName = ''

            UNION ALL

            SELECT     y.NodeLevel + 1,
                       n.ParentNodeName,
                       n.NodeName,
                       n.NodeID,
                       y.NodePath + ';' + CAST(n.NodeID AS VARCHAR(MAX))
            FROM       @Nodes AS n
            INNER JOIN Yak AS y ON y.ElementName = n.ParentNodeName
)

SELECT   RootName,
         REPLICATE('    ', NodeLevel) + ElementName AS ElementName,
         ROW_NUMBER() OVER (PARTITION BY RootName ORDER BY ElementName) AS SortedByElementName,
         ROW_NUMBER() OVER (PARTITION BY RootName ORDER BY NodeID) AS SortedByPresence
FROM     Yak
ORDER BY NodePath

I am proud and humble

A few months back, Adam Machanic launched a competition about "Grouped string concatenenation" here http://sqlblog.com/blogs/adam_machanic/archive/2009/02/27/t-sql-challenge-grouped-string-concatenation.aspx

Now Adam has publish his results here
http://sqlblog.com/blogs/adam_machanic/archive/2009/05/31/grouped-string-concatenation-the-winner-is.aspx and luckily my fourth suggestion was considered the overall winner!

When seeing my competitors, I am humble to see that my suggestion performed better than other suggestions made by such people as Itzik Ben-Gan, Remus Rusanu, Rob Farley and others.

Of course I am proud to win the MSDN Premium subscription, and now I have a hunch that next competition will be with Microsoft to get this baby activated   ;-)

Thanks to all who participated. I have learned a few new tricks.
And I will have to really investigate the execution plans for my two other finalists, Leonid Koyfman and Rick Holliday.

//Peter