Peter Larsson Blog

Patron Saint of Lost Yaks

Extract XML structure automatically, part 2

For some time ago, I posted an algorithm how to get the XML structure automatically. Today I stumbled across another approach which seems to be faster. Reservations though I haven't tested this against large xml data yet.

However, the previous algorithm relied on a WHILE loop here
http://weblogs.sqlteam.com/peterl/archive/2009/03/05/Extract-XML-structure-automatically.aspx

but this new algorithm doesn't. It's all xml internal thingies going on.

DECLARE     @Nodes TABLE
            (
                         NodeID INT IDENTITY(1, 1) PRIMARY KEY CLUSTERED,
                         ParentNodeName NVARCHAR(64),
                         NodeName NVARCHAR(64)
            )

DECLARE @Data XML

SET @Data = '
<root>
        <elementGroup>
                        <element>
                                     <stuff>
                                                 <comment>Stuff comment</comment>
                                     </stuff>
                                     <comment>Element comment</comment>
                        </element>
                        <comment>Element group comment</comment>
            </elementGroup>
            <comment>Root comment</comment>
</root>'

INSERT      @Nodes
            (
                         ParentNodeName,
                         NodeName
            )
SELECT      e.value('local-name(..)[1]', 'VARCHAR(MAX)') AS ParentNodeName,
            e.value('local-name(.)[1]', 'VARCHAR(MAX)') AS NodeName
FROM        @data.nodes('//*[local-name(.) > ""]') AS n(e)

;WITH Yak(NodeLevel, RootName, ElementName, NodeID, NodePath)
AS (
            SELECT     0,
                       ParentNodeName,
                       NodeName,
                       NodeID,
                       CAST(NodeID AS VARCHAR(MAX))
            FROM       @Nodes
            WHERE      ParentNodeName = ''

            UNION ALL

            SELECT     y.NodeLevel + 1,
                       n.ParentNodeName,
                       n.NodeName,
                       n.NodeID,
                       y.NodePath + ';' + CAST(n.NodeID AS VARCHAR(MAX))
            FROM       @Nodes AS n
            INNER JOIN Yak AS y ON y.ElementName = n.ParentNodeName
)

SELECT   RootName,
         REPLICATE('    ', NodeLevel) + ElementName AS ElementName,
         ROW_NUMBER() OVER (PARTITION BY RootName ORDER BY ElementName) AS SortedByElementName,
         ROW_NUMBER() OVER (PARTITION BY RootName ORDER BY NodeID) AS SortedByPresence
FROM     Yak
ORDER BY NodePath

Legacy Comments


PeterNeo
2009-06-05
re: Extract XML structure automatically, part 2
I found the following code in SQLServerCentral but it does not use any CTE, LOOP etc.. check this one also

DECLARE @x xml;SET @x = N'<?xml version = "1.0"?>
<root xmlns:n = "uri:components:name"
xmlns:p = "uri:components:people">
<p:person>
<name>
<n:first-name>Joe</n:first-name>
<n:last-name>Johnson</n:last-name>
</name>
<name>
<n:first-name>Lennox</n:first-name>
<n:last-name>Lewis</n:last-name>
</name>
<name>
<n:first-name>Louie</n:first-name>
<n:last-name>McDougal</n:last-name>
</name>
</p:person>
</root>';

SELECT CASE NodeUri WHEN '' THEN ''
ELSE '{' + NodeUri + '}' END + NodeName AS [Name],
NodeUri,
NodeName,
NodeValue
FROM
(
SELECT node.value('fn:namespace-uri(.[1])', 'varchar(1000)') AS NodeUri,
node.value('fn:local-name(.[1])', 'varchar(1000)') AS NodeName,
node.query('./text()') AS NodeValue
FROM @x.nodes(N'//*') T(node)
) sub;

Jake
2010-11-21
re: Extract XML structure automatically, part 2
How can this be modified to return the full path, instead of just the parent?

Thanks,
Jake