Brett Kaiser (x002548) Blog

Not Just a Number - Brett Kaiser

T-SQL HTML? Parser

EDIT:  Ok, so as is pointed out in the comments below SQL Server comes with a very powerful XML Parser already OPENXML and sp_xml_preparedocument.  I posted a code snipet in the comments I got from BOL so you can see how it works.  While my routine operaes in a different manner, it seems to have been an exercise in futility...but it was fun writing it...annd hey maybe it will come in hany.  Who knows.

A poster over at SQLTeam asked a question on how to “read“ xml data from a string.  Now I had always toyed with the idea of parsing XML data so it could be loaded in to a normalized database.  We had already implemented here in prouction, the delivery of data to a Java front in XML format (for which the developers were very happy), but we never did the reverse.  Now this code below assumes that the xml data will not be malformed (i.e. there is no checking) and that the “elements“ of the xml are 1 per tag.  They all do not have to be there, so nulls would be inserted, it can have more or different tags, but they will get ignored, and if more than 1 “element“ tag type exists, the last one wins.  To modify this for more or different tags, you will only have to modify the IF/ELSE IF code block, and the dynamic sql component.  Happy Coding.

USE Northwind
GO

CREATE TABLE myTable99(fname char(20), lname char(30), phone char(20))
GO


DECLARE @xml varchar(8000), @data varchar(8000), @tag varchar(255), @sql varchar(8000)
, @fname char(20), @lname char(30), @phone char(20)
SET @xml = '<fname>Brett</fname><lname>Kaiser</lname><phone>555-1234<phone>'
SELECT @tag = '', @data = ''

WHILE LEN(@xml) <> LEN(@tag+@data+@tag)+3
  BEGIN
 SELECT @tag=SUBSTRING(@xml,CHARINDEX('<',@xml), CHARINDEX('>',@xml))
 SELECT @data = SUBSTRING(@xml,LEN(@tag)+1,CHARINDEX('<',@xml,LEN(@tag))-LEN(@tag)-1)
 
      IF @tag = 'fname'
  SELECT @fname = @data
 ELSE IF @tag = 'lanme'
  SELECT @lname = @data
 ELSE IF @tag = 'phone'
  SELECT @phone = @data
 
 SELECT @xml = SUBSTRING(@xml,LEN(@tag+@data+@tag)+2,LEN(@xml)-LEN(@tag+@data+@tag)+2)
 
 SELECT @xml,@fname,@data,@tag,LEN(@xml),LEN(@tag+@data+@tag)+2
  END

SELECT @tag=SUBSTRING(@xml,CHARINDEX('<',@xml), CHARINDEX('>',@xml))
SELECT @data = SUBSTRING(@xml,LEN(@tag)+1,CHARINDEX('<',@xml,LEN(@tag))-LEN(@tag)-1)

     IF @tag = ''
 SELECT @fname = @data
ELSE IF @tag = ''
 SELECT @lname = @data
ELSE IF @tag = ''
 SELECT @phone = @data

SELECT @sql = 'INSERT INTO myTable99(fname,lname,phone) '
  + 'SELECT ''' + RTRIM(@fname) + ''''
  + ',' + ''''  + RTRIM(@lname) + ''''
  + ',' + ''''  + RTRIM(@phone) + ''''
SELECT @sql

EXEC(@sql)

SELECT * FROM myTable99
GO

DROP TABLE myTable99
GO

Legacy Comments


Mladen
2006-07-27
re: T-SQL XML Parser
so what's wrong with using
OPENXML and sp_xml_preparedocument?

Brett
2006-07-27
re: T-SQL XML Parser
Care to post an example?

Brett
2006-07-27
re: T-SQL XML Parser
OK, Good Lord...I guess I wrote a HTNL parser. As maddy points out...this is quite amazing found this in BOL

DECLARE @idoc int
DECLARE @doc varchar(1000)
SET @doc ='
<ROOT>
<Customer CustomerID="VINET" ContactName="Paul Henriot">
<Order CustomerID="VINET" EmployeeID="5" OrderDate="1996-07-04T00:00:00">
<OrderDetail OrderID="10248" ProductID="11" Quantity="12"/>
<OrderDetail OrderID="10248" ProductID="42" Quantity="10"/>
</Order>
</Customer>
<Customer CustomerID="LILAS" ContactName="Carlos Gonzlez">
<Order CustomerID="LILAS" EmployeeID="3" OrderDate="1996-08-16T00:00:00">
<OrderDetail OrderID="10283" ProductID="72" Quantity="3"/>
</Order>
</Customer>
</ROOT>'
--Create an internal representation of the XML document.
EXEC sp_xml_preparedocument @idoc OUTPUT, @doc
-- Execute a SELECT statement that uses the OPENXML rowset provider.
SELECT *
FROM OPENXML (@idoc, '/ROOT/Customer',1)
WITH (CustomerID varchar(10),
ContactName varchar(20))

Mladen
2006-07-28
re: T-SQL HTML? Parser
emmm you know brett i'm not really Maddy... :))

Brett
2006-07-28
re: T-SQL HTML? Parser
OK, Ye I admit I am losing my mind Mladen. With a release every day, it's been insane, and when I had a small window to "play" I took advatage...so forgive the confusion...

And thanks for the insight btw

Do you ever use it? How reliable can XML text be in a repeatable process I wonder?


Mladen
2006-07-28
re: T-SQL HTML? Parser
no biggie :)

i use it now and then... nothing hardcore though...
mostly for bulk insert of 10 rows from the client or something similar.
I find it preety reliable. hasn't crashed on me yet.