SQL Server: Network packet size difference between returning XML and TDS
A while back I was thinking about SQL Server’s capabilities of returning data as XML with FOR XML and it occurred to me that maybe returning 10.000 results as xml would be faster that the way SQL server returns them now in Tabular Data Stream (TDS). After 10 seconds I dismissed it as nonsense because if that were so the xml would be used by default. But it piqued my curiosity about what would be the actual size overhead of returning XML compared to TDS. I knew there was overhead but i didn’t think it would be so high.
So i decided to test it. I tried 2 queries which got same data from AdventureWorks.Person.Address table and for each I just changed the TOP N value to: 1, 10, 100, 1000, 10000 and no TOP N to get all rows.
Query 1:
SELECT * FROM Person.Address
Query 2:
Here I used alias A and root P to minimize the XML size. The AUTO option creates an XML item for each row and columns are treated as attributes the PATH wraps the row items with the root P item. This appears to be the smallest XML you can generate with unchanged column names.
SELECT * FROM Person.Address as A FOR XML AUTO, ROOT('P')
These are the results in table and graph form.
Legacy Comments
JRI
2009-12-01 |
re: SQL Server: Network packet size difference between returning XML and TDS Actually, it seem to grow very much linearly. Not exponentially at all. The ratio of XML size/TDS size is close to constant over the different sample size. The same goes for the ratio XML size/row count. |
eyechart
2009-12-02 |
re: SQL Server: Network packet size difference between returning XML and TDS 'network packet size' is a bit misleading. Nothing is changing from a network packet perspective. The ethernet packets being transmitted from server to client aren't being altered in any way. The TDS packets riding on top of the network layer are typically 4KB but can be up to 32KB in size (according to protocol spec). Typical ethernet frames are 1500 bytes, but could be jumbo packets if using gig or 10gig ethernet. TDS doesn't really even care about the network transport layer, you could be running old-school FDDI or token ring and it wouldn't care. SQL Server uses the MS-BINXML format which is just a way of encoding XML data in a binary format. This format is transmitted by being encapsulated in the TDS protocol. TDS is an application layer protocol that does more than just move data, it also handles authentication, rpc, sql queries, etc. I think this is just more proof that there is more bloat associated with moving XML data around. You can see the same thing by saving an excel spreadsheet in original XLS format and then in XLSX format - the XLSX format will be 3x to 4x the size of the XLS file. |
Mladen
2009-12-02 |
re: SQL Server: Network packet size difference between returning XML and TDS @ec: you're right... but i couldn't think of a better title. :) |
Brett
2009-12-03 |
re: SQL Server: Network packet size difference between returning XML and TDS GREAT Post |
Kristen
2010-01-26 |
re: SQL Server: Network packet size difference between returning XML and TDS We looked at JSON, in preference to XML, to get data to web pages (AJAX stuff). That is much "lighter", but even so quite a lot of overhead (column name headers repeated), so we are heading for a proprietary binary approach ... just like the old days ... |