XSLT Performance in .NET
by Dan Frumin07/14/2003
The Microsoft .NET Framework brings with it many new tools and improvements for developers. Among them is a very rich and powerful set of XML classes that allow the developer to tap into XML and XSLT in their applications. By now, everyone is familiar with XML, the markup language that is the basis for so many other standards. XSLT is a transformation-based formatter. You can use it to convert structured XML documents into some other form of text output -- quite often HTML, though it can also generate regular text, comma-separated output, more XML, and so on. If you haven't used XSLT before, you might want to read the previously published article titled "Five Quick Tips to Using XSLT."
Before the Microsoft .NET Framework was released, Microsoft published the XML SDK, now in version 4.0. The XML SDK is COM-based, and so can be used from any development language, not just Microsoft .NET. Its object model is also a little different than the .NET implementation, and therefore requires a bit of learning to use. But in the end, the XML SDK can do the same things for XSLT that the .NET Framework offers.
|
Related Reading XSLT Cookbook |
Which raises the question: how do these two engines compare to each other in performance? This article will answer that question.
Methodology
In order to test the performance of the two parsers, I used a standard XML file for storing a catalog of books. A file with only one book looked like this:
<?xml version="1.0"?>
<catalog>
<book id="bk101">
<author>Gambardella, Matthew</author>
<title>XML Developer's Guide</title>
<genre>Computer</genre>
<price>44.95</price>
<publish_date>2000-10-01</publish_date>
<description>An in-depth look at creating
applications with XML.</description>
</book>
</catalog>
I chose to use an XML file on the file system, rather than a SQL 2000 XML Query, in order to avoid any of the performance noise that SQL might incur. I then created four different versions of the same XML file, containing one, 20, 100, and 500 books, respectively.
After setting up my XML sources, I put together a number of XSLT transform files. I created four different XSLT transform files, increasing in their complexity of the processing and output. The first file did nothing but output the book ID for every node in the XML. The second file generated an output of all of the book information in HTML table form. For the third and fourth files, I wanted to test some actual processing. The third file I created uses a for-each operation with a select that filters for the book with ID "bk101". For the fourth and last test file, I decide to apply a sort to all of the nodes, based on the author's name.
As an example, here's the complete XSLT text for the sorting test:
<xsl:stylesheet
xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:output method="html"/>
<xsl:template match="/catalog">
<table>
<xsl:apply-templates select="book">
<xsl:sort select="author"/>
</xsl:apply-templates>
</table>
</xsl:template>
<xsl:template match="book">
<tr>
<td><xsl:value-of select="@id"/></td>
<td><xsl:value-of select="title"/></td>
<td><xsl:value-of select="author"/></td>
<td><xsl:value-of select="genre"/></td>
<td><xsl:value-of select="price"/></td>
<td><xsl:value-of select="publish_date"/></td>
<td><xsl:value-of select="description"/></td>
</tr>
</xsl:template>
</xsl:stylesheet>
All that remained was applying each XSLT transform to each XML file and tracking the results.
Tools Used
I developed a simple test harness to time all of the individual combinations of XML and XSLT files. The test harness was written in C# as a command-line application. By default, it looked in a folder for every XML and XSLT file and ran each combination through two test classes, one written using the MSXML SDK and the other using the .NET Framework. The test manager then recorded the results in a CSV file that could later be loaded into Excel for analysis.
Each of the two classes was given a Test member that accepted two parameters,
one a path to the XML file, the other a path to an XSLT file. Each test
was designed to perform an end-to-end transformation of the XML. Because
MSXML's implementation of XSLT returns a string with the transformed
output, the .NET implementation of the test used the same mechanism to get
the results.
Each test was split up into five operations: creating the XML objects, preparing the XML objects for transformation (e.g., loading the file from the file system), creating the XSLT objects, preparing the XSLT objects for the transformation, and actually executing the transform itself. By splitting the end-to-end transformation into these individual operations, we gain a better understanding of the inner workings of the objects.
Individual timings were taken immediately after each of the five operations and summed up over one hundred runs, later to be averaged out. This process allows for a better representation of the time required by each operation.
In order to get accurate time measurements, I wrapped the QueryPerformanceCounter and QueryPerformanceFrequency functions offered by the Win32 API in the KERNEL32.DLL library. These functions offer much better timing resolution than the DateTime class in the .NET Framework.
Here's a snippet of code for wrapping those functions in your own class:
[System.Runtime.InteropServices.DllImport("KERNEL32")]
private static extern bool
QueryPerformanceCounter(ref long lpPerformanceCount);
[System.Runtime.InteropServices.DllImport("KERNEL32")]
private static extern bool
QueryPerformanceFrequency(ref long lpFrequency);
The tests themselves were run on a P4 1.6GHz machine with 512MB of memory. The box had plenty of CPU and memory resources left at the time the tests were run, so performance should be reasonably steady. The tests were run through the command line, using a release build of the TestManager project.
Actual Code Tested
.NET Implementation
Counter.Set();
xmlDoc = new XmlDocument();
dCreateXml = Counter.GetMilliseconds();
Counter.Set();
xmlDoc.Load(xmlFile);
dPrepXml = Counter.GetMilliseconds();
Counter.Set();
xslt = new XslTransform();
dCreateXslt = Counter.GetMilliseconds();
Counter.Set();
xslt.Load(xsltFile);
dPrepXslt = Counter.GetMilliseconds();
Counter.Set();
StringWriter sw = new StringWriter();
xslt.Transform(xmlDoc, null, sw);
output = (string) sw.ToString();
dRunXslt = Counter.GetMilliseconds();
MSXML Implementation
Counter.Set();
xmlDoc = new DOMDocument40Class();
dCreateXml = Counter.GetMilliseconds();
Counter.Set();
xmlDoc.load(xmlFile);
dPrepXml = Counter.GetMilliseconds() ;
Counter.Set();
xsl = new FreeThreadedDOMDocument40Class();
xslt = new XSLTemplate40Class();
dCreateXslt = Counter.GetMilliseconds();
Counter.Set();
xsl.load(xsltFile);
xslt.stylesheet = xsl;
xslProc = xslt.createProcessor();
xslProc.input = xmlDoc;
dPrepXslt = Counter.GetMilliseconds() ;
Counter.Set();
xslProc.transform();
output = (string) xslProc.output;
dRunXslt = Counter.GetMilliseconds();
As mentioned above, the .NET implementation uses a StringWriter class to access the string output of the transformation. Getting the MSXML SDK implementation to work required a few unique steps. First, the actual XSLT object requires that the XML for the stylesheet be provided in a FreeThreadedDOMDocument object, rather than a regular DOMDocument object. Second, an IXSLProcessor object must be used to execute the transform itself. Beyond these little details, the code is self-explanatory.
Results and Observations
|
| Figure 1. Results for Sorting XSLT |
A few items jump out after analyzing the numbers generated by the tests, as illustrated using the sorting XSLT test.
- COM/Interop overhead is significant. Both the creation and deletion of intrinsic .NET objects is much cheaper than creating COM objects, especially since .NET's garbage collector can optimize the operations. As you can see from the left-most column, the CreateXML time increased linearly for the MSXML parser. That's because in our tests, creation actually covers the creation of a new object and the implicit deletion of the old object. Unlike .NET, which simply places its objects on the GC queue, COM objects must be freed. In the case of a very large (500-node) XML file, freeing all of those internal objects is taking quite a while (7 to 8ms).
- The XML parser in MSXML is significantly more efficient than the NET equivalent, especially as the size of the XML file increases. Unfortunately, this benefit is somewhat offset by the cost of creating and deleting the MSXML COM objects. In fact, in smaller XML file sizes (20 nodes), the combined cost of those two operations was actually greater than the combined cost of the .NET implementation's equivalent.
- The extra IXSLProcessor object incurs around a 1ms overhead. The MSXML objects consistently took an extra 1ms in preparing the XSLT for transformation. That can easily be attributed to the extra object creation and passing. This particular operation involves loading an XML file for the stylesheet. As the stylesheet grows in size, the MSXML parser should catch up to the .NET implementation. In addition, a quick scan of the graph shows that this 1ms is negligible compared to the cost of the rest of the operation, especially for larger source files. Nonetheless, developers should optimize this out if possible.
- The MSXML processor is consistently three to four times faster in larger and more complex transformations. This is the most important result, and most significant difference. The MSXML parser transformed the 100-node file in 3.0 to 3.5ms, depending on the complexity of the operation, with sorting being the most complex. By comparison, the .NET implementation required 11 to 13ms to execute the same transformation. The ratio of time (around 3x) appeared consistently up to the 500-node file, dropping only in the simplest case of a one-node XML file.
Recommendations
Looking at the results, we can see that in a single end-to-end operation, the cost of the COM overhead can offset the advantages gained in transformation. This is especially true for smaller XML files (20 to 40 nodes). However, the margin of difference grows as the input files grow in size and as the transformation grows in complexity. When dealing with these scenarios, developers should consider using MSXML as well as two techniques to optimize their applications.
First, consider storing the XSLT transform objects (including IXSLProcessor) in some shared location (e.g., a static member) for future use. This eliminates the cost of creating and preparing the XSLT objects and allows for a reusable transformation object that can simply be applied to XML input.
Second, developers should consider creating their own COM object garbage collector for the XML files, especially if they are large in size. The assumption is that the XSLT transform won't change often, but the input files will (e.g., through data changes as a result of order entry.) Clearly, the creation time of the COM object itself is constant regardless of input file size. That means that most of the cost we see in the CreateXML step is actually part of the deletion of the COM object. After using an XML object, developers could place it into a simple queue and use a separate thread to free those objects. This eliminates another big chunk of time from the operation.
A combination of these techniques and the MSXML objects could easily shave 60 to 70% of the time involved in such a transformation. This level of savings will directly translate into faster performance of the application, as well as greater scalability.
Dan Frumin is a long-time technology executive, with over 10 years of experience in the industry.
Return to ONDotnet.com
You must be logged in to the O'Reilly Network to post a talkback.
Showing messages 1 through 13 of 13.
-
oh boy
2003-12-25 02:28:05 anonymous2 [Reply | View]
Oh come on, testing XSLT performance with XmlDocument and no warming up is just a joke. Have you tried XPathDocument?
-
.Net framework version
2003-09-02 06:34:32 anonymous2 [Reply | View]
We tried it with framework 1.1 vs 1.0. In the later version the not-too-slow bits got a bit less slow, and the oh-my-goodness-is-it-really-that-slow bits got slower.
Word on the street is that MS have the original author of MSXML working on refactoring it all big-time for .Net 2.0, which is an implilcit admission of just how bad it is. We shall see...
-
let me sum it up: XSLT for .NET is a dog
2003-08-06 13:50:49 winfield [Reply | View]
This is conventional wisdom. No one I know uses .NET XSLT for anything but the smallest of transforms...
-
Noble effort...
2003-07-29 01:04:56 anonymous2 [Reply | View]
It's hard to cover performance thoroughly and although there are lots of nit picks, I commend Dan for at least having the guts to try.
-
I need mush more work to use msxml from .NET
2003-07-18 17:01:32 anonymous2 [Reply | View]
Performance measurement is hard and expensive to do right. I'm unconvinced by this. A few points.
1) I want realistic tests of what I'm doing. These tests don't measure up.
2) .NET has ticks with 100 nanosecond resolution. Is that not good enough?
3) There are other ways to use XSLT in .NET. Is this a realistic choice.
4) How does .NET 1.0 compare with 1.1?
5) I imagine the choice is more like ASP versus ASPX in which case running msxml from .NET is inappropriate.
6) Where are theunits for the axes on the graph marked. (This is really basic!)
In summary. I'd love to see analysis of optimised code for each situation tested. I admire someone taking the time to have a shot at it but am unconvinced. At present I stick with .NET XSLT within a .NET application. -
I need mush more work to use msxml from .NET
2003-07-27 23:44:26 anonymous2 [Reply | View]
The nominal 100nS accuracy of .NET timers doesn't mean the clocks are that accurate; accuracy varies from system to system in the windows world.
If you want accurate measurement, resort to coding the _rdtsc opcode in native code and P/Invoke it, and work out the base round trip time which you need to factor out.
But if you do get some accurate values, could you publish them? The .NET EULA says 'no performance figures' without permission...
-
Missing Bits - Quite a few
2003-07-17 00:46:23 zarko [Reply | View]
This is what I noticed at the first glance:
- XmlDocument used instead of the XPathDocument. Surely the one using the COM DLL doesn't have a choice but the one using .NET does. The nature of the former is to facilitate editing, and the nature of the latter is to facilitate XPath matching - thereby XSLT - when written well of course.
- Using StringWriter and calling .ToString() is a way different operation then just casting xslProc.output - almost as if author wanted to test how wild a code can get sending objects to GC and copying strings. A little mofification of using explicit StringBuilder instead of the one hidden and thereby forced to GC in the StringWriter, would be at least a bit further apart from a tutorial sample.
- Article states the goal "to test the performance of the two PARSERS" which is a largely different task from anything that followed.
- It looks like the author insisted on measuring individual small transforms instead of batches of at least 100 or 1000 so he needed more accurate clock. That however means that large part of the measurement was a measurement of all kinds of jitters in the system.
- The rate of processing time increase indicated in that diagram (5-6 for the size jump from 100 to 500) would bring someone a Turing award - for a linear general sorting algorithm (5*lg5 ~ 11.6 not 5) in both engines :-) so it looks like it just shows the trend of increased load and doesn't actually sense anything else but noise.
- If the code from the article was actually used, then timestamps didn't sense any "COM objects must be freed" time - when it's encapsulated in an Interop object it lives by the same GC rules. So the closest guess of what that code actually measured would be - the GC reaction to Interop objects. Also, unless someone would manage to come up with a divine prediction logic, loading something like an XML structure has no option but to fragment the memory and therefore be subject to a GC in one form or the other.
- The only thing that timing shows clearly is that .NET GC gets more retentive when faced with Interop objects - probably since it knows what a hairy bulk is rolling behind a pointer, but more realistically because a component, and a foreign body, is not expected to be trashed every 10ms. What that means in turn is that such MSXML-based application would be very unstable - GC would explode in rather large bursts that could melt down the server. At the same time .NET native object would get the GC working more steadily all the time.
- Too simplistic XML format, very close to a DB rowset dump out, so the measurement becomes the one of how would these two XSLT engines handle a SQL-esque task they are not meant for in the first place.
- Too simplistic XSLT and mimicking the procedural way of processing particularly reflected in picking/selecting instead of following the tree - to the point of calling for-each loop "some actual processing". One normally doesn't have to touch things like sort in XSLT work at all - except for some small, standalone work - or book samples :-)
- That long rename-tags-to-TD XSLT in the article for example is equivalent to a two very simple and fully declarative templates which good XSLT compiler can optimize - that list of value-of/select-s no one can - they are spaghetti pointers of the XSLT world.
The reason I mentioned these points is that just by switching to XPathDocument I saw 2 to 10 times performance improvement in actual XSLT processing (not simulating a SQL query) and even for the utmost inefficient constructs like for-each, XPathDocument suffered at worst a 2x performance hit compared to the efficient XSLT code.
The reason is that XPathDocument expects to be queried, only queried and nothing but queried in accordance with the basic principle of the XML standard (tags are principal entities, attributes and textual content are secondary). For example, if one would really be faces with XML as simplistic as the one in this article he would simply get things done with a few XPaths without firing a transform at all.
XML has intrinsic threshold bellow which it becomes terribly inefficient for anything at all and if we measure bellow that threshold we just measure the color of a noise. XSLT has very wide area of use and can actually be very fast - if handled with care and not pushed out of it's natural domain - structural transformation
As for the MSXML used standalone (straight COM as the other reader mentioned), it fared something like an order of magnitude worse in a close to a real world scenario and mostly couldn't notice the difference between efficient and wild XSLT - meaning that it's XSLT compiler hardly does any optimization work.
Good part of that of course came from the intrinsic problem that it really isn't a stand-alone animal but tries to act like one so it essentially has to create and then trash everything it has - the nature of a coarse grained component model.
I'm sure that, except for an exercise, no one would suggest writing a whole Web site or processing engine in C/C++ and ATL just in order to give MSXML a "kosher" environment. A noble cause otherwise, and an interesting one, but that seems to be the work industry and large is giving to Perl, Java and C#. MSXML always ends up being handled either by a script engine or a separate binary, just to be able to talk to the rest of the world.
ASPX "pages" can, in my experience, actually be much more efficient vehicle then ASP - but only after those using it find their peace with the paradigm shift and looking at the framework of pieces serving very different needs and not a monolithic engine to apply verbatim to any problem.
-
Compiled stylesheets
2003-07-17 00:35:53 anonymous2 [Reply | View]
The author has failed to realise and utilise msxml4's ability to cache and compile XSLT stylesheets. Look up IXSLTemplate in the msxml4 SDK documentation for an explnation.
The time to compile a stylesheet would explain the "slow" performance for preapring the XSLT for transformation, that the author has attributed merely to object creation.
This comparision between the two parsers is very poor and needs to be redone. I am surprised that O'Reilly even put their name to this.
-
.Net framework version
2003-07-16 01:52:08 anonymous2 [Reply | View]
What version of the .Net framework was used in this test? At least some XSLT operations (for example xsl:key) have been optimized quite a lot in 1.1.
-
COM garbage collector
2003-07-15 13:48:14 anonymous2 [Reply | View]
The offhand suggestion that one build a COM'ish garbage collector seems a bit nontrivial, if not moderately absurd.
The author also fails to account for the performance of MSXML using a straight COM implementation without Interop. He further neglects to look into the performance differences between classic ASP and ASPX. Given that XML is very important to the delivery of Web content, the difference between ASP and ASPX is a salient point for XML developers.






