I created a benchmark similar to the one that VTD-XML uses. Basically, since most xml processing is mutation, this benchmark parses an input xml file, executes various xpaths on the file, modifying the document in 2 instances, and then serializes the new document. The steps are listed below:
- Parse blog.xml, preparing to query the resulting document
- Perform the following xpath queries, or their equivalents, once each:
- count(//*) (10390 for this document)
- //item (a list of those 10390 items)
- /blog/item (similar to the previous, except you know the path)
- //text() (all text nodes)
- count(//item)
- count(/blog/item)
- /blog/item[@num=’a781′]
- /blog/item/body/p/a
- Mutate the document by removing the resulting nodes from the last 2 queries (performed inline with the queries)
- serialize the modified document back out
I created this benchmark for 4 products (the ones that have xpath or xpath-like support, if you know of another one, please submit me some code, and I will be happy to run and aggregate the results):
- DOM4J - TransformDOM4J.java
- Java6 DOM - TransformDOM.java
- VTD-XML - TransformVTD.java
- Tango Document - transformtango.d
After the run, I take the average cycle time, and turn that into the followin graph showing cycles per second. blog.xml is 1.3MB, so you can multiply these numbers by 1.3 to get the Megabytes per second number for each tool.

Some notes of the implementations:
- Tango, while not actually having an actual xpath parser, has the requisite power in its query language to be able to pull this off with aplomb
- You will note that the VTD code does NOT delete the /blog/item[@num=’a781′] node, because its XMLModifier is unable to perform deletes inside a delete. If someone knows how to fix this, please let me know
Would also note that these benchmarks were run on an Intel Q6700 quad core machine at 2.66 GHz, with 4GB of RAM, running Ubunu Linux.
Popularity: 7%
