Monday
10
Mar 2008

XML Benchmarks - Parse/Query/Mutate/Serialize

(8:41 am) Tags: [Software, Projects, D Programming Language]

I created a benchmark similar to the one that VTD-XML uses. Basically, since most xml processing is mutation, this benchmark parses an input xml file, executes various xpaths on the file, modifying the document in 2 instances, and then serializes the new document. The steps are listed below:

  1. Parse blog.xml, preparing to query the resulting document
  2. Perform the following xpath queries, or their equivalents, once each:
    • count(//*) (10390 for this document)
    • //item (a list of those 10390 items)
    • /blog/item (similar to the previous, except you know the path)
    • //text() (all text nodes)
    • count(//item)
    • count(/blog/item)
    • /blog/item[@num=’a781′]
    • /blog/item/body/p/a
  3. Mutate the document by removing the resulting nodes from the last 2 queries (performed inline with the queries)
  4. serialize the modified document back out

I created this benchmark for 4 products (the ones that have xpath or xpath-like support, if you know of another one, please submit me some code, and I will be happy to run and aggregate the results):

After the run, I take the average cycle time, and turn that into the followin graph showing cycles per second. blog.xml is 1.3MB, so you can multiply these numbers by 1.3 to get the Megabytes per second number for each tool.

Some notes of the implementations:

Would also note that these benchmarks were run on an Intel Q6700 quad core machine at 2.66 GHz, with 4GB of RAM, running Ubunu Linux.

Popularity: 7%

Comments: (3)

XML Benchmarks - updated graphs with RapidXml

(8:25 am) Tags: [Software, Projects]

I have added the recent RapidXml to the graphs. Note that the RAM usage for RapidXml skyrockets, cost it efficiency. Noted on their homepage, they make a copy of the input buffer, because the input is ‘destroyed’ while parsing. I would assume that this memory usage would fit the machine it is running on, but that is a HUGE amount of allocation.

Popularity: 8%

Comments: (0)