Sunday
9
Mar 2008

XML Benchmarks - pros and cons of each library

(7:04 pm) Tags: [Software, Projects, D Programming Language]

I have started writing this post as a sidebar in comparing the parsers in my benchmarks. I will post what I know, and add more to it as I am informed by the community. Consider this a living post. Where something is just a fact, I list it as a Pro, such as language developed.

Product Pros Cons
Tango PullParser (pull)
  • Written in the D programming language
  • Tango devs are very aware of cost of allocation, and try to avoid it as often as possible.
  • Extremely fast, extremely memory efficient
  • Beta level code
  • Interfaces may change, since Tango is not yet 1.0
  • NOT W3C XML compliant (ignores DOCTYPE, etc)
Tango SaxParser (SAX)
  • Written in the D programming language, on top of Tango’s PullParser.
  • Straight port of Java SAX code, with a small amount of D flavor
  • Useful for porting existing SAX-based code
  • Beta level code
  • Interfaces may change, since Tango is not yet 1.0
  • As shown in the benchmarks, virtual calls (SAX does a lot of them) cost quite dearly
  • NOT W3C XML compliant
Tango Document (DOM)
  • Written in the D programming language, and a DOM-style tree of xml to manipulate
  • Faster than all non-tree code tested so far
  • Not DOM compliant
  • Integrated query language, inspired by XPath
  • Beta level code
  • Interfaces may change, since Tango is not yet 1.0
  • Not DOM compliant
  • NOT W3C XMLcompliant
Phobos std.xml (DOM)
  • Written in the D programming language
  • Shipped in D 2.0’s standard library
  • DOM-style tree object model
  • Not DOM compliant
  • Not DOM compliant
  • Requires previous knowledge of the structure of the xml being parsed. Cannot parse arbitrary XML
  • NOT W3C compliant
RapidXml (DOM)
  • Written in C++, with ultimate performance in mind
  • Highly configurable, use only the featureset you need.
  • Not DOM compliant
  • Not DOM compliant
  • Not W3C XML compliant (ignores DOCTYPE)
libxml2 (SAX)
  • Written in C
  • extremely robust - passes all 1800 tests from the OASIS XML Tests Suite
VTD-XML (DOM)
  • Written in Java, also availabe in C, C#
  • Indexes the XML for super fast querying
  • XPath Support
Java SAX (SAX)
  • Written in Java
Java DOM (DOM)
  • Written in Java
  • W3C DOM compliant
  • W3C XML compliant
  • XPath support
Java StaX parsers (pull)(includes Aalto, Woodstox, and javolution)
  • Written in Java
DOM4J (DOM)
  • Written in Java
  • XPath Support

Popularity: 43%

Comments are closed.