I had forgotten to re-enable the comments posting script when I started posting again regularly. If anyone has tried to comment and not been able to, I do apologize, it should be working now.
Popularity: 5%
I had forgotten to re-enable the comments posting script when I started posting again regularly. If anyone has tried to comment and not been able to, I do apologize, it should be working now.
Popularity: 5%
I have been getting questions concerning the performance of Tango in the XML benchmarks I have been running, with people wondering how something that is not C/C++ could be so fast. “They must be cheating!”
This post intends to explain how D, and subsequently Tango, can perform so well, even against C/C++. To read more about D, please visit the home page for D - D Programming Language. Tango is an alternate ’standard’ library for the D programming language, with a design philosophy of building a great library, with extensive documentation, and providing the greatest functionality in the most efficient manner possible. How do they do that you ask?
Comments are open if other D people would like to add their $.02.
Popularity: 16%
I created a benchmark similar to the one that VTD-XML uses. Basically, since most xml processing is mutation, this benchmark parses an input xml file, executes various xpaths on the file, modifying the document in 2 instances, and then serializes the new document. The steps are listed below:
I created this benchmark for 4 products (the ones that have xpath or xpath-like support, if you know of another one, please submit me some code, and I will be happy to run and aggregate the results):
After the run, I take the average cycle time, and turn that into the followin graph showing cycles per second. blog.xml is 1.3MB, so you can multiply these numbers by 1.3 to get the Megabytes per second number for each tool.

Some notes of the implementations:
Would also note that these benchmarks were run on an Intel Q6700 quad core machine at 2.66 GHz, with 4GB of RAM, running Ubunu Linux.
Popularity: 7%
I have added the recent RapidXml to the graphs. Note that the RAM usage for RapidXml skyrockets, cost it efficiency. Noted on their homepage, they make a copy of the input buffer, because the input is ‘destroyed’ while parsing. I would assume that this memory usage would fit the machine it is running on, but that is a HUGE amount of allocation.

Popularity: 8%
I have started writing this post as a sidebar in comparing the parsers in my benchmarks. I will post what I know, and add more to it as I am informed by the community. Consider this a living post. Where something is just a fact, I list it as a Pro, such as language developed.
| Product | Pros | Cons |
| Tango PullParser (pull) |
|
|
| Tango SaxParser (SAX) |
|
|
| Tango Document (DOM) |
|
|
| Phobos std.xml (DOM) |
|
|
| RapidXml (DOM) |
|
|
| libxml2 (SAX) |
|
|
| VTD-XML (DOM) |
|
|
| Java SAX (SAX) |
|
|
| Java DOM (DOM) |
|
|
| Java StaX parsers (pull)(includes Aalto, Woodstox, and javolution) |
|
|
| DOM4J (DOM) |
|
|
Popularity: 7%
Aaron was kind enough to help me out with the RapidXml test. RapidXml is written in highly-tuned C++, and does give Tango a run for the money. I am really glad we are starting to add some non-Java alternatives, so we can see what native code can do. Without further ado, the code is bench_rapidxml.cpp, which was compiled via:
g++ bench_rapidxml.cpp -O2 -o bencn
Results for hamlet.xml:
stonecobra@jeff-home:~/xmlbench$ vi bench_rapidxml.cpp stonecobra@jeff-home:~/xmlbench$ g++ bench_rapidxml.cpp -O2 -o bench stonecobra@jeff-home:~/xmlbench$ ./bench Document Length: 279628 bytes Data Length: 279629 bytes Fastest:313.362203 MB/s Fastest:312.956579 MB/s Fastest:313.055406 MB/s Fastest:301.303166 MB/s Fastest:310.668081 MB/s Fastest:310.523743 MB/s Fastest:310.924893 MB/s Fastest:310.434819 MB/s Fastest:310.868351 MB/s Fastest:310.745189 MB/s Default:172.539398 MB/s Default:172.309405 MB/s Default:172.501116 MB/s Default:172.385035 MB/s Default:172.386038 MB/s Default:172.455936 MB/s Default:172.498550 MB/s Default:172.357293 MB/s Default:172.331007 MB/s Default:172.326775 MB/s strlen:3543.806666 MB/s strlen:3589.165483 MB/s strlen:3590.035209 MB/s strlen:3560.508898 MB/s strlen:3587.427295 MB/s strlen:3590.035209 MB/s strlen:3573.965308 MB/s strlen:3589.551976 MB/s strlen:3590.276875 MB/s strlen:3565.793459 MB/s
Average parsing speed: 310.48 MB/sec in fastest mode, 172.41 MB/sec in default mode.
Results for soap_mid.xml:
stonecobra@jeff-home:~/xmlbench$ vi bench_rapidxml.cpp stonecobra@jeff-home:~/xmlbench$ g++ bench_rapidxml.cpp -O2 -o bench stonecobra@jeff-home:~/xmlbench$ ./bench Document Length: 134334 bytes Data Length: 134335 bytes Fastest:197.352607 MB/s Fastest:197.097866 MB/s Fastest:196.779684 MB/s Fastest:197.276936 MB/s Fastest:197.096047 MB/s Fastest:188.870551 MB/s Fastest:197.026330 MB/s Fastest:197.164297 MB/s Fastest:197.156408 MB/s Fastest:196.966655 MB/s Default:121.320212 MB/s Default:121.256024 MB/s Default:121.385734 MB/s Default:121.286215 MB/s Default:121.236746 MB/s Default:121.340896 MB/s Default:121.295172 MB/s Default:121.264861 MB/s Default:121.311711 MB/s Default:121.360322 MB/s strlen:3608.479264 MB/s strlen:3586.658061 MB/s strlen:3619.080745 MB/s strlen:3613.568366 MB/s strlen:3619.694270 MB/s strlen:3615.812122 MB/s strlen:3615.403959 MB/s strlen:3609.495937 MB/s strlen:3615.914177 MB/s strlen:3612.651269 MB/s
Average parsing speed: 196.28 MB/sec in fastest mode, 121.31 MB/sec in default mode.
Popularity: 6%
All I can say at this moment, is “finally”. I was about 2 weeks away from tossing the iPhone and scoring a Crackberry.
Push email is what I need the MOST in a phone, and the iPhone wasn’t cutting the mustard, until maybe sometime in the really near future that we aren’t allowed to know at this point, but the teaser seems to be enough.
Popularity: 8%
I was helping someone on IRC in #d.tango try to use tango.text.xml to parse and display data from an xml document. We ended up building a simple example using HttpGet to get the document, Document to parse it, and Document’s xpath-like querying functionality to extract the useful bits.
import tango.io.File;
import tango.io.Stdout;
import tango.text.xml.Document;
import tango.net.http.HttpGet;
void main ()
{
auto doc = new Document!(char);
auto page = new HttpGet (\"http://www.google.com/ig/api?weather=London\");
auto content = cast (char[]) page.read;
doc.parse (content);
foreach( node; doc.query.descendant[\"forecast_conditions\"])
{
Stdout.formatln(\"forecast for {} is {} with a high of {}\",
node.query[\"day_of_week\"].attribute.nodes[0].value,
node.query[\"condition\"].nodes[0].getAttribute(\"data\").value,
node.query[\"high\"].nodes[0].getAttribute(\"data\").value);
}
}
The D programming language coupled with Tango as a standard library allows you to become a productive programmer.
Update: Please ignore the backslashes in the code if you are trying to run this example. For some reason, Wordpress is mucking around with the output.
Popularity: 8%
From my mistaken typing in the aalto benchmark, I accidentally benchmarked the default Java6 StaX parser, so this graph changes the axis to allow more players, and adds the real Aalto numbers. Click to view the graphs in full size.
Popularity: 8%