Wednesday
10
Jun 2009

Code Coverage in Eclipse with EclEmma

(8:38 am) Tags: [Software, Projects]

I am fortunate enough to be working on a project where I get to start from a clean slate. I set up test cases using JUnit, and Eclipse runs them within the IDE easily. Next I wanted to know how well the unit tests were covering the code.

Enter EclEmma, an Eclipse plugin for showing code coverage. Simply add the update site of http://update.eclemma.org/ to your sites, and install/restart Eclipse. Next, run your test case with the EclEmma run button, and you get a report of code coverage. It even highlights the code in the Java editor to show no/partial/full coverage of a line.

This is the way an Eclipse plugin should operate. Great work, EclEmma team! Now, if it only tested a webapp… I know I can do it manually, but this simple ‘click here’ experience leaves me not wanting to do that.

Popularity: 36%

Comments: Comments Off
Tuesday
3
Feb 2009

Sum all numbers in a file

(11:06 am) Tags: [Software, Why I like..., How do I..., Sysadmin]

You HAVE to love the elegance of unix and the utilities contained therein:

awk '{ sum += $1; } END { print sum; }'

Popularity: 41%

Comments: (3)
Tuesday
27
Jan 2009

keepalived: who needs a redundant load balancer setup?

(3:31 pm) Tags: [Software, Why I like..., Projects]

I was recently tasked with some performance work for a client. Their production web application written in CakePHP was having serious speed/load issues, so I jumped in and took a look.

After some initial testing, I determined that the load balancer serving HTTPS traffic to 2 web servers was only allowing 10 requests/second through, while each web server individually would handle more than double that. I set up a simple SSL/mod_proxy using apache from my own colocated server, and the throughput jumped fourfold to over 40 requests/sec. After checking all was well with the hosting company’s rented load balancer, we decided to ditch it.

I set up a simple load balancing solution using the proxy capabilities of nginx, proxying back to Apache. I did this so I could be sure that the Apache config was untouched. After getting that set up, and seeing the performance come back to expectations, I was then asked by the client to make it redundant (with failover).

I did some quick research, and found keepalived, a small project that is part of the larger Linux Virtual Server project. The best config I found was actually found in docs for haproxy, ironic.


vrrp_script chk_haproxy { # Requires keepalived-1.1.13
script "killall -0 haproxy" # cheaper than pidof
interval 2 # check every 2 seconds
weight 2 # add 2 points of prio if OK
}

vrrp_instance VI_1 {
interface eth0
state MASTER
virtual_router_id 51
priority 101 # 101 on master, 100 on backup
virtual_ipaddress {
192.168.1.1
}
track_script {
chk_haproxy
}
}

I modified the virtual ip address, and the check script to look for ‘nginx’, and bammo, it just worked, right out of the box.

I am pleased with the simple configuration of keepalived, and that is ‘just worked’.

Popularity: 38%

Comments: Comments Off
Tuesday
6
Jan 2009

sparse revisions in subversion

(9:38 am) Tags: [Software]

Finally, an easy way to keep the root folder checked out, but only check out a few subfolders.
svn co --depth=empty is the best new feature in svn 1.5, bar none. Check out more info here and here.

Popularity: 34%

Comments: Comments Off
Saturday
3
Jan 2009

DimDim mini-review

(12:14 pm) Tags: [Software]

We tried out a company conference call using DimDim yesterday morning, and here is a mini-review.

Pros

Cons

Looks like we will have to stay with our current crap solution that we have, because screen sharing is our 90% use case.

Popularity: 36%

Comments: Comments Off
Friday
2
Jan 2009

Tango Patch Day

(2:19 pm) Tags: [D Programming Language]

Today was my day for patching up Tango, and I managed to make a good dent in getting Tango working on 64bit with GDC. As of svn revision 4235, I am compiling and running my programs on 64 bit Ubuntu 8.10 with GDC and tango.

Now to go see if I can get Mango going on 64bit.

Popularity: 39%

Comments: Comments Off
Thursday
1
Jan 2009

WebSVN install

(3:59 pm) Tags: [Software]

Installed WebSVN this afternoon for the corporate svn repository, and from the first use, it is one fast little script.

We were using Trac in the past, but I feel for svn browsing, WebSVN is going to become the de-facto standard.

Install was simple, just unzip, copy the sample config to your own, and edit a few options.

Popularity: 35%

Comments: Comments Off

rsync backups with TeraStation

(11:41 am) Tags: [Software, Life]

I just successfully set up a backup for my laptop to the home TeraStation, using instructions I found here.

If you just generically want to start using rsync to do backups, there is a great resource here.

I am already backing up via SVN for the really important stuff, and SVN backs up to S3, so I am covered there.

At work, we are using JungleDisk Workgroup to back up home folders, and that seems to be working out fairly well.

It feels good to start the new year with a backup…

Popularity: 40%

Comments: Comments Off
Friday
21
Nov 2008

Full 1920×1200 screen resolution on Dell Latitude E6500 with Ubuntu 8.10

(2:23 pm) Tags: [Software]

I am temporarily stepping away from Vista after some lockups that can’t be diagnosed to software or hardware. I downloaded the ubuntu 8.10 CD ISO from the torrent, burned it onto a CD, and rebooted the laptop.

Install took about 15 minutes, after which I was up and running in Ubuntu. Great compatibilty work, congrats to everyone in the community that makes stuff like this happen for people like me.

Once booted into Ubuntu, I thought the screen resolution was a bit off, so I checked, and sure enough, my Latitude E6500 with 15.4 inch 1920×1200 screen was running at 1280×1024. I tried running the screen resolution utility, but to no avail.

I found this post showing how to get it working, and these are the steps that I followed:

  1. Open a terminal
  2. Type: sudo apt-get install envyng-gtk
  3. Once installed, type sudo envyng -t
  4. Follow the prompts (I installed the recommended nVidia driver)
  5. Reboot as it recommends
  6. The laptop now boots at 1920×1200 resoution natively. Beautiful!

Having been a heavy CentOS/RedHat user, I am looking forward to learning the ways of Ubuntu.

Popularity: 38%

Comments: Comments Off
Wednesday
12
Mar 2008

Why is D/Tango so fast at parsing XML?

(9:05 am) Tags: [General, Projects, D Programming Language]

I have been getting questions concerning the performance of Tango in the XML benchmarks I have been running, with people wondering how something that is not C/C++ could be so fast. “They must be cheating!”

This post intends to explain how D, and subsequently Tango, can perform so well, even against C/C++. To read more about D, please visit the home page for D - D Programming Language. Tango is an alternate ’standard’ library for the D programming language, with a design philosophy of building a great library, with extensive documentation, and providing the greatest functionality in the most efficient manner possible. How do they do that you ask?

Comments are open if other D people would like to add their $.02.

Popularity: 76%

Comments: (3)
Monday
10
Mar 2008

XML Benchmarks - Parse/Query/Mutate/Serialize

(8:41 am) Tags: [Software, Projects, D Programming Language]

I created a benchmark similar to the one that VTD-XML uses. Basically, since most xml processing is mutation, this benchmark parses an input xml file, executes various xpaths on the file, modifying the document in 2 instances, and then serializes the new document. The steps are listed below:

  1. Parse blog.xml, preparing to query the resulting document
  2. Perform the following xpath queries, or their equivalents, once each:
    • count(//*) (10390 for this document)
    • //item (a list of those 10390 items)
    • /blog/item (similar to the previous, except you know the path)
    • //text() (all text nodes)
    • count(//item)
    • count(/blog/item)
    • /blog/item[@num=’a781′]
    • /blog/item/body/p/a
  3. Mutate the document by removing the resulting nodes from the last 2 queries (performed inline with the queries)
  4. serialize the modified document back out

I created this benchmark for 4 products (the ones that have xpath or xpath-like support, if you know of another one, please submit me some code, and I will be happy to run and aggregate the results):

After the run, I take the average cycle time, and turn that into the followin graph showing cycles per second. blog.xml is 1.3MB, so you can multiply these numbers by 1.3 to get the Megabytes per second number for each tool.

Some notes of the implementations:

Would also note that these benchmarks were run on an Intel Q6700 quad core machine at 2.66 GHz, with 4GB of RAM, running Ubunu Linux.

Popularity: 62%

Comments: (3)

XML Benchmarks - updated graphs with RapidXml

(8:25 am) Tags: [Software, Projects]

I have added the recent RapidXml to the graphs. Note that the RAM usage for RapidXml skyrockets, cost it efficiency. Noted on their homepage, they make a copy of the input buffer, because the input is ‘destroyed’ while parsing. I would assume that this memory usage would fit the machine it is running on, but that is a HUGE amount of allocation.

Popularity: 58%

Comments: (2)
Sunday
9
Mar 2008

XML Benchmarks - pros and cons of each library

(7:04 pm) Tags: [Software, Projects, D Programming Language]

I have started writing this post as a sidebar in comparing the parsers in my benchmarks. I will post what I know, and add more to it as I am informed by the community. Consider this a living post. Where something is just a fact, I list it as a Pro, such as language developed.

Product Pros Cons
Tango PullParser (pull)
  • Written in the D programming language
  • Tango devs are very aware of cost of allocation, and try to avoid it as often as possible.
  • Extremely fast, extremely memory efficient
  • Beta level code
  • Interfaces may change, since Tango is not yet 1.0
  • NOT W3C XML compliant (ignores DOCTYPE, etc)
Tango SaxParser (SAX)
  • Written in the D programming language, on top of Tango’s PullParser.
  • Straight port of Java SAX code, with a small amount of D flavor
  • Useful for porting existing SAX-based code
  • Beta level code
  • Interfaces may change, since Tango is not yet 1.0
  • As shown in the benchmarks, virtual calls (SAX does a lot of them) cost quite dearly
  • NOT W3C XML compliant
Tango Document (DOM)
  • Written in the D programming language, and a DOM-style tree of xml to manipulate
  • Faster than all non-tree code tested so far
  • Not DOM compliant
  • Integrated query language, inspired by XPath
  • Beta level code
  • Interfaces may change, since Tango is not yet 1.0
  • Not DOM compliant
  • NOT W3C XMLcompliant
Phobos std.xml (DOM)
  • Written in the D programming language
  • Shipped in D 2.0’s standard library
  • DOM-style tree object model
  • Not DOM compliant
  • Not DOM compliant
  • Requires previous knowledge of the structure of the xml being parsed. Cannot parse arbitrary XML
  • NOT W3C compliant
RapidXml (DOM)
  • Written in C++, with ultimate performance in mind
  • Highly configurable, use only the featureset you need.
  • Not DOM compliant
  • Not DOM compliant
  • Not W3C XML compliant (ignores DOCTYPE)
libxml2 (SAX)
  • Written in C
  • extremely robust - passes all 1800 tests from the OASIS XML Tests Suite
VTD-XML (DOM)
  • Written in Java, also availabe in C, C#
  • Indexes the XML for super fast querying
  • XPath Support
Java SAX (SAX)
  • Written in Java
Java DOM (DOM)
  • Written in Java
  • W3C DOM compliant
  • W3C XML compliant
  • XPath support
Java StaX parsers (pull)(includes Aalto, Woodstox, and javolution)
  • Written in Java
DOM4J (DOM)
  • Written in Java
  • XPath Support

Popularity: 63%

Comments: Comments Off

XML Benchmarks - RapidXml

(6:56 pm) Tags: [Software, Projects]

Aaron was kind enough to help me out with the RapidXml test. RapidXml is written in highly-tuned C++, and does give Tango a run for the money. I am really glad we are starting to add some non-Java alternatives, so we can see what native code can do. Without further ado, the code is bench_rapidxml.cpp, which was compiled via:

g++ bench_rapidxml.cpp -O2 -o bencn

Results for hamlet.xml:

stonecobra@jeff-home:~/xmlbench$ vi bench_rapidxml.cpp
stonecobra@jeff-home:~/xmlbench$ g++ bench_rapidxml.cpp -O2 -o bench
stonecobra@jeff-home:~/xmlbench$ ./bench
Document Length: 279628 bytes
Data Length: 279629 bytes
Fastest:313.362203 MB/s
Fastest:312.956579 MB/s
Fastest:313.055406 MB/s
Fastest:301.303166 MB/s
Fastest:310.668081 MB/s
Fastest:310.523743 MB/s
Fastest:310.924893 MB/s
Fastest:310.434819 MB/s
Fastest:310.868351 MB/s
Fastest:310.745189 MB/s
Default:172.539398 MB/s
Default:172.309405 MB/s
Default:172.501116 MB/s
Default:172.385035 MB/s
Default:172.386038 MB/s
Default:172.455936 MB/s
Default:172.498550 MB/s
Default:172.357293 MB/s
Default:172.331007 MB/s
Default:172.326775 MB/s
strlen:3543.806666 MB/s
strlen:3589.165483 MB/s
strlen:3590.035209 MB/s
strlen:3560.508898 MB/s
strlen:3587.427295 MB/s
strlen:3590.035209 MB/s
strlen:3573.965308 MB/s
strlen:3589.551976 MB/s
strlen:3590.276875 MB/s
strlen:3565.793459 MB/s

Average parsing speed: 310.48 MB/sec in fastest mode, 172.41 MB/sec in default mode.

Results for soap_mid.xml:

stonecobra@jeff-home:~/xmlbench$ vi bench_rapidxml.cpp
stonecobra@jeff-home:~/xmlbench$ g++ bench_rapidxml.cpp -O2 -o bench
stonecobra@jeff-home:~/xmlbench$ ./bench
Document Length: 134334 bytes
Data Length: 134335 bytes
Fastest:197.352607 MB/s
Fastest:197.097866 MB/s
Fastest:196.779684 MB/s
Fastest:197.276936 MB/s
Fastest:197.096047 MB/s
Fastest:188.870551 MB/s
Fastest:197.026330 MB/s
Fastest:197.164297 MB/s
Fastest:197.156408 MB/s
Fastest:196.966655 MB/s
Default:121.320212 MB/s
Default:121.256024 MB/s
Default:121.385734 MB/s
Default:121.286215 MB/s
Default:121.236746 MB/s
Default:121.340896 MB/s
Default:121.295172 MB/s
Default:121.264861 MB/s
Default:121.311711 MB/s
Default:121.360322 MB/s
strlen:3608.479264 MB/s
strlen:3586.658061 MB/s
strlen:3619.080745 MB/s
strlen:3613.568366 MB/s
strlen:3619.694270 MB/s
strlen:3615.812122 MB/s
strlen:3615.403959 MB/s
strlen:3609.495937 MB/s
strlen:3615.914177 MB/s
strlen:3612.651269 MB/s

Average parsing speed: 196.28 MB/sec in fastest mode, 121.31 MB/sec in default mode.

Popularity: 56%

Comments: (3)
Friday
7
Mar 2008

iPhone Enterprise

(9:38 am) Tags: [Software, Life]

All I can say at this moment, is “finally”. I was about 2 weeks away from tossing the iPhone and scoring a Crackberry.

iPhone - Enterprise

Push email is what I need the MOST in a phone, and the iPhone wasn’t cutting the mustard, until maybe sometime in the really near future that we aren’t allowed to know at this point, but the teaser seems to be enough.

Popularity: 57%

Comments: Comments Off
Thursday
6
Mar 2008

Tango XML - Querying the weather in London

(5:42 pm) Tags: [General, Projects, D Programming Language]

I was helping someone on IRC in #d.tango try to use tango.text.xml to parse and display data from an xml document. We ended up building a simple example using HttpGet to get the document, Document to parse it, and Document’s xpath-like querying functionality to extract the useful bits.

import tango.io.File;
import tango.io.Stdout;
import tango.text.xml.Document;
import tango.net.http.HttpGet;
void main ()
{
        auto doc = new Document!(char);
        auto page = new HttpGet (\"http://www.google.com/ig/api?weather=London\");
        auto content = cast (char[]) page.read;
	doc.parse (content);
	foreach( node; doc.query.descendant[\"forecast_conditions\"])
	{
	  Stdout.formatln(\"forecast for {} is {} with a high of {}\",
			  node.query[\"day_of_week\"].attribute.nodes[0].value,
			  node.query[\"condition\"].nodes[0].getAttribute(\"data\").value,
			  node.query[\"high\"].nodes[0].getAttribute(\"data\").value);
	}
}

The D programming language coupled with Tango as a standard library allows you to become a productive programmer.

Update: Please ignore the backslashes in the code if you are trying to run this example. For some reason, Wordpress is mucking around with the output.

Popularity: 68%

Comments: (1)
Tuesday
4
Mar 2008

XML Benchmarks - Updated graphs

(2:23 pm) Tags: [General, Software, Projects, D Programming Language]

From my mistaken typing in the aalto benchmark, I accidentally benchmarked the default Java6 StaX parser, so this graph changes the axis to allow more players, and adds the real Aalto numbers. Click to view the graphs in full size.

Popularity: 68%

Comments: Comments Off
Wednesday
27
Feb 2008

XML Benchmarks - Updated graphs with StaX parsers

(10:22 pm) Tags: [Software, Projects, D Programming Language]

Thanks to Paul Findlay, we finally have a possible contender in the Java camp with Aalto.

This goes to show you how good library design and the D Programming Language come together to kick serious butt.

PS: I am looking for anyone to do comparisons with MSXML, RapidXML, etc. More native code help is needed. Send me email at scott aht dotnot daht org.

Popularity: 59%

Comments: (3)

XML Benchmarks - Aalto

(10:11 pm) Tags: [Software, Projects]

Next up from Paul Findlay: Aalto. Aalto.java:

// requires jar files from http://www.cowtowncoder.com/hatchery/aalto/index.html
// (and maybe some command line switches as per the same page)

import javax.xml.stream.XMLInputFactory;
import javax.xml.stream.XMLStreamReader;

import java.io.*;

public class Aalto
{
public static byte[] getBytesFromFile(File file) throws IOException {
InputStream is = new FileInputStream(file);
long length = file.length();

byte[] bytes = new byte[(int)length];

int offset = 0;
int numRead = 0;
while (offset < bytes.length
&& (numRead=is.read(bytes, offset, bytes.length-offset)) >= 0) {
offset += numRead;
}

if (offset < bytes.length) {
throw new IOException(”Could not completely read file “+file.getName());
}

is.close();
return bytes;
}

public static void main (String args[]) throws Exception
{
int iterations = 2000;

XMLInputFactory xmlif = XMLInputFactory.newInstance();
xmlif.setProperty(XMLInputFactory.IS_REPLACING_ENTITY_REFERENCES, Boolean.FALSE);
xmlif.setProperty(XMLInputFactory.IS_SUPPORTING_EXTERNAL_ENTITIES, Boolean.FALSE);
xmlif.setProperty(XMLInputFactory.IS_COALESCING, Boolean.FALSE);

byte[] content = Aalto.getBytesFromFile(new File(args[0]));
ByteArrayInputStream bais = new ByteArrayInputStream(content);

for (int i = 0; i < 10; i++) {
long start = System.currentTimeMillis();

for (int j = 0; j < iterations; j++) {
XMLStreamReader xr = xmlif.createXMLStreamReader(bais);
while (xr.hasNext()) {
xr.next();
}
xr.close();
bais.reset();
}

long stop = System.currentTimeMillis();
double timer = (stop - start) / 1000.0;
double total = (content.length * iterations) / (timer * (1024 * 1024));
System.out.print(total);
System.out.println(” MB/s”);
}
}
}

How it was run:

echo “aalto”
javac -classpath aalto-0.9.jar Aalto.java
echo “hamlet.xml”
java -cp aalto-0.9.jar:stax2-3.0pr1.jar:. Aalto hamlet.xml
echo “soap_mid.xml”
java -cp alto-0.9.jar:stax2-3.0pr1.jar:. Aalto soap_mid.xml

Results:

stonecobra@jeff-home:~/xmlbench$ ./all
aalto
hamlet.xml
119.02434356083324 MB/s
149.60675553887623 MB/s
149.81687738654318 MB/s
149.4390819546354 MB/s
150.23889675946305 MB/s
150.36596659038446 MB/s
150.4507992936795 MB/s
151.09010863912005 MB/s
151.00455365121567 MB/s
151.13292249818468 MB/s
soap_mid.xml
41.88683525261311 MB/s
43.82856162166171 MB/s
43.896140352961176 MB/s
43.86607965078486 MB/s
43.552910290707864 MB/s
44.16855218759427 MB/s
44.16093954502489 MB/s
44.13811735404555 MB/s
44.267755915728124 MB/s
44.22191426307118 MB/s

Average for hamlet.xml: 147.22 MB/sec
Average for soap_mid.xml: 43.80 MB/sec

As noted on the website, Aalto does seem to be quite fast on the “fast path”. Impressive for a Java solution at this point.

Update: 2008-03-03 13:15 PST: Thanks to Paul Findlay for catching my misspelling of the aalto.jar in the java run command. These numbers posted are actually for the default Java6 StaX parser, and not Aalto. Re-running, I get:

stonecobra@jeff-home:~/xmlbench$ ./all
aalto
hamlet.xml
138.74820070137716 MB/s
148.31704212905834 MB/s
148.73064235808528 MB/s
148.73064235808528 MB/s
148.56492576492863 MB/s
148.85517261961868 MB/s
148.97991159108764 MB/s
149.18827510380245 MB/s
149.14655578749824 MB/s
149.23001776611466 MB/s
soap_mid.xml
79.94439040256923 MB/s
85.83643927646042 MB/s
86.3571861274804 MB/s
86.64922936768157 MB/s
85.72156950158394 MB/s
86.5906628050809 MB/s
87.09101673699332 MB/s
87.03185164410135 MB/s
87.06142413871369 MB/s
87.20958857734321 MB/s

Average for hamlet.xml: 147.85 MB/sec
Average for soap_mid.xml: 85.95 MB/sec
Much more impressive numbers from the Java camp. Graphs will be updated later today.

Popularity: 37%

Comments: Comments Off

XML Benchmarks - Javolution

(9:56 pm) Tags: [Software, Projects]

Another benchmark from Paul Findlay, using Javolution. Here is Javolution.java:

// requires jar files from http://javolution.org/javolution-5.2.6-bin.zip

import javolution.xml.stream.XMLInputFactory;
import javolution.xml.stream.XMLStreamReader;

import java.io.*;

public class Javolution
{
public static byte[] getBytesFromFile(File file) throws IOException {
InputStream is = new FileInputStream(file);
long length = file.length();

byte[] bytes = new byte[(int)length];

int offset = 0;
int numRead = 0;
while (offset < bytes.length
&& (numRead=is.read(bytes, offset, bytes.length-offset)) >= 0) {
offset += numRead;
}

if (offset < bytes.length) {
throw new IOException(”Could not completely read file “+file.getName());
}

is.close();
return bytes;
}

public static void main (String args[]) throws Exception
{
int iterations = 2000;

XMLInputFactory factory = XMLInputFactory.newInstance();

byte[] content = Javolution.getBytesFromFile(new File(args[0]));
ByteArrayInputStream bais = new ByteArrayInputStream(content);

for (int i = 0; i < 10; i++) {
long start = System.currentTimeMillis();

for (int j = 0; j < iterations; j++) {
XMLStreamReader xr = factory.createXMLStreamReader(bais);
while (xr.hasNext()) {
xr.next();
}
xr.close();
bais.reset();
}

long stop = System.currentTimeMillis();
double timer = (stop - start) / 1000.0;
double total = (content.length * iterations) / (timer * (1024 * 1024));
System.out.print(total);
System.out.println(” MB/s”);
}
}
}

javac -classpath javolution.jar Javolution.java
echo “hamlet.xml”
java -cp javolution.jar:. Javolution hamlet.xml
echo “soap_mid.xml”
java -cp javolution.jar:. Javolution soap_mid.xml
stonecobra@jeff-home:~/xmlbench$ ./all
javolution
hamlet.xml
50.6551508686574 MB/s
51.165395577138696 MB/s
51.19486307315164 MB/s
51.19486307315164 MB/s
51.18503680384777 MB/s
51.23420590740574 MB/s
51.229284746527114 MB/s
51.23420590740574 MB/s
51.23420590740574 MB/s
51.229284746527114 MB/s
soap_mid.xml
44.98275478234452 MB/s
45.975555578724986 MB/s
46.000317996451415 MB/s
46.00857806432652 MB/s
45.99206089395699 MB/s
46.066481704465005 MB/s
46.08305238133712 MB/s
46.08305238133712 MB/s
46.09134219108371 MB/s
46.066481704465005 MB/s

Average for hamlet.xml: 51.16 MB/sec
Average for soap_mid.xml: 45.93 MB/sec

Most of the Java camp is starting to look the same.

Popularity: 31%

Comments: Comments Off