<blog>

<item num="a1046">
<title>LibraryLookup: Aleph</title>
<date>2004/07/22</date>
<body>

<p>
Thanks to Janet Lefkowitz, a librarian at the <a href="http://www.mslib.huji.ac.il/main/siteNew/?langId=1">Hebrew University of Jerusalem</a>, the <a href="http://weblog.infoworld.com/udell/LibraryLookup/">LibraryLookup</a> project has added support for a fifteenth class of OPAC (online public access catalog) system: <a href="http://www.exlibris.co.il/">Ex Libris</a> <a href="http://www.exlibris.co.il/aleph.htm">Aleph</a>. This was an interesting collaboration. I'd looked at a few different Aleph systems, and found that their URLs varied from one implementation to the next in ways that I didn't have time to unravel. But Janet was willing to do this research, and she presented me with a set of Aleph URLs that illustrated the variations. I updated the <a href="http://weblog.infoworld.com/udell/stories/2002/12/11/librarylookupGenerator.html">Build your own Bookmarklet</a> (BYOB) script accordingly. 
</p>
<p>
A technical note: the BYOB script falls back on one of those guilty pleasures of scripting, <tt>eval</tt>. The fifteen URL templates are classified in a JavaScript array, like so:
<pre class="code javascript">
queries['eosQ'] = '/VAR1/search/AdvancedSearch.asp?selectField1=IS&amp;txtSearch1=\'+isbn,';
</pre>
When an OPAC's bookmarklet needs to be parameterized, the form provides fill-in boxes like so:
<pre>
&lt;input name="eosQ" value="WEBOPAC"/>
</pre>
The script needs to replace VAR1, in the URL template, with the value of the fill-in box. In order to capture that value, it has to choose from a namespace that looks like this:
<pre class="code javascript">
document.forms['byo'].eosQ.value
document.forms['byo'].aleph.value
</pre>
The OPAC name -- eosQ, or aleph, or another of the fifteen choices on the form -- is captured in a variable called <tt>vendor</tt>. Interpolating that name into a JavaScript expression at runtime is a job for <tt>eval</tt>:
<pre class="code javascript">
var1 = eval ( "document.forms['byo']." + vendor + ".value" ); 
</pre>
Is <a href="http://www.google.com/search?q=eval+is+evil">eval evil</a>? Yeah, I guess, but at times like this I drift over to the dark side. If there's a high road I'm not seeing, let me know and I'll report it.
</p>
<p>
A historical note: Aleph, the first letter of the alphabet for more than three centuries, is a leading character in a book I just happened to pick up at the library last night: <a href="http://www.amazon.com/exec/obidos/tg/detail/-/0767911725/">Language Visible: Unraveling the Mystery of the Alphabet from A to Z</a>, by David Sacks. The name 'aleph' meant 'ox' to the Phonecians, and the letterform evolved from a picture of an ox's head. 
</p>
<p>
The book is full of fascinating bits of trivia like that. What really grabbed me, though, was this discussion of the portability of alphabets:
<blockquote class="personQuote DavidSacks">
Even if two languages are totally unlike, letters often can make the transition. Because their core selection of sounds (inherited from the alphabet's earliest stages) is close to being universal, letters usually can be adapted to a different tongue through only a few changes: three or four letters revalued to new sounds, a letter or two invented, unneeded letters discarded.
<br/>...<br/>
The newly independent countries of Azerbaijan, Turkmenistan, and Uzbekistan have not altered their spoken languages, which are Turkish tongues. But the governments have moved to replace Cyrillic street signs, textbooks, tax forms, etc., with new ones printed in a modified, 29-letter Roman alphabet. Elementary schools now teach Roman letters. The massive, disruptive changeover -- inspired by westward trade ambitions and hatred of the Soviet memory -- was declared officially complete in Azerbaijan, at least, in 2001. The new alphabet is modeled on that of modern Turkey, which switched from Arabic to Roman letters in 1928, under the westernizing regime of Kemal Atat&#252;rk. 
<br/><br/>
Prior to 1940, Azerbaijan, Turkmenistan, and Uzbekistan used the Arabic alphabet, until the early Soviets imposed the Roman one in the 1920s. Thus the three regions have seen all three major alphabets in the last 80 years: Arabic, Cyrillic, and (twice) Roman. Although the languages of the three countries are unrelated to Arabic, Russian, or Latin, each alphabet has taken hold in turn.
</blockquote>
Amazing. I knew, of course, that the <a href="http://www.adath-shalom.ca/alphabet.htm">family tree of alphabets</a> is far simpler than the <a href="http://www.armenianhighland.com/images/illustration122.jpg">family tree of languages</a>. But the portability of alphabets, with respect to languages, just never occurred to me. Live and learn.
</p>

</body>
</item>

<item num="a1045">
<title>HailStorm training wheels</title>
<date>2004/07/21</date>
<body>

<p>
<blockquote>
Many folks wouldn't want to be reminded how easy it is to convert sparse input into a detailed profile that includes a phone number, a street address, a satellite photo, and driving directions. Re-entering the basic facts each time perpetuates an illusion of privacy. Yet the reality, for many of us, is that these facts are public.
<br/><br/>
Since I haven't told Google (or any other directories) to delete my records, I've implicitly given permission for Web applications to use that data. Let me now make that permission explicit. I'd be happy if a Web form made intelligent use of public information about me.
<br/><br/>
I'd be even happier if I could control the source of that data. Public information is a poorly defined concept, after all. There are online directories that still remember an address I vacated five years ago. I'd like to maintain the facts about me that I deem public. When applications need those facts, I'd like to refer them to a service that dispenses them. [Full story at <a href="http://www.infoworld.com/article/04/07/16/29OPstrategic_1.html">InfoWorld.com</a>]
</blockquote>
</p>
<p>
When I <a href="http://weblog.infoworld.com/udell/2004/07/13.html#a1038">previewed</a> this column last week, it occurred to me that <a href="http://www.foaf-project.org/">FOAF</a> is an example of a mechanism that empowers users to assert facts about themselves. I don't expect earth-shattering results from the publication of my <a href="http://udell.roninhouse.com/foaf.xml">own FOAF file</a>. But if for now it does nothing more than neatly encapsulate certain facts I'm sometimes asked to produce -- my picture, my bio -- that's useful.
</p>
<p>
In theory, it would be straightforward for business homepages to adopt a similar approach. They all do the same stuff: About, Company, Products, News, Contact. There's an obvious XML format for News -- RSS -- but not for these other things. It's easy to imagine a virtuous cycle. Companies publish their facts in a structured form. As a result, more directories list them -- and do so more correctly. As a result, more companies are incented to publish XML facts. And yet in practice, this hasn't happened.
</p>
<p>
Agreeing on a format is, of course, always a huge obstacle. But I suspect the Web design reflexes that we carry forward from the 90s are also getting in the way. It's been a very long time since I visited a company's home page and thought: "Wow, get a load of those DHTML menu effects!" Or: "Nice font!" I'm there for the information, and I'll shred the site trying to find it, grumbling the whole time. I know I'm not the only one who feels this way.
</p>
<p>
Of course I'm not wholly insensitive to aesthetics. In fact, I worship CSS wizards who can dress a skeleton of structured information in beautiful clothing. But people really hate looking at, or thinking about, that skeleton. Steve Jobs' demonstration of Safari RSS at Apple's recent developer conference was a great example. At one point, he flipped back and forth between the skeletal (RSS) and clothed (Web) views of a page. It was the least compelling moment of the keynote. Jobs himself sounded unconvinced, and the audience responded with silence.
</p>
<p>
It'd be great if business websites formed a FOAF-like "web of machine-readable home pages." But I don't expect that'll happen anytime soon. When people look at websites through X-ray glasses, they don't like what they see.
</p>

</body>
</item>


<item num="a1044">
<title>Longhorn follow-up: Quentin Clark interview</title>
<date>2004/07/20</date>
<body>

<p>
The <a href="http://www.infoworld.com/reports/29SRlonghorn.html">Longhorn cover story</a> ran this week. It includes a <a href="http://www.infoworld.com/article/04/07/16/29FElonghorn_1.html">main story</a>, an <a href="http://www.infoworld.com/article/04/07/16/29FElonghornclark_1.html">interview with Quentin Clark</a>, and an <a href="http://www.infoworld.com/article/04/07/16/29FElonghornreich_1.html">interview with Miguel de Icaza and Brendan Eich</a>. Here are some outtakes from my interview with Quentin Clark, director of program management for WinFS.
</p>
<p>
<b>On XML datatypes</b>
</p>
<blockquote class="personQuote QuentinClark">
We see being able to store structured data (like contacts), semi-structured data (XML), and unstructured data. In the case of a Word document, the XML isn't described by the doc type in WinFS, but the WinFS type defines an XML datatype, you can stick in the XML there, and we can reason over that. A JPEG, when you pull off the excess headers, is just a series of 1s and 0s you feed into an algorithm, that will never be structured. But even within a WinFS type, like a doctype, we allow for all three components to be within an instance of that type. A photo is a good example. We have a picture/photo type, things like what camera model, where the picture was taken, plus the unstructured bitstream. With respect to metadata handling -- and property promotion is only part of that -- we talk about picking the author out of a Word doc, putting it into a WinFS property. We also have property demotion, so coming through the APIs you can reprogram the title of some item, and it finds its way back into the filestream. 
<br/><br/>
Just ignore WinFS types for a second, let's say I don't need no stinking types, I'm gonna build my own.  You can make it from scalars, the XML datatype, and a binary field. We've defined the Windows doctype to have the ability to have a filestream as well as an XML datatype. That gives you a lot of power. You can walk up to WinFS, create a scope -- all documents, the whole store -- and then issue XPath queries into items that have XML datatypes, then we can go reason over those things. 
</blockquote>
<p>
<b>On sharing</b>
</p>
<blockquote class="personQuote QuentinClark">
We want to use synch as a way to enable people to share stuff with each other, and to enable offline experiences. It's the Outlook 11 idea that I'm working always locally -- we're bringing that model to all data. You'll have ability to use any scoping mechanisms -- querying, or explicit wrangling where you drop 16 things into a list -- and say, hey, I want to share this with whatever, another machine or with another person. 
<br/><br/>
Relationships that exist within the scoping are no problem, we know how to rehydrate them on the other end. For relationships outside the boundary, there's a couple of different mechanisms. So if you got the document from me, but I didn't give you contacts, then if you do have that contact, Sarah Wiley, on your end, we'll reconstitute the relationship. If it's not a thing we can positively identify, then it would dangle. But after the PDC we changed the data model to make dangling references not really dangle any more. There was a point where we tried to work through the user experience of finding and showing danglers where we realized hmmm, that sucks, people won't know why are they even doing this.  Can we change how we store and model the data so that's not an issue? So you'll have a document, and an author, and the system knows nothing about that author because you don't.
</blockquote>
<p>
<b>On things being in more than one place</b>
</p>
<blockquote class="personQuote QuentinClark">
Consider three scenarios. First, I want to keep a list of stuff I need to do today -- a piece of mail, the notes preparing me for this call. Let's assume there's no query-based way to do this, it requires explicit wrangling. 
<br/><br/>
Second: in Outlook 11 I have search folders -- stuff from direct staff, stuff where I'm on the To line, the Cc line -- I use these every day as part of my reasoning over my life. A lot of the reasons you want to do things in that query way imply that you don't want to manually intervene. You wouldn't want to inform the system manually about the To line, although you tell the system that my staff is the following people.
<br/><br/>
The third case is about where you want things to live, physically. Where they are contained. In Outlook, I have a PST and the OST, which is the reflection of my online Exchange mailbox.  No big surprise, the 200MB Microsoft gives me is not big enough to contain all my mail, so I have PSTs to keep stuff. That's a containment thing, where do I want it to actually live? I have a removable hard drive at home, at some point I decide this photo will live there. 
<br/><br/>
So those are the three axes. The limitation of Outlook 11 is that it doesn't allow you to put an item in more than one user lassooing. We want to allow multiple lists, or folders, where you can put the same thing in both. We're removing that Outlook limitation. 
<br/><br/>
We encountered significant design challenges around user experience and expectations, and also problems around the DAG (directed acyclic graph). Consider security. I take an item, it lives in a bunch of folders, what is the security on that thing? Folder 2 has it too, then moves to folder 3. All the way back on folder 1, does the owner have any way to know what's happened? Then there's naming. If I have a doc, call it "jon's doc," created in a single folder, then I want to have it appear twice in that same folder, what is it called? If it's in a second folder, and I delete it from folder 1, then at some point I rename folders and put the doc back, calculating namespaces becomes complex. 
</blockquote>
<p>
<b>On the object/relational/XML "trinity"</b>
</p>
<blockquote class="personQuote QuentinClark">
Why do you need all three? I take it that it's obvious why you need objects: you program to them. The CLR has given us some language independence, and we've done a fairly good job building an object universe better than we had before, we're strong believers in that. As for XML, there's no argument there either. The big thing isn't turning out to be industry schemas, but the fact that you have this self-describing thing, this is what I can learn about it, and I can reason on it in a programmatic way by pulling it up into an object. 
<br/><br/>
Then there's relational, this is harder to describe. I will observe that nobody has built an XML store that has the level of scale, performance, or capabilities of today's relational stores. It's just true that the relational model has a set of design characteristics that give it performance characteristics that are are just inherent. Doing things in the XML store doesn't give you the same benefit -- and that's not even accounting for the fact that there's so much data in relational stores today. 
<br/><br/>
But we're taking the start of the Yukon XML work, and bringing it into WinFS. Our vision is a marriage of these worlds, this is why they did so much work around SQL/CLR in Yukon. And we've done more since then in the WinFS part of the code. 
<br/><br/>
Yukon doesn't reach the end of the journey, WinFS the client does not, but that's where they're headed. In terms of a data platform, that's what we want. This couples with the discussion of structured, unstructured, and semi-structured data. XPath doesn't make sense with the JPEG bitstream. Having that object/relational/XML trinity over a breadth of datatypes, that's the holy grail, that's completeness. 
</blockquote>
<p>
<b>On WinFS benefits</b>
</p>
<blockquote class="personQuote QuentinClark">
First, having datatypes in Windows, so you can do things like program around a contact. The shared data is a huge benefit, though admittedly it's tricky to get it right. If you are Eudora or Act, how do we make sure you can plug in and own the contacts that are yours, while ensuring that Amazon can still query into a contact and pull out an address? 
<br/><br/>
Second, the Outlook 11 experience of being always local. So an ISV builds a specialized app for architectural firms. Thanks to WinFS synch, the user can go offline and online. Using rules, if meeting notes come in that talk about changes in plans, he gets an action item. 
<br/><br/>
Third, customized experience. If I drill into docs by Jon about WinFS, I can name that query, reuse it, write rules about it.
<br/><br/>
Fourth, finding things. How many times do you get the call: "I created this doc, help me find it." You'll never get that call again. If their field of view is too broad, they can narrow it easily. Fulltext search is there. Life will become a lot easier for end users. We had to turn off indexing in XP by default, it was too slow and chatty. When you're chasing the truth, it's hard to do. Being the truth is easier to do. When people come in and make changes, we know about it, we're built on a database.
</blockquote>
</body>
</item>


<item num="a1043">
<title>Feedster/Bloglines citation bookmarklets</title>
<date>2004/07/17</date>
<body>

<p>
Feedster's Scott Rafer wrote to point out that there is a URL syntax for assembling the conversation around a blog post:
</p>
<pre>
http:\//www.feedster.com/links.php?url=\
  http%3A%2F%2Fweblog.infoworld.com%2Fudell%2F2004%2F07%2F16.html%23a1041
</pre>
<p>
You need to escape the target URL, which isn't easy to do while copying and pasting it, but is easy for a bookmarklet to do. So, drag this link -- <a href="javascript:void(location='http://www.feedster.com/links.php?url='+escape(location.href));">Feedster Citations</a> -- to your toolbar, and you can have one-click access to the conversations around any blog post you're currently viewing.
</p>
<p>
The Bloglines equivalent, by the way, is:
</p>
<pre>
http:\//www.bloglines.com/citations?url=\
  http%3A%2F%2Fweblog.infoworld.com%2Fudell%2F2004%2F07%2F16.html%23a1041
</pre>
<p>
and here is the <a href="javascript:void(location='http://www.bloglines.com/citations?url='+escape(location.href));">Bloglines Citations</a> bookmarklet. Again, drag it to your toolbar for one-click access to Bloglines citation lookups.
</p>
<p>
Currently, this pair of queries (<a href="http://www.feedster.com/links.php?url=http%3A//weblog.infoworld.com/udell/2004/07/16.html%23a1041">Feedster</a>, <a href="http://www.bloglines.com/citations?url=http%3A%2F%2Fweblog.infoworld.com%2Fudell%2F2004%2F07%2F16.html%23a1041">Bloglines</a>) yields the same set of items referring to Friday's <a href="http://weblog.infoworld.com/udell/2004/07/16.html#a1041">Feedster reloaded</a> item. This suggests to me that conversation tracking is becoming more deterministic. Excellent!
</p>
<p>
Is it really necessary, by the way, to escape the target URLs? If not, the use of these query mechanisms would be able to spread more virally.
</p>

</body>
</item>

<item num="a1043">
<title>Edwin Khodabakchian interview</title>
<date>2004/07/19</date>
<body>

<p>
Congratulations to Scott Johnson and the rest of the <a href="http://www.feedster.com">Feedster</a> for the launch of Feedster version 2. There are lots of new features to digest, but the ones that most interest me are those that enhance cross-blog conversation. At <a href="http://about.feedster.com/?id=39&amp;epoch=1089217495">this URL</a>, for example, I can find a tidy summary of the reaction to <a href="http://weblog.infoworld.com/udell/2004/07/07.html#a1035">this item</a>:
</p>
<table align="center" width="80%">
<tbody><tr><td bgcolor="#eeeeee">
<p align="left">
Tim Bray has <a href="http://www.tbray.org/ongoing/When/200x/2004/07/05/SafariExt">thrown down the warning flag</a> with respect to the Dashboard-related HTML extensions in the next version of Safari. "I'd be really happy if someone explained to me how this is different from what Netscape and Microsoft did to each other so irritatingly back in 1996," he writes. <b>...</b></p><p align="right"><a href="http://weblog.infoworld.com/udell/2004/07/07.html#a1035">1 week, 1 day ago</a></p>
<p align="left">Links to this post include:</p><ul><li>From: Keep an Open Eye - <a href="http://www.theopensourcery.com/wordp1/index.php?p=40">Views on WHATWG, Dashboard</a></li><li>From: Editor's Radio Weblog - <a href="http://radio.weblogs.com/0132182/2004/07/07.html#a132">(No Title)</a></li><li>From: house of warwick - <a href="http://houseofwarwick.com/2004/07/07.html#a835">WHATWG</a></li><li>From: Spontaneously Combusting - <a href="http://dansickles.blogs.com/weblog/2004/07/web_standard_st.html">Web standard stagnation</a></li><li>From: Forwarding Address: OS X - <a href="http://saladwithsteve.com/osx/2004/07/consensus-or-at-least-broadly-shared.html">Consensus, or at least a broadly shared suspicion,...</a></li><li>From: steve News - <a href="http://trioconnect.org/steve/2005/04/19#a192">Recent News from house of warwick</a></li></ul>
</td></tr>
</tbody></table>
<p>
Excellent! Here's a suggestion, for what it's worth. The URL that produces that summary looks like this:
<pre>
http://about.feedster.com/?id=39&amp;epoch=1089217495
</pre>
Because it's opaque with respect to the URL that it summarizes, I can't form the query directly. That means, among other things, that I can't make a Feedster version of my <a href="http://weblog.infoworld.com/udell/2004/04/13.html">Technorati trackback bookmarklet</a> that I could use to generate this kind of view with a single click, from any blog post I happen to be reading. 
</p>
<p>
The equivalent Technorati query looks like this:
<pre>
http://www.technorati.com/cosmos/search.html?url=\
  http://weblog.infoworld.com/udell/2004/07/07.html#a1035
</pre>
This seems preferable to me. Notice, though, that the Technorati <a href="http://www.technorati.com/cosmos/search.html?url=http://weblog.infoworld.com/udell/2004/07/07.html#a1035">query</a> yields the dreaded <b>Ouch! No results found</b>. It's <a href="http://www.sifry.com/alerts/">not news</a> that the blog world's exponential growth has been challenging to keep up with. 
</p>
<p>
Of course there's a <a href="http://blog.topix.net/archives/000016.html">supercomputer</a> out there that hasn't yet been applied to this problem. I'm not the only one who wonders when, and how, it will. 
</p>

</body>
</item>


<item num="a1042">
<title>Edwin Khodabakchian interview</title>
<date>2004/07/17</date>
<body>

<p>
<blockquote>
BPEL (business process execution language) is the XML-based language of Web services "orchestration" -- that is, a means to connect multiple Web services to create end-to-end business processes. Recently, InfoWorld Test Center Lead Analyst Jon Udell interviewed BPEL expert Edwin Khodabakchian about the future of this language. Khodabakchian is CEO of Collaxa, a pure-play BPM startup whose BPM orchestration product has supported BPEL for more than a year. Collaxa was acquired by Oracle earlier this month, and its BPEL Server product is now marketed as Oracle BPEL Process Manager. Full story at [<a href="http://www.infoworld.com/article/04/07/16/29FEbpmbpel_1.html">InfoWorld.com</a>]
</blockquote>
</p>
<p>
In this <a href="http://weblog.infoworld.com/udell/gems/khodabakchian.mp3">outtake</a> from our interview, Edwin pushes back against the notion that BPEL is overly complex. A lot of the complexity, he argues, has to do with XML Schema, not BPEL itself. He goes on to describe how alternate bindings -- based on <a href="http://ws.apache.org/wsif/">WSIF</a> and <a href="http://www.jcp.org/en/jsr/detail?id=208">JSR 208</a>, as well as <a href="http://msdn.microsoft.com/Longhorn/understanding/pillars/Indigo/default.aspx">Indigo</a> -- will extend BPEL's reach beyond SOAP Web services to the full range of legacy protocols.
</p>


</body>
</item>

<item num="a1041">
<title>Feedster reloaded</title>
<date>2004/07/16</date>
<body>

<p>
Congratulations to Scott Johnson and the rest of the <a href="http://www.feedster.com">Feedster</a> gang for the launch of Feedster version 2. There are lots of new features to digest, but the ones that most interest me are those that enhance cross-blog conversation. At <a href="http://about.feedster.com/?id=39&amp;epoch=1089217495">this URL</a>, for example, I can find a tidy summary of the reaction to <a href="http://weblog.infoworld.com/udell/2004/07/07.html#a1035">this item</a>:
</p>
<table align="center" width="80%">
<tbody><tr><td bgcolor="#eeeeee">
<p align="left">
Tim Bray has <a href="http://www.tbray.org/ongoing/When/200x/2004/07/05/SafariExt">thrown down the warning flag</a> with respect to the Dashboard-related HTML extensions in the next version of Safari. "I'd be really happy if someone explained to me how this is different from what Netscape and Microsoft did to each other so irritatingly back in 1996," he writes. <b>...</b></p><p align="right"><a href="http://weblog.infoworld.com/udell/2004/07/07.html#a1035">1 week, 1 day ago</a></p>
<p align="left">Links to this post include:</p><ul><li>From: Keep an Open Eye - <a href="http://www.theopensourcery.com/wordp1/index.php?p=40">Views on WHATWG, Dashboard</a></li><li>From: Editor's Radio Weblog - <a href="http://radio.weblogs.com/0132182/2004/07/07.html#a132">(No Title)</a></li><li>From: house of warwick - <a href="http://houseofwarwick.com/2004/07/07.html#a835">WHATWG</a></li><li>From: Spontaneously Combusting - <a href="http://dansickles.blogs.com/weblog/2004/07/web_standard_st.html">Web standard stagnation</a></li><li>From: Forwarding Address: OS X - <a href="http://saladwithsteve.com/osx/2004/07/consensus-or-at-least-broadly-shared.html">Consensus, or at least a broadly shared suspicion,...</a></li><li>From: steve News - <a href="http://trioconnect.org/steve/2005/04/19#a192">Recent News from house of warwick</a></li></ul>
</td></tr>
</tbody></table>
<p>
Excellent! Here's a suggestion, for what it's worth. The URL that produces that summary looks like this:
</p>
<pre>
http://about.feedster.com/?id=39&amp;epoch=1089217495
</pre>
<p>
It's opaque with respect to the item that it summarizes. The reason is that Feedster summaries are by day, not by item. This means, among other things, that I can't make a Feedster version of my <a href="http://weblog.infoworld.com/udell/2004/04/13.html">Technorati trackback bookmarklet</a> that I could use to generate this kind of view with a single click, from any blog post I happen to be reading. 
</p>
<p>
An example of that kind of Technorati query looks like this:
</p>
<pre>
http://www.technorati.com/cosmos/search.html?url=\
  http://weblog.infoworld.com/udell/2004/07/07.html#a1035
</pre>
<p>
This seems preferable to me. Notice, though, that the Technorati <a href="http://www.technorati.com/cosmos/search.html?url=http://weblog.infoworld.com/udell/2004/07/07.html#a1035">query</a> yields the dreaded <b>Ouch! No results found</b>. It's <a href="http://www.sifry.com/alerts/">not news</a> that the blog world's exponential growth has been challenging to keep up with. 
</p>
<p>
Of course there's a <a href="http://blog.topix.net/archives/000016.html">supercomputer</a> out there that hasn't yet applied itself to this problem. I'm not the only one who wonders when, and how, it will. 
</p>
<p><b>Update</b>:
http://www.feedster.com/links.php?url=http%3A%2F%2Fweblog.infoworld.com%2Fudell%2F2004%2F07%2F07.html%23a1035
</p>

</body>
</item>



<item num="a1040">
<title>Network access for guests</title>
<date>2004/07/15</date>
<body>

<p>
Here's a scenario that I've come to call "the coffee-shop problem" because it pertains to a local coffee shop, though it also applies to a home office that might receive visitors. You have a single DSL or cable connection. The challenge: offer Wi-Fi to visitors without exposing your connected computer (or LAN).
</p>
<p>
I haven't yet found a low-end (sub-$100) appliance that can do this. If you're willing to spend closer to $1000, the solution I'm testing here in my home office at the moment, <a href="http://www.fortinet.com/products/fortiwifi.html">Fortinet's FortiWiFi-60</a>, solves the problem handily. You can establish firewall, anti-virus, intrusion-detection, content-filtering, and traffic-shaping policies between any pair of its WAN, LAN, DMZ, and 802.11b/802.11g interfaces. That's overkill for the coffee shop scenario, of course. And while it's entertaining for me to fiddle around with the various policies, that's very much an administrative thing, not something a non-technical user would want to do. For the coffee shop and home office scenario, I think there might be a market for a cheap appliance that would isolate a WLAN from its host LAN in a turnkey way.
</p>
<p>
For the enterprise, of course, this is a more complicated problem. In some cases, you want to isolate visitors from the local network. In other cases, you'd like to be able to collaborate with visitors and share intranet resources with them. Solutions to this problem tend to require administrative support. But that often won't correspond to the way we delegate trust in the physical world. For example, on a recent visit to a corporate campus, I signed in at the visitor center. But when the meeting moved to another building, it wasn't my visitor credentials that gave me access to that building. Rather, I piggybacked on the authorization of the employee who unlocked the door with his card. But since there was no analogous way to delegate network access (isolated or not), I spent the day out of contact with the world.
</p>
<p>
At the <a href="http://www.dartmouth.edu/~deploypki/summit04/">PKI Unlocked</a> summit at Dartmouth College, I saw an interesting approach to solving this problem. <a href="http://www.cs.dartmouth.edu/~sws/greenpass/">Greenpass</a>, one of the projects being directed by <a href="http://www.cs.dartmouth.edu/~sws/">Sean Smith</a>, is a prototype system that enables a trusted insider to delegate certificate-based access to a guest. It was set up and running in the seminar room, and it works like this:
<ol>
<li><p>On connecting to the access point, the guest is bounced to a registration page.</p></li>
<li><p>The guest uploads his or her digital certificate to Greenpass.</p></li>
<li><p>Greenpass produces an image based on the guest's public key, and displays it on the guest's laptop.</p></li>
<li><p>The guest shows the image to the delegator.</p></li>
<li><p>The delegator compares it to an image based on the key associated with the access request, and if the images match, accepts the request.</p></li>
<li><p>A <a href="http://world.std.com/~cme/html/spki.html">SPKI</a>/<a href="http://theory.lcs.mit.edu/~cis/sdsi.html">SDSI</a> certificate is issued for the guest. (Pronounced "spooky/sudsy", these technologies support a decentralized, peer-to-peer approach. The design of Groove was influenced by SPKI/SDSI.)</p></li>
<li><p>A modfied RADIUS server accepts the SPKI/SDSI certificate.</p></li>
</ol>
</p>
<p>
There's plenty of rocket science under the covers, but the parts that people do -- compare images, vouch for guests -- are easy and natural. Nice!
</p>
<p>
<b>Update</b>: Here are some suggested solutions to the coffee-shop problem;
</p>
<p>
From Seairth Jacobs: <a href="http://www.dlink.com/products/?pid=173">D-Link's DSA-3100 Public/Private Hot Spot Gateway</a>. Seairth writes: "I admit it's not sub-$100 and it doesn't provide some of the features as the Fortinet product, but it may be a good compromise.  Also, because it doesn't have the wireless built in, it is possible to keep up with the latest and greatest (wireless features) without having to replace the entire device."
</p>
<p>
From Dave Megginson, Will Glass-Husain, and Eddy Carroll: A 3-box solution, two Linksys (or equivalent) routers connected in a Y configuration with a third that talks to the cable/DSL box. "I have a vague feeling there might be gremlins in this double network address translation," writes Will, "but can't think of a concrete reason it wouldn't work." (<a href="http://www.jepstone.net/index.cgi">Brian Jepson</a> also suggested this to me, a while ago.)
</p>




</body>
</item>

<item num="a1039">
<title>Upcoming events: July 2004</title>
<date>2004/07/14</date>
<body>

<p>
Today (July 14) I'll be attending <a href="http://www.dartmouth.edu/~deploypki/summit04/">PKI Unlocked</a>, a seminar on PKI deployment at Dartmouth College.
</p>
<p>
From July 28 - 30 I'll be at <a href="http://conferences.oreillynet.com/os2004/">OSCON 2004</a>. And from July 31 - August 1 I'll be at the <a href="http://www.vanpyz.org/conference">VanPy (Vancouver Python) Workshop</a>, where I'm filling in as keynoter for <a href="http://www.europython.org/interviews/paul_everitt_2003/view">a guy</a> who knows a lot more about Python than me (though he pretends otherwise), and sharing the stage with <a href="http://www.python.org/~guido/">the guy</a> who invented the language. Gulp.
</p>

</body>
</item>


<item num="a1038">
<title>HailStorm [CQ]</title>
<date>2004/07/13</date>
<body>

<p>
My <a href="http://weblog.infoworld.com/udell/2004/07/03.html#a1033">recent mangling</a> of Diego Doval's name in a print column was a harsh reminder that I neglect one tradition of print journalism at my peril. That tradition is a fact-checking mechanism called CQ. The idea is that an author, when writing the name of a person, company, or product, should CQ it to indicate that the spelling has been double-checked. (The acronym "CQ" is itself unCQ-able, since nobody owns the term or seems to know what it stands for.) Of course a copy editor shouldn't automatically trust an author's CQ. But it's one layer of a defense-in-depth strategy.
</p>
<p>
Here's a real-life example. In next week's column I mention a certain Microsoft initiative, now mothballed. I wasn't sure about "Hailstorm" versus "HailStorm", but found some examples (via Google) that convinced me to go with the former. Having double-checked in this way, I should have written "Hailstorm [CQ]," but didn't. My editor, who had a clear memory of the Hailstorm spelling, did CQ it that way. But in fact, the correct spelling -- thankfully caught at the last minute by an eagle-eyed copy editor -- appears to be "HailStorm."
</p>
<p>
HailStorm was originally described in a Microsoft <a href="http://www.microsoft.com/net/hailstorm.asp">whitepaper</a>, now 404. The <a href="http://www.microsoft.com/presspass/features/2001/mar01/03-19hailstorm.asp">original press release</a>, still online, uses both spellings. If you search <a href="http://www.google.com/search?q=microsoft+hailstorm">Google</a> or even <a href="http://search.microsoft.com/search/results.aspx?qu=hailstorm">Microsoft.com</a>, you'll also find examples of both spellings. About the best that can be said, as my editor pointed out, is that spellings with the cap S are more frequent.
</p>
<p>
I've long been fascinated with the way in which Google can perpetuate misspellings. Compare, for example, the count of results for <a href="http://www.google.com/search?q=embarrass">embarrass</a> (count: 401,000) and <a href="http://www.google.com/search/q=embarass">embarass</a> (count: 41,400). Obviously you shouldn't use Google as a dictionary, you should instead go <a href="http://www.m-w.com/cgi-bin/dictionary?va=embarass">here</a> or <a href="http://dictionary.reference.com/search?q=embarass">here</a>. But I'll bet a lot of people do look up "embarass" on Google, find evidence to support their misspellings, and thus perpetuate them. I've even wondered if there's a feedback loop here that will increase the ratio of incorrect to correct spellings over time.
</p>
<p>
Although you shouldn't use Google as a dictionary, note the difference between looking up the wrong and right spellings there:
</p>
<table border="1" cellspacing="0" cellpadding="6">
<tr>
<td><a href="http://www.google.com/search?q=embarass">embarass</a></td>
<td>Results <b>1</b> - <b>100</b> of about <b>41,400</b> for <b><b>embarass</b></b>.</td>
</tr>
<tr>
<td><a href="http://www.google.com/search?q=embarrass">embarrass</a></td>
<td>Results <b>1</b> - <b>100</b> of about <b>401,000</b> for <b>embarrass</b>[<a href="http://dictionary.reference.com/searchq=embarrass" title="Look up embarrass on dictionary.com">definition</a>]</td>
</tr>
</table>
<p>
In the latter case, Google refers you to an authoritative source -- in this case, dictionary.com. Of course, CQ-able facts usually can't be found in a dictionary. The authority that governs them is the person who owns the name in question, or the company that owns the name or product. At least, that's how it ought to be. But look at what really happens:
</p>
<table border="1" cellspacing="0" cellpadding="6">
<tr>
<td><a href="http://www.google.com/search?q=infoworld+%22john+udell">infoworld "john udell"</a></td>
<td>Results <b>1</b> - <b>100</b> of about <b>7,740</b></td>
</tr>
<tr>
<td><a href="http://www.google.com/search?q=infoworld+%22jon+udell%22">infoworld "jon udell</a></td>
<td>Results <b>1</b> - <b>100</b> of about <b>17,900</b></td>
</tr>
</table>
<p>
I own the spelling of my name. InfoWorld, as my employer, has some ownership interest in that fact too. Microsoft, even though it has 404'd the HailStorm whitepaper, still owns that piece of its institutional history. Shouldn't these responsible parties control such facts about themselves?
</p>
<p>
HailStorm, of course, was based on a mechanism for publishing machine-readable facts. There are other ways to skin the cat. <a href="http://www.foaf-project.org/">FOAF</a>, for example, is a way for individuals to assert facts about themselves. Currently Google sees <a href="http://www.google.com/search?q=foaf+filetype%3Ardf">14,700 foaf.rdf files</a> and <a href="http://www.google.com/search?q=foaf+filetype%3Axml">416 foaf.xml files</a> -- not including <a href="http://udell.roninhouse.com/foaf.xml">mine</a>, which I just added today. I <a href="http://weblog.infoworld.com/udell/2004/01/04.html#a878">resisted FOAF</a> until now because I've worried about <a href="http://weblog.infoworld.com/udell/2004/01/06.html">asserting things which can't be asserted</a>, such as relationships. But the core concept of FOAF, as captured in the tagline "a Web of machine-readable homepages," is indisputably valid.
</p>
<p>
If you removed FOAF's "friend-of-a-friend" branding, the concept might make more sense to organizations. For example, the homepage of infoworld.com or microsoft.com might contain:
<pre>
&lt;link rel="dictionary" type="tbd" href="dictionary.xml">
</pre>
</p>
<p>
The dictionary.xml file would assert public facts: names of employees, organizational units, products. These would reflect internal records. How would an organization mark facts in its internal databases as being both correct and releasable? In my mind's eye, I see a Web form. On the form there is a button. And the button says: <input type="submit" onclick="javascript:alert('CQ!')" value="CQ"/>
</p>

</body>
</item>
	
<item num="a1037">
<title>Web standards on the move</title>
<date>2004/07/12</date>
<body>

<p>
<blockquote>
WHATWG's home page asks rhetorically: "Shouldn't this work be done at the W3C or the IETF?" And it answers: "Many of the members of this working group are active supporters and members of the W3C and other standardization bodies. We plan to submit our work for standardization to a standards body when it has reached an appropriate level of maturity." Bingo. That's how things used to work a decade ago when Web standards, and the applications built on them, formed a virtuous cycle of co-evolution.
<br/><br/>
Another sign of forward motion came from the Mozilla Foundation, which announced last week that it will modernize the long-stagnant Netscape plug-in API in collaboration with Adobe, Apple, Macromedia, Opera, and Sun Microsystems. In other words, everyone but Microsoft. While Internet Explorer sits on the sidelines, benched by Avalon, the rest of the players are creating some excitement on the field. Go, team! [Full story at <a href="http://www.infoworld.com/article/04/07/09/28OPstrategic_1.html">InfoWorld.com</a>]
</blockquote>
Brendan Eich amplified the themes of this column when he appeared last week on the <a href="http://www.itconversations.com/shows/detail156.html">Gillmor Gang</a>. In <a target="audio" href="http://udell.infoworld.com:8002/?site=rdscon.vo.llnwd.net&amp;url=/o1/_downloads/itc/mp3/2004/The%20Gillmor%20Gang%20-%20July%209,%202004.mp3&amp;dur=01:07:52&amp;beg=00:21:37&amp;end=00:25:28">this clip</a> (21:37-25:28), Brendan talks about the tug of war between formal standards and real-world standards.
</p>
<p>
There's more history and passion wrapped up in all this than I can begin to understand, though Tim Bray's <a href="http://www.tbray.org/ongoing/When/200x/2004/07/08/SafariHTML">comments</a> at the end of last week offer some glimpses. Tim was, however, relieved to see that Safari may encapsulate its Dashboard-related extensions in a "pseudo"-namespace -- which seems entirely reasonable to me as well. And despite misgivings, he was listening to Brendan and "not hearing much to disagree with." 
</p>
<p>
The bottom line, for me, is that the browser is the most powerful engine for creating and distributing software that the world has ever seen. Its birth was a messy affair, and its adolescent growth spurt -- if that's what this is -- might not be pretty either. But I'd really like to see it reach for its full potential.
</p>


</body>
</item>

<item num="a1036">
<title>Topic: identity. Author: anonymous.</title>
<date>2004/07/08</date>
<body>

<p>
<img border="1" align="right" vspace="6" src="http://weblog.infoworld.com/udell/gems/didw.jpg"/>
The <a href="http://magazine.digitalidworld.com/Jun04/index.htm">current issue</a> of <a href="http://magazine.digitalidworld.com">Digital ID World</a> just arrived. While reading an article about <a href="http://www.corestreet.com/">CoreStreet</a>, a company whose identity technologies have <a href="http://www.infoworld.com/article/03/09/26/38OPstrategic_1.html">intrigued</a> <sup>1</sup> <a href="http://www.infoworld.com/article/04/05/21/21FEinnov8_1.html">me</a> for a while, I noticed something strangely missing from the article: a byline. 
</p>
<p>
Flipping through the magazine, I found several bylined columns and one bylined feature, but most of the features -- hefty four-to-six-page articles on a range of identity-related topics -- are anonymous.
</p>
<p>
This would be odd in any case, but for a magazine with the tagline <b>Identity is Center</b> it seems downright surreal. What's up with that?
</p>
<p>
<b>Update</b>: Here is the explanation: anything not bylined is written by Phil Becker, the magazine's founder and editor-in-chief, who is also its most prolific author. Phil worried that it would seem egomaniacal to print his name so many times. For what it's worth, I think it's pretty cool that the chief cook can also wash all those bottles.
</p>
<hr align="left" width="25%"/>
<p>
<sup>1</sup> Another identity-related bit of news, as you'll discover if you follow that link, is that older InfoWorld content (pre-2004, I believe) now requires (free) registration. 
</p>

</body>
</item>

<item num="a1035">
<title>WHATWG</title>
<date>2004/07/07</date>
<body>

<p>
Tim Bray has <a href="http://www.tbray.org/ongoing/When/200x/2004/07/05/SafariExt">thrown down the warning flag</a> with respect to the Dashboard-related HTML extensions in the next version of Safari. "I'd be really happy if someone explained to me how this is different from what Netscape and Microsoft did to each other so irritatingly back in 1996," he writes.
</p>
<p>
Well, here's how it looks to me. In <a href="http://weblogs.mozillazine.org/hyatt/archives/2004_07.html#005896">this post</a> about Dashboard, Dave Hyatt mentions that extensions are being done "in a way that is designed to be compatible with <a href="http://www.whatwg.org/specs/web-forms/2004-06-27-call-for-comments/">other browsers</a>." The linked site belongs to the <a href="http://www.whatwg.org/">Web Hypertext Application Technology Working Group</a>, just formed last month. From the WHATWG's home page:
<blockquote>
<b>Shouldn't this work be done at the W3C or IETF?</b>
<p>Many of the members of this working group are active supporters
  and members of the W3C and other standardization bodies. We plan to
  submit our work for standardization to a standards body when it has
  reached an appropriate level of maturity. The current focus is
  rapid, open development and iteration to reach that level.</p>
  <p>Several members of this working group attended <a href="http://www.w3.org/2004/04/webapps-cdf-ws/">The W3C Workshop on Web Applications and Compound Documents</a>. The <a href="http://www.w3.org/2004/04/webapps-cdf-ws/papers/opera.html">position paper submitted by Opera and Mozilla</a> represents the fundamental principles upon which the WHAT working group intends to operate. [<a href="http://www.whatwg.org/">WHATWG]</a>
</p>
</blockquote>
</p>
<p>
That document, which enumerates a whole bunch of practical ways in which browsers could support better Web applications, resonates powerfully for me. Unlike in 1996, Microsoft today sees Web applications as a dead end; Internet Explorer is frozen; the wholly proprietary Avalon is their future. Meanwhile Mozilla, Safari, and Opera think they can create forward motion on Web apps, within a cooperative framework. My $0.02: go for it.
</p>

</body>
</item>


<item num="a1034">
<title>Java and Sun's operating systems: better together?</title>
<date>2004/07/06</date>
<body>

<p>
Every now and then I find myself playing the Howard Beale (Peter Finch) role in <a href="http://us.imdb.com/title/tt0074958/">Network</a>, across from Arthur Jensen (Ned Beatty), the capitalist visionary who explains how things really work. At Digital ID World in 2002, it was Phil Becker who <a href="http://weblog.infoworld.com/udell/2002/10/14.html">channeled Arthur Jensen</a>. During last week's <a href="http://www.itconversations.com/shows/detail152.html">Gillmor Gang</a> show, it was Sun's Jonathan Schwartz. In the wake of JavaOne I'd been thinking tactically about technical aspects of Java. For Schwartz, though, it's all strategic and economic -- as in this sermonette on <a target="audio" href="http://udell.infoworld.com:8002/?site=rdscon.vo.llnwd.net&amp;url=/o1/_downloads/itc/mp3/2004/The%20Gillmor%20Gang%20-%20July%201,%202004.mp3&amp;dur=01:05:09&amp;beg=00:47:22&amp;end=00:49:10">leasing and net present value</a>. I've thought a lot about subscription businesses, and I've even <a href="http://www.oreilly.com/news/udell_0301.html">helped create one</a>, so I have a basic appreciation for the model. But I'm not qualified to evaluate Schwartz's plans for turning Sun's various assets into recurring revenue -- at least, not on financial terms. 
</p>
<p>
I can, however, make some observations about the fitness of those assets for the stated purpose. Java, in particular, has always delivered the right mix of ingredients -- at least in theory.  It's portable. It scales up to the cloud and down to the handset. It's a robust substrate for network services. And it can project a rich user interface onto any device. In practice, though, it's been a challenge to exploit all this goodness with the Java layer decoupled from its OS substrates. I can think of a couple of ways in which tighter integration could be useful:
</p>
<ul>
<li>
<p>
<b>A better <a href="http://java.sun.com/products/javawebstart/">Java Web Start</a>.</b> 
"It's a great idea," Pito Salas <a href="http://www.salas.com/weblogs/archives/000450.html">blogged</a> last week, "but disappointingly implemented." With Java applets, Sun had the first mover advantage in deploying code on demand. Lately Microsoft has been iterating toward a viable .NET solution. It's true that the <a href="http://www.google.com/microsoft?q=clickonce">ClickOnce</a> technology in the 2.0 ("Whidbey") version of the .NET Framework won't play on as many devices as Java can. But ClickOnce will reach a lot of desktops. Java Web Start needs to do better there, and sooner rather than later. 
</p></li>
<li><p>
<b>Stronger Solaris/Java synergy.</b> In Solaris 10, as Sun has been pointing out recently, a single instance of the OS will be able to be virtualized into many isolated partitions. I haven't yet seen an explanation of how Java-based workloads map to those partitions, but I presume the model will be one or more JVMs per partition. Will superior intra-JVM communication be a Solaris 10 differentiator? Will a more granular mapping of applications (rather than JVMs) to partitions be possible? In the latter case, you'd need a process-like abstraction in Java -- and in fact, one is <a href="http://www.dehora.net/journal/2004/07/the_problem_that_java_isolates_solve.html">forthcoming</a>.
</p></li>
</ul>
<p>
As Schwartz notes, Sun's plans for the Windows desktop ran afoul of tactics for which Microsoft wound up making a "two billion dollar apology." OK, but what makes Java special on Solaris servers, or on Sun's Linux desktops for that matter? Sun <a href="http://www.sun.com/smi/Press/sunflash/2002-02/sunflash.20020208.1.html">asserts</a> that Solaris is the best Java substrate, but doesn't marshall a lot of evidence. I haven't seen the same kind of argument made for Sun's version of Linux, in comparison to other Linuxes, but since the Java Desktop System doesn't run much in the way of Java software, the point's kind of moot. 
</p>
<p>
It was inevitable that Java would grow more operating-system-like over time. One example is application isolation, specified in <a href="http://jcp.org/en/jsr/detail?id=121">JSR 121</a>. Another is dynamic management of the Java stack on J2ME devices, described in <a href="http://jcp.org/en/jsr/detail?id=232">JSR 232</a> and <a href="http://sun.feedroom.com/index.jsp?fr_story=FEEDROOM75919">demonstrated by Nokia at JavaOne</a>. In <a href="http://weblog.infoworld.com/udell/gems/nokiaJavaOne2004.ram">this Real clip</a> <sup>1</sup> from the twenty-minute concept video, we see Java components deployed to a network of Nokia Communicators, and then remotely managed. This is no doubt a great thing for the world of handsets. But desktop and server operating systems have their own highly-evolved management methods, to which Java is somewhat orthogonal. If Sun's own operating systems are going help create the new economic world order that Schwartz envisions, maybe they and Java should find ways to work more closely together.
</p>
<hr/>
<p>
<sup>1</sup> Accessing this clip, by the way, was no mean feat. Sun's video URLs are even more elusive than <a href="http://weblog.infoworld.com/udell/2004/07/02.html#a1032">Microsoft's</a>.
</p>

</body>
</item>



<item num="a1033">
<title>Diego Doval</title>
<date>2004/07/03</date>
<body>

<p>
In next week's InfoWorld column, I quote <a href="http://www.dynamicobjects.com/aboutme.html">Diego Doval</a>, CTO of <a href="http://www.clevercactus.com/">clevercactus</a>. Or rather, I meant to quote him. For reasons that escape me, I attributed Diego's remarks to <a href="http://www.diegorivera.com/">Diego Rivera</a>, the famous Mexican muralist. I have no earthly idea how I managed to transpose the living computer scientist and the dead artist. Since Diego Rivera won't be reading this, I'll direct my apology to Diego Doval. For the record:
</p>
<p>
<table border="1" cellpadding="6" cellspacing="0">
<tr><td>
<table align="left"><tr><td>
<img border="1" src="http://weblog.infoworld.com/udell/gems/diegoDoval.jpg"/>
<br/><div align="center"><a href="http://www.dynamicobjects.com/aboutme.html">Diego Doval</a></div>
</td></tr></table>
<p>I am co-founder and CTO of <a href="http://www.clevercactus.com/">clever<b>cactus</b> ltd.</a> I submitted my PhD thesis (in the area of self-organizing networks) last year to <a href="http://www.tcd.ie/">Trinity College Dublin</a>, 
Ireland. I was previously a teaching assistant at TCD's Computer Science department. 
I graduated from <a href="http://www.drexel.edu/">Drexel University</a> in Philadelphia, PA where I did research in the <a href="http://serg.mcs.drexel.edu/">Software Engineering Research Group</a>. I worked at <a href="http://www.fuego.com/">Fuego Corp.</a> before going to the US, then after graduation I was a Research Associate in the <a href="http://www.research.ibm.com/cross_disciplines/p_systems.shtml">Personal Systems Group</a> at IBM's <a href="http://www.watson.ibm.com/">TJ Watson Research Center</a> in Yorktown Heights, New York, and later at <a href="http://www.mindstech.com/">Mindstech International</a>
in Silicon Valley.</p>
</td></tr>
<tr><td>
<table align="right"><tr><td>
<img border="1" src="http://weblog.infoworld.com/udell/gems/diegoRivera.jpg"/>
<br/><div align="center"><a href="http://www.diegorivera.com/">Diego Rivera</a></div>
</td></tr></table>
DIEGO RIVERA (1886-l957), muralist painter, was one of the greatest artists in the XXth century. Born in Guanajuato Mexico, in 1892 he moved to Mexico City with his family. He studied in the San Carlos Academy and in the carving workshop of artist Jos&#233; Guadalupe Posada, whose influence was decisive.
</td></tr>
</table>
</p>

</body>
</item>


<item num="a1032">
<title>Note to MSDN: Make friends with the Lazy Web</title>
<date>2004/07/02</date>
<body>

<p>
A couple of months ago I spoke with Jeffrey Snover, who is the architect of MSH (aka Monad), Microsoft's new object-oriented command shell. At the time, I didn't get to see a demo. Yesterday, Chris Sells <a href="http://www.sellsbrothers.com/news/showTopic.aspx?ixTopic=1431">pointed</a> to the <a href="http://msdn.microsoft.com/theshow/episode043/default.asp">episode of the .NET show</a> that includes a Monad demo by Snover and Jim Truher. Sells also notes that the beta of Monad is available for XP and Server 2003, so I've registered for the download. The concept is wonderful: a Unix-like shell where the stuff that gets piped around is self-describing, either in the form of .NET objects or their XML serializations. Although it targets "the Longhorn wave," I'll be curious to see what Monad can do on current Windows OSs.
</p>
<p>
I was hoping to use Rich Persaud's <a href="http://autometa.com/rpxp/web/">AV clipping service</a> to point to some interesting parts of that Monad demo. That service, which inspired the experimental MP3 clipping service I tried <a href="http://weblog.infoworld.com/udell/2004/06/29.html#a1030">on Tuesday</a>, also enabled me to quote from the Apple WWDC keynote <a href="http://weblog.infoworld.com/udell/2004/06/30.html#a1031">on Wednesday</a>. To form URLs that quote from Real, QuickTime, and Windows Media streams, you just need the URL of the stream. Which, in the case of MSDN broadcasts, is either hard or impossible to find.
</p>
<p>
Here's the <a href="http://msdn.microsoft.com/theshow/episode043/default.asp">home page</a> for the episode of the .NET show that includes the Monad demo. Here's the <a href="http://msdn.microsoft.com/seminar/shared/asp/view.asp?url=/theshow/en/episode043/manifest.xml">URL</a> behind the "Watch it now!" button. Here's the <a href="http://msdn.microsoft.com/theshow/en/episode043/manifest.xml">XML manifest</a> embedded in that URL. And here, from deep inside that file, is a reference to the actual .WMV file:
<pre>
&lt;mediaVideo identifier="060EDE76_49D9_423B_8DA3_D6DB5039745E" 
  xlinkHref="netsow43_mbr.wmv" xlinkActuate="onLoad" xlinkRole="ecrs">
</pre>
So the URL for the movie must be <a href="http://msdn.microsoft.com/theshow/episode043/netsow43_mbr.wmv">http://msdn.microsoft.com/theshow/episode043/netsow43_mbr.wmv</a>, right? Nope. How about <a href="http://msdn.microsoft.com/theshow/netsow43_mbr.wmv">http://msdn.microsoft.com/theshow/netsow43_mbr.wmv</a>? No joy. I did a bit of spelunking in the layers of IE-and-Windows-Media-player-specific JavaScript wrapped around that filename, but came up empty-handed. My guess is that the pathname is buried in some piece of server-side code.
</p>
<p>
Now, MSDN does an awesome job with its webcasts. If you access them from IE -- which, unfortunately, is the only way you can access them -- you'll find that the transcripts are linked to the video with exquisite care, like so:
<pre class="code" lang="xhtml">
&lt;div id="p01:12:23">&lt;img src="/seminar/shared/images/playsync_stat.gif" 
 alt="Jump to #85" hspace="2" border="0" align="absmiddle" class="PrintNever"
 onclick="callSeek('01:12:23');" onmousedown="downPlaySync(this)" 
 onmouseout="resetPlaySync(this)" onmouseover="this.style.cursor='hand'; 
 togglePlaySync(this);" />
&lt;b>JEFFREY SNOVER:&lt;/b>  Yeah.  So now let's focus in on even a more 
sophisticated example where again you have the MSH do more work for you.  
This is the code to stop a process.  Again it's a class, again you put 
the commandlet attribute on top of it and here we have a public int, 
array of integers, called ID.  So we're going to kill processes by 
their process ID and we've got attributes on top of it.  
&lt;div class="code">&lt;pre>
[Cmdlet("stop", "ps1")]
public class StopPs1: Cmdlet
{
	[Parameter( 
		Mandatory = true, 
		PipelineInput = PipelineInput.ByMatchingProperty, 
		Position = 0)]
	[Prompt("Where's the ID dude?")]
	public int [] Id;
	public override void ProcessRecord()
	{
...
&lt;/pre>&lt;/div>
</pre>
</p>
<p>
This is incredibly well done. And yet, the entire presentation is hermetically sealed. From the outside, there's only a single IE-accessible entry point. Conspiracy theorists will doubtless find evil here. I don't. If MSDN wanted to assert total control over this content, it wouldn't offer downloads:
<blockquote>
<b>Offline Viewing Download</b>:
For those of you who want to download a copy of this episode
to your local hard drive for off-line viewing,
we provide this as a separate file (self-extracting .exe) that you can
download. We now offer two file size choices, depending on the
bandwidth of your Internet connection and a third one especially for
mobile devices.<br/><br/>
<a href="http://www.microsoft.com/downloads/details.aspx?FamilyId=0417AA7E-0F20-41E1-A0FE-9AE4CD043E0C&amp;displaylang=en">300 KB version</a> (<b>246 MB file</b>)<br/>
<a href="http://www.microsoft.com/downloads/details.aspx?FamilyId=45CFDC81-2EFA-4358-86CD-E961A7E7AED2&amp;displaylang=en">100 KB version</a> (<b>87 MB file</b>)<br/>
<a href="http://www.microsoft.com/downloads/details.aspx?FamilyId=7F5B93B1-5D47-42EE-B77D-83D94FF52030&amp;displaylang=en">Mobile devices version</a> (<b>78 MB file</b>)<br/>
</blockquote>
</p>
<p>
What is evident, though, is a cultural reluctance to work with the Web on its own terms. MSDN's predilection for publishing URLS that point to self-extracting .EXEs, rather than (in this case) to .WMV files, is really quite odd. My advice: point to the .WMVs too. You've already invested a huge amount of effort in this stuff. The content is intended to be public, and its purpose is to evangelize. So, why not trust the Web and let it help you do that? If you make the URLs directly available, here are some of the positive effects that can ensue:
<ul>
<li><p>
A blogger could point directly to one of your timecoded fragments, or use an AV clipping service to point to a newly-constructed fragment.
</p></li>
<li><p>
A Firefox user on Mac OS X could access the content. You don't want to just preach to the converted, do you?
</p></li>
<li><p>
A transcoding service could (in theory) make the video accessible in non-Windows-Media formats.
</p></li>
</ul>
If you let it, the <a href="http://www.lazyweb.org/">LazyWeb</a> will be your friend. You needn't implement any of these ideas, you just need to publish the URLs that enable others to do so.
</p>
<p>
<b>Update</b>: Ace detective <a href="http://hublog.hubmed.org/">Alf Eaton</a> took up the challenge, and has extracted the URL I was looking for. The trick is to hit the "Watch this video" URL with a user agent that pretends to be IE but isn't. Alf did that using the debug feature of Safari. Armed with this insight, I was able to do the same thing with the command-line tool curl:
</p>
<pre>
curl -A "Mozilla/4.0 (compatible; MSIE 5.5)" \
  http://msdn.microsoft.com/seminar/shared/asp/view.asp?\
  url=/theshow/en/episode043/manifest.xml 
</pre>
<p>
which, as Alf discovered, yields:
</p>
<pre>
/theshow/en/episode043/netsow43_mbr.asx
</pre>
<p>
which in turn yields:
</p>
<pre>
mms://wm.microsoft.com/ms/seminar/en/Episode043/netsow43_mbr.wmv
</pre>
<p>
Thanks Alf! Now, where was I? Oh, yeah, I wanted to highlight a couple of things:
<ul>
<li><p>
<a href="http://autometa.com/rpxp/?winmedia/clip/video/start/1:02:36/stop/1:03:21/stream/mms://wm.microsoft.com/ms/seminar/en/Episode043/netsow43_mbr.wmv">navigating from objects to subobjects</a>
</p></li>
<li><p>
<a href="http://autometa.com/rpxp/?winmedia/clip/video/start/1:03:57/stop/1:05:17/stream/mms://wm.microsoft.com/ms/seminar/en/Episode043/netsow43_mbr.wmv">piping output to Excel</a>
</p></li>
<li><p>
<a href="http://autometa.com/rpxp/?winmedia/clip/video/start/1:15:20/stop/1:16:13/stream/mms://wm.microsoft.com/ms/seminar/en/Episode043/netsow43_mbr.wmv">data coercion in the pipeline</a>
</p></li>
<li><p>
<a href="http://autometa.com/rpxp/?winmedia/clip/video/start/1:23:24/stop/1:23:45/stream/mms://wm.microsoft.com/ms/seminar/en/Episode043/netsow43_mbr.wmv">errors as collections of first-class objects</a>
</p></li>
<li><p>
<a href="http://autometa.com/rpxp/?winmedia/clip/video/start/1:27:20/stop/1:28:48/stream/mms://wm.microsoft.com/ms/seminar/en/Episode043/netsow43_mbr.wmv">globbing and wildcarding alternate namespaces, with tab completion</a>
</p></li>
</ul>
Cool stuff.
</p>
	
</body>
</item>


<item num="a1031">
<title>Space, time, and data</title>
<date>2004/06/30</date>
<body>

<p>
<blockquote>
Scalable vector graphics and animation are two of the hallmark features of Macromedia's nearly ubiquitous multimedia player. Yet the company has done a poor job of creating -- or convincing third-party developers to create -- components that make it routine for people to work with spatial and temporal data. And in the recent push to legitimize Flash as a rich-client platform, the company has de-emphasized what is at the core of every Flash movie: its timeline.
<br/><br/>
It's a hard sell, admittedly. Microsoft is also having a tough time articulating the business case for the scalable vector graphics, 3-D, and animation capabilities it's building into Avalon, the next-generation Windows graphics subsystem. My advice? Stop worshipping the raw power of next year's graphics processing unit, and start showing developers concrete ways to help users deal with their four-dimensional data.  [Full story at <a href="http://www.infoworld.com/article/04/06/25/26OPstrategic_1.html">InfoWorld.com</a>] 
</blockquote>
<a target="video" href="http://autometa.com/rpxp/?quicktime/clip/video/start/1:14:30/stop/1:15:27/stream/http://stream.qtv.apple.com/events/jun/wwdc2004/wwdc_300_100_56_ref.mov"><img align="right" vspace="6" hspace="6" src="http://weblog.infoworld.com/udell/gems/electricZebras.jpg"/></a>
I hadn't yet seen Steve Jobs' <a target="video" href="http://stream.qtv.apple.com/events/jun/wwdc2004/wwdc_300_100_56_ref.mov">WWDC keynote</a> when I wrote this column. The demos, collectively, add up to a pretty convincing shot across Longhorn's bow. But I'd level the same criticisms at Apple's use of its hot new graphics technologies. <a target="video" href="http://autometa.com/rpxp/?quicktime/clip/video/start/1:10:15/stop/1:10:39/stream/http://stream.qtv.apple.com/events/jun/wwdc2004/wwdc_300_100_56_ref.mov">Here</a> Phil Schiller applies a bump distortion to an image of a tiger, and  <a target="video" href="http://autometa.com/rpxp/?quicktime/clip/video/start/1:14:30/stop/1:15:27/stream/http://stream.qtv.apple.com/events/jun/wwdc2004/wwdc_300_100_56_ref.mov">here</a> he creates the Electric Zebras album cover. Later, Jobs casually shows off <a target="video" href="http://autometa.com/rpxp/?quicktime/clip/video/start/1:21:38/stop/1:22:08/stream/http://stream.qtv.apple.com/events/jun/wwdc2004/wwdc_300_100_56_ref.mov">liquid distortion</a> as he drags Dashboard widgets onto the desktop. Absolutely luscious eye candy. But, to what end?
</p>
<p>
We live in an age of <a href="http://www.amazon.com/exec/obidos/tg/detail/-/0679726012/">innumeracy</a>. The "chartoon" style of graphics that <a href="http://www.nigelholmes.com/">Nigel Holmes</a> invented at Time has now, to my dismay, begun to cheapen the editorial page of the New York Times. Holmes' arch-nemesis <a href="http://www.edwardtufte.com/tufte/books_vdqi">Edward Tufte</a>, whom Salon aptly describes as a <a href="http://www.salon.com/march97/tufte970310.html">data artist</a>, sets the bar for precise, intelligent, meaningful visualization of data. We don't get nearly enough of that, and it's not for lack of GPU horsepower or elegant APIs. The gating factor is that you can't bottle and sell the Tuftean sensibility. Still, we can try. I'd like to see Apple or Macromedia or Microsoft put Tufte (or someone who thinks like him) in charge of a Manhattan program to produce a new breed of display widgets and data-wrangling wizards.  
</p>
<p>
I was recently shown a stunning visualization of sales data based on the open source <a href="http://treemap.sourceforge.net/">Java Treemap Viewer</a> (<a href="http://www.cs.umd.edu/hcil/treemap-history/index.shtml">background</a>). Like the DateLens viewer I mention in this week's column, the Treemap viewer derives from pioneering work at the University of Maryland's <a href="http://www.cs.umd.edu/hcil/">Human-Computer Interaction Lab</a>. I can't show you the actual visualization I saw because it's proprietary, but here's a <a target="video" href="http://www.cs.umd.edu/hcil/treemap/applet/index.shtml">demo</a>. This technique has been around for years. Of the real-life data sets that could be productively visualized this way, though, I'll wager that few are. I've got a hunch there are a bunch of other techniques that are languishing in research labs too. The industry's challenge is to dig them up, refine them, and deliver them to developers and end users in ways that will really improve our data-driven communication.
</p>
<p>
<b>Update</b>: <a href="http://www.webwerks.co.nz/weblog/">Andrew Duncan</a> wrote to remind me that I omitted another WWDC graphics demo: <a target="video" href="http://autometa.com/rpxp/?quicktime/clip/video/start/00:35:30/stop/00:37:48/stream/http://stream.qtv.apple.com/events/jun/wwdc2004/wwdc_300_100_56_ref.mov">Aran Anderson's stunning Orbit satellite simulator</a>. "If you need a non-trivial justification for all that GPU goodness," he asked, "wouldn't Orbit qualify?" That's a great point, thanks Andrew. An awesome app, indeed. As I mentioned in my column, scientific visualization has always been a voracious consumer of GPU cycles, but it has also tended to live in its own sci-viz ghetto. Now it's time for this stuff to break out into the world of mainstream business data.
</p>

</body>
</item>


<item num="a1030">
<title>It's not the J in Java Virtual Machine that matters, it's the VM</title>
<date>2004/06/29</date>
<body>

<p>
During the <a href="http://www.itconversations.com/shows/detail149.html">June 18 Gillmor Gang show</a>, I asked Hummer Winblad's <a href="http://www.humwin.com/team.html#kertzman">Mitchell Kertzman</a> about open source business models. Kertzman <a target="audio" href="http://udell.infoworld.com:8002/?site=rdscon.vo.llnwd.net&amp;amp;url=/o1/_downloads/itc/mp3/2004/The%20Gillmor%20Gang%20-%20June%2018,%202004.mp3&amp;amp;dur=01:04:22&amp;amp;beg=00:09:21&amp;amp;end=00:10:25">said</a> <sup>1</sup> that the key factor, from his perspective, is the way in which the open source stack frees commercial software companies from the burden of "dragging around an expensive platform." He also <a target="audio" href="http://udell.infoworld.com:8002/?site=rdscon.vo.llnwd.net&amp;amp;url=/o1/_downloads/itc/mp3/2004/The%20Gillmor%20Gang%20-%20June%2018,%202004.mp3&amp;amp;dur=01:04:22&amp;amp;beg=00:15:08&amp;amp;end=00:15:59">questioned the need</a> <sup>2</sup> for the JVM, citing two reasons. First, that Java's portability has become a non-issue now that there are only two platforms that matter: .NET and Linux. Second, that the rise of XML Web services has given a boost to the text-savvy scripting languages: Perl/Python/PHP, the "P" in LAMP. 
</p>
<p>
At that point something clicked in my head, and I <a target="audio" href="http://udell.infoworld.com:8002/?site=rdscon.vo.llnwd.net&amp;amp;url=/o1/_downloads/itc/mp3/2004/The%20Gillmor%20Gang%20-%20June%2018,%202004.mp3&amp;amp;dur=01:04:22&amp;amp;beg=00:21:13&amp;amp;end=00:23:35">proposed</a> <sup>3</sup> a software taxonomy based entirely on virtual machines -- the VB runtime, the CLR, the JVM, the Perl and Python VMs. Some of these are bound more tightly to operating systems than others, some are bound more tightly to programming languages than others, but they all share a set of common characteristics. The definition of a modern "software platform," I would say, is a VM and its associated class libraries. And a bunch of implications flow from that.
</p>
<p>
Here's one. In last Friday's <a href="http://weblog.infoworld.com/udell/2004/06/25.html#a1029">item</a> on automated code analysis, I forgot to mention that the growing reliance on VMs is become a key enabler of a new breed of tools that enhance software quality. From Greg Wilson's blog:
<blockquote class="personQuote GregWilson">
<p>One of the most important features of the "New Standard Model" of
programming is its emphasis on unit testing.  Just five years after
the first version of JUnit was written, an ever-increasing number of
programmers actually create and run tests as a matter of course.</p>
<p>But writing tests by hand is still tedious, and still requires a fair
degree of programming skill.  Enter Li and Wu's new <a href="http://www.sybex.com/sybexbooks.nsf/booklist/4320">book</a>.  Over the
course of twelve detailed (and sometimes rather intense) chapters, the
authors explain how to build a higher-level testing tool for .NET
programs using:</p>
<ul>
<li>reflection, to find and call the methods being tested;</li>
<li>CodeDOM, to generate testing code from specifications; and</li>
<li>Excel, as a user interface.</li>
</ul> [<a href="http://pyre.third-bit.com/heliumblog/archives/000049.html">Helium: Greg Wilson</a>]
</blockquote>
</p>
<p>
VMs still aren't completely viable on the client side, so a lot of what's becoming possible hasn't really sunk in, but that's about to change. Eclipse runs on the JVM, Chandler runs on Python, various things run on the CLR (and Mono), Longhorn apps will run on the CLR. One way or another, your platform will be a VM. Its capabilities, class libraries, OS bindings, and language bindings will matter more to you than the underlying OS or language.
</p>
<hr/>
<p>
<sup>1</sup> This is an experimental MP3 clipping service. Alternatively (i.e., if I break it) you can just go to the <a href="http://www.itconversations.com/shows/detail149.html">broadcast</a> and play 9:21 to 10:25.
</p>
<p>
<sup>2</sup> 15:08 - 15:59
</p>
<p>
<sup>3</sup> 21:13 - 23:35
</p>



</body>
</item>

<item num="a1029">
<title>Open source and the advancement of automated code analysis</title>
<date>2004/06/25</date>
<body>

<p>
<a href="http://weblog.infoworld.com/udell/gems/agitar.jpg"><img align="right" vspace="6" hspace="6" src="http://weblog.infoworld.com/udell/gems/agitar_s.jpg"/></a>
Back in January I mentioned <a href="http://www.agitar.com/">Agitar Software</a> in a <a href="http://www.infoworld.com/article/04/01/23/04OPstrategic_1.html">column on software testing</a>. The backstory was that Agitar got in touch with me after reading my review of <a href="http://weblog.infoworld.com/udell/2003/12/03.html#a857">Compuware's DevPartner Studio</a>. I had used NLucene, the .NET port of the Java-based Lucene search engine, as a benchmark to explore that product's debugging and source-code analysis features. Agitar's development lead, Kent Mitchell, picked up on the idea. He fed Lucene's Java sources into his test automation tool, <a href="http://www.agitar.com/products/000024.html">Agitator</a>, and used Lucene to demonstrate his product.
</p>
<p>
Today Agitar's Mark de Visser pointed me to this <a href="http://www.agitar.com/openquality/">interesting experiment</a>. It's a set of test coverage reports for Agitar's own product plus some open source Java projects including Ant, Berkeley DB, Cocoon, and Lucene. What exactly these reports mean is open to interpretation, as Agitar points out. Note also that the Agitar is a special case, since the company has been <a href="http://www.developertesting.com/managed_developer_testing/000033.html">dogfooding</a> its own tool. While "agitation" of arbitrary code can automatically produce a bunch of tests, they're not really mean to be used without human oversight. CTO Alberto Savoia puts it this way:
<blockquote class="personQuote AlbertoSavoia">
Agitator can greatly accelerate the development and thoroughness of unit tests by automating most of the activities that don't require human understanding, intelligence, and creativity, but you still need to invest time and thought to direct the automation and to make sure the results the results are correct, robust, and maintainable. [<a href="http://www.developertesting.com/managed_developer_testing/000033.html">Developer Testing: Eating our own dogfood</a>]
</blockquote>
</p>
<p>
The meta-theme I find interesting here is the virtuous cycle involving open source codebases and a new breed of static and dynamic code analysis tools. Another example: <a href="http://www.coverity.com/main.html">Coverity's</a> <a href="http://linuxbugs.coverity.com/">Linux bugs database</a> (registration required, see <a href="http://www.coverity.com/files/linux_article.pdf">this Linux Magazine article by Benjamin Chelf</a> for background). 
</p>
<p>
To Eric Raymond's <a href="http://www.catb.org/~esr/writings/cathedral-bazaar/cathedral-bazaar/ar01s04.html">famous dictum</a> -- "Given enough eyeballs, all bugs are shallow" -- perhaps we should now add: "Given enough code to study, the eyeballs will be fitted with increasingly powerful spectacles."
</p>
	

</body>
</item>

<item num="a1028">
<title>OS X Keychain and Win XP Credential Manager</title>
<date>2004/06/24</date>
<body>

<p>
Somebody asked me today why Windows XP doesn't have something like Mac OS X's Keychain: a secure, systemwide store for names and passwords. And then I remembered, dimly, that it does -- sort of. When XP came out, all the <a href="http://www.microsoft.com/windowsxp/pro/evaluation/features.mspx">feature lists</a> mentioned Credential Manager, which uses the Windows Data Protection API (<a href="http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dnsecure/html/windataprotection-dpapi.asp">DPAPI</a>) to do something that sounds just like what the OS X Keychain does. On XP, you get to the Credential Manager like so: Control Panel -> User Accounts -> Manage my Network Passwords. 
</p>
<p>
It seems bizarre that I could have forgotten all about this. But then again, perhaps not. When I looked at the Stored Usernames and Passwords list, I found nothing there except for my Passport account. No FTP sites, websites, email accounts, or SSH accounts. Nor was I able to add such accounts using the GUI. Digging a bit deeper, I learned that Windows Server 2003's <a href="http://www.microsoft.com/resources/documentation/WindowsServ/2003/standard/proddocs/en-us/Default.asp?url=/resources/documentation/WindowsServ/2003/standard/proddocs/en-us/cmdkey.asp">cmdkey</a> can be transplanted to XP, where it can be used to list and add credentials. Using cmdkey I was able to add a Web account by specifying the "generic" type -- as opposed to the default, which is the domain. But IE still paid no attention. Its credential memory is apparently unrelated to Credential Manager. Who knew? Not me, anyway.
</p>
<p>
Next I went back to double-check the OS X situation. In Keychain Access, I found FTP sites, SSH accounts, and certificates, but no websites. How come? Oh, Firefox. I haven't used Safari in ages. Firefox evidently talks neither to Credential Manager on Windows nor to Keychain on OS X. But while Windows' native browser, IE, doesn't talk to the systemwide credential store, OS X's native browser, Safari, does. When I told Safari to remember credentials for a secure website, they showed up in Keychain Access. (Apparently <a href="http://www.mozilla.org/projects/camino/">Camino</a> supports the Keychain too.)
</p>
<p>
Weird, eh? Some parting questions:
<ol>
<li><p>Does IE really not use DPAPI to store non-Passport Internet credentials, and if not, why not? </p></li>
<li><p>Will XP SP2 make any changes in this area? </p></li>
<li><p>What would it take for a cross-platform app, say Firefox, to support both Credential Manager on Windows and Keychain on OS X?</p></li>
<li><p>Do <i>any</i> existing apps do both?</p></li>
</ol>
</p>
<hr/>
<p>
<b>Update:</b> Ari Pernick <a href="http://blogs.msdn.com/webtransports/archive/2004/06/25/166317.aspx">spells out</a> the situation, which is a bit complex. Briefly, WinInet uses DPAPI for NTLM/Kerberos, but uses PStore for basic and digest authentication. He writes:
<blockquote class="personQuote AriPernick">
Pstore doesn't do as good of a job of protecting credentials as the Data Protection and Credential Management APIs do and as the warning on the API documentation suggests, it is likely to change or go away in Longhorn. In that timeframe WinInet will switch to use the better APIs for those types of credentials. As for Udell's question #1, which asks why we don't use the better APIs to store basic and digest authentication, my best guess is that the credential manager wasn't really made to hold that type of credential well (you can't input them from the GUI UI).  And to answer question #2, this hasn't changed in Windows XP SP2.
<br/><br/>
Even with the planned changes I referred to, you are still a far cry from centralized credential management that includes all web credentials. The credentials in the better store may still not show up in the GUI and forms based authentication is a completely different beast altogether. Sounds like a nice feature to integrate all of those in one GUI for a user, and maybe an IE or a security pm will hear the call and make it so, especially if the users ask for it. [<a href="http://blogs.msdn.com/webtransports/archive/2004/06/25/166317.aspx">Ari Pernick: WebTransports: Where to put the credentials?</a>]
</blockquote>
Thanks for clearing that up, Ari. I suspect that if more users thought about this issue, they'd be asking for the solution, but since they don't, they aren't. For what it's worth, I'm asking. Whether you are a home user or an enterprise user, you've got a boatload of Web credentials to manage. For something so basic, it seems nuts to have to rely on a non-integrated third-party solution -- Bruce Schneier's <a href="http://www.schneier.com/passsafe.html">Password Safe</a>, for example -- when the platform could support an integrated solution. Something this basic ought to be built in, as it is on the Mac. And "the Longhorn time frame" seems awfully remote. XP SP3, maybe?
</p>

</body>
</item>


<item num="a1027">
<title>The Google PC</title>
<date>2004/06/22</date>
<body>

<p>
<blockquote>
On the Google PC, you wouldn't need third-party add-ons to index and search your local files, e-mail, and instant messages. It would just happen. The voracious spider wouldn't stop there, though. The next piece of low-hanging fruit would be the Web pages you visit. These too would be stored, indexed, and made searchable. More ambitiously, the spider would record all your screen activity along with the underlying event streams. Even more ambitiously, it would record phone conversations, convert speech to text, and index that text. Although speech-to-text is a notoriously imperfect art, even imperfect results can support useful search. [<a href="http://www.infoworld.com/article/04/06/18/25OPstrategic_1.html">InfoWorld.com</a>]
</blockquote>
This column is a companion to another from a few weeks ago: <a href="http://weblog.infoworld.com/udell/2004/05/12.html#a999">Google's supercomputer</a>. Meanwhile I've been working on a story about Longhorn, for which I had long and an extremely interesting interview with Quentin Clark, <s>the architect of</s> director of program management for WinFS. I'd like to transcribe the whole thing to post along with the story, when it runs, but the upshot is that Microsoft is planning more and better integration between WinFS and XML -- both in terms of data definition and query -- than I'd previously heard, which is welcome news. 
</p>
<p>
It seems clear, though, that whatever can be accomplished by means of what I've come to call "managed metadata," we'll always want that Google effect to be happening in parallel. When asked about the Semantic Web and RDF at InfoWorld's 2002 CTO Forum, Sergey Brin said:
<blockquote class="personQuote SergeyBrin">
Look, putting angle brackets around things is not a technology, by itself. I'd rather make progress by having computers understand what humans write, than by forcing humans to write in ways computers can understand.
</blockquote>
From my perspective, this isn't an either/or choice. I'd rather make progress by having computers understand what people write <i>and</i> by helping people to write in ways that computers can understand. What's more, I'd like to construe "writing in ways that computers can understand" as a problem for which hybrid SQL/XML technology is a solution. When managed metadata exists, or can be acquired, purely relational query will be powerful. When metadata is implicitly present, for example in XML fragments, XPath and XQuery can leverage it. The combination of relational, XML, and free-text search is the best of all worlds. As I've mentioned before, by the way, <a href="http://archive.infoworld.com/article/03/05/23/21FEinnovidehen_1.html?s=feature">Kingsley Idehen</a> has been <a href="http://search.infoworld.com/servlet/query.html?qt=virtuoso">demonstrating this</a> for several years. 
</p>

</body>
</item>

<item num="a1026">
<title>Outages</title>
<date>2004/06/22</date>
<body>

<p>
Yesterday, one of my DSL providers ran afoul of a backhoe which severed its OC3. The bad news was that a bunch of customers, me included, learned that we had no redundant path to the backbone -- at least not through this provider. (This is one reason why I maintain a separate circuit through a different provider; that one was unaffected.) The good news was that the fiber got spliced together very quickly, and the provider was really, really sorry and really, really proactive. I got calls from three people alerting me to the outage, and calls from four other people notifying me that it was cleared. In a situation like that, there's no such thing as overcommunicating.
</p>
<p>
Hence this note. If you haven't heard from me in a few days but think you should have, it's not because of that fiber cut. Apparently my home mail server, to which my InfoWorld mail is forwarded, tightened up its reverse DNS lookup policy. It could resolve the domain, but not the specific hostname/domain. That's been corrected now on our end (thanks, Kevin), and I hope the queued messages will transfer today. 
</p>

</body>
</item>


<item num="a1025">
<title>Open document formats</title>
<date>2004/06/17</date>
<body>

<p>
Last week Tim Bray <a href="http://www.tbray.org/ongoing/When/200x/2004/06/09/ScienceStreet">wrote about</a> his (and Sun's) involvement in the European Commission's investigation into the OpenOffice and Microsoft flavors of XML office documents. The upshot:
<blockquote class="personQuote TimBray">
You can find the Committee's conclusions <a href="http://europa.eu.int/ISPO/ida/jsps/index.jsp?fuseAction=showDocument&amp;parent=crossreference&amp;documentID=2592">here</a>; 
they're short, readable, and defy summarization. [<a href="http://www.tbray.org/ongoing/When/200x/2004/06/09/ScienceStreet">ongoing</a>]
</blockquote>
The conclusions are indeed concise, and the bulleted recommendations even more so. I'll quote them here, changing only &lt;ul> to &lt;ol> for ease of reference:
<blockquote>
Therefore, it is recommended that:
<ol>
<li>The OASIS Technical Committee
considers whether there is a need and opportunity for extending the
emerging OASIS Open Document Format to allow for custom-defined schemas; 
</li><li>Industry actors not currently
involved with the OASIS Open Document Format consider participating in
the standardisation process in order to encourage a wider industry
consensus around the format; 
</li><li>Submission of the emerging OASIS
Open Document Format to an official standardisation organisation such
as ISO is considered;
</li><li>Microsoft considers issuing a
public commitment to publish and provide non-discriminatory access to
future versions of its WordML specifications; 
</li><li>Microsoft should consider the merits of submitting XML formats to an international standards body of their choice; 
</li><li>Microsoft assesses the possibility of excluding non-XML formatted components from WordML documents; 
</li><li>Industry is encouraged to provide
filters that allow documents based on the WordML specifications and the
emerging OASIS Open Document Format to be read and written to other
applications whilst maintaining a maximum degree of faithfulness to
content, structure and presentation. These filters should be made
available for all products; 
</li><li>Industry is encouraged to provide
the appropriate tools and services to allow the public sector to
consider feasibility and costs of a transformation of its documents to
XML-based formats;
</li><li>The public sector is
encouraged to provide its information through several formats. Where by
choice or circumstance only a single revisable document format can be
used this should be for a format around which there is industry
consensus, as demonstrated by the format's adoption as a standard.</li></ol>
</blockquote>
</p>
<p>
The next day I received a note from somebody at Waggener-Edstrom, Microsoft's public relations firm, pointing to and summarizing <a href="http://www.microsoft.com/office/xml/juneletter.mspx">this open letter from Jean Paoli</a>. Both notes -- that is, the PR rep's and Paoli's -- stress point #1: that support for user-defined schemas, which Office 2003 alone offers, is a big deal. I agree. Neither note directly addresses points #4 <sup>1</sup>, #5, or #6. And neither cites the original report, though the <a href="http://www.microsoft.com/office/xml/">Office XML home page</a>, which the Paoli letter points to, does point to the European Commission's <a href="http://europa.eu.int/ISPO/ida/jsps/index.jsp?fuseAction=showDocument&amp;parent=news&amp;documentID=2387">wrapper page</a>. And it, in turn, points to:
<ul>
<li><a href="http://europa.eu.int/ISPO/ida/jsps/index.jsp?fuseAction=showDocument&amp;parent=news&amp;documentID=2387">the recommendations</a></li>
<li><a href="http://europa.eu.int/ISPO/ida/export/files/en/1928.pdf">the full "Valoris" report (78-page PDF)</a></li>
<li><a href="http://europa.eu.int/ISPO/ida/export/files/en/1933.pdf">Microsoft's comments</a></li>
<li><a href="http://europa.eu.int/ISPO/ida/export/files/en/1971.pdf">Sun's comments</a></li>
</ul>
</p>
<p>
I'm citing those URLs here partly for my own future reference, and partly to try to attract attention to a subject that's important, complex, and warrants a lot more discussion and commentary. Just now, with the Valoris report loaded into my browser, I clicked my <a href="http://weblog.infoworld.com/udell/2004/04/13.html">Technorati talkback</a> bookmarklet -- which in this case resolves to <a href="http://www.technorati.com/cosmos/search.html?url=http://europa.eu.int/ISPO/ida/export/files/en/1928.pdf">this lookup</a>, and found only <a href="http://217.45.146.189/archive/2004/06/14/232.aspx">this comment</a> from Stephen McGibbon. Meanwhile, Feedster comes up blank for <a href="http://www.feedster.com/search.php?q=%22valoris+report">Valoris report</a>.
</p>
<p>
Open document formats are a big deal. Here's hoping that the next time I issue those queries, more will turn up.
</p>
<hr/>
<p>
<sup>1</sup> Note, however, that the <a href="http://www.microsoft.com/office/xml/">Office XML home page</a> calls out the <a href="http://www.microsoft.com/Office/xml/faq.mspx">FAQ</a> which "has been recently updated with information regarding the perpetual nature of the program, patent grants, and more." 
</p>


</body>
</item>

<item num="a1024">
<title>When a journalist blogs</title>
<date>2004/06/15</date>
<body>

<p>
<a href="http://blogs.msdn.com/jmazner/archive/2004/06/14/155791.aspx">Jeremy Mazner</a> is asking some great questions:
<ul>
<li><p><i>
Q: Does a quick blog entry meet the same standards and go through the same background and vetting process as a "real" story?
</i></p>
<p>
A: Many (though not all) of the items I post here are as carefully written as what goes into print. None <s>are</s> is <sup>1</sup> edited by anybody but me. None are vetted by anyone at InfoWorld, but all can be vetted by everybody who chooses to comment.
</p></li>
<li>
<p><i>
Q: Is a blog entry equally as obligated to represent both sides of a controversy, or is it expected to only represent the journalist's point of view? 
</i></p>
<p>
A: For the magazine, I write features and reviews and columns. All are expected to be fair. The story types exist along a spectrum ranging from less to more personal. The blog lives at the personal end of the spectrum.
</p></li>
<li>
<p><i>
Q: Are blogs supposed to be more of a conversation -- and if so, should they always have comments enabled?
</i></p>
<p>
A: I think blogs can't help but be a conversation. As to comments, after years of doing Web forums and discussions, I'm experimenting with taking a break from flames and spam. I'd like to think that the blogosphere's less tightly-coupled "discussions" -- mediated by logs and search engines -- delivers better signal-to-noise with less psychic strain. That said, I do miss direct comments, I do use them selectively, and I may try renabling them.
</p></li>
</ul>
</p>
<p>
Jeremy's questions were motivated by a series of questions I've been asking about Longhorn. This is part of a strategy I've been using -- since the pre-blog era, in fact, when my medium of choice was NNTP -- to deepen the stories I research for magazines. When the subject is not a secret, I find it extremely helpful to raise some issues publicly and invite a range of interested parties to react to them. A recent example was <a href="http://weblog.infoworld.com/udell/2004/01/27.html#a900">this entry</a> in support of <a href="http://weblog.infoworld.com/udell/2004/03/01.html#a930">this story</a>. 
</p>
<p>
In that spirit, I owe Jeremy a response to <a href="http://blogs.msdn.com/jmazner/archive/2004/06/14/155779.aspx">his questions</a> about my take on WinFS. He asks: "What is an 'XML-centric database' anyway?" A good example of the basic idea -- and the one I've been working with -- is Berkeley DB XML (which has also been adopted by the Chandler project). DB XML supports indexed XPath search, a poweful capability that's now being woven into both RDBMSs with XML support, and "native" XML databases. An even more powerful standard is XQuery, which though not a final recommendation is implemented provisionally in both conventional RDBMSs and native-XML dbs.
</p>
<p>
We have standard query languages (XPath, XQuery), and standard ways of writing schemas (XSD, Relax), and applications (Office 2003) that with herculean effort have been adapted to work with these query and schema languages, and free-text search further enhancing all this goodness. Strategically, why not build directly on top of these foundations? 
</p>
<p>
Tactically, why do I want to write code like this:
<pre class="code csharp">
public class Person
  {
  [XmlAttribute()] public string Title;
  [XmlAttribute()] public string FirstName;
  [XmlAttribute()] public string MiddleName;
  [XmlAttribute()] public string LastName;
  ....
</pre>
in order to consume data like this?
<pre>
&lt;People>
  &lt;Person
    DisplayName="Woodgrove Bank"
    IMAddress="Support@woodgrovebank.com"
    UserTile=".\user_tiles\Adventure Works.jpg">
    &lt;EmailAddresses>
        &lt;EmailAddress
            Type="Work"
            Address="mortgage@woodgrovebank.com"/>
        &lt;EmailAddress
            Type="Primary"
            Address="Support@woodgrovebank.com"/>
   &lt;/EmailAddresses>
</pre>
</p>
<p>
I believe two things to be true. First, we have some great XML-oriented data management technologies. Second, the ambitious goals of WinFS cannot be met solely with those technologies. I'm trying to spell out where the line is being drawn between interop and functionality, and why, and what that will mean for users, developers, and enterprises.
</p>

<hr/>
<p>
<sup>1</sup> David Clarke, of CapeClear, points out that "this statement, precisely by virtue of its obvious lack of sub-editing ('are', not 'is'), re-inforces the very point it seeks to make!" Delightful! As David mentioned to me in email, there ought to be a word for this reflexive case.  
</p>

</body>
</item>



<item num="a1023">
<title>Thin client, rich data</title>
<date>2004/06/15</date>
<body>

<p>
<blockquote>
Current approaches to taking browsers offline typically enqueue messages that later update a server-based data model. An Alchemy application, though, always works with a genuine local data model that it stores as sets of XML fragments and navigates in a relational style. Bosworth's hunch is that a Web-style thin client, driven by a rich data model intelligently synchronized with the services cloud, could do most of what we really need -- both offline and online. Nothing prevents Java, .Net, and Flash clients from adopting the same strategy, by the way. But if Bosworth is right, the universal client that we know and love could get a new lease on life. [Full story at <a href="http://www.infoworld.com/article/04/06/11/24OPstrategic_1.html">InfoWorld.com</a>]
</blockquote>
</p>
<p>
In the story as printed, this sentence:	
<blockquote>
BEA's Alchemy relies on a server component for the same reason that Macromedia's Flex does: both companies want to sell servers.
</blockquote>
was abbreviated to this:
<blockquote>
BEA's Alchemy relies on a server component for the same reason that Macromedia's Flex does.
</blockquote>
Things get left on the cutting room floor, that's just life in the print medium, but I do want to restore (and expand on) the original point. Adam Bosworth is a guy who knows an awful lot about building client software -- Quattro, Paradox, Access, IE. But he is not now selling client software. Rather, he's selling infrastructure based on principles -- asynchronous coarse-grained XML messaging -- that he has forcefully and consistently evangelized. From our interview, here are some quotes that restate why such infrastructure must be:
</p>
<blockquote class="personQuote AdamBosworth">
<b>Clustered</b>
People who build services tend to assume they don't know who's going to use them, and how often. 
</blockquote>
<blockquote class="personQuote AdamBosworth">
<b>Metadata-driven</b>
So you can change the behavior without recompiling and redeploying the code.
</blockquote>
<blockquote class="personQuote AdamBosworth">
<b>Asynchronous</b> We're still trying to convince the industry of this, but it's a lot better if you do this asynchronously, because a lot of time the thing you're trying to talk to can't respond right away, either because it wasn't written to handle the load or because the thing you're asking it to do takes time. 
</blockquote>
<blockquote class="personQuote AdamBosworth">
<b>Intermediated</b> 
The problem is that your credit approval service got bought by BFA and they consolidated the thing, so now it's a different address with a different message, and you don't want to redeploy your app. So you want everything to go through some fabric that is essentially modifiable. Call these things intermediaries, enterprise service buses, fabrics, I don't care, but you need one of these things. We've announced one called QuickSilver, BlueTitan does a nice job with this, there's Confluent...
</blockquote>
<p>
Nothing controversial here. But there are wildly different approaches to the construction of the client-side systems that we'll attach to this infrastructure. Microsoft's Longhorn must try to extend the Windows franchise. BEA's Alchemy is free to extend the Web. This isn't an either-or deal, of course. Both strategies can succeed and co-exist. It helps, though, that the Web has found a powerful new ally.
</p>


</body>
</item>


<item num="a1022">
<title>Quis custodiet ipsos custodes?</title>
<date>2004/06/14</date>
<body>

<p>
Tim Bray <a href="http://www.tbray.org/ongoing/When/200x/2004/06/13/Sunbeams">points to</a> Sun's John Clingan who asks the important question (in English, not Latin): <a href="http://blogs.sun.com/roller/page/jclingan/20040613">Who analyzes the analysts?</a> This bit caught my eye:
<blockquote class="personQuote JohnClingan">
I remember back in ~1990 when Windows NT was being talked about taking over the world. My girlfriend at the time (now my wife) saw it on a magazine rack and said "I saw a Byte magazine cover which said, 'Is Unix Dead?'". "Uh oh, are you going to have a job next year?" Ironically, Byte magazine is dead (although byte.com is still around). Is this the enforcement of accountability for journalists and analysts? [<a href="http://blogs.sun.com/roller/page/jclingan/20040613">John Clingan</a>]
</blockquote>
Yup, in the long run it is. But things have gotten a whole lot more interactive than that. As I <a href="http://weblog.infoworld.com/udell/2004/06/11.html#a1021">mentioned on Friday</a>, Sean McCown's SQL/XML story for InfoWorld, and Michael Rys' commentary on it, combine in an interesting way. Every analyst ought to be a part-time practitioner, and every practitioner ought to be a part-time analyst.
</p>
<p>
That 1990 BYTE story, by the way, makes for an interesting re-read. Some backstory: my pals Tom Yager and Ben Smith wrote it, and all three of us objected to the sensationalistic headline and its hand-wringing subhead ("As Unix faces the stiffest competition of its long life--Windows NT--can it survive?"). These came from the editorial packagers, not from the writers. And naturally, they're all anyone remembers now. I found a copy of the article on the BYTE CD-ROM, and at this late date I don't think anyone will begrudge my posting it. So, back from the dead, here is BYTE's 1990 <a href="http://udell.roninhouse.com/archive/IsUnixDead.htm">Is Unix Dead?</a>. It contains some gems:
</p>
<p><b>Reports greatly exaggerated:</b>
<blockquote>
Despite its problems, Unix is not dead; in fact, it's surprisingly healthy.
</blockquote>
</p>
<p><b>Imagining OS X:</b>
<blockquote>
Improving Unix is much on the minds of Unix vendors. "If you have an X-based desktop with Mac-like features, the end user won't care that Unix is underneath," says Ken Arnold, an engineer at HP's Distributed Object Computing Program. As base-level machines get more powerful, they can better run the larger Unix operating systems. Then, to the end user, it is simply a matter of what off-the-shelf applications are available.
<br/><br/>
Avadis Tevanian, director of System Software at Next, agrees. He envisions a GUI that can run productivity applications side-by-side with user-made custom applications. "To get up to millions of units, you have to get rid of [the Unix shell]," he says. 
</blockquote>
</p>
<p><b>The Sun factor:</b>
<blockquote>
Solaris 2.0, a derivative of SVR4, is going to be the acid test for Sun spinoff SunSoft. It remains to be seen whether the software arm of a hardware vendor is truly willing to create a level playing field. Sun is trying to set itself up with a virtual monopoly on SPARC operating systems and, through SunSoft and Solaris 2.0, is planning to extend its reach into the realm of high-end PCs.
</blockquote>
</p>
<p><b>Novell's first Linux:</b>
<blockquote>
While NextStep will be one of the contenders for the high-end multitasking desktop, it appears that the fiercest salvo fired at NT will come from an unlikely alliance: Univel. USL, looking to get serious about marketing and distribution, and Novell, hoping to shed some of its proprietary image in the newly competitive climate, have joined forces to offer a new shrink-wrapped Unix operating system that may be available as early as this fall. Sold as SVR4.2 by USL and as UnixWare by Univel, it has a list of promises at least as long as NT's.
</blockquote>
</p>
<p>
Pretty good story, on the whole. We'll never know how many more magazines that ill-fated headline sold, but clearly, it wasn't a winning strategy.
</p>
<p>
The computing landscape back then sounds oddly familiar. In many ways things have progressed more slowly than I'd have imagined. But the analyst/practitioner ecosystem is refreshingly new. "Who analyzes the analysts?" You do.
</p>


</body>
</item>



<item num="a1021">
<title>Sean McCown, Michael Rys, and conversational journalism</title>
<date>2004/06/11</date>
<body>

<p>
Back in April, we ran a wildly ambitious story by Sean McCown. Entitled <a href="http://www.infoworld.com/article/04/04/23/17FExml_1.html">Databases Flex their XML</a>, it compared the XML features of DB2, SQL Server, Oracle, and Sybase -- and also made an excursion into Yukon territory. (My contribution was the <a href="http://www.infoworld.com/article/04/04/23/17FExmlview_1.html">speculative sidebar</a> on the future of native XML database technology.) Yesterday Microsoft's Michael Rys, a database architect and a co-author of <a href="http://safari.oreilly.com/0321180607">XQuery from the Experts</a>, blogged a <a href="http://sqljunkies.com/WebLog/mrys/archive/2004/06/10/3036.aspx">lengthy and thoughtful response</a> to Sean's analysis.
</p>
<p>
To frame his response, Michael develops a taxonomy of XML structures and storage models and says:
<blockquote class="personQuote MichaelRys">
It should be clear, that by making this distinction, the terms "shredding," "unstructured," and "structured" are confusing. XML's structure can be highly structured, semi-structured or markup-structured, but it is always structured. And either of these formats can be stored in a way to provide relational, InfoSet or textual fidelity using either relational or blob storage. [<a href="http://sqljunkies.com/WebLog/mrys/">Michael Rys</a>]
</blockquote>
</p>
<p>
That's the kind of useful clarification that Michael has been consistently delivering on his <a href="http://sqljunkies.com/WebLog/mrys/">blog</a>. I hope this thread will continue. Sean's article was -- as Michael acknowledges -- as good a comparative piece as has ever appeared in the press. But the topic is huge, and will fuel ongoing discussion. We're living through an epochal moment in the history of the industry. The hybridization of SQL and XML will deeply transform the philosophy and practice of data management in ways that I think none of us fully understands. The story will emerge from conversations between practitioner/analysts like Sean, and architects like Michael. Happily, the online realm has become a pretty good place to have those conversations.
</p>

</body>
</item>

<item num="a1020">
<title>FixYourOwnPrinter.com</title>
<date>2004/06/10</date>
<body>

<p>
<a target="movie" href="http://weblog.infoworld.com/udell/gems/FixYourOwnPrinter.swf"><img src="http://weblog.infoworld.com/udell/gems/FixYourOwnPrinter.jpg" align="right" vspace="6" hspace="6"/></a>
My decade-old LaserJet 4 recently developed a bad case of the dreaded "accordian paper jam" syndrome. It's been a workhorse. Maybe, I thought, I should just put it out to pasture. But I had a hunch that the process of getting it fixed would be interestingly different from the last time I had to do something like this. And sure enough, it was. I found several repair kits online, but zeroed in on <a href="http://www.fixyourownprinter.com">FixYourOwnPrinter.com</a> because <a href="http://www.fixyourownprinter.com/kke0.html">their kit</a> includes a video that illustrates the process.
</p>
<p>
Here's <a target="movie" href="http://weblog.infoworld.com/udell/gems/FixYourOwnPrinter.swf">45 seconds</a> from my favorite scene<sup>1</sup>, which demonstrates the right way to remove the clip from the end of a roller. I, of course, did it the wrong way. "Be careful not to lose these e-clips, they're easy to pop off," the guy said, just as my e-clip took the leap of faith. That was the only mishap, though. The printer's fixed, and I've joined the ranks of FixYourOwnPrinter.com's <a href="http://www.fixyourownprinter.com/fanmail.html">satisfied customers</a>. 
</p>
<p>
The video isn't going to win any production awards. It's handheld, and not always in focus. But it was plenty good enough to walk me through a complicated procedure that couldn't have been communicated as effectively in any other way. And because it didn't need to be better than that, it was doable for some folks whose business is printer repair, not video production. 
</p>
<hr/>
<p>
<sup>1</sup> Courtesy of <a href="http://www.blue-pacific.com/products/turbinevideo/tvwelcome.htm">Blue Pacific's Turbine Video Encoder</a>. I've been wanting to standardize on Flash as a universal no-hassle video playback format. Turbine, an encoder for Flash video, is a $39 product. And it offers a free version (which I've used here) that's unrestricted except for a subtle watermark. Looks like a nice solution.
</p>


</body>
</item>


<item num="a1019">
<title>Questions about Longhorn, part 3: Avalon's enterprise mission</title>
<date>2004/06/09</date>
<body>

<p>
<a href="http://weblog.infoworld.com/udell/gems/WinformsVsAvalon.jpg"><img vspace="6" hspace="6" align="right" src="http://weblog.infoworld.com/udell/gems/WinformsVsAvalon_s.jpg"/></a>
The slide shown at the right comes from a presentation entitled <a href="http://www.ineta.org/DesktopDefault.aspx?tabindex=2&amp;tabid=41&amp;FileID=125">Windows client roadmap</a>, given last month to the International .NET Association (<a href="http://www.ineta.org/DesktopDefault.aspx">INETA</a>). When I see slides like this, I always want to change the word "How" to "Why" -- so, in this case, the question would become "Why do I have to pick between Windows Forms and Avalon?" Similarly, MSDN's Channel 9 ran a video clip of Joe Beda, from the Avalon team, entitled <a href="http://www.microsoft.com/winme/0404/22606/Joe_Beda_prepare_300k.asx">How should developers prepare for Longhorn/Avalon?</a> that, at least for me, begs the question "Why should developers prepare for Longhorn/Avalon?"
</p>
<p>
I've been looking at decision trees like the one shown in this slide for more than a decade. It's always the same yellow-on-blue PowerPoint template, and always the same message: here's how to manage your investment in current Windows technologies while preparing to assimilate the new stuff. For platform junkies, the internal logic can be compelling. The INETA presentation shows, for example, how it'll be possible to use XAML to write WinForms apps that host combinations of WinForms and Avalon components, or to write Avalon apps that host either or both style of component. Cool! But...huh? Listen to how Joe Beda frames the "rich vs. reach" debate:
</p>
<blockquote class="personQuote JoeBeda">
Avalon will be supplanting WinForms, but WinForms is more reach than it is rich. It's the reach versus rich thing, and in some ways there's a spectrum. If you write an ASP.NET thing and deploy via the browser, that's really reach. If you write a WinForms app, you can go down to Win98, I believe. Avalon's going to be Longhorn only.
</blockquote>
<p>
So developers are invited to classify degrees of reach -- not only with respect to the Web, but even within Windows -- and to code accordingly. What's more, they're invited to consider WinForms, the post-MFC (Microsoft Foundation Classes) GUI framework in the .NET Framework, as "reachier" than Avalon. That's true by definition since Avalon's not here yet, but bizarre given that mainstream Windows developers can't yet regard .NET as a ubiquitous foundation, even though many would like to.
</p>
<p>
Beda recommends that developers isolate business logic and data-intensive stuff from the visual stuff -- which is always smart, of course -- and goes on to sketch an incremental plan for retrofitting Avalon goodness into existing apps. He concludes:
<blockquote class="personQuote JoeBeda">
Avalon, and Longhorn in general, is Microsoft's stake in the ground, saying that we believe power on your desktop, locally sitting there doing cool stuff, is here to stay. We're investing on the desktop, we think it's a good place to be, and we hope we're going to start a wave of excitement leveraging all these new technologies that we're building.
</blockquote>
</p>
<p>
It's not every decade that the Windows presentation subsystem gets a complete overhaul. As a matter of fact, it's never happened before. Avalon will retire the hodge-podge of DLLs that began with 16-bit Windows, and were carried forward (with accretion) to XP and Server 2003. It will replace this whole edifice with a new one that aims to unify three formerly distinct modes: the document, the user interface, and audio-visual media. This is a great idea, and it's a big deal. If you're a developer  writing a Windows application that needs to deliver maximum consumer appeal three or four years from now, this is a wave you won't want to miss. But if you're an enterprise that will have to buy or build such applications, deploy them, and manage them, you'll want to know things like:
<ul>
<li><p>How much fragmentation can my developers and users tolerate <i>within</i> the Windows platform, never mind across platforms?</p></li>
<li><p>Will I be able to remote the Avalon GUI using Terminal Services and Citrix?</p></li>
<li><p>Is there any way to invest in Avalon without stealing resources from the Web and mobile stuff that I still have to support?</p></li>

</ul>
</p>
<p>
Then again, why even bother to ask these questions? It's not enough to believe that the return of rich-client technology will deliver compelling business benefits. (Which, by the way, I think it will.) You'd also have to be shown that Microsoft's brand of rich-client technology will trump all the platform-neutral variations. Perhaps such a case can be made, but the concept demos shown so far don't do so convincingly. The Amazon demo at the Longhorn PDC (Professional Developers Conference) was indeed cool, but you can see similar stuff happening in <a href="http://www.ultrasaurus.com/sarahblog/archives/000140.html">Laszlo</a>, Flex, and other RIA (rich Internet application) environments today. Not, admittedly, with the same 3D effects. But if enterprises are going to head down a path that entails more Windows lock-in, Microsoft will have to combat the perception that the 3D stuff is gratuitous eye candy, and show order-of-magnitude improvements in users' ability to absorb and interact with information-rich services.
</p>

</body>
</item>

<item num="a1018">
<title>Open source and visible source</title>
<date>2004/06/08</date>
<body>

<p>
<blockquote>
Zope Corp.'s layered strategy of engagement with open source and visible-source communities is a compelling blend of the strengths of free and commercial software development. In two previous columns, <a href="http://weblog.infoworld.com/udell/2003/10/28.html#a833">Open source citizenship</a> and <a href="http://weblog.infoworld.com/udell/2003/12/08.html#a862">Giving back to open source</a>, I explored the tendency of enterprises to fork open source projects rather than join them. Pedhazur suggests that a commercial entity supporting both an open source base and a visible-source layered product can reduce the need to fork. By outsourcing code enhancements, the argument goes, an enterprise can enjoy single-throat-to-choke control without seceding from a project's community. It remains to be seen how broadly this model can apply, but in cases where it does, what's not to like? [Full story at <a href="http://www.infoworld.com/article/04/06/04/23OPstrategic_1.html">InfoWorld.com</a>]
</blockquote>
In this <a target="audio" href="http://weblog.infoworld.com/udell/gems/hadar.mp3">two-minute clip</a>, Zope Corp.'s Chairman Hadar Pedhazur describes the visible source model as a middle-ground option between the few large open source projects, whose direction an enterprise cannot easily influence, and the many smaller ones that enterprises can influence, but typically fork in order to do so.
</p>
<p>
My hunch is we'll see more of this kind of thing as open source continues to climb up the stack and encroach on the business layer. The visible-source gated community is a particularly interesting construct in light of the <a href="http://www.acm.org/ubiquity/interviews/v5i14_carr.html">Nicholas Carr argument</a> that a lot of IT is shifting from competitive advantage to cost of doing business. In an environment of growing "co-opetition," the visible-source model can pool dollars and intellectual capital in a way that drives down cost for everyone without favoring anyone. Meanwhile it's a great opportunity for the business that manages the relationship between two worlds: the open-source product with its user/developer community, and the visible-source product with its user/developer/customer community.
</p>
</body>
</item>

<item num="a1017">
<title>Questions about Longhorn, part 2: WinFS and semantics</title>
<date>2004/06/07</date>
<body>

<p>
In the <a href="http://weblog.infoworld.com/udell/2004/06/02.html#a1012">first installment</a> of this series of questions about Longhorn, I concluded that the compelling benefit of WinFS must lie in the realm of "organizing stuff" rather than just "finding stuff" -- else why not just leverage existing and well-understood relational, free-text, and XML search methods? And I posited that the signature feature of WinFS -- "relationships" -- must be powerful enough to justify the creation of a proprietary new storage model that will enable (but also require) new applications and developer skills. Admittedly my "finding versus organizing" distinction was a bit of a cheat, since finding depends sensitively on prior organization. Except when it doesn't: brute-force free-text search routinely trumps navigation and structured search. But OK, we've all got to hope that better organization, someday, will level the playing field.
</p>
<p>
Today's personal information systems are organized hierarchically. WinFS proposes that they be organized semantically. A number of observers have noted a family resemblance between RDF (Resource Description Framework) "triples" and WinFS relationships. An RDF triple, in geek-speak, is a subject-predicate-object relation. Sets of RDF triples can be (and Semantic Web people say must be) used to represent and organize knowledge. Microsoft blogger Joshua Allen explicitly connects the dots between RDF/SemWeb and WinFS:
<blockquote class="personQuote JoshuaAllen">
WinFS is going to enable numerous application scenarios that simply are not practical to implement with today's technology. WinFS is not based on RDF, of course, but they both share similar data models. And, while the scope of WinFS is local and "Semantic Web" is global, the scenarios are not that different. When you start to imagine what it would be like to extend WinFS stores to publish and synchronize data with one another, or alternately imagine a "personal semantic web," you can begin to see that the visions have some serious overlap. [<a href="http://www.netcrucible.com/blog/PermaLink.aspx?guid=69ec2c8c-7a78-4a79-acda-6087b4b3f723">Joshua Allen</a>]
</blockquote>
</p>
<p>
Although this stuff can get dangerously abstract, it's easy to state the practical benefit. If my personal information store contains items of types Person, Organization, Project, and Document, and if it knows about relationship types like Employment and Authorship, then I can easily answer questions like "Which Project X documents were written by Doug?" or "Which Project Y documents were written by employees of organization Z?"
</p>
<p>
Not everybody buys into the triples-oriented data model. Among them is another Microsoft blogger, Dare Obasanjo, who writes:
<blockquote class="personQuote DareObasanjo">
It seems that the point being argued is that with RDF you can get more understanding of the information in the document than with just XML. Being that one could consider RDF as just a logical model layered on top of an XML document (e.g. RDF/XML) I find it hard to understand how viewing some XML document through RDF colored glasses buys one so much more understanding of the data. [<a href="http://www.25hoursaday.com/weblog/PermaLink.aspx?guid=27b4fb9a-37a6-4bbe-8a43-04f965f7a54e">Dare Obasanjo</a>]
</blockquote>
Dare aims this critique at RDF/SemWeb, not WinFS, but I'll take the liberty of extending it to both. And I'll argue that in theory, an information system based on explicit knowledge representation -- using triples, or relationships, or whatever flavor of item-linking you prefer -- is way more powerful than a system in which the same knowledge is available only implicitly. But in practice, I wonder if anybody, whether it's Tim Berners-Lee or the Longhorn architects, can mandate such an approach given the chaotic messiness of reality. My favorite Joshua Allen quote, for example, is this one -- which I also used in my <a href="http://udell.roninhouse.com/xml2003/nakedXML.html">XML 2003 keynote</a>:
<blockquote class="personQuote JoshuaAllen">
The lesson, of course, is that real-world information is chaotic. In any but the smallest "proof of concept" systems, the best that one can hope for is to be able to recognize small pockets of structure within a sea of otherwise unstructured information. [<a href="http://www.netcrucible.com/blog/2002/12/20.html#a263">Joshua Allen</a>]
</blockquote>
</p>
<p>
Maybe it depends how you construe "small pockets of structure." I've been getting decent mileage using nothing fancier than unschematized XML fragments. Microsoft, meanwhile, has taken a great leap forward in Office 2003 with support for schematized XML documents. The first glimmer of this stuff came <a href="http://weblog.infoworld.com/udell/2002/07/13.html">almost two years ago</a>. It shipped <a href="http://www.infoworld.com/article/03/10/03/39FEoffice_1.html">last fall</a>. If asked to paraphrase the Office XML strategy then, I'd have put it this way:
<blockquote>
Let's get schematized information out into the open, where any XML-aware tool can see it and touch it and work with it -- locally and globally, on Windows or any platform -- and then let's see what happens. If we play our cards right we'll broadly legitimize schematization, and we'll be able to use Windows to layer semantic value on top of it.
</blockquote>
If asked to paraphrase the WinFS strategy now, I'd put it this way:
<blockquote>
Let's put schematized information into Windows, where any CLR-aware Windows application can see it and touch it and work with it.
</blockquote>
</p>
<p>
The first strategy envisions a plurality of schemas arising from the grassroots. You won't often hear support for this strategy from Microsoft, but I heard it last fall at the Enterprise Architect Summit from Jean Paoli, who appeared (with Sun's Jon Bosak) on my panel <a href="http://weblog.infoworld.com/udell/2003/10/07.html#a821">Schemas in the wild</a>.
</p>
<p>
The second strategy envisions a canonical set of schemas woven tightly into Longhorn. Years from now it'll ship. Years later, it'll reach critical mass, developers will have mastered its APIs, and schema-aware Windows apps could start to make a "semantic" way of organizing and finding information real for lots of people. 
</p>
<p>
Why wait? Microsoft is telling us to disregard the grassroots Office XML strategy, which is here now and doesn't lock us in, in favor of the ivory-platform WinFS strategy, which is years away and does lock us in. If a compelling argument can be made for the second approach, I haven't seen it yet.
</p>
</body>
</item>


<item num="a1016">
<title>Watching people use software</title>
<date>2004/06/06</date>
<body>

<p>
<blockquote>
<table align="right" border="0" cellspacing="0" cellpadding="6">
<tr><td>
<a href="http://weblog.infoworld.com/udell/gems/searchingForEvents.jpg"><img width="250" src="http://weblog.infoworld.com/udell/gems/searchingForEvents.jpg"/></a>
</td></tr>
<tr><td>
<a href="http://weblog.infoworld.com/udell/gems/creatingHighlights.jpg"><img width="250" src="http://weblog.infoworld.com/udell/gems/creatingHighlights.jpg"/></a>
</td></tr>
</table>
Developers who possess deep but tacit knowledge of complex hardware and software environments are notoriously unable to project themselves into the beginner's mind. Observation is the only way to bridge the gap. [Full story at <a href="http://www.infoworld.com/article/04/06/04/23FEuser_1.html">InfoWorld.com</a>]
</blockquote>
This story grew out of my ongoing experimentation with capturing both live video and screen video. These technologies motivated <a href="http://weblog.infoworld.com/udell/2004/01/13.html#a885">two</a> <a href="http://weblog.infoworld.com/udell/2004/04/07.html#a968">columns</a> and a series of related blog entries (<a href="http://weblog.infoworld.com/udell/2004/01/26.html#a899">1</a>, <a href="http://weblog.infoworld.com/udell/2004/03/02.html#a931">2</a>, <a href="http://weblog.infoworld.com/udell/2004/03/04.html#a933">3</a>). When I got interested in this stuff, months ago, I figured there ought to be a market developing around it. As it turns out, that's happening. One of the products featured in this story -- TechSmith's Morae -- shipped in March. The other, UsersFirst's VisualMark, is just entering beta. Both are harbingers of what I expect will be an emerging trend: the pervasive use of live video and screen video, in combination, to observe and analyze how people really use (or fail to use) software.
</p>
<p>
The story includes an <a href="http://www.infoworld.com/article/04/06/04/23FEuser-sb_1.html">interview</a> with Chris Rockwell of <a href="http://www.lextant.com/">Lextant.com</a>, a company that specializes in user research and interaction design. I really enjoyed my interview with Chris. In this <a target="audio" href="http://weblog.infoworld.com/udell/gems/rockwell.mp3">9 minute clip</a> from our conversation, we discuss the value of raw user-experience instrumentation versus post-production highlights, the possibility of observing users throughout the lifecycle of deployed software, and the gap between users' and programmers' mental models. 
</p>
	

</body>
</item>



<item num="a1015">
<title>Optical illusions</title>
<date>2004/06/04</date>
<body>

<p>
<a href="http://toutfait.com/issues/issue_1/Articles/boat.html"><img align="right" src="http://toutfait.com/issues/issue_1/Articles/images/Cube.jpg"/></a>
<blockquote>
In 1832, the Swiss crystallographer Louis Albert Necker discovered his famously ambiguous cube, which seems to jump back and forth between two orientations. Given the same raw data -- a particular arrangement of a dozen line segments -- our brains find different ways to interpret it. ... The real integration challenge resides inside our heads. There is no single frame of reference for data. [Full story at <a href="http://www.infoworld.com/article/04/05/28/22OPstrategic_1.html">Infoworld.com</a>]
</blockquote>
Apparently I've used this Necker analogy <a href="http://weblog.infoworld.com/udell/2003/03/18.html#a642">before</a>. But it aptly describes what we see happening this week, for example, as <a href="http://www.douglasp.com/PermaLink.aspx?guid=8843aa1c-6b0a-410f-81aa-5ba8064b6ee4">Doug Purdy</a>, <a href="http://www.neward.net/ted/weblog/index.jsp?date=20040603#1086326018156">Ted Neward</a>, <a href="http://blogs.msdn.com/dareobasanjo/archive/2004/05/28/143940.aspx">Dare Obasanjo</a>, and others bat around the implications of DataSets, doc/literal SOAP messages, and hierarchical vs. relational storage. 
</p>
doug 

</body>
</item>



<item num="a1014">
<title>ISBN Y2K+5</title>
<date>2004/06/04</date>
<body>

<p>
At the heart of <a href="http://weblog.infoworld.com/udell/LibraryLookup">LibraryLookup</a> there's a regular expression that matches a 10-digit ISBN. Wouldn't you know it, come January 1, 2005, that string of 10 digits grows to 13. Thanks to Tim Meadowcroft for the heads-up (via email, with permission). He adds:
<blockquote>
All 10 digit ISBN's can be converted to 13 digits by adding a 3 digit
standard code before them ("978" - it effectively puts all the existing
codes into a single namespace), but as the last ISBN character is a base
11 checksum digit (that's why it can be "X" but all other chars must be
digit 0-9), the last character will then change, see
<a href="http://www.isbn.org/standards/home/isbn/transition.asp">http://www.isbn.org/standards/home/isbn/transition.asp</a> for details.
</blockquote>
The ISBN numberspace is variably partitioned, sort of like class A, B, and C networks. A while ago I <a href="http://weblog.infoworld.com/udell/2003/01/07.html#a567">pointed</a> to Roger Costello's <a href="http://www.xfront.com/isbn.xsd">isbn.xsd</a>, a formidable XML schema that documents -- and validates -- a bunch of combinations of country ID and publisher ID. I'd hate to have to update that beast!
</p>
<p>
I gather that the new 13-digit ISBN will be compatible with the <a href="http://www.autoid.org/Primer/ean_upc.htm">EAN / UPC</a> [European Article Numbering / Universal Product Code] system. How will the variably-partitioned EAN / UPC mesh with the variably-partitioned ISBN? Beats me.
</p>
<p>
None of the publishers I know are freaking out about this impending change, so maybe it's not a huge deal for them. Regular folks probably won't even notice, except when required to speak ISBNs or type them into search pages. Like IP addresses -- and increasingly, like phone numbers -- ISBNs are just opaque identifiers. We rely on the Domain Name System, Google, Amazon, and other services to map those identifiers to names we can deal with. 
</p>
<p>
In the digital realm this works out just fine. It's a bit shocking, though, when we reach for these mappings in the analog world and can't lay our hands on them. The classic dilemma: you call directory assistance from a cellphone, while driving, and try to remember the spoken digits long enough to dial them. My current solution: record a voicenote of the spoken number, and play it back a couple of times until it sinks into short-term memory. (I could pay for them to dial, but that would just gall me, and wouldn't plant the number in my phone.) The next step: do it as data, not voice. (Outside the valley of cellphone despair known as New Hampshire this is pretty common, I'm told.) After that: I dunno, but Ray Kurzweil figures we'll have <a href="http://udell.roninhouse.com/bytecols/1999-12-21.html">ported consciousness to new hardware</a> by then, which may solve naming and addressing once and for all. Or not.
</p>
	
</body>
</item>


<item num="a1013">
<title>Broadcatching: the RSS-ification of television news</title>
<date>2004/06/03</date>
<body>

<p>
A Webjay user named <a href="http://webjay.org/by/webjaybs">Brett Singer</a> has been conducting an interesting experiment: a <a href="http://webjay.org/by/webjaybs/newsvideo-daily">playlist of daily news clips</a>. (Like all Webjay playlists, it can be <a href="http://webjay.org/by/webjaybs/newsvideo-daily.xml">subscribed in RSS</a>.) I heard recently that TV remains the primary news source for three-fourths of Americans. Can that possibly still be true? I never watch TV news. But this new clip feed might change that, at least a little. TV has the resources to do things like <a href="rtsp://real.cbsig.net/cbsnews/2004/05/31/video620422.rm">take you to the North Pole</a> to see and hear a scientist evaluate the melting ice pack, and a military analyst discuss the implications of an ice-free northwest passage. I won't watch something like that on CBS's schedule, and I won't even watch it on TiVo's schedule (since TiVo doesn't have the granularity for named two-minute segments), but I might find two minutes to watch it on RSS's schedule. 
</p>
<p>
There's not a huge diversity of sources here -- the clips I've seen are mostly CBS, with some BBC and PBS. But that's already enough to give you a taste of what the RSS-ification of TV news will be like. It'll be a smorgasbord from which you sample, without regard for media brand, in response to the recommendations of your trusted group -- who are in turn influenced by your recommendations.
</p>
<p>
<a href="http://www.instat.com/press.asp?ID=968&amp;sku=IN0401238ME"><img hspace="6" vspace="6" align="right" src="http://www.instat.com/charts/2004/IN0401238ME_ch.gif"/></a>
Webjay's creator Lucas Gonze uses the term <a href="http://gonze.com/weblog/story/5-20-4">broadcatching</a>, which seems to have arisen at the intersection of <a href="http://www.google.com/search?q=broadcatching%20rss%20bittorrent">RSS and BitTorrent</a>. Given the <a href="http://www.instat.com/press.asp?ID=968&amp;sku=IN0401238ME">relatively slow start for personal video recorders</a>, it could take quite a while for this second-order phenomenon to catch on. If the PVR numbers that In-Stat/MDR has made up are even in the ballpark -- 40 million PVRs worldwide in 2008, extrapolated from 4.6 million this year and 1.5 million last year -- the RSS-ification of TV news can fly under the radar for at least a few years while CBS et al. absorb the impact of TiVo. And that's probably a good thing. Because if pages like <a href="http://www.cbsnews.com/sections/i_video/main500251.shtml">this</a> become pages like <a href="http://www.real.com/partners/cnn/">this</a> too soon, the collaborative thing won't get a chance to happen. 
</p>

</body>
</item>

<item num="a1012">
<title>Questions about Longhorn, part 1: WinFS</title>
<date>2004/06/02</date>
<body>

<p>
Over the next few days I want to explore a series of questions about the "pillars" of Longhorn -- WinFS, Avalon, and Indigo. Last fall, when this stuff was first announced, I reacted with an entry entitled <a href="http://weblog.infoworld.com/udell/2003/10/31.html">Replace and Defend</a>. I argued then that Longhorn reinvents quite a few wheels. Nobody can blame Microsoft for seeking new ways to keep customers locked into its Windows franchise. That's a business strategy that every rational player must pursue, in one way or another. In chapter 6 of <a href="http://www.inforules.com/">Information Rules</a>, entitled <i>Managing Lock-In</i>, Carl Shapiro and Hal Varian write:
<blockquote>
The great fortunes of the information age lie in the hands of companies that have successfully established proprietary architectures that are used by a large installed base of locked-in customers. And many of the biggest headaches of the information age are visited upon companies that are locked into information systems that are inferior, orphaned, or monopolistically supplied. 
</blockquote>
There's no question that Longhorn aims for lock-in -- it has to. But what is the nature of the bargain that's being offered? What kinds of benefits will it yield? And what kinds of headaches will accompany those benefits? 
</p>
<p>
With respect to WinFS, Longhorn's new storage system -- an object/relational engine that also doubles as a conventional file system -- the claimed benefits are:
</p>
<ul>
<li><p>Finding stuff.
Those of us who sometimes blog things just so we'll be assured of finding them later have a special appreciation of the absurdity of the current situation. Unless we use an add-on to Windows such as <a href="http://www.x1.com/">X1</a>, we can often find things on the Internet more easily and more reliably than we can find things on our own hard disks. 
</p>
</li>
<li><p>Organizing stuff.
We know that hierarchical foldering systems adapt poorly to the chaos of real life. Unix has always supported the concept of symbolic links, which give you the flexibility to construct alternate paths to the same thing. And indeed, modern versions of Windows do too. A little-known fact is that <a href="http://www.sysinternals.com/ntw2k/source/misc.shtml#junction">Junction</a>, yet another wonderful utility from the indefatigable Mark Russinovich, enables you to create and delete symbolic links on Win2K or WinXP. But symlinking isn't something any normal user would be able to do routinely, and in any case it doesn't really solve the essence of the organizational problem, which is that we want to be able to group items dynamically based on the contents of individual items, and also -- crucially -- on relationships that tie sets of items together. 
</p>
</li>
</ul>
<p>
Nobody wouldn't want these benefits. The way in which Microsoft proposes to deliver them, though, contains some assumptions that I'd like to start unpacking. Let's start with the first benefit: finding stuff. Here's an example of a Longhorn search scenario:
<blockquote>
For example, a user may want to use some pictures taken on a family vacation on her business Web site to promote a sale. She can tag these pictures already stored in a "\Family\Vacation\Photos" folder with a "Promote Sale" keyword when the sale begins. The application managing her Web site can then load all the pictures of this category and have them displayed as a slide show. When the sale ends, she can remove the tag from the pictures in a "WinFS" store. The website will stop showing them to the site visitors afterwards. [<a href="http://longhorn.msdn.microsoft.com/lhsdk/winfs/wfconlonghornstoragesubsystem.aspx">Longhorn SDK Documentation</a>]
</blockquote>
</p>
<p>
There's no need to wait until 2007 to see what this would be like. Just now, for example, I opened up Word 2003, wrote a short document, assigned it the keyword "Promote Sale," and saved it as XML. Here's a script to insert the document into a Berkeley DB XML database:
<pre class="code python">
from dbxml import *
db = 'winfs.dbxml'
container = XmlContainer(None, db)
container.open(None,DB_CREATE)
doc = XmlDocument()
item = open ('myDocument.xml').read()
doc.setContent(item)
container.putDocument(None, doc)
container.close()
</pre>
</p>
<p>
And here's a script that finds that document in the database, based on the keyword:
<pre class="code python">
from dbxml import *
db = 'winfs.dbxml'
container = XmlContainer(None, db)
container.open(None)
context = XmlQueryContext(0,0)
context.setNamespace ('o', 'urn:schemas-microsoft-com:office:office')
xmlResults = container.queryWithXPath(None, 
    "//o:Keywords[contains(.,'Promote Sale')]", context)         
</pre>
A growing number number of applications -- notably, Microsoft's own latest generation of Office apps -- can store XML data in ways amenable to XPath search. The same XML data will be open to the more powerful kinds of search available in the newer XML technologies now coming online: XPath 2.0, XQuery. Meanwhile, a growing number of databases are gearing up to do this kind of search efficiently, often in combination with both relational and free-text querying. 
</p>
<p>
The power of pervasive free-text search, by the way, is something that Microsoft seems consistently to underestimate. Outlook, even in its latest incarnation, is helpless to find anything quickly. Everybody has to rely on third-party add-ons for this essential function. There's a hole in the market that you could drive a truck through, and the name on the side of that truck is Gmail, but I digress.
</p>
<p>
Here's the point of this installment. To the extent that our personal information stores contain information represented in XML, we have standard ways to search them. What's more, two powerful trends point to a brighter future for this scenario: the growing use of open XML file formats, and the steady advance of databases that can index and search XML content. WinFS embraces neither trend, and that looks to me like a looming headache. Personal information management, in Longhorn, will be a walled garden with its own notion of schema, and its own query language. To give users the benefit of finding stuff, Longhorn-style, developers will have to implement the Longhorn model. And then they'll have to find ways to unify that approach with the XML-oriented model prevailing in the world at large -- and indeed, even on pre-Longhorn Windows systems.
</p>
<p>
The justification for this headache, if there is one, must lie not in the realm of "finding stuff" but in the realm of "organizing stuff." WinFS relationships, in other words, must be capable of delivering such compelling benefits that there was no choice but to invent a proprietary storage model from the ground up. I'll explore that proposition next time.
</p>

</body>
</item>


<item num="a1011">
<title>Five guys talking</title>
<date>2004/06/01</date>
<body>

<p>
Tim Bray raises some good questions about last week's <a href="http://www.itconversations.com/shows/detail145.html">Gillmor Gang</a> episode:
<blockquote class="personQuote TimBray">
First of all, a transcript would be so much better; I don't have an hour to listen and if I did it would be in my car, and even if I tried, sitting here in my office (even though the audio is excellent) my attention is continually getting pulled away by email or instant messages or red letters in NetNewsWire or whatever. If I'm writing code or a tricky position paper or reading something material or even just thinking about a hard problem I can tune out the distractions no problem, but four guys talking? The mind wanders. [<a href="http://www.tbray.org/ongoing/When/200x/2004/05/31/SOATalk">ongoing</a>]
</blockquote>
I agree. Doug Kaye is working on providing transcripts, but it's a hard problem and a thankless chore.
Meanwhile, I've been exploring a middle-ground approach. I went through the first half of the show, in which various aspects of service-oriented architecture were batted around, and added a layer of indexing and annotation. The result: <a href="http://weblog.infoworld.com/udell/gems/itconv3.smil">this SMIL presentation for the Real player</a>. Note: Clicking an index link will seek in the audio stream and synch the annotations panel, but (at least for me) won't always actually play the audio at that location unless you click again. (Annoying. Why is that?)
</p>
<p>
<a href="http://weblog.infoworld.com/udell/gems/itconv3.smil"><img vspace="6" hspace="6" align="right" src="http://weblog.infoworld.com/udell/gems/itconv3.jpg"/></a>
Here's my thinking. Even for big media organizations with big budgets, it's a struggle to get audio transcriptions done quickly and well. But maybe, with the right set of tools, it'd be feasible to create a layer of indexing plus annotation that would contextualize and give meaningful random access to the audio stream.
</p>
<p>
Working through the process gave me a clearer sense of what tools we already have, and what tools we'd need to make it practical. A huge enabler is the ability to rely on a standard Web server, rather than a specialized streaming server. Last week, I <a href="http://weblog.infoworld.com/udell/2004/05/26.html#a1009">indexed and annotated</a> a downloadable RealVideo file. The same principle applies to a downloadable MP3, and that's the core of today's experiment. 
</p>
<p>
The challenge then becomes to isolate segments, form links to the beginning of each segment, and pair each audio segment with annotations displayed in another pane. I found the Winamp player really helpful for fine-tuning start/stop times. You can use CTRL-J to jump to a minutes:seconds location, and can use the arrow keys to jump forward or backward in 5-second increments. It gets tedious to subtract by minutes:seconds in order to arrive at durations, but there are calculators that can help with that. 
</p>
<p>
These conveniences only scratch the surface, of course. We're left with plenty of roadblocks. There was nothing to help construct the index, organize the annotations into a set of panes, or orchestrate linking from the index to the annotations. More woes: The result is specific to the Real player. It won't even work in QuickTime, which has SMIL support, never mind in Windows Media Player, which doesn't. Another possible issue: the index and annotations, encapsulated in .smil and .rt (RealText) files respectively, are (I suspect) opaque to Google, which defeats the purpose of using the annotations to make the audio partly searchable. And the elephant in the room: Real's RealText isn't HTML, and the Real player isn't a browser. We can awkwardly include AV content into a text/graphics viewer (i.e., browser), or awkwardly include text and graphics into an AV player, but we've never satisfactorily united the two modes.
</p>
<p>
Suppose we magically healed this longstanding breach. Suppose further that, in some hypothetical browser/player, we could even author for the combined medium -- for example, by capturing timecoded annotations in realtime, <a href="http://www.codingmonkeys.de/subethaedit/">SubEthaEdit</a>-style, or by collecting and presenting the URLs that participants visit during the event. Would the kind of hybrid presentation I'm envisioning still be a poor substitute for a complete transcript? If you had such a transcript, would the audio still be valuable, and if so, in what ways? 
</p>
<p>
I can't answer these questions yet, but it's a fascinating area to explore -- and not only from the perspective of four (actually, five) guys talking on an IT radio show. Think about the meetings you attend. Think about the note-taking that does (or doesn't) occur in those meetings. Imagine being able to efficiently review what was actually said, not just what was summarized, when making decisions. In that situation, a complete transcript -- even if one could be produced cheaply and accurately -- won't tell the whole story. Recorded speech, linked to searchable annotations, would be an amazing enhancement to routine business communication.
</p>


</body>
</item>

<item num="a1010">
<title>The artful logger</title>
<date>2004/05/27</date>
<body>

<p>
<blockquote>
I confess to a deep fascination with the seemingly mundane topic of logging. Software crashes, shopping cart abandonment, and security breaches are among the many situations in which you'll find yourself poring over logs trying to figure out what went wrong.
<br/><br/>...<br/><br/>
Logs can flood us with information, or they can tell us compelling stories. We can influence the outcome by artful and iterative refinement of the data we collect. [Full story at <a href="http://www.infoworld.com/article/04/05/21/21OPstrategic_1.html">InfoWorld.com</a>]
</blockquote>
Years ago -- it must have been more than a decade, because Win95 was then a beta product code-named Chicago -- I made a trip to Microsoft to be briefed on OS strategy. Win32 was young then, and its transplantation from NT onto the Win9x codebase was a big deal. Most of Win32 was slated to make the trip, but a few things got left behind, and the omission that most disturbed me was event logging.
</p>
<p>
The event log subsystem was left on the cutting room floor, an executive told me, because hard choices had to be made in order to bring Win95 in under its 4MB memory budget. This was not so absurd as it now sounds. Win95's competition was Windows 3.1, which could run in 4MB. (As it turned out, of course, nobody ran Win95 in less than 8MB.) But while granting the case for prudent conservation of scarce resources, I argued that it was vital to get developers of mainstream Windows apps into the habit of logging not just outright failures and errors, but also routine status information that could be used to analyze patterns of software use and guide incremental improvement of software.
</p>
<p>
Developers of server applications were then already making liberal use of the event log. If the hordes of developers coming to Win95 from Windows 3.1 weren't immediately enabled (and expected) to do the same, I argued, an opportunity to improve software quality would be lost for a generation.
</p>
<p>
So here we are in 2004, I'm running Windows XP on my desktop, and there's essentially no interesting data in the Event Viewer's Application log. What are some examples of things I'd like to see there? Off the top of my head:
<ul>
<li><p>Warnings.
If the same warning appears repeatedly (or perhaps a set of related warnings spanning several apps), it's a sign that there's a problem with the software, or with the user's understanding of the software, or both. If we don't log these warnings, though, we can't detect patterns and respond to them.
</p></li>
<li><p>Settings changes. 
As a user, how many times have you tried to remember what settings were in place when something that's broken used to work? As a developer, how many times have you tried to get users to remember what they changed? Aren't such changes important events in the life of an application, worthy of logging?
</p></li>
<li><p>Launch and exit events.
These are the most basic and obvious things to record, but we don't find them in the log. If we going to move toward "software as a service," shouldn't we keep track of what's used and how often?
</p></li>
</ul>
</p>
<p>
Ironically there are much more detailed logs of our routine software activities on other people's machines (i.e., on Web servers) than on our own. There's no reason why this has to be so, and plenty of reasons why it shouldn't be. It's an accident of history, really. A questionable decision made during an era of resource scarcity now serves us badly in this era of abundance. 
</p>
								   
</body>
</item>

<item num="a1009">
<title>The future of conferences</title>
<date>2004/05/26</date>
<body>

<p>
<a href="http://weblog.infoworld.com/udell/gems/asterisk.smil"><img align="right" width="300" height="268" src="http://weblog.infoworld.com/udell/gems/asterisk_s.jpg"/></a>
Yesterday I had a phone conference with Hadar Pedhazur of <a href="http://www.opticality.com">Opticality Ventures</a>, during which Hadar mentioned that he's been using <a href="http://www.asterisk.org/">Asterisk</a>, a Linux-based software PBX, with great success. Although Asterisk is VoIP-capable, Hadar's using <a href="http://store.yahoo.com/asteriskpbx/wildcardx100p.html">cheap ($100) Digium cards</a> to manage and route calls among his various business-related  POTS lines. That really got my attention; I've long wanted such a capability. So I did some reading, and I also watched this <a href="http://graphics.cs.uni-sb.de/VCORE/Publications/mark_spencer/mark.smil">presentation</a> given by <a href="http://www.digium.com/">Digium's</a> founder and Asterisk's developer, Mark Spencer. 
</p>
<p>
I can't say more about Asterisk until I've had a chance to try it, but I do want to note that Mark's presentation -- a RealVideo stream synchronized to a slide show -- was extremely effective. The talk was given during Linux-Kongress 2003, at Universit&#228;t des Saarlandes, home of the <a href="http://graphics.cs.uni-sb.de/VCORE/">Virtual Courseroom Environment (VCORE)</a> project. The project page notes:
<blockquote>
Although many aspects of streaming multimedia are well understood there are many open questions concerning a real world implementation.
</blockquote>
You can say that again. Let's look at how this version delivers the content. The presentation's URL links to a <a href="http://www.w3.org/TR/REC-smil/">SMIL</a> document which contains:
<ul>
<li><p>The URL of the video, an HTTP-accessible RealVideo (.rm) file.</p></li>
<li><p>A series of pointers to JPG renderings of the slides.</p></li>
</ul>
The VCORE system has cleverly taken care of details like:
<ul>
<li><p>Determining the duration of the video and encoding that in SMIL.</p></li>
<li><p>Acquiring the JPG renderings.</p></li>
<li><p>Determining the transition points and encoding the duration of each slide accordingly.</p></li>
</ul>
</p>
<p>
This is pretty nice! I've seen this done occasionally, but it's hardly routine. Since I'm on a random access kick lately, I decided to see what it would take to add an index to the presentation. A lovely example of how to do that can be found <a href="http://cobra.gslis.utexas.edu:8080/ramgen/gracy5.smil">here</a>, courtesy of <a href="http://www.gslis.utexas.edu/~l384k9/smil/smilindex.html">UT Austin's David Gracy</a>. Note that this example dates all the way back to 1999, yet demonstrates something quite compelling that we rarely see even today.
</p>
<p>
While I was at it, I explored how to incorporate the slides as text, rather than as images. Here's the result: a <a href="http://weblog.infoworld.com/udell/gems/asterisk.smil">three-minute clip</a> that includes the first four slides of Spencer's talk, and an index that links to each of the four transitions. The SMIL wrapper, playable in RealOne, looks like this:
</p>
<pre>
&lt;smil>
&lt;head>
&lt;layout>
&lt;root-layout width="640" height="480" />
  &lt;region id="text_region" width="320" height="480" left="0" top="0" />
  &lt;region id="video_region" width="320" height="240" left="320" top="0" />
  &lt;region id="text_region2" width="320" height="240" left="320" top="240" />
&lt;/layout>
&lt;/head>
&lt;body>
&lt;par>
  &lt;textstream src="http://weblog.infoworld.com/udell/gems/asterisk_index.rt" 
    region="text_region" dur="3:08"/>
  &lt;video src="http://graphics.cs.uni-sb.de/VCORE/Publications/\
    mark_spencer/Data/mark.rm?start=0:0&amp;end=3:08" 
    region="video_region"/>
  &lt;seq dur="3:08">
  &lt;textstream src="http://weblog.infoworld.com/udell/gems/asterisk1.rt" 
    region="text_region2" dur="38"/>
  &lt;textstream src="http://weblog.infoworld.com/udell/gems/asterisk2.rt" 
    region="text_region2" dur="32"/>
  &lt;textstream src="http://weblog.infoworld.com/udell/gems/asterisk3.rt" 
    region="text_region2" dur="22"/>
  &lt;textstream src="http://weblog.infoworld.com/udell/gems/asterisk4.rt" 
    region="text_region2" dur="96"/>
  &lt;/seq>
&lt;/par>
&lt;/body>
&lt;/smil>
</pre>
<p>
Here are some things I discovered:
</p>
<ul>
<li><p>Random access over HTTP. The other day I <a href="http://weblog.infoworld.com/udell/2004/05/18.html#a1003">mentioned</a> how HTTP 1.1 enables some players (RealOne, Winamp) to randomly access audio on a vanilla Web server. The same holds true for RealOne going against RealVideo content. However, although you can jump to a random location, it takes noticeably longer than when you do the same thing with a streaming server. 
</p>
</li>
<li><p>Minutes-and-seconds notation. In the <a href="http://weblog.infoworld.com/udell/gems/asterisk_index.rt">index file</a> you can write URLs in terms of minutes-and-seconds, not the byte-range lingo that the client and server speak. These URLs aren't available outside that context, though. And while it's possible to package up the start/stop syntax into a .ram file that you can point a browser at, you can't (so far as I know) form an URL that indexes into the SMIL assembly.
</p>
</li>
<li><p>Text formatting is lame, but workable. Here's <a href="http://weblog.infoworld.com/udell/gems/asterisk4.rt">an example</a> of a slide written using Real's markup. 
</p></li>
</ul>
<p>
This still isn't a great solution, but it's instructive to see what can be done with late-nineties technology. Given XML slide markup and a means of capturing transition timecodes, a VCORE-like system should be able to generate this kind of indexed presentation automatically. The slide content might need to be streamlined and simplified, but you could also link out from slides to richer Web pages if needed.
</p>
<p>
Conferences would be <i>so</i> much more useful if this were the norm. As an attendee, I should expect that when I return home, I'll have slide-by-slide random access to every talk. What's more, I should expect to be able to search for slide text and jump into presentations at the found locations. Remote attendees, meanwhile, would be able to purchase this level of access, thus defraying the cost of providing it.
</p>
<p>
The deluxe solution, of course, would make all these entry points bloggable by surfacing external URLs. But just having basic indexing done comprehensively and reliably would be a huge step forward. It sucks not to be able to take that for granted.
</p>
<p>
So we schlep to conferences, make painful choices between conflicting sessions, and feel vaguely guilty when we miss lots of them anyway because we're busy schmoozing. It's great to be bathed in WiFi signal at conferences nowadays, and able to blog them in realtime. Now that we've solved outbound access from the venue, let's solve inbound access to the content.
</p>


</body>
</item>


<item num="a1008">
<title>Threat modeling</title>
<date>2004/05/25</date>
<body>

<p>
Michael Howard <a href="http://blogs.msdn.com/michael_howard/archive/2004/05/24.aspx">points</a> to a free <a href="http://www.microsoft.com/downloads/details.aspx?FamilyID=62830f95-0e61-4f87-88a6-e7c663444ac1">threat modeling tool</a> written by Frank Swiderski, author of the forthcoming book <a href="http://www.microsoft.com/MSPress/books/6892.asp">Threat Modeling</a>. The evolving formal discipline of threat modeling first came to my attention in 2000, when I read Bruce Schneier's <a href="http://www.amazon.com/exec/obidos/tg/detail/-/0471453803/">Secrets and Lies</a>. This picture, from chapter 21 of that book, is worth a thousand words:
</p>
<p align="center"><img border="1" src="http://weblog.infoworld.com/udell/gems/attackSafe.jpg"/>
</p>
<p>
One way to gauge the growing interest in threat modeling -- at
Microsoft and elsewhere -- is to compare its coverage in the two
editions of Michael Howard's <i>Writing Secure Code</i> (<a href="http://safari.oreilly.com/0735615888">1</a>, <a href="http://safari.oreilly.com/0735617228">2</a>). In the first edition, threat modeling is mentioned in a section of Chapter 2, <i>Designing Secure Systems</i>. In the second edition, it becomes a chapter in its own right. 
</p>
<p>
Swiderski's tool is a GUI-based .NET app that collects tree-structured
information about entry points, protected resources, and threats. If
you have the Visio drawing control, you can use that to add data flow
diagrams, otherwise the tool includes a simple diagram editor. The
classifications defined by the STRIDE methodology -- Spoofing,
Tampering, Repudiation, Information Disclosure, Denial of Service, and
Elevation of Privilege -- are available as checkboxes. Likewise the
classifications defined by the DREAD methodology -- Damage Potential,
Reproducibility, Exploitability, Affected Users, Discoverability -- are
available as numeric choices (1-10). ("The concepts of STRIDE and DREAD
were conceived, built upon, and evangelized at Microsoft by Loren
Kohnfelder, Praerit Garg, Jason Garms, and Michael Howard." -- <i>Writing Secure Code</i>, 2nd Edition)
</p>
<p>
I've always been suspicious of the kinds of software tools that just
provide bookkeeping support for some methodology. Of course, a
methodology that people can actually understand and use is really just
a formalization of common sense, and I think the STRIDE/DREAD stuff
falls into that category. </p>
<p>
Here's a report on a very simple threat model, generated from the XML data captured by the tool:
</p><div style="border-style: solid; border-width: thin; padding: 10px;"><p>

</p><p class="MsoNormal"><b><span style="font-size: 20pt; font-family: Arial;">
	Threat Model: XPath Query Service</span></b></p>
<h1><a name="_Toc31961280">Threat Model Information</a></h1>
<p>

</p><p>

</p><table class="fes1" border="1" cellspacing="0" cellpadding="0" style="border: medium none ; border-collapse: collapse;">
<thead>
<tr style="page-break-inside: ;">
<td colspan="2" valign="top" style="border: 1pt solid windowtext; padding: 0in 5.4pt; background: black none repeat scroll 0% 50%; -moz-background-clip: initial; -moz-background-origin: initial; -moz-background-inline-policy: initial;">
<p class="MsoNormal"><span style="font-size: 10pt; font-family: Arial;">Information</span></p>
</td>
</tr>
</thead><tbody><tr>
<td valign="top" style="border-style: none solid solid; border-color: -moz-use-text-color windowtext windowtext; border-width: medium 1pt 1pt; padding: 0in 5.4pt; background: rgb(204, 204, 204) none repeat scroll 0% 50%; -moz-background-clip: initial; -moz-background-origin: initial; -moz-background-inline-policy: initial;">
<p class="MsoNormal"><span style="font-size: 10pt; font-family: Arial;">
				Owner
			</span></p>
</td>
<td valign="top" style="border-style: none solid solid none; border-color: -moz-use-text-color windowtext windowtext -moz-use-text-color; border-width: medium 1pt 1pt medium; padding: 0in 5.4pt; background: rgb(204, 204, 204) none repeat scroll 0% 50%; -moz-background-clip: initial; -moz-background-origin: initial; -moz-background-inline-policy: initial;">

<p class="MsoNormal"><span style="font-size: 10pt; font-family: Arial;">Jon Udell</span></p>
</td>
</tr>
<tr>
<td valign="top" style="border-style: none solid solid; border-color: -moz-use-text-color windowtext windowtext; border-width: medium 1pt 1pt; padding: 0in 5.4pt;">
<p class="MsoNormal"><span style="font-size: 10pt; font-family: Arial;">
				Participants
			</span></p><p>

</p></td>
<td valign="top" style="border-style: none solid solid none; border-color: -moz-use-text-color windowtext windowtext -moz-use-text-color; border-width: medium 1pt 1pt medium; padding: 0in 5.4pt;">
<p class="MsoNormal"><span style="font-size: 10pt; font-family: Arial;"></span></p>
</td>
</tr>
<tr>

<td valign="top" style="border-style: none solid solid; border-color: -moz-use-text-color windowtext windowtext; border-width: medium 1pt 1pt; padding: 0in 5.4pt; background: rgb(204, 204, 204) none repeat scroll 0% 50%; -moz-background-clip: initial; -moz-background-origin: initial; -moz-background-inline-policy: initial;">
<p class="MsoNormal"><span style="font-size: 10pt; font-family: Arial;">
				Reviewer
			</span></p>
</td>
<td valign="top" style="border-style: none solid solid none; border-color: -moz-use-text-color windowtext windowtext -moz-use-text-color; border-width: medium 1pt 1pt medium; padding: 0in 5.4pt; background: rgb(204, 204, 204) none repeat scroll 0% 50%; -moz-background-clip: initial; -moz-background-origin: initial; -moz-background-inline-policy: initial;">
<p class="MsoNormal"><span style="font-size: 10pt; font-family: Arial;"></span></p>
</td>
</tr>
<tr><td valign="top" style="border-style: none solid solid; border-color: -moz-use-text-color windowtext windowtext; border-width: medium 1pt 1pt; padding: 0in 5.4pt;">
<p class="MsoNormal"><span style="font-size: 10pt; font-family: Arial;">
				Description
			</span></p>
</td>
<td valign="top" style="border-style: none solid solid none; border-color: -moz-use-text-color windowtext windowtext -moz-use-text-color; border-width: medium 1pt 1pt medium; padding: 0in 5.4pt;">

<p class="MsoNormal"><span style="font-size: 10pt; font-family: Arial;">Jon's XPath query service</span></p>
</td>
</tr>
</tbody></table>
<p class="MsoNormal"></p>
<h1><a name="_Toc31961293">
	Threats
</a></h1><p>

</p><p class="MsoNormal"></p>
<p>

</p><p>

</p><table class="MsoTableGrid" border="1" cellspacing="0" cellpadding="0" style="border: medium none ; border-collapse: collapse;">

<thead>
<tr style="page-break-inside: ;">
<td colspan="2" valign="top" style="border: 1pt solid windowtext; padding: 0in 5.4pt; background: black none repeat scroll 0% 50%; -moz-background-clip: initial; -moz-background-origin: initial; -moz-background-inline-policy: initial; width: 100%;">
<p class="MsoNormal"><span style="font-size: 10pt; font-family: Arial; color: white;">Threat 1</span></p>
</td>
</tr>
</thead>
<tbody><tr>
<td valign="top" style="border-style: none solid solid; border-color: -moz-use-text-color windowtext windowtext; border-width: medium 1pt 1pt; padding: 0in 5.4pt; background: rgb(204, 204, 204) none repeat scroll 0% 50%; -moz-background-clip: initial; -moz-background-origin: initial; -moz-background-inline-policy: initial; width: 30%;">
<p class="MsoNormal"><span style="font-size: 10pt; font-family: Arial;">
					Name
				</span></p>
</td><td valign="top" style="border-style: none solid solid none; border-color: -moz-use-text-color windowtext windowtext -moz-use-text-color; border-width: medium 1pt 1pt medium; padding: 0in 5.4pt; background: rgb(204, 204, 204) none repeat scroll 0% 50%; -moz-background-clip: initial; -moz-background-origin: initial; -moz-background-inline-policy: initial; width: 70%;">
<p class="MsoNormal"><span style="font-size: 10pt; font-family: Arial;">Penetration</span></p>

</td>
</tr>
<tr>
<td valign="top" style="border-style: none solid solid; border-color: -moz-use-text-color windowtext windowtext; border-width: medium 1pt 1pt; padding: 0in 5.4pt; width: 30%;">
<p class="MsoNormal"><span style="font-size: 10pt; font-family: Arial;">
					Description
				</span></p>
</td>
<td valign="top" style="border-style: none solid solid none; border-color: -moz-use-text-color windowtext windowtext -moz-use-text-color; border-width: medium 1pt 1pt medium; padding: 0in 5.4pt; width: 70%;">
<p class="MsoNormal"><span style="font-size: 10pt; font-family: Arial;"></span></p>
</td>
</tr>
<tr><td valign="top" style="border-style: none solid solid; border-color: -moz-use-text-color windowtext windowtext; border-width: medium 1pt 1pt; padding: 0in 5.4pt; background: rgb(204, 204, 204) none repeat scroll 0% 50%; -moz-background-clip: initial; -moz-background-origin: initial; -moz-background-inline-policy: initial; width: 30%;">
<p class="MsoNormal"><span style="font-size: 10pt; font-family: Arial;">
					Threat Tree
				</span></p>

</td>
<td valign="top" style="border-style: none solid solid none; border-color: -moz-use-text-color windowtext windowtext -moz-use-text-color; border-width: medium 1pt 1pt medium; padding: 0in 5.4pt; background: rgb(204, 204, 204) none repeat scroll 0% 50%; -moz-background-clip: initial; -moz-background-origin: initial; -moz-background-inline-policy: initial; width: 70%;">
<p class="MsoNormal"><span style="font-size: 10pt; font-family: Arial;"><dl><dt>1 Threat: Penetration</dt><dd>1.1  Malicious URL</dd></dl></span></p>
</td>
</tr>
</tbody></table><p class="MsoNormal"></p>
<p>

</p><table class="MsoTableGrid" border="1" cellspacing="0" cellpadding="0" style="border: medium none ; border-collapse: collapse;">
<thead>
<tr style="page-break-inside: ;">
<td colspan="2" valign="top" style="border: 1pt solid windowtext; padding: 0in 5.4pt; background: black none repeat scroll 0% 50%; -moz-background-clip: initial; -moz-background-origin: initial; -moz-background-inline-policy: initial; width: 100%;"><p>

</p><p class="MsoNormal"><span style="font-size: 10pt; font-family: Arial; color: white;">Threat 2</span></p>

</td>
</tr>
</thead>
<tbody><tr>
<td valign="top" style="border-style: none solid solid; border-color: -moz-use-text-color windowtext windowtext; border-width: medium 1pt 1pt; padding: 0in 5.4pt; background: rgb(204, 204, 204) none repeat scroll 0% 50%; -moz-background-clip: initial; -moz-background-origin: initial; -moz-background-inline-policy: initial; width: 30%;">
<p class="MsoNormal"><span style="font-size: 10pt; font-family: Arial;">
					Name
				</span></p>
</td>
<td valign="top" style="border-style: none solid solid none; border-color: -moz-use-text-color windowtext windowtext -moz-use-text-color; border-width: medium 1pt 1pt medium; padding: 0in 5.4pt; background: rgb(204, 204, 204) none repeat scroll 0% 50%; -moz-background-clip: initial; -moz-background-origin: initial; -moz-background-inline-policy: initial; width: 70%;">
<p class="MsoNormal"><span style="font-size: 10pt; font-family: Arial;">Denial of service</span></p>
</td>
</tr><tr>
<td valign="top" style="border-style: none solid solid; border-color: -moz-use-text-color windowtext windowtext; border-width: medium 1pt 1pt; padding: 0in 5.4pt; width: 30%;">
<p class="MsoNormal"><span style="font-size: 10pt; font-family: Arial;">

					Description
				</span></p>
</td>
<td valign="top" style="border-style: none solid solid none; border-color: -moz-use-text-color windowtext windowtext -moz-use-text-color; border-width: medium 1pt 1pt medium; padding: 0in 5.4pt; width: 70%;">
<p class="MsoNormal"><span style="font-size: 10pt; font-family: Arial;"></span></p>
</td>
</tr>
<tr>
<td valign="top" style="border-style: none solid solid; border-color: -moz-use-text-color windowtext windowtext; border-width: medium 1pt 1pt; padding: 0in 5.4pt; background: rgb(204, 204, 204) none repeat scroll 0% 50%; -moz-background-clip: initial; -moz-background-origin: initial; -moz-background-inline-policy: initial; width: 30%;">
<p class="MsoNormal"><span style="font-size: 10pt; font-family: Arial;">
					Threat Tree
				</span></p><p>
</p></td>
<td valign="top" style="border-style: none solid solid none; border-color: -moz-use-text-color windowtext windowtext -moz-use-text-color; border-width: medium 1pt 1pt medium; padding: 0in 5.4pt; background: rgb(204, 204, 204) none repeat scroll 0% 50%; -moz-background-clip: initial; -moz-background-origin: initial; -moz-background-inline-policy: initial; width: 70%;">
<p class="MsoNormal"><span style="font-size: 10pt; font-family: Arial;"><dl><dt>2 Threat: Denial of service</dt><dd>2.1  Malicious XPath</dd></dl></span></p>
</td>
</tr>
</tbody></table><p class="MsoNormal"></p><p>
</p></div>
<p>
This kind of report, and the process that leads to it, is no more than
a framework for thinking through the issues involved in securing an
application. But it is also no less than that. If the analytic
framework is easy to pick up and use, you'll do more and better
analysis. To that end I have a couple of suggestions. One's easy to
implement, one's really hard.
</p>
<p>Here's the easy one. The built-in report viewer failed when I tried
to produce my report. I took a look at the included XSLT stylesheets
and found a couple of C# methods defined using the &lt;msxsl:script&gt;
mechanism. They weren't doing anything particularly vital, just
removing whitespace and truncating strings, so I removed references to
them. That enabled me to produce the excerpt shown above using an
external XSLT processor, though still not from within the tool itself,
for reasons I haven't figured out. I'm sure there's an easy fix for
this. But relying by default on a non-standard extension like
&lt;msxsl:script&gt; isn't a great public relations move. It encourages
people to think the tool is more Microsoft-centric than it in fact is.
True, it requires the .NET Framework (v1.1) to run, but the generated
XML is entirely neutral. For example, it writes out a data flow diagram
two ways: as a Base64 encoding of a bitmapped image, and also as a
chunk of SVG that could be useful in all sorts of ways on any platform.
My recommendation: make the default XSLT transformations similarly
neutral.
</p>
<p>Now here's the tough one. The real impediment to doing this kind of
analysis is the classic problem of documentation that's not connected
to code. It's tedious and boring to enumerate entry points (e.g.
network ports, file systems) and their relationships to well-known
threats. What are the odds that I'll update my threat model if I change
the port on which my service is listening? Slim to none. Of course the
code knows what port the service is listening on. What's more, the code
can tell us a lot about potential attacks. For example, my XPath query
service, written in Python, uses the BaseHTTPServer class. (I do this
partly because it's so simple. There's no massive IIS or Apache edifice
to worry about, just a small amount of code -- which I've read and
which I understand -- that implements an HTTP responder.) Given a
database of viable threats to BaseHTTPServer, an automated analysis of
the code could fill in parts of the threat model for me. More broadly,
automated analysis of the configuration data used by app servers,
routers, firewalls, and other infrastructure software could help us
automatically populate threat models. That'd be a great way to mine
value from the XML that's now routinely used to describe these things.
I predict that source-code analysis and configuration-file analysis
will help us do more frequent and more reliable threat modeling. It'll
be a challenge. But if we know that's where we're headed, we can design
source-code metadata mechanisms and configuration-file formats
accordingly.
</p>

</body>
</item>

<item num="a1007">
<title>The challenge of partial trust</title>
<date>2004/05/24</date>
<body>

<p>
Over the weekend I upgraded a kid's PC from Win98 to XP. I'd been dragging my heels because Win98 was "good enough" for games, IM, and writing school reports, but this installation had long since reached its half-life. Also, I was curious to see what a 98-to-XP upgrade would be like, never having done one. So I fired up the installer and posted the kid on guard to alert me when intervention was required.
</p>
<p>
He summoned me repeatedly, but in each case the reason was a popup ad, not anything technical. Call me naive, but the frequency and intrusiveness of these ads surprised me. Otherwise, though, the in-place upgrade went smoothly. It really is remarkable that the Win9x kernel can be uprooted, and the NT kernel inserted in its place, with so little disruption. 
</p>
<p>
In order to ensure the maximum half-life for the new system, I made myself administrator and gave a limited account to the kid. Then I set him to work verifying that his games still worked. All of them did except for Age of Empires. The installation report suggested I should reinstall it. I switched to my account, did that, and fired up the game. It worked. Then I switched back to the kid's account and fired up the game. It still failed.
</p>
<p>
So for now, the kid is the proud owner of administrative privilege. I could milk this for irony by pointing out that Age of Empires is a Microsoft product. But I'd rather take this in a different direction. Partial trust is a hard problem, period, in all operating systems and environments. So hard that we either spend inordinate amounts of time figuring out how to make partial trust work, or we punt and allow more trust than we should. Or both. 
</p>
<p>
In this particular example, had I the time and inclination to solve the problem, I'd probably fire up <a href="http://www.sysinternals.com/ntw2k/source/filemon.shtml">Sysinternals' Filemon</a> and try to find out which file or directory Age of Empires is failing to read or write. Of course the problem could lie elsewhere -- with API permissions rather than file permissions, for example. 
</p>
<p>
This isn't only a Windows issue. Across the board we need better ways to visualize trust boundaries and diagnose problems arising at these boundaries. 
</p>

</body>
</item>

<item num="a1006">
<title>Patterns, Wikis, and APIs</title>
<date>2004/05/21</date>
<body>

<p>
<img align="right" vspace="6" hspace="6" src="http://weblog.infoworld.com/udell/gems/wardsEnterpriseAttitude.jpg"/>
It's great to see Ward Cunningham's friendly face popping up on MSDN's <a href="channel9.msdn.com">Channel 9</a>. In <a href="http://www.microsoft.com/winme/0405/22606/Cunningham/Idea_for_Wiki.asx">these</a> <a href="http://www.microsoft.com/winme/0405/22606/Cunningham/Teach_a_Kid_Ward.asx">segments</a>, he connects the dots between the patterns that we increasingly use to guide software architecture, and the environments in which we formulate, discuss, and apply those patterns. 
</p>
<p>
In the <a href="http://www.microsoft.com/winme/0405/22606/Cunningham/Idea_for_Wiki.asx">first clip</a>, Ward recalls how the <a href="http://www.c2.com/cgi/wiki?WelcomeVisitors">aboriginal Wiki</a> was a place for programmers to work out, in collaboration, a set of ideas about how to do object-oriented programming. In the <a href="http://www.microsoft.com/winme/0405/22606/Cunningham/Teach_a_Kid_Ward.asx">second clip</a>, he notes that what gates programming productivity isn't syntax, but rather API/library/framework surface area. "Keeping up with what's available in the libraries," he says, "is the number one information overload challenge."
</p>
<p>
It's hard, maybe impossible, to master all the existing and emerging disciplines that flow together in modern programming work, but then, we shouldn't have to:
<blockquote class="personQuote WardCunningham">
I wouldn't think to start a program from first principles. If I want to make a program, I want to find the people who know kind of how to do it, and say, come sit with me, come help me get started. Let's talk to each other about what we're doing, let me get the feel for how it's supposed to go. Once you have a program that's working, then it's just...improving it. [Channel 9: Ward Cunningham]
</blockquote>
</p>
<p>
How and where do we have the conversation in which we merge our individual understandings? Patterns are ways to frame that conversation; Wikis and other online venues are places to have it. 
</p>

</body>
</item>

<item num="a1005">
<title>Paul Goldberger vs. Keith Pleas</title>
<date>2004/05/20</date>
<body>

<p>
Seymour Hersh's <a href="http://www.newyorker.com/fact/content/?040524fa_fact">The Gray Zone</a> was this week's blockbuster New Yorker article. Hardly anybody commented on Paul Goldberger's <a href="http://www.newyorker.com/critics/skyline/?040524crsk_skyline">High-Tech Bibliophilia</a>, a review of Seattle's new library. I wouldn't either, were it not for the contrast with an earlier review entitled <a href="http://weblogs.asp.net/kpleas/archive/2004/04/27/121407.aspx">Brutal Architecture</a> and written by Keith Pleas, a software architect, author, and trainer who now works in the area of patterns and practices at Microsoft. 
</p>
<p>
The contrast between the two reviews could not be more striking. The Goldberger piece, which opens onto a two-page spread dominated by a huge photo of the library's angular exterior shell, is full of the kinds of airy proclamations that art and architecture critics love to make:
<blockquote>
...the most important new library to be built in a generation, and the most exhilarating...
</blockquote>
<blockquote>
...not so much a rejection of traditional monumentality as a reinterpretation of it...
</blockquote>
<blockquote>
...a reinvention of the idea of the public library...
</blockquote>
But there are no interior photos, and we learn nothing about how users interact with this "ennobling public space." 
</p>
<p>
Keith Pleas decided to find out for himself. Armed with his digital camera and a deep understanding of how architecture (in the software realm) can fail the test of use, he investigated the library from the inside out. Pleas conducted his tour last month, before the Goldberger review appeared, but he deconstructs an earlier <a href="http://seattletimes.nwsource.com/pacificnw/2004/0425/cover.html">Seattle Times magazine cover  story</a> with devastating effect:
</p>
<blockquote>
On page 25 of the magazine we see a picture of Koolhaas, Ramus, and good old Paul Maritz(!) standing on a tiny balcony:
<p align="center"> <img src="http://seattletimes.nwsource.com/art/pacificnw/2004/0425/cover04.jpg" border="1"/><br/><font size="2">(Seattle Times photo)</font></p>
<p align="left">What you can't see in the electronic version of the
image is that the architect and designer are holding onto the thin
metal railing at waist height. Ramus, who knows more about how the
building is actually put together, is holding on with <b><i>both</i></b>
hands. Maritz, who's probably the smartest of the three, is standing
further back. And why are they holding on? Well, here's the view <i><b>down</b></i> from where that picture was taken:</p>
<p align="center"><img width="348" height="261" src="http://www.keithpleas.com/SPLdown.jpg" border="1"/></p>
<p align="left">You can see the same waist-high railing on this, a main
passageway. You can also just see an edge of a substantial industrial
table placed against the railing. And if you have any imagination at
all, you can see how things placed on this table - which seems to just
be begging to be climbed on - will have full opportunity to demonstrate
their glide characteristics as they descend 6 stories (I forget, but I
think that's the number) to the busy floor below.</p>
<p>...</p>
<p>
Of course, the architects / designers didn't completely ignore
"life safety" issues in designing the new library. In fact, here's
an...innovative...solution to the double-issue of both tripping <i>and </i>konking your head on an angled support while you're exploring the "unity of knowledge": </p>
<p align="center"><img width="347" height="249" src="http://www.keithpleas.com/SPLcaution.jpg" border="1"/></p>
[<a href="http://weblogs.asp.net/kpleas/archive/2004/04/27/121407.aspx">Keith Pleas: Brutal Architecture</a>]
</blockquote>
<p>
Bravo, Keith! The New Yorker and its readers don't know it yet, but architecture criticism is (or should be) forever changed by what you've done here. And those of us who care about the architecture of software should be heartened to see the instinctive concern for user experience that motivates your analysis.
</p>

</body>
</item>



<item num="a1004">
<title>DomainKeys</title>
<date>2004/05/19</date>
<body>

<p>
Jeremy Zawodny <a href="http://jeremy.zawodny.com/blog/archives/002010.html">notes</a> that Yahoo's <a href="http://antispam.yahoo.com/domainkeys">DomainKeys proposal</a> is now public. Here's the <a href="http://www.ietf.org/internet-drafts/draft-delany-domainkeys-base-00.txt">Internet-Draft</a>; here's the <a href="http://www.technorati.com/cosmos/search.html?rank=&amp;url=http%3A%2F%2Fantispam.yahoo.com%2Fdomainkeys&amp;sub=Go%21">blog chatter</a> as seen by Technorati.
</p>
<p>
In the <a href="http://weblog.infoworld.com/udell/categories/infoworld/2004/04/21.html#a980">blog introduction</a> to my story on <a href="http://www.infoworld.com/article/04/04/16/16FEfutureforgery_1.html">sender authentication schemes</a>, I included some clips from an interview with Sendmail Inc.'s Eric Allman. Here's <a target="audio" href="http://weblog.infoworld.com/udell/gems/ericAllman03.mp3">another excerpt</a>, in which Eric discusses the issue of roving users. Although DomainKeys can potentially deal with this case -- by mapping its DNS <i>selectors</i> to individuals -- he notes that you're better off making an authenticated connection to your home MTA, if not through a VPN then by means of <a href="http://xml.resource.org/public/rfc/html/rfc2476.html">port 587 message submission</a>. Here's the <a href="http://people.qualcomm.com/presnick/draft-hutzler-spamops-00.html#RFC2476">Internet-Draft</a> on that topic, which Eric co-wrote and hopes will become a BCP (<a href="http://www.rfc-editor.org/categories/rfc-best.html">Best Current Practices</a>) document.
</p>
<p>
Eric concludes this segment by saying that, for the first time in a long time, he's "cautiously optimistic" about doing something effective against spam. Likewise, I'm cautiously optimistic about the long-term value of publishing keys in the DNS. The DomainKeys scheme initially maps keys to organizations, but has the flexibility to map them to individuals as well. 
</p>


</body>
</item>

<item num="a1003">
<title>Random access to Web audio</title>
<date>2004/05/18</date>
<body>

<p>
Doug Kaye's <a href="http://www.itconversations.com/">ITConversations</a> has the first installment of a new online talk show called <a href="http://www.itconversations.com/shows/detail123.html">The Gillmor Gang</a>. My ongoing interest in the ability to form URLs that link into large media objects has now <a href="http://www.rds.com/doug/weblogs/personal/2004/05/15.html#a1213">infected Doug</a>, and we've been talking about how to enable that capability on his site.
</p>
<p>
And then it all came rushing back to me, like the hot kiss at the end of a wet fist.  I recalled an exchange, some months ago, with Kevin Marks, a former Apple QuickTime engineer who is now Technorati's director of engineering. Our conversation was prompted by my <a href="http://www.infoworld.com/article/03/11/26/47OPstrategic_1.html">mobile webcasting column</a>. Kevin wrote to point out that streaming is really only useful for live events, and that downloadable files are otherwise superior. But what about random access, I asked? HTTP 1.1 supports that, Kevin pointed out.
</p>
<p>
I've known for a long time that certain applications -- notably Adobe Reader -- make use of the HTTP Range header to request partial content. I'd never seen the protocol in action, though. It took me a while to find a PDF on the Web that exhibits random-access behavior -- perhaps because it's not really necessary for the vast majority of sub-1MB PDFs out there -- but eventually I found <a href="http://www.saltforum.org/saltforum/downloads/SALT1.0.pdf">this 4MB document</a> and was able to watch Adobe Reader requesting a sequence of chunks in the background, and skipping ahead when I scrolled to the end of the document to view the last page.
</p>
<p>
What about downloadable MP3s? I tried QuickTime, no joy. Windows Media Player, no joy. RealOne Player: bingo! And likewise Winamp. How did I never notice this before? Here's some of the chatter between Winamp and Doug's server:
<pre>
GET /mp3/2004/The%20Gillmor%20Gang%20-%20May%2014,%202004.mp3 HTTP/1.0
Connection: keep-alive
Host: rdscon.vo.llnwd.net
User-Agent: WinampMPEG/5.0
Accept: */*
Icy-MetaData: 1
 
HTTP/1.0 200 OK
Date: Sat, 15 May 2004 03:05:23 GMT
Server: Apache/1.3.29 (Unix)
Last-Modified: Sat, 15 May 2004 02:43:02 GMT
ETag: "216a53-1173385-40a583b6"
Accept-Ranges: bytes
Content-Length: 18297733
Content-Type: audio/mpeg
Connection: close
 
GET /mp3/2004/The%20Gillmor%20Gang%20-%20May%2014,%202004.mp3 HTTP/1.1
Connection: keep-alive
Host: rdscon.vo.llnwd.net
User-Agent: WinampMPEG/5.0
Accept: */*
Range: bytes=9902232-
 
HTTP/1.0 206 Partial Content
Date: Sat, 15 May 2004 03:05:23 GMT
Server: Apache/1.3.29 (Unix)
Last-Modified: Sat, 15 May 2004 02:43:02 GMT
Accept-Ranges: bytes
Content-Type: audio/mpeg
Content-Range: bytes 9902232-18297732/18297733
Content-Length: 8395501
Connection: close
 
GET /mp3/2004/The%20Gillmor%20Gang%20-%20May%2014,%202004.mp3 HTTP/1.1
Connection: keep-alive
Host: rdscon.vo.llnwd.net
User-Agent: WinampMPEG/5.0
Accept: */*
Range: bytes=15105006-
 
HTTP/1.0 206 Partial Content
Date: Sat, 15 May 2004 03:05:23 GMT
Server: Apache/1.3.29 (Unix)
Last-Modified: Sat, 15 May 2004 02:43:02 GMT
Accept-Ranges: bytes
Content-Type: audio/mpeg
Content-Range: bytes 15105006-18297732/18297733
Content-Length: 3192727
Age: 214
Connection: close
</pre>
In this sequence, the server reports a Content-Length of about 18MB. I scroll halfway, and request the range starting there. Then I scroll farther and request another range.
</p>
<p>
There remains the problem of link addressability. Doug would have to invent, and hack into his server, some kind of URL parameterization -- which, in fact, he's considering doing. Of course somebody must already have thought of that, and sure enough, Ari Luotonen did in his <a href="http://ftp.ics.uci.edu/pub/ietf/http/hypermail/1995q2/0122.html">original 1995 proposal</a> for byte ranges:
<pre>
EXAMPLES OF THE BYTERANGE URL PARAMETER
 
The first 500 bytes:
   <a href="http://host/dir/foo;byterange=1-500">host/dir/foo;byterange=1-500</a>
 
The second 500 bytes:
   <a href="http://host/dir/foo;byterange=501-1000">host/dir/foo;byterange=501-1000</a>
 
Bytes from 501 until the end of file:
   <a href="http://host/dir/foo;byterange=501-">host/dir/foo;byterange=501-</a>
</pre>
</p>
<p>
According to <a href="http://www.research.att.com/~bala/papers/h0vh1.html">this comparison of HTTP 1.0 and 1.1</a>, the URL parameter idea ran afoul of HTTP 1.1's conditional GET feature, and so byte ranges migrated into the realm of HTTP headers.
</p>
<p>
To sum up, an ordinary downloadable MP3 sitting on a conventional Web server (as opposed to a streaming MP3 hosted on an Icecast or Shoutcast server) is perfectly able to be randomly accessed -- but only by means of HTTP Range headers, not by means of parameterized URLs. And some (but evidently not all) MP3 players are prepared to exploit that random-access feature. 
</p>
<p>
What's missing? 
<ul>
<li><p>A Web server convention for accepting parameterized URLs like the ones Ari Luotonen proposed way back when. By "convention" I mean something like Real's <b>ramgen</b>, a virtual directory that invokes special processing. The handler for that directory would be a server extension, implemented in various ways on various servers, that would convert from parameterized-URL lingo to HTTP-Range-header lingo.</p></li>
<li><p>An audio player convention for exposing such URLs to users. I envision it as a Link button that goes active when the player is paused, and that produces the parameterized URL when clicked.</p></li>
</ul>
</p>
<p>
I can see at least one major objection. The byte range syntax isn't human-friendly. The hours/minutes/seconds format that streaming servers support would be nicer. Knowing nothing about MP3 formats, I can't say whether it would be feasible for a sufficiently smart server extension to translate from hours/minutes/seconds to byte ranges. 
</p>

</body>
</item>

<item num="a1002">
<title>Pushmepullyou</title>
<date>2004/05/18</date>
<body>

<p>
<a href="http://www.shadesofmeaning.com/whatis.htm"><img align="right" vspace="6" hspace="6" src="http://www.shadesofmeaning.com/apr02/pushmepullyou.JPG"/></a>
<blockquote>
Recently I spoke with Dave Lewis, vice president of deliverability management and ISP relations at Digital Impact. His company's motto: "Making e-mail marketing more effective is our single-minded passion." In one of his online essays, entitled "<a href="http://directmag.com/ar/marketing_btob_e-mail_customers/index.htm" class="regularArticleU">How to Keep B-to-B E-mail From Getting Caught in Filters</a>," his first rule is "Get permission." 
<br/><br/>
I argued that RSS does away with the need for marketers to ask our permission, for us to grant it, for marketers to play by the rules when we revoke it, and for us to trust that marketers will play by the rules. With e-mail marketing, control resides with the sender and permission is a "best practice." With RSS, control resides with the recipient and permission is an inherent property of the medium.
<br/><br/>
I feel Dave's pain. E-mail direct marketers are stuck between a rock and a hard place. They believe e-mail is necessary because it's an "intrusive" medium, yet they are forced to neuter e-mail's intrusiveness by complying with the opt-in gold standard. Unfortunately, there's no middle ground. With RSS recipients can have, and increasingly will demand, control of the channel.
<br/><br/>
Dave and I agreed on one point. "You'd be crazy not to communicate with your customers in their medium of choice," he said. My preference is RSS. Trust me with control of the channel, and I'll be more likely to trust you with my business. [Full story at <a href="http://www.infoworld.com/article/04/05/14/20OPstrategic_1.html">InfoWorld.com</a>]
</blockquote>
In this column I deconstruct "push" and "pull" and determine that, when it comes to modes of electronic communication, these terms mean basically nothing. What matters is who controls the channel of communication, not how we construe the direction of flow.
</p>
<p>
I think the rhetoric of email direct marketing -- that it's an opt-in, customer-controlled medium -- should correspond to the reality. It makes email direct marketers understandably nervous when I point out that RSS has all the right characteristics -- including, nowadays, lower cost, given the expense incurred on both ends of the email pipe in order to keep the channel clear.
</p>
<p>
Obviously direct marketers will be among the last to relinquish channel control to the customer. Meanwhile, there's another species of email that's ripe for migration to RSS: institutional alerts. My bank, for example, sends me email alerts when my checking balance falls below $500. To separate those alerts from my spam filters on the one hand, and from my interpersonal email on the other hand, I had to write a filter to catch them and route them to a folder. Many (probably most) people won't go that extra mile. They'll have to pluck the bank's messages from a chaotic email stream, and will wind up missing some alerts. 
</p>
<p>
The obvious alternative is a personalized RSS feed. Does anyone have this already? I'm hoping that, before the end of this year, at least one of the institutions that currently sends me email alerts will offer an RSS option. 
</p>

</body>
</item>



<item num="a1001">
<title>Personas and plogs</title>
<date>2004/05/17</date>
<body>

<p>
<a href="http://www.engl.uvic.ca/Faculty/MBHomePage/ISShakespeare/WT/WT.TOC.html"><img align="right" hspace="6" vspace="6" src="http://www.engl.uvic.ca/Faculty/MBHomePage/ISShakespeare/WT/WT.GIF"/></a>
A couple of years ago, after I heard Alan Cooper speak about his company's ethnographic approach to interaction design, the word <a href="http://www.dictionary.com/search?q=persona">persona</a> first <a href="http://weblog.infoworld.com/udell/2002/06/13.html">appeared in this blog</a>. Last Friday, "persona" popped up in back-to-back phone interviews, and made me realize that Cooper's formulation of IT stakeholders as characters in a story has become deeply rooted and widespread. The first interview was with Microsoft's Bob Muglia who, in the course of laying out the Windows server roadmap, said this:
<blockquote class="personQuote BobMuglia">
Over the last 18 months we've focused on trying to understand the different audiences, or roles, within IT, and how they consume technology. We do this by associating <b>personas</b> with the individual roles. 
</blockquote>
The second interview was with Forrester's Harley Manning. We were discussing usability testing, and he said this:
<blockquote class="personQuote HarleyManning">
What we've been focusing on lately is behavioral segmentation and modeling, typically as represented by a <b>persona</b> -- a one-page front end with a face and a name, and a narrative description of the person's behaviors. We do that to encourage companies to design for a small number of segments about which they are very well informed. 
</blockquote>
</p>
<p>
The literary theme continued today, when Roland Piquepaille <a href="http://radio.weblogs.com/0105910/2004/05/17.html#a845">blogged</a> a <a href="http://www.cio.com/archive/051504/work.html">Michael Shrage article in CIO.com</a> that coins the term 'plog' for 'project log' -- a powerful technique that I've <a href="http://udell.roninhouse.com/bytecols/2001-05-24.html">used myself</a> and <a href="http://www.infoworld.com/article/03/03/28/13stratdev_1.html">written about</a>.
</p>
<p>
Persona is an ancient and beautiful word. Plog is a brand-new word that's even uglier (if possible) than blog. But the words don't matter. What's striking is how the art of storytelling -- our instinctive human way of making sense of the world -- has woven itself into the science of information technology.
</p>


</body>
</item>

<item num="a1000">
<title>Link-addressable streams, revisited</title>
<date>2004/05/13</date>
<body>

<p>
Peter van Dijck wrote to tell me about <a href="http://www.me-tv.org/freetools/getrmurl.php">his tool</a> for converting the URL of a Real stream, plus start/stop times, into a link to the specified segment. A while ago, I <a href="http://weblog.infoworld.com/udell/2003/12/19.html">mentioned</a> Rich Persaud's <a href="http://autometa.com/RPXP/web/">version</a> of the same idea, which works with Windows Media and QuickTime as well as Real. Using either of these, you can do what I did the other day -- namely, link to a segment within a video stream -- without hacking URLs and wrapper files. 
</p>
<p>
As helpful as these tools are, I've come to see that the hassles they alleviate are only part of the reason why we're as yet unable to weave video effectively into blog conversations. In the case of yesterday's clip, for example, there's probably a 50-50 chance that my carefully-prepared link actually worked for you. C-SPAN's streaming setup is amazingly robust, but invariably the content that's most likely to attract links occurs at times of peak load. If I really wanted to make sure you could see that 30-second clip, I might have done better to capture it and post a downloadable version. 
</p>
<p>
That, of course, would raise all sorts of questions. First of all, how? It's doable, but not easily and not (to my knowledge) with free tools. Second, in which format? Third, does fair use cover these kinds of quotations? (I think it should, and will be testing that hypothesis.)
</p>
<p>
Despite these issues, the overriding consideration may be that streams require specialized servers, whereas downloadable clips (which nowadays play progressively) do not. Downloadable clips are, of course, inherently link-addressable, and since they're short, it's not imperative to be able to point to locations within them. 
</p>
<p>
What we're left with, though, is an asymmetry. Big media organizations, for now, still have the advantage over small independents, because the big organizations are more able to deploy and manage streaming infrastructure. Bloggers can link into those streams, and/or capture and post quotes from them, but can't yet easily produce streams. What we can do easily is produce <a href="http://udell.infoworld.com:8000/?//p[contains(.//a/@href,'.mov')]">short downloadable clips</a>. 
</p>
<p>
All this could change, of course, if a hypothetical video-oriented version of <a href="http://www.audioblog.com">Audioblog.com</a> were to emerge. For $X per month, I'd be able to send streams from my iSight camera to this hypothetical service, which would support X concurrent viewers of the stream. Hmm.
</p>

</body>
</item>


<item num="a998">
<title>The whole picture</title>
<date>2004/05/11</date>
<body>

<p>
<img align="right" vspace="6" hspace="6" src="http://weblog.infoworld.com/udell/gems/inhope.jpg"/>
I mostly avoided the hearing today, but tuned in to C-SPAN's video stream just long enough to catch <a href="http://weblog.infoworld.com/udell/gems/inhope.ram">this brief segment</a> in which Senator James Inhope argues that "if pictures are authorized to be disseminated among the public, then for veery picture of abuse or alleged abuse of prisoners, we [should] have pictures of mass graves, pictures of children being executed, pictures of the four Americans in Baghdad who were burned and mutilated." He concludes: "Let's get the whole picture." 
</p>
<p>
Absolutely. The notion of authorized dissemination is problematic, though. In the <a href="http://www.google.com/search?q=%22transparent%20society%22">transparent society</a> that we are becoming, the whole picture most certainly <i>is</i> developing. The Net is a force of nature. It superconducts information and superdistributes awareness.
</p>
<p>
Of course the military, like every enterprise, is entitled to try to control the terms on which its employees can engage with the Net. So the Seattle Times reports that Tami Silicio, who gave us another piece of the picture, <a href="http://seattletimes.nwsource.com/html/nationworld/2001909527_coffin22m.html">was fired</a> for violating the Pentagon ban on pictures of flag-draped coffins. Likewise, Seattle's other paper, the Post-Intelligencer, reported last fall that Michael Hanscom <a href="http://seattlepi.nwsource.com/business/146115_blogger30.html">was fired</a> for his pictures of G5 Macs on a loading dock at Microsoft. Fair enough. In a similar position of responsibility, I'd have to make similar choices. But let's be clear: the whole picture, by definition, cannot be authorized.
</p>

</body>
</item>

<item num="a997">
<title>Xythos Intellittach</title>
<date>2004/05/11</date>
<body>

<p>
In a <a href="http://weblog.infoworld.com/udell/2004/04/28.html#a986">recent column</a> on how we use and abuse email, I mentioned the idea of passing attachments "by reference" rather than "by value." Unfortunately I overlooked a product <a href="http://www.infoworld.com/article/04/02/27/09TCxythos_1.html">recently reviewed by InfoWorld</a> that does exactly that. The Xythos WebFile Server has a companion WebFile Client that hooks File Attach (in Notes and Outlook) and replaces attachments with secure links to an access-controlled and versioned instance of the document. Cool!
</p>
<p>
The $50K price tag, as our reviewer noted, "may keep smaller companies away." But other implementations of the idea are clearly possible. I've received a bunch of responses to the column saying: "We attach files because IT gives us no alternative." Xythos offers an alternative. I'd like to see the "Intellittach" concept turn into a broadly-adopted convention.
</p>

</body>
</item>


<item num="a996">
<title>Trademarks, acronyms, and Orwell</title>
<date>2004/05/11</date>
<body>

<p>
The other day I <a href="http://weblog.infoworld.com/udell/2004/05/07.html#a992">wondered</a> why some well-known technology acronyms -- notably UPnP -- aren't expanded on the home pages of the organizations promoting those technologies. In the case of UPnP, at least, the reason is that it isn't (any longer) an acronym:
<blockquote>
The UPnP mark is not an acronym and should not be represented as such. The mark is a single entity that happens to consist of four symbols (i.e., letters), which individually do not have any particular meaning.
[<a href="http://www.upnp-ic.org/uic/docs/UPnP_mark_tips_7-09-2003.pdf">Tips for using the UPnP Certification Mark</a>]
</blockquote>
Why the switch? Apparently it's because you can't trademark an acronym. So, for example, JDBC, like UPnP, has been uprooted and now exists as a free-floating string of "symbols (i.e., letters)". JDBC is a registered trademark, and although Sun was not able to expunge all references to <a href="http://onesearch.sun.com/search/developers/index.jsp?qt=%22java+database+connectivity%22&amp;uid=6910018">Java Database Connectivity</a> from its website, the <a href="http://java.sun.com/products/jdbc/index.jsp">JDBC home page</a> nowhere mentions the term.
</p>
<p>
I found this puzzling in light of this Q and A from the <a href="http://www.swiggartagin.com/trademark/faq1.html">trademark FAQ</a> of a Boston technology law firm:
<blockquote>
13. Can I register an acronym of my company name as a trademark?
<br/><br/>
Companies with lengthy trade names will sometimes use the acronym of their trade name as their primary service mark: e.g. Columbia Broadcasting System, National Broadcasting System, and American Broadcasting System, use the acronyms CBS, NBC and ABC, respectively, as marks for the service of providing news and entertainment services over electronic media. 
</blockquote>
NBC hasn't, to my knowledge, ceased to be the National Broadcasting System. Of course JDBC and UPnP are trademarks, while NBC and CBS are service marks, so perhaps the distinction lies there. But whatever the explanation, the pretense that JDBC and UPnP don't mean "Java Database Connectivity" and "Universal Plug and Play" is simply Orwellian. It's already way too hard to explain technology in ways people can understand. We can ill afford to drain the meaning out of our language.
</p>

</body>
</item>

<item num="a994">
<title>XBRL follow-up</title>
<date>2004/05/10</date>
<body>

<p>
Following last week's <a href="http://weblog.infoworld.com/udell/2004/05/05.html#a989">critique of XBRL</a>, I had an interesting email exchange with David vun Kannon, a manager in KPMG's financial services practice and one of the editors of the XBRL spec. The dialogue went far beyond what InfoWorld's letters column could ever accommodate, so with David's permission, I'm reproducing it here. 
</p>
<p>
<b>David vun Kannon:</b>
<blockquote class="personQuote DavidVunKannon">
I feel your analogy was inadequate and the "too complex" criticism misses
the point. XBRL isn't designed to be hand-written, and that level of
simplicity is not a virtue in the design space it targets.
<br/><br/>
As one of the designers of the XBRL specification, I sympathize with your
desire for a simple XML format for the exchange of financial and business
reporting data. But as your article's lead paragraphs point out, the world
of accounting standards is wickedly complex. The design scope of XBRL had to
address that complexity, as well as the use of financial data in all kinds
of tax and regulatory filings worldwide. Did you really expect something
simple from that target?
<br/><br/>
Here's a recipe for a "simple" financial reporting format:<br/>
	- assume a single accounting framework<br/>
	- assume the framework never changes<br/>
	- assume one currency<br/>
	- assume one language<br/>
	- mix content and presentation<br/>
	- assume businesses will change how they report to fit your design
<br/><br/>
The above recipe actually works for single application languages where there
pis one dominant consumer, such as the IRS' XML format for tax filings. But
the world doesn't need a thousand different financial reporting languages.
That is the "stovepipe application" thinking that misses the forest for the
trees. That is why XBRL is trying to provide a unifying framework.
<br/><br/>
It is nice to know that your blog can get by using RSS. However, Reuters and
Dow Jones can't, and I doubt InfoWorld runs on RSS. For them, there is
NewsML. Ever read the NewsML spec? Looked at the latest version of FpML, for
describing financial derivatives? An "apples-to-apples" comparison of XML
languages would compare XBRL to these languages, because of the breadth of
the business problem they are each trying to solve.
<br/><br/>
There are thousands of companies that report financial results according to
US, international and local rules, as well as separate tax reporting. If you
wrote every blog entry in four separate languages, with an eye to satisfying
a different set of picky editorial rules for each, your blog analogy would
be more appropriate.
<br/><br/>
The companies with financial reporting needs that are similar to your blog
example will be served by software, such as Microsoft's Excel add-in now in
beta, that manage the complexity for them.
<br/><br/>
The number of developers that will have to face head-on the complexity of
the XBRL spec is low. You can write an XML Schema without delving into the
depths of the XML Schema spec. Only the writer of an XML Schema validator
has to do that. Similarly, developers at businesses can write XBRL instance
and taxonomy documents using tools. Only the developer of XBRL support
software has to go the limit with understanding the spec.
</blockquote>
</p>
<p>
<b>Jon Udell:</b>
<blockquote class="personQuote JonUdell">
> It is nice to know that your blog can get by using RSS. <br/>
> However, Reuters and Dow Jones can't, and I doubt InfoWorld<br/>
> runs on RSS.<br/>
<br/>
As a matter of fact InfoWorld does, in a variety of ways. I'm not convinced Reuters and Dow Jones couldn't either, as RSS is now modular and extensible.<br/>
<br/>
> For them, there is NewsML. Ever read the NewsML spec?<br/>
<br/>
Yep. And it's not small, I agree. I'll also agree that modular extensions to RSS that would bring it to parity with NewsML would yield complexity equal to that of NewsML. However the key difference, in my view, would be a lower activation threshold and smoother growth curve  -- i.e., the ability to start with something simple and concrete, and  evolve to the more complex and abstract.<br/>
<br/>
> Only the developer of XBRL support software has to go <br/>
> the limit with understanding the spec.<br/>
<br/>
The "tools will manage the complexity" argument is always compelling, but also always worrisome to me. Over and over again I've seen stupidly simple formats and protocols triumph over highly-engineered counterparts, especially when -- as I believe is true in the case of XBRL -- the goal is widespread, if not universal, adoption by a broad constituency. We'll probably just end up agreeing to disagree, but I've noted with great interest the evolution of XML specs, in the Web services realm, away from the monolithic and towards the granular and "composable." This seems to me a fundamentally correct way to attack complexity. And XBRL seems monolithic, not composable, hence my reaction.
</blockquote>
</p>
<p>
<b>David vun Kannon:</b>
<blockquote class="personQuote DavidVunKannon">
As background, a paper I gave at XML Europe a few years ago on the design of XBRL 1.0 is still available on the web at http://www.gca.org/papers/xmleurope2000/papers/s26-01.html. While the particulars of XBRL 1.0 have become dated, the motivating sections are still relevant. You might also want to note how much smaller/simpler the 1.0 spec is, compared to 2.1!
<br/><br/>
I've had to think about the issues you raise since 1999 and the design of
XBRL 1.0. There are many different aspects to the complexity problem,
including scope of the business problem and choice of base technologies. For
instance, I'm asked relatively frequently "Why don't you use RDF?" as if RDF
was pixie dust that could be sprinkled on a problem to make its complexity
go away. Complexity is conserved.
<br/><br/>
BTW, a link to your column has been posted to the xbrl-public Yahoo Group.
As the only public (non-member) Yahoo Group for XBRL, it attracts most of
the newbies and naysayers, and the latter are happily adding to the thread
agreeing with you. For the sake of the former, I've posted my response to
you over there as well.
<br/><br/>
I agree with your points on modularity. XBRL is designed so that the simple
is simple and the complex is possible. Believe it or not! The "Hello, World"
test for a single financial fact is
<pre class="code xml">
&lt;xbrl namespaces for XBRL, XML Schema Instance, 
and US GAAP taxonomy go here>
&lt;us:assets contextRef="c1" unitRef="u1" 
  precision="18">7&lt;/us:assets>
&lt;unit id="u1">ISO4217:USD&lt;/unit>
&lt;context id="c1">
  &lt;period>&lt;instant>20041231&lt;/instant>&lt;/period>
  &lt;entity>
    &lt;identifier scheme="http://www.duns.com/D-U-N-S">
      1234567890
    &lt;/identifier>
  &lt;/entity>
&lt;/context>
&lt;/xbrl>
</pre>
It is hard to point to any part of the above as unnecessary, though most
votes go to the precision attribute.
<br/><br/>
So the use of XBRL by a company is modular, and can expand in a gradual,
modular way. I'm not sure if the 2.1 spec is organized quite the way a
primer should be. Also, while XBRL is committed to modular expansion into
the future, the current spec and conformance suite are monolithic. During
the last version design phase, I argued for profiles that would let tools or
validators claim conformance to XBRL while not implementing the whole spec.
This didn't make the cut. But a properly written intro to XBRL would show
the natural breakdown of the parts of the spec and how different parts
(different linkbases for example) can be used independently.
<br/><br/>
I think a big influence on why the spec isn't "officially" modular is that
XBRL has succeeded most with global financial regulators, who have typically
wanted all the bells and whistles. Indeed, they want modules that aren't
finished yet, such as the Formula Linkbase I am designing now. This adoption
process has damped the "small is beautiful" psychology and grass roots
momentum dynamics that drive some specs to wildfire rates and levels of
adoption. Web pages and blogs were pioneered by individuals, not businesses
or government departments. True bandwagon dynamics for XBRL will wait until
the SEC (the 800 pound gorilla of regulation) requires using XBRL (for
external financial reporting) and until the Excel add-in becomes widely
available (for internal management reporting and financial consolidation).
<br/><br/>
So I think we agree far more than we disagree. XBRL is an undoubted
challenge to developers. Its linkbases are the first use of out-of-band
hyperlinking. For developers used to working with numbers, it is surprising
that so much of accounting is navigating a hypertext! 
<br/><br/>
XBRL was advised by Tim Bray in a recent conference keynote to take six
months off (or more) and stop inventing/using bleeding edge technology. It
hasn't happened, of course, but the market is starting to catch up the spec.
</blockquote>
</p>
<p>
David is right to point out that a government-mandated reporting format is an unlikely source of grassroots innovation. But this evocative statement -- "it is surprising that so much of accounting is navigating a hypertext" -- does make me wonder. Years ago I worked on the first incarnation of a business information product (which <a href="http://www.onesource.com/">still exists</a>) that blended financial reports with news, biographies, and other contextualizing information. It was a read-only product delivered on a write-only medium, CD-ROM. Back then there was no other choice. Now we produce some kinds of hypertext almost as naturally as we consume it. Will we be able to paint financial information on the universal canvas, mixing it with text, charts, math, and other XML brushstrokes? For the sake of our ability to step back and see the big picture, I hope so.
</p>

</body>
</item>


<item num="a993">
<title>A sea of snapshots, a heterogenous world of transforms</title>
<date>2004/05/10</date>
<body>

<p>
In my interview last week with John Shewchuk, one of the Indigo architects at Microsoft, I asked whether XML disciplines can or should model data, as well as exchange it. I like the answer John gives in <a target="audio" href="http://weblog.infoworld.com/udell/gems/johnshew.mp3">this audio clip</a>. There really isn't a primary data model, he suggests. (Note to self: Get over it!) Relational, object, and XML disciplines are just aspects of a relativistic universe of data. Very postmodern!
</p>
<p>
I edited this clip with <a href="http://audacity.sourceforge.net/">Audacity</a>, by the way. I've been using it on the Mac for a while, but only just recently noticed that it's a wxWindows-based app that runs on Windows and Linux too. Like other sound editors, it offers a bunch of effects filters. I rarely use them. I just want to capture, crop, and post. Audacity makes it pretty straightforward to find a segment in an audio track, zoom in to precisely mark its boundaries, and save the result to MP3. 
</p>

</body>
</item>

<item num="a992">
<title>UPnP, Web services, and Rendezvous</title>
<date>2004/05/07</date>
<body>

<p>
A few of us InfoWorlders spoke yesterday with one of Microsoft's Indigo architects, John Shewchuk. In the course of our conversation, Shewchuk mentioned the recent WinHEC announcement about device support for Web services protocols, reported in InfoWorld on May 5:
<blockquote class="pubQuote InfoWorld">	
Microsoft Corp., Intel Corp., Lexmark International Inc. and Ricoh Co. Ltd. on Tuesday detailed new Web services technology designed to make it easier for users to connect devices such as printers, digital cameras and digital music players over a network. The companies at Microsoft's Windows Hardware Engineering Conference (WinHEC) officially announced a Devices Profile for Web services, which describes how devices should use Web services protocols. The announcement builds on WS-Discovery, a Web services specification that Microsoft, Intel, Canon Inc. and BEA Systems Inc. introduced in February. WS-Discovery describes a way for devices to find and connect to Web services. [<a href="http://www.infoworld.com/article/04/05/05/HNwebservices_1.html">InfoWorld.com: Web services find way to devices</a>]
</blockquote>
The "Devices Profile" will be proposed to the UPnP (universal plug and play<sup>1</sup>) as the basis of the UPnP 2.0 Device Architecture.
</p>
<p>
Shewchuk sees this as a "singularity":
<blockquote class="personQuote JohnShewchuk">
There is nothing different about the Web services on a printer, than the Web services at Amazon. That's mind-blowing. And it means the same Visual Studio tool that I pick up to do my cross-enterprise application, I can now point at my printer. And the same reliable messaging protocol that makes sure my information gets to Amazon also makes sure that I don't drop packets when I'm moving from room to room on WiFi sending a print job.
</blockquote>
It's a strong argument. The odd man out in this scenario appears to be Rendezvous, as <a href="http://www.carpeaqua.com/archives/2004/02/18/wsdiscovery.php">several</a> <a href="http://varchars.com/archives/2004/02/44.html">folks</a> <a href="http://postneo.com/categories/webServices/2004/05/05.html#a3400">have</a> pointed out. Of course Canon and HP and the rest have implemented Rendezvous as well. I'm not sure what kinds of mapping and/or layering might make sense here, but ideally this won't turn out to be an either/or scenario. It'd be sweet to hit Rendezvous services in OS X, Zeroconf services on Windows, and devices, all from a SOAP-aware scripting language.
</p>
<p>
<b>Update</b>: I got to wondering about cross-platform Rendezvous, and that led me to <a href="http://www.porchdogsoft.com/products/spike/">Porchdog Software's Spike</a>, a dynamically-discoverable network clipboard for both Windows and OS X. Spike, which Just Works, is built on Porchdog's <a href="http://www.porchdogsoft.com/products/howl/">Howl</a>, an open-source SDK that brings Zeroconf/Rendezvous capabilities to Windows, Linux, and FreeBSD. Very cool.
</p>
<hr/>
<p>
<sup>1</sup> When your organization and domain name are both the same acronym, e.g. UPnP, you'd think it would make sense to expand the acronym on your home page. But I can't find the phrase "universal plug and play" -- or even any of the constituent words "universal," "plug," "play" -- on <a href="http://www.upnp.org/">this page</a>. And this isn't uncommon. <a href="http://www.svg.org/">www.svg.org</a> doesn't bother to expand SVG to Scalable Vector Graphics. The phrase "Portable Document Format" appears nowhere on <a href="http://planetpdf.com/">www.planetpdf.com</a>.
</p>
<p>
I've seen other examples of this, and I know why it happens. If you're so far inside a technology that you run an organization and website dedicated to it, you've long since lost touch with the world that might not know what that technology's acronym stands for. But while you can't cater to every newbie question, a site that aims to be an educational resource should probably answer the first and most obvious question: "What the heck does ___ stand for?"
</p>
</body>
</item>

<item num="a991">
<title>New voices</title>
<date>2004/05/06</date>
<body>

<p>
We don't yet know what the steady state of the blogosphere is going to look like. As has been <a href="http://www.theregister.co.uk/2003/10/04/blogosphere_to_reach_10_million/">snarkily reported</a>, lots of blogs die on the vine. Of course plenty don't, and there's also a steady influx of new voices. Here are three that have enriched my daily trawl for ideas and perspectives.
</p>
<p>
<b>Brendan Eich</b>, creator of JavaScript and architect of Mozilla: 
<blockquote class="personQuote BrendanEich">
The challenge for Mozilla and other open source projects is not to "react to Microsoft", any more than it is to "react to Macromedia". MS and MM are reacting to the same fields of force that govern everybody. The prize we seek is a better way to develop common kinds of networked, graphical applications. [<a href="http://weblogs.mozillazine.org/roadmap/archives/005370.html">Brendan Eich: roadmap</a>]
</blockquote>
Amen. Brendan's roadmap blog is a great way to continue the tradition of the <a href="http://www.mozilla.org/roadmap.html">Mozilla development roadmap</a>. 
</p>
<p>
<b>Martin Roberts</b>, enterprise architect:
<blockquote class="personQuote MartinRoberts">
When a Process fails where do you need to route the fault to? Normally a human - so why do most tools make this a cumbersome task? Why do these so called next generation tools find dealing with people such an alien idea? I believe the answer lies in the fact that most of these emerging tools have been built by people used to handling classes that rarely touch humans directly. They tend to be focused on the J2EE/.Net like frameworks which are low level in the inspirations and have failed to take into account the gains of the 4GL world of the early 1990's.  [<a href="http://archmusings.blogspot.com/2004_05_01_archmusings_archive.html#108383773616204711">Martin Roberts: Architecture Musings in IT</a>]
</blockquote>
I met Martin at XML 2003 and we had a fascinating hour-long conversation. The point he makes here -- that humans are the exception handlers in automated systems, and that we need to design accordingly -- is one I've made too. But my perspective doesn't include experience building enterprise apps using Oracle Workflow. Martin's does. (He currently holds forth at blogspot.com which offers Atom only, no RSS, but you can get an RSS translation of his Atom feed <a href="http://www.2rss.com/atom2rss.php?atom=http%3A//archmusings.blogspot.com/atom.xml">here</a>, thanks again to www.2rss.com.)
</p>
<p>
<b>Evelyn Rodriguez</b>, engineer turned freelance marketer:
<blockquote class="personQuote EvelynRodriguez">
Have you ever watched a start-up make that corporate transition from the inside? It's not just that the dogs and beer bashes go, but something subtle, intangible seems to shift. The palpable energy evaporates. It's not the transparency that's at issue. Maybe not even the quarterly view of the world (most start-ups have to watch their cash closely anyway and thus balance the short-term and long-term). It's more the unspoken effect and influence of "best practices" and the pressure to conform to a more respectable and familiar culture that are the hallmark of measurable metrics of Wall Street. Who knows what <a href="http://www.sas.com/">SAS</a>'s <a href="http://www.usatoday.com/money/industries/technology/2004-04-21-sas-culture_x.htm">life-friendly practices</a> are worth? Just looks like a cost to me on a balance sheet. Giving Googlers 20% of time to goof off on pet projects? That's productive time being wasted! [<a href="http://evelynrodriguez.typepad.com/crossroads_dispatches/2004/04/google_public_c.html">Evelyn Rodriguez: Crossroads Dispatches</a>]
</blockquote>
Evelyn is another conference acquaintance of mine. I find her perspectives on entrepreneurism, marketing in the blog era, and human potential to be consistently valuable.
</p>

</body>
</item>

<item num="a990">
<title>Adobe Designer 6.0 preview</title>
<date>2004/05/06</date>
<body>

<p>
<a href="http://weblog.infoworld.com/udell/gems/designer.jpg"><img vspace="6" hspace="6" align="right" src="http://weblog.infoworld.com/udell/gems/designer_s.jpg"/></a>
A more descriptive name for Adobe Designer 6.0 might be "InfoPath for PDF." 
The concept is brilliant: exploit Microsoft's failure to make
InfoPath ubiquitous by putting interactivity and XML smarts into
Adobe's free PDF viewer, and by offering a forms builder that targets
both Adobe Reader and Acrobat. Announced last summer, in beta now, and
scheduled for release this summer, Adobe Designer is that forms
builder.
</p>
<p>Adobe says that Designer targets version 6 of the PDF players. I had
to upgrade both to the (still unreleased) version 6.02, though, in
order to use Designer-built forms. You can start a form from scratch,
or by importing a layout from sources including PDF, Word, and even
InfoPath files. Either way, you can associate the form with an XML
Schema. But while the schema defines the shape of the data collected by
the form, there's limited runtime enforcement of schema constraints in
Acrobat or Reader. </p>
<p>
Some constraints, such as field lengths, are handled automatically. But
when I wrote a regular-expression restriction into the schema,
Designer's preview didn't complain when I entered text that didn't
match the pattern. In Acrobat, I was able to save an invalid XML
instance. Bottom line: if you want real schema validation, you'll have
to do it yourself in the back-end process that receives the data.
</p>
<p>
Designer enables you to specify repeating elements, but they only
work in concert with a server that regenerates the form with space for
new data. You can't grow a region interactively, a la InfoPath. That's
a limitation of the Acrobat/Reader forms player, of course, not of
Designer. 
</p>
<p>
Despite evident weaknesses, the Designer/Reader duo offers two key
strengths: digital-paper fidelity, and a ubiquitous runtime. Using the
free Reader, I was able to fill out a Designer-built form, print a
high-fidelity copy for my records, and post its XML data to a Web
server. No matter how the future of e-forms unfolds, that's going to be
a popular scenario.
</p>
<hr/>
<p><b>Note</b>: This item appears on page 18 of InfoWorld, May 3, 2004,
in the Product Previews section. Normally I point to InfoWorld articles
on InfoWorld.com, but since we haven't yet found a home online for
Product Previews, I'm publishing (the original version of) the item
here.
</p>

</body>
</item>


<item num="a989">
<title>Attack of the killer accountants</title>
<date>2004/05/05</date>
<body>

<p>
<blockquote>
The XBRL [eXtensible Business Reporting Language] spec describes how the parts of an XBRL instance interrelate, using state-of-the-art XML technologies such as XLink and XPointer. And it talks at length about the syntax and semantics of "taxonomies" that abstractly define chunks of financial reports. No sign of any actual financial data, though. And the link to a sample page at <a href="http://www.xbrl.org/Sample/">xbrl.org</a>, returned a "404 Not Found." I'm not surprised. The poor bloke whose job it was to produce that sample must have suffered a polymorphic recursive brain meltdown. [Full story at <a href="http://www.infoworld.com/article/04/04/30/18OPstrategic_1.html">InfoWorld.com</a>]
</blockquote>
Since I am not, myself, an actual financial expert (as Dave Barry might say), I worried that I might have gone overboard here. But the responses I've gotten so far allay that fear. One suggests that XBRL, if successful, will "create a master race of accountants / XML consultants." How's that for a B-movie concept!
</p>
<p>
Seriously, let's think about where the middle ground lies here. What the hammer is to the carpenter, the spreadsheet is to the accountant. In 2003, the dominant spreadsheet -- Microsoft Excel -- gained the ability not only to read and write XML, but also to guarantee fidelity to arbitrary schemas. We've yet to see the impact of that key development, but in the short run I expect we'll see a thousand flowers bloom as organizations, for their own purposes, begin to schematize their business information. In parallel, we're seeing the evolution of a global interconnected business network, implemented as a fabric of web services. The Platonic solution that XBRL envisions will, I'm guessing, more likely result from Darwinian forces now in play.
</p>
<p>
You'll schematize your own information because you can, and because it's intrinsically valuable to do so. What self-respecting accountant wouldn't want an automatic check on the validity of data? Meanwhile, your schematized information will be drawn inexorably into the interconnected business fabric. To survive in that ecosystem, you'll wind up transforming your stuff. The purpose of the transformation won't be to conform to a reporting specification, but rather to interoperate with the fabric. Proxies within the fabric will crank out the reports we need to see.
</p>
<p>
PS: Sorry about the title, but when the phrase "Attack of the killer accountants" came up blank on Google, I just had to claim it.
</p>

</body>
</item>


<item num="a988">
<title>Groove, four years later</title>
<date>2004/05/04</date>
<body>

<p>
I recently met with Groove's Jack Ozzie and Michael Helfrich. Jack is a co-founder and VP, development; Michael is VP, applied technology. The subject, of course, was the forthcoming V3 of Groove, a product I first saw in beta four years ago this spring. We had a wide-ranging discussion; here are some of the key takeaway points.
</p>
<p>
<b>Sayonara, top-to-bottom XML</b>
I don't believe that I pay a performance penalty for using XML, and depending on how you use XML, you may not believe that you do either. But don't tell that to Jack Ozzie. The original architectural pillars of Groove were COM, for software extensibility, and XML, for data extensibility. In V3 the internal XML datastore switches over to a binary record-oriented database. 
</p>
<p>
You can't argue with results: after beating his brains out for a couple of years, Jack can finally point to a noticeable speedup in an app that has historically struggled even on modern hardware. The downside? Debugging. It was great to be able to look at an internal Groove transaction and simply be able to read it, Jack says, and now he can't. Hey, you've got to break some eggs to make an omelette.
</p>
<p>
I'm sure Groove has made the right choice here. Still, it's troubling if you believe -- as I do -- that a high-performance XML database ought to be a core piece of client infrastructure. Groove's original XML database vision was probably too forward-looking. Version 1.0 was effectively done, for example, by the time the ink was dry on the XPath specification. XML storage technology didn't then support what Groove wanted to do. Arguably it still may not. We don't yet know what Chandler will be able to achieve with Berkeley DB XML. Meanwhile WinFS is turning out to be less like the XML database I imagined, and more like a record-oriented (or rather, CLR-object-oriented) database.
</p>
<p>
So how do we resolve the impedance mismatch between our desktop storage engines -- the file system, conventional databases -- and the XML content model that is increasingly the choice of desktop applications? I'm still looking for the solution to this puzzle.
</p>
<p>
<b>Groove and .NET</b>
The managed-code interface to Groove gets an overhaul in V3 but the core product itself does not rely on .NET. Not because Groove's developers wouldn't like to use .NET. They very much would, Jack says. But rather because the already steep ante -- Groove's a 10MB download -- looks even steeper when you pile on a 20MB .NET download. This isn't news, just another datapoint, but every time I hear this it tells me two things. The CLR and .NET Framework aren't yet infrastructure that a mainstream Windows developer can take for granted. But when that finally becomes true, a whole lot of pent-up developer demand for .NET services will be released. 
</p>
<p>
<b>Python?</b>
During a demo of the V3 forms builder, which gains some nice yardage on the previous version, I noted that the scripting languages supported are VBScript and JavaScript. Hmm, thought I. Is this thing an ActiveX Scripting Host? And if so, can I plug in another scripting engine that works in that environment, say Python? The answers were "Yes" and "Don't see why not, we'll get back to you." 
</p>
<p>
<b>Groove Web services</b>
The Web services stuff that I <a href="http://webservices.xml.com/pub/a/ws/2002/12/09/udell.html">explored</a> a while ago has matured, and is baked into the product. Groove V3 comes up listening for SOAP calls from localhost, and can be configured to listen for SOAP calls from remote nodes. 
</p>
<p>
The good news: in addition to using forms, you can write scripts that reach through the Web services layer to find things in Groove spaces. The bad news: you'll have to, there's <i>still</i> no built-in search capability.
</p>
<p>
<b>Challenges</b>
Early reviews of the 3.0 beta have showered praise on the product's revamped UI, and I'll add mine: it's cleaner and better optimized for common tasks. There remain challenges. Groove is a holistic solution that shares idioms with both Windows and the web in ways that seem familiar, but sometimes aren't. In the Groove "Files Tool," for example, you're shown what looks like a file in a folder, but is actually an encrypted and synchronized Groove object. Double-clicking the file opens it into its default editor, which may (or may not) reveal the fact that the file has been decrypted to your local temporary directory for viewing and editing. Quitting can result in a two-step tango. First the editor asks if you want to save. Then Groove, detecting changes, asks again: "Do you want to save?" It's the classic dilemma of every document manager that <s>hooks File Open and File Save</s> gets in between apps and storage in order to add value. In Groove's case, that value is considerable: automatic secure synchronization, and change notification, across all instances of a shared space. But until and unless a more intimate relationship can be forged between Groove's secure/transacted/synchronized storage and the OS-level storage APIs that applications expect to see, there's just no way to make this seamless.
</p>
<p>
Groove's use of web metaphors raises other challenges. For example, navigation from tool to tool within a Groove space uses browser-like back/forward controls. But in a space that includes an embedded browser, you end up with two separate sets of back/forward controls. Groove's hyperlinking is also similar-yet-different. Depending on the Groove tool you're in, you may be able to form a link to a record, a view, or the tool itself. For which audience is the link relevant? It depends. Members of the space that contains the tool can jump to it from a link pasted into, say, a chat window or discussion. The same link pasted into another space may or may not be accessible to everybody, depending on who's also joined to the target space. Can the link point into Groove from the outside, say from an intranet web page? In theory yes, though I'd be surprised if anyone has ever done it. 
</p>
<p>
Groove's transacted/synchronized storage model envisions a species of applications that don't yet exist outside of Groove. Likewise its hyperlinking model envisions collaborative scenarios that don't yet flourish outside of Groove. Such applications and scenarios would present thorny usability challenges even if there were no legacy to consider, because the total experience is so different from what we're conditioned to expect. Of course there <i>is</i> a legacy. Reconciling it with Groove is incredibly hard, but there's been steady progress all along, and V3 is another big push forward. 
</p>

</body>
</item>


<item num="a987">
<title>XML databases move to the middle</title>
<date>2004/04/30</date>
<body>


<p>
<blockquote>
It's true that you can use native XML databases to manage the growing number of business documents created by the new generation of XML-savvy end-user applications. It's handy, for example, to search an insurance database for incident reports that match some structured pattern of in-line metadata. But hybrid SQL/XML databases can do that too, and they can also join the structured XML content with relational columns -- a powerful combination. So XML databases are migrating into a niche that SQL/XML can't and won't occupy. They're becoming the high-performance pumps that push XML traffic around on the emerging services web. [<a href="http://www.infoworld.com/reports/17SRxml.html">InfoWorld.com</a>]
</blockquote>
This short piece is a companion to Sean McCown's excellent <a href="http://www.infoworld.com/reports/17SRxml.html">cover story</a> which surveys the XML features of leading relational databases: Oracle, DB2, SQL Server, Sybase.
</p>
<p>
I've followed the odyssey of Sonic XML Server, n&#233;e eXcelon, n&#233;e ObjectStore, for quite a long time. I wouldn't have predicted that XML databases would become the context engines of the services web, but I guess it's not too surprising. More surprising, I have to admit, is the extent to which the SQL discipline is merging with the XML discipline in the conventional database engines. "It's possible that developers will want to stay within an XML abstraction for all their data sources," said Oracle's Sandeepan Banerjee when I interviewed him for last summer's <a href="http://weblog.infoworld.com/udell/categories/infoworld/2003/07/30.html#a760">story on SQL/XML hybridization</a>. Wow. I still can't believe that an Oracle guy said that! 
</p>



</body>
</item>



<item num="a986">
<title>Jack of all trades, master of none</title>
<date>2004/04/28</date>
<body>

<p>
<blockquote>
E-mail is the jack of all trades, but the master of none. There are better ways to transfer files, hold discussions, deliver notifications, broadcast newsletters, schedule meetings, work collaboratively, and manage personal information. But even though e-mail isn't the best tool for any of these tasks, it provides a single interface to all of them. Here's a challenge: Let's improve the various functions performed by e-mail without multiplying the interfaces people must learn in order to use those functions. [Full story at <a href="http://www.infoworld.com/article/04/04/23/17OPstrategic_1.html">InfoWorld.com</a>]
</blockquote>
A favorite example of mine is RSS. It's an inherently opt-in, spam-free channel of communication that can replace certain of email's most broken functions: broadcast newsletters, notifications. But, as <a href="http://www.newsgator.com/">NewsGator</a> shows us, RSS can still look and feel like email to the user.
</p>
<p>
I also mentioned the old idea of passing attachments "by reference" rather than "by value" -- that is, emailing links to uploaded attachments, rather than including the attachments themselves. Several people responded to that, including two whose emails I'm quoting here with permission. For John Heery, the issue is IT control:
</p>
<blockquote class="personQuote JohnHeery">
I thought your solution for having an e-mail client that could pass a file "by reference" was a great one, and one that several of us at work use with a two step process.  We drop the file on a shared drive, and then just send a link.
<br/><br/> 
However, your assessment of why people use e-mail to transfer files may be accurate for Infoworld, but I seriously doubt it is on the mark for most companies.  As IT locks down systems in an ever increasing game of black ops, e-mail is just about all we poor users have left.  My laptop doesn't have a floppy or CD-RW, so I can't write onto removeable media.  The USB port is my current option, until IT discovers I bought a jump drive to move files around.  FTP isn't an option, in fact, the FTP ability of IE 6 has been disabled on my machine.  My co-workers and I couldn't stop laughing as you rattled off WebDAV, scp, and Radio UserLand.  These may be great little secrets for IT people, but at least at our company they aren't made available.  We can't even determine what we set as our default browser webpage.
<br/><br/>  
Lotus Notes is our mail client, and it's forced to do the file transfer.  For a while, several of us received training in Lotus Application Development and developed some great database tools for our groups.  IT has removed that ability.  They only support work they developed, and even if you agree to forego support, development by users is not an option.  In the ever expanding cold war with IT, my fellow Engineers and Technicians have now retreated to the MS-Office applications.  Converting our former Lotus Notes apps to Access with VBA has given us power to develop flexible tools...for the time being.  Last week we discovered our ODBC connection between Access and Notes had been disabled.  Another battle in the war.
<br/><br/>  
I've used a desktop computer since 1984 when I was required to own one for Engineering school.  It's insulting to be told by the new MSCE qualified IT kid that if I'm given the ability to change screen resolution on my laptop, I'll just get into trouble down the road.  Please.  I run the same OS on my home machine as an Administrator and never have any problems.  The problem isn't the variety of tools, nor is it the users.  It's the availability. [John Heery]
</blockquote>
<p>
Several other correspondents said the same thing: they'd love to implement the idea, but lack the means to do so. It's ironic but inevitable that the PC, which was originally the information worker's secret weapon in the "ever expanding cold war with IT," has become the raised-floor sanctuary guarded by the priesthood. I can definitely see both sides of that argument. But in this day and age, when anybody can sign up for a free blog site that requires only a vanilla browser to use, I guess I'd ask this of IT: Do you want users to route around you by sharing files insecurely in free services, or would you rather admit that link-addressable filespace on the public web is as essential a tool of modern work as an email address is?
</p>
<p>
Jon Hoover also liked the idea, is in a position to do something about it, and wonders how to bend Exchange to this purpose:
</p>
<blockquote class="personQuote JonHoover">
Just a comment on your "E-mail's many hats" article, which I enjoyed reading. Recently, an "administrative assistant turned graphics and marketing person" in our organization was found to be sending SEVERAL 100-350 MB attachments to users out on the Internet -- via email, of course. This became apparent very quickly as our Pentium 2 333 MHz Exchange 5.5 server choked down that much data, our T1 flooded, and our mail store approached the Exchange file size limit it has been flirting with for quite some time. I instituted a limit policy that very day, which had always been in the back of my mind (for example, what happens when a virus is created expressly for the purpose of filling Exchange mail stores by sending huge attachments to an entire Global Address List when it detects it is on a LAN connected to such an Exchange server -- sending small attachments to other users not directly connected to the server).
<br/><br/>
The problem, of course, was that everyone was sending large files. The limit I instituted for outgoing is now 3.5 MB, incoming at 10 MB. These are, in my opinion, very large limits, but complaints quickly grew. I created a samba share on our network which users could drop files into, making them immediately available through a symlink to our public web server. The URL could then be emailed instead of the actual file.
<br/><br/>
Now, how big of a next step is it to create a form in Exchange which can automatically copy a file into the share, and insert the URL (or URLs) into the email message. Taking it a step further, can the form accept directories to send, zipping them first and copying them to the share? Can it add a password to the zip archive and place it into the body of the message?
<br/><br/> 
Thoughts? I may just have to find a guy in house to put this to task, the more I think about it. [Jon Hoover]
</blockquote>
<p>
I've no experience with Exchange development, but I told Jon I'd float his query here in case somebody has a solution they'd like to share. For a first level of security, the URL contained in the email message could look like this:
<pre>
https://user:password@domain.com/~user/proposal.pdf
</pre>
</p>
<p>
By the way, I notice that Chad Dickerson is <a href="http://weblog.infoworld.com/dickerson/2004/04/27.html#10.29.40">hiring a developer</a> for InfoWorld. Cool! I'm sure there are lots of other priorities, but maybe we can also task the person to make our own email infrastructure smarter.
</p>

</body>
</item>

<item num="a985">
<title>i18n again</title>
<date>2004/04/27</date>
<body>

<p>
Sam Ruby pinpoints the glitch:
<blockquote class="personQuote SamRuby">
<p>Let's take a closer look into Jon's
<a href="http://weblog.infoworld.com/udell/rss.xml">RSS
feed</a>:</p>
<pre class="code">&lt;title&gt;Active r&amp;amp;#233;sum&amp;amp;#233;s&lt;/title&gt;
</pre>
Arguably, the InfoWorld process <b>did</b> parse the RSS feed,
once. [<a href="http://www.intertwingly.net/blog/1772.html">Sam Ruby</a>]
</blockquote>
I'll be damned. I had forgotten that Radio UserLand's RSS writer runs the title through an encoding routine. That's where the extra level of escaping came from. I had removed the call to the encoder for the body content in my version of the RSS writer, but not for the title. Now it's removed there too, which I <i>think</i> is correct for my situation, but we'll see. 
</p>
<p>
Thanks Sam, and apologies to the InfoWorld crew -- it was my fault after all. Clearly Sam's right: we could, indeed, learn a lot from those 13th century artisans. And I guess <a href="http://dubinko.info/blog/2004_04_01_archive.html#108300793527506560">Micah Dubinko</a> would agree.
</p>

</body>
</item>

<item num="a984">
<title>Weinberger's rant</title>
<date>2004/04/27</date>
<body>

<p>
C-SPAN captured David Weinberger's excellent rant yesterday at the <a href="http://www.fieldworksonline.com/techpoliflyer.html">Technology and Politics Summit</a>in DC. The <a href="rtsp://video.c-span.org/project/c04/c04042604_tech.rm">stream</a> is overloaded at the moment, but I captured a clip (<a href="http://weblog.infoworld.com/udell/gems/weinberger.wmv">WinMedia</a>, <a href="http://weblog.infoworld.com/udell/gems/weinberger.mov">QuickTime</a>). 
</p>
<p>
I wish I could say it was easy to do this kind of videoblogging, but it's just not true. What I meant to be a quick, spontaneous thing turned into a chore. It's frustrating, really -- we're so close, yet so far, in terms of being able to sling video clips as easily as we sling text, still images, and even audio. 
</p>

</body>
</item>


<item num="a983">
<title>Radical software customization</title>
<date>2004/04/27</date>
<body>

<p>
The always-interesting Sean McGrath has a great column this week about software customization. He says, in part:
<blockquote class="personQuote SeanMcGrath">
In order to stay sane, most programmers concentrate on the part of the problem they are working on today. As a consequence, their view of what pieces of the functions under development need to be parameterized and which do not, tends to be a quite low level. Indeed, most of the items programmers will chose to parameterize will amount to double dutch to the business analysts. [<a href="http://www.itworld.com/nl/ebiz_ent/04272004/">Sean Mcgrath: The mysteries of flexible software</a>] 
</blockquote>
In the companion <a href="http://seanmcgrath.blogspot.com/archives/2004_04_25_seanmcgrath_archive.html#108305574138645334">blog entry</a> Sean gives the example of a Jython script that he used, instead of an XML configuration file, to parameterize a piece of software. It illustrates, by example, one of the points I tried to make in my recent <a href="http://www.itconversations.com/transcripts/117/transcript117-1.html">IT Conversations</a> interview with Doug Kaye. Dynamic languages are a great way to record data when a solution is fluid and requirements are evolving. And, come to think of it, when aren't those things true?
</p>
<p>
Closely related to this theme are the tools and frameworks for capturing and manipulating business rules. A while back I wrote a column on the subject, and James Owen -- a seasoned user of the various rules engines -- wrote to me about it. After a bit of back and forth I recruited him to review this class of product for InfoWorld, and he's produced a series of articles: <a href="http://www.infoworld.com/article/03/09/12/36TCjrules_1.html">JRules</a>, <a href="http://www.infoworld.com/article/04/01/16/03TCblaze_1.html">Blaze Advisor</a>, <a href="http://www.infoworld.com/article/04/03/12/11TCopsj_1.html">Jess and OPSJ</a>. 
</p>
<p>
I'm also quite curious to see what Microsoft will make of Ward Cunningham's ideas and techniques. I interviewed Ward in <a href="http://weblog.infoworld.com/udell/2003/02/13.html">Refactoring the business</a> and, in my <a href="http://weblog.infoworld.com/udell/2003/08/04.html">blog companion</a> to our feature on <a href="http://www.infoworld.com/article/03/08/01/30FEtestmain_1.html">test-driven development</a>, he talks about the <a href="http://fit.c2.com/wiki.cgi?WhatsWhat">FIT framework</a> that he's used to push testable business logic into spreadsheets that business analysts can make and use.
</p>
<p>
We can all agree that software must be customizable. But when programmers alone decide how users can do things, you often end up with a scenario like <a href="http://weblog.infoworld.com/udell/2004/03/02.html">Aunt Tillie's OS X adventure</a>: a dashboard packed with incomprehensible dials and knobs. If the dashboard was built with a dynamic language, the programmer can at least rearrange the controls more quickly and more easily. But the rules engines that James Owen has been writing about, and the FIT framework that Ward Cunningham has created, point toward a radically altered relationship between software makers and software users. It can't happen too soon.
</p>

</body>
</item>

<item num="a982">
<title>13th century standards</title>
<date>2004/04/26</date>
<body>

<p>
<a href="http://www.duke.edu/religion/graphic/graphic.html"><img width="305" height="226" align="right" hspace="6" vspace="6" src="http://www.duke.edu/religion/chartres.jpg"/></a>
Traveling in France in 2001, I visited Chartres Cathedral and was lucky enough to show up in time for <a href="http://www.artagogo.com/commentary/miller/miller.htm">Malcolm Miller's</a> lecture. Seemingly unchanged from the last time I'd seen him, in 1978, Miller again made the architecture and stained glass come alive in his inimitable way. This time, though, I heard something I hadn't the first time -- about standards. When the construction project drew in artisans from the 13th-century French countryside, the first order of business was to agree on standard weights and measures. I wonder what those negotiations were like!
</p>
<p>
It all seemed kind of quaint until, a couple of days later, I found myself in an Internet cafe struggling with a French keyboard. The @ symbol was the showstopper. I finally abandoned typing and, feeling ridiculous, copied the symbol from a web page and pasted it into the email message I was composing.
</p>
<p>
What reminded me of all this was the title of <a href="http://weblog.infoworld.com/udell/2004/04/22.html#a981">last Thursday's entry</a>: "Active r&#233;sum&#233;s." To be honest, I took the lazy route at first and wrote it as "Active resumes" because I knew that using a LATIN SMALL LETTER E WITH ACUTE would likely cause some problems. But then, mindful of Sam Ruby's recent <a href="http://intertwingly.net/stories/2004/04/14/i18n.html">admonition</a> to test international characters "in every nook and cranny you can find," I went with the correct spelling. 
</p>
<p>
Since I write in XML, my input strategy was to use numeric references, which meant typing this string of characters: "r&amp;#233;sum&amp;#233;s" -- and that's exactly what showed up on the InfoWorld home page when the item was excerpted there. Evidently the process that creates those excerpts is reading, but not parsing, RSS feeds. 
</p>
<p>
The item itself displayed correctly, but other subtleties emerged. For example, Technorati and Feedster produce hits when searching for the wrong spelling (<a target="search" href="http://www.technorati.com/cosmos/search.html?rank=&amp;url=Active+resumes">T</a>, <a target="search" href="http://www.feedster.com/search.php?hl=en&amp;ie=UTF-8&amp;q=Active+resumes">F</a>) but not when searching for the right one (<a target="search" href="http://www.technorati.com/cosmos/search.html?rank=&amp;url=Active+r%C3%A9sum%C3%A9s">T</a>, <a target="search" href="http://www.feedster.com/search.php?hl=en&amp;ie=UTF-8&amp;q=Active+r%C3%A9sum%C3%A9s">F</a>). (<b>Update:</b> Hmm. Technorati does find <a target="search" href="http://www.technorati.com/cosmos/search.html?url=active+r%C3%A9sum%C3%A9">active r&#233;sum&#233;</a>, though. So does <a target="search" href="http://www.google.com/search?hl=en&amp;ie=UTF-8&amp;oe=UTF-8&amp;q=%22active+r%C3%A9sum%C3%A9%22">Google</a>, but it finds a lot more instances of <a target="search" href="http://www.google.com/search?hl=en&amp;ie=UTF-8&amp;oe=UTF-8&amp;q=%22active+resume%22">active resume</a>.)
</p>
<p> I discovered that my own XPath search does <a target="search" href="http://udell.infoworld.com:8000/?/blog/item/title[contains(.,%20'r%C3%A9sum%C3%A9s')]">find the entry</a>, though entering the search term presents a bit of a challenge. Copying an instance of 'r&#233;sum&#233;' into the search form works, as does the extra-geeky method of writing the URL-encoded version ('r%C3%A9sum%C3%A9s') directly into the URL. But the resulting display was wrong, until I switched the browser's text encoding to UTF-8. I guess I should have my search server emit the appropriate UTF-8 header.
</p>
<p>
Sam's essay points to a <a href="http://www.joelonsoftware.com/articles/Unicode.html">Joel Spolsky article</a> that is the single most lucid treatise I've seen on the subject of internationalization. We've come a long way with Unicode, but there's still some distance to go. Chartres Cathedral still stands, so apparently those 13th-century carpenters and stonemasons got things sorted out reasonably well. I trust we will too. 
</p>

</body>
</item>

<item num="a981">
<title>Active r&#233;sum&#233;s</title>
<date>2004/04/22</date>
<body>

<p>
Today's New York Times includes a <a href="http://www.nytimes.com/2004/04/22/technology/circuits/22diar.html">brief article</a> on music blogging. The story links to <a href="http://www.webjay.com">Webjay</a> and quotes <a href="http://www.gonze.com/weblog">Lucas Gonze</a> and <a href="http://www.pmbrowser.info/hublog/">Alf Eaton</a>. I've written three recent entries about this phenomenon: <a href="http://weblog.infoworld.com/udell/2004/03/15.html#a945">The media-player fireswamp</a>, <a href="http://weblog.infoworld.com/udell/2004/03/30.html#a959">Blogs + playlists = collaborative listening</a>, and <a href="http://weblog.infoworld.com/udell/2004/04/14.html#a972">Networks of shared experience</a>. My fascination with the topic may seem like diversion from my usual themes, and in a way it is, but I think the issues transcend music, copyright, and the RIAA.
</p>
<p>
Alf Eaton writes today:
<blockquote class="personQuote AlfEaton">
I think the MP3 blogs (which are essentially annotated playlists) might well be taking the middle ground in the P2P vs music industry wars - I hope that the record industry will begin to see the value in what these grassroots enthusiasts are doing to promote their music. On the other hand, a large part of making these playlists under current laws involves turning your back on the major labels and concentrating on the music libre, the 'free music', the stuff that wants to be shared. Those artists that make their tracks freely available online are the ones that will benefit most from the collaborative filtering and recommendation networks that are being set up. [<a href="http://www.pmbrowser.info/hublog/archives/000802.html">Hublog</a>]
</blockquote>
Let's extend that remark: Any professional whose work is visible on the Net will become part of the conversation that establishes reputation and creates opportunity. The blog is an <i>active r&#233;sum&#233;</i> that enables you to participate -- by proxy -- in that conversation.
</p>
<p>
What an active r&#233;sum&#233; should include will vary by profession and according to personal inclination. For a musician, a couple of complete tracks from each CD. For a home renovator, photos and write-ups of some completed projects -- and for extra credit, video walkthroughs. For a programmer, links to those of your applications, tools, or specifications that touch the public domain.</p>
<p>
Here's the bottom line. What Alf calls "collaborative filtering and recommendation networks" will rival -- and my guess is, largely supplant -- conventional marketing and promotion. But if those networks can't find you, they won't be able to help you.
</p>

</body>
</item>

<item num="a980">
<title>Ending email forgery</title>
<date>2004/04/21</date>
<body>

<p>
<blockquote>
In our July 18 feature, <a href="http://www.infoworld.com/article/03/07/18/28FEspam_1.html">Canning Spam</a> we mentioned an Internet draft proposal from Hadmut Danisch, called <a href="http://www.ietf.org/internet-drafts/draft-danisch-dns-rr-smtp-03.txt">RMX</a> (Reverse Mail eXchange). <b>It was an elaboration of an earlier proposal by Paul Vixie, architect of BIND (Berkeley Internet Name Domain), who in turn attributes the idea to Jim Miller of JCM Consulting.</b> The idea is elegantly simple. In addition to publishing the MX (Mail Exchange) DNS records that identify inbound mail hosts, an organization also publishes reverse MX records that identify outbound hosts. A receiving server queries the DNS to find out if the sending host is so authorized. The name yahoo.com is easy to forge, but the IP addresses of Yahoo's outbound servers are not.
<br/><br/>
The devil's always in the details, of course. It's remarkably difficult to define exactly what "sender" means in today's complex e-mail environment. Three current proposals -- pobox.com's <a href="http://spf.pobox.com">SPF</a> (originally Sender Permitted From, now Sender Policy Framework), Microsoft's <a href="http://www.microsoft.com/mscorp/twc/privacy/spam_callerID.mspx">Caller ID for E-Mail</a>, and Yahoo's DomainKeys (unpublished) -- take differing approaches. [Full story at <a href="http://www.infoworld.com/article/04/04/16/16FEfutureforgery_1.html">InfoWorld.com</a>]
</blockquote>
As part of this week's cover story on <a href="http://www.infoworld.com/infoworld/article/04/04/16/16FEfuturemail_1.html">email's future</a>, my piece explores the current crop of sender authorization proposals. The boldfaced sentence didn't appear in the printed article. I resurrect it here to help set the record straight. In <a href="https://lists.lab.net/archive/nanog-exploder/Week-of-Mon-20030825/000203.html">this mailing list message</a>, Paul Vixie, responding to a posting that mentions the RMX/SPF idea, says: "Fine idea. Thank Jim Miller for it when you see him."  Jim and I have never met, but I did track him down in order to establish that he's the sole proprietor of JCM Consulting. So thanks, Jim! Even though your sentence wound up on the cutting room floor, I've put it back where Google can find it.
</p>
<p>
Here are some clips from my interview with Eric Allman. First, Eric <a target="_new" href="http://weblog.infoworld.com/udell/gems/ericAllman01.mp3">explains</a> why Sendmail Inc. is implementing DomainKeys in preference to the other schemes. Then, Eric and I <a target="_new" href="http://weblog.infoworld.com/udell/gems/ericAllman02.mp3">discuss crypto and the end-to-end principle</a>, relative to DomainKeys.
</p>

</body>
</item>

<item num="a979">
<title>Middleware dark matter</title>
<date>2004/04/20</date>
<body>

<p>
Steve Vinoski, middleware architect at IONA and a prolific columnist, has been blogging for a couple of months at <a href="http://www.iona.com/blogs/vinoski/">Middleware Matters</a>. Back in 2002, his IEEE Internet Computing column used the title that I stole for this blog entry: <a href="http://www.iona.com/hyplan/vinoski/pdfs/IEEE-Middleware_Dark_Matter.pdf">Middleware Dark Matter</a>. The reference is to Clay Shirky's excellent meme "PCs are the dark matter of the Internet," which helped the peer-to-peer movement define itself circa 2000. Vinoski wrote:
<blockquote class="personQuote SteveVinoski">
We can apply a similar analogy to middleware because the mass of the middleware universe is much greater than the systems -- such as message-oriented middleware (MOM), enterprise application integration (EAI), and application servers based on Corba or J2EE -- that we usually think of when we speak of middleware. We tend to forget or ignore the vast numbers of systems based on other approaches. We can't see them, and we don't talk about them, but they're out there solving real-world integration problems -- and profoundly influencing the middleware space. These systems are the dark matter of the middleware universe. [<a href="http://www.iona.com/hyplan/vinoski/pdfs/IEEE-Middleware_Dark_Matter.pdf">Steve Vinoski</a>]
</blockquote>
</p>
<p>
Absolutely true. When I read this, though, I couldn't help but imagine the same column having been written, for another audience, like so:
<blockquote class="personQuote SteveVinoski">
The mass of the middleware universe is much greater than the systems -- based on Perl, Python, CGI, FTP file transfer, Unix shell, Visual Basic  -- that we usually think of when we speak of middleware. We tend to forget or ignore the vast numbers of systems based on other approaches such as message-oriented middleware (MOM), enterprise application integration (EAI), and application servers based on Corba or J2EE. We can't see them, and we don't talk about them, but they're out there solving real-world integration problems -- and profoundly influencing the middleware space. These systems are the dark matter of the middleware universe. 
</blockquote>
</p>
<p>
Both of these passages make perfect sense to me. Though driven apart by a deep cultural schism, the two integration styles are utterly co-dependent.
</p>


</body>
</item>


<item num="a978">
<title>Betty Dylan</title>
<date>2004/04/20</date>
<body>

<p>
<a href="http://www.bettydylan.com"><img align="right" vspace="6" hspace="6" src="http://weblog.infoworld.com/udell/gems/bettyDylan.jpg"/></a>
A brief special announcement for folks living near the intersection of New Hampshire, Vermont, and Massachusetts. The Nashville duo called <a href="http://www.bettydylan.com/">Betty Dylan</a>, whose signature tune <a target="_new" href="http://www.bettydylan.com/mp3/Amtrash/AmericanTrash.mp3">American Trash</a> has been percolating through the <a href="http://webjay.org/related/judell/test">Webjay playlists</a>, will be returning to Keene, NH, on Thursday April 22. Where: <a href="http://www.someplacesdifferent.com/eflane-directions.htm">E.F. Lane hotel</a> on main street. When: Happy hour, 5PM. I'll be there!
</p>
<p>
And now back to your regularly scheduled program...
</p>

</body>
</item>

<item num="a977">
<title>Proxy power</title>
<date>2004/04/19</date>
<body>

<p>
<blockquote>
One of these years, my bank will upgrade to a new system that's built around Web services. They'll probably offer a basic "rich Internet application" -- for Windows, Java, or Flash -- that connects to those services. When the bank announces the upgrade, it will stress the richer user experience and choice of interchangeable clients.
<br/><br/>
Those will be crucial benefits indeed. What won't be said, because it's harder to explain, is that the system will also have become radically extensible. Suppose I want to trigger an alert when a transfer exceeds some limit or when a duplicate amount appears. Today, if the system doesn't implement these rules, I'm stuck. In a services-oriented environment, though, I needn't depend on either the bank or my client software. If neither delivers the features I want, I'll inject an intermediary that does. Local proxies are geeky curiosities today, but someday we'll wonder how we lived without them. [Full story at <a href="http://www.infoworld.com/article/04/04/16/16OPstrategic_1.html">InfoWorld.com</a>]
</blockquote>
As mentioned in this week's column, I've been experimenting with a local Web proxy that XHTML-izes and transforms Web pages on the fly. Here's an example:
</p>
<p>
<a target="_new" href="http://weblog.infoworld.com/udell/gems/plainProxy.jpg"><img width="368" height="227" src="http://weblog.infoworld.com/udell/gems/plainProxy.jpg"/></a>
</p>
<p>
In this screenshot, Firefox is pulling this week's InfoWorld column through a proxy based on the one included in the <a href="http://www.twistedmatrix.com/products/twisted">Twisted</a> framework for Python. Inside the proxy, I'm using <a href="http://www.egenix.com/files/python/mxTidy.html">mxTidy</a> to convert the text of the page to XHTML. Then I'm using libxml2's XPath search to find just the paragraph elements with the class attribute <i>ArticleBody</i>, and rewriting the page to include only those elements.
</p>
<p>
It's kind of a parlor trick, I'll admit. But realtime XML transformation of Web pages could have applications that go way beyond ad blocking. Suppose I store all my XML-convertible Web content in an XML database. (Some stuff can't be XHTML-ized, but it turns out a lot can.) It's just text, after all, I bet a year's worth of content is a drop in the bucket compared to a typical MP3 collection. 
</p>
<p>
Given such a database, the on-the-fly filter could do some clever correlation. Suppose that for the pages I read -- and maybe also for each link in those pages -- the filter extracts URLs, queries the database for elements that mention those URLs, and rewrites the current page with links to the query output. Voila! Instant context. 
</p>
<p>
I don't yet know if this will be practical, and in fact my XML.com column is late this month because I haven't figured that out yet. But it's an exciting idea. We have a surplus of storage and processing power on the desktop, but never enough useful context. When more of our data flows are XML, local proxies will really shine. Even now, though, they can do more than you might think.
</p>

</body>
</item>


<item num="a976">
<title>Always-on identification</title>
<date>2004/04/18</date>
<body>

<p>
<a href="http://www.mvtec.com/halcon/applications/surveillance/"><img vspace="6" hspace="6" align="right" src="http://www.mvtec.com/halcon/applications/surveillance/ubahn.gif"/></a>
David Weinberger's recent essay, <a href="http://www.hyperorg.com/backissues/joho-apr15-04.html">There's No "I" in "Identity"</a>, advances a notion of real-world identity that's so different from mine I had to sort out why. David writes:
<blockquote class="personQuote DavidWeinberger">
In the real world, we don't identify everyone. We only identify those about whom we have doubts that we have to resolve for some purpose. Identifying is not the default in the real world. Nor, IMO, should it be online. [<a href="http://www.hyperorg.com/backissues/joho-apr15-04.html">JOHO</a>]
</blockquote>
Compare this with the following memorable quote from Bruce Schneier's <a href="">Secrets and Lies</a>:
<blockquote class="personQuote BruceSchneier">
Authentication is about the continuity of relationships, knowing who to trust and who not to trust, making sense of a complex world. Even nonhumans need authentication: smells, sound, touch. Arguably, life itself is an authenticating molecular pit of enzymes, antibodies, and so on.
</blockquote>
I remember this quote because I included it in my <a href="http://udell.roninhouse.com/bytecols/2000-10-18.html">review</a> of the book, which I continue to think is not only Schneier's best book, but also the best book I've ever read on the topic of security.
</p>
<p>
Distinguishing between self and other is what every living organism does, all the time. So is identifying others. Humans are hardwired to recognize faces, voices, gaits. We do it always and automatically. Perhaps so automatically that we don't notice, for the most part, that we are doing it. When my teenage daughter comes downstairs there's rarely any ambiguity about who she is. (Though there can be, sometimes it's one of her friends.) But at 100 yards, watching someone that might be her walking up the street, identification becomes a foreground task. Is that her gait? Her hairstyle? Her clothing? Once these questions are asked, it becomes imperative to answer them. 
</p>
<p>
Suppose she has just returned from shopping downtown, where she made a cash purchase in a store. We might be inclined to call this an anonymous transaction. There was no need for identification, so none presumably occurred. Except that's not really true. If she paid with a twenty-dollar bill and forgot to pick up her change, odds are she can return to the store and collect it. The store clerk's lizard-brain will authenticate her face, her voice, her gait. Or, what's becoming increasingly likely, the store's surveillance camera will. 
</p>
<p>
Of sci-fi's three "killer B's" (Gregory Benson, David Brin, Greg Bear), the one most often cited in discussions of identity and privacy is David Brin, whose book <a href="http://www.amazon.com/exec/obidos/ASIN/0738201448/">The Transparent Society</a> I can't recommend too often. But I think it was Gregory Benson who, in his 2000 <a href="http://www.wired.com/news/technology/0,1282,37610,00.html">keynote talk</a> at the O'Reilly Open Source Convention, said that we "shed data trails" as we move through the real world, just as surely as we do when we move through cyberspace.
</p>
<p>
With cameras proliferating in meatspace and blogs pervading cyberspace, it's getting harder and harder to distinguish between "real" and "virtual" data trails. Does it matter? David and I agree that it doesn't. We're 180 degrees apart on the default case, though. I think identification defaults to always-on.
</p>

</body>
</item>

<item num="a975">
<title>The participant/narrator: owning the role</title>
<date>2004/04/16</date>
<body>

<p>
The XML-Deviant column at O'Reilly's XML.com (<a href="http://www.xml.com/pub/at/17">index</a>, <a href="http://www.xml.com/feeds/columns/?x-col=17">rss</a>), which began in January 2000, would have been called a blog had the term been more current then. Written first by <a href="http://www.xml.com/pub/au/15">Leigh Dodds</a> and now by <a href="http://www.xml.com/pub/au/92">Kendall Grant Clark</a>, the concept was a brilliant one. Recruit literate developers who participate in key mailing lists (Dodds: xml-dev, Grant Clark: W3C Technical Architecture Group), and have them publish reports that summarize and comment on weekly activity. 
</p>
<p>
This is a potent form of communication. For people who lack the time to closely monitor activity in some area, these bulletins are a way to keep a finger on the pulse. For the participant/narrator, they're a way to build personal brand and -- perhaps -- influence the agenda.
</p>
<p>
It's been clear to me for a long time that the participant/narrator, armed with easy-to-use Web publishing technology (aka blog tools), will be a key player on every professional and civic team. A couple of years ago I sketched out how blog narrative can work as a <a href="http://udell.roninhouse.com/bytecols/2001-05-24.html">professional project management tool</a>. Just today, I learned of a great example from the realm of civics. Not co-incidentally, it involves another XML.com regular, <a href="http://www.xml.com/pub/au/82">Simon St. Laurent</a>. 
</p>
<p>
Simon lives in Varna, NY, which is between Ithaca and the town of Dryden, whose Democratic Committee he now chairs. Today's Ithaca Journal fills in the backstory:
<blockquote>
St. Laurent can be seen, notebook and digital camera in tow, at Planning Board and Conservation Advisory Council gatherings, as well as at special meetings on fire departments, speeding and comprehensive plans. So I admit, my curiosity was piqued. What could motivate this seemingly normal man to submit himself to hours of political talk and legalese? Talk that even elicits occasional groans from those delivering it. Turns out, it's all in the name of a blog -- <a href="http://simonstl.com/dryden/">http://simonstl.com/dryden/</a>.
<br/><br/>
"I volunteered with the local Democratic party in the last elections and made some calls for them. People would ask me questions and I'd have partial answers and they'd have partial answers. It seemed like an opportunity to learn more about what was going on and to help the person on the other end of the phone."
<br/><br/>
So on Nov. 6, St. Laurent launched his Dryden site. Six months later, he hasn't missed a posting. [<a href="http://www.theithacajournal.com/news/stories/20040416/localnews/240404.html">Ithaca Journal</a>]
</blockquote>
</p>
<p>
Now that the hype about political blogs has died down, it's clear that this is the real deal: a grassroots effort to connect a political process to itself, to its constituency, and to the outside world. No fanfare, just steady and reliable information flow.
</p>
<p>
Every team can benefit from this approach. By <a href="http://archive.scripting.com/2002/04/03#fromMyInstantOutline">narrating the work</a>, as Dave Winer once put it, we clarify the work. There can be more than narrator, but it makes sense to have one team member own the primary role just as other members own other roles.
</p>

</body>
</item>


<item num="a974">
<title>SafariBox</title>
<date>2004/04/16</date>
<body>

<p>
<table border="0" align="right" cellpadding="10" cellspacing="0">
<tr><td>
<script src="http://safari.oreilly.com/safaribox.asp?v=s&amp;t=0&amp;q=javascript&amp;j=1">
</script>
</td></tr>
</table>
The new device in the right-hand column of my template is a SafariBox -- it's like the GoogleBox, but for <a href="http://safari.oreilly.com">Safari Books Online</a>. Disclosure: I <a href="http://www.oreilly.com/news/udell_0301.html">helped design</a> Safari and sometimes still advise the project, though rarely nowadays. I'm using the SafariBox here because I enjoy being reminded about books, and -- as with the GoogleBox -- because I enjoy making serendipitous search-driven connections.
</p>
<p>
To receive HTML from the SafariBox, use an URL like this:
<pre class="code url">
http:\//safari.oreilly.com/safaribox.asp?
  v=s&amp;t=0&amp;q=javascript
</pre>
</p>
<p>
<a href="http://safari.oreilly.com/safaribox.asp?v=s&amp;t=0&amp;q=javascript">Try it.</a>
</p>
<p>
If your blog software can make HTTP calls at page-construction time, you can use this version to dynamically generate SafariBox content into your statically served pages. I'm doing that in a Radio UserLand macro, for example.
</p>
<p>
Alternatively, you can let client-side JavaScript handle things at page-load time. It's the same strategy that, as I mentioned <a href="http://weblog.infoworld.com/udell/2004/04/13.html#a971">the other day</a>, could enable a Technorati trackback counter. 
</p>
<p>
To receive JavaScript that writes the SafariBox HTML, tack on j=1 and wrap the URL in a SCRIPT tag, like this:
<pre class="code html">
&lt;script 
  src="http:\//safari.oreilly.com/safaribox.asp?
  v=s&amp;t=0&amp;q=javascript&amp;j=1">
&lt;/script>
</pre>
</p>
<p>
That's the method I'm using in this post. There's a character encoding/decoding glitch, by the way. If you see "Top?5" instead of "Top 5", the question mark was was originally, I think, a Unicode non-breaking space -- U+00A0. My Radio UserLand macro sees it as \xA0, and converts it to a space. Firefox renders it as a question-mark when using the UTF-8 encoding, but as a space when you switch to ISO-8859-1. MSIE, though, seems to render it as a space using either encoding. Go figure.
</p>


</body>
</item>


<item num="a973">
<title>Donkey adoptions</title>
<date>2004/04/15</date>
<body>

<p>
If there isn't a place on the web that collects ad-targeting misfires, there should be. And here's an entry for it:
</p>
<p>
<a href="http://weblog.infoworld.com/udell/gems/donkeyAdoptions.jpg"><img width="401" height="294" src="http://weblog.infoworld.com/udell/gems/donkeyAdoptions.jpg"/></a>
</p>
<p>
This is the article: <a href="http://www.linux-mag.com/2004-01/europe_01.html"> GNU/Linux is changing the face of the New Europe</a>. Could Google have been thinking about <a href="http://packages.debian.org/stable/net/donkey">donkey</a>, a password calculator? Was there an <a href="http://www.enchantedlearning.com/subjects/mammals/classification/Ungulates.shtml">ungulate</a>-based connection: gnu &lt;-> donkey, via Linux adoption?
</p>
<p>
Anyway, it's even funnier than Amazon's <a href="http://weblog.infoworld.com/udell/2002/11/08.html#a505">Customers who shopped for this book also wear clean underwear</a>. Keep 'em coming!
</p>

</body>
</item>


<item num="a972">
<title>Networks of shared experience</title>
<date>2004/04/14</date>
<body>

<p>
Jefferson Provost has written a thoughtful essay on music sharing as viral marketing. He writes, in part:
<blockquote class="personQuote JeffersonProvost">
The big issue here is how serious music fans decide what music to buy. I'm talking about the people who maintain large CD collections and spend a lot of money on music -- the customers that the music industry should be holding close to their hearts. These people not only spend a lot of money themselves, but they influence their less musically-inclined friends. These people tend to have idiosynchratic tastes, and are picky to the point of snobbishness. They don't buy music based on music industry mass-marketing. They buy it based on hearing it and liking it, and the way they hear new music is by sharing it with friends. Radio used to play a part, too, but consolidation has turned music radio into a steaming pile of crap, so what's left? Networks of like-minded friends sharing music are what's left. [<a href="http://jefferson.blogs.com/jp/2004/04/music_sharing_i.html">Jefferson Provost</a>]
</blockquote>
Over the past weeks, I've been watching -- and participating in -- a <a href="http://weblog.infoworld.com/udell/2004/03/30.html">fascinating experiment</a> that aims to recreate the process of collaborative discovery that was Napster's greatest achievement. Ever heard of <a href="http://music.mp3lizard.com/heavyconfetti/">HeavyConfetti</a>? Me neither, but I'm listening to <a href="http://www.heavyconfetti.com/tito.html">Tito</a> now on <a href="http://webjay.org/by/norelpref/titoacousticguitarcd">Webjay</a>. The MP3 versions of these smooth Pat Metheny-inspired acoustic guitar tracks are licensed under Creative Commons. HeavyConfetti's <a href="http://www.heavyconfetti.com/hcstore.html">e-commerce backend</a>, somewhat puzzlingly, turns out to be: "email me and we can work something out." Maybe they should sign up with <a href="http://www.magnatune.com/">Magnatune</a>, which has worked out a friendly but less casual purchasing model:
</p>
<p align="center">
<font face="Verdana, Arial, utopia, sans-serif" size="2" color="#666666">        How much do you want to pay? <br/>
        <select name="amount" size="1"><option value="5">$5</option><option value="6">$6</option><option value="7">$7</option><option selected="selected" value="8">$8 (recommended)</option><option value="9">$9</option><option value="10">$10</option><option value="11">$11</option><option value="12">$12</option><option value="13">$13</option><option value="14">$14</option><option value="15">$15</option><option value="16">$16</option><option value="17">$17</option><option value="18">$18</option></select>
<div align="center"><font size="1">(50% goes directly to the artist, so please be generous)</font></div>
</font>
</p>
<p>
But smoothing out the payment process matters only when there are people who want to pay. Let's look at some of the evolving ways to arrive at that state of mind. Sebasti&#233;n Paquet recently posted a <a href="http://radio.weblogs.com/0110772/2004/04/05.html#a1516">blueprint</a> for a blog-based music recommendation network. Alf Eaton responded with an <a href="http://www.pmbrowser.info/hublog/archives/000777.html">implementation</a> that connects the dots between reading weblogs that talk about and link to freely-available MP3s, aggregating those weblogs, converting an aggregated page to a playlist, and -- directly from the player -- inserting a track into a Webjay playlist. 
</p>
<p>
Here's what the process looks like. From <a href="http://home01.wxs.nl/~verka067/Songs.html">this page</a> of African griot tunes by <a href="http://www.listenall.com/dembo_jobarteh.html">Dembo Jobarteh</a>, I used Alf's <a href="javascript:location.href='http://www.pmbrowser.info/playlists/playlist.cgi?url='+escape(location.href)+'&amp;format=.smil'">SMIL bookmarklet</a> (drag to your toolbar) to synthesize a playlist of the MP3s linked on the page, and launch the player:
</p>
<p>
<img src="http://weblog.infoworld.com/udell/gems/musicBlog01.jpg"/>
</p>
<p>
While the tune <i>Allah la ke</i> is playing, I click <i>Recommend this tune using <u>Webjay</u></i>, and here's the result:
</p>
<p>
<a href="http://weblog.infoworld.com/udell/gems/musicBlog02.jpg"><img width="373" height="316" src="http://weblog.infoworld.com/udell/gems/musicBlog02.jpg"/></a>
</p>
<p>
One more click adds the tune to my <a href="http://webjay.org/by/judell/griot">African griot playlist</a>. Slick, huh?
</p>
<p>
I'm not much of an audiophile, to be honest, and there are lots of other people who will get more deeply into music-blogging and playlist-sharing than I'm likely to. But the process at work here is deeply fascinating to me, and generalizes to other realms. Every kind of digital experience can thrive in the virtuous cycle of the blogosphere: use it, capture part of it, link to it, write about it, search for it, read about it, aggregate it, rinse, lather, repeat.
</p>
<p>
Consider another kind of digital experience: software. The ability to try before I buy is great, but it's so much more powerful to tap into the shared experience of a knowledgeable user of the software I might want to buy. That's why <a href="http://weblog.infoworld.com/udell/2004/04/07.html#a968">Paul Everitt's spontaneous demo</a> seemed like such a revelation. 
</p>
<p>
I once read an interview with Michael Kinsley, right after he stepped down from the editorship of Slate. What had he expected Web publishing to be, the interviewer asked, and where had the medium fallen short? His answer was immediate and precise. He'd thought of the Web as a medium for shared experience. So, for example, music reviews and film reviews would quote from, rather than merely describe, songs and movies. That mostly hasn't happened yet, for both legal and technical reasons, but I see signs of a breakthrough. In the long run it has to happen. We crave access not only to intellectual products, but also to other people's experience and understanding of those products. When we focus on sharing experience -- which is sometimes, but not necessarily, the same as sharing product -- we'll unleash powerful economic forces. 
</p>

</body>
</item>

<item num="a971">
<title>Technorati trackbacks</title>
<date>2004/04/13</date>
<body>

<p>
BoingBoing's <a href="http://www.boingboing.net/2004/04/12/boing_boing_add_tech.html">other blogs commenting on this post</a> feature, added yesterday, has provoked a <a href="http://www.technorati.com/cosmos/search.html?rank=&amp;sub=mtcosmos&amp;url=http://www.boingboing.net/2004/04/12/boing_boing_add_tech.html">flurry of responses</a>. Co-incidentally, I had just made myself a Technorati Trackback bookmarklet: drag this link -- <a href="javascript:void(location='http://www.technorati.com/cosmos/search.html?url='+location.href);">TT</a> -- to your toolbar, then click while visiting a blog article to see Technorati's roundup of posts commenting on the article.
</p>
<p>
As Cory Doctorow mentioned in an email I was cc'd on, the "other blogs commenting" feature ideally should display the count of inbound links, or, in case there are none, vanish. Here's a picture of a trial implementation:
</p>
<p>
<img border="1" src="http://weblog.infoworld.com/udell/gems/technoratiComments.jpg"/>
</p>
<p>
There are two moving parts. First, a service that asks Technorati for the count, and returns some JavaScript. Here's the guts of my trial implementation:
<pre class="code python">
tsearch = 'http://www.technorati.com/cosmos/search.html'
tpage = urllib.urlopen(tsearch + '?url=' + url).read()
m = re.search ('from &lt;span class="greentext">(\d+)&lt;/span>',tpage)
count = int (m.group(1))
return """
document.write('Technorati comments: 
&lt;a href="http://www.technorati.com/cosmos/search.html?url=%s">%d&lt;/a>')
""" % (url, count)
</pre>
</p>
<p>
Second, a template modification to call this service for each published item, passing the permalink of the item, and including the results in a &lt;script> tag.
</p>
<p>
This is the classic pattern for pages that are dynamically generated but statically served, as is true for most blogs. You make client-side JavaScript add up-to-the-minute information at page-load time. 
</p>
<p>
Although I can host the service that queries Technorati, I'd rather not, and in any case most bloggers can't. So I've disabled the feature for now, but I'd love to see Technorati offer a callable counter. 
</p>

</body>
</item>

<item num="a970">
<title>In praise of margins</title>
<date>2004/04/12</date>
<body>

<p>
<img align="right" vspace="6" hspace="6" src="http://weblog.infoworld.com/udell/gems/marginalia.jpg"/>
<blockquote>
The fuzzy intersection of official and unofficial data has never been a comfort zone for information technologists. In <a href="http://www.pliant.org/Beyond-Formalisms.pdf">chapter 4</a> of Klaus Kaasgaard's <a href="http://www.amazon.com/exec/obidos/ASIN/8716134958/">Software Design and Usability</a>, Xerox's Palo Alto Research Center (PARC) alumnus Austin Henderson says that "one of the most brilliant inventions of the paper bureaucracy was the idea of the margin." There was always space for unofficial data, which traveled with the official data, and everybody knew about the relationship between the two. [Full story at <a href="http://www.infoworld.com/article/04/04/09/15OPstrategic_1.html">InfoWorld.com</a>]
</blockquote>
This column muses on the use of DNS TXT records to implement the latest round of SMTP sender authorization schemes. Everybody feels guilty about not using some new formally-defined DNS resource record type, but everybody also knows that would be a non-starter. So instead we're scribbling in the margins of the DNS, and luckily, DNS <i>has</i> margins available for scribbling.
</p>
<p>
It strikes me that all of my recent experimentation -- with XHTML microcontent, semantically-oriented CSS, and structured search -- has a similar flavor. I've been looking for ways to scribble in the margins of the Web. Not because it's the right thing to do, but because it's perhaps the only feasible way forward.
</p>
</body>

</item>





<item num="a969">
<title>What website is Aunt Tillie really on?</title>
<date>2004/04/08</date>
<body>

<p>
Last Friday I visted CoreStreet, a company whose ingenious approaches to large-scale credential validation and physical security I mentioned in my <a href="http://www.infoworld.com/article/03/09/26/38OPstrategic_1.html">Permissions on the edge</a> column last fall. While I was there, CoreStreet's president, Phil Libin, who blogs at <a href="http://www.vastlyimportant.com">vastlyimportant.com</a>, showed me a neat gizmo intended to help Aunt Tillie understand where she's really going on the web. Consider this screenshot:
</p>
<p>
<a target="spoofstick" href="http://weblog.infoworld.com/udell/gems/spoofstick.jpg"><img width="325" height="336" src="http://weblog.infoworld.com/udell/gems/spoofstick.jpg"/></a>
</p>
<p>
In the lower right browser window, I'm on CSPAN's Booknotes.org site, where -- <a href="http://www.sklar.com/blog/index.php?/archives/31_Media_convergence_Jon_Udell_style.html">David Sklar reminded me</a> -- you can watch Brian Lamb's interviews with authors. In the upper left window, I'm watching the George Soros program. Note the extra toolbar in that window, which says: <b>You're on <font color="green">virage.com</font></b>. That's CoreStreet's <a href="http://www.corestreet.com/spoofstick/">Spoofstick</a> in action. In this case, CSPAN's relationship with media partner <a href="http://www.virage.com/">Virage</a> is made plain in the pop-up window, even though the URL-line is hidden. But when bad guys are running the show, it's all to easy for Aunt Tillie to wind up in the wrong neighborhood without realizing it. 
</p>
<p>
Spoofstick is a beta extension for Firefox, with IE support "right around the corner." (Didn't things used to be the other way around?) It fits right in with one of the the themes I've been developing lately: we need to standardize on the UI conventions that contextualize secure interaction on the web. 
</p>
<p>
I don't think Spoofstick is a final solution, and neither do the CoreStreet folks. In this particular case, for example, what's Aunt Tillie to make of the fact that she's been transported by CSPAN to Virage? Is that OK or not? How's she supposed to evaluate all this?
</p>
<p>
In the case of a benign third-party relationship like this one, you could argue Spoofstick raises more questions than it answers. Nor would it surprise me if somebody discovers a way to spoof Spoofstick. But the principle at work here is sound. The information superhighway needs a standard system of roadsigns that Aunt Tillie can trust. The SSL lock was and is helpful, but we need to do more. Spoofstick suggests an important next step.
</p>


</body>
</item>

<item num="a968">
<title>Software cinema verit&#223;</title>
<date>2004/04/07</date>
<body>

<p>
<blockquote>
A growing number of vendors now use Flash videos to augment the obligatory lists of customers, features, and benefits that they publish on their marketing pages. It's a strategy I highly recommend. What hadn't occurred to me, until it happened this week, was that users might do this for you! [Full story at <a href="http://www.infoworld.com/article/04/04/02/14OPstrategic_1.html">InfoWorld.com</a>]
</blockquote>
Here's Paul Everitt, whose spontaneous act of software demonstration motivated the column:
<blockquote class="personQuote PaulEveritt">
It's funny how these things happen. I put very little consideration into making that narrated demo. I had posted something about XSLT and said it was easier than advertised. In a weblog comment, someone asked for evidence to back up my assertion. I offered to make a recording, he took me up on the offer, and I spent 15 minutes with no post-production to respond to him. [<a href="http://radio.weblogs.com/0116506/2004/04/04.html">Zope Dispatches</a>]
</blockquote>
Exactly. "15 minutes with no post-production" is doable on a whim. When the activation threshold is low enough, things can happen that otherwise wouldn't.
</p>
<p>
The fact that Flash has become the de facto standard for such videos is interesting in light of Microsoft's "quietly announced" <sup>1</sup> <a href="http://channel9.msdn.com/">Channel 9</a>. I'm hardly the first to point out that Channel 9's Windows-Media-only format excludes crucial audiences. Joe Wilcox says so <a href="http://www.microsoftmonitor.com/archives/002659.html">here</a>, and Robert Scoble responds <a href="http://radio.weblogs.com/0001011/2004/04/07.html#a7181">here</a>.
</p>
<p>
Joe's point is spot on. Although Larry O'Brien says he <a href="http://www.knowing.net/2004/04/06.aspx#a735">won't watch the videos</a>, they're actually the only part of Channel 9 that I have <a href="http://channel9.msdn.com/rss.aspx?ForumID=14&amp;Mode=0">tuned into</a>. Of the first batch of videos, the one that I found most important (albeit not as entertaining as Bill Hill's <a href="http://www.microsoft.com/winme/0404/22606/Bill_Hill1_300k.asx">Homo Sapiens 1.0</a>) was Michael Howard's <a href="http://www.microsoft.com/winme/0404/22606/Michael_Howard_College_300k.asx">observation</a> <sup>2</sup> about how the computer science curriculum gives short shrift to security. His book, <a href="http://www.amazon.com/exec/obidos/ASIN/073561722">Writing Secure Code</a>, is a remarkably candid, Cluetrain-like piece of work. In this passage, for example, he draws attention to past ActiveX-related screwups in Microsoft products:
</p>
<p>
<img border="1" src="http://weblog.infoworld.com/udell/gems/writingSecureCode.jpg"/>
</p>
<p>
That honesty, coupled with the book's exhaustive analysis and recommendations, makes Howard the best and most credible voice inside Microsoft on an issue that desperately cries out for credibility. But because of the format lock-in, he winds up preaching to the choir. A further irony was that Channel 9 asked me to accept a signed ActiveX control! The people who really ought to see and hear Michael Howard never will.
</p>
<p>
As for Robert Scoble's response, I dunno. "When we came up with the idea of Channel9," he writes, "we didn't just get unlimited resources to do everything perfect." Well OK, but <a href="http://www.apple.com/quicktime/upgrade/">QuickTime Pro</a> is $30, and <a href="http://www.wildform.com/flix/flix_pro.php">Flix Pro</a> is $149. Using these, I was able to produce QuickTime and Flash versions of the Michael Howard clip. The quality's not great, partly because I couldn't figure out how to download the .WMV files behind the .ASX wrapper,  so I resorted to a <a href="http://www.techsmith.com/products/studio/">Camtasia Studio</a> screen capture. And I'm not sure Microsoft would appreciate my posting alternate versions in any case, so I won't. But, though I'm far from an expert on video formats, it doesn't look like a budgetary or logistical issue to me.
</p>
<p>
No other company comes close to the transparency that Microsoft is achieving with its blog activity and now Channel 9. I've applauded such efforts and will continue to do so. But I'll applaud Channel 9 more loudly when its message can reach the unconverted.
</p>
<hr align="left" width="25%"/>
<p>
<sup>1</sup> 
What's up with that "quiet" meme? 
<blockquote>
<a href="http://news.com.com/2100-7343-5185841.html">news.com</a>: "Microsoft quietly launched a new site on Tuesday that combines blogs, discussion forums and other technology to improve communications with developers."
</blockquote>
<blockquote>
<a href="http://www.infoworld.com/article/04/04/06/HNmschannel9_1.html">infoworld.com</a>: "Microsoft  has quietly expanded its Microsoft Developer Network with a Web site that combines a host of social networking technologies in a move to improve communications with outside software developers."
</blockquote>
Are these only accidentally similar? Or did one derive from the other? Or was there an aboriginal source? Perhaps meme archaeologists can figure it out.
</p>
<p>
<sup>2</sup> Note that I had to dig these direct links to the videos out of the RSS feed. They're not directly available on the surface of <a href="http://channel9.msdn.com/ShowForum.aspx?ForumID=14">this page</a>. This is typical of MSDN Web designs that use video snippets, and I think it's un-Weblike and blogger-unfriendly. 
</p>

</body>
</item>

<item num="a967">
<title>Customer demand for a ubiquitous InfoPath runtime</title>
<date>2004/04/06</date>
<body>

<p>
The last time I asked Microsoft why there's no plan to make the InfoPath runtime ubiquitous, the answer I got was: "We don't hear customers asking for it." Well, I do. Here's a typical rant from one customer who, because his company has a relationship with Microsoft that he doesn't want to jeopardize, asked me to anonymize his comments:
<blockquote>
I believe a primary requirement of a forms application is to make it
possible for the form to be completed by a wide audience of people from
whom I wish to gather data.  A key driver, at least in the world of my
customers, is to be able to distribute the form widely to people who
aren't necessarily connected to the network and get them to fill it in
and return it.  I don't want to authenticate these people in my network.
They won't install software on their computers just to fill out my form.
They don't want to learn a new application.
<br/><br/>
It seems InfoPath has completely ignored the question of how the form
will actually be filled in by the responder.  There is no free viewer as
there is with Adobe Acrobat.  There is no ability to save the form
template as an ASP.NET web form.  It appears that Microsoft expects
everyone to purchase a full copy of InfoPath--the complete form design
application--just so they can fill out a form.  They can't possibly
believe the product will gain any traction with this licensing and
deployment model, can they? [1] What are they thinking? [2]
<br/><br/>
So my main question is, is there any way to deploy InfoPath forms
without putting full InfoPath on every desktop?  [3] Do you know whether
Microsoft understands this issue and are planning anything to address
it?  [4] The two applications that are widely available on everyone's
desktop are a web browser and Adobe Acrobat, and it seems like it would
be a good idea for InfoPath to support forms deployment via one of those
means.  Am I missing something here? [5]
</blockquote>
</p>
<p>
My answers were "I don't know" [1], "I don't know" [2], "No" [3], "Apparently they don't see a problem and aren't planning to do anything" [4], and "We're in the same boat: I don't get it either." [5]
</p>

</body>
</item>


<item num="a966">
<title>RSS and TiVo</title>
<date>2004/04/06</date>
<body>

<p>
<table cellpadding="6" align="right">
<tr><td><img src="http://weblog.infoworld.com/udell/images/xml.gif"/></td></tr>
<tr><td><img src="http://weblog.infoworld.com/udell/gems/tivo.gif"/></td></tr>
</table>
Yesterday's <a href="http://weblog.infoworld.com/udell/2004/04/04.html#a964">item</a> provoked a flurry of responses. Steven J. Vaughan-Nichols, who wrote the Washington Post story I dissected, points out that the nature of his assignment precluded broader coverage, and that he'd otherwise gladly have included <a href="http://www.bloglines.com">bloglines</a>. There's been lots of chatter about bloglines lately -- Chad Dickerson <a href="http://weblog.infoworld.com/dickerson/2004/04/05.html#11.54.22">mentions it today</a> -- so I was interested to hear from Martin Thornell about another web-based product, <a href="http://reader.rocketinfo.com">Rocket RSS reader</a>. Doubtless there are others too. An implementation of one of these licensed for behind-the-firewall use, as Chad suggests, would be handy. As a matter of fact, that's how I use Radio UserLand's reader. It's nominally a desktop product, but I run it as a server and authenticate to it over SSL.
</p>
<p>
Vaughan-Nichols' critique of .NET's performance raised hackles with several readers, include Mark Levison:
<blockquote class="personQuote MarkLevison">
I'm doing smart client (no touch deployment) .NET development at the moment.  I find that we've no trouble getting excellent performance out of our app.  When we do have problems it is usually algorithmic. Jon, what .NET client side apps have you tried? SharpReader? RSS Bandit? NewsGator? Are any of these slow?  Let's test claims like this before repeating them. [<a href="http://dotnetjunkies.com/WebLog/mlevison/archive/2004/04/05/10796.aspx">dotnetjunkies</a>]
</blockquote>
I've used all of the above. It's always problematic to define what's meant by speed in cases like this. Application load time? GUI responsiveness? Data transfer? Every .NET app I've used loads slowly -- particularly when it's the first .NET app in use, but even otherwise. GUI responsiveness varies from sluggish to snappy, which I attribute to differing degrees of experience with the Framework and with the managed environment that supports it. Data transfer that isn't gated by your network pipe its mainly an algorithmic thing that depends on caching, not the runtime.
</p>
<p>
When I said .NET performance is "a real issue that will dog client-side .NET in the same way, and for the same reasons, that it has dogged client-side Java," I did not mean that I believe, as Vaughan-Nichols does, that use of .NET automatically means sluggish performance. In fact I don't think that. But the perception does exist, as it has existed for Java, despite evidence to the contrary (e.g., Eclipse), because there is also evidence to support it. Modern managed runtimes are a huge and necessary step forward, but the desktop is an unforgiving environment in which to deploy apps that depend on them. That's been a challenge for Java, and it's a challenge for .NET too.
</p>
<p>
Meanwhile, Russ Lipton brings me back to my original point:
<blockquote class="personQuote RussLipton">
Jon Udell reminds me yet again how pathetically inept we are at explaining technology so that normal human beings can make sense of it. As a result, normal human beings intelligently dislike the technologies that fascinate some of us. [<a href="http://www.coffeehouse-at-end-of-days.com/2004/04/driving_aunt_ti.html">Coffehouse at the End-of-Days</a>]
</blockquote>
Exactly. Normal people don't, however, dislike their TiVos. After a long period of foot-dragging I finally joined the TiVo cult and am fascinated most of all to watch my family, none of whom are very technical, integrate this Linux appliance into the fabric of their lives. Comparisons of the Linux desktop to the Microsoft desktop immediately fade to insignificance. If typical members of either of those tribes had written the TiVo software, my kids would be asking me what to do about the "disk is 97% full" message. But they don't, because TiVo spares them such nonsense. They only need to think about getting stuff and using stuff, and not much explanation is needed. All of our "real" apps, RSS readers included, should work like that.
</p>
</body>
</item>


<item num="a964">
<title>Introducing Aunt Tillie to RSS</title>
<date>2004/04/04</date>
<body>

<p>
This morning a story on RSS newsreaders appeared in the Personal Tech section of my local paper. The title was <i>A simple program to 'refresh' the news</i>; the byline was <i>The Washington Post</i>. I'm keenly interested in how the story of RSS is being told to <a href="http://weblog.infoworld.com/udell/2004/03/02.html">Aunt Tillie</a>, so I deconstructed this one with some care. 
</p>
<p>
The first order of business was to find the article online so I could quote from it, and cite the URL in this posting. I went to washingtonpost.com, registered, and searched for the phrase "inefficient bundle of code"; we'll get to why I used that search in a moment. 
</p>
<p>
The Washington Post is evidently even more restrictive than the New York Times. This two-week-old story is already parked behind the costwall, where you're asked to buy it for $2.95. No thanks. I did, however, learn that the original title was <i>Refining Paperless News</i>, and that the author was <a href="http://www.google.com/search?q=%22steven+j.+vaughan-nichols%22">Steven J. Vaughan-Nichols</a>. 
</p>
<p>
When I'm looking for costwalled New York Times stories, I've noticed that you can often find them for free elsewhere. Sure enough, a Google search for <a href="http://www.google.com/search?q=%22inefficient+bundle+of+code%22">"inefficient bundle of code"</a> landed me <a href="http://www.washingtonpost.com/wp-dyn/articles/A55027-2004Mar13.html">here</a>. 
</p>
<p>
A couple of points in the article caught my eye. Exhibit A:
<blockquote class="personQuote StevenJVaughan-Nichols">
RSSReader (Win 98 or newer, free at <a href="http://www.rssreader.com">www.rssreader.com</a>) leaves out FeedDemon's price tag, but also its performance. It was easily the slowest newsreader we tried -- partially because it runs on Microsoft's .Net Framework, an <b>inefficient bundle of code</b> [emphasis mine] that lets developers add Web functions to their software. [<a href="http://www.washingtonpost.com/wp-dyn/articles/A55027-2004Mar13.html">Refining Paperless News (TechNews.com)</a>]
</blockquote>
When I think of the many ways one could introduce Aunt Tillie to the .NET Framework, "inefficient bundle of code that lets developers add Web functions to their software" seems an odd choice. If Aunt Tillie knew that Steven J. Vaughan-Nichols writes the <a href="http://www.linux-mag.com/depts/shutdown.html">endpage for Linux Magazine</a> and edits the <a href="http://www.eweek.com/category2/0,1738,1237915,00.asp">Linux and Open Source Topic Center</a> for eWeek.com, it might help her to contextualize this remark. 
</p>
<p>
I don't, by the way, entirely disagree with Vaughan-Nichols. Although I think he overplays the ".NET is slow" card here -- using it three times -- this is a real issue that will dog client-side .NET in the same way, and for the same reasons, that it has dogged client-side Java. But that's way more software-industry inside baseball than Aunt Tillie needs here, if the point of the article is to introduce her to the fundamental concepts and benefits of RSS, and acquaint her with the kinds of tools available for reading feeds.
</p>
<p>
Exhibit B:
<blockquote class="personQuote StevenJVaughan-Nichols">
Unfortunately, you can't just click that button to subscribe. You must right-click it -- on a Mac, hold down the Ctrl key as you click -- to copy the link's address, then paste it into your newsreader. [<a href="http://www.washingtonpost.com/wp-dyn/articles/A55027-2004Mar13.html">Refining Paperless News (TechNews.com)</a>]
</blockquote>
Spot on. This is a huge roadblock for Aunt Tillie, as I've said repeatedly. We gotta fix this.
</p>
<p>
Exhibit C:
<blockquote class="personQuote StevenJVaughan-Nichols">
ADC Software's NewzCrawler (Win 95 or newer, $25 at <a href="http://www.newzcrawler.com">www.newzcrawler.com</a>)
is perhaps the most flexible newsreader around. Beyond RSS, this fast,
easily customizable program also collects and presents newsfeeds
delivered with a newer protocol called Atom and postings from Usenet
newsgroups. 
</blockquote>
Delivering Usenet postings is a clear benefit. It means you get more and different content than you'd get from RSS. What about Atom? Does this "newer protocol" also deliver more and different content than you'd get with RSS, or from Usenet? Clearly my own biases are showing here, but my answer is a resounding no. I've long argued that the last thing Aunt Tillie needs, just as she's becoming aware of the concept of syndication, is to get smacked in the face with our RSS-vs-Atom dirty laundry. 
</p>
<p>
One final observation. The article focused entirely on a single species of RSS newsreader: the standalone GUI program. If Aunt Tillie happens to be reading email in Outlook, she ought to have been made aware of the <a href="http://www.newsgator.com/">Newsgator</a> option. An even more glaring omission was <a href="http://www.bloglines.com/">bloglines.com</a>. Nowadays when RSS newbies ask me which reader to use, I point them to bloglines; it's the perfect quickstart. I tell folks they can deal with selecting, installing, and learning to use a "real" newsreader after they've gotten a taste of what RSS newsreading is all about. I <i>don't</i> tell them the reasons why, for certain <a href="http://www.intertwingly.net/blog/1716.html">advanced</a> <a href="http://jeremy.zawodny.com/blog/archives/001829.html">users</a> of RSS, bloglines winds up being the "real" solution. That's too much information for an elevator pitch. However in an article of this length, which mentions Atom and harps on the performance of the .NET runtime, I think Aunt Tillie should have been told that Web-based readers exist, require no installation, can be used from anywhere, and are always synchronized.
</p>
<p>
My point here isn't to pillory Steven J. Vaughan-Nichols, whose work I've known and respected for a long time. All of us who belong to the geek tribe -- myself included -- tend to focus on our issues, not the issues that will matter most to Aunt Tillie. But we're the gatekeepers of this story. As syndication goes mainstream, we're the ones who'll be asked to explain it to Aunt Tillie. Here's hoping we can all put the geek stuff in its place and tell her what she really needs to know.
</p>

</body>
</item>


<item num="a963">
<title>Should GMail be exhibited in the Museum of Jurassic Technology?</title>
<date>2004/04/03</date>
<body>

<p>
<a href="http://www.mjt.org/"><img align="right" vspace="6" hspace="6" src="http://www.mjt.org/images/hometrn3.gif"/></a>
There is a place in Los Angeles I've never visited, but would love to: <a href="http://www.mjt.org/">The Museum of Jurassic Technology</a>. It is the subject of Lawrence Wechsler's delightful 1995 book, <a href="http://www.amazon.com/exec/obidos/tg/detail/-/0679764895">Mr. Wilson's Cabinet Of Wonder: Pronged Ants, Horned Humans, Mice on Toast, and Other Marvels of Jurassic Technology</a>. One Amazon reviewer called the museum "a straight-faced, Andy Kaufman-esque joke, blending exhibits that look too nutty to be true, but are true, with outright hoaxes."
</p>
<p>
Sometimes the jokes are pretty broad:
<blockquote class="personQuote LawrenceWechsler">
The very first display you encounter is an exhibit entitled "Protective Auditory Mimicry." Together, encased under glass, are displayed a luminous iridescent beetle and next to it a similarly tiny iridescent pebble. The wall placard to the side asserts that over the eons this beetle has adapted to make precisely the same sound when threatened that this pebble makes at rest. [transcript of 1996 NPR <a href="http://www.soundportraits.org/on-air/museum_of_jurassic_technology/">radio documentary</a> by Lawrence Wechsler]
</blockquote>
But mostly, the museum's curator David Wilson is a lot subtler than that. Driven to investigate the meticulously researched and lovingly displayed curiosities that Wilson presents, Wechsler found some to be true, some false, and some a mixture of the two.
</p>
<p>
I was reminded of all this on Thursday when, for hours, nobody seemed to know whether Google's GMail announcement was real, or was an April Fools day prank. It wasn't only the date of the announcement, but also its tongue-in-cheek tone -- "Search is Number Two Online Activity -- Email is Number One; "Heck, Yeah," Say Google Founders" -- that led many to conclude it must be a prank. Even one of the savviest observers on the scene, Doc Searls, was momentarily taken in. And when I posted <a href="http://weblog.infoworld.com/udell/2004/04/01.html">my response</a> to Doc's initial posting, suggesting that Google had executed a brilliant double head fake, I wasn't yet 100% certain that this was no hoax -- <i>even though I had read John Markoff's <a href="http://www.nytimes.com/2004/04/01/technology/01google.html">story</a>, datelined March 31, in the dead-trees version of the Times on the morning of April 1</i>. Indeed, it was Doc's near-instantaneous correction, after receiving a call from a Google insider, that finally settled the matter for me -- and, I'm sure, for many others. It's interesting to consider why. I trust Doc Searls as much as I trust John Markoff, and it was Doc's site, not the Times' site, that first reported a Google source both acknowledging and dispelling the possibility of a hoax.
</p>	
<p>
Here are the remaining questions. Did Google intentionally leverage the reality-bending April 1 tradition -- to which it has <a href="http://www.google.com/technology/pigeonrank.html">famously</a> <a href="http://www.google.com/jobs/lunar_job.html">contributed</a> -- in order to crank up the buzz surrounding the announcement? (<b>Update</b>: Doc says yes, based on <a href="http://www.shellen.com/jason/archives/2004_04_01_default.asp#108093573678962271">this posting from Google employee Jason Shellen</a>.) If so, was the strategy a brilliant PR coup, as I suggested on Thursday, or a colossal blunder, as Doc <a href="http://doc.weblogs.com/2004/04/01#excuseMeWhileITakeThisChainOffMyNeck">concluded</a> on Thursday? In retrospect, I'm inclined to think Doc's right. But either way, the period of confusion on Thursday was a very weird time. The sensory apparatus that tells me what's going on in the world is a complex machine whose gears -- weblogs, newspapers, Google -- were grinding.
</p>
<p>
The Museum of Jurassic Technology wraps the frame of conceptual art around the experiences it delivers. But how do we frame what happened on Thursday? I'm reminded of a story a graduate school professor once told me. He was stationed in London, covering the art scene for Time Magazine, and went to Hyde Park to report on a work of performance art that was scheduled to happen there at a certain time. A bunch of people were milling around, waiting for the event to begin. Much later the artist finally arrived, surveyed the crowd, and asked: "Where do I sign?"
</p>
<p>
<b>Update</b>: Doc just wrote his <a href="http://doc.weblogs.com/2004/04/03#thisIsABadThing">post-mortem</a>, in which he says: "I've long since lost my PR edge." No, Doc, I don't think so. I've changed my mind since Thursday, and I think your gut reaction was the right one. 
</p>
<p>
<b>Update</b>: Bryan Field-Elliot <a href="http://netmeme.org/blog/archives/000110.html#000110">thinks</a> the whole thing was a feint to distract attention from Gmail's <a href="http://gmail.google.com/gmail/help/privacy.html">privacy policy</a>.
</p>

</body>
</item>

<item num="a962">
<title>An example of helpful guidance</title>
<date>2004/04/02</date>
<body>

<p>
A reader took me to task for suggesting, in <a href="http://weblog.infoworld.com/udell/2004/03/30.html#a958">this week's column</a>,
that we need to do a better job of spelling out the user-interface
implications of Internet standards. Robb Beal agreed with me, though,
and today I found another example of the kind of guidance that his <a href="http://www.usercreations.com/weblog/gems/Aggregator%20client%20HTTP%20tests.html">functional annotations</a> provide.
</p>
<p>
Last July, I  mentioned <a href="http://www.danisch.de/work/security/antispam.html">RMX (Reverse Mail eXchange)</a> in an <a href="http://www.infoworld.com/article/03/07/18/28FEspam_1.html">article on anti-spam technologies</a>. Since then there's been a lot of activity on this front. Now I'm looking into <a href="http://spf.pobox.com/">SPF</a> (proposed by pobox.com), <a href="http://www.microsoft.com/mscorp/twc/privacy/spam_callerid.mspx">Caller ID for Email</a>, (proposed by Microsoft) and <a href="http://slashdot.org/articles/03/12/06/147258.shtml">Domain Keys</a> (proposed by Yahoo, not yet published). 
</p>
<p>
The various strategies for weaving authorization and email policy into
the Domain Name System are quite fascinating. But I was also struck by
this passage I found in the Caller ID spec:
</p>
<blockquote>
Common historical practice in mail reading software
regarding the mail originator and resent headers has been to present
only the contents of the From: header to the users; the other related
headers (Sender:, Resent-From:, Resent-Sender:) have not been shown.
This behavior SHOULD change. Messages with combinations of identities
in the originator headers SHOULD be rendered differently than messages
in which the identities are the same. Specifically, it is RECOMMENDED
that if the purported responsible addresses of a message is not the
same as the address that would be rendered as the From: address that
both these addresses be exhibited to the user. For example, the message
in the example -3.2.3 might be presented by e-mail client software as
being
<br/><br/>
From bob@forwarderexample.com on behalf of adam@example.com
<br/><br/>
or 
<br/><br/>
From adam@example.com via bob@forwarderexample.com
<br/><br/>
instead of the historical 
<br/><br/>
From adam@example.com
</blockquote>
<p>
Exactly! Nobody should care what's jammed into the DNS TXT records used
to authorize an SMTP sender, but everybody should care about dodgy
identity trails. While acknowledging that such matters are "properly a
role of mail filtering and e-mail client software," the spec
nonetheless ventures "some suggestions regarding how that might work."
Applause.
</p>

</body>
</item>


<item num="a961">
<title>Macromedia Flex</title>
<date>2004/04/01</date>
<body>
<p>
<a href="http://doc.weblogs.com/2004/04/01#fMail"><img src="http://weblog.infoworld.com/udell/gems/gmailDoc.JPG"/></a>
</p>
<p>
Or, maybe, look at the brilliant marketing strategist who was out-Cluetrained by a brilliant marketing strategy :-)
</p>
<p>
<b>Update:</b>
Doc recants:
<blockquote>
Just when I think I've given all the PR advice a former PR guy who's still a journalist can give, here's one more: If you're gonna shake the Earth with an unexpected announcement, don't pick the one day out of 365 when everybody's yanking everybody else's chain, okay?
</blockquote>
Why not? Worked like a charm! 
</p>
</body>
</item>


<item num="a960">
<title>Macromedia Flex</title>
<date>2004/03/31</date>
<body>

<p>
<a href="http://weblog.infoworld.com/udell/gems/flex.jpg"><img width="233" height="204" align="right" vspace="6" src="http://weblog.infoworld.com/udell/gems/flex.jpg"/></a>
<blockquote>
The Flex strategy first began to crystallize two years ago when Macromedia rolled out the Flash 6 player, Flash MX development tools, and ColdFusion MX server. The possibilities were exciting, and the back-end environment was comfortably based on Java and Web services. But the client-side discipline was alien to the corporate programmer.
<br/><br/>
One obstacle was the ActionScript 1.0 language, which lacked the strong typing and formal class model that a Java programmer would expect. The solution to this problem arrived last fall when Flash MX 2004 introduced Flash Player 7 and support for ActionScript 2.0. Yet the Flash IDE was still built around the concept of making a movie, not coding an application. Flex presents a development model that will make immediate sense to an enterprise developer. 
[Full story at <a href="http://www.infoworld.com/article/04/03/29/13TCflex_1.html">InfoWorld.com</a>]
</blockquote>
The sample Flex app that appears in the story is the <a target="mxml" href="http://www.markme.com/cc/archives/003901.cfm">RSS reader</a> that Macromedia's Christophe Coenraets wrote. I guess RSS readers are now the official benchmark for next-generation markup-driven development. Here's the same thing done in <a target="xaml" href="http://www.joemarini.com/tutorials/tutorialpages/xamlblogexplorer.php">XAML</a>.
</p>
<p>
It's interesting to consider these two admirably compact implementations side-by-side. Some points of comparison:
</p>
<table cellpadding="4" cellspacing="0">
<tr><td align="center"><b>MXML</b></td><td align="center"><b>XAML</b></td></tr>
<tr><td colspan="2"/></tr>
<tr><td style="color: green">Here today</td><td style="color: red">2006? 2007?</td></tr>
<tr><td style="color: green">Runs anywhere Flash Player 7 runs</td><td style="color: red">Runs only on Longhorn</td></tr>
<tr><td style="color: red">Server required</td><td style="color: green">Server not required</td></tr>
<tr><td style="color: green">Uses ActionScript 2.0</td><td style="color: green">Uses .NET languages</td></tr>
<tr><td style="color: red">XPath support: no</td><td style="color: green">XPath support: yes</td></tr>
<tr><td style="color: green">CSS support: yes</td><td style="color: red">CSS support: no</td></tr>
</table>
<p>
This mixed pattern of green (good) and red (bad) pretty much sums up my conclusion. I want all the green stuff in one column. Actually, I want all the green stuff in multiple columns: Flash, Mozilla, .NET. Heck, if I want to write a tool for Groove 3.0, I should be able to use the same XML-based UI definitions, objects, and events as I can use everywhere else. At this level of abstraction, all this stuff is too similar to justify the differences. 
</p>
<p>
We had a great thing going for about 10 years: the universal HTML/JavaScript client. And while it's still a great thing, there are good reasons to advance the state of the art. But can we please, please not lose the standardization that's served us so well? 
</p>
</body>
</item>

<item num="a959">
<title>Blogs + playlists = collaborative listening</title>
<date>2004/03/30</date>
<body>

<p>
<a href="http://www.webjay.org/"><img align="right" vspace="6" hspace="6" src="http://www.webjay.org/img/webjay-heart.gif"/></a>
Something wonderful died with Napster: the collaborative discovery and sharing of a wide diversity of music. Lucas Gonze is on a crusade to bring that experience back, legally. On his site, <a href="http://www.webjay.org/">webjay.org</a>, users share playlists -- i.e., lists of URLs that point to MP3s that are posted on artists' websites, or that are otherwise authorized for distribution on the Web. 
</p>
<p>
My first (and so far only) Webjay <a href="http://webjay.org/by/judell/test">playlist</a> began as a couple of tunes by <a href="http://www.bettydylan.com">Betty Dylan</a>, a Nashville-based duo who played my hometown recently and won me over with their energy and charm. Hunting around for more Betty Dylan tunes, I ran into some other Bettys -- Betty Roche, Betty Sue -- so I included them too.
</p>
<p>
Yesterday I noticed that the Betty Roche tune had migrated into one of Lucas' playlists, <a href="http://webjay.org/by/lucas_gonze/streakofleanstreakoffat">Streak of lean, streak of fat</a>, and the Betty Dylan tunes had found their way into another of Lucas' lists, <a href="http://webjay.org/by/lucas_gonze/thenotdylansnotbowies">The Betty Destroyer</a>. 
</p>
<p>
In a recent blog essay, Lucas talks about the collaborative filtering dynamic he hopes to encourage:
<blockquote class="personQuote LucasGonze">
There's one song in <a href="http://webjay.org/by/lucas_gonze/organism">Treebot</a> from Tofuhut, Yusef Lateef's <a href="http://tofuhut.racknine.net/Yusef%20Lateef/Yusef%20Lateef%20-%20Strange%20Lullaby.mp3">Strange Lullaby</a>.  There's also one song from <a href="http://www.largeheartedboy.com/blog/archives/002128.html">LargeHeartedBoy</a>, Julie Doiron's mind-blowingly beautiful <a href="http://www.epitonic.com/files/reg/songs/mp3/Julie_Doiron-Pour_Toujours.mp3">Pour Toujours</a>, and that song had gone through three generations of filtering.  In fact, <i>every</i> song in Treebot made it through multiple cullings, and that's why it's a good playlist.
<br/><br/>
It took Tofuhut to introduce "Strange Lullaby" into the ecosystem, and if he didn't have both taste and writing ability his recommendation wouldn't have made it through.  But it always takes more than one person to do collaborative filtering.  I want to make the path from obsessive record collectors to the average iPod as short as possible, and that's what Webjay does for him. [<a href="http://gonze.com/weblog/story/3-18-4">Lucas Gonze, 3/18/04</a>]
</blockquote>
And elsewhere:
<blockquote class="personQuote LucasGonze">
Here's the business problem: I want to help music businesses sell products, then make my money on affiliate revenues.  That way everybody's incentives are lined up in the same direction.  The listeners are looking for the best music, I'm trying to find the music they'll like the most.  Music businesses are looking for listeners charged up to buy, I'm trying to get the listeners charged up.
<br/><br/>
So how do I do it?  An Amazon search for a song title?  Amazon's product database isn't big enough (hard to believe, I know) and the lookup algorithms aren't smart enough -- I need a relevance match, not a keyword search.  ISRC identifiers?  Good luck getting them for online music, much less matching them to vendors.  So help me out here, Music Industry: given a product and a buyer, how do I find a seller? [<a href="http://gonze.com/weblog/story/3-23-4">Lucas Gonze, 3/23/04</a>]
</blockquote>
</p>
<p>
There are a bunch of things that frustrate me about playlists. Competing formats: m3u, smil. Inconsistent behavior: if you want your tunes (and associated images) to render as you expect, you're looking at an insane test matrix. Crappy metadata: missing or incomplete, and often hard to find. Despite all these irritations I find myself returning to Webjay for the same reasons I write this blog and read others. What I know, I want to share with others. What others know, I want to know too. 
</p>
<p>
If it's easy to buy music online, I sometimes will. But first it has to be easy to find, listen to, talk about, and share tunes. The intersection of blogs and playlists isn't yet nearly as smooth an experience it should be, but the ideas that motivate webjay.org are exactly right.
</p>

</body>
</item>

<item num="a958">
<title>Human interface guidelines for the Internet</title>
<date>2004/03/30</date>
<body>

<p>
<blockquote>
Apple, of course, wrote the book on human interface guidelines by visualizing and documenting a range of interaction scenarios in meticulous detail. Today we have a variety of platform-specific guidelines -- for Windows, for GNOME, for Flash MX. But we lack general guidelines for how Internet applications should behave on all platforms. E-mail programs don't agree on how threading, foldering, and filtering should work. Web browsers don't agree on how drop-down search boxes should work. RSS readers don't agree on how the orange XML icon should work. Media players don't agree on how playlists should work.
<br/><br/>
We need HCI (human/computer interface) guidelines more than ever. And we need them not only for Windows, OS X, GNOME, and Flash, but for the uber-platform that subsumes them all. We need human interface guidelines for the Internet. [Full story at <a href="http://www.infoworld.com/article/04/03/26/13OPstrategic_1.html">InfoWorld.com</a>]
</blockquote>
The impetus for this column came from <a href="http://weblog.infoworld.com/udell/2004/03/23.html#a952">this posting on S/MIME signatures</a>, which argued that confusion about whether or how to trust a signature is a problem of UI, not cryptography. <a href="http://www.usercreations.com/weblog/">Robb Beal</a> violently agreed. He wrote:
<blockquote class="personQuote RobbBeal">
Yes! Every technical spec that has user-facing implications should have a corresponding functional spec.
<br/><br/>
See my <a href="http://www.usercreations.com/weblog/gems/Aggregator%20client%20HTTP%20tests.html">functional annotation</a> of Mark Pilgrim's HTTP tests for an example.
</blockquote>
</p>
<p>
Robb pointed me to some <a href="http://www.usercreations.com/weblog/2003/09/08.html#a316">other</a> <a href="http://www.usercreations.com/weblog/2004/03/18.html#a546">examples</a> as well. Why isn't this done more? Robb thinks it's because developers tend to want platform vendors to do this work for them. But even on a given platform, essential guidance about user interaction is often lacking.
</p>
<p>
Scanning the responses to my posting on S/MIME signatures, I realize some people took it as a condemnation of S/MIME. Not so. I was trying to illustrate how interactive context affects the implementation of a protocol, and how the nature of that context can be (but rarely is) specified. 
</p>
<p>
I had suggested, for example, that a mail client displaying a signed message should always display the address in the From: header (not just a friendly name), should display a standard signature icon, and should link the icon to a certificate viewer. Outlook 2000 breaks the first guideline. Darrell Dykstra wrote to point out that Outlook 2002 and 2003 comply with all three guidelines, which is great. Except, of course, they aren't guidelines written down anywhere, and that's my point. 
</p>
<p>
The other day, NPR's Day to Day ran a segment on <a target="audio" href="http://www.npr.org/features/feature.php?wfId=1788632">phishing</a>. In <a target="audio" href="http://weblog.infoworld.com/udell/gems/phishing.mp3">this clip</a>, John Dimsdale interviews David Jevans, chairman of the <a href="http://www.antiphishing.org/">anti-phishing working group</a>, who says:
<blockquote class="personQuote DavidJevans">
Typically it's the average consumer, who's quite Internet-savvy, and they get an email in that looks exactly like it came from their bank, with very compelling information -- it will have the logos, it will really try to fake the website.
</blockquote> 
We have a technical solution: Aunt Tillie could evaluate the site's SSL cert or the email cert of the phisher. But there isn't a snowball's chance in hell that she will. For that, and for the countless other ways that we fail to contextualize protocols in standard and familiar ways, we should be ashamed.
</p>
<p><b>Update:</b> John Patrick on phishing:
<blockquote class="personQuote JohnPatrick">
The moral of the story is to be increasingly careful. Anti-virus and anti-spam are not enough. Anti-spyware is not enough. Hardware and software firewalls are not enough. All of these are essential but the other ingredient is common sense. Look at your email carefully. Even if the "from" address is one you recognize, look also at the context.
<br/><br/>
...
<br/><br/>
Digital ID's are essential to add authentication to email and software downloads. We need to be able to establish that we are who we say we are and to be sure that others (people, links, software) are who they say they are. You can read more about this in the patrickWeb <a href="http://patrickweb.com/weblog/categories/pki/privacy.html">Privacy and Trust series</a>. [<a href="http://patrickweb.com/weblog/categories/pki/phishing3.html">John Patrick: Phishing Update</a>]
</blockquote>
</p>
</body>
</item>



<item num="a957">
<title>The social enterprise</title>
<date>2004/03/29</date>
<body>

<p>
<blockquote>
We are social animals for whom networked software is creating a new kind of habitat. Social software can be defined as whatever supports or amplifies our social behavior as we colonize the virtual realm. The category includes familiar things such as groupware and knowledge management, and extends to the new breed of relationship power tools that have brought the venture capitalists out of hibernation. [Full story at <a href="http://www.infoworld.com/article/04/03/26/13FEsocial_1.html">InfoWorld.com</a>]
</blockquote>
This story touched on too many themes for the allotted space, but I thought it important to try to paint the bigger picture. 
</p>
<p>
There's also an <a href="http://www.infoworld.com/article/04/03/26/13FEsocialint_1.html">interview</a> with Valdis Krebs and Gerry Falkowski. Valdis wrote me over the weekend with a correction to this bit:
<blockquote>
<p><b>IW: Does it cut the other way, too?</b></p>
<p>VK: We wouldn't take a job that we knew would lead to a resource action.</p>
<p><b>IW: Resource action?</b></p>
<p>VK: Layoff.</p>
</blockquote>
Valdis thought it was Gerry, not himself, speaking at this point. I just checked, and he's right. Here's the <a href="http://weblog.infoworld.com/udell/gems/resourceAction.mp3">clip</a>.
</p>

</body>
</item>



<item num="a956">
<title>Outsourcing anecdotes come in different flavors</title>
<date>2004/03/29</date>
<body>

<p>
The pro-outsourcing arguments advanced by economist Daniel Drezner, writing in Foreign Affairs, break no new ground. I was struck, though, by this comment about anecdotal evidence:
<blockquote class="personQuote DanielDrezner">
When forced to choose between statistical evidence showing that trade is good for the economy and anecdotal evidence of job losses due to import competition, Americans go with the anecdotes. [<a href="http://www.foreignaffairs.org/20040501faessay83301/daniel-w-drezner/the-outsourcing-bogeyman.html?mode=print">ForeignAffairs.org</a>, via <a href="http://weblog.siliconvalley.com/column/dangillmor/archives/010194.shtml#010194">Dan Gillmor</a>]
</blockquote>
I just want to point out that anecdotes come in all flavors. Here's one that you probably haven't heard. Last week, an Indian who runs an outsourcing business in Texas wrote to tell me that somebody threw stones through his office window. 
</p>
<p>
He says he can't prove this attack was motivated by anti-outsourcing sentiment, but thinks so based on the fact that his website was also recently defaced with messages like "*&amp;*&amp;&amp;** you have taken our jobs!" 
</p>
<p>
Sigh.
</p>
<p>
Most of the reactions to my <a href="http://weblog.infoworld.com/udell/2004/03/08.html#a939">recent column</a> on outsourcing, in which I interviewed MAPICS CEO Dick Cook, were favorable. To the minority of critics, I wrote back and asked: "How do you propose to deal with the situation?" No answers yet. 
</p>
<p>
Meanwhile, an issue that was never abstract to me has become even more concrete. CNET's Builder.com will be outsourcing some of its "content production" to an editorial firm in India. The CNET spokesperson cited in <a href="http://trends.newsforge.com/trends/04/03/18/2240229.shtml">news coverage of this story</a> is senior editor Rex Baldazo, who worked for me at BYTE years ago. 
</p>
<p>
I keep coming back to <a href="http://weblog.infoworld.com/udell/2004/03/09.html#a940">the exchange between Daniel Pink and Shirley Turner</a>. "We've done it before," says Pink, "going from farm to factory, from factory to knowledge work, and from knowledge work to whatever's next." To which Turner responds: "I'd like to know where you go from knowledge."
</p>
<p>
Not, let's hope, to rock-throwing.
</p>
</body>
</item>


<item num="a955">
<title>Refrigerator magnet mystery: solved</title>
<date>2004/03/26</date>
<body>

<p>
<a href="http://digme.typepad.com/">H&#229;kon Styri</a> figured out the answer to yesterday's puzzle. The page in question -- <a href="http://weblog.infoworld.com/udell/2002/09/23.html">The analog hole</a> -- <i>does</i> mention magnets. The text is hidden in the Strategic Developer widget on that page. As Hekon points out, that's very confusing. Indeed, it calls into question the common practice of decorating web pages with all sorts of auxiliary info-widgets.
</p>
<p>
The problem isn't just with Marc Barrot's nifty expanding <a href="http://www.activerenderer.com/">activeRenderer</a> widget, which I use in a couple of places in my standard template. Even visible text that's unrelated to the primary item on a page will cause problems. For example, you can't do an effective fulltext search of my blog for anyone whose name appears in one of my blogrolls.
</p>
<p>
In theory a CSS attribute could say: "Don't index this element." Parsing it would likely be more work than search engines are currently willing or able to do. But it's a <a href="http://search.yahoo.com/search?p=yahoo+dumps+googl">competitive market again</a>, and there's going to be a struggle to differentiate premium search from commodity search. "Don't index this element," in and of itself, isn't a feature to write home about. But if an Internet-scale engine could deliver the kinds of <a href="http://udell.infoworld.com:8000/?/blog/item/title[contains(.,%20'Dynamic')]">structured search</a> I've implemented locally on this site, that would be a serious advantage. I wonder who'll <a href="http://www.google.com">get</a> <a href="http://search.yahoo.com">there</a> <a href="http://search.msn.com/">first</a>?
</p>

</body>
</item>

<item num="a954">
<title>The refrigerator magnet mystery</title>
<date>2004/03/25</date>
<body>

<p>
<img align="right" vspace="6" hspace="6" src="http://weblog.infoworld.com/udell/gems/greenFridge.jpg"/>
My referral log shows four visits today as a result of this query: <a href="http://www.comcast.net/qry/websearch?query=research%20with%20circle%20hole%20magnets&amp;safe=on&amp;as_qdr=all&amp;lr=&amp;base=20&amp;num=10">research with circle hole magnets</a>. It presents so many interesting questions to divert me from writing the column that's due today!
</p>
<ol>
<p><li>What was really intended? Maybe <a href="http://www.google.com/search?q=magnets%20%22circular%20holes%22">magnets "circular holes"</a>?
</li></p>
<p><li>
Does anybody out there get the difference between <a href="http://www.google.com/search?q=magnets circular holes">magnets circular holes</a> and <a href="http://www.google.com/search?q=magnets%20%22circular%20holes%22">magnets "circular holes"</a>?
</li></p>
<p><li>
There were <i>four</i> clickthroughs to my page -- <a href="http://weblog.infoworld.com/udell/2002/09/23.html">The analog hole</a> -- from this query? Wasn't it evident, after the first, that <i>it doesn't even mention magnets</i>?
</li></p>
<p><li>
Why is the <a href="http://216.239.51.104/search?hl=en&amp;q=cache:pgLPgbE9AF0J:http://weblog.infoworld.com/udell/2002/09/23.html+research+with+circle+hole+magnets">cached version</a> of the found page empty?
</li></p>
<p><li>
Why is the found content:
<blockquote>
... My refrigerator magnets still don't receive weather reports, but when they do, we'll ... CEO and Research Chair of the Burton Group, Jamie is a longtime industry ...
</blockquote>
actually from another item, <a href="http://weblog.infoworld.com/udell/2003/03/07.html">Playing the Internet scales</a>?
</li></p>
<p><li>
Why does the "refrigerator magnets" blurb appear in the summary of <i>other</i> pages too? 
<blockquote>
<a href="http://weblog.infoworld.com/udell/2003/06/27.html">Jon <b>Udell</b>: My conversation with Mr. Safe</a><br/><font size="-1"> <b>...</b> My <b>refrigerator</b> <b>magnets</b> still don't receive weather reports, but when<br/>
they do, we'll need something like PreCache to make them work. <b>...</b> 
<br/><font color="#008000">weblog.infoworld.com/udell/2003/06/27.html -  101k - </font><a class="fl" href="http://216.239.51.104/search?q=cache:7gtIVYll95gJ:weblog.infoworld.com/udell/2003/06/27.html+udell+refrigerator+magnets&amp;hl=en&amp;ie=UTF-8">Cached</a> - <a class="fl" href="/search?hl=en&amp;lr=&amp;ie=UTF-8&amp;oe=UTF-8&amp;c2coff=1&amp;safe=off&amp;q=related:weblog.infoworld.com/udell/2003/06/27.html">Similar pages</a></font>
</blockquote>
</li></p>
</ol>
<p>
I can't answer questions 1 through 4, but I've got a hunch about what's happening with 5 and 6. Try this query: <a href="http://search.atomz.com/search/?sp-a=sp10022a3d&amp;sp-f=ISO-8859-1&amp;sp-q=refrigerator&amp;xsubmit=search">refrigerator magnets</a>. The Atomz search engine, which I formerly used to search my blog locally (but never discontinued when I switched to InfoWorld's UltraSeek engine) appears to have suffered some kind of aphasia. When you search for "refrigerator magnets" it finds hundreds of articles, and uses the same summary for each. Doesn't happen with any other query I try, only "refrigerator magnets" (or "refrigerator" or "magnets"). Cool, huh?
</p>
<p>
Now, did Google find this refrigerator magnetized page? If it did, how exactly did the "refrigerator magnet" summarization glitch infect Google? By linking to that wacky Atomz query, from this posting, will I make <i>all</i> my Google summaries be about refrigerator magnets? 
</p>
<p>
Alas, my deadline looms, so the answers to these pressing questions will have to wait. 
</p>

</body>
</item>


<item num="a953">
<title>The Firefox opportunity</title>
<date>2004/03/24</date>
<body>

<p>
<blockquote>
The future of "great Windows applications," we're told, lies with Longhorn's next-generation presentation subsystem, Avalon, which will reboot software development sometime in the latter half of this decade. Of course, even Microsoft can't wait until then. Consider InfoPath. It's a great Windows application and a rich Internet client that had to ship in 2003. Its foundation is none other than Internet Explorer -- or rather, the suite of components and Internet standards on which Internet Explorer depends. Could InfoPath have been built on a Mozilla foundation instead? You bet. And the result wouldn't just be a great Windows application. It would be a great application, period. [Full story at <a href="http://www.infoworld.com/article/04/03/19/12OPstrategic_1.html">InfoWorld.com</a>]
</blockquote>
After I wrote this column, I checked out an interesting new application that I wish had been built on a Mozilla foundation: <a href="http://www.onfolio.com/">Onfolio</a>. You can't fault Onfolio's creator, J.J. Allaire, for targeting the overwhelming majority platform: IE/Win. Of course as a .NET app, Onfolio targets a minority within that majority. We live in interesting times! 
</p>
<p>
It's easy to imagine a Mozilla-based organizer that does the kinds of things Onfolio does, but on all platforms. The UI would be handled by XUL, JavaScript, and components; storage would be handled (as in Chandler) by Berkeley DB XML. What's missing from this equation is the enormous value that I'm told (and that I believe) the .NET Framework is delivering to Onfolio. Of course Mozilla can use the .NET Framework on Windows too. The wild card in the deck is Mono, which could in theory deliver similar value on other platforms. One of these days, I guess somebody's going to dive in and test those technical (and legal!) waters.
</p>
<p>
In response to the Firefox column, Peter Traeg asked:
<blockquote class="personQuote PeterTraeg">
You mentioned in your article that you have been using Firefox to build apps 
that fetch, transform, and search XML documents. Do you have some links you could share with information on how to do this?
</blockquote>
Yup. <a href="http://webservices.xml.com/pub/a/ws/2003/06/10/xpathsearch.html">Here</a> and <a href="http://www.xml.com/pub/a/2003/10/08/udell.html">here</a>.
</p>
<p>
For those who don't want to switch (or aren't allowed to), another reader recommends a way to have, in IE, the kinds of enhancements that have been showing up in Mozilla:
<blockquote>
There is a product called <a href="http://www.myie2.com/html_en/home.htm">My IE 2</a> that adds not only a very flexible Tabbed Browsing interface to IE 6, but ad blocking, pop-up blocking and a whole lot more...It is an amazing piece of freeware and it does all of this and much more, including mouse gestures, in a very small, 700+ Kilobyte file that basically sits atop IE 6 and makes your browsing experience a much, much better one.
</blockquote>
I haven't tried this yet, so I can't recommend it, but it seems noteworthy.
</p>
<p>
Finally, I've noticed growing interest in Firefox search plugins. Oracle guru Steve Muench is <a href="http://radio.weblogs.com/0118231/2004/03/24.html#a253">rolling his own</a>, and Flash guru Mike Chambers has made <a href="http://www.markme.com/mesh/archives/004528.cfm">a bunch of installable ones</a> for searching Macromedia resources.
</p>

</body>
</item>


<item num="a952">
<title>How to forge an S/MIME signature</title>
<date>2004/03/23</date>
<body>

<p>
The other day I received an email message from jon_udell@infoworld.com, accompanied by a valid S/MIME digital signature. But the message wasn't from me, it was from David Wall (see <a href="http://weblog.infoworld.com/udell/2004/03/19.html#a948">earlier post</a>), and here's what it said:
</p>
<blockquote class="personQuote DavidWall">
As mentioned here is a spoofed email that appears to come from you and is digitally signed.  Note that I signed up using another person's email address, another person's SSN, another person's phone number, chose your name as the password for the key, etc.  In other words, these "precautions" Thawte demands don't provide any real security any more than checking IDs will stop terrorism.  Only the honest will comply.  
<br/><br/> 
And what's worse, the person who really has the SSN that I provided won't be able to get her own certificate now because I've locked it up, yet Thawte doesn't know who I am to resolve matters.
</blockquote>
<p>
Ouch! This withering critique of S/MIME deserves a closer look. I was at first perplexed because I've tested S/MIME forgery myself, and have verified that when the From: header doesn't match the certified address, S/MIME-aware mailers tell you that the signature is invalid. So let's look at how David's trick works.
</p>
<p>
I began by retracing David's steps, because it's been a very long time since I originally signed up with Thawte -- a process which, as reader Matt Dirks notes, begins <a href="http://www.thawte.com/email/">here</a>. (Another reader, Dennis Wurster, pointed me to <a href="http://www.joar.com/certificates/">this overview</a> of the signup process.)
</p>
<p>
Like David, I was able to use a random 10-digit number to satisfy Thawte's requirement for a "national ID." He's right: that's lame. The freemail cert does one thing, and one thing only: it binds a public key to an email address with minimal assurance. Thawte, like other certification authorities, will sell you certificates that offer more robust assurance. Only then, arguably, should official credentials -- SSN, driver's license, passport -- play a role in the process. I'd love to hear from Thawte (or another CA offering free S/MIME certs) on this point.
</p>
<p>
Here's the information I gave Thawte when I created my new account:
<pre>
        Surname: Gates
      Forenames: Bill
    Nationality: American
USA National ID: xxx-xxx-xxxxx
      Thawte ID: JUDELL@MYREALBOX.COM
  Date of Birth: 1955/05/12
</pre>
And here is a spoofed message from Bill Gates with a valid digital signature backed by a certificate containing these data:
<div>
<img vspace="10" alt="spoofed S/MIME, Outlook" src="http://weblog.infoworld.com/udell/gems/gotchaOutlook1.jpg"/>
</div>
</p>
<p>
Cool, huh? It probably wouldn't occur to <a href="http://weblog.infoworld.com/udell/2004/03/02.html#a931">Aunt Tillie</a> to click on the signature icon. If she did, here's what she would see:
<div>
<img vspace="10" alt="spoofed S/MIME, Outlook: revealed" src="http://weblog.infoworld.com/udell/gems/gotchaOutlook2.jpg"/>
</div>
The signature is valid because the email address in the From: header <i>does</i> match the certified email address. But Aunt Tilie can't see the mismatch between the address and the friendly name. The forger, relying on the fact that Outlook's "friendly" display hides the actual email address, misdirects Aunt Tillie. She is tricked into believing that the signature binds to billg@microsoft.com rather than to judell@myrealbox.com.
</p>
<p>
In another context, that bit of misdirection doesn't work so well. Here's the same message in OS X Mail:
<div>
<img border="1" vspace="10" alt="spoofed S/MIME, OS X" src="http://weblog.infoworld.com/udell/gems/gotchaOSX.jpg"/>
</div>
In this case, even poor old Aunt Tillie might wrinkle her brow and suspect foul play. Unfortunately for her, OS X Mail's ability to inspect the certificate is far weaker than Outlook's. Clicking on the signature icon does nothing. And there is zero chance she'll find her way to the Keychain Access app, figure out which of a bunch of similarly-named Thawte certs corresponds to this message, and inspect it.
</p>
<p>
David Wall and I draw different conclusions from all this. Mine follows from last week's posting, <a href="http://weblog.infoworld.com/udell/2004/03/17.html#a946">standards versus conventions</a>: we can't neglect the subtle user-interface details. For example, <a href="http://www.ietf.org/rfc/rfc2312.txt">RFC2312</a> says:
<blockquote>
Receiving agents MUST check that the address in the From header of a mail message matches an Internet mail address in the signer's certificate.
</blockquote>
Clearly that's necessary, but not sufficient. I can imagine some additional rules:
<ul>
<li><p>A receiving agent that displays a signed message MUST display the address in the From header along with the friendly name.</p></li>
<li><p>A receiving agent that displays a signed message MUST one of the standard signature icons: {URL}</p></li>
<li><p>The signature icon MUST link to a certificate viewer.</p></li>
</ul>
</p>
<p>
Historically, of course, we don't spell these things out. When I suggested that perhaps we should, Marcus Ramberg suggested that I've been "taking a deep hit of the crack pipe":
<blockquote class="personQuote MarcusRamberg">
For one, standards like this would conflict with UI standards on the respective operating systems the apps run on, and anyways, the point of making a standard is so entities can interact with each other. How applications should interact with users is a topic for UI Design 101. [<a href="http://thefeed.no/marcus/archives/000621.html">Marcus Ramberg</a>]
</blockquote> 
I disagree. Security is a game of social engineering as well as cryptography. And social engineering is inseparably linked to UI conventions. I'm not saying that RFC2312 is the place to spell out the details, but I'm pretty sure we need to do it somewhere.
</p>

</body>
</item>


<item num="a951">
<title>Let your customers sell your software</title>
<date>2004/03/23</date>
<body>

<p>
Paul Everitt's <a href="http://radio.weblogs.com/0116506/2004/03/23.html">Zope Dispatches</a> blog today features a <a href="http://zea.zope-europe.org/~paul/oxygen/oxygen.html">narrated screen video</a> that demonstrates <a href="http://www.oxygenxml.com/">oXygen</a>, Paul's weapon of choice for wrangling XML and XSLT. I invite everyone -- and in particular the marketing folks at SyncRO Soft, Ltd (oXygen's maker) -- to compare what's happening on the oXygen site with what's happening on Paul's blog.
</p>
<p>
The oXygen site has all the familiar paraphernalia: a <a href="http://www.oxygenxml.com/features/">features and benefits list</a>, a <a href="http://www.oxygenxml.com/customers.html">customers list</a>, a bunch of <a href="http://www.oxygenxml.com/doc/index.html">articles and documentation</a>. Yawn. OK, I should look into that, someday...
</p>
<p>
Meanwhile Paul, who's "merely" a user of oXygen, shows me and tells me what the tool does, and why he values it. The customers that the oXygen site lists are just names and websites that otherwise mean nothing to me. Paul, on the other hand, is someone I know. And even if I didn't know him personally, I could get a sense of the guy by absorbing the identity he's projected into his blog over time. So his recommendation feels personal.
</p>
<p>
Reading his commentary on the screen video he made, I hear the voice of experience and the ring of truth:
</p>
<blockquote class="personQuote PaulEveritt">
FWIW, Komodo is a nice XML environment as well. It has the one feature I miss the most in oxygen, which is an XSLT debugger. This is just wildly useful in Komodo: set a breakpoint in an XSLT file, and watch as the result document is rendered, stepwise. Still, oxygen makes a nicer XML environment, as it is really geared towards XML semantics (such as enforcing the XSLT schema and learning structure).
</blockquote>
<p>
The fact that Paul's assessment of oXygen includes a comparison with Komodo (and an implicit criticism oXygen) makes his final recommendation all the more credible. As does the fact that an oXygen user liked the product enough to spend time and effort demonstrating it to all interested parties on his blog.
</p>
<p>
Very, very cool. It reinforces my hunch that the combination of easy-to-create blogs and easy-to-create narrated screen videos could put users in charge of software marketing, education, and training.
</p>

</body>
</item>


<item num="a950">
<title>Blog/print synergy: my strategies</title>
<date>2004/03/22</date>
<body>

<p>
For almost a decade I've used the Web -- and most recently my blog -- to research, develop, and enhance the articles I write for magazines. When I ran into <a href="http://weblog.siliconvalley.com/column/dangillmor/">Dan Gillmor</a> at SXSW we discussed some of my strategies, and Dan asked me to write them up. Seems worth doing, so here goes. Much of this concerns the IT trade pub ecosystem specifically, but I think the principles will generalize. The basic pattern is simple: a story gestates in blogspace, appears in print and online, and then matures in blogspace.
</p>
<p><b>Pre-publication phase: Announce story on blog, publish draft outline, solicit feedback.</b> 
The  <a href="http://weblog.infoworld.com/udell/2004/01/27.html#a900">preview</a> of my <a href="http://www.infoworld.com/reports/09SRmsnet.html">.NET cover story</a> was a good example of the role the blog can play in the pre-publication phase of a story. Among the purposes served by that posting:
<ul>
<li><p><b>Validate the idea.</b> There's a lot of complaining, lately, about the "echo chamber" effect in the blogosphere. But in this case, the blog is a way of breaking out of another kind of echo chamber: the editorial ivory tower. Every magazine has some version of the editorial meeting, a session in which ideas are pitched and vetted. The external feedback loop that governs this process is highly attenuated, though. If an idea was incomplete, or poorly focused, you'll hear about it from readers -- but only after the article hits print. Since readers are stakeholders in this process, I figure I should involve them up front. This makes particularly good sense in the realm of IT trade journalism, where we writers serve as proxies for the readers. I enjoy privileged access to vendors, but with that privilege comes a responsibility to ask and answer the questions that matter to readers. By operating transparently, in blogspace, I invite my reader-stakeholders to keep me on track. 
</p></li>
<li><p><b>Gather expertise.</b> I start with topics to which I bring a certain amount of expertise. Then I leverage what I know (and who I know) to find what I don't know (and who I don't know). Of course in the trade magazine business, there is a whole profession dedicated to helping me do that. When a story appears on the editorial calendar, I'm swamped with phone calls and emails from PR folk who want to supply me with analysts, executives, domain experts, and customers. This isn't necessarily a bad thing. I sometimes accept these opportunities, and in some cases, I learn from them. It's dangerous, though, to be led down the path of least resistance. So I rely on the blog to find other people who have important things to tell me. As you can imagine, this makes PR folk really nervous. It's their job to try to control my story. It's my job to route around that control, and the blog is a tremendously powerful tool for doing that.
</p></li>
<li><p><b>Focus the PR energy.</b> 
The journalism/PR game is made more antagonistic that it needs to be when there's insufficient data in play. For example, I neglected to blog proactively about the <a href="http://www.infoworld.com/article/04/01/23/04FEforms_1.html">e-forms story</a> we ran in January. As a result, the PR people were forced to rely on our <a href="http://www.infoworld.com/advertise/adv_edt_cal.html">editorial calendar</a>, which described the story as something like "Life of a document." They concluded, not irrationally, that it was going to be a story about document management. And then many took a further leap of faith and figured that, given the impending Sarbanes-Oxley deadline, I should write a story about document management systems that help companies comply with that legislation. I can't tell you how many calls and emails I got inquiring about my "Sarbanes-Oxley story." But this was really my fault. Had I spelled out my intention -- which was to compare the Acrobat, InfoPath, and XForms approaches to e-forms -- I'd have spared a bunch of people from making phone calls and writing emails that were as fruitless for them as they were annoying to me. And I'd have encouraged the folks who really should have been contacting me to do so.
</p></li>
<li><p><b>Dialogue with vendors.</b>
In the IT trades, readers aren't the only stakeholders. Vendors are stakeholders too. They're creating products and services that, over the years, have grown steadily more complex, and more difficult to understand fully and explain well. They rely on trade pubs to help get the story out, but the pubs have less and less space for detailed explanation and analysis. There's much more to say than InfoWorld (or any other trade pub) has room to print. By narrating my evolving views in my blog, I invite everyone -- including vendors, who are of course the best experts on their own stuff -- to help me refine those views. That give and take yields valuable insight and -- when it can take the form of cross-blog conversations (i.e., isn't secret, as many things aren't) -- valuable content. A tip of the hat here to Microsoft, by the way, whose developers are miles ahead of their counterparts at Sun, IBM, Apple, and elsewhere when it comes to engaging with the blog medium.
</p></li>
<li><p><b>Promote the story.</b>
I hadn't thought about this until recently, but blogging the run-up to a print story can help create buzz. That may matter less to a controlled-circulation magazine like InfoWorld than it would to a newsstand pub, but it's still an interesting notion. When you've got a major story on an evergreen topic -- one that isn't going to break news or reach shocking conclusions -- opening up the process a bit may be a useful marketing strategy. That's what movie-makers do, after all, and the magazine game is a species of show business.
</p></li>
</ul>
</p>
<p>
<b>Post-publication phase: analysis, feedback, enhancement.</b> 
Since the advent of the Web, magazine sites have used the "TalkBack" device to enable readers (and authors) to comment on stories. This was a great way to work around the severely-bottlenecked "letter-to-the-editor" medium. In the blog era, there's another way to skin this cat: aggregate what readers (and authors) say on their blogs about the published article.
</p>
<p>
I think we'll see more of this TrackBack-like approach as time goes on. In fact, InfoWorld.com takes a step in that direction, following a suggestion of mine. Blog entries that reference InfoWorld.com stories, found by way of Feedster and Technorati, are collected into a database. Then a selected few are shown on every page, in a box labeled "Top Site Referrals." I find this label confusing, and would rather see something like "Bloggers talk back." But that wouldn't work well either because, currently, the items appear sitewide, not per-article.
</p>
<p>
InfoWorld.com doesn't have the resources to collect all the substantive blog postings (and letters to the editor) that relate to each published article, and use them to advance the story in a coherent way. But as the author of a few of those articles, I have the bandwidth -- and the motivation -- to do exactly that. Here are some of the ways the blog can add depth to a printed story. 
</p>
<ul>
<li><p>
<b>Respond to readers.</b> 
I used to mention my published stories on the blog immediately. Lately, though, I've decided to let InfoWorld's RSS feeds announce the stories, and hold my posting until I've had a chance to collect and process email and blog feedback. Last week's <a href="http://www.infoworld.com/article/04/03/12/11OPstrategic_1.html">column on email identity</a> is a case in point. It posted to the Web on Friday the 12th, but it wasn't until a week later -- last Friday -- that I'd gathered enough feedback to support a <a href="http://weblog.infoworld.com/udell/2004/03/19.html#a948">substantive follow-up</a>. 
</p></li>
<li><p>
<b>Publish interview out-takes.</b> 
I've used the blog to expand on published interviews with various people including <a href="http://weblog.infoworld.com/udell/2003/02/13.html">Ward Cunningham</a> and <a href="http://weblog.infoworld.com/udell/2004/03/08.html#a939">Dick Cook</a>. I was going to add Jean Paoli to this list, but when I went back and looked, that entire interview ran as <a href="http://www.infoworld.com/article/02/11/11/021114opwebserv_1.html">a column</a>. Interestingly, Phil Wainewright saw where this was going even before I did. He originally wrote:
<blockquote class="personQuote PhilWainewright">
This is cutting-edge journalism, by the way -- neither a finished article nor a weblog entry but something in-between that would never have happened without the influence of weblogging or the convenience of online publishing -- an analytical journalist publishing his interview notes accompanied by his reflections on them.  [<a href="http://www.looselycoupled.com/blog/2002_11_10_lc.htm">Loosely Coupled</a>]
</blockquote>
Then, when he realized that in this case the interview-plus-reflections appeared in the column, not the blog, he added:
<blockquote class="personQuote PhilWainewright">
Jon subsequently <a href="http://weblog.infoworld.com/udell/2002/11/15.html#a508">noted</a> that the article was his weekly column, so I shouldn't really have implied that it was less than a finished piece. But I almost wish that I <i>had</i> been right, because the idea of supplementing traditional published formats with new ones appeals to me.
</blockquote>
Me too. Phil was right, just not about that particular example. His comment helped crystallize the approach I've taken with subsequent interviews, and plan to continue.
</p></li>
<li><p>
<b>Publish demos and examples.</b> 
My item on <a href="http://weblog.infoworld.com/udell/2004/03/10.html#a941">secure use of private keys</a>, which featured screen videos of advanced private-key security configuration in OS X and Windows, was a companion to the <a href="http://weblog.infoworld.com/udell/2004/03/19.html#a948">column on email identity</a>. It's time-consuming to do this kind of thing, but with more practice using the capture/edit tools, and some refinement of my presentation skills, I hope to be able to make it happen more routinely. Clearly you can't do this in print, but it makes a powerful complement to the printed article. 
</p></li>
</ul>
<p>
The rhetoric swirling around blogs and journalism often takes an adversarial tone. One of the reasons for that, I think, is the relationship of the two cultures to their primary sources. Bloggers feel obliged to cite them, journalists often don't. A startling example of this was the <a href="http://weblog.infoworld.com/udell/2003/09/26.html">Dan Geer incident</a>, which revolved around a PDF report on the Web. Every blogger who commented on the matter linked to that report. No conventional journalist did. 
</p>
<p>
I won't always report everything that someone said to me, or cite every information source I've consulted, because I'm trying to tell stories here, and I want to keep the narrative lively. But using the blog to open a window onto my primary sources before, during, and after the publication of an article helps me -- and the various stakeholders -- in all sorts of ways.
</p>

</body>
</item>

<item num="a949">
<title>More on OS X certs</title>
<date>2004/03/22</date>
<body>

<p>
I mentioned <a href="http://weblog.infoworld.com/udell/2004/03/19.html">the other day</a> that OS X Mail and Outlook handled a DoD email certificate differently: OS X Mail trusted the cert, and Outlook didn't. The obvious explanation -- that OS X has the DoD root certificates pre-installed, whereas Windows doesn't -- somehow never occurred to me. But according to Daniel Dulay, that is indeed the case:
</p>
<blockquote class="personQuote DanielDulay">
<p>
I have worked in the computer security field in the past, and I have experience with deploying PKI in enterprises. I also have had a little exposure to the DoD smart card, the Common Access Card or CAC card. I'd like to comment on your story about receiving an email signed by DoD user and your description of Mail.app as "questionable" for having trusted this digital signature.
</p>
<p>
First, a kludgy little trick I learned in OS X. Do you know how to read the certificate authorities that Apple has shipped with Panther? The certs are stored in /System/Library/Keychains/X509Anchors and /System/Library/Keychains/X509Certificates, and you may use Keychain Access to read these files. In Keychain Access go to File -> Add Keychain... and point to one of these files. I should add the caveat that I have always made a copy of these files first because I don't know how robust Keychain Access is or if this functionality is supported by Apple. (Another way to access these files is the command line certtool utility. See "man certtool" for some surprisingly detailed documentation.)
</p>
<p>
So if you open up the cert authorities, then you will find that the DoD certs are already installed on your system! This is why Mail.app trusted the digital signature from the DoD. Your windows box probably did not have the DoD cert installed (I know win2k does not, but I am not sure about XP).
</p>
<p>
Why are these certificates already there? Because Panther is supposed to have CAC card support built in! I have not seen it for myself, but you can find some tools under /usr/libexec/SmartCardServices. Panther is supposed to support smart card logins, and I assume that a smart card's certificates can be used with Mail.app or Safari. There is a detail-free article on Apple's web site, http://docs.info.apple.com/article.html?artnum=152235, and I would love to find out more.
</p>
</blockquote>
<p>
Interesting! I checked and sure enough, OS X trusts a bunch of DoD root certification authorities. Who would have thunk it? Thanks, Daniel. 
</p>

</body>
</item>


<item num="a948">
<title>Making email identity work</title>
<date>2004/03/19</date>
<body>

<p>
<blockquote>
I've watched with bemusement as Bill Gates has been making the rounds lately -- the World Economic Forum, the RSA Conference -- to announce that Microsoft is "innovating on many different fronts" to eradicate spam. Really? The hashcash scheme, which requires the sender to spend CPU cycles, dates back to about 1992 or so. And "caller ID for e-mail" derives from RMX (Reverse MX), a more recent proposal to bind senders to authorized relays via DNS records.
<br/><br/>
The truth is we've had plenty of innovation over the years. What we've lacked is follow-through. Consider S/MIME digital signatures. It's very likely that your e-mail client supports them. But it's overwhelmingly unlikely that you've ever digitally signed an e-mail message. [Full story at <a href="http://www.infoworld.com/article/04/03/12/11OPstrategic_1.html">InfoWorld.com</a>]
</blockquote>
This column provoked some really interesting and useful responses. First, a <i>mea culpa</i>. When Google found <a href="http://www.macdevcenter.com/pub/a/mac/2003/01/20/mail.html">this elaborate recipe</a> for acquiring a digital certificate in OS X, I assumed the procedure was necessary, and followed it. Not so. The latest version of Safari can, in fact, request a cert, retrieve it, and install it directly into the OS X keychain.
</p>
<p>
There's no excuse for not having checked that myself. Typically I do. I've probably installed more digital certificates, in more browsers, on more operating systems, than anybody. But sadly I was willing to believe that the painful procedure outlined in that O'ReillyNet article was necessary because, well, that's the universal experience of S/MIME. Everything's ten times harder than it should be.
</p>
<p>
Whenever I write something about digital signatures, a handful of folks are inspired to send me signed messages, and since this happens so rarely, I always learn something new. One such message came from a DoD employee, who wishes to remain anonymous. His was the first cert I've ever received from a DoD certification authority. Outlook and OS X Mail, as it turns out, have inverse policies for dealing with this case. Outlook refused to trust the cert until I explicitly approved the issuing DoD CA. OS X Mail, questionably in my view, trusted it implicitly.
</p>
<p>
Anyway, the DoD guy had written to me to find out how to require per-message passwords, an advanced feature I describe in the column. In his office they use smartcards. When he hits Send in Outlook, he's challenged once for the smartcard PIN. Subsequent access to the signing key requires no further interaction. He's concerned about walking away from the machine and leaving signing enabled. For that, at least, there's a solution: yank the card when you walk away. But I'd add another concern: that a piece of rogue software could show up even while he's sitting there, and silently impersonate him. Clearly you're not going to yank the card during a session. So per-message confirmation of access to the private key -- which I've now also learned <a href="http://weblog.infoworld.com/udell/2004/03/10.html#a941">how to do in OS X Mail</a> -- seems like a good idea to me.
</p>
<p>
But according to this fellow, what I have been considering a feature of Outlook is actually thought, by Microsoft, to be a bug! 
<blockquote>
If I am understanding the document at the following URL properly, Microsoft considers it a BUG if you get asked for your password before sending each digitally signed message (using Windows XP) and they have a BUG FIX so it will STOP asking you each time.  This seems BACKWARDS to me from a security standpoint! 
<br/><br/>http://support.microsoft.com/?kbid=821574
</blockquote>
Go figure.
</p>
<p>
Finally, I received this thoughtful response from David Wall, chief software architect of <a href="http://www.yozons.com/">Yozons Inc.</a>, which I quote with permission:
<blockquote class="personQuote DavidWall">
You are to be commended for fighting through the free email certificate
acquisition and installation process.  And to think you just have to do it
again next year.  Or when you get another computer.  Or you want to send
email from your office, laptop and home computer using the same email
address.  Or when you change your email address, and you realize there's no
way to invalidate the certificate for the old email address.
<br/><br/>
And if just you and the rest of the world would actually do this complicated
process, S/MIME would finally become useful for email, provided all those
desktops were secure enough to keep hackers and virus writers from stealing
your keys.  Also, if you encrypt on your desktop using a recipient's public
key, you'll likely be violating corporate policies because the company will
not be able to meaningfully audit or archive the encrypted message.
<br/><br/>
But do you suppose free email certificates wouldn't be free today if people
actually wanted them?  They are free because nobody will pay for them, and
even at the cost of nada, few actually do.  I think this points out that
people as a whole just can't work with PKI's complexity, portability and
constant renewal hassles.  Have you ever tried to validate a digitally
signed email from a few years ago?  Do you really have the certificates that
went with old message today?  And even if you're one of the rare folks who
actually keeps all of these thousands of certificates -- one per email
address per year does add up quickly -- because they expire, you will get
signature failures and have to note that the error was related to expiration
and not because it was tampered with or the cert was revoked.
<br/><br/>
T-Mobile is our most recent large business customer.  There are working
alternatives today, like our Yozons business private network, and unlike
S/MIME, they can also produce legally recognized electronic signatures as
well as keep messages secure, provide full tracking and auditing, and
there's no fuss about installing, revoking or otherwise keeping digital
certificates current and secure.
</blockquote>
</p>
<p>
The S/MIME problems that David Wall cites are quite real. And since we have so far failed to tame these problems on the public network, we are -- quite rationally -- retreating to various kinds of private networks. Yozons' solution is one example. Groove is another. Aggressively-whitelisted email services are yet another. It's far more practical to establish trust within private networks than on the public network, and there are very good reasons to do so.
</p>
<p>
But private networks are islands. We ultimately need a workable trust solution for the global public network. That's clearly a daunting challenge. PKI is only a first draft of the solution. It's possible that that we'll need to rip it up and start over. It's also possible, though, that we can refine and improve it. But not if current implementations don't evolve in response to use.
</p>

</body>
</item>


<item num="a947">
<title>REST for the rest of us</title>
<date>2004/03/18</date>
<body>

<p>
<blockquote> 
The word used again and again lately to describe distributed information systems is "composition". The Unix idea of piecing together solutions from reusable parts has morphed into XML-based, service-oriented architecture. This time around, though, it's all happening on the Web, in an environment where everybody can compose simple and popular tunes. When technologists forget that, I hope users will administer the <a href="http://www.google.com/search?q=%22dope+slap%22">dope slap</a> we deserve. [Full story at <a href="http://www.xml.com/pub/a/2004/03/17/udell.html">XML.com</a>]
</blockquote>
I wrote this column on the plane home from SXSW. <a href="http://www.metagrrrl.com/">Dinah Sanders</a>, product manager for the <a href="http://www.innovativeinterfaces.com/">Innovative Interfaces</a> OPAC system, invited me to sit in along with <a href="http://www.mamamusings.net/">Liz Lawley</a>, <a href="http://www.pixelcharmer.com/fieldnotes/">Tanya Raybourn</a>, and Sun's corporate librarian Cynthia Hill. Reactions to the panel came from <a href="http://www.hyperorg.com/blogger/mtarchive/002502.html">David Weinberger</a> and <a href="http://www.theshiftedlibrarian.com/2004/03/15.html#a5360">Jenny Levine</a>. 
</p>
<p>
I really enjoyed meeting and hearing from my fellow panelists. They're on the forefront of reinventing the institution of the library and the profession of librarianship. My message to them, and to people in every other profession, is: <a href="http://www.infoworld.com/article/02/12/17/021219opwebserv_1.html">expect spontaneous integration</a>. My message to IT propellerheads: don't disappoint that expectation. Larry Wall got it right: hard things should be possible, and easy things should be easy.
</p>

</body>
</item>


<item num="a946">
<title>Standards versus conventions</title>
<date>2004/03/17</date>
<body>

<p>
<a href="http://www.pmbrowser.info/hublog/">Alf Eaton</a> looked inside a Magnatune MP3 file to see what metadata is really contained there, and concluded that the media players indeed can show all of it: artist, title, date, and in the comment field, the text "magnatune.com." I checked and Alf's right: additional info about licensing and purchasing doesn't seem to be present. 
</p>
<p>
Let's assume that the <a href="http://www.id3.org/id3v2.4.0-structure.txt">ID3</a> spec spelled out, in precise detail, how a company like Magnatune would embed its licensing and purchasing hooks into an MP3 -- in some more specific way that just dumping extra text into the comment field. From the perspective of the spec writers, it's case closed. Black and white. Either you conform to the spec or you don't. Done deal.
</p>
<p>
Except here's what these specs never talk about. In QuickTime/Mac, to access this metadata, I use the Get Movie Properties function ("Movie Properties" for a music track?), and then look inside Annotations. In RealOne/Mac, it's Window->Clip Info. In iTunes, File->Get Info. (In MediaPlayer/Mac, it's...never mind, can't seem to get that one to work at all.) At least the platform convention, Apple+I-key, invokes these differently-presented "get info" functions in a standard way.
</p>
<p>
Meanwhile over on Windows, another set of behaviors. Quicktime: Get Movie Properties->Annotations (Control-I). Real: File->Clip Properties->View Clip Info (Control-I). WinMedia: File->Properties->Content. WinMedia seems to lack an accelerator key. Arguably it's not needed, since WinMedia runs the metadata as a CNN-style crawl. But then, arguably, it is needed, because a License or Buy option would require a context for interaction, like a dialog box. 
</p>
<p>
So here's the point, and I see the same thing in other metadata standardization efforts such as the RSS/Atom fiasco. Technologists focus on formats and APIs, because that's what we know. How users will interact with the formats and APIs is left as an exercise for the implementer. But of course that's where the rubber meets the road. So syndication still lacks a well-known mechanism for one-click subscribe. Online music lacks a well-known mechanism for one-click licensing or purchasing. 
</p>
<p>
This is a crucial kind of standardization that tends to fall through the cracks. The IETF, W3C, and OASIS don't deal with such matters. Who could, and who should?
</p>

</body>
</item>

<item num="a945">
<title>The media-player fireswamp</title>
<date>2004/03/15</date>
<body>

<p>
By way of <a href="http://www.lifewithalacrity.com/">Christopher Allen</a>, I got to meet <a href="http://blogs.magnatune.com/buckman/">John Buckman</a> here at SXSW. John founded <a href="http://www.lyris.com/">Lyris</a>, a company whose hosted email list services I have used on behalf of clients. Although I prefer RSS to email as a direct marketing tool, the latter isn't going away anytime soon. So it's been a pleasure to rely on Lyris, a service that runs with impeccable integrity. John's new venture is <a href="http://www.magnatune.com">Magnatune</a>, an online record label I discovered a few months back whose endearing motto is "We are not evil." Equally endearing is this snippet from Magnatune's purchase  page:
</p>
<blockquote>
<font face="Verdana, Arial, utopia, sans-serif" size="2" color="#666666">        How much do you want to pay? <br/>
        <select name="amount" size="1"><option value="5">$5</option><option value="6">$6</option><option value="7">$7</option><option selected="selected" value="8">$8 (recommended)</option><option value="9">$9</option><option value="10">$10</option><option value="11">$11</option><option value="12">$12</option><option value="13">$13</option><option value="14">$14</option><option value="15">$15</option><option value="16">$16</option><option value="17">$17</option><option value="18">$18</option></select>
        <br/>
        <font size="1">(50% goes directly to the artist, so please be 
        generous)</font>
        </font>
</blockquote>
<p>
Interestingly, when given a choice (and the assurance that artists will be properly rewarded), users sometimes choose to <a href="http://www.magnatune.com/info/stats/highest_valued_this_month">pay more</a> than the suggested amount. 
</p>
<p>
I got to wondering how Magnatune and <a href="http://webjay.org/">Webjay</a> might work together. Webjay is Lucas Gonze's idea, a site whose tagline is "Listener-created radio." Nothing prevents me from extracting an MP3 URL from a Magnatune playlist and including it in a playlist that I publish on Webjay. But is this fair to Magnatune? The interstitial ads that Magnatune uses are included in their playlists, but not embedded in the individual MP3s. 
</p>
<p>
I asked Lucas and John (via email) to consider what would be the fair and right way to contextualize a Magnatune MP3 in a Webjay playlist. And just as I sent that message, Chris Allen -- who works with Magnatune (along with a bunch of other interesting ventures) -- sat down with me, here in the hallway at SXSW, to clarify his take on the matter. The 128kbps MP3 streams served up by Magnatune are made available under a Creative Commons Attribution-NonCommercial-ShareAlike license. So according to Chris, it's perfectly kosher to include them in playlists that you publish. The idea, says Chris, is that the <a href="http://www.id3.org/">ID3</a> tags embedded in the MP3s are sufficient to let listeners find out about Magnatune and its purchasing and licensing options. 
</p>
<p>
So I did the experiment, and it was a complete failure. None of the players on my PowerBook -- not iTunes, not RealPlayer, not QuickTime -- presented this metadata. That's hardly surprising. The various media players are, collectively, a train wreck. Publishing Web content that works in a standard and reliable way, in any browser, is a walk in the park compared to publishing AV content that works in a standard and reliable way in any media player.
</p>
<p>
We can't blame the problem on the record labels. It's the computer industry that gave us this fragmented and broken media platform. Now, suddenly, there's an explosion of content that can legally be ripped, mixed, burned, and blogged. The RIAA isn't the problem here. We need to find our way out of the QuickTime/Real/WinMedia/Flash fireswamp. 
</p>

</body>
</item>

<item num="a944">
<title>A nation of polarized readers</title>
<date>2004/03/13</date>

<body>

<p>
An <a href="http://www.nytimes.com/2004/03/13/arts/13BOOK.html">article</a> in today's New York Times features this <a target="_new" href="http://www.nytimes.com/imagepages/2004/03/13/arts/13BOOKCA01ready.html">Amazon-derived network map</a> by social network analyst Valdis Krebs. It's another fascinating illustration of an idea that Krebs mentioned when I <a href="http://webservices.xml.com/pub/a/ws/2002/06/04/udell.html">interviewed him</a> for the O'Reilly Network in mid-2002:
</p>
<blockquote>
<p>
Given good pictures of social networks, what will we use them for? Valdis
Krebs has lots of practical ideas. For example, consider Amazon's
related-book feature. If you follow these links a few steps out, says
Krebs, clusters emerge, and sometimes those clusters represent disjoint
interests connected only through one book. He offers Thomas Petzinger's <a href="http://www.amazon.com/exec/obidos/ASIN/0684863103/"><i>The New
Pioneers</i></a> as an example. It connected two different groups -- one
reading books on business and strategy, the other reading books on
complexity science and chaos theory. Now there are a number of books that
broker that connection, but Petzinger's was one of the first popular books
to do so, according to Krebs.
</p>
<p>
The general principle at work here, Krebs says, was articulated in Ron
Burt's <a href="http://www.amazon.com/exec/obidos/ASIN/0674843711/"><i>Structural Holes: The Social Structure of Competition</i></a>. It states that networks
with "holes" -- that is, unbrokered connections -- present the most
opportunity. A successful actor is one with ties to many points in the
network who can uniquely fill one or more of those holes. To that end,
Krebs -- who is writing a book on his experiences with social networks and
business organizations -- plans to mine Amazon, map out the communities of
interest relevant to his themes, and tune his presentation to optimally
broker among them. [<a href="http://webservices.xml.com/pub/a/ws/2002/06/04/udell.html">WebServices.XML.com: Seeing and Tuning Social Networks</a>]
</p> 
</blockquote>
<p>
In today's Times story, Krebs identifies <a href="http://www.amazon.com/exec/obidos/tg/detail/-/0743204735/">Bush at War</a> and <a href="http://www.amazon.com/exec/obidos/tg/detail/-/1400050219">Sleeping with the Devil</a>
as the current political books that are being read by conservatives and
liberals alike. Will publishers begin to apply this strategy
consciously, as Krebs suggests might be possible? Filling the
"structural holes" in networks, and creating large audiences from sets
of smaller ones, is a fascinating idea -- though I'm sure it's easier
said than done.
</p>
<p>Valdis Krebs will also appear in InfoWorld's March 29 issue, by the
way. For a feature on social software, I spoke with him and his
business partner Gerry Falkowski about their use of social network
analysis inside large enterprises such as IBM.
</p>

</body>
</item>

<item num="a943">
<title>Automated security scanning with Google</title>
<date>2004/03/12</date>
<body>

<p>
The other day <a href="http://www.masternewmedia.org/">Robin Good</a> posted a link, via <a href="http://www.elearnspace.org/blog/">George Siemens</a>, to a <a href="http://www.theregister.co.uk/content/55/36142.html">Register article</a>
by Scott Granneman. The article illustrates Google queries that find
passwords, web-accessible databases, and financial data. Nobody should
be surprised by what these queries reveal, but I'm sure a lot of folks
will be. </p>
<blockquote class="personQuote ScottGranneman">
A couple of websites have even sprung up dedicated to listing
words and phrases that reveal sensitive information and
vulnerabilities. My favorite of these, <a target="_blank" href="http://johnny.ihackstuff.com/index.php?module=prodreviews">Googledorks</a>,
is a treasure trove of ideas for the budding attacker. As a protective
countermeasure, all security pros should visit this site and try out
some of the suggestions on the sites that they oversee or with whom
they consult. With a little elbow grease, some Perl, and the <a target="_blank" href="http://www.google.com/apis/">Google Web API</a>,
you could write scripts that would automate the process and generate
some nice reports that you could show to your clients. [<a href="http://www.theregister.co.uk/content/55/36142.html">The Register: The Perils of Googling</a>]
</blockquote>
<p>
Indeed. What <i>does</i> surprise me is that there isn't a well-known tool for doing this. It would be the 21st-century equivalent of <a href="http://www.fish.com/satan/">SATAN</a>, the first security scanner I pointed at my website back in the mid 1990s. Or more recently, <a href="http://www.nessus.org/">Nessus</a>.
</p>
<p>
Perhaps such a tool is well-known, but not yet to the good guys? It
would be really useful. The mechanism, as Granneman points out, is
trivial, but assembling the database of vulnerabilities isn't. If a
credible project has formed around this idea, I'd like to know about
it.
</p>

</body>
</item>

<item num="a942">
<title>More Firefox search plugins</title>
<date>2004/03/11</date>

<body>
<p>
<script type="text/javascript" src="http://weblog.infoworld.com/udell/gems/mycroft.js"></script>
I've added a few more search engines to Firefox, and I'm parking them here so I can easily transfer them to my other machines. 
<ul>
<li><p><a href="javascript:addEngine('safari', 'gif', 'Tech')">Safari Books Online</a></p></li>
<li><p><a href="javascript:addEngine('infoworld', 'gif', 'Tech')">InfoWorld</a></p></li>
<li><p><a href="javascript:addEngine('jonblog', 'gif', 'Tech')">Jon's Radio</a></p></li>
</ul>
</p>
<p>
Here's the procedure create these plugins, by the way:
</p>
<ol>
<li><p>Capture the image. To do this I fetch the favicon.ico file from the site's root, and use ImageMagick to convert it to a GIF.</p></li>
<li><p>Write the control file. For example:
</p>
<pre class="code xml">
&lt;search 
   name="Feedster"
   method="GET"
   action="www.feedster.com/search.php?btnG=Search&amp;sort=date"&gt;
&lt;input name="q" user&gt;
&lt;/search&gt;
</pre>
The action is the query URL minus the query parameter, in this case "q"
-- it goes separately as part of the &lt;input&gt; tag. When a site
uses POST instead of GET, you'll need to dig a bit deeper to come up
with the query string. I used to use the <a href="http://livehttpheaders.mozdev.org/">LiveHTTPHeaders</a> extension. Even better, though, is Chris Pederick's wonderful <a href="http://chrispederick.myacen.com/work/firefox/webdeveloper/">Web Developer Extension</a> which does all kinds of handy things, including converting between GETs and POSTs.
</li>
</ol>
<p>
To add a plugin, just drop a pair of these files -- the image and the
control file -- into Firefox's searchplugins directory. The additional
step I'm illustrating here -- one-click installation of the plugin --
depends on a snippet of JavaScript:
</p>
<pre class="code javascript">
function addEngine(name,ext,cat)
{
  if ((typeof window.sidebar == "object") &amp;&amp; (typeof
  window.sidebar.addSearchEngine == "function"))
  {
    window.sidebar.addSearchEngine(
      "http://weblog.infoworld.com/udell/gems/"+name+".src",
      "http://weblog.infoworld.com/udell/gems/"+name+"."+ext,
      name,
      cat );
  }
  else
  {
  alert("Netscape 6 or Mozilla is needed to install a search plugin");
  }
}
</pre>
<p>
Note that there's a <a href="http://mycroft.mozdev.org/index.html">registry</a>
of these plugins. And I should probably register the Safari plugin there. But I'm sure this blog isn't searched often enough to warrant registering a "Jon's Radio" Firefox plugin. For such cases, it's nice to know that a more decentralized, ad-hoc solution is available.
</p>
<p><b>Update:</b> One reader wondered where the search plugin dropdown list is hiding. In <a href="http://weblog.infoworld.com/udell/gems/firefoxSearchPlugins.jpg">plain sight</a>. Though I'll agree it's easier to miss than Safari's equivalent, which remembers search history. Hmm. Would it make sense to offer both funtions? One handle to drop down the list of engines, and another to drop down the recent searches for that engine. Nah. Too cluttered, probably.
</p>
</body>
</item>

<item num="a941">
<title>Secure use of private keys in OS X Mail and Outlook</title>
<date>2004/03/10</date>
<body>

<p>
I finally got around to installing a digital certificate on OS X, so I can sign email messages in Panther's Mail app as I always do in Outlook on Windows. The <a href="http://www.macdevcenter.com/lpt/a/4541">recipe</a> for acquiring and installing the cert is, unfortunately, guaranteed to scare away <a href="http://weblog.infoworld.com/udell/2004/03/02.html#a931">Aunt Tillie</a>. But if you've gotten that far, you might want to consider an extra step to secure the use of your private key.
</p>
<p>
In Outlook, I've set things up so that messages are always signed. What's more, I have to type a password to unlock my private key each time I use it to sign a message. If the signature is going to be meaningful, I want to be sure -- and I want you to be sure -- that some piece of rogue software hasn't coerced Outlook into using cached credentials. I also find the extra confirmation step helpful, in the same way that a real signature can be. Even though it becomes an automatic reflex, it's not a completely unconscious act. And I don't send so many emails in a day that I can't afford a few seconds to consider the consequences of my words.
</p>
<p>
Achieving this effect in Outlook is wildly obscure. Once the cert is installed, I haven't found a way to up the security to require a per-use password. It's only when requesting the cert that you're given that option. Here's a <a target="movie" href="http://weblog.infoworld.com/udell/gems/digid3.html">movie</a> that shows how it works when requesting an Outlook S/MIME cert from Thawte.
</p>
<p>
The analogous procedure in OS X is nicer. Here's <a target="movie" href="http://weblog.infoworld.com/udell/gems/digid1.html">a movie</a> showing how to twiddle the settings on your private key, in Keychain Access, in order to require the keychain password (not, as in Outlook, a per-key password) when signing. And <a target="movie" href="http://weblog.infoworld.com/udell/gems/digid2.html">this movie</a> shows the result: you have to type the keychain password in order to send a signed message.
</p>
<p>
I used the trial version of <a href="http://www.qarbon.com">Qarbon</a> to make these movies. Based on the comments I see <a href="http://www.markme.com/jd/archives/004470.cfm">here</a>, it seems that <a href="http://www.macromedia.com/software/robodemo/">Macromedia's RoboDemo</a> should be the next screen video tool I try.
</p>

</body>
</item>


<item num="a940">
<title>Beyond knowledge?</title>
<date>2004/03/09</date>
<body>

<p>
<table align="right" cellpadding="0" cellspacing="4">
<tr><td>
<a href="http://www.wired.com/wired/archive/12.02/india_pr.html"><img src="http://www.wired.com/wired/archive/12.02/images/FF_94_1.jpg"/></a>
<div align="center" class="realsmall">Aparna Jairam</div>
</td></tr>
<tr><td>
<a href="http://www.njleg.state.nj.us/members/turner.asp"><img src="http://www.njleg.state.nj.us/members/memberphotos/turner.jpg"/></a>
<div align="center" class="realsmall">Shirley Turner</div>
</td></tr>
</table>
The February issue of Wired features an <a href="http://www.wired.com/wired/archive/12.02/india_pr.html">article on offshoring</a> by Daniel Pink, author of <a href="http://allconsuming.net/item.cgi?isbn=0446678791">Free Agent Nation</a>. Wired's story, entitled <i>The New Face of the Silicon Age</i>, might instead have been called <i>Free Agent World</i>. Here's a stunning exchange between Pink and New Jersey state senator Shirley Turner:
</p>
<blockquote>
I toss a slur across her desk. I call her a protectionist.
<br/><br/>
"Oh, and I'm proud of it," she responds. "I wear that badge with honor. I am a protectionist. I want to protect America. I want to protect jobs for Americans."
<br/><br/>
"But isn't part of this country's vitality its ability to make these kinds of changes?" I counter. "We've done it before - going from farm to factory, from factory to knowledge work, and from knowledge work to whatever's next."
<br/><br/>
She looks at me. Then she says, "I'd like to know where you go from knowledge."  [<a href="http://www.wired.com/wired/archive/12.02/india_pr.html">Wired: Kiss Your Cubicle Goodbye</a>]
</blockquote>
<p>
Where indeed? I think protectionism is the wrong approach. And I think <a href="http://weblog.infoworld.com/udell/2004/03/08.html#a939">Dick Cook's ideas</a> are right. But let's not kid ourselves. What's at stake here isn't just call-center jobs, or <a href="http://tbray.org/ongoing/When/200x/2004/02/23/NumbingCoding">mind-numbing</a> code-writing jobs, or <a href="http://www.nytimes.com/2004/03/04/opinion/04FRIE.html">accounting jobs</a>. Creativity, innovation and hard work are the levers that move the global economy, and anybody, anywhere, will be able to grasp those levers. 
</p>

</body>
</item>

<item num="a939">
<title>The accident of geography</title>
<date>2004/03/08</date>
<body>

<p>
<blockquote>
When I was in kindergarten, my family lived in New Delhi. It was a magical year in which I made permanent memories of the sights, sounds, and smells of India. A decade ago I returned to India for a tour of its software industrial parks. That visit changed me in another way. I met programmers and tech journalists who were my equal or better in every way, but whom you'll likely never hear of unless they're profiled in an article such as this week's cover story. Their faces and their voices became permanent memories, too. For me, the offshoring debate isn't abstract. I know that it turns on a mere accident of geography. [Full story at <a href="http://www.infoworld.com/article/04/03/05/10OPstrategic_1.html">InfoWorld.com</a>]
</blockquote>
This week's column is more about China than India. I interviewed MAPICS CEO Dick Cook, who's been on trade missions to China, knows the situation better than anyone I've met, and has thought deeply about how the US can and should deal with it.
</p>
<p>
Dick said a lot more in our interview than I had room for in the column. Here are some outtakes:
</p>
<p>
<b>On jobs data:</b> <i>Everybody in this political season is jumping on offshoring but although you can find anecdotal information, it's hard to find real data. I've looked hard, but neither the Bureau of Labor Statistics nor anybody else can give me concrete evidence that this world movement of jobs is netting down as much as everybody perceives.</i>
</p>
<p>
<b>On jobless recovery:</b> <i>It doesn't necessarily mean we're moving jobs offshore, it means we're workign more efficiently. Me and MAPICS (an ERP package) are probably one reason for that. People pay a lot of money for our software to be a cause of that. In this new world, the customer places an order online. The order department doesn't need to add people to handle more orders. Customers can check order status online. I have 175 customers in the furniture industry, there is a 6- to 8-week lead time. Three weeks prior to delivery, every customer calls the manufacturer or the store to ask when it's going to ship. By creating tools, we allow the manufacturer to automatically send a shipping notice 2 or 3 days before when the stats tell you would be the day the customer would call.</i>
</p>
<p>
<b>On China:</b> <i>In 2008, they're going to surprise everyone. The government and the people all realize they'll be on televisions in every home in the world for 17 days, during the Olympics. And they intend to present themselves, not as the largest developing country in the world, but as the largest developed country. We met with the Olympic planners. There are two goals. First, if you can believe it, is to be the green Olympics -- shutting down coal-fired power, building a big dam for hydro. Second, to be the digital Olympics. They're laying fiber everywhere, and they fully anticipate you'll use an ID card as your main security device and to charge meals.</i>
</p>

</body>
</item>



<item num="a937">
<title>Why no 'use strict' in Python? Answer: PyChecker</title>
<date>2004/03/06</date>
<body>

<p>
The unanimous response to my question "Why no 'use strict' in Python?" was: <a href="http://pychecker.sourceforge.net/">PyChecker</a>. Thanks to everyone who pointed me to this excellent tool. 
</p>
<p>
The first person to respond to my query was David Ascher, architect of <a href="http://www.activestate.com">ActiveState's</a> <a href="http://www.activestate.com/Products/Komodo/">Komodo</a>. Why, I asked David, isn't PyChecker included with the standard Python kit, and accessible by way of a command-line switch? David's response (via email, quoted with permission<sup>1</sup>):
</p>
<blockquote class="personQuote DavidAscher">
<p>
I suspect that it goes something like this:
</p>
<ol>
<li>the parsing infrastructure was developed with two goals in mind:
correctness and speed, and maintaining extra data that you'd need for
doing linting wasn't high enough priority early on.</li>
<li>the "right" way to do it is to use the new compiler system</li>
<li>since pychecker "works", the incentive to do it right is only one
that appeals to those people in pursuit of beauty for its own sake.
</li>
<li> Those guys are busy.</li>
</ol>
</blockquote>
<p>
Another noteworthy comment on this subject comes from Ted Leung:
<blockquote cite="http://www.sauria.com/blog/2004/03/05#846" class="personQuote TedLeung">
I don't know the history behind various Python features, so I can't comment on strict. What I can comment on is that strict is nice, but a type inferencer for Python would be better (as I've <a href="http://www.sauria.com/blog/2003/05/07#191">posted</a> before). One of the reasons that I'm excited to be going to PyCon this year is Michael Salib, an undergraduate at MIT has written <a href="http://web.mit.edu/msalib/www/urop/">Starkiller</a>, a type inference engine for Python. [<a href="http://www.sauria.com/blog/2004/03/05#846">Ted Leung on the air</a>]
</blockquote>
</p>
<p>
Fair enough. Based on the email I've been receiving, though, it's clear that I'm not the only Python programmer who's been unaware of PyChecker. Evidence suggests that it might deserve to be elevated to a command-line-accessible option.
</p>
<hr/>
<p>
<sup>1</sup>
Emails from Ross Mayfield, CEO of Socialtext, include a .sig that ends with:
<pre>
this email is: [ ] bloggable [ x ] ask first [ ] private
</pre>
Great idea! I've added this to my own .sig. 
</p>

</body>
</item>

<item num="a936">
<title>Why no 'use strict' in Python?</title>
<date>2004/03/05</date>
<body>

<p>
Yesterday I had the opportunity to speak with Anders Hejlsberg, father of both Turbo Pascal and C#. Of course I had to scratch my dynamic language itch, so we talked some about that. The upshot is that Anders believes compile-time type checking is valuable, but also thinks we can (and probably should) use type inferencing to make static type checking feel more dynamic. 
</p>
<p>
During our conversation, he reminded me of an issue that I've been meaning to ask the Python folks to comment on. To illustrate it, consider exhibits A, B, and C.
</p>
<p>
Exhibit A. This Python program produces no compile-time error when the misspelled variable aNyme is referenced. It produces an error at runtime.
</p>
<pre class="code python">
$ cat loose.py
aName = 'abc';
print '[' + aNyme + ']';
$
$ python -c "compile(open('loose.py').read(),'loose.py','exec')"
$
$ python loose.py
Traceback (most recent call last):
  File "loose.py", line 2, in ?
    print '[' + aNyme + ']';
NameError: name 'aNyme' is not defined
</pre>
<p>
Exhibit B. This Perl program produces no compile-time or run-time error.
</p>
<pre class="code perl">
$ cat loose.pl
my $aName = 'abc';
print "[" . $aNyme . "]\n";
$
$ perl -c loose.pl
loose.pl syntax OK
$
$ perl loose.pl
$ []
</pre>
<p>
Exhibit C. This Perl program produces a compile-time error.
</p>
<pre class="code perl">
$ cat strict.pl
use strict;
my $aName = 'abc';
print "[" . $aNyme . "]\n";
$
$ perl -c strict.pl
Global symbol "$aNyme" requires explicit package name at strict.pl line 3.
strict.pl had compilation errors.
$
$ perl strict.pl
Global symbol "$aNyme" requires explicit package name at strict.pl line 3.
Execution of strict.pl aborted due to compilation errors.
</pre>
<p>
A few others out there have made this observation, for example:
<blockquote>
I find python confusing on the other hand. e.g. sysmsg = sysmsg.replace('&amp;', ' ')<br/>
what if you wrote "sysmgs = sysmsg.replace('&amp;',' ')"<br/>
there is a small typo! In perl "use strict;" would find that for you, but python has no equivalent yet. [<a href="http://www.linuxjournal.com/comments.php?op=showreply&amp;pid=3361&amp;sid=3882">anonymous comment at LinuxJournal.com</a>]
</blockquote>
In my use of Perl, I've sometimes had to relax the constraints imposed by "use strict" -- for example, with "no strict vars" when I'm dynamically conjuring variable names. But on the whole, I never felt (though I'm sure some do) that "use strict" seriously compromised Perl's essential dynamism. 
</p>
<p>
Are there reasons why Python can't, or shouldn't, support something like "use strict"?
</p>

</body>
</item>


<item num="a935">
<title>No-Touch Deployment versus ClickOnce</title>
<date>2004/03/05</date>
<body>

<p>
Mark Levison, one of the developers I interviewed for the .NET story, thinks that Microsoft has undersold the benefits of No-Touch Deployment (NTD), the current solution for running rich .NET clients from the Web. Having done the gruntwork required to understand and use NTD, Mark's not so sure that developers ought to write off this technology and wait for Whidbey's ClickOnce.
</p>
<blockquote cite="http://dotnetjunkies.com/WebLog/mlevison/archive/2004/03/04/8417.aspx" class="personQuote MarkLevison">
I think David [Treadwell] misses much of the point. The caching features of No-Touch Deployment (NTD) work well enough. Click Once will be useful, but there are many other issues dealing with NTD apps. My impression is that MS didn't dog-food enough NTD. I think there are a few key areas that need work. [<a href="http://dotnetjunkies.com/WebLog/mlevison/archive/2004/03/04/8417.aspx">Mark Levison</a>]
</blockquote>
<p>
Mostly, Mark's asking for documentation and tutorials that will enable other developers to use NTD effectively, and spare them much of the painful R&amp;D he had to go through. In his posting, he ticks off a very specific and well thought-out list of suggested items to cover in tutorials. 
</p>


</body>
</item>


<item num="a934">
<title>Structured change detection</title>
<date>2004/03/04</date>
<body>

<p>
<blockquote>
Consider two versions of a Word document saved as XML. There are "structured diff tools that can map the changes at an intermediate level, in terms of XML elements. For example, IBM's AlphaWorks site offers the <a href="http://www.alphaworks.ibm.com/tech/xmldiffmerge">XML Diff and Merge Tool for Java</a>, while Microsoft's GotDotNet site offers <a href="http://apps.gotdotnet.com/xmltools/xmldiff">XML Diff and Patch for .Net</a>. Both of these free tools can track element-level change. To get a sense of what's possible, check out <a href="http://www.deltaxml.com/svg">Monsell EDM's online demo of its Delta XML</a> technology. The demo compares two subtly different versions of a complex graphic -- the standard SVG (Scalable Vector Graphics) "tiger" benchmark -- and animates the differences between the two. It's stunningly cool.
<br/><br/>
As XML becomes the standard way to represent prose, graphics, and other content, we should expect such change visualization to become routine. What about code? It has sections, subsections, and paragraphs, too. XML isn't -- and probably shouldn't be -- the primary way we read and write code. But the underlying abstract syntax tree has structure that can -- and arguably should -- help us see and comprehend the code's evolution. [Full story at <a href="http://www.infoworld.com/article/04/02/27/09OPstrategic_1.html">InfoWorld.com</a>]
</blockquote>
Ordinarily readers call me on stuff like this, but for once I get a chance to beat them to the punch. This column certainly should have mentioned that <a href="http://subversion.tigris.org/">Subversion</a>, the open source project that aims to replace CVS, reached its 1.0 release last week. It looks really good, and I'm investing some time in learning how to deploy and use it.
</p>
<p>
Subversion's support for copying and renaming files and directories aims to reduce one of CVS's worst points of friction. Since I work with lots of XML data -- including just about everything I write -- I'm also eager to try plugging in some structured diff programs.
</p>

</body>
</item>

<item num="a933">
<title>Screen video tips</title>
<date>2004/03/04</date>
<body>

<p>
Several folks wrote with questions and comments about the OS X screen video I posted the other day. I mentioned that Media Encoder was the capture tool, but didn't specify how I got from Windows Media to Flash. For that, I used <a href="http://www.techsmith.com/products/studio/default.asp">Camtasia Studio</a>. I've heard good things about <a href="http://www.qarbon.com">Qarbon</a> but haven't had a chance to try it yet. Chris Ryland, from Em Software, wrote to recommend <a href="http://www.ambrosiasw.com/utilities/snapzprox/">SnapzPro X 2</a> specifically for OS X (and QuickTime).
</p>
<p>
Also,  <a href="http://cheerleader.yoz.com/">Yoz Grahame</a> wrote to alert me to a cool VNC hack, <a href="http://www.unixuser.org/~euske/vnc2swf/">vnc2swf</a>, a VNC viewer that records Flash movies. Getting it running, on either Fedora or OS X, failed my 5-minute rule. (I.e., if it takes more than 5 minutes, it's not a good use of my time.) But the example movies prove that it can work. And it's interesting to watch the author of vnc2swf, Yusuke Shinyama, driving various applications in a mixture of Japanese and English.
</p>
<p>
By the way, have you ever wondered what happens if you point a VNC viewer on one box (say, a Mac) at another box (say, Windows), then launch a VNC viewer on the second box and point it back at the first? Here's what:
<img border="1" width="300" height="200" vspace="6" hspace="6" alt="hall of mirrors" src="http://weblog.infoworld.com/udell/gems/recursiveVNC.gif"/>
</p>
<p><b>Update:</b>
<a href="http://www.livingskies.com/">Karl Fast</a> reports that he's seen a demo of a (still unreleased) new screen recording tool from <a href="http://www.usersfirst.com/">Users First</a> (great name!). The product is geared for usability analysis:
<blockquote class="personQuote KarlFast">
It is a client-server system. You have a CD for the client machine (Windows). It automatically runs off the CD. No installation required. This is a huge plus for capturing real work environments.
<br/><br/>
The recording program runs on MacOS X. It finds the client machine over the network. It can record an audio stream and multiple video streams. So one stream would be the screen video, but you can also capture users facial reactions and an audio stream, all synchronized. 
<br/><br/>
You get pixel-perfect capture (it uses VNC), over the network, without having to install anything on the client.
<br/><br/>
There is more, but like I said, it's slick. Finally something really geared towards the usability-engineering/ information-architecture/interaction-design/user-experience crowd.
</blockquote>
Great idea! Part of my recent keen interest in screen videos is exactly for this reason. Conventional usability testing is a prohibitively expensive process. Cheaper and more convenient ways to let developers look over users' shoulders could have a huge impact on sofware usability.
</p>

</body>
</item>

<item num="a932">
<title>Component builders and solution builders</title>
<date>2004/03/03</date>
<body>

<p>
<blockquote>
Despite lots of second-guessing, there is no consensus that the CLR is inherently unfriendly to dynamic languages. The JVM didn't bend over backwards for such languages either, and yet Jython is a great success thanks to the heroic efforts of its inventor, Jim Hugunin. Now Hugunin has turned his attention to .NET, and reports promising results with a prototype Python implementation for .NET called IronPython.
<br/><br/>
Such projects always seem to spring from an inspired individual or small team. In fact, Microsoft has such a team. It created JScript.NET, the most dynamic of Microsoft's .NET languages. But JScript.NET is the unloved stepsister of C# and VB.NET.
<br/><br/>
Dynamic languages are rooted in a culture that is simply not indigenous to Redmond. That may change, but for the time being, the future of dynamic languages in .NET lies with non-Microsoft innovators. [Full story at <a href="http://www.infoworld.com/article/04/02/27/09FEmsnetdynamic_1.html">InfoWorld.com</a>]
</blockquote>
The day this story posted, <a href="http://www.thinkingin.net/2004/02/27.aspx#a630">Larry O'Brien</a> pointed me to Jim Waldo's essay, <a href="http://www.artima.com/weblogs/viewpost.jsp?thread=36525">To type or not to type</a>, which says in part:
<blockquote>
<p>
When we argue over whether or not a programming language should have types, we are not discussing a matter of fact. Instead, we are participating in what [linguistic philosopher John L.] Austin would call <i>confessional language</i>; what we are really doing is saying something about ourselves.
</p>
<p>
In particular, I think that those who advocate typed languages are (generally) participating in different kinds of programming exercises then those who advocate untyped languages. In particular, people who argue for strongly typed languages tend to be involved in projects that are
 </p>
<ul>
<li>large, with lots of interacting components;</li>
<li>require multiple people to work together; </li>
<li>will take a long time to develop (weeks or months, not hours); and </li>
<li>will live for a long time, changing over that time. </li>
</ul>
 <p>
On the other hand, people who like untyped languages tend to be involved in projects that 
</p>
<ul>
<li>require lots of prototyping;</li>
<li>are done by one person, or a small group of people; </li>
<li>tend to be small or short term; and </li>
<li>often are used for a short period of time, or are not altered through their lifetime.</li>
</ul>
</blockquote>
</p>
<p>
I've been thinking about this for a couple of days, because it's true that my own programming work is better characterized by the second list of attributes than by the first. Does this mean my passion for dynamic languages merely reflects my own orientation?
</p>
<p>
This led to another question: what is a large system? I tend to regard any application -- even a dozens-of-modules, millions-of-lines-of-code application -- as a good-sized component that participates in the large system we call the Web.
</p>
<p>
It's no accident that Perl was the original language of choice for programming that large system. Perl's dynamic nature was just what we needed in an environment that was itself dynamic, producing new services that could interact in unpredictable ways to yield  emergent outcomes. 
</p>
<p>
Nowadays Python is my first choice. Its approach to typing -- strong <i>and</i> dynamic -- is part of the reason why. But for programming the Web, I'll reach first for any of the dynamic languages before I'll reach for C# or Java. I've only recently been able to explain why. It's about the data. No one programming language's (or VM's) type system can (or should) span the Web. That's why XML has become one of the primary ways we invent, absorb, and interconnect data models. Dynamic languages offer complementary affordances.
</p>
<p>
Ten years ago I wrote my most widely cited BYTE cover story, called <a href="http://www-cad.eecs.berkeley.edu/~newton/Presentations/WebArchTutorialPrint/sld025.htm">Componentware</a>. I said then that software development was becoming a two-tiered system. There would be a relatively small number of component builders, working in compiled (today: JITed) languages such as C and C++ (today: C#, Java) to produce reusable components (today: services). And there would be a relatively large number of solution builders, working in scripting (today: dynamic) languages such as Visual Basic (today: Perl, Python, Ruby) to produce applications. 
</p>
<p>
That components-and-glue metaphor still describes the software world today -- if anything, much more powerfully. The object orientation and static typing features of the JVM and the CLR are tools of the component builder's trade. And the dynamic features of what we still often call scripting languages are tools of the solution builder's trade. This isn't an either/or situation, though. Software development works best when the membrane that divides the component builder from the solution builder is flexible and porous, because the two activities are not as distinct as we suppose. This, I think, is why Sean McGrath calls Jython "Java's strategic weapon for the 21st century." And it's why I continue to want first-class dynamic language implementations for the CLR. The two tribes that Jim Waldo identifies are, roughly, the component builders and the solution builders. Dynamic languages are not only the solution builders' best tool. They're also the best way for the two tribes to collaborate on programming the planetary web of data.
</p>

</body>
</item>

<item num="a931">
<title>Aunt Tillie's OS X Adventure</title>
<date>2004/03/02</date>
<body>

<p>
<a href="http://weblog.infoworld.com/udell/gems/tillie.jpg"><img align="right" width="250" height="238" src="http://weblog.infoworld.com/udell/gems/tillie.jpg"/></a>
In a pair of <a href="http://www.catb.org/~esr/writings/cups-horror.html">recent</a> <a href="http://www.catb.org/~esr/writings/luxury-part-deux.html">essays</a>, Eric Raymond tears into the open source community -- rightly so -- for its failure to deliver software that Aunt Tillie can use. He's spot on. One of his comments got me wondering, though:
<blockquote cite="http://www.catb.org/~esr/writings/cups-horror.html" class="personQuote EricRaymond">
If the designers were half-smart about UI issues (like, say, Windows programmers) they'd probe the local network neighborhood and omit the impossible entries. If they were really smart (like, say, Mac programmers) they'd leave the impossible choices in but gray them out, signifying that if your system were configured a bit differently you really could print on a Windows machine, assuming you were unfortunate enough to own one. [<a href="http://www.catb.org/~esr/writings/cups-horror.html">Eric Raymond: An Open-Source Horror Story</a>]
</blockquote>
As it happens, I'd never tried printing to a Windows XP queue on my home network from my Mac, and I wondered how well those Mac programmers Eric talks about handled that case. So here, for your Flash viewing pleasure, is <a target="tillie" href="http://weblog.infoworld.com/udell/gems/tillie.html">Aunt Tillie's OS X Adventure</a>. 
</p>
<p>
Actually this was a kill-two-birds-with-one-stone experiment. I've been wanting to be able to record screen videos on OS X, just like I do on Windows using Media Encoder 9, but I didn't have the software to do it. Or thought I didn't. Then I remembered <a href="http://www.realvnc.com/">VNC</a>. I pointed a VNC viewer on Windows XP at a VNC server on OS X, and ran Media Encoder on the viewer. It works.
</p>
<p>
The upshot, for you fast-forward types, is that Aunt Tillie didn't have a picnic on OS X either. Raymond wrote:	
<blockquote cite="http://www.catb.org/~esr/writings/cups-horror.html" class="personQuote EricRaymond">
 Clicking on the menu, I am presented with the following alternatives:
<pre>
Networked CUPS (IPP)
Networked Unix (LPD)
Networked Windows (SMB)
Networked Novell (NCP)
Networked JetDirect
</pre>
Here is our first intimation of trouble. If I were Aunt Tillie the
archetypal nontechnical user, I am at this point thinking "What in the
holy fleeping frack does that mean? 
</blockquote>
</p>
<p>
Rather to my surprise, I found an oddly similar set of choices on the Mac:
<pre>
  AppleTalk
  IP Printing
  Open Directory
  Rendezvous
  USB
x Windows Printing
</pre>
Windows Printing was the default, but no other choice was dimmed. That was the least of Aunt Tillie's worries though. In the finale she has to choose between HP LaserJet 4 Plus, v2013.111, and HP LaserJet 4 series, CUPS+Gimp-Print v4.2.5. The latter was the correct choice, by the way.
</p>
<p>
I'm sure that on OS 9, talking to a PostScript printer, Aunt Tillie would never have needed to know about the dreaded CUPS (Common Unix Printing Systems) which provoked Eric Raymond's rant. Even so, I don't think her OS X misadventure blunts the force of that rant. Aunt Tillie has always been the problem. Her life may be a bit easier on Windows and on Mac OS, but it is far from comfortable. There's room for order-of-magnitude improvement. Will open source folk ever conclude that Aunt Tillie represents a hard engineering problem, and decide to wrap their collective heads around it? Stranger things have happened.
</p>

</body>
</item>


<item num="a930">
<title>.NET report card</title>
<date>2004/03/01</date>
<body>

<p>
<blockquote cite="http://www.infoworld.com/reports/09SRmsnet.html">
Every couple of years Microsoft wraps a marketing label around all the major initiatives in the company. In 2000, the label was .NET; in 2003, Longhorn. As developers and IT managers ponder what the "Longhorn wave" might mean to them, InfoWorld decided to assess the current .NET wave. Its goals were many and ambitious. At the core of .NET, the Common Language Runtime (CLR) and its associated Framework (class library) would usher Microsoft developers into the world of managed code, whose benefits were already well-known to their Java counterparts. In parallel, Web services would become the pivotal integration technology, and XML the lingua franca of data representation. These were, and still are, the central themes. Don Box, architect of Longhorn's Indigo communication subsystem, put it plainly on his weblog: "We're betting that the future is managed code and XML." [Full story at <a href="http://www.infoworld.com/reports/09SRmsnet.html">InfoWorld.com</a>]
</blockquote>
This story, which began <a href="http://www.infoworld.com/article/04/02/27/09FEmsnetdynamic_1.html">thirty weblog items ago</a>, is (at least for me) a compelling demonstration of weblog/journalism synergy. I first tried this approach in 1996, for a <a href="http://www.byte.com/art/9608/sec6/sec6.htm">BYTE cover story</a>. In the pre-blog era, NNTP newsgroups were the venue, but it's the same principle. When you're dealing with an evergreen topic, and you're not worried about getting scooped by the competition, why not go ahead and outline your ideas in advance? The ensuing conversation will clarify them, and put you in touch with people who can share interest and expertise that you otherwise wouldn't have been able to find.
</p>
<p>
Back in '96 it was Dave Korsmeyer who popped up on the radar screen, to tell me about an interesting use of Java for distributed data visualization at NASA's Ames Research Center. I just heard from Dave recently. Now he's Chief of the Computational Sciences Division at the Ames Research Center, and his team has built several software tools to support the current Mars mission, including <a href="http://infotech.arc.nasa.gov/news/story.php?sid=90">MERCIP</a>, which Dave describes as "distributed web information application using XML as its messaging protocol."
</p>
<p>
In similar fashion, a number of folks popped up on the radar for this .NET story. I'd like to thank everyone who took the time to think about and discuss various issues. And I wonder what they'll be up to 8 years hence!
</p>

</body>
</item>

<item num="a929">
<title>The 1060 REST microkernel and XML app server</title>
<date>2004/02/26</date>
<body>

<p>
<span class="minireview">1060 NetKernel</span> 
Suhail Ahmed alerted me, via email, to a really interesting project called <a href="http://1060research-server-1.co.uk/docs/latest/docxter/doc_intro_whatitis.html">NetKernel</a>, from <a href="http://www.1060research.com/">1060 Research</a>. The docs describe it as "a commercial open-source realisation of the HP Dexter project." Here's the skinny:
<blockquote cite="http://1060research-server-1.co.uk/docs/latest/docxter/doc_intro_whatitis.html">
Today's Web-servers and Application Servers have a relatively flat interface which creates a hard boundary between Web and non-Web. This boundary defines the zone of URI addressable resources.
<br/><br/>
What if the REST interface (URI address space) didn't end at the edge of your external interface?
<br/><br/>
NetKernel uses REST-like service interfaces for all software components. The services are fully encapsulated in modules which export a public URI address space. A module may import other module's address spaces, in this way service libraries may be combined into applications. [<a href="http://1060research-server-1.co.uk/docs/latest/docxter/doc_intro_whatitis.html">NetKernel Essentials</a>]
</blockquote>
What if, indeed? I downloaded the 20MB NetKernel JAR file, installed the system, and took it for a spin. Fascinating concept. As advertised, it offers a suite of XML services -- including XSLT, and the Saxon implementation of XQuery -- in a composable architecture based on URIs. These include the familiar http: and file: plus NetKernel's own active: which is a URI scheme for NetKernel processes scheduled by the "REST microkernel."
</p>
<p>
You compose primitive URI-based services like so :
<pre class="code xml">
Here's an example of a DPML [Declarative Processing Markup Language] 
instruction to perform an XSLT transform:
 
&lt;idoc>
  &lt;seq>
    &lt;instr>xslt&lt;/instr>
    &lt;operand>document.xml&lt;/operand>
    &lt;operator>transform.xml&lt;/operator>
    &lt;target>this:response&lt;/target>
  &lt;/seq>
&lt;/idoc>
 
Which the DPML runtime compiles to the active URI 
 
&lt;code>active:xslt+operand@document.xml+operator@transform.xsl &lt;/code>
</pre>
Since all the supported XML processing technologies use the active: resolver, you could use active: URIs as the operand and/or operator, and you could source the resource described by this active: URI into another processing step, say an XSLT transform or an XQuery query.
</p>
<p>
I never heard the phrase "REST microkernel" before, but I had an immediate expectation of what that would mean. An hour's experimentation with the system met that expectation. Wildly interesting stuff. Thanks for the pointer, Suhail!
</p>
</body>
</item>

<item num="a927">
<title>Christopher Allen, Rip Van Winkle</title>
<date>2004/02/25</date>
<body>

<p>
I met Christopher Allen about a decade ago, when he ran Consensus Development, a company that made a commercial SSL toolkit. (Prior to that, he was involved in the startup of VeriSign, and in the development of the SSL reference implementation for Netscape.) I hadn't heard from him in a long time, and his recent essay, <a href="http://www.lifewithalacrity.com/2004/02/security_crypto.html">Security and Cryptography: The Bad Business of Fear</a>, explains why. When he sold his company to Certicom in 1999, he signed a <s>5</s> 3-year non-compete agreement. When it expired, he re-entered the security industry, expecting to find it much changed:
<blockquote cite="http://www.lifewithalacrity.com/2004/02/security_crypto.html" personQuote="ChristopherAllen">
Internet time had still been moving fast back in 1999 and I wasn't sure how many generations had gone by in the security industry. One, two, more?
</blockquote>
</p>
<p>
Actually, none, as it turns out. 
</p>
<blockquote cite="http://www.lifewithalacrity.com/2004/02/security_crypto.html" personQuote="ChristopherAllen">
Walking the floors of RSA last year, in the immense exhibit hall at the San Jose Convention Center, I did feel a sense of energy. The floor was still packed, and the carefully cut kiosks and the garish banners bespoke the millions put into the show by the exhibitors. The constant chatter was a deafening white noise, and whenever I veered too near a booth, there was a salesman very eager to tell me about his company's latest and greatest.
<br/><br/>
But, to a certain extent, that energy felt to me like a facade. There was nothing new; instead all the exhibitors were showing off the same technology that they were displaying five years ago. There was a bit of glitz and some extra chrome, perhaps a carefully redesigned product name, but beyond that there was a weird feeling of deja vu.
<br/><br/>
There were the same old tools that we've been using to deter hackers since the advent of the Morris Worm way back in 1989: products to detect intruders and safeguard your machines against them; firewalls; and VPNs. Maybe we've gotten a little better at figuring out expert rules, maybe we've improved our user interfaces, but these are slow, gradual upgrades, not quantum leaps.
</blockquote>
<p>
To put it another way, we have been optimizing existing algorithms, not inventing new ones. The rest of this remarkable essay suggests what some of those new approaches might be. He considers the idea of insurance as a form of business risk management, something that Bruce Schneier has also been discussing lately. He notes that data security is not the same thing as data reliability: the latter is what we really want. And he suggests, finally, that alongside these approaches driven by fear, we need to develop new methods motivated by opportunity. 
</p>
<blockquote cite="http://www.lifewithalacrity.com/2004/02/security_crypto.html" personQuote="ChristopherAllen">
The possibilities are only limited by our imagination, if we can just think beyond current possibilities.
<br/><br/>
We have already seen the first wave of security technology; now we need to initiate a second, for I believe with the next wave the best is yet to come.
</blockquote>
<p>
Well said. And welcome back, Chris!
</p>


</body>
</item>


<item num="a927">
<title>WS-WorldPeace</title>
<date>2004/02/23</date>
<body>

<p>
<blockquote cite="http://www.infoworld.com/article/04/02/20/08OPstrategic_1.html">
Here's one popular definition of insanity: "Do the same thing, expect a different result." Now consider the following partial list of proposed standards for Web services: WS-Addressing, WS-AtomicTransaction, WS-Attachments, WS-Context, WS-Coordination, WS-Eventing, WS-Federation, WS-Reliability, WS-ReliableMessaging, WS-Routing, WS-SecureConversation, WS-Security, WS-SecurityPolicy, WS-Transaction, and WS-Trust. [Full story at <a href="http://www.infoworld.com/article/04/02/20/08OPstrategic_1.html">InfoWorld.com</a>]
</blockquote>
The original title of this column was <i>WS-WorldPeace</i>, so I've used that title here because I still like it better. But this is the same column as the one in the current print edition of InfoWorld entitled <i>Web services alphabet soup</i>. In the column, I interview Microsoft's John Shewchuk on the question of why this round of small, modular specifications is arguably not a replay of past sins, and how Indigo intends to help developers get a handle on "composable complexity."
</p>

</body>
</item>

<item num="a926">
<title>Lightweight XML search servers, part 2</title>
<date>2004/02/23</date>
<body>

<p>
<blockquote cite="http://www.xml.com/pub/a/2004/02/18/udell.html" class="personQuote JonUdell">
In <a href="http://www.xml.com/pub/a/2004/01/21/udell.html">last month's installment</a> I showed a simple search service that uses libxslt to reduce a file of XML content (my weblog writing) to just the elements matching an XPath expression. This month's challenge was to scale up to a database-backed implementation using Berkeley DB XML. [Full story at <a href="http://www.xml.com/pub/a/2004/02/18/udell.html">XML.com</a>]
</blockquote>
After looking at my implementation, John Merrells, the creator of DB XML, wrote to ask why I was using the libxml2 XPath feature to search within documents returned by DB XML XPath queries. Didn't I know that DB XML offered a document-level XPath query function, as well as a database-level one? Heh. Actually, I hadn't known. 
</p>
<p>
There's some sort of object lesson here. Lately I've grown extremely fond of the libxml2/Python combination. When I need to process XML, that's how I want to do it. But having developed this habit, it also becomes necessary to break it from time to time. Materializing the libxml2/Python combination, on a given platform, can absorb time and energy that may be better spent elsewhere, and it can even lead to compromises.
</p>
<p>
Case in point: my original implementation of this service used Jython to talk to the DB XML Java API. This was actually a great combination. It married Python's flexibility to a more robust and complete DB XML API than is available from the C flavor of Python. However, it lacked my new old friend, libxml2. So I wound up using an older version of DB XML (1.2, rather than the latest 1.2.1) in order be able to use C Python. Which, as it now turns out, was unnecessary, since DB XML supports both database-level and document-level querying.
</p>
<p>
It's amazing how one wrong or missing piece of information can wind up dictating a major architectural choice. And how one unexamined habit can make us vulnerable to that outcome. 
</p>

</body>
</item>

<item num="a925">
<title>Different strokes</title>
<date>2004/02/22</date>
<body>

<p>
Here's what Brent Simmons had to say about yesterday's item on news scanning and news reading:
<blockquote class="personQuote BrentSimmons">
1. NetNewsWire's Combined View works with channels, all-new-headlines -- and groups.
<br/><br/>
For instance, I have 145 subscriptions organized into 8 groups. When you view a group in the Combined View, you see all the unread headlines for that group. Like most people, I organize my groups by topic.
<br/><br/>
So I have a page for Macintosh news, a page for weblogs, a page for books, a page for baseball, etc.
<br/><br/>
2. Generalizing about NetNewsWire based on Steve Gillmor's use of it isn't fair.
<br/><br/>
For instance, I personally find Radio's batches-of-100-in-a-web-page to be awkward. With Radio I can't scan fast enough and I can't keep up. I developed NetNewsWire so I could keep up with more feeds with less effort. But everybody's different: different presentations work for different people. That doesn't mean that Radio's approach is better or worse than NetNewsWire's.
</blockquote>
I agree. In trying to illustrate a point about scanning versus reading, I'm afraid I fanned the flames of the newsreader-style versus browser-style debate. In fact, the two modes can be complementary. I just bought the full version of NetNewsWire, which exploits that synergy as Brent describes. So does FeedDemon, which <a href="http://peteresch.blogspot.com/2004_02_01_peteresch_archive.html#107742056132825446">this posting</a> prompted me to re-explore.
</p>
<p>
It's true that different folks will prefer different strategies for grouping and processing their feeds. But no matter which strategy you prefer, you need to harmonize two modes: scanning, and reading. And no matter which strategy you prefer, the same methods can be used to achieve that harmony. On the publishing side: untruncated feeds, containing HTML (ideally, but not necessarily, XHTML) markup, with a first element that can work standalone. This is often naturally the case, since a lead paragraph's job is to hook the reader.
</p>
<p>
On the consumption side: feedreaders that XHTML-ize content (in case it isn't already XHTML), use the first markup element to optimize scanning modes, and provide the full content for reading. Peter Eschenbrenner suggests this is already possible with FeedDemon:
<blockquote cite="http://peteresch.blogspot.com/2004_02_01_peteresch_archive.html#107742056132825446" personQuote="Peter Eschenbrenner">
You might want to check out <a href="http://www.bradsoft.com/feeddemon/">FeedDemon</a> by Nick Bradbury. While it comes with default style sheets, users are able to create their own XSL for efficiently processing the information. So, if you wanted to view just the first paragraph, you could create your own style or ask someone in the community to create one.<br/><br/>Radek, an active community member, has created styles that hint at what can be achieved with this combination, from rating your feeds in a <a href="http://republika.pl/fdstyles/Ratings.html">database</a>, to creating powerful <a href="http://republika.pl/fdstyles/MindManager.html">MindMaps</a>.
[<a href="http://peteresch.blogspot.com/2004_02_01_peteresch_archive.html#107742056132825446">Peter Eschenbrenner: Note to Self</a>]
</blockquote>
</p>
<p>
Interesting! So far as I can tell, though, FeedDemon's XML data model corresponds -- as you'd expect -- to that of RSS. Which means the content is opaque. So while you can use XSLT to hack alternate presentations of channel-level and item-level dates and titles, XSLT can't see into the content. For example, the default stylesheet includes:
<pre class="code xslt">
&lt;div class="newsitemcontent">
  &lt;xsl:value-of select="description"/>
&lt;/div>
</pre>
But if it unescaped and XHTML-ized the markup within the description, FeedDemon (or any RSS reader) could expose the content of items to the same kinds of XML manipulation that we routinely apply to the RSS metadata wrapper. (For all I know, there may even be a way to do this now in FeedDemon, by breaking into the XML pipeline and inserting an HTML Tidy step.) Selecting initial elements, in order to normalize and improve summary views, is one reason to do that. The structured search technique I've been exploring is another. I mentioned yesterday that these opportunities have nothing to do with the RSS/Atom debate. As I should also have mentioned yesterday, they have nothing to do with the newsreader/browser debate either.
</p>
<p>
Bottom line: blog content needs to become a first class citizen in the XML world. And as it turns out, it's more feasible than I thought to make that so. Most people won't be producing well-formed content anytime soon. But tools that produce and consume RSS can compensate, with the help of things like HTML Tidy, and there are compelling reasons to do so.
</p>

</body>
</item>

<item num="a924">
<title>Heads, decks, and leads: revisited</title>
<date>2004/02/21</date>
<body>

<p>
In his essay <a href="http://www.masternewmedia.org/2004/02/19/the_birth_of_the_newsmaster.htm">Birth of the NewsMaster</a>, Robin Good writes:
<blockquote cite="http://www.masternewmedia.org/2004/02/19/the_birth_of_the_newsmaster.htm" class="personQuote RobinGood">
I have seen and heard of people subscribing to hundreds if not to thousands of feeds inside their RSS aggregators.
<br/>
Is that manageable?
Do these people get better and more information than everyone else?
<br/>
It is not.
They don't. 
</blockquote>
Information architecture is one of my abiding passions. Designing an information display that can be efficiently scanned is something I've thought a whole lot about. So I'm particularly keen to understand why some people report being overwhelmed by too much RSS input, while others say they're able to process lots of it effectively. 
</p>
<p>
Yesterday, for example, <a href="http://www.eweek.com/article2/0,4149,1439309,00.asp">Steve Gillmor</a> told me that he's feeling overwhelmed by thousands of unread items in NetNewsWire. Yet I never feel that way. I suspect that's because I'm reading in batches of 100 (in the Radio UserLand feedreader). I scan each batch quickly. Although <a href="http://www.nelson.monkey.org/~nelson/weblog/culture/blogs/fullrss.html">opinions differ</a> as to whether or not a feed should be truncated, my stance (which I'm reversing today) has been that truncation is a useful way to achieve the effect you get when scanning the left column of the Wall Street Journal's front page. Of the 100 items, I'll typically only want to read several. I open them into new Mozilla tabs, then go back and read them. Everybody's different, but for me -- and given how newspapers work, I suspect for many others too -- it's useful to separate the acts of scanning and reading. When I'm done with the batch, I click once to delete all 100 items.
</p>
<p>
As a user of NetNewsWire Lite, I don't have access to the combined view that enables items to be processed in batch rather than individually. The <a href="http://ranchero.com/images/nnw/hpss/combinedView103.jpg">example screenshot</a> suggests that there is still a per-channel interaction required, however I suspect that when Combined View is used in conjunction with Show Aggregated New Items, you can see -- and process -- everything at once. (If I've got that wrong, I'm sure <a href="http://inessential.com/">Brent</a> will clarify.) 
</p>
<p>
If Steve and I have the same batch-processing capability, why do we feel so differently about the overload problem? Maybe because it's not the same. If I'm right about NNW's Combined View / Show Aggregated New Items, the difference may boil down to this: my aggregated view delivers batches of 100, whereas Steve's delivers either small per-channel batches, or very large all-channel batches. So, in other words, I'm seeing what roughly corresponds to a Wall Street Journal news summary, whereas Steve is seeing what roughly corresponds to a 5x or 10x bigger version of that page. (If I've got that wrong, I'm sure Steve will clarify.) 
</p>
<p>
Either way, the content is an awkward mixture of truncated and full items. Both modes are useful, but they serve different purposes and they mix badly. Truncation is necessary for the Wall Street Journal effect, though where and how to truncate is a tricky question that I've just now changed my mind about. And of course you need the full view at some point, so you can actually read stuff. 
</p>
<p>
Currently I provide two versions of my feed: truncated and full. And the truncated feed is intelligently truncated. Using a callback that Dave Winer added to Radio UserLand a couple of years ago, I select the first HTML paragraph (&lt;p>) element. Knowing that this will happen, I put some thought into what that element will contain when I'm writing an item. In effect, the first paragraph element is the lead, or blurb. Sometimes it's just a plain paragraph. But sometimes it will contain an image, or a quotation, when these are appropriate and useful hooks. This <a href="http://udell.infoworld.com:8001/?//body/p[1][contains(ancestor::item/@channel,%20'full-length')%20and%20contains(ancestor::item/date,%20'2004/01')]">query</a>, which shows the first paragraphs from all my January items, illustrates some of the variation. The fact that I can issue this query against my untruncated feed shows that my truncated feed is really not necessary. What is necessary, or at any rate useful, is the extra bit of preparation, i.e. thinking about what goes into that first HTML paragraph. 
</p>
<p>
Unfortunately the effect of all my careful preparation has mostly been wasted so far. When you process large batches of feeds, some of which use intelligent truncation, some of which use dumb truncation (i.e., just grab the first 250 characters and slap on an ellipsis), and some of which use no truncation, the result is kind of a mess.
</p>
<p>
All along, I've had the idea that feedreaders should be able to smooth out these differences. If you wanted a Wall Street Journal view across all your feeds, you could get one. And if you wanted a full-content view across all your feeds, you could get that too.
</p>
<p>
Playing around with my queryable feed database today, I realized we're within shouting distance of making that happen. And I'm reversing my former stance on truncation. Here is a <a href="http://udell.infoworld.com:8001/?//body/*[1][ancestor::item/date%20=%20'2004/02/21']|//content/body[count(./*)=0][ancestor::item/date='2004/02/21']">Wall Street Journal view</a> of all of my feeds so far today. And here is a <a href="http://udell.infoworld.com:8001/?//body[ancestor::item/date = '2004/02/21']">full-content view</a> of all of my feeds so far today. It includes this long item I'm now writing, which shows how a mixture of truncated and untruncated content is optimal for neither scanning nor for reading.
</p>
<p>
Here are my conclusions:
<ul>
<li><p>Nobody needs to truncate feeds in order to enable front-page views (although some will still want to in order to drive traffic to websites).</p></li>
<li><p>Everybody's content should be HTML (if not XHTML).</p></li>
<li><p>Authors should think of the first HTML element (normally a paragraph, but could be a list or a blockquote or something else) as special: the lead, or deck, that will appear in a front-page view.</p></li>
<li><p>Feedreaders should XHTML-ize what they read.</p></li>
<li><p>Feedreaders should then offer a front-page view (e.g., just the first HTML element found in each item) as well as a full-content view.</p></li>
</ul>
</p>
<p>
By the way, in case it isn't obvious, the RSS/Atom controversy is irrelevant to this discussion. In both environments, the same principles could be applied in exactly the same ways, for exactly the same reasons.
</p>

</body>
</item>


<item num="a923">
<title>Under the radar</title>
<date>2004/02/20</date>
<body>

<p>
Dare Obasanjo complains about being shut out of Steve Saxon's feed:
<blockquote class="personQuote DareObasanjo">
This afternoon I found out that <a href="http://ruxp.net/">Steve Saxon</a>, the author of the excellent article <a href="http://msdn.microsoft.com/library/en-us/dnexxml/html/xml03172003.asp">XPath Querying Over Objects with ObjectXPathNavigator</a>, had a Blogger.com blog that only provided an <a href="http://www.ruxp.net/atom.xml">ATOM feed.</a> Being that I use <a href="http://www.rssbandit.org/">RSS Bandit</a> as my aggregator of choice I cannot subscribe to his feed nor can I use <a href="http://www.lights.com/weblogs/rss.html">a large percentage of the existing news aggregators</a> to read Steve's feed. [<a href="http://www.25hoursaday.com/weblog/PermaLink.aspx?guid=49b611e5-3788-4921-8b55-00fc08de7e9e">Dare Obasanjo</a>]
</blockquote>
Of course, you can read Steve's Atom-only feed in an RSS-only newsreader such as RSS Bandit. Look:
</p>
<p>
<a href="http://weblog.infoworld.com/udell/gems/atom2rss.gif"><img width="419" height="305" src="http://weblog.infoworld.com/udell/gems/atom2rss.gif"/></a>
</p>
<p>
How? Just search Google for <a href="http://www.google.com/search?q=atom2rss">atom2rss</a>. There are a bunch of translators floating around. The one I picked comes from the folks at <a href="http://www.2rss.com">2rss.com</a>. Here is their translator: <a href="http://www.2rss.com/software.php?page=atom2rss">http://www.2rss.com/software.php?page=atom2rss</a>. And they are kindly making this service available for free. You can go to the site, plug in an Atom URL, generate the corresponding RSS URL, and subscribe to that.
</p>
<p>
Sheesh. The fact that we are now going to have a war over formats that are separated by a trivial XML transformation is almost as depressing as February in New England. This cheered me up, though:
<blockquote cite="http://blogs.geekdojo.net/pdbartlett/archive/2004/02/19/1146.aspx" class="personQuote PaulBartlett">
I'd like to give a big "hats off" to all the work Jon Udell is doing to make the "semantic web" a reality by looking at ways to extract information from existing (X)HTML content, and also in proposing new ways of adding semantic information to the markup for new content. If you don't already read his blog then I can't recommend it highly enough. Anyone who's involved in the reading or writing of technical blogs must surely have at least a passing interest in this sort of stuff. (BTW, I'm surprised that his work does not seem to have received much attention from the "big" blogging sites and/or engines. Unless I've missed it, of course...) [<a href="http://blogs.geekdojo.net/pdbartlett/archive/2004/02/19/1146.aspx">Paul's Imaginary Friend</a>]
</blockquote>
I've wondered about this too. My focus is on the syndication payload, not the syndication wrapper, and for my purposes it's completely irrelevant whether the wrapper is RSS, Atom, or Bob's Your Uncle. Come to think of it, maybe it's a good thing that the big sites and engines aren't focused on this stuff yet. Where there's syntax, there's the potential for another format war. What's really needed, though, is a quiet space for experimentation and organic evolution. 
</p>

</body>
</item>


<item num="a922">
<title>Using the Yahoo! search plugin in Mozilla</title>
<date>2004/02/19</date>
<body>

<p>
Somebody was looking over my shoulder the other day as I was using the dropdown list of search plugins in Firefox (nee Firebird nee Phoenix nee Mozilla), and was surprised to see it. Which reminded me that in IE and Safari, the built-in search isn't extensible. 
</p>
<p>
Now that we're all comparing Google and Yahoo!, it's really handy to be able to query one engine, then repeat the query in another engine frictionlessly. Here's what that looks like:
</p>
<p>
<a href="http://weblog.infoworld.com/udell/gems/firefoxSearchPlugins.jpg"><img width="391" height="288" src="http://weblog.infoworld.com/udell/gems/firefoxSearchPlugins.jpg"/></a>
</p>
<p>
If you're using a Mozilla variant and haven't set up a Yahoo! plugin yet, it's installable from the <a href="http://mycroft.mozdev.org/quick/yahoo.html">mycroft page</a>. Very convenient.
</p>

</body>
</item>


<item num="a921">
<title>Google News coverage of Yahoo! dumping Google</title>
<date>2004/02/18</date>
<body>

<p>
I wondered whether today's biggest tech news story -- Yahoo! dumping Google for its own search engine -- would show up first in the Sci/Tech category at Google. Sure enough, it does:
</p>
<p>
<img border="1" vspace="6" src="http://weblog.infoworld.com/udell/gems/yahooDumpsGoogle.jpg"/>
</p>
<p>
Cool.
</p>
<p>
Meanwhile, as everyone begins to dissect the capabilities of the new Yahoo search engine, <a href="http://www.theshiftedlibrarian.com/2004/02/18.html#a5227">The Shifted Librarian</a> notes that RSS feeds associated with found sites are highlighted, and can be added to the My Yahoo feedreader. Except for Google's Blogger-created blogs, which don't bother to provide RSS feeds.
</p>
<p>
Uncool.
</p>
<p>
<b>Update:</b> Heh. I just rechecked, and now there's no sign of the Yahoo! story at news.google.com. Did I get today's lucky screenshot?
</p>
<p>
<b>Further update:</b> Now the Yahoo! story is back :-)
</p>
<p>
<b>Still further update:</b> And now it's gone again. I'm getting dizzy...
</p>
<p>
<img border="1" vspace="6" src="http://weblog.infoworld.com/udell/gems/yahooDumpsGoogle2.jpg"/>
</p>


</body>
</item>


<item num="a920">
<title>LibraryLookup for Talis Prism</title>
<date>2004/02/18</date>
<body>

<p>
<a href="http://www.timhodson.com/">Tim Hodson's</a> LibraryLookup bookmarklet broke when his library upgraded its OPAC. So he fixed it:
<blockquote class="personQuote TimHodson">
Have used your lookup many times, until our library service started to use a new OPAC! Talis Information systems have released a new OPAC which is called Talis Prism. For a while I thought my lovely lookup would never work again, but I have recently discovered (by changing their post form variables to gets with the marvellous firefox browser and a web developers toolbar) that a get version of their page works just as well.
</blockquote>
Thanks Tim! I love to see users hacking their library systems this way. I've taken the URL pattern that Tim figured out and added it to the <a href="http://weblog.infoworld.com/udell/stories/2002/12/11/librarylookupGenerator.html">build your own bookmarklet</a> service; Talis Prism now becomes the twelfth supported OPAC. I can no longer keep up with the static lists that I originally compiled in order to seed this project. But I'm always on the lookout for new patterns -- like the one Tim has provided -- that enable users to generate their own bookmarklets for some previously unsupported class of OPAC system.
</p>
<p>
If your game is enterprise software, you might regard all this library stuff as an odd diversion of mine. But ask yourself: can users of your ERP and CRM systems hack their own integration? If not, why not?
</p>

</body>
</item>

<item num="a919">
<title>Real world semantics</title>
<date>2004/02/18</date>
<body>

<p>
At ETech (which I unfortunately could not attend) there was a presentation entitled <a href="http://tantek.com/presentations/2004etech/realworldsemanticspres.html">real world semantics</a> that is close in spirit to my own recent experimentation. The presenters were Technorati's <a href="http://epeus.blogspot.com/">Kevin Marks</a> and <a href="http://tantek.com/log/2004/02.html">Tantek Celik</a>, who fought the good fight to bring quality CSS support to Microsoft's now-abandoned MSIE/Mac. Phrases they use to define real world semantics: "emerging semantic (x)html", "adoption by 'real people'", "beyond academics and theoretical discussions." Exactly.
</p>
<p>
Meanwhile, over on <a href="http://www.openlinksw.com/blog/~kidehen/">Kingsley Idehen's blog</a>, you can see another implementation of the kind of xhtml-aware search technology I've been playing with here. The <a href="http://www.openlinksw.com/blog/search.vspx?blogid=127">advanced search</a> feature uses the Virtuoso engine to perform not only XPath search, as I'm doing, but also XQuery search. Here's one of the provided examples:
<pre class="code xquery">
for $i in node()//a return &lt;p>{ string($i/@href) }&lt;/p>
</pre>
This query, which finds links and produces a series of paragraphs containing the referenced URLs, shows how XQuery can combine the search capability of XPath with the transformative and generative power of XSLT.
</p>
<p>
Although random XHTML can be mined more fruitfully than you might suspect, I'm on the lookout for ways to naturally, and virally, enrich its semantic carrying capacity. The Celik/Marks presentation points to several such efforts, including <a href="http://geourl.org/">GeoURL</a>, which I use in my blog's header to announce my location (&lt;META name="ICBM" content="42.93564,-72.27239">), and <a href="http://gmpg.org/xfn/">XFN</a>, the XHTML Friends Network, which proposes using the REL attribute of links (&lt;a href="..." rel="acquantaince">) to indicate relationships. This is the sort of thing that will make the search techniques Kingsley and I are demonstrating come alive. My hunch is that lots of XFN-like strategies will emerge, if we can close the feedback loop and connect the effort required to adopt such a strategy to an immediate reward.
</p>


</body>
</item>


<item num="a918">
<title>Gender, personality, and social software</title>
<date>2004/02/17</date>
<body>

<p>
<blockquote class="personQuote JonUdell">
"I feel like I'm at a Microsoft monastery here," wrote Rory Blyth from the most recent Professional Developers Conference. "I think I've seen about 2.5 females ... it's like they're an endangered species." The observation holds equally true for open source conferences.
<br/>...<br/>
If we expect social software to help rewrite the productivity equation, social skills and protocols become critical parts of the game. How can social software succeed if, in its development, half the population is so poorly represented?  [Full story at <a href="http://www.infoworld.com/article/04/02/13/07OPstrategic_1.html">InfoWorld.com</a>]
</blockquote>
This column touches on two third-rail issues: personality and gender. The <a href="http://www.wired.com/wired/archive/9.12/aspergers_pr.html">Wired article on Asperger's syndrome</a> cited in the column was incorrectly dated, by the way. My error: it was of course published in 2001, not 1991. That slipped past me and my editors, but my friend <a href="http://radio.weblogs.com/0105977/">Larry Welkowitz</a>, a psychologist and AS specialist, caught it. 
</p>
<p>
I'm not a social scientist or a psychologist, and I was reluctant to touch either of these controversies. (As you might imagine, the column provoked some internal discussion at InfoWorld.) In the end I decided to go ahead precisely because both subjects make me uncomfortable.
</p>
<p>
The larger of the two issues, in my mind, is that of gender. Nobody seems to have any real answers, but here are some perspectives on gender and computing:
</p>
<blockquote cite="http://www.nsf.gov/sbe/srs/databrf/sdb97326.htm">
The percentage decline in computer science was much larger among women (51 percent) than among men (28 percent) from 1985 to 1995. [<a href="http://www.nsf.gov/sbe/srs/databrf/sdb97326.htm">National Science Foundation</a>]
</blockquote>
<blockquote cite="http://www.mines.edu/fs_home/bmoskal/scholprog/Reports_Sp_01/Makoski.pdf" class="personQuote HeatherMakoski">
Programming assignments are many times devoid of meaning and importance to people's lives, which tends to appeal more to boys. Girls, on the other hand, will be more attracted to technology, if it has some meaning or positive purpose in a real-world context. [<a href="http://www.mines.edu/fs_home/bmoskal/scholprog/Reports_Sp_01/Makoski.pdf">Heather Makoski: Underrepresentation of Women in Science, Engineering, and Mathematics</a>]
</blockquote>
<blockquote cite="http://www.mills.edu/ACAD_INFO/MCS/SPERTUS/Gender/pap/node1.html" class="personQuote EllenSpertus">
Girls and women are choosing, consciously or subconsciously, not to go into or stay in computer science. While one cannot rule out the possibility of some innate neurological or psychological differences that would make women less (or more) likely to excel in computer science, I found that the cultural biases against women's pursuing such careers are so large that, even if inherent differences exist, they would not explain the entire gap. [<a href="http://www.mills.edu/ACAD_INFO/MCS/SPERTUS/Gender/pap/node1.html">Ellen Spertus: Why are There so Few Female Computer Scientists?</a>
</blockquote>
<blockquote cite="http://www.unix-girl.com/blog/archives/000380.html" class="personQuote KasiaTrapszo">
On my way to work this morning I was listening to NPR, as I usually do, and heard a segment on the declining numbers of female students entering the computer science major. I'm sure they are correct in their observation that the numbers are indeed declining, I'm not going to argue that. I am however finding myself disagreeing with their reasoning behind this decline. One thing in particular that I felt was an erronous conclusion.. the amount of time young boys spend playing video games as opposed to young girls. 
I do agree that most video games are geared towards boys, I don't agree that this has anything to do with the probability of a child's future interest in computer science. [<a href="http://www.unix-girl.com/blog/archives/000380.html">kasia in a nutshell</a>
</blockquote>
<blockquote cite="http://www.enderton.com/maria/honors/honors-double.pdf" class="personQuote MariaEnderton">
Additionally, for many females, computers are more meaningful and compelling if they are able to link them with other fields and are able to keep computer science's social context in mind. Margolis and Fisher (2002) call this appeal "computing with a purpose." However, computer science curricula has traditionally been oriented on the basis of the fascinations of male students, and the aspects of computers that females find interesting may not be emphasized. This lack of emphasis on certain characteristics may discourage women, allowing them to feel computers "aren't for them." [<a href="http://www.enderton.com/maria/honors/honors-double.pdf">Maria Enderton: Honors Thesis, Women in Computer Science</a>]
</blockquote>
<blockquote cite="http://www.digitalsqueeze.com/drupal/node/view/127">
It's funny that Jon has written an article on social networking addressing the geek male perspective, when I've been thinking quite a bit lately that some of the best minds regarding efforts behind Social Networking are actually female. They just get the importance of relationships much better than guys. [<a href="http://www.digitalsqueeze.com/drupal/node/view/127">Digital Squeeze</a>]
</blockquote>
<p>
I'd be interested in <a href="http://radiocomments.userland.com/comments?u=100887&amp;p=918&amp;link=http%3A%2F%2Fweblog.infoworld.com%2Fudell%2F2004%2F01%2F27.html%23a918">comments</a> on these issues.
</p>
<p>
On a related note, I'm working on a story about enterprise social software. What that label means is, of course, open to discussion. If you're developing and/or using what you think of as enterprise social software, and want to talk about it, feel free to ping me.
</p>

</body>
</item>


<item num="a917">
<title>It was forty years ago today</title>
<date>2004/02/15</date>
<body>

<p>
Actually, it was in 1960, four years before the Beatles showed up on Ed Sullivan, that <a href="http://www.cs.dartmouth.edu/~doug/">Doug McIlroy</a> published <a href="http://portal.acm.org/citation.cfm?id=367223&amp;dl=ACM&amp;coll=portal">Macro instruction extensions of compiler languages</a>, which appears to be a seminal paper in the literature of metaprogramming. I mention this because a number of folks have responded to last week's item, <a href="http://weblog.infoworld.com/udell/2004/02/11.html#a915">Programs that write programs</a>, pointing out that Lisp programmers have been there, done that:
<blockquote>
"your note about code generation, and the referenced discussions - bits of which i'd already read elsewhere, left me with a really eerie feeling, that i might not be living in the same dimension with you folks. you see, there's a practice of code generation which extends back decades: lisp. code generations is a lisp programmer's bread and butter."
</blockquote>
<blockquote>
"In the lisp world, they call these macros.  The idea is pretty widely known, though not too many languages implement them.  Perl 6, and by extension the Parrot interpreter, will include macros, and they will thus be available to  any language that gets implemented on top of Parrot (which currently includes Ruby, Python, (maybe) PHP, and, of course, Perl)."
</blockquote>
</p>
<p>
Points taken. As it happens, I did at one time program in a variant of Lisp. From that experience I learned the value of incremental development, dynamic data structures (lists, dictionaries), code generation, and other techniques that later became available to me in languages like Perl and Python. But macros weren't part of the Lisp I used, so I didn't make that connection. 
</p>
<p>
The cultural anthropology of programming languages is a fascinating subject. Recently, for example, I asked an accomplished developer with deep roots in the Microsoft programming culture to cite his favorite productivity aids in the .NET Framework. Regular expressions made his short list. That floored me, since regexes are just part of the atmosphere that Unix and open source programmers have always breathed. But a lot of Microsoft programmers didn't grow up breathing that atmosphere.
</p>
<p>
Of course it goes both ways. I'm likely to try out new technologies on Windows first, because the Windows culture groks packaging and installation -- even of open source software! -- better than the Unix culture does.
</p>
<p>
There's some truth to the oft-heard claim that there are no new software technologies. If we spent the next decade just cross-fertilizing what we already have, it would probably be a decade well spent.
</p>

</body>
</item>

<item num="a916">
<title>OCLC refines its ISBN-clustering service</title>
<date>2004/02/13</date>
<body>

<p>
Python hacker and OCLC chief scientist Thom Hickey has updated me on the <a href="http://www.oclc.org/research/projects/xisbn/">xISBN</a> project:
<blockquote class="personQuote ThomHickey">
Just thought I'd let you know that we've put up a new version of the ISBN database.  We've done a lot of work to pull works with variant titles together (which helps with <a href="http://labs.oclc.org/xisbn/0066620694">The Innovator's Dilemma</a>) and made the retrievals consistent, so that any ISBN in a group retrieves that same ISBN group (which also helps with I's D).  We've learned a lot about how ISBNs are used (and misused).
</blockquote>
Thanks for the update, Thom. Sure enough, my original examples now work as advertised. Here's what Thom was referring to:
<blockquote class="personQuote JonUdell">
There are a few caveats here. First, the one-to-many algorithm doesn't seem to be fully bi-directional. In the example above, we'd like to get from 0066620694, a paperback, to 0875845851, a hardcover. But although we can get from <a href="http://labs.oclc.org/xisbn/0875845851">0875845851 to 0066620694</a>, we can't get from <a href="http://labs.oclc.org/xisbn/0066620694">0066620694 to 0875845851</a>. [Jon's Radio: <a href="http://weblog.infoworld.com/udell/2003/11/13.html">Multi-ISBN LibraryLookup</a>]
</blockquote>
Those two links didn't used to yield the same set of ISBNs. Now they do. Cool!
</p>
<p>
My adaptation of LibraryLookup for xISBN, by the way, is <a href="http://weblog.infoworld.com/udell/2003/11/13.html">here</a>. An improved xISBN service makes it more interesting, but the real bottleneck will be the OPAC systems themselves. The LibraryLookup idea -- which gets a nice mention in this month's <a href="http://www.technologyreview.com/articles/wo_stenger021304.asp">Technology Review</a> (thanks, <a href="http://www.raelity.org/">Rael</a>!) -- works by splicing two Web contexts together. From a page at Amazon or B&amp;N or AllConsuming, you go to a page on your library's Innovative or Polaris OPAC system. Now, with xISBN, you can present the OPAC with a list of ISBNs. Unfortunately, OPACs have no idea what to do with a list of ISBNs. The multi-window solution <a href="http://weblog.infoworld.com/udell/gems/multiIsbnLookupGenerator.html">I tried</a> is kind of lame. 
</p>
<p>
I'd love to see the various OPACs take note of xISBN. We can imagine all sorts of fancy integrations involving Web services or WSRP, but the simplest thing, really, would be for OPACs to silently expand an ISBN to an ISBN group, search accordingly, and return a combined result. 
</p>
<p>
I'll be on a panel at SXSW Interactive in March, entitled <a href="http://www.sxsw.com/interactive/panels/index.php?action=details&amp;con=ia&amp;panelname=Streetwise+Librarians+and+the+Revolution+in+Public+Information">Streetwise Librarians and the Revolution in Public Information</a>, which should be a great venue in which to explore these kinds of issues.
</p>

</body>
</item>

<item num="a915">
<title>Programs that write programs</title>
<date>2004/02/11</date>
<body>

<p>
Following pointers from Ned Batchelder's recent excursion into <a href="http://www.nedbatchelder.com/blog/200402.html#e20040211T060922">code generation</a> led me to another nice example of the power of dynamic languages. In order to streamline his use of C++, Ned wrote a little tool called <a href="http://www.nedbatchelder.com/code/cog/index.html">cog</a> which enables him to embed, in C++ programs, Python fragments that generate verbose and/or repetitive C++ constructs. He adds:
<blockquote cite="Ned Batchelder">
For more about code generation in general, try:
<ul>
<li>
<a href="http://www.codegeneration.net/tiki-read_article.php?articleId=9" class="offsite">Dave Thomas interviewed about code generation</a>. Dave Thomas is one of the <a href="http://www.pragprog.com" class="offsite">Pragmatic Programmers</a>, and I find I agree with him almost universally. He forbids putting the output of code generators under source control, I encourage it.  We agree that the output should never be edited.
</li>
<li>The <a href="http://c2.com/cgi/wiki?CodeGeneration" class="offsite">Code Generation</a> page on the c2 wiki. As will happen with a wiki, this fractures off in many directions, with many different viewpoints, both for and against code generation.
</li>
</ul>
[<a href="http://www.nedbatchelder.com/blog/200402.html#e20040210T222100">Ned Batchelder</a>]
</blockquote>
</p>
<p>
In the interview Ned cites, Dave Thomas gives an example of a Ruby feature that I've heard of, but never had occasion to use. In a class definition you can write Ruby code to define a type. That means, as Thomas puts it, that "you can effectively extend the language at runtime from within." Statements like that have a tendency to alienate people. It can sound like the drug-induced fantasy of some idealistic tree-hugging Birkenstock-wearer who isn't living in the real world of Enterprise Software Development. But Thomas backs it up with a great example. In this case, he used Ruby's dynamic extensibility to wrap a database schema in classes that can either persist objects to the database, or create schema documentation, depending on how the methods that dynamically define those classes are defined.
</p>
<p>
Once upon a time <a href="http://training.perl.com/">Tom Christiansen</a> gave me a great quote, which he attributes to <a href="http://www.research.att.com/info/andrew/">Andrew Hume</a>: "Programs that write programs are the happiest programs in the world." Templating and code generation are examples of this happy strategy. We've always known that dynamic languages are a great way to create "little languages" for specific tasks. But we don't yet fully appreciate that <i>all</i> programming is a continuous process of language invention. And we don't (yet) evaluate programming-language productivity on those terms. Dave Thomas:
<blockquote cite="Dave Thomas">
I'm betting that languages such as Java and C++ will in the long term be seen as a curious branch in the evolution of computing. I'm hoping that somewhere out there some bright spark is coming up with a way of letting us write applications expressively and dynamically. Once this happens, the need for these kinds of code generators will diminish.
<br/><br/>
For example, I rarely (if ever) write a code generator that generates Ruby code: there's just no need, as Ruby is dynamic enough to let be do what I want without leaving the language.
</blockquote>
We are linguistic animals endowed with a protean ability to generate language. Naturally we'll want that same generative power in our programming languages. 
</p>

</body>
</item>
	

<item num="a913">
<title>Multi-valued CSS class attributes</title>
<date>2004/02/09</date>
<body>

<p>
A reader named Jemisa wrote last week with this proposal:
<blockquote cite="Jemisa">
Just another proposition for 
<pre>
&lt;pre class="code" lang="python">
...
&lt;/pre>
</pre>
Why not use
<pre>
&lt;pre class="code python">
...
&lt;/pre>
</pre>
I know it's less "semantic" than your experimental attribute, but it might be useful (to style python code with different color than perl code for example)
</blockquote>
This is a great idea which I at first completely failed to understand. My objections were twofold. On the CSS front, that you'd want to be able to style 'code' things independently of 'python' things. And on the XPath search front, that you'd want to be able to search independently.
</p>
<p>
But then <a href="http://jim.roepcke.com/">Jim Roepke</a> sent me the exact same proposal, and set me straight on the first point:
<blockquote cite="Jim Reopke">
You can specify more than one class name in the class attribute.... you could say:
<pre>
&lt;pre class="code python">...&lt;/pre>
</pre>
I don't know if your XPath stuff can handle that, but it's a valid way to specify the class of an element. In terms of CSS, both the code and python classes will be applied to the element.
</blockquote>
OK. Now I finally get it. I can use XHTML like this:
<pre class="code xhtml">
&lt;pre class="code python">
import re
import sys
...
&lt;/pre>
 
&lt;blockquote class="personQuote StefanoMazzocchi">
"..."
&lt;/blockquote>
</pre>
along with CSS like this:

<pre class="code css">
blockquote.personQuote {
font-style: italic;
}
blockquote.StefanoMazzocchi:after {
content: 
  url(http://www.oio.de/public/web/stefano-mazzocchi.jpg)
}
pre.code {
border-style: solid;
border-width: thin;
padding: 10px;
}
pre.python:after {
font-weight: bold;
content: 
  url(http://www.python.org/pics/PythonPoweredSmall.gif);
}
</pre>
</p>
<p>
This looks really promising. On the XHTML front, legal and even elegant. No extra namespace baggage. It's easy to write this way by hand, and tools that give people control over CSS styles could easily support this method if they wanted to.
</p>
<p>
On the search front, it's less than ideal. But the difference boils down to substring matching versus string equality. On a small corpus of XML content -- i.e., your own blog -- this won't matter, as I've already demonstrated to my own satisfaction. And on a larger corpus -- like the one I'm assembling now -- I'm presuming a database with indexing that can make a contains() query roughly as efficient as an equals() query. 
</p>
<p>
Thanks, Jemisa and Jim! At the moment this looks like the winning strategy. I can't dive into this for a few days yet, but I'll watch for further feedback in case there's something <i>else</i> I've missed.
</p>

</body>
</item>


<item num="a912">
<title>Device independence</title>
<date>2004/02/06</date>
<body>

<p>
<blockquote cite="InfoWorld">
For a team of collaborators, Groove synchronizes both the sets of applications available in a given context (or "shared space") and the data written by those applications. If you drop your laptop on the floor you can effortlessly recover everything into a fresh instance of Groove on a new machine.
<br/><br/>
Of course this works only for native Groove apps. Browser history and bookmarks, Outlook settings, and a million other things are handled in a million other ways -- or not handled at all -- because desktop operating systems aren't Groove. A general solution would require OSs that work like Groove, and applications that send messages rather than write files. Well, come to think of it, why not? [Full story at <a href="http://www.infoworld.com/article/04/02/06/06OPstrategic_1.html">InfoWorld.com</a>]
</blockquote>
</p>

</body>
</item>


<item num="a911">
<title>Things that shouldn't have to be said</title>
<date>2004/02/06</date>
<body>

<p>                                                                       <blockquote cite="Doc Searls">
But sometimes arguments cross a line beyond which everybody gets hurt, including the Net. I see that happening here. Even though I'm no technologist, it's clear to me that the Net has been improved, radically and fundamentally, by RSS and other standards like it (even if they come, as Mark claims RSS does, in 9 incompatible versions). [<a href="http://doc.weblogs.com/2004/02/06#peaceOut">Doc Searls</a>]
</blockquote>
Must we <i>still</i>, at this late date, reiterate and underscore Doc's point? Apparently, we must. Sigh.
</p>
<p>
Oh, and by the way, what's up with this?
<img src="http://weblog.infoworld.com/udell/gems/xml11.gif" vspace="10" border="1"/>
</p>

</body>
</item>



<item num="a910">
<title>Notes from an XQuery practitioner</title>
<date>2004/02/06</date>
<body>

<p>
A Hungarian developer, <a href="http://fb2.hu/x10/">Fejes Balazs</a>, alerted me to a <a href="http://fb2.hu/x10/Articles/XQueryforfun.html">couple</a> <a href="http://fb2.hu/x10/Articles/XQueryInWorkshop.html">of</a> his articles on XQuery -- the first a general introduction, and the second a walkthrough of XQuery transformation in BEA's WebLogic Workshop. Both are nicely done.
</p>
<p>
Given that so much more can be done with XPath and XSLT than is widely appreciated, I've been focused mainly on broadening awareness of what's possible. But I've been studying XQuery in parallel, and it recently struck me that one of the reasons XQuery is going to be important matches one of the reasons that dynamic programming languages are important: both let you play with data.
</p>
<p>
That phrase -- "play with data" -- comes to me by way of Jonathan Robie, who I met at XML 2003. Jonathan, co-author of <a href="http://safari.oreilly.com/0321180607">XQuery from the Experts</a> and co-editor of the XPath and XQuery specs, believes (as do I) that data is a substance, like clay, that you have to pound on, roll out, squeeze, mold, and generally get your hands dirty with, in order to discover its possibilities. 
</p>
<p>
Over the years I've come to see that the ability to treat data like clay is a primary benefit of languages such as Perl and Python. If you had to finalize your data structures up front you'd never get anywhere, because they're emergent. 
</p>
<p>
Now listen to Jonathan Robie on the subject of types in XQuery:
<blockquote cite="Jonathan Robie">
The type system of XQuery is one of the most eclectic, unusual, and useful aspects of the language. XML documents contain a wide range of type information, from very loosely typed information without even a DTD, to rigidly structured data corresponding to relational data or objects. A language designed for processing XML must be able to deal with this fact gracefully; it must avoid imposing assumptions on what is allowed that conflict with what is actually found in the data, allow data to be managed without forcing the programmer to cast values frequently, and allow the programmer to focus on the documents being processed and the task to be performed rather than the quirks of the type system. [<a href="http://safari.oreilly.com/JVXSL.asp?x=1&amp;mode=section&amp;sortKey=rank&amp;sortOrder=desc&amp;view=book&amp;xmlid=0-321-18060-7/ch01lev1sec14&amp;open=false&amp;g=&amp;srchText=types+in+xquery&amp;code=&amp;h=&amp;m=&amp;l=1&amp;catid=&amp;s=1&amp;b=1&amp;f=1&amp;t=1&amp;c=1&amp;u=1&amp;r=&amp;o=1&amp;page=0">XQuery From the Experts, Chapter 1: A Guided Tour</a>]
</blockquote>
I really like the sound of that.
</p>


</body>
</item>


<item num="a909">
<title>Experimental attributes</title>
<date>2004/02/05</date>
<body>

<p>
There have been a number of thoughtful responses to my <a href="http://weblog.infoworld.com/udell/2004/02/03.html#908">confession</a>, the other day, about cheating on Web standards. Several folks recommended this approach:
<pre class="code" lang="xhtml">
&lt;blockquote 
   cite="http://www.betaversion.org/~stefano/linotype/news/35/"
   title="Stefano Mazzocchi">
...
&lt;/blockquote>
</pre>
Jim White also made this intriguing proposal:
<pre class="code" lang="xhtml">
&lt;blockquote cite="urn:name:Stefano%20Mazzocchi">
..
&lt;/blockquote>
</pre>
Jim pointed me to the <a href="http://www.iana.org/assignments/urn-namespaces">IANA registry of URN namespaces</a>, noting that while 'name' is not among those registered namespaces, and the one you can find there -- <a href="http://www.faqs.org/rfcs/rfc3043.html">RFC3043, Personal Internet Name (PIN): A URN Namespace for People and Organizations</a> -- isn't quite right either, these are examples of valid ways to extend an attribute that takes a URI as its value.
</p>
<p>
Of course that still left the other problem: 
</p>
<pre>
&lt;pre class="code" lang="python">
..
&lt;/pre>
</pre>
<p>
I think that for the next phase of this experiment, I should just bite the bullet and start writing nonstandard attributes -- such as 'lang' in this case -- into another namespace. For an author, as Jim points out, there's not a lot of extra friction or overhead. It could be as little as two extra characters:
<pre class="code" lang="xhtml">
&lt;pre class="code" e:lang="python">
..
&lt;/pre>
</pre>
The 'e' would be for 'experimental' -- mapped to what URI I don't yet know. As Jim rightly points out, the burden to process these experimental attributes would fall mainly on developers of authoring and search tools, not on users. Since I've got a couple of my own XML-aware search tools running now, I'll give this a whirl and see how it goes. Thanks to everybody who commented on this matter. I will continue to be interested to hear from people with ideas about how to strike the right balance.
</p>

</body>
</item>

<item num="a908">
<title>Confession time</title>
<date>2004/02/03</date>
<body>

<p>
It's time for a confession. I've been acting as though all this cool XPath search stuff I've been demonstrating for the past few weeks were based on plain vanilla XHTML. Well, it's not (quite) true. In general my point has been to illustrate two things:
<ol>
<li><p>That the XHTML equivalent of ordinary HTML content includes metadata (links, tables, images) that can be usefully exposed as XML.</p></li>
<li><p>That legal ways of enlarging the namespace used within HTML -- in particular, CSS class attributes -- can enhance this approach.</p></li>
</ol>
</p>
<p>
But in truth, as some have noticed, I've been cheating on XHTML a bit. Here's one cheat: in order to support this query -- <a href="http://udell.infoworld.com:8000?//pre[@class='code' and @lang='python']">Python snippets</a> -- I've been writing HTML like this:
<pre class="code" lang="xhtml">
&lt;pre class="code" lang="python">
...
&lt;/pre>
</pre>
The class="code" bit is OK, but there is no lang attribute defined for the &lt;pre> element, I just made it up to support queries of this form. So far nobody has noticed or complained, but it's not right.
</p>
<p>
Here's another cheat. In order to support this query -- <a href="http://udell.infoworld.com:8000?//blockquote[contains(@cite, 'Stefano')]">quotes from Stefano</a> -- I've been writing HTML like this:
<pre class="code" lang="xhtml">
&lt;blockquote cite="Stefano Mazzocchi">
...
&lt;/blockquote>
</pre>
This isn't right either. The value of the cite attribute is supposed to be a URI, not somebody's name. In this case, a few people have noticed and complained. I'm willing to switch to the correct usage of cite, and since my content is in an XML database I can fix it backwards as well as forwards. But here's the thing: I still want to be able to search quotes by person, not by URI. And I'd like there to be a standard way for other people to write quotes that they, or I, can search by person.
</p>
<p>
More generally, there are zillions of such use cases which I don't think we can know in advance of discovering them. So I can't imagine proposing any specific extensions to XHTML that would accommodate such discovery. I can think of two general approaches, though. One might go like this:
<pre class="code" lang="xhtml">
&lt;blockquote 
  cite="http://www.betaversion.org/~stefano/linotype/news/35/" 
  X-who="Stefano Mazzocchi">
...
&lt;/blockquote>
</pre>
In other words, agree to allow a class of experimental attributes in a manner analogous to the experimental X- headers of SMTP.
</p>
<p>
Another might go like this:
<pre class="code" lang="xhtml">
&lt;blockquote xmlns:exp="http://XHTML-Experimental"
  cite="http://www.betaversion.org/~stefano/linotype/news/35/" 
  exp:who="Stefano Mazzocchi">
...
&lt;/blockquote>
</pre>
In other words, use another namespace for attributes carrying extra data intended to facilitate search and reuse.
</p>
<p>
I've hesitated to even raise this issue because, in my experience, it's the kind of thing that can just get bogged down in endless discussion and debate. So I've gone ahead and cheated a bit on XHTML in the service of what I think is a proper ambition: to find some workable middle ground between the unstructured real web that exists all around us and the structured Semantic Web that exists only in our imagination. Or rather, to suggest how the latter can emerge from the former. But to those of you who've wondered: yes, I do feel guilty about cheating, and I'd like to come clean. Are there ways to enlarge the carrying capacity of the HTML namespace without doing violence to the spec? And without inventing mechanisms too complex for writers of ordinary everyday documents, or too far removed from existing writing tools?
</p>

</body>
</item>

<item num="a907">
<title>Content-aware search</title>
<date>2004/02/02</date>
<body>

<p>
<blockquote cite="InfoWorld">
At InfoWorld's 2002 CTO Forum, Google co-founder Sergey Brin threw cold water on the idea of instrumenting content for intelligent search. "I'd rather make progress by having computers understand what humans write," he said, "than by forcing humans to write in ways that computers can understand." Brin's pragmatic stance sharply opposes the idealistic view of the Web's inventor, Tim Berners-Lee, who continues to evangelize his vision of a Semantic Web full of carefully encoded content that we can precisely search and fluidly recombine. My own humble contribution to this debate is a prototype search engine, now running on my Weblog, that tries to steer a middle course between the Scylla of simple fulltext search and the Charybdis of unwieldy tagging schemes and brittle ontologies. [Full story at <a href="http://www.infoworld.com/article/04/01/30/05OPstrategic_1.html">InfoWorld.com</a>]
</blockquote>
I keep trying out phrases to capture what I'm aiming for. One is 'dynamic categories,' another is 'interoperable content.' Probably neither will stick, because these only describe how to do something, not why. The why, of course, is productivity. 
</p>
<p>
The NY Times has an article today by Steve Lohr, entitled <a href="http://www.nytimes.com/2004/02/02/technology/02neco.html?pagewanted=all">Technology and Worker Efficiency</a>, in which <a href="http://www.google.com/search?q=%22john+seely+brown%22">John Seely Brown</a> makes the case for productivity very well. Here's a quote that sums up nicely what I also think is happening, and why I am optimistic:
</p>
<blockquote cite="New York Times">
<p>
John Seely Brown, former director of the Xerox Palo Alto Research Center, says he believes that recent changes in software technology could allow big gains in productivity and innovation. The opportunity, he says, is to move beyond the limitations of centralized systems for automating business operations, like enterprise resource systems. "Those systems are prisons," said Mr. Brown, who is scheduled to speak at today's conference.
</p>
<p>
The software plumbing of computing, Mr. Brown explains, is evolving, and so is Internet-based software for individual workers. Software systems built on Web standards, he said, can be used as pick-and-place building blocks, instead of the more formal hierarchical systems of the past.
</p>
<p>
Mr. Brown also points to the rapid development of what he calls "social software" like instant messaging, Weblogs, wikis (multi-user Weblogs) and peer-to-peer tools - all of which make it easier for workers to communicate and collaborate online, almost instantaneously.
</p>
<p>
The combined result, Mr. Brown said, is information technology that can amplify social interaction and enhance workers' understanding of what is happening around them. The benefit, he added, could be to increase their ability to "collectively improvise and innovate."
</p>
<p>
That is a key to productivity and peak performance, according to Mr. Brown. Business, he said, is a lot like soccer. In soccer, there are some set plays, but the best teams also display a wealth of effective improvisation based on the players' deep knowledge of one another. "It's the same in the best corporations or start-ups," he said. [<a href="http://www.nytimes.com/2004/02/02/technology/02neco.html?pagewanted=all">New York Times: Technology and Worker Efficiency, by Steve Lohr</a>
</p>
</blockquote>
<p>
Kevin Werbach got the original sound bite on this: "Web services, Weblogs and WiFi are the new WWW." It was becoming clear in 2002, and is clearer now, that this is a recipe for the kinds of productivity gains that move the needle on the economic dial. However, it's frustratingly hard to be concrete about that squishy intersection between knowledge and collaboration.
</p>

</body>
</item>



<item num="a906">
<title>Exonerated feeds</title>
<date>2004/02/02</date>
<body>

<p>
Apologies to those of you whose feeds I incorrectly named in yesterday's (now updated) entry about RSS feed caching (or, rather, non-caching). I've revised the list. It does appear, though, that there is still a healthy percentage of my 200+ feeds that are not being cached. 
</p>
<p>
It strikes me that the normal methods of checking whether a feed is or is not cached are way, way too geeky for ordinary users. Here's a thought: could/should the <a href="http://www.feedvalidator.org/">feed validator</a> also report whether a feed is using one or another of the caching techniques, and warn if not?
</p>

</body>
</item>



<item num="a905">
<title>RSS self-defense</title>
<date>2004/02/01</date>
<body>

<p>
Now that I'm accumulating my inbound feeds as XHTML, in order to database and search them, I find myself in the aggregator business, where I never planned to be. The tools I'm using to XHTML-ize my feeds are Mark Pilgrim's incredibly useful <a href="http://diveintomark.org/projects/feed_parser/">ultra-liberal feed parser</a> and the equally useful <a href="http://tidy.sourceforge.net/">HTML Tidy</a>, invented by <a href="http://www.w3.org/People/Raggett/">Dave Raggett</a>, and maintained by folks like <a href="http://www.google.com/search?q=%22charlie+reitzel%22">Charlie Reitzel</a>, one of CMS Watch's <a href="http://www.cmswatch.com/Features/PeopleWatch/FeaturedPeople/?feature_id=99">Twenty Leaders to Watch in 2004</a> (along with yours truly). 
</p>
<p>
Today I finally got around to using the <a href="http://www.google.com/search?q=etag+rss">ETag</a> and <a href="http://www.google.com/search?q=conditional+get+rss+if-modified-since">conditional GET (If-Modified-Since)</a> features of Mark Pilgrim's feed parser. (Apologies to my subscribees who, until now, have been treated impolitely by my indexer.) Of the <a href="http://weblog.infoworld.com/udell/gems/mySubscriptions.opml">200+ feeds</a> to which I subscribe, <s>fifty</s> 35 seem not to support either of these two bandwidth-saving techniques, which means they're probably getting battered unnecessarily by feedreaders. The victims are:
</p>
<pre class="realsmall">
<a href="http://fieldmethods.net/backend.php">http://fieldmethods.net/backend.php</a>
<a href="http://groups.yahoo.com/group/syndication/messages?rss=1&amp;viscount=15">http://groups.yahoo.com/group/syndication/messages?rss=1&amp;viscount=15</a>
<a href="http://matt.griffith.com/weblog/rss.xml">http://matt.griffith.com/weblog/rss.xml</a>
<a href="http://nhpr.org/view_rss">http://nhpr.org/view_rss</a>
<a href="http://royo.is-a-geek.com/siteFeeder/GetFeed.aspx?FeedId=43">http://royo.is-a-geek.com/siteFeeder/GetFeed.aspx?FeedId=43</a>
<a href="http://safari.oreilly.com/NewOnSafari.asp">http://safari.oreilly.com/NewOnSafari.asp</a>
<a href="http://today.java.net/pub/q/29?cs_rid=47">http://today.java.net/pub/q/29?cs_rid=47</a>
<a href="http://today.java.net/pub/q/weblogs_rss?x-ver=1.0">http://today.java.net/pub/q/weblogs_rss?x-ver=1.0</a>
<a href="http://usefulinc.com/edd/blog/rss">http://usefulinc.com/edd/blog/rss</a>
<a href="http://w3future.com/weblog/rss.xml">http://w3future.com/weblog/rss.xml</a>
<a href="http://w3future.com/weblog/staplerFeeds/dubinko.xml">http://w3future.com/weblog/staplerFeeds/dubinko.xml</a>
<a href="http://www.burtongroup.com/weblogs/jamielewis/rss.xml">http://www.burtongroup.com/weblogs/jamielewis/rss.xml</a>
<a href="http://www.eighty-twenty.net/blog?flav=rss">http://www.eighty-twenty.net/blog?flav=rss</a>
<a href="http://www.eod.com/devil/rss10.xml">http://www.eod.com/devil/rss10.xml</a>
<a href="http://www.fuzzyblog.com/rss.php?version=2.0">http://www.fuzzyblog.com/rss.php?version=2.0</a>
<a href="http://www.g2bgroup.com/blog/rss.xml">http://www.g2bgroup.com/blog/rss.xml</a>
<a href="http://www.gonze.com/index.cgi?flav=rss">http://www.gonze.com/index.cgi?flav=rss</a>
<a href="http://www.gotdotnet.com/team/dbox/rssex.aspx">http://www.gotdotnet.com/team/dbox/rssex.aspx</a>
<a href="http://www.gotdotnet.com/team/tewald/rss.aspx?version=0.91">http://www.gotdotnet.com/team/tewald/rss.aspx?version=0.91</a>
<a href="http://www.intertwingly.net/wiki/pie/RecentChanges?action=rss_rc">http://www.intertwingly.net/wiki/pie/RecentChanges?action=rss_rc</a>
<a href="http://www.lucidus.net/blog/rss.cfm">http://www.lucidus.net/blog/rss.cfm</a>
<a href="http://www.markbaker.ca/2002/09/Blog/index.rss">http://www.markbaker.ca/2002/09/Blog/index.rss</a>
<a href="http://www.mobilewhack.com/index.rss">http://www.mobilewhack.com/index.rss</a>
<a href="http://www.neward.net/ted/weblog/rss.jsp">http://www.neward.net/ted/weblog/rss.jsp</a>
<a href="http://www.newsisfree.com/HPE/xml/newchannels.xml">http://www.newsisfree.com/HPE/xml/newchannels.xml</a>
<a href="http://www.openlinksw.com/blog/~kidehen/gems/rss.xml">http://www.openlinksw.com/blog/~kidehen/gems/rss.xml</a>
<a href="http://www.oreillynet.com/cs/xml/query/q/295?x-ver=1.0">http://www.oreillynet.com/cs/xml/query/q/295?x-ver=1.0</a>
<a href="http://www.pepysdiary.com/syndication/rss.php">http://www.pepysdiary.com/syndication/rss.php</a>
<a href="http://www.photo-mark.com/cgi-bin/rss2.cgi?set_id=16">http://www.photo-mark.com/cgi-bin/rss2.cgi?set_id=16</a>
<a href="http://www.pipetree.com/qmacro/xml">http://www.pipetree.com/qmacro/xml</a>
<a href="http://www.testing.com/cgi-bin/blog/index.rss">http://www.testing.com/cgi-bin/blog/index.rss</a>
<a href="http://www.voidstar.com/module.php?mod=blog&amp;op=feed&amp;name=jbond">http://www.voidstar.com/module.php?mod=blog&amp;op=feed&amp;name=jbond</a>
<a href="http://www.xmldatabases.org/WK/blog?t=rss20">http://www.xmldatabases.org/WK/blog?t=rss20</a>
<a href="http://www.xmlhack.com/rss.php">http://www.xmlhack.com/rss.php</a>
<a href="http://www.zope.org/SiteIndex/news.rss">http://www.zope.org/SiteIndex/news.rss</a>
</pre>
<p><b>Update</b>:
This list is 15 shorter than it was last night. Greg Reinacker wrote to point out that <a href="http://www.rassoc.com/gregr/weblog/rss.aspx">his feed</a> does emit the ETag header. I checked, and what I originally reported was feeds that were missing one or the other of two different ways to tell the client a feed hasn't changed. But so long as one is in effect, you're OK. Now the list should include only feeds that support neither method, and that as a result cannot return the HTTP '304 Not Modified' response enabling a feedreader to skip an unnecessary fetch of an unchanged feed.
</p>
<p>
Here's a brief summary of the two methods. First, a site that supports Etag (but not Last-Modified), namely Greg's:
</p>
<pre>
1. First fetch of Greg's feed:
 
GET /gregr/weblog/rss.aspx HTTP/1.1
 
2. Etag response:
 
HTTP/1.x 200 OK
Date: Mon, 02 Feb 2004 14:17:01 GMT
Server: Microsoft-IIS/6.0
Etag: "632104748500000000"
 
3. Second fetch of Greg's feed:
 
GET /gregr/weblog/rss.aspx HTTP/1.1
If-None-Match: "632104748500000000"
 
4. 304 response:
 
HTTP/1.x 304 Not Modified
</pre>
<p>
Now here's a site that supports Last-Modified (but not Etag):
</p>
<pre>
1. First fetch of David's feed
  
GET /index.xml HTTP/1.1
Host: www.davidgalbraith.org
  
2. Last-Modified response
  
HTTP/1.x 200 OK
Server: Zeus/4.2
Last-Modified: Mon, 02 Feb 2004 02:02:55 GMT
  
3. Second fetch of David's feed
  
GET /index.xml HTTP/1.1
If-Modified-Since: Mon, 02 Feb 2004 02:02:55 GMT
  
4. 304 response
  
HTTP/1.x 304 Not Modified
</pre>
<p>
And finally, here's a site from the list above, supporting neither method:
</p>
<pre>
1. First request:
  
GET /syndication/rss.php HTTP/1.1
Host: www.pepysdiary.com
  
2. Response includes neither Etag nor Last-Modified
  
HTTP/1.x 200 OK
Server: Apache/1.3.19 (Unix) PHP/4.0.4pl1
Transfer-Encoding: chunked
Content-Type: text/html
  
3. Second request:
  
GET /syndication/rss.php HTTP/1.1
Host: www.pepysdiary.com
  
4. Unchanged feed sent again:
  
HTTP/1.x 200 OK
Server: Apache/1.3.19 (Unix) PHP/4.0.4pl1
Transfer-Encoding: chunked
Content-Type: text/html
</pre>
<p>
If you're curious about which of these cases applies to your feed, one way to check is to use Mozilla's <a href="http://livehttpheaders.mozdev.org/">LiveHTTPHeaders</a> extension, which is in fact how I took these snapshots.
</p>

</body>
</item>


<item num="a904">
<title>Paul Venezia's masterful Linux 2.6 review</title>
<date>2004/02/01</date>
<body>

<p>
Hats off to Paul Venezia for his exhaustive analysis of the Linux 2.6 kernel in this week's InfoWorld:
</p>
<blockquote cite="InfoWorld">
Will the new Linux really perform in the same league as the big boys? To find out, I put the v2.6.0 kernel through several real-world performance tests, comparing its file server, database server, and Web server performance with a recent v2.4 series kernel, v2.4.23. [<a href="http://www.infoworld.com/infoworld/article/04/01/30/05FElinux_1.html">InfoWorld: Linux v2.6 scales the enterprise, Paul Venezia</a>]
</blockquote>
<p>
Paul's not kidding, he went to the mat on this one. In a <a href="http://www.infoworld.com/article/04/01/30/05FElinuxdev_1.html">sidebar</a> on the kernel development process, Paul notes that he twice went to the Linux Kernel Mailing List with what seemed to be -- and in fact were -- bugs. Here's <a href="http://testing.lkml.org/slashdot.php?mid=429770">the first LKML thread</a>, and here's <a href="http://testing.lkml.org/slashdot.php?mid=430810">the second</a>. Nice going!
</p>


</body>
</item>


<item num="a903">
<title>Analyzing blog content</title>
<date>2004/01/31</date>
<body>

<p>
Suppose that we bloggers, collectively, wanted to migrate toward HTML coding and CSS styling conventions that would make our content more interoperable. Since none of us is starting from a clean slate, we'd need to analyze current practice. Well, now we can. Here, for example, is a concordance of use cases for HTML elements with class attributes, drawn from the database I'm building:
</p>

<div style="border-style: solid; border-width: thin; padding: 10px; margin: 2em 6em">
<p><b>&lt;a class="Troll"></b>
<ol>
<li>OLDaily: <a href="http://www.csmonitor.com/2004/0127/p11s01-legn.html">Theory in Chaos</a></li>
</ol></p>
<p><b>&lt;a class="listLinkLrg"></b>
<ol>
<li>Kingsley Idehen's Blog: <a href="http://www.openlinksw.com:80/blog/~kidehen/?id=442">Enterprise Databases get a grip on XML</a></li>
</ol></p>
<p><b>&lt;a class="nodelink"></b>
<ol>
<li>Erik Benson: <a href="http://erikbenson.com/index.cgi?node=Pat%20Coa">Pat Coa</a></li>
</ol></p>
<p><b>&lt;a class="offlink"></b>
<ol>
<li>Erik Benson: <a href="http://erikbenson.com/index.cgi?node=Pat%20Coa">Pat Coa</a></li>
</ol></p>
<p><b>&lt;a class="regularArticleU"></b>
<ol>
<li>Jeroen Bekkers' Groove Weblog: <a href="http://radio.weblogs.com/0104207/2003/07/15.html#a780">Groove and Weblogs</a></li>
<li>Kingsley Idehen's Blog: <a href="http://www.openlinksw.com:80/blog/~kidehen/?id=442">Enterprise Databases get a grip on XML</a></li>
</ol></p>
<p><b>&lt;a class="weblogItemTitle"></b>
<ol>
<li>Seb's Open Research: <a href="http://radio.weblogs.com/0110772/2004/01/29.html#a1427">Mario dans Le Devoir</a></li>
</ol></p>
<p><b>&lt;blockquote class="posts"></b>
<ol>
<li>McGee's Musings: <a href="http://www.mcgeesmusings.net/2004/01/28.html#a3921">Russell Ackoff resources on systems thinking</a></li>
</ol></p>
<p><b>&lt;div class="Section1"></b>
<ol>
<li>Clemens Vasters: Indigo'ed: <a href="http://staff.newtelligence.net/clemensv/PermaLink.aspx?guid=c65cb06d-1d7b-4038-9121-3905799cb148">Back to Business</a></li>
</ol></p>
<p><b>&lt;div class="active1"></b>
<ol>
<li>s l a m: <a href="http://radio.weblogs.com/0104487/2003/03/19.html#a569">Countering The Bush Doctrine</a></li>
</ol></p>
<p><b>&lt;div class="blogtitle"></b>
<ol>
<li>McGee's Musings: <a href="http://www.mcgeesmusings.net/2004/01/28.html#a3921">Russell Ackoff resources on systems thinking</a></li>
</ol></p>
<p><b>&lt;div class="caption"></b>
<ol>
<li>Joi Ito's Web: <a href="http://joi.ito.com/archives/2004/01/28/with_bloggers_inside_davos_secrets_are_out_iht_article.html">With bloggers inside, Davos secrets are out - IHT article</a></li>
<li>Windley's Enterprise Computing Weblog: <a href="http://www.windley.com/2004/01/14.html#a992">Toysight</a></li>
</ol></p>
<p><b>&lt;div class="comment"></b>
<ol>
<li>Organic BPEL: <a href="http://weblog.infoworld.com/udell/">Avalon is NOT representing the convergence between the Web and GUI!</a></li>
</ol></p>
<p><b>&lt;div class="date"></b>
<ol>
<li>Comments for Jon's Radio: <a href="http://radiocomments.userland.com/comments?u=100887&amp;p=900&amp;link=http://weblog.infoworld.com/udell/2004/01/27.html#900">None</a></li>
</ol></p>
<p><b>&lt;div class="inlineimage"></b>
<ol>
<li>Joi Ito's Web: <a href="http://joi.ito.com/archives/2004/01/28/with_bloggers_inside_davos_secrets_are_out_iht_article.html">With bloggers inside, Davos secrets are out - IHT article</a></li>
<li>Windley's Enterprise Computing Weblog: <a href="http://www.windley.com/2004/01/14.html#a992">Toysight</a></li>
</ol></p>
<p><b>&lt;div class="node"></b>
<ol>
<li>s l a m: <a href="http://radio.weblogs.com/0104487/2003/03/19.html#a569">Countering The Bush Doctrine</a></li>
</ol></p>
<p><b>&lt;div class="personquote"></b>
<ol>
<li>Joi Ito's Web: <a href="http://joi.ito.com/archives/2004/01/28/with_bloggers_inside_davos_secrets_are_out_iht_article.html">With bloggers inside, Davos secrets are out - IHT article</a></li>
</ol></p>
<p><b>&lt;div class="posts"></b>
<ol>
<li>McGee's Musings: <a href="http://www.mcgeesmusings.net/2004/01/28.html#a3921">Russell Ackoff resources on systems thinking</a></li>
</ol></p>
<p><b>&lt;li class="MsoNormal"></b>
<ol>
<li>Hillel Cooperman: <a href="None">None</a></li>
<li>Rob Howard's Blog: <a href="http://weblogs.asp.net/rhoward/archive/2003/11/18/38446.aspx">Continued...</a></li>
<li>cbrumme's WebLog: <a href="http://blogs.msdn.com/cbrumme/archive/2003/05/17/51445.aspx">Memory Model</a></li>
</ol></p>
<p><b>&lt;p class="ArticleBody"></b>
<ol>
<li>Telematique, water and fire.: <a href="http://www.telematica.com/blog/2003/12/17.html#a247">Server vendors launch management initiative</a></li>
</ol></p>
<p><b>&lt;p class="MsoNormal"></b>
<ol>
<li>Luann Udell / Durable Goods: <a href="http://www.durable-goods.com/blog/2004/01/09.html#a17">Myth #3 about Artists</a></li>
<li>Clemens Vasters: Indigo'ed: <a href="http://staff.newtelligence.net/clemensv/PermaLink.aspx?guid=c65cb06d-1d7b-4038-9121-3905799cb148">Back to Business</a></li>
<li>Rob Howard's Blog: <a href="http://weblogs.asp.net/rhoward/archive/2003/11/18/38298.aspx">Last post on the topic -- at least for now!</a></li>
<li>cbrumme's WebLog: <a href="http://blogs.msdn.com/cbrumme/archive/2003/05/17/51445.aspx">Memory Model</a></li>
</ol></p>
<p><b>&lt;p class="blogtitle"></b>
<ol>
<li>McGee's Musings: <a href="http://www.mcgeesmusings.net/2004/01/28.html#a3921">Russell Ackoff resources on systems thinking</a></li>
</ol></p>
<p><b>&lt;p class="code"></b>
<ol>
<li>Duncan Wilcox's weblog: <a href="http://duncan.focuseek.com/2003/01/tagsoup/">Tag Soup</a></li>
</ol></p>
<p><b>&lt;p class="editorial"></b>
<ol>
<li>MobileWhack: <a href="http://www.mobilewhack.com/handset/sonyericsson/z600/z600_accessories.html">Z600 Accessories, Accessories, Accessories</a></li>
</ol></p>
<p><b>&lt;p class="imagelink"></b>
<ol>
<li>Kevin Lynch: <a href="http://www.klynch.com/archives/000043.html">Intel Centrino</a></li>
</ol></p>
<p><b>&lt;p class="posts"></b>
<ol>
<li>McGee's Musings: <a href="http://www.mcgeesmusings.net/2004/01/28.html#a3921">Russell Ackoff resources on systems thinking</a></li>
</ol></p>
<p><b>&lt;p class="q"></b>
<ol>
<li>Duncan Wilcox's weblog: <a href="http://duncan.focuseek.com/2002/11/trustingcorporations/">Trusting Corporations</a></li>
</ol></p>
<p><b>&lt;p class="text"></b>
<ol>
<li>Hillel Cooperman: <a href="None">None</a></li>
</ol></p>
<p><b>&lt;p class="times"></b>
<ol>
<li>Telematique, water and fire.: <a href="http://www.telematica.com/blog/2004/01/12.html#a256">Metro AG and their RFID Plan</a></li>
</ol></p>
<p><b>&lt;span class="artText"></b>
<ol>
<li>Kingsley Idehen's Blog: <a href="http://www.openlinksw.com:80/blog/~kidehen/?id=442">Enterprise Databases get a grip on XML</a></li>
</ol></p>
<p><b>&lt;span class="bodytext"></b>
<ol>
<li>Seb's Open Research: <a href="http://radio.weblogs.com/0110772/2004/01/28.html#a1423">Kottke: Guidelines for learning</a></li>
</ol></p>
<p><b>&lt;span class="byline"></b>
<ol>
<li>McGee's Musings: <a href="http://www.mcgeesmusings.net/2004/01/28.html#a3921">Russell Ackoff resources on systems thinking</a></li>
</ol></p>
<p><b>&lt;span class="closed"></b>
<ol>
<li>s l a m: <a href="http://radio.weblogs.com/0104487/2003/03/19.html#a569">Countering The Bush Doctrine</a></li>
</ol></p>
<p><b>&lt;span class="imagelink"></b>
<ol>
<li>Kevin Lynch: <a href="http://www.klynch.com/archives/000058.html">Adam Bosworth on Service Architecture</a></li>
</ol></p>
<p><b>&lt;span class="nxml-attribute-local-name"></b>
<ol>
<li>darcusblog: <a href="http://netapps.muohio.edu/movabletype/archives/darcusb/darcusb/000120.html">Names (again)</a></li>
</ol></p>
<p><b>&lt;span class="nxml-attribute-value"></b>
<ol>
<li>darcusblog: <a href="http://netapps.muohio.edu/movabletype/archives/darcusb/darcusb/000120.html">Names (again)</a></li>
</ol></p>
<p><b>&lt;span class="nxml-attribute-value-delimiter"></b>
<ol>
<li>darcusblog: <a href="http://netapps.muohio.edu/movabletype/archives/darcusb/darcusb/000120.html">Names (again)</a></li>
</ol></p>
<p><b>&lt;span class="nxml-element-local-name"></b>
<ol>
<li>darcusblog: <a href="http://netapps.muohio.edu/movabletype/archives/darcusb/darcusb/000120.html">Names (again)</a></li>
</ol></p>
<p><b>&lt;span class="nxml-tag-delimiter"></b>
<ol>
<li>darcusblog: <a href="http://netapps.muohio.edu/movabletype/archives/darcusb/darcusb/000120.html">Names (again)</a></li>
</ol></p>
<p><b>&lt;span class="nxml-tag-slash"></b>
<ol>
<li>darcusblog: <a href="http://netapps.muohio.edu/movabletype/archives/darcusb/darcusb/000120.html">Names (again)</a></li>
</ol></p>
<p><b>&lt;span class="nxml-text"></b>
<ol>
<li>darcusblog: <a href="http://netapps.muohio.edu/movabletype/archives/darcusb/darcusb/000120.html">Names (again)</a></li>
</ol></p>
<p><b>&lt;span class="o"></b>
<ol>
<li>ongoing: <a href="http://www.tbray.org/ongoing/When/200x/2004/01/19/HeresGenx">Genx</a></li>
</ol></p>
<p><b>&lt;span class="ofp"></b>
<ol>
<li>Seb's Open Research: <a href="http://radio.weblogs.com/0110772/2004/01/25.html#a1414">None</a></li>
</ol></p>
<p><b>&lt;span class="rss:item"></b>
<ol>
<li>Blogging Alone: <a href="http://radio.weblogs.com/0104704/2004/01/03.html#a1252">None</a></li>
</ol></p>
<p><b>&lt;span class="storyHead"></b>
<ol>
<li>Jeroen Bekkers' Groove Weblog: <a href="http://radio.weblogs.com/0104207/2003/06/11.html#a760">Disruptive in no small measure</a></li>
</ol></p>
<p><b>&lt;span class="text"></b>
<ol>
<li>s l a m: <a href="http://radio.weblogs.com/0104487/2003/03/19.html#a569">Countering The Bush Doctrine</a></li>
</ol></p>
<p><b>&lt;span class="title"></b>
<ol>
<li>Blogging Alone: <a href="http://radio.weblogs.com/0104704/2004/01/03.html#a1252">None</a></li>
</ol></p>
<p><b>&lt;span class="topstoryhead"></b>
<ol>
<li>Dive into BC4J: <a href="http://radio.weblogs.com/0118231/2004/01/15.html#a219">BC4J Mentioned in the Latest Article in the OTN Architecture Series</a></li>
</ol></p>
<p><b>&lt;ul class="noindent"></b>
<ol>
<li>Corante: Social Software: <a href="http://www.corante.com/many/20030901.shtml#51897">Friendster notes</a></li>
<li>Web Voice: <a href="http://webvoice.blogspot.com/archives/2004_01_01_webvoice_archive.html#107452126963625953">And now for something different</a></li>
<li>Dan Gillmor's eJournal: <a href="http://weblog.siliconvalley.com/column/dangillmor/archives/001733.shtml">Electronic Voting: An Insecure Mess, but Full Speed Ahead</a></li>
</ol></p>
</div>

<p>
With only a few days' worth of accumulated content, I wouldn't dare to venture any recommendations about these use cases. But as the picture develops over time, we might start to see opportunities for convergence.
</p>
<p><b>Update</b>:
I've been hoping for some external validation of this approach, and Giulio Piancastelli provides it today. As part of a much longer posting with lots of detailed technical analysis of RDF-oriented techniques, he writes:
</p>
<blockquote cite="Giulio Piancastelli">
<p>
What Jon is searching for, I think, is a good
balance between the cost of providing metadata and the benefits gained
by working on the provided metadata, while trying not to entirely move
away from the web world as we know it. In fact, this is probably the
most important characteristic of Jon's experiment: he is working with
what he is able to find right now, that is lots of HTML documents,
which can be converted to be well-formed XML quite easily, and then
searched by means of XPath. While these are ubiquitous technologies,
it's difficult to find RDF files spreaded around as such: proving that
the RDF world is query-enabled, stating that the right place where to
put metadata are RDF files because you can probably get higher quality
and more complete results is useless if there are little or no data to
query.</p>
<p>From my personal perspective, I see those two worlds, one working
with XML and XPath, the other messing around with RDF and RDQL, still
very far from each other. Jon's experiment is helping us to become
conscious of the fact we <em>already</em> are on a metadata path as
far as web content is concerned: XML and XPath are probably the first
steps in this journey, leading us to a more semantic web augmented with
technologies which nowadays seems not to be successful, but that will
hopefully prove to be useful when more complex needs arise. We can only
hope the <a shape="rect" title="The forest and the trees" href="http://weblog.infoworld.com/udell/2004/01/25.html#a896">virtuous cycle</a> will start to spin soon.</p>
[<a href="http://www.mycgiserver.com/~gpiancastelli/archives.jsp?post=0063">Through the blogging-glass</a>]
</blockquote>
<p>
Amen. Thanks, Guilio!
</p>

</body>
</item>

<item num="a902">
<title>More fun with queries</title>
<date>2004/01/30</date>
<body>

<p>
I should probably get a life, but instead I can't stop myself from writing more new queries against my growing database of well-formed blog content. Here are some queries that find the following things in the last few days' worth of my inbound RSS feeds:
</p>
<p>
<a href="http://udell.infoworld.com:8001?//p//a[contains(./@href, 'apple.com')]">paragraphs containing links to apple.com</a>
</p>
<p>
<a href="http://udell.infoworld.com:8001?//p[contains(.//a/@href, 'apple.com') and contains(., 'XSLT')]">paragraphs that contain links to apple.com and mention 'XSLT'</a>
</p>
<p>
<a href="http://udell.infoworld.com:8001?//p[contains (., 'Orkut') and (ancestor::item/date = '2004/01/30')]">paragraphs in items posted today that mention 'Orkut'</a>
</p>
<p>
<a href="http://udell.infoworld.com:8001?//item[contains(./@channel, 'Joi Ito') or contains(./@channel, 'Joho')]//body[contains (., 'Orkut') and contains(ancestor::item/date, '2004/01')]">January items, posted by Joi Ito or David Weinberger, that mention mention 'Orkut'</a>
</p>
<p>
<a href="http://udell.infoworld.com:8001?//table//td[contains(., 'zipcode')]/ancestor::body">items containing tables with cells that mention 'zipcode'</a>
</p>
<p>
<a href="http://udell.infoworld.com:8001?//a[contains(./@href, 'amazon.com') and contains(./img/@src, 'amazon.com')]">links to amazon.com that also contain images from amazon.com</a>
</p>
<p>
Either I am crazy, or this is way cool. Or both.
</p>

</body>
</item>

<item num="a901">
<title>Structured search, phase two</title>
<date>2004/01/29</date>
<body>

<p>
The next phase of my structured search project is coming to life. For the new version I'm parsing all 200+ of the RSS feeds to which I subscribe, XHTML-izing the content, storing it in Berkeley DB XML, and exposing it to the same kinds of searches I've been applying to my own content. Here's a taste of the kinds of queries that are now possible:
</p>
<p>
<a href="http://udell.infoworld.com:8001/?//item[contains(./@channel, 'Dare Obasanjo')]//blockquote">quotes from Dare Obasanjo</a>
</p>
<p>
<a href="http://udell.infoworld.com:8001/?//item[contains(./@channel, 'ongoing')]//body//a">links from Tim Bray</a>
</p>
<p>
<a href="http://udell.infoworld.com:8001/?//item[contains(./@channel, 'inessential.com')]//body//a[contains(./@href, 'infoworld.com')]">links from Brent Simmons to InfoWorld.com</a>
</p>
<p>
<a href="http://udell.infoworld.com:8001/?//item[contains(./@channel, 'AKMA')]//p[contains(.//a/@href,'amazon.com')]">books mentioned by AKMA</a>
</p>
<p>
<a href="http://udell.infoworld.com:8001/?//item[contains(./@channel, 'Michael Rys')]//a[contains(./@href,'amazon.com') and contains(. , 'XQuery')]">books, with XQuery in the title, mentioned by Michael Rys</a>
</p>
<p>
The paint's not dry on this thing yet. I have yet to normalize the dates, and I'm still getting the hang of DB XML, but here are some things that become immediately obvious:
<ul>
<li><p>Feeds that deliver only partial content are at a disadvantage.</p></li>
<li><p>HTML Tidy is able to coerce a surprisingly large number of the feeds I take from HTML to XHTML.</p></li>
<li><p>Once coerced, they're addressable in terms of the elements you find in HTML: links, images, tables, quotes.</p></li>
</ul>
</p>
<p>
Until now, I've thought the major roadblock standing in the way of more richly structured content was the lack of easy-to-use XML writing tools. But maybe I've been wrong about that. If it's going to be practical to XHTML-ize what current HTML writing tools, maybe we can make a whole lot more progress than I thought by working toward CSS styling standards that will also provide hooks for more powerful searching.
</p>
<p>
At the very least, this will be a nice laboratory in which to experiment with a growing pool of XML content, using a variety of XML-capable databases. My hope, of course, is to offer a service that's as useful to you -- the writers of the blogs I'm reading, aggregating and searching -- as it is to me. And ideally, useful to you in ways that invite you to think about how to make what you write even more useful to all of us. We'll see how it goes. 
</p>

</body>
</item>

<item num="a900">
<title>.NET status check</title>
<date>2004/01/27</date>
<body>

<p>
There's been some pushback recently, in the .NET blogging community, about Microsoft's habit of living in the future. For example:
<blockquote cite="Michael Earls">
It is abundantly frustrating to be keeping up with you guys right now. We out here in the real world do not use Longhorn, do not have access to Longhorn (not in a way we can trust for production), and we cannot even begin to test out these great new technologies until version 1.0 (or 2.0 for those that wish to stay sane).  I know there's probably not a whole lot you can do, but this is a plea to you from someone "in the field".  My job is to work on the architecture team as well as implement solutions for a large-scale commercial website using .NET.  I use this stuff all day every day, but I use the  1.1 release bits.
<br/><br/>
Here's my point, enough with the "this Whidbey, Longhorn, XAML is so cool you should stop whatever it is you are doing and use it". Small problem, we can't. Please help us by remembering that we're still using the release bits, not the latest technology. [<a href="http://www.cerkit.com/cerkitBlog/PermaLink.aspx?guid=9ededd3b-a7a3-401a-9a74-63e048c5e68e">Michael Earls</a>]
</blockquote>
In the spirit of Michael's plea, I'm working on an upcoming article in which I'll compare what was promised for the .NET platform (er, framework), two and three years ago, with the current reality as it exists today. Examples of the kinds of issues I want to consider:
</p>
<ol>
<li>
<p>
Easier deployment. The "end of DLL hell" was one of the early .NET battle cries. CLR metadata, enabling side-by-side execution, was going to make that problem go away. Well, has it? I hear a lot about ClickOnce deployment in Longhorn, but does the existing stuff work as advertised?
</p>
</li>
<li>
<p>
Unified programming model. It was obvious that wrapping years of crufty Win32 and COM APIs into clean and shiny .NET Framework classes, and then transitioning app and services to that framework, wasn't going to happen overnight. But, how much progress has been made to date?
</p>
</li>
<li>
<p>
Programming language neutrality. Here's a statement, from an early Jeff Richter <a href="http://msdn.microsoft.com/msdnmag/issues/0900/Framework/default.aspx">article about .NET</a>, that provoked oohs and ahhs at the time: "It is possible to create a class in C++ that derives from a class implemented in Visual Basic." Well, does anybody do this now? Is it useful? Meanwhile, the dynamic language support we were going to get, for the likes of Perl and Python, hasn't arrived. Why not?
</p>
</li>
<li>
<p>
Security. As security bulletin MS02-06 ("Unchecked buffer in ASP.NET Worker Process") made clear, not everything labeled ".NET" is managed. Still, there is a lot of .NET-based server code running now. Can we articulate the real benefits of .NET's evidence-based approach to code access security? And what have been the tradeoffs? For example, I've noticed that while .NET's machine.config adds a new layer of complexity to an environment, nothing is subtracted. You've still got Active Directory issues, NTFS issues, IIS metabase issues. How do we consolidate and simplify all this stuff?
</p>
</li>
<li>
<p>
XML web services. I'd say many of the original goals were met here. Of course the goalposts moved too. .NET Web Services, circa 2000, looked more like CORBA-with-angle-brackets than like service oriented architecture. But while Longhorn's Indigo aims for the latter target, it's worth considering how well the deployed bits are succeeding on their original terms.
</p>
</li>
<li>
<p>
XML universal canvas. I hoped the XML features of Office 2003 were going to deliver on this promise. But here, the jury's still out.
</p>
</li>
<li>
<p>
WebForms/WinForms. This is a tricky one. The original .NET roadmap charted two parallel courses for client-side developers, one for the rich client and one for the thin client. Or as we say lately: "rich versus reach." There wasn't a write-once strategy for combining the two -- and indeed, in Longhorn, there still isn't -- but it's probably useful to consider how the side-by-side strategy has played out.
</p>
</li>
<li>
<p>
Software as a service. Not much progress there, as Bill Gates acknowledged in a <a href="http://www.microsoft.com/billgates/speeches/2002/07-24netstrategy.asp">July 2002</a> speech in which he also lamented the failure of "building block services" -- what was envisoned as Hailstorm -- to emerge. What are the roadblocks here? Plenty of business and technical issues to consider.
</p>
</li>
<li>
<p>
Device neutrality. The Tablet PC has turned out to be a good platform for .NET apps. Phones and PDAs, less so, for reasons that will be interesting to explore.
</p>
</li>
<li>
<p>
User interface / personal information management. A bunch of important themes were sounded in the <a href="http://www.microsoft.com/billgates/speeches/2000/06-22f2k.asp">2000 .NET rollout speech</a>. Pub/sub notification. Attention management. Smart tags. Today, I'd argue, I'm getting a lot of these effects from blog culture and RSS. Going forward, Longhorn is the focus of the UI/PIM vision articulated for .NET. But living here in the present, as we do, it's worth considering which aspects of current .NET technology are making a difference on this front.
</p>
</li>
</ol>
<p>
Over the next week or so, I'd like to have conversations with people on all sides of these (and perhaps other, related) issues. I'll be speaking with various folks privately, but here's a <a href="http://radiocomments.userland.com/comments?u=100887&amp;p=900&amp;link=http%3A%2F%2Fweblog.infoworld.com%2Fudell%2F2004%2F01%2F27.html%23a900">comment link</a> (<a href="http://weblog.infoworld.com/udell/gems/900.xml">rss</a>) for those who want to register opinions and/or provide feedback.
</p>

</body>
</item>


<item num="a899">
<title>Mindreef's SOAPscope 3.0</title>
<date>2004/01/26</date>
<body>

<p>
<a href="http://weblog.infoworld.com/udell/gems/mindreef.swf"><img align="right" vspace="6" hspace="6" alt="camtasia" src="http://weblog.infoworld.com/udell/gems/camtasia.gif"/></a>
Here's a <a href="http://weblog.infoworld.com/udell/gems/mindreef.swf">four-minute Flash movie</a> containing three segments from an online demo of the latest version of Mindreef's SOAPscope. The presenter is <a href="http://www.mindreef.com/company/team.html#frank">Frank Grossman</a>; a few others (including me) chime in occasionally. The segments are:
</p>
<ol>
<li><p>How SOAPscope integrates with the WS-I (Web Services Interoperability Organization) <a href="http://www-106.ibm.com/developerworks/webservices/library/ws-wsitest/?Open&amp;ca=daw-ws-news">test tools</a>.</p></li>
<li><p>How to invoke a WSDL service -- in this case, Microsoft's <a href="http://www.mindreef.net/soapscope/wsdldemo?referer=xmethods&amp;url=http://terraservice.net/TerraService.asmx?WSDL">TerraService</a> -- using SOAPscope to visualize inputs and outputs as pseudocode, and optionally modify and replay messages. You can <a href="http://www.mindreef.net/main/wsdlinvokeform?wsdlId=218&amp;service=0&amp;port=0&amp;operation=13">try this yourself</a> at XMethods.net, but the earlier version 2.0 of SOAPscope that's running there isn't as clever about converting enumerated types in the schema into picklists on the invocation form. </p></li>
<li><p>How SOAPscope 3.0 integrates with Visual Studio.NET.</p></li>
</ol>
<p>
Thanks to the Mindreef guys for playing along with this experiment, and to TechSmith for letting me test-drive <a href="http://www.techsmith.com/products/studio/">Camtasia Studio</a>. If folks think these off-the-cuff videos are useful, I'll try to do more of them. I'm involved in a lot of online demos, and showcasing them in this way is probably win/win both for the companies who present to me and for the readers of this blog. 
</p>
<p>
<b>Update</b>: Just as I was noticing a playback problem, Frank Grossman wrote to report the same thing. Camtasia uses a secondary .SWF file, launched from <a href="http://weblog.infoworld.com/udell/gems/mindreef.html">this HTML</a>, to control playback. Evidently, the idea is to make sure the movie plays at the correct screen size. But what I found, as did Frank, is that after the first time through, progressive playback of the video doesn't work on subsequent playbacks. So now I'm pointing directly at the <a href="http://weblog.infoworld.com/udell/gems/mindreef.swf">primary .SWF file</a> which, if you're running at greater than 1024x768 (the resolution of the demo) should work fine. If you're running at 1024x768, though, you'll want to use F11 to maximize the Flash player. 
</p>
	
</body>
</item>

<item num="a898">
<title>The art and science of software testing</title>
<date>2004/01/26</date>
<body>

<p>
<blockquote cite="InfoWorld">
Test-driven development does require a lot of time and effort, which means something's got to give. One Java developer, Sue Spielman, sent <a href="http://weblogs.java.net/pub/wlg/532">a Dear John letter to her debugger</a> by way of her Weblog. "It seems over the last year or two we are spending less and less time with each other," she wrote. "How should I tell you this? My time is now spent with my test cases." 
<br/><br/>
Clearly that's a better use of time, but when up to half of the output of a full-blown TDD-style project can be test code, we're going to want to find ways to automate and streamline the effort. Agitar Software's forthcoming Java analyzer, Agitator, which was demonstrated to me recently and is due out this quarter, takes on that challenge. [Full story at <a href="http://www.infoworld.com/article/04/01/23/04OPstrategic_1.html">InfoWorld.com</a>]
</blockquote>
</p>

</body>
</item>

<item num="a897">
<title>Next-generation e-forms</title>
<date>2004/01/26</date>
<body>

<p>
<blockquote cite="InfoWorld">
E-forms, a technology that's been around for a long time, is now a hotbed of activity. Microsoft's XML-oriented InfoPath, which shipped with Office 2003 in October, is now deployed and in use. Adobe plans to ship a beta version of its PDF-and-XML-oriented forms designer in the first quarter of this year. And e-forms veterans such as PureEdge and Cardiff, whose offerings are built on an XML core, are lining up behind XForms, the e-forms standard that became an official W3C recommendation in October 2003. [Full story at <a href="http://www.infoworld.com/article/04/01/23/04FEforms_1.html">InfoWorld.com</a>]
</blockquote>
</p>

</body>
</item>


<item num="a896">
<title>The forest and the trees</title>
<date>2004/01/25</date>
<body>

<p>
<blockquote cite="Evan Lenz">
<p>The genius of Jon Udell's work is not sheer technical
innovation (not that TransQuery amounted to anything like that either)
but rather the ability to make sense of how such technologies can be
used in simple but powerful ways over compelling content.</p>
<p>And not getting lost in the trees.</p> [<a href="http://evan.pcseattle.org/archives/000122.html#000122">Evan Lenz</a>]
</blockquote>
I greatly appreciate Evan's kind words. Ironically, I've been asking myself the same questions about my current project that Evan asks himself, in his posting, about his earlier (and masterfully done) <a href="http://24.18.215.221:8080/xsltdb/?*xsl=demo/about.xsl">TransQuery</a> <a href="http://www.xmlportfolio.com/transquery/">project</a>: why doesn't it provoke the reaction I think it should? Not because my stuff is technically innovative, which it isn't. But rather because it shows how ubiquitous but underexploited technologies (XPath, XSLT, XHTML) can make our everyday information more useful.
</p>
<p>
<a href="http://safari.oreilly.com/0321180607"><img align="right" vspace="6" hspace="6" src="http://safari.oreilly.com/images/0321180607/0321180607_s.jpg"/></a>
Co-incidentally I'm now reading <a href="http://safari.oreilly.com/0321180607">XQuery from the Experts</a>, and am having a curiously mixed reaction to the book. The geek in me is irresistably drawn to this Swiss-army-knife query language that so ambitiously straddles the realms of typed and untyped, hierarchical and relational, declarative and procedural. And I can't wait to use the corpus of XHTML blog content that I'm assembling to explore XQuery implementations, along with the XPath/XSLT techniques I've used so far.
</p>
<p>
On the other hand: so what? If I can't paint a picture of the forest that people can relate to, then planting a few more trees won't help. The notion of <a href="http://weblog.infoworld.com/udell/2004/01/15.html#a887">dynamic</a> <a href="http://weblog.infoworld.com/udell/2004/01/22.html#a894">categories</a> comes closest to answering the "so what?" question. But not close enough. When you work publicly, in blogspace, as I have been doing, reaction to your work is exquisitely measurable. And when I take the pulse of that reaction it's clear that I'm miles away from proving three points:
<ol>
<li><p>Ordinary Web content is already full of metadata,</p></li>
<li><p>which can enable powerful queries,</p></li>
<li><p>which, in turn, can motivate us to enrich the metadata.</p></li>
</ol>
As I begin to explore XQuery, I'll try to keep these guiding principles front and center. And if I wander off into the weeds, please feel free to administer a <a href="http://www.google.com/search?q=%22dope+slap%22">virtual dope slap</a>.
</p>

</body>
</item>


<item num="a895">
<title>Open source lock-in</title>
<date>2004/01/23</date>
<body>

<p>
<blockquote cite="InfoWorld">
With the release of MySQL 4.0, the licensing policy of the wildly popular open source database underwent a subtle change. The code libraries that client programs use to access the native MySQL API, formerly licensed under the LGPL (Lesser General Public License), were converted to the GPL. The LGPL was designed to exempt "nonfree" programs that link against open source libraries from the GPL's strong requirement to release source code. The purpose of the LGPL, according to the Free Software Foundation, is "to encourage the widest possible use of a certain library, so that it becomes a de-facto standard." And indeed, MySQL has become the database pillar of the so-called LAMP platform, whose acronym expands to Linux, Apache, MySQL, and the trio of Perl, Python, and PHP. [Full story at <a href="http://www.infoworld.com/article/04/01/16/03OPstrategic_1.html">InfoWorld.com</a>]
</blockquote>
Here's an interesting bit of backstory. As originally filed, my use of the terms LGPL and GPL in the lead paragraph was backwards. Not because I don't know the difference, but because it's so darned easy to get yourself mixed up when talking about this stuff. The error got past my own proofreading, and got by several editorial checks as well, but was fortunately caught before it went to print. I'm tempted to say that the complexity of open source licensing can make your eyes bleed, and that's true, but I guess it applies to all software licensing. Oracle, for example, is apparently now offering <a href="http://www.computerworld.com/databasetopics/data/software/story/0,10801,83053,00.html">licensing seminars</a> where you go to learn, not how to use Oracle, but how to pay for it. 
</p>
<p>
This week's column is only partly about licensing, though. It's also a cautionary tale about getting locked into database-specific access technologies. I referred to a posting by Kingsley Idehen which says, in part:
<blockquote cite="Kingsley Idehen">
I have been an ardent ODBC supporter since its inception simply because data is timelessly important, and ODBC provides a critical solution for separating application logic from data repositories.  There is a lot of SQL data driving mission critical business applications globally, and failure to comprehend ODBC's value proposition ultimately results in loss of control over Data, which is the foundation from which Information and Knowledge are derived.
<br/><br/>
You should never find yourself locked into any database vendor, programming language vendor, operating system vendor, or business application vendor, simply becuase you want exploit your own data. [<a href="http://www.openlinksw.com/blog/~kidehen/?id=446">Kingsley Idehen</a>]
</blockquote>
</p>
<p>
As several readers correctly pointed out, there are different kinds of database-access lock-in. Technologies such as ODBC and JDBC are ways to avoid the kind of lock-in I was talking about about in the column; we might call that "transport" or "access" lock-in. Of course, abstraction at this level doesn't help at all with another kind of lock-in; let's call that "SQL dialect lock-in." One reader, Jim Penny, argued forcefully in email (quoted with permission) that the former is far less worrisome than the latter:
<blockquote cite="Jim Perry">
The transport layer is so easy to replace that lockin at this level is hardly an issue. It should be, at most, a replacement of two or three routines and a re-linkage. In any Unix program that cares about database independence, it should be as easy as selecting some Makefile or autoconf options. (ODBC is a single transport layer, and locking in to it is hardly different in degree of lockin to locking in to any other transport layer, JDBC, libpq, etc.)
<br/><br/>
The common subset problem is much larger. It is really hard to write efficient, working SQL for multiple backends, especially if you don't know in advance what set of databases need to be supported. Nobody really attempts to be 100% compliant with the standard(s), everyone has extensions, missing features, and quirks.  I have in mind things like subselect support, 'in' support, representation of booleans and dates, cast format, blob support, and varchar support. In the more exotic realm, triggers, prepared statements, and procedural support vary wildly. 
<br/><br/>
If transport layer independence were truly important, more people would be using something like <a href="http://sqlrelay.sourceforge.net/sqlrelay/">SQL Relay</a>. SQL Relay offers other advantages, as well. [Jim Penny]
</blockquote>
</p>
<p>
Jim makes excellent points. (And SQL Relay, of which I'd not heard, sounds interesting.) I agree that SQL-dialect lock-in is even more pernicious than transport data-access lock-in. Although many useful apps can write to SQL's common subset, including (fortunately for me) some I've worked on, many others can't, and that's a huge problem. That said, I don't think transport neutrality is a non-issue. I'm as tempted as the next developer to think that if some simple hack will get me from transport A to transport B, then A and B are effectively the same. But really, they're not. What Microsoft understood very well about ODBC, as Kingsley has been saying for years, is that transport-neutral data access <i>from the desktop</i> would be a tremendous enabler if, and only if, it were always and automatically available.
</p>

</body>
</item>


<item num="a894">
<title>How dynamic categories work</title>
<date>2004/01/22</date>
<body>

<p>
<blockquote cite="Jon Udell">
In the spirit of the lightweight browser-based solution, I decided to create an equally lightweight server-based version based on Python and libxml2/libxslt. (I'm also working on a slightly heftier, but more powerful variation based on Berkeley DB XML; we'll explore that one next time.) [<a href="http://www.xml.com/pub/a/2004/01/21/udell.html">O'Reilly Network</a>]
</blockquote>
</p>
<p>
This article spells out, in more detail than I've gone into here, an approach to <a href="http://weblog.infoworld.com/udell/2004/01/15.html#a887">dynamic categories</a>. During yesterday's roundtable at <a href="http://myst-technology.com/mysmartchannels/public/blog/15397">RSS WinterFest</a>, I mentioned one use of this technique: querying for <a href="http://udell.infoworld.com:8000/?//p[ancestor::item/date[contains(.,%20'2003/12')]%20and%20contains(.//a/@href,%20'.mov')]">items, by date, that include QuickTime movies</a>. Kevin Marks, who's now director of engineering for Technorati, pointed out, correctly, that there's nothing special about searching by date. What is special is a search that combines the sort of standard metadata captured by any content management system with what we might call "inline metadata" that emerges from the content itself.
</p>
<p>
A clearer example, because it involves only inline metadata, is this dynamic category for <a href="http://udell.infoworld.com:8000/?//p[contains(.//a/@href,'amazon.com')%20or%20contains(.//a/@href,'allconsuming')%20or%20contains(.//a/@href,'safari.oreilly.com')]">items related to books</a>. It's a content-aware query that returns paragraphs (along with links, images, and other markup) containing URLs to amazon.com or allconsuming.com, the two book sites I commonly refer to. As a matter of fact, when I wrote the query I forgot about a third book site I commonly refer to: Safari. When I amended the query accordingly, a few more items appeared in the category. Note also that, because the query is content-aware, it can return more context (for example, <a href="http://udell.infoworld.com:8000/?//body[contains(ancestor::item/date,%20'2003')%20and%20contains(.//a/@href,%20'amazon')%20]">entire items</a>), or less context (for example, <a href="http://udell.infoworld.com:8000/?//a[contains(./@href,'amazon.com')%20or%20contains(./@href,'allconsuming')%20or%20contains(./@href,'safari.oreilly.com')]">just links</a>), by adjusting its scope.
</p>
<p>
Now, since the mountain will not come to Mohammed, Mohammed will go to the mountain. By that I mean: if the majority of blogs to which I subscribe won't provide me with XHTML content to search, then I will endeavor to XHTML-ize the feeds that they do supply. The reason: to extend these dynamic categories across the whole set of blogs I read. Here's a preview of a books query against the last few days' worth of my inbound feeds:
</p>
<div style="border-style: solid; border-width: thin; padding: 10px; margin: 2em 6em">
<p><b><a href="http://evan.pcseattle.org/archives/000117.html">Writing is hard</a> (Evan @ PCSeattle.org: 2003-10-23T17:06:10-08:00)
</b></p><div><p><b>I:</b> That's all well and good. "Being true to yourself". Sounds like you've been listening to that <a href="http://www.amazon.com/exec/obidos/ASIN/1559943491">William Zinsser tape</a> that O'Reilly sent you.</p></div><hr width="20%" align="left"/><p></p><p><b><a href="http://evan.pcseattle.org/archives/000072.html">Computer Control</a> (Evan @ PCSeattle.org: 2003-02-10T01:28:51-08:00)

</b></p><div><p>And I read this in the context of being introduced to the "simplicity movement" with the help of <a href="http://www.amazon.com/exec/obidos/ASIN/0609809016">Living Simply with Children</a> and <a href="http://www.amazon.com/exec/obidos/ASIN/0140286780">Your Money or Your Life</a>, both of which I've only begun to read. And both of which are causing a stirring in my soul, as well as Lisa's.</p></div><hr width="20%" align="left"/><p></p><p><b><a href="http://joi.ito.com/archives/2004/01/18/writing_style_and_blogging.html">Writing style and blogging</a> (Joi Ito's Web: Sun, 18 Jan 2004 17:56:47 +0900)
</b></p><div><p>My favorite reference is the <a href="http://www.amazon.com/exec/obidos/ASIN/0226104036">Chicago Manual of Style</a>.</p></div><hr width="20%" align="left"/><p></p><p><b><a href="http://joi.ito.com/archives/2004/01/17/inequality_and_the_role_of_fitness_in_power_laws.html">Inequality and the role of "fitness" in power laws</a> (Joi Ito's Web: Sat, 17 Jan 2004 23:41:26 +0900)
</b></p><div><p>In <i><a href="http://www.amazon.com/exec/obidos/ASIN/0452284392">Linked</a></i>

Albert-Laszlo talks a lot about power laws and makes a few interesting
points. First of all, power laws on the web make two assumptions, that
the network is growing and that people tend to link to sites that have
the most links. Laszlo cites work by Paul Krapivsky and Sid Redner from
Boston University, working with Francois Leyvraz from Mexico,</p></div><hr width="20%" align="left"/><p></p><p><b><a href="http://weblog.infoworld.com/udell/2004/01/15.html%23a887">Dynamic categories</a> (Jon's Radio (full-length descriptions): 2004-01-15T09:42:26-05:00)
</b></p><div><p><a href="http://142.167.72.34:8000/?//p%5Bcontains%28.//a/@href,%27amazon.com%27%29%20or%20contains%28.//a/@href,%27allconsuming%27%29%5D">books</a>: //p[contains(.//a/@href,'amazon.com') or contains(.//a/@href,'allconsuming')]</p></div><hr width="20%" align="left"/><p></p><p><b><a href="http://www.25hoursaday.com/weblog/PermaLink.aspx%3Fguid%3D1fefe7fd-a07d-4706-96e1-f5d4cc9b413d">It's All About Your Point of "View"</a> (Dare Obasanjo aka Carnage4Life: Mon, 19 Jan 2004 06:30:09 GMT)
</b></p><div><p dir="ltr">Once an XML representation of the relevant
information users are interested has been designed (i.e. the XML schema
for books, reviews and wishlists that could be exposed by sites like <a href="http://www.amazon.com/"><strong><font size="1" color="#345877">Amazon</font></strong></a> or <a href="http://www.bn.com/"><strong><font size="1" color="#345877">Barnes &amp; Nobles</font></strong></a>) the next technical problem to be solved is uniform access mechanisms... Then there's deployment, adoption and evangelism...</p></div><hr width="20%" align="left"/><p></p><p><b><a href="http://www.25hoursaday.com/weblog/PermaLink.aspx%3Fguid%3D1fefe7fd-a07d-4706-96e1-f5d4cc9b413d">It's All About Your Point of "View"</a> (Dare Obasanjo aka Carnage4Life: Mon, 19 Jan 2004 06:30:09 GMT)

</b></p><div><p>A few days ago I got a response to this post from Michael Brundage, author of <a href="http://www.amazon.com/exec/obidos/tg/detail/-/0321165810/qid=1074493016/">XQuery : The XML Query Language</a> and
a lead developer of the XML&lt;-&gt;relational database technologies
the WebData XML team at Microsoft produces, on a possible solution to
this problem that doesn't require lots of disparate parties to agree on
schemas, data model or web service endpoints. Michael <a href="http://www.25hoursaday.com/weblog/CommentView.aspx?guid=dee93b81-0ace-4643-8595-c443689d5fb3">wrote</a></p></div><hr width="20%" align="left"/><p></p><p><b><a href="http://www.25hoursaday.com/weblog/PermaLink.aspx%3Fguid%3Dce0409d2-359c-4deb-8414-d20e346b571f">The Dork Watch Up Close</a> (Dare Obasanjo aka Carnage4Life: Thu, 15 Jan 2004 07:00:36 GMT)
</b></p><div><p>Today I picked up my rash and purely impulsive Christmas buy, a <a href="http://www.amazon.com/exec/obidos/ASIN/B000153MWW/ref=pd_sxp_elt_l1/104-8866171-0247936"><font color="#003399">Fossil Wrist.NET Smart watch</font></a>. It was
probably sub-consciously induced by the new kid who came to our school
(around 1977) with a calculator on his watch. No matter that it was
impossible to press any of the buttons to do even the most simple sums
and that this was tremendously useless, the fact that it was on a
watch with a calculator built in made it ultra cool and an instant
friend maker.</p></div><hr width="20%" align="left"/><p></p><p><b><a href="http://www.25hoursaday.com/weblog/PermaLink.aspx%3Fguid%3Ddee93b81-0ace-4643-8595-c443689d5fb3">XML For You and Me, Your Mama and Your Cousin Too</a> (Dare Obasanjo aka Carnage4Life: Tue, 06 Jan 2004 16:17:31 GMT)

</b></p><div><p dir="ltr">Once an XML representation of the relevant
information users are interested has been designed (i.e. the XML schema
for books, reviews and wishlists that could be exposed by sites like <a href="http://www.amazon.com/">Amazon</a> or <a href="http://www.bn.com/">Barnes &amp; Nobles</a>) the next technical problem to be solved is uniform access mechanisms. The eternal <a href="http://effbot.org/zone/rest-vs-rpc.htm">REST vs. SOAP vs. XML-RPC</a> that has plagued a number of online discussions. Then there's deployment, adoption and evangelism.</p></div><hr width="20%" align="left"/><p></p><p><b><a href="http://www.pipetree.com/qmacro/2003/07/18%23gpg">Google Pocket Guide out now</a> (DJ's Weblog: None)
</b></p><div><p><a href="http://www.amazon.com/exec/obidos/tg/detail/-/0596005504/"><img align="right" border="0" title="Google Pocket Guide cover" src="http://www.pipetree.com/%7Edj/2003/07/googlepg_tiny.jpg"/></a> I don't think I mentioned it directly here (perhaps partly a cause <em>and</em> effect of the recent blogging hiatus) but the <a title="Google Pocket Guide" href="http://www.oreilly.com/catalog/googlepg">Google Pocket Guide</a> has recently been released. Hurrah! It's a book I worked on with <a title="Rael Dornfest" href="http://www.raelity.org">Rael</a> and <a title="Tara Calishain" href="http://www.oreillynet.com/cs/catalog/view/au/873">Tara</a> (nice work, you two!). Talking to people at <a title="O'Reilly and Associates" href="http://www.oreilly.com">O'Reilly</a> last week at <a href="http://conferences.oreilly.com/os2003">OSCON</a>, it seems the guide is selling well. Hurrah again!</p>
</div>
</div>
<p>
I haven't yet normalized the dates of the items, and there are some conversion artifacts to deal with, but you get the idea.
</p>
 
</body>
</item>

<item num="a893">
<title>A penny for your ERP thoughts</title>
<date>2004/01/22</date>
<body>

<p>
Well, actually there's no penny involved. But InfoWorld would really like those of you who work with ERP systems to share your experiences with them in <a href="http://www.surveymonkey.com/Users/19050150/Surveys/67797350872/339BABA0-3396-43C8-83F7-C1C432DB9688.asp?U=67797350872&amp;DO_NOT_COPY_THIS_LINK">our ERP survey</a>.
</p>

</body>
</item>


<item num="a892">
<title>One-click subscription, continued: the lesser of two evils?</title>
<date>2004/01/21</date>
<body>

<p>
There's been some ongoing discussion of one-click RSS subscriptions over at <a href="http://inessential.com/?comments=1&amp;postid=2792">Brent's</a> and <a href="http://www.25hoursaday.com/weblog/CommentView.aspx?guid=6781c45c-01d6-406b-8b0c-ad860c859a0b">Dare's</a> sites. Some things I've found out:
</p>
<ul>
<li><p>Here's a <a href="http://www.25hoursaday.com/draft-obasanjo-feed-URI-scheme-02.html">spec</a> for the feed: URI scheme.</p></li>
<li><p>It's possible to hook the feed: URI scheme, system-wide, on Mac OS X and Windows. </p></li>
<li><p>Although the older version of NetNewsWire Lite I was using didn't support feed: URLs, the newer versions of Lite and Pro do, and I've tested Lite's feed: behavior successfully in both Mozilla and Safari.</p></li>
<li><p>RSS Bandit is supposed to also support feed: URLs, but I couldn't get it to work with a freshly-downloaded copy. Dare's checking. (<b>Update</b>: <a href="http://www.25hoursaday.com/weblog/PermaLink.aspx?guid=01b75d53-ce97-4109-a79c-f01645a89f04">Here's the drill</a>. Works for me now in both Mozilla and IE.)</p></li>
<li><p>I was surprised to note that <a href="http://www.methodize.org/quicksub/">quicksub</a> produces feed: URLs that look like this -- feed:http://weblog.infoworld.com/udell/rss.xml -- rather than like this -- feed://weblog.infoworld.com/udell/rss.xml. However I note that Dare's spec permits either form; in the degenerate case, http: is inferred.
</p></li>
</ul>
<p>
If this is all mumbo-jumbo to you, here's the take-away. Subscribing to a feed in a browser-based feedreader, such as Radio UserLand or Bloglines, is a no-brainer, because everything works on the click-to-load-a-page model. But it hasn't been a no-brainer to get from the orange XML icon on a web page into a GUI feedreader such as NetNewsWire or RSS Bandit. The feed: URI scheme aims to solve that problem. 
</p>
<p>
It is, however, controversial. As Joe Gregorio <a href="http://bitworking.org/news/Atom_Auto_Sub_How_To">points out</a>, the recently-finalized manifesto of the W3C Technical Architecture Group frowns on inventing new URI schemes unnecessarily:
<blockquote cite="W3C TAG">
Authors of specifications SHOULD NOT introduce a new URI scheme when an existing scheme provides the desired properties of identifiers and their relation to resources. [<a href="http://www.w3.org/TR/webarch/#pr-new-scheme-expensive">Architecture of the World Wide Web</a>]
</blockquote>
As usual, I can see both sides. I agree with the TAG that URI schemes should not multiply like rabbits. Yet the alternative -- using mime-types -- seems fraught with peril. The complexity of <a href="http://bitworking.org/news/Atom_Auto_Sub_How_To">Joe's explanation</a> makes me think of all the things that work poorly for me, in various browsers on various systems, when mime-types aren't correctly handled, as they so often are not. I'd hate for syndication newcomers to land in another WindowsMedia/QuickTime/RealVideo <a href="http://funwavs.com/wavfile.php?quote=4173&amp;sound=15">fireswamp</a>. I'm not certain that the feed: URI scheme is necessary, but at the moment, I'm inclined to think that it's the lesser of two evils. 
</p>

</body>
</item>

<item num="a891">
<title>XPath query tips</title>
<date>2004/01/19</date>
<body>

<p>
My new <a href="http://142.167.72.34:8000/?//blockquote[@cite='InfoWorld']">query page</a> invites you to try writing your own queries, and a few adventurous souls have been doing just that. As I've mentioned before, I'm no world-class expert on this subject, but  as I build up a corpus of searchable data on the one hand, and a set of canned and modifiable queries on the other, I'm learning. Indeed, one of my goals for the query page is to serve as a tutorial and playground, a place where folks (me included) can get ideas about what kinds of XHTML elements they might include in their own content, and how those elements could interact with XPath queries.
</p>
<p>
In the spirit of exploration and learning, here's a first installment of the tutorial. First, some background. The XPath expressions used in this search engine are embedded in an XSLT stylesheet. The stylesheet includes two XSLT templates. Here's the one that counts the number of results:
<pre class="code" lang="xslt">
&lt;xsl:template match="/">
&lt;div>Results:
&lt;xsl:value-of select="count(__QUERY__)"/>
&lt;/div>
&lt;xsl:apply-templates />
&lt;br clear="all"/>
&lt;p>Entries searched: &lt;xsl:value-of 
       select="count(//item)" />&lt;/p>
&lt;p>Date of oldest entry searched: &lt;xsl:value-of 
       select="//item[position()=last()]/date" />&lt;/p>
&lt;p>Date of newest entry searched: &lt;xsl:value-of 
       select="//item[position()=1]/date" />&lt;/p>
&lt;/xsl:template>
</pre>
<p>
And here's the one that reduces the whole XML file to just matching elements:
</p>
<pre class="code" lang="xslt">
&lt;xsl:template match="__QUERY__" >
&lt;p>&lt;b>
&lt;a>
&lt;xsl:attribute name="href">
http://weblog.infoworld.com/udell/&lt;xsl:value-of 
    select="ancestor::item/date" />.html#&lt;xsl:value-of 
    select="ancestor::item/@num"/>
&lt;/xsl:attribute>
&lt;xsl:value-of select="ancestor::item/title" />
&lt;/a> (&lt;xsl:value-of select="ancestor::item/date" />)
&lt;/b>
&lt;div>
&lt;xsl:copy-of select="."/>
&lt;xsl:if test="local-name(.)='blockquote' and @cite != ''">
Source: &lt;xsl:value-of select="@cite"/>
&lt;/xsl:if>
&lt;/div>
&lt;hr align="left" width="20%" />
&lt;/p>
&lt;/xsl:template>
</pre>
</p>
<p>
In my forthcoming O'Reilly Network column I publish the script that implements the search engine, and discuss it in detail. But from the perspective of writing queries, here's what you need to know. First, the search script replaces __QUERY__, in both XSLT templates, with the text of an XPath <i>pattern</i> -- either a canned one, or one that you supply. Second, the XML file matched against the pattern has this simple structure:
<pre class="code" lang="xml">
&lt;item num="a883">
&lt;title>Server-based XPath search&lt;/title>
&lt;date>2004/01/10&lt;/date>
&lt;body>
&lt;p>
...arbitrary XHTML content...
&lt;/p>
&lt;/body>
&lt;/item>
</pre>
</p>
<p>
Third, the pattern is used, in the XSLT transformation, in two different ways. The counting template uses it in an XSLT select (&lt;xsl:value-of select="count(__QUERY__)"/>), but the data-reduction template uses it in an XSLT match (&lt;xsl:template match="__QUERY__" >).
</p>
<p>
When I first wrote this entry, I used the term <i>expression</i> rather than <i>pattern</i> -- but really, the latter is correct. What's the difference between the two? Writing for MSDN Magazine, Aaron Skonnard explains:
<blockquote cite="Aaron Skonnard">
Select does indeed expect an XPath expression, which is used to select a nodeset for further processing. 
<br/>...<br/>
The match attribute, on the other hand, takes what's called a pattern. A pattern looks like an XPath expression because it shares the same syntax, but it's treated differently by the XSLT processor. A pattern is used for matching nodes in the tree against the specified criteria. [<a href="http://msdn.microsoft.com/webservices/building/columns/default.aspx?pull=/msdnmag/issues/03/08/xmlfiles/default.aspx">MSDN Magazine</a>]
</blockquote>
</p>
<p>
The XPath syntax you can use in a <a href="http://www.w3c.org/TR/xslt#patterns">match pattern</a> is more restrictive than the syntax you can use in a <a href="http://www.w3c.org/TR/xslt#dt-expression">select expression</a>. Since my XSLT stylesheet uses the syntax you supply in both contexts, it is limited to the more restrictive flavor -- that is, it must be a pattern, not a full-blown expression.
</p>
<p>
Watching my search logs, I notice that the most common error is to supply something like this:
<pre class="code" lang="xpath">
count(//blockquote)
</pre>
This fails because only some XPath functions can appear in the pattern syntax, and count() isn't one of them.
</p>
<p>
Why restrict the XPath syntax to only what's valid for the match attribute of an XSLT template? Because that's what my little search engine does. It matches and displays a subset of the elements contained in my blog.
</p>

</body>
</item>

<item num="a890">
<title>What RSS users want: consistent one-click subscription</title>
<date>2004/01/19</date>
<body>

<p>
Saturday's Scripting News <a href="http://archive.scripting.com/2004/01/17#When:7:58:40AM">asked</a> an important question: <a href="http://blogs.law.harvard.edu/crimson1/2004/01/13#a1027">What do users want from RSS?</a> The context of the question is the upcoming <a href="http://myst-technology.com/mysmartchannels/public/blog/15397">RSS Winterfest</a>. Dave Winer adds:
<blockquote cite="Dave Winer">
I thought we should try to put the focus on people who use the technology, to let them set the agenda for the developers. 
</blockquote>
Amen. Over the weekend I received a draft of the RSS Winterfest agenda along with a request for feedback. Here's mine: focus on users. In an <a href="http://weblog.infoworld.com/udell/2003/10/08.html#a823">October posting from BloggerCon</a> I present video testimony from several of them who make it painfully clear that the most basic publishing and subscribing tasks aren't yet nearly simple enough.
</p>
<p>
Here's more testimony from the comments attached to Dave's posting:
<blockquote cite="Ingrid Jones">
One message: MAKE IT SIMPLE. I've given up on trying to get RSS. My latest attempt was with Friendster: I pasted in the "coffee cup" and ended up with string of text in my sidebar. I was lost and gave up. I'm fed up with trying to get RSS. I don't want to understand RSS. I'm not interested in learning it. I just want ONE button to press that gives me RSS. Like Technorati gives me a simple list. I don't want to look under it's hood to learn the mechanics of how it works. RSS sales folk speak a different language from customers. RSS designers/instruction writers need work in tandem with people who write plain English and talk the language of the customer. [Ingrid Jones]
</blockquote>
<blockquote cite="Derek Scruggs">
Like others, I'd say one-click subscription is a must-have. Not only does this make it easier for users, it makes it easier to sell RSS to web site owners as a replacement/enhancement for email newsletters. Managing newsletters is a huge PITA - spam filters and the general unreliability of SMTP for large scale broadcasts has led to a Rube Goldberg nightmare. (NOTE: I'm not talking about spam, so no flames please. I'm talking about opt-in.) There are a LOT of companies that would jump on RSS if enough end users adopted it and it did away with the need for cumbersome email delivery technologies. [Derek Scruggs]
</blockquote>
<blockquote cite="Christoph Jaggi">
For average users RSS is just too cumbersome. What is needed to make is simpler to subscribe is something analog to the mailto tag. The user would just click on the XML or RSS icon, the RSS reader would pop up and would ask the user if he wants to add this feed to his subscription list. A simple click on OK would add the feed and the reader would confirm it and quit. The user would be back on the web site right where he was before. [Christoph Jaggi]
</blockquote>
</p>
<p>
Clearly the current approach -- linking the orange XML icon to an XML file, whose address must be captured and pasted into a feedreader -- isn't working for many users (or would-be users). There has been lots of discussion about creating a standard one-click subscription method. Dare Obasanjo reviews some of the issues <a href="http://www.25hoursaday.com/weblog/default.aspx?date=2003-12-06">here</a>. Phil Ringnalda reviews some current solutions <a href="http://philringnalda.com/blog/2003/08/quicksub_and_syndication_subscription_service.php">here</a>. On purely technical grounds, I'm frankly not sure which of three approaches -- a feed:// URI scheme, a MIME type, or a local HTTP listener -- is the "right" one. Dare writes:
<blockquote cite="Dare Obasanjo">
With all these varying approaches, it means that any website that wants to provide a link that allows one click subscription to an RSS feed needs to support almost a dozen different techniques and thus create a dozen different hyperlinks on their site. This isn't an exaggeration, this is exactly what <a href="http://www.feedster.com/">Feedster</a> does when one wants to subscribe to the results of a search. If memory serves correctly, Feedster uses the <a href="http://www.methodize.org/quicksub/">QuickSub javascript module</a> to present these dozen links in a drop down list. [<a href="http://www.25hoursaday.com/weblog/default.aspx?date=2003-12-06">Dare Obasanjo</a>]
</blockquote>
I checked, and that's exactly what Feedster is doing. Yes, it's preposterous. Nevertheless, I've decided to try the same method myself until the market converges on a single approach. Which convergence, it seems to me, can't happen until users of feedreaders reach some critical mass. Which, in turn, won't happen if feed publishers and feed readers continue to violate users' expectations of how something as fundamental as subscribing to a feed should work. 
</p>

</body>
</item>



<item num="a889">
<title>More on screen videos and dynamic categories</title>
<date>2004/01/18</date>
<body>

<p>
A couple of follow-ups to things mentioned here lately. First, thanks to the folks at <a href="http://www.techsmith.com/">TechSmith</a>, I'm trying out a copy of <a href="http://www.techsmith.com/products/studio/default.asp">Camtasia Studio</a>. I've used it to update the <a href="http://weblog.infoworld.com/udell/LibraryLookup/">LibraryLookup</a> page to include Flash versions of the Windows Media screen videos I made. You can do a lot more with Camtasia Studio than just convert formats, of course. It's a complete solution for capturing screen videos, and editing video and audio clips on a timeline. 
</p>
<div style="border-style: solid; border-width: thin; padding: 6px;"><b>Update</b>: 
John Dowdell writes:
<blockquote cite="John Dowdell">
I can appreciate the desire to turn efficient screen narratives into a commodity which escapes any pricing, but I haven't seen anyone donate their days to the world like that yet. [<a href="http://www.markme.com/jd/archives/004198.cfm">JD on MX</a>]
</blockquote>
An excellent point. To reiterate, I would like to draw a sharp distinction between amateur and professional screen videos. The ones I've done are strictly amateur. What I'm proposing is that the software industry as a whole would benefit if users could easily capture and publish amateur screen videos. These would lack the production values you'd expect in a professionally-done demo or training video, but would suffice for user-to-user communication (teaching one another how stuff works) and for user-to-developer communication (illustrating what <i>doesn't</i> work). These folks don't need and won't buy professional screen-video production tools. Conversely the producers of demos and training videos do need and will buy such tools. 
</div>
<p>
In other news, I've been having a ball writing queries for the XPath search page. For example: <a href="http://142.167.72.34:8000/?/blog/item/title[contains(../date, '2003/12')]">titles of December 2003 items</a>; <a href="http://142.167.72.34:8000/?//body[../date[contains(.,%20'2003/12')]%20and%20contains(.//a/@href,%20'.mov')]">December 2003 items containing QuickTime movies</a>.  
</p>
<p>
I've cleaned up my earlier entries well enough to include them, but the complete archive -- 832 entries in a 2.4MB XML file -- is more than libxslt can deal with. A version of this solution based on Berkeley DB XML is waiting in the wings, though, and I hope to deploy it sometime next week.
</p>
	
</body>
</item>

<item num="a888">
<title>Spontaneous screen videos</title>
<date>2004/01/16</date>
<body>

<p>
I can't post the Outlook/SpamBayes video mentioned in <a href="http://weblog.infoworld.com/udell/2004/01/13.html#a885">this week's column</a> because it reveals too much information about other people. So instead I've posted some experimental videos of LibraryLookup in action:
</p>
<ul>
<li><p><a href="http://weblog.infoworld.com/udell/gems/libraryLookup1.wmv">LibraryLookup 1. Basic: Using an existing bookmarklet.</a></p></li>
<li><p><a href="http://weblog.infoworld.com/udell/gems/libraryLookup2.wmv">LibraryLookup 2. Advanced: Creating your own bookmarklet.</a></p></li>
</ul>
<p>
<a href="http://weblog.infoworld.com/udell/gems/wmEncoderSettings.jpg"><img vspace="6" hspace="6" align="right" src="http://weblog.infoworld.com/udell/gems/wmEncoderSettings.gif"/></a>
I captured these videos with <a href="http://www.microsoft.com/windows/windowsmedia/9series/encoder/default.aspx">Windows Media Encoder 9</a>, using the settings shown in this screenshot. Each is a bit over 2.5MB, runs about 3 minutes, is stored on a plain Web server, and plays with minimal delay in Windows Media Player 9 on my DSL-connected system.
</p>
<p>
From the perspective of a Windows-based producer creating a software video for Windows-based consumers, things could hardly be easier. After recording the videos I dropped them into my Radio UserLand upload folder and, minutes later, they were available for progressively-downloadable viewing. The key characteristics of this solution:
<ul>
<li><p>Free.</p></li>
<li><p>Easy.</p></li>
</ul>
</p>
<p>
I'm well aware that there are many potential producers and consumers who are not Windows-based. Indeed, I'm often among those groups myself. Dealing with that problem sacrifices free, or easy, or both. There are free media converters, but it's not easy to acquire them, use them, and deliver multiple formats in a sane way. There are commercial solutions that make things easy -- by targeting Flash, for example, to avoid the WinMedia/Real/QuickTime <a href="http://funwavs.com/wavfile.php?quote=4173&amp;sound=15">fireswamp</a> -- but of course they're not free. I mentioned <a href="http://www.qarbon.com/products/vc/">Qarbon</a>, about which several correspondents have said nice things. I've also been referred to <a href="http://www.techsmith.com/products/studio/default.asp">Camtasia Studio</a>, and I'm sure there are others. At some point I'd like to experiment with one or more of these professional tools, but for now, let's stay focused on free and easy. 
</p>
<p>
There's a well-understood need for professional screen video tools. Less obvious is the need for <i>non-professional</i> tools. As is painfully clear from the videos I've made, I'm no <a href="http://search1.npr.org/search97cgi/s97_cgi?CleanQuery=xeni+jardin&amp;ResultTemplate=allow_re_sort.hts&amp;SortSpec=Date+Desc+Score+Desc&amp;ViewTemplate=docview.hts&amp;collection=ALL02&amp;Action=FilterSearch&amp;filter=topic_filter.NEW.hts&amp;QueryText=&amp;x=0&amp;y=0">Xeni Jardin</a> when it comes to narrating technology. But that's exactly the point. Sure, I'd like to improve my voice skills, and if I were creating a training video I'd have to do just that, or hire a voice talent. But imagine, instead, that I'm simply a user of the software, and that I want to show (and tell) other users how it works.
</p>
<p>
Software users could help one another, and the industry, if they could narrate their experiences with software, and publish those videos as easily as they can now publish blogs. A huge amount of the knowledge that's transferred about software use takes place among users, in the medium of text -- on forums and now also on blogs. Think how much more effective that knowledge transfer would be if users could capture and upload screen videos.
</p>
<p>
Of course this would make life uncomfortable for software developers who don't want to think about the annoyances that drive users crazy. For example, my wife recently switched from Eudora to Outlook Express. She wanted to Bcc: a message, but couldn't find the button. The reason is that Outlook Express (like Outlook, as a matter of fact) hides it. To get to the place where you can Bcc:, you have to click the To: button. Her response: "How idiotic is that?" And indeed, it's wildly counter-intuitive. 
</p>
<p>
Every application exhibits such annoyances. Developers are blind to them, because they're necessarily focused on making features work. And even experienced users become blind to them, because once you've learned the trick ("click To: or Cc: when you want to Bcc:") the knowledge becomes tacit. To combat this awareness problem, software companies bring in new users, capture their interactions with software on video, and make developers watch the videos. I've been on the receiving end of that treatment; it's painful. 
</p>
<p>
While this focus-group process always improves software, it's expensive. And it doesn't enable users of the shipping product, in the field, to contribute their reactions. Windows Media Encoder, coupled with blog technology, does enable anyone to spontaneously capture and post a screen video that can teach other users about an application, and/or to provide compelling feedback to the developer. I haven't seen this happen yet in blogspace, but I think it can and should. A couple of things (which may or may not already exist) that would help:
</p>
<ul>
<li><p>A free/easy way to Flash-ify a .WMV file</p></li>
<li><p>A free/easy way to capture and Flash-ify a Mac OS X screen video</p></li>
</ul>
<p>
One final point for developers: try narrating a video of your own software sometime. It's humbling. When I made these LibraryLookup videos, for example, I was forcibly reminded of all the ways in which LibraryLookup sucks. The canned lists of bookmarklets aren't complete or always accurate. There's no a priori way to know which OPAC your library uses, and therefore which library to search. Nor does the bookmarklet generator adequately solve the problem of identifying your library's OPAC. As I explained the process in the videos, these flaws became painfully apparent because, when we demonstrate and explain, we're forced to experience software from another person's point of view. As a result, I realized that rather than demonstrating workarounds for things like the OPAC-identification issue, I ought to come up with a better solution. 
</p>

</body>
</item>

<item num="a887">
<title>Dynamic categories</title>
<date>2004/01/15</date>
<body>

<p>
A while back I stopped assigning the items I post here to categories. It wasn't because I couldn't be bothered to do the categorization. Quite the contrary, I'm really interested in achieving that result, and more than willing to put some effort into it. But, although I'm generally a huge proponent of the publishing technique I call <i>static serving of dynamically-generated pages</i>, it increasingly seemed like the wrong way to deal with categories.
</p>
<p>
Lately it's becoming clear how the XPath search technology I've been working with will enable a fully dynamic approach to categories. For example, after posting yesterday's item, it struck me that two labels I'd have wanted to attach to that item were: <b>books</b>, and <b>AV clips</b>. So I added these two queries to the list of canned queries on the search page:
</p>
<p>
<a href="http://142.167.72.34:8000/?//p[contains(.//a/@href,'amazon.com')%20or%20contains(.//a/@href,'allconsuming')]">books</a>: //p[contains(.//a/@href,'amazon.com') or contains(.//a/@href,'allconsuming')]
</p>
<p>
<a href="http://142.167.72.34:8000/?//p[contains(.//a/@href,'.mp3')%20or%20contains(.//a/@href,'.wav')%20or%20contains(.//a/@href,'.mov')%20or%20contains(.//a/@href,'.ram')]">AV clips</a>: //p[contains(.//a/@href,'.mp3') or contains(.//a/@href,'.wav') or contains(.//a/@href,'.mov') or contains(.//a/@href,'.ram')]
</p>
<p>
Each of these queries finds yesterday's item (and this one too, actually). Each also forms a result page that could serve as a category page. There are a bunch of other queries that haven't been written down yet, but that implicitly categorize the same item in other ways. For example: <a href="http://142.167.72.34:8000/?//blockquote[@cite='Doc%20Searls']">Doc Searls quotations</a>. Or <a href="http://142.167.72.34:8000/?//p[contains(.//a/@href,'1585420824')%20andcontains(.//a/@href,'amazon.com')]">Jeremy Rifkin's <i>The Age of Access</i></a>. Query. Gotta love it.
</p>
<p>
I also added some instrumentation to the search page that reports the number of entries searched (213, as of this one), and the date of the earliest entry searched (April 2003). Here are some next steps: 
<ul>
<li><p><b>XHTML-ize the 500+ earlier entries.</b> That's done, pending some cleanup, thanks to <a href="http://tidy.sourceforge.net/">HTMLTidy</a>.</p>
<p class="tip">Just say no to WYSIWYG editors, such as the MS DHTML edit control, that insist on mangling your content.</p></li>
<li><p><b>Expand the roster of queries.</b> The earlier entries contain implicit metadata that, once exposed to search, will suggest additional query possibilities.</p></li>
<li><p><b>RSS-ify queries.</b> This handy technique, already practiced by <a href="http://www.technorati.com/">Technorati</a>, <a href="http://www.feedster.com/">Feedster</a>, and others, could be quite interesting in this context. If I refine a query so that it reshapes a category, you'd be notified. Could be annoying too, if I fiddle around too much, but we'll see how it goes.</p></li>
<li><p><b>Upgrade the search server.</b> I'm currently running with an almost perversely minimal setup. The next incarnation, which is in the pipeline, uses Berkeley DB XML instead of a bare XSLT processor.</p></li>
</ul>
I'm still deciding whether to stick with Python's mini-httpd (BaseHTTPServer), or switch to something else. But here's a larger issue to consider. Most bloggers don't have the ability to maintain any non-standard server-side infrastructure. So if this approach is going to scale, it can't require that. I've been thinking about this for a while. It ties back to RSS. Any feed that includes <a href="http://weblog.infoworld.com/udell/2003/04/14.html">well-formed XHTML content</a> can deliver that content to a search service. So Technorati, or Feedster, or another service that's already in the business of aggregating and searching feeds could also offer XPath (or ultimately XQuery) services. I would <i>love</i> to see that happen.
</p>
	
</body>
</item>


<item num="a886">
<title>Turning consumers into producers</title>
<date>2004/01/14</date>
<body>

<p>
<a href="http://weblog.infoworld.com/udell/gems/docAppleCES.mp3"><img vspace="6" hspace="6" align="right" src="http://weblog.infoworld.com/udell/gems/doc.jpg"/></a>
The final scan of my RSS feeds, last night, pulled in <a href="http://doc.weblogs.com/2004/01/13#winterfestWarmth">an item</a> from Doc Searls who said that he was on live radio at <a href="http://thelinuxshow.com/">The Linux Show</a>. (Doc's item also mentions the <a href="http://myst-technology.com/mysmartchannels/public/blog/15397">RSS Winterfest Webcast</a> a week from today. I'll be there; the complete list of participants is <a href="http://myst-technology.com/mysmartchannels/public/blog/17849">here</a>.) When I clicked through to The Linux Show's stream, I heard Doc say some things about the recent Macworld and CES shows that really hit home -- so much so that I wanted to hear them again. I reached for the RealPlayer's slider, but it was unresponsive. Doc wasn't kidding, he really was coming live from that radio show. His announcement of that fact made it from his computer to his blog to my aggregator in time for me to catch the live stream. Just another one of the daily miracles that I can't yet bring myself to take for granted.
</p>
<p>
<a href="http://www.amazon.com/exec/obidos/asin/1585420824/"><img hspace="8" align="right" src="http://images.amazon.com/images/P/1585420824.01.MZZZZZZZ.jpg"/></a>
Here's <a href="http://weblog.infoworld.com/udell/gems/docAppleCES.mp3">the bit</a> that caught my ear:
<blockquote cite="Doc Searls">
The most significant announcement was GarageBand. What Apple has started doing is providing the means by which consumers become producers. And in doing so, he [Jobs] is hacking the industry. He's hacking the entertainment industry, and he's hacking the consumer electronics industry.
</blockquote>
Exactly right. Why should you care, if you're reading this blog for insight into enterprise information technology? A book I've <a href="http://weblog.infoworld.com/udell/2003/02/12.html#a604">mentioned</a> <a href="http://webservices.xml.com/pub/a/ws/2003/02/11/udell.html">before</a>, Jeremy Rifkin's <a href="http://www.amazon.com/exec/obidos/asin/1585420824/">The Age of Access</a>, helps connect the dots. In our April 2003 story, <a href="http://www.infoworld.com/infoworld/article/03/04/18/16dyndev_1.html">Leveraging a global advantage</a>, I used the meme -- expounded on by Rifkin in this book, <a href="http://www.inc.com/magazine/19950301/2182.html">David Friedman in Inc. Magazine</a>, and others -- that "every business will be like show business." For the purposes of our story, that meant a fluid ad-hoc approach to assembling the teams and resources needed to develop enterprise software. 
</p>
<p>
But Rifkin's book takes a broader view:
<blockquote cite="Jeremy Rifkin">
A final point needs to be made about the Hollywood organizational model that is too often glossed over or missed altogether in discussions of management strategies. It's no mere coincidence that other industries try to model the way the entertainment industry is organized. The cultural industries -- including the recording industry, the arts, television, and radio -- commodify, package, and market experiences as opposed to physical products or services. Their stock and trade is selling short-term access to simulated worlds and altered states of consciousness. The fact is, they are an ideal organizational model for a global economy that is metamorphosing from commodifying goods and services to commodifying cultural experience itself. [Jeremy Rifkin, The Age of Access]
</blockquote>
There's much more in this fascinating book, whose basic premise is that the defining principle of capitalism is now no longer ownership of property bought and sold in markets, but rather access to services leased within networks of suppliers and users. That we are now organizing our IT infrastructure as a loose federation of services is, I would argue, another non-coincidence.
</p>
<p>
Pay close attention to the pivotal word "experience" -- as in, for example, "the user experience." It's the clue to understanding why Steve Jobs and John Mayer onstage at Macworld, mixing tracks in GarageBand, have more to do with IT's mission than you might think. The quality of experience that we deliver, through software and services, will depend on our ability to negotiate protocols and relationships in a fluid, rapidly-evolving environment. In short: to jam.
</p>

</body>
</item>

<item num="a885">
<title>Moving pictures</title>
<date>2004/01/13</date>
<body>

<p>	
<span class="minireview">Windows Media Encoder 9</span>
<blockquote cite="InfoWorld">
I wanted to demonstrate the SpamBayes plug-in for the school, and I realized I ought to try the screen-capture feature of the free Windows Media Encoder 9. The results were stunning. I set up a new session, pointed it at Outlook's main window, and began encoding. Then I talked through a demonstration of SpamBayes' configuration manager, its Delete and Recover toolbar buttons, and my techniques for integrating SpamBayes with Outlook's filtering and foldering. Along the way I pointed with the cursor to items of interest, opened and closed dialog boxes, and drove the Outlook interface as I normally do.
<br/><br/>
The resulting six-minute video had the same format as my Outlook window, which happened to be about 750-by-620. The file came in at just under 3MB. I FTP'd it to my Website and, because I'd chosen the progressive-download option, playback was immediate. It was also perfectly readable and audible. Elapsed time from the moment I thought of trying this to the end of playback: about 25 minutes. Next time it'll take 10. Why don't more people do this? Because it wasn't this easy before. Now, it is. [Full story at <a href="http://www.infoworld.com/article/04/01/09/02OPstrategic_1.html">InfoWorld.com</a>]
</blockquote>
I wanted to post that video here, but I'm afraid I can't because it reveals too much of the contents of my inbox. However, I'll definitely be using this technique in the future. One killer application, if you sit in on a lot of WebEx demos as I do, is the ability to record them, play them back, and publish excerpts from them.
</p>
<p>
For example, yesterday I sat in on two fascinating demos. The first was with Bill Appleton, creator of SuperCard, whose new product, <a href="http://www.dreamfactory.com/">DreamFactory</a> (see <a href="http://www.infoworld.com/article/04/01/12/HNdream_1.html">Paul Krill's InfoWorld article</a> yesterday), offers a really exciting way to compose graphical interfaces that wield Web services. The second was with Mark de Visser and Kent Mitchell of <a href="http://www.agitar.com/">Agitar</a>, whose new product, Agitator, takes a dramatically innovative approach to the automation of software testing. As the first WebEx was ending, it struck me that I might have been able to record it using Windows Media Encoder. So I experimented during the second WebEx and sure enough, it worked -- apart from my fumbling of the audio, that is. 
</p>
<p>
In order to use such material, I'd obviously need to clear it with the presenters, but I expect that for briefings not under non-disclosure, a number of folks would be willing to let me post AV excerpts. I can't wait to try this!
</p>
<p><b>Updates:</b> Ray Ozzie says that they're getting great results at Groove using <a href="http://www.qarbon.com/products/vc/">Qarbon</a> to capture software demos for Flash playback. In other news, the Agitar team has a <a href="http://www.developertesting.com/">blog</a>.
</p>

</body>
</item>

<item num="a884">
<title>Don Box and Tennessee Williams</title>
<date>2004/01/12</date>
<body>

<p>
<a href="http://search.barnesandnoble.com/booksearch/isbnInquiry.asp?isbn=0306808056&amp;itm=2"><img align="right" src="http://images.barnesandnoble.com/images/1180000/1181334.gif" vspace="6" hspace="6"/></a>
<blockquote cite="Don Box">
<div>
      <p style="">
        <span>Jon </span>
        <span>Udell</span>
        <span> has </span>
        <a title="" href="#a883">
          <span>joined the club</span>
        </a>
        <span> of those wanting to expose remote </span>
        <span>XQuery</span>
        <span> over the Internet. </span>
      </p>
      <p style="">
        <span>I have a feeling that Jon may not have read my </span>
        <a href="http://www.gotdotnet.com/team/dbox/default.aspx?key=2004-01-06T10:24:26Z">
          <span>security concerns</span>
        </a>
        <span> over exposing raw </span>
        <span>XQuery</span>
        <span> (and </span>
        <span>XPath</span>
        <span>) over a public access point </span>
      </p>
      <p style="">
        <span>The reason I have this feeling is because it looks like </span>
        <a href="http://142.167.72.34:8000/?//blockquote%5b@cite=%27InfoWorld%27%5d">
          <span>Jon's engine</span>
        </a>
        <span> has already melted down from too many //* queries (it's 11:36 PST and the site is effectively wedged). </span>
      </p>
      <p style="">
        <span>When
I was on the site earlier today, I did notice that Jon's engine was
putting an upper-bound on the size of the result set. Unfortunately, it
looks as if it is not putting an upper bound on the amount of compute
resources a given query can consume. </span>
      </p>
      <p style="">
        <span>When
I tried a //*-style query earlier this afternoon, the HTTP
infrastructure between my house and Jon's server wouldn't let a single
HTTP request go that long without returning. </span>
      </p>
      <p style="">
        <span>If it was my one query that sent Jon's server over the edge, I'm very sorry.</span>
      </p>
[<a href="http://www.gotdotnet.com/team/dbox/default.aspx?key=2004-01-12T07:41:25Z">Don Box's spoutlet: On the Kindness of Strangers</a>]
    </div> 
</blockquote>
Not to worry, Don. I'm aware of the concern, and part of this experiment is about exploring its implications. In fact, the queries that are timing out don't seem to be expensive at all. One possibility was my single-threaded use of Python's minimal BaseHTTPServer class.
</p>
<p>
So I switched from:
</p>
<pre class="code" lang="python">
class myHTTPServer (BaseHTTPServer.HTTPServer):
</pre>
<p>
To:
</p>
<pre class="code" lang="python">
class myHTTPServer (SocketServer.ThreadingMixIn,
                      BaseHTTPServer.HTTPServer):
</pre>
<p>
However, I think the problem may have been even more basic than that: failing to set Content-length when reporting that a query has exceeded the max result-set size. We'll see how it goes now. 
</p>
<p>
As an aside, I've added a canned query that finds <a href="http://142.167.72.34:8000/?//p[@style='']">blog items written using InfoPath</a>, based on its unique HTML coding signature :-)
</p>
<p>
The general question of how to constrain an engine's use of resources when exposed to arbitrary queries is, of course, extremely interesting. 
</p>

</body>
</item>

<item num="a883">
<title>Server-based XPath search</title>
<date>2004/01/10</date>
<body>

<p>
The <i>xpath search</i> link on the left navbar has, for the past few months, led to a <a href="http://weblog.infoworld.com/udell/gems/blogsearch.html">browser-based implementation</a> which was cool, in that it worked locally in either MSIE or Mozilla, but cumbersome since it required you to first download an ever-growing pile of XML content. So, as part of my next couple of O'Reilly Network columns, I'm experimenting with a few different lightweight server-based solutions. 
</p>
<p>
The first of these, currently wired to the <i>xpath search</i> link, is a <a href="http://142.167.72.34:8000/?//blockquote[@cite='InfoWorld']">minimal solution using Python's BaseHTTPDServer and libxslt</a>. This implementation goes against a file containing the XHTML entries I've accumulated over the past 5 or so months, which amounts to about .8MB currently. It transforms that file with the same stylesheet used in the client-side solution. This seems to work snappily for queries that test for equality of attributes. Queries that only use <i>contains()</i> clauses take noticeably longer. Ordinarily I wouldn't find that surprising. However if you compare with the client-side solution, you'll see that there, even <i>contains()</i> queries, using MSIE (the MSXML processor) or Mozilla (the Transformiix processor) are instantaneous. I'd have thought libxslt would give similar results on a similar quantity of data, but evidently not.
</p>
<p>
The next version, which I'm still refining, uses Berkeley XML DB. So far, it looks like it delivers great performance on all queries -- <i>if</i> I split the entries out into individual records. 
</p>
<p>
The point of this exercise is to continue to explore and reveal the structural possibilities inherent in simple XHTML/CSS content. It also makes a nice interactive XPath demo. Note that currently if you write an invalid query you just get a general error. I'll try to improve that with more specific feedback.
</p>

</body>
</item> 

<item num="a882">
<title>Databases get a grip on XML</title>
<date>2004/01/08</date>
<body>

<p>
<blockquote cite="InfoWorld">
The next iteration of the SQL standard was supposed to arrive in 2003. But SQL standardization has always been a glacially slow process, so nobody should be surprised that SQL:2003 -- now known as SQL:200n -- isn't ready yet. Even so, 2003 was a year in which XML-oriented data management, one of the areas addressed by the forthcoming standard, showed up on more and more developers' radar screens. [Full story at <a href="http://www.infoworld.com/article/03/12/31/01FEtoydata_1.html">InfoWorld.com</a> (part of <a href="http://www.infoworld.com/reports/01SRtoy04.html">2003 Technology of the Year</a>)]
</blockquote>
Although I thought XML support in databases was a hot 2003 topic, Edd Dumbill felt otherwise:
</p>
<p>
<blockquote cite="Edd Dumbill">
Though there's a reasonable amount of interest in the W3C XML Query language, there's not much to say about XML and databases. It doesn't seem to me that the integration of XML with relational databases has taken off in the way we once thought it might. [<a href="http://usefulinc.com/edd/blog/contents/2004/01/08-xmlconf/read">Edd Dumbill: The changing face of XML</a>]
</blockquote>
</p>
<p>
I may be guilty of a bit of wishful thinking. And yet, when I consider what Oracle, OpenLink Software, Sleepycat, and others are up to, I can't help but feel that we've turned the corner -- though the road ahead is still, admittedly, very long. My <a href="http://weblog.infoworld.com/udell/categories/infoworld/2003/07/30.html#a760">July feature on SQL/XML hybridization</a> spelled out the argument in more detail.
</p>

</body>
</item> 

<item num="a881">
<title>Dynamic languages and enterprise apps</title>
<date>2004/01/07</date>
<body>
<p>
<blockquote cite="InfoWorld">
We hoped 2003 would bring a rapprochement between the dominant enterprise VMs, Java and .Net, and the dynamic-language VMs that are still in many ways well-kept secrets. That mostly didn't happen. At the JavaOne 2003 technical keynote in June there was a nod in the direction of JSR (Java Specification Request) 223, which would enable languages such as PHP to be used in the Java Web tier. But the stewards of the enterprise VMs still aren't pushing to integrate them with the popular and productive dynamic-language VMs.
<br/><br/>
Jython, the Java/Python hybrid, has a growing cult following, but isn't on Sun's radar screen. Microsoft has yet to deliver on its early promises to make dynamic languages first-class citizens of the CLR. Here's hoping that the many VMs that flourished in 2003 will work better together in 2004. [Full story at <a href="http://www.infoworld.com/article/03/12/31/01FEtoydev_1.html?s=feature">InfoWorld.com</a> (part of <a href="http://www.infoworld.com/reports/01SRtoy04.html">2003 Technology of the Year)</a>]
</blockquote>
The ever-quotable Sean McGrath has said, of Jython:
<blockquote cite="Sean McGrath">
<a href="http://www.jython.org">Jython</a>, lest you do not know of it, is the most compelling weapon the Java platform has for its survival into the 21st century. [<a href="http://seanmcgrath.blogspot.com/2003_07_27_seanmcgrath_archive.html#105971971904416520">Sean McGrath</a>]
</blockquote>
Hyperbole? Maybe not. This weekend, I was working with the Java API to Sleepycat's Berkeley DB XML, and it felt like one of those bad dreams in which you're slogging through molasses toward an ever-receding goal. I switched to Jython and quickly got the job done. And it was <i>the same job</i> (indexing and searching content) using the <i>same engine</i> (Berkeley DB XML).
</p>
<p>
Of course the even better solution was native Python bound to DB XML, a combination that is not so easy to materialize. When I finally got that working, things really started to cook. 
</p>
<p>
Somebody asked me yesterday why platform vendors like Microsoft and Sun are never at the forefront of dynamic-language innovation. I don't know why that's so, but it does seem to be true. 
</p>
</body>
</item> 

<item num="a880">
<title>The sigh heard round the world</title>
<date>2004/01/06</date>
<body>

<p>
Edd Dumbill sighed when he read <a href="http://weblog.infoworld.com/udell/2004/01/04.html#a878">my comments</a> on FOAF and social networking:
<blockquote cite="FOAF IRC">
&lt;danbri&gt;http://weblog.infoworld.com/udell/2004/01/04.html#a878<br/>
* edd sighs<br/>
&lt;danbri&gt; its because foaf is associated in ppls minds w/ the 'social networking' sites<br/>
&lt;danbri&gt; and the 'f' in 'foaf' can't help<br/>
</blockquote>
</p>
<p>
I wanted to know more about that sigh, so I wrote to Edd, and heard back from both he and Dan Brickley. Edd wrote:
<blockquote cite="Edd Dumbill">
For me at least FOAF's point is as the personal homepage technology of
the semantic web.  Like we all made homepages back in 1995.  In fact the
links of significance in FOAF are the rdfs:seeAlso, not the foaf:knows
bits: the dumb seeAlso is the parallel to the dumb &lt;a href=&quot;&quot;&gt;. (Except
it turns out we can hang more information on a seeAlso.
<br/><br/>
I don't agree with your assertion that Google's enough: there are many
circumstances in which that isn't true.  FOAF and other techs are useful
in a lot of scenarios where we can and want to be more precise about
making a link about our involvement with projects and people.
</blockquote>
</p>
<p>
And Dan wrote:
</p>
<blockquote cite="Dan Brickley">
I guess my take is that since this 'social software' bandwagon came
along, commentators have largely lumped FOAF in with it, on the 
assumption (which to be fair the spec doesn't do enough to counter) 
that FOAF is all (and only) about explicitly representing typed relations 
amongst people. You _can_ do that with FOAF, and foaf:knows is an 
intentionally gentle starter in that direction, but you can also 
take things from a softer, less explicit angle too. Hmm I scribbled
about this before somewhere [rummages], ah yep <a href="http://lists.w3.org/Archives/Public/www-archive/2003Nov/0010.html">in reply to Shirky's piece:</a>
<blockquote>
[[SW [Semantic Web] technology, specifically RDF, makes it *possible* to goof up 
in various ways, but it also allows for subtler treatments, which 
is where (hopefully) FOAF is headed through its focus on 
describing the photos, events, collaborations etc that are the 
evidence friendship leaves in the world, rather than crudely 
taxonomising classes of friend.
]]
</blockquote>
FOAF is a playground where we can try out different takes on the 
explicit/articulated vs soft / evidence-based approaches. My own 
bias is towards the latter, but folks who find value in formally 
taxonomising their relationships can exchange their data in RDF/XML 
with FOAF + extensions, and nothing much should break (except perhaps 
a few hearts; not everything that can be written down should be ;)
</blockquote>
<p>
What struck me later about this interaction was its miraculous subtlety. I wrote something that made Edd sigh, I overheard his sigh, and we had a discussion about what provoked it. Now let's look at how this happened. My original comments were posted on this weblog. Edd and Dan may or may not subscribe to my blog, but given their central involvement in FOAF it was virtually certain that the item would come to their attention. Their reaction to it, on the FOAF chat channel, was logged on a public page. I became aware of it when somebody followed the link to my item from that page, which created an entry in my referrer log. A truly remarkable chain of events. This kind of thing happens every day, but I continue to find it astonishing.
</p>
<p>
It seems we all agree on the need to simultaneously explore what Dan calls &quot;explicit/articulated vs. soft / evidence-based approaches.&quot; My ability to &quot;overhear&quot; Edd's sigh is an example of the kind of softness that's already inherent in the systems we're evolving. My ability to search this blog for items in which I quote <a href="http://142.167.72.34:8000/?//blockquote[@cite='Edd%20Dumbill']">Edd</a> and <a href="http://142.167.72.34:8000/?//blockquote[@cite='Dan%20Brickley']">Dan</a> (experimental server-based version, not guaranteed to stay up, but see also the <a href="http://weblog.infoworld.com/udell/gems/blogsearch.html">client-based version</a>) is one way to be more explicit. I'm all for more ways to be explicit, but we've got to weave the stuff into our ordinary and natural activities. And if it's really going to scale, &quot;ordinary and natural&quot; has to mean something very different from what it means to Dan Brickley, Edd Dumbill, and me.
</p>

</body>
</item> 

<item num="a879">
<title>Pipelines and monads</title>
<date>2004/01/05</date>
<body>

<p>
More pushback on <a href="http://weblog.infoworld.com/udell/2004/01/01.html#a876">last week's column</a>, this time from from Stefano Mazzocchi:
<blockquote cite="Stefano Mazzocchi">
I've been one or the first to see the value of pipelines for XML processing and wrote Cocoon to make it happen, so I think I know a little about XML processing pipelines, but there is something that the people advocating web services miss entirely:
<ul><li>protocol interoperability is hard but can be achieved</li>
<li>data interoperability is harder but some standardization (real or de-facto) creates power-law clusters where things can work because some information can be taken for granted because implicit in that communication context</li>
</ul>
<p>but there is a third piece of the puzzle that almost everybody forgets:</p>
<ul><li>metadata interoperability</li>
</ul>
I think that point-2-point web services will work, as they do today and have been doing for a long while. Which communication protocol  and programming language they use to make it work doesn't matter at all, it's all just marketing. The rest will have a complexity similar to that of the RDF/RDFSchema/OWL stack (and if you look at <a href="http://xml.coverpages.org/bpel4ws.html">BPEL4WS</a> you start to understand what I'm talking about). [<a href="http://www.betaversion.org/~stefano/linotype/news/35/">Stefano's Linotype</a>]
</blockquote>
</p>
<p>
The crux of Stefano's objection, I think, is here:
<blockquote cite="Stefano Mazzocchi">
Marketing, protocol and syntax sugar aside, web services are RPC. 
</blockquote>
I disagree. It's true that Web services got off to a shaky start. At a conference a couple of years ago, a panel of experts solemnly declared that the &quot;Web&quot; in &quot;Web services&quot; was really a misnomer, and that Web services really had nothing to do with the Web. But since then the pendulum has been swinging back, and for good reason. Much to everyone's surprise, including mine, the linked-web-of-documents approach works rather well. Not just one-to-one and one-to-many, but also many-to-many. Adam Bosworth's <a href="http://weblog.infoworld.com/udell/2003/12/22.html#a873">XML 2003 keynote</a> was, for me, the most powerful affirmation yet that Web services can and should leverage the Web's scalable spontaneity. That's the vision firmly planted in my mind when I talk about Web services.
</p>
<p>
Meanwhile, this just in from Christian Morgensen:
<blockquote cite="Christian Morgensen">
<p>
You should check out Microsoft's new command line - after twenty years they're finally upgrading command.com in Longhorn. The Longhorn shell will do command pipelining using XML streams. They're calling it MSH - codenamed &quot;Monad&quot;.
</p>
<p>
It should give developers the power and flexibility of a Unix command line, with less of the awk/sed rewriting glue needed, because the data isn't just text in columns, but is structured and meaningful. The writeups focus on the fact that it hooks into the dotNet runtime, but the clever thing is really the communication mechanism between the objects, which is structured text. The Indigo runtime will short-circuit the serialization/deserialization to actual text, but the basic concept looks sound. It will make things quite interesting when it hits -- there will probably be some culture clash as windows developers learn to work without a GUI.
</p>
<p>
Check out the writeup on the new shell <a href="http://tfl09.blogspot.com/2003_11_01_tfl09_archive.html#106769921834716276">here</a>, and the PDC slides are <a href="http://www.gotdotnet.com/team/PDC/4118/ARC334.ppt">here</a>.
</p>
</blockquote>
Thanks Christian! I had seen that writeup in November, but lost track of it. This is just the sort of approach to harmonizing graphical and command-line interfaces that I've long envisioned.
</p>

</body>
</item> 

<item num="a878">
<title>The Heisenberg uncertainty of social networks</title>
<date>2004/01/04</date>
<body>

<p>
One of my New Year's resolutions is to open up this weblog to a wider audience. So on first mention of something obscurely technical, I'll try to define it. Today's obscurely technical topic: FOAF.
</p>
<dl class="definition">
<dt class="term">FOAF</dt>
<dd>
<p>
The FOAF (friend-of-a-friend) project, according to its <a href="http://www.foaf-project.org/">homepage</a>, &quot;is about creating a Web of machine-readable homepages describing people, the links between them and the things they create and do.&quot; 
</p>
<p>
Here is a trivial but valid FOAF file:
</p>
<pre class="code" lang="rdf">
&lt;?xml version=&quot;1.0&quot; encoding=&quot;utf-8&quot;?&gt;
&lt;rdf:RDF xmlns:rdf=&quot;http://www.w3.org/1999/02/22-rdf-syntax-ns#&quot;
         xmlns:foaf=&quot;http://xmlns.com/foaf/0.1/&quot;&gt;
&lt;foaf:Person&gt;
&lt;foaf:weblog rdf:resource=&quot;http://weblog.infoworld.com/udell/&quot; /&gt;
&lt;foaf:name&gt;Jon Udell&lt;/foaf:name&gt;
  &lt;foaf:knows&gt;
    &lt;foaf:Person/&gt;
  &lt;/foaf:knows&gt;
&lt;/foaf:Person&gt;
&lt;/rdf:RDF&gt;
</pre>
<p>
My FOAF &lt;Person&gt; record could say more about me: geographic location, interests, projects, and so on. But its main purpose is to list a bunch of other &lt;Person&gt; records -- those of my friends and associates -- and thereby create a web that can be traversed by software.
</p>
</dd>
</dl>
<p>
I don't currently maintain a FOAF file, for reasons that David Weinberger solidly nails in this posting:
</p>
<blockquote cite="David Weinberger">
It's like thinking that the invitation list for your wedding actually reflects your circle of friends and relatives. No, you had to invite Barry-the-Boozer because he's your cousin and you couldn't invite Marsha because then you'd have to invite her husband Larry-the-Ass-Grabber and her daughter Erin-the-Snot-Flinger.
<br/><br/>
If you want to get at the real social networks, you're going to have to figure them out from the paths that actual feet have worn into the actual social carpet. [<a href="http://www.corante.com/many/archives/2004/01/04/does_social_software_matter.php">David Weinberger: Corante: Many-to-Many</a>]
</blockquote>
Exactly right. In a connected world we have all sorts of ways to measure relationships, but that doesn't mean we can or should try to declare them. Bill de hÓra says why not:
<blockquote cite="Bill de hÓra">
Hand crafted logical ontologies are not sufficient precisely because they want to be certain. They don't drift with your interests over time, they're rigid, they're deterministic, they can only see around so many corners. In short they age badly, and they evolve badly. [<a href="http://www.dehora.net/journal/archives/000333.html">Bill de hÓra</a>]
</blockquote>
<p>
I realized long ago, for example, that maintaining a blogroll by hand was going to be a losing proposition, and switched to a system that simply echoes the list of feeds to which I'm currently subscribed. Commenting on David Weinberger's posting, Julian Bond echoes that idea:
<blockquote cite="Julian Bond">
I think that systems that exploit the other activities of people to build maps of social networks will always be more accurate and have less inherent bullshit than system where the participants are consciously building networks. This may mean that systems like Spoke and Plaxo that derive metadata from email use have more relevant data than systems like Tribes, Friendster and Linkedin. [<a href="http://www.corante.com/many/archives/2004/01/04/does_social_software_matter.php#1060">Julian Bond</a>]
</blockquote>
</p>
<p>
The impetus for all this recent discussion was, in part, the recent announcement by Six Apart that its TypePad blogging service has been <a href="http://www.sixapart.com/log/2004/01/format_offering.shtml">automatically generating FOAF files</a> from users' blogrolls. The announcement also mentions that a new service, <a href="http://beta.plink.org/">Plink</a>, is available for browsing the newly-enlarged web of &lt;Person&gt; records. As <a href="http://www.blackbeltjones.com/work/mt/archives/000805.html">Matt Jones</a> found out, you don't need to maintain a FOAF file to be Plink'd, you just need to be mentioned in somebody else's FOAF file. 
</p>
<p>
Given the Web and the many agents dedicated to exploring its interconnectedness -- Google, blog searchers and mappers -- this approach seems to me at best redundant. So when LinkedIn asked me to explicitly define my relationship with someone, by choosing from a list of options, I <a href="http://weblog.infoworld.com/udell/2003/12/16.html#a870">declined</a>. A Google query mentioning our two names would have been the best way to define our past, current, and future relationship.
</p>
<p>
I'm sympathetic to the FOAF cause, but I hope for a more general approach. The reality is that every document published to the Web can help to define a relationship -- by linking to, quoting from, or more subtly supporting or refuting another document. Of these actions, linking is the only one that's always unambiguously machine-readable. We can do much better. Six Apart has the right idea in leveraging TypePad's blogroll editor to build FOAF files under the covers. We need to extend that to all of our content production: email, blogs, everything.
</p>
<p>
To that end, I'd like to see <a href="http://www.gotdotnet.com/team/dbox/default.aspx?key=2004-01-04T02:05:29Z">Don Box's Word-to-XHTML project</a> (which may or may not supplant an <a href="http://www.microsoft.com/downloads/details.aspx?familyid=d5dcf263-8e19-4054-b599-70371b6cc2b4&amp;displaylang=en">earlier but not very useful translator posted to MSDN</a>) turn into more than a solo skunkworks effort.
</p>

</body>
</item> 

<item num="a877">
<title>Hacking matter</title>
<date>2004/01/02</date>
<body>

<p>
<a href="http://www.amazon.com/exec/obidos/asin/046504428X"><img alt="translucent databases" vspace="4" hspace="4" border="1" align="right" src="http://images.amazon.com/images/P/046504428X.01.MZZZZZZZ.jpg"/></a>
One of my holiday books was Wil McCarthy's <a href="http://www.amazon.com/exec/obidos/asin/046504428X">Hacking Matter</a>, an engaging treatise on the theory and possible uses of man-made atoms. So I was delighted to see that McCarthy will be speaking on the subject at the <a href="http://conferences.oreillynet.com/cs/et2004/view/e_sess/4625">Emerging Technology</a> conference in February.
</p>
<p>
An abbreviated version of the book is available as <a href="http://www.wired.com/wired/archive/9.10/atoms_pr.html">Ultimate Alchemy</a>, a 2001 article in Wired that spells out the relationship between quantum dots and programmable matter. A quantum dot, which can be engineered in a number of ways, is a three-dimensional electron trap.
<blockquote cite="Wil McCarthy">
The electrons trapped in a quantum dot will arrange themselves as though they were part of an atom, even though there's no atomic nucleus for them to surround. Which atom they resemble depends on the number of excess electrons trapped inside. What's more, the electrons in two adjacent quantum dots will interact just as they would in two real atoms placed at the equivalent distance, meaning the two dots can share electrons between them - they can form connections equivalent to chemical bonds. Not virtual or simulated bonds, but real ones. [<a href="http://www.wired.com/wired/archive/9.10/atoms_pr.html">Wil McCarthy: Ultimate Alchemy</a>]
</blockquote>
</p>
<p>
What could you do with materials made of this stuff? Of the zillions of possible applications, the book emphasizes smart houses and smart vehicles that tune themselves in realtime to manage energy more efficiently. Walls that adjust their transmissivity -- from reflective to opaque to transparent -- in response both to the sun and to the needs of the inhabitants. Cars whose bodies dynamically adjust their storage of electrical and kinetic energy. Of course there are infotech- and biotech-related applications too, but as 2003 should have made clear to everyone, we're way overdue for smarter ways to manage energy.
</p>

</body>
</item> 

<item num="a876">
<title>A tale of two cultures</title>
<date>2004/01/01</date>
<body>
<p>
<blockquote cite="InfoWorld">
It's clear that that the future of the Unix-style pipeline lies with Web services. When the XML messages flowing through that pipeline are also XML documents that users interact with directly, we'll really start to cook with gas. But a GUI doesn't just present documents, it also enables us to interact with them. From Mozilla's XUL (XML User Interface Language) to Macromedia's Flex to Microsoft's XAML, we're trending toward XML dialects that define those interactions. Where this might lead is not so clear, but the recently published WSRP (Web Services for Remote Portals) specification may provide a clue. WSRP, like the Java portal systems it abstracts, delivers markup fragments that are nominally HTML, but could potentially be XUL, Flex, or XAML. It's scary to think about combinations of these, so I'm praying for convergence. But I like the trend. XML messages in the pipeline, XML documents carrying data to users, XML definitions of application behavior. If we're going to blend the two cultures, this is the right set of ingredients. [Full story at <a href="http://www.infoworld.com/article/03/12/31/01OPstrategic_1.html">InfoWorld.com</a>]
</blockquote>
My recent stuff has provoked some diametrically opposed reactions. Responding to this column, Dan Kegel wrote:
<blockquote cite="Dan Kegel">
Jon, you've been drinking too much XML / web services kool-aid. Only clueless analysts and those who wish they could program, but can't, think there's anything novel about &quot;web services&quot;. Anything you can do with XML can be done more simply without it; the standards documents associated with XML and &quot;web services&quot; are absolutely mind-numbing. In the meantime, real programmers are getting real work done, and ignoring the analysts.
</blockquote>
</p>
<p>
Meanwhile, in response to the <a href="http://weblog.infoworld.com/udell/categories/infoworld/2003/12/15.html#a868">previous week's column</a>, the Kalsey Consulting Group takes me to task from the other direction:
<blockquote cite="Kalsey Consulting Group">
Jon Udell says that intranets should abandon Web services like SOAP and REST in favor of screen scraping XHTML. Hogwash. [<a href="http://kalsey.com/2004/01/xhtml_services/">Measure Twice Weblog</a>]
</blockquote>
Given the &quot;two cultures&quot; theme of last week's column, it's probably fitting that my message can be seen in such different ways. And I'll admit that my eclecticism can seem almost perverse. Last month, I was contacted by someone about a nomination to be a Microsoft MVP, and by someone else about joining a Mozilla advisory board. I don't know if either of these things will pan out, but I'd love it if both did. In his <a href="http://safari.oreilly.com/?XmlId=1-56592-537-8/foreword-18">foreword</a> to my 1999 book, Tim O'Reilly wrote something I'll aways cherish:
<blockquote cite="Tim O'Reilly">
All too often, people wear their technology affiliations on their sleeve (or perhaps on their T-shirts), much as people did with chariot racing in ancient Rome. Whether you use NT or Linux, whether you program in Perl or Java or Visual Basic -- these are marks of difference and the basis for suspicion. Jon stands above this fragmented world like a giant. He has only one software religion: what works.
</blockquote>
I can't think of a better theme for the new year: keep focusing on what works.
</p>
</body>
</item> 

<item num="a875">
<title>The wardriver and the cop</title>
<date>2003/12/29</date>
<body>

<p>
<a href="http://doc.weblogs.com/2003/12/16">Doc Searls</a> often writes about how his modus operandi for acquiring Internet access while traveling is to cruise residential neighborhoods running <a href="http://www.macstumbler.com/">MacStumbler</a>, which finds wireless access points and speaks to you in different voices depending on whether the AP has WEP turned on or off. Jeremy Zawodny recently <a href="http://jeremy.zawodny.com/blog/archives/001255.html">wrote about</a> his wardriving adventures in and around Toledo, Ohio, while visiting his family for the holiday. So it was probably inevitable that I would find myself parked in front of the junior high school in a small town in Michigan a few nights ago, explaining my odd behavior to a local cop.
</p>
<p>
The last time we visited my wife's family there, several years ago, schools were the only source of broadband access. Since two of my sisters-in-law are teachers, I was able to visit their classrooms and jack in. This time, I found a dozen access points in the immediate vicinity of my in-laws' house. None, predictably, was WEP-enabled. The strongest signal came from the WaveLAN at the junior high, so I parked there to synch mail and RSS.
</p>
<p>
It was a surreal experience to have available, in such a situation, all the essential tools of my professional life: a cellphone; high-speed Internet access; even videoconferencing if I'd needed it. And to be honest, it wasn't an entirely comfortable experience. So I half expected the flashing blue lights that came up behind me. Out-of-state plates, motor running, headlights off, and a blue glow in the cockpit: I presented a very odd picture indeed.
</p>
<p>
After he ran a check on my license, the cop was really nice about the whole thing. I wondered if he'd tell me to leave, but he didn't, he only asked me to park closer to the curb. We talked about how the school might want to lock down its AP. As it happens his wife works for the school -- it is a <i>small</i> town -- so I guess that message was delivered.
</p>
<p>
I suppose this scene has played out differently in other places. After all, the script hasn't been written yet. Few small-town cops would guess that the driver of a vehicle suspiciously parked outside the junior high would be, of all things, checking his email. No policy exists for this situation. I imagine my cop wondered later, as I did, what such a policy might be.
</p>
<p>
Is it conceivable that a small town might designate a well-known AP for public access, including drive-through use? The public library would be the obvious candidate. (Ideally its public-access AP would be isolated from the library's internal network.) During regular hours, visitors could bring laptops inside, but after hours they could park outside in a well-lit and easily-monitored area. Then there'd be no incentive to cruise schools and residential neighborhoods scanning for APs. 
</p>
<p>
A strange concept, admittedly. But I can think of stranger things. One is wardriving. Another would be policies that might emerge to prevent it. 
</p>

</body>
</item> 

<item num="a874">
<title>The social life of XML</title>
<date>2003/12/25</date>
<body>
<p>
<blockquote cite="Jon Udell">
<img src="http://udell.roninhouse.com/xml2003/devcon2001.jpg" vspace="6" hspace="6" align="right"/>
I recently found a picture of the panelists at the XML DevCon 2001
session entitled &quot;The Importance of XML.&quot; My body language told the
story: I wasn't a happy camper. Of course I agreed with all the reasons
the panel thought XML was important: for web services, for interprocess
communication, and for business process automation. But I also thought
XML was important for a whole different set of reasons that weren't on
the conference's agenda. I thought XML was important for end-user
applications, for human communication, and for personal productivity. I
believed then, and I believe more strongly today, that it's a bad idea
to separate those two ways of using XML. [<a href="http://www.xml.com/pub/a/2003/12/23/udell.html">XML.com</a>]
</blockquote>
This is an edited-down version of the talk I gave at XML 2003. It omits
the XPath-search-in-the-browser demonstrations, which readers of my
O'Reilly Network column have already seen.
</p>
<p>
The other day, Don Box wondered -- in reference to <a href="http://weblog.infoworld.com/udell/2003/12/22.html">last week's InfoWorld column</a> -- about the notion <a href="http://www.gotdotnet.com/team/dbox/default.aspx?key=2003-12-23T02:25:41Z">
&quot;that there are somehow two classes of XML - documents and something else&quot;</a>.
As should be clear from the text of my keynote talk, it's hard for me,
personally, to make a distinction between documents and databases. But
the river of XML is fed by two tributaries -- people who came from
publishing, and people who came from IT-driven data management -- and
that's the duality Don may be picking up on.
</p>
<p>
In fact, there's another kind of duality on my radar screen at the moment. Dare Obasanjo <a href="%20http://www.25hoursaday.com/weblog/PermaLink.aspx?guid=c5e678a4-7e74-4f13-bb68-57da7e3b4f30%20%20">reported</a>
that in our conversation at XML 2003 I said I had expected WinFS would
turn out to be an XML store, not a CLR store. That's absolutely true,
and I continue to believe it would be the right way to ensure that
information created by users of future Windows systems will have the
right kinds of social opportunities.
</p>

</body>
</item> 

<item num="a873">
<title>XML for the rest of us</title>
<date>2003/12/22</date>
<body>

<p>
<a target="_new" href="http://weblog.infoworld.com/udell/gems/bosworth_02.mov"><img vspace="6" hspace="6" align="right" alt="adam bosworth" src="http://weblog.infoworld.com/udell/gems/bosworth_02.gif"/></a>
<blockquote cite="InfoWorld">
&quot;The relational database is designed to serve up rows and columns,&quot; said BEA's Adam Bosworth in his keynote talk. &quot;But our model of the world is documents. It's 'Tell me everything I want to know about this person or this clinical trial.' And those things are not flat, they're complex. Now we have the way to get not only the hospital records and prescriptions but also the doctor's write-ups.&quot;
<br/><br/>
The doctors and bankers will get that, just as the highway patrolmen already do. XML documents, flowing through XML plumbing, can now deliver very real and tangible benefits. For the publishing geeks who started it all, it's a moment to savor. [Full story at <a href="http://www.infoworld.com/article/03/12/19/50OPstrategic_1.html">InfoWorld.com</a>]
</blockquote>
By the way, Adam Bosworth said a great many other interesting things in his XML 2003 talk. For those of you not inclined to <a target="_new" href="http://weblog.infoworld.com/udell/gems/bosworth_02.mov">watch this QuickTime clip</a> -- and in particular for the search crawlers -- I would like to enter the following quote into the public record.
</p>
<blockquote cite="Adam Bosworth">
<p>
The reason people get scared of queries is that it's hard to say 'You can send me this kind of query, but not that kind of query.' And therefore it's  hard to have control, and people end up building other systems. It's not clear that you always want query. Sometimes people can't handle arbitrary statements. But we <i>never</i> have queries. I don't have a way to walk up to Salesforce and Siebel and say tell me everything I know about the customer -- in the same way. I don't even have a way to say tell me everything about the customers who meet the following criteria. I don't have a way to walk up to Amazon and Barnes and Noble and in a consistent way say 'Find me all the books reviewed by this person,' or even, 'Find me the reviews for this book.' I can do that for both, but not in the same way. We don't have an information model. We don't have a query model. And for that, if you remember the dream we started with, we should be ashamed.
</p>
<p>
I think we can fix this. I think we can take us back to a world that's a simple world. I think we can go back to a world where there are just XML messages flowing back and forth between...resources. I think we can do this by understanding that there are certain amounts of the Web services standard that quite honestly should just be ignored for a while, because it's not clear that we'll ever need them, and it's certainly clear that we don't need them now. These are the complex multi-part coordination standards that describe how you can hop, walk, and quack like a duck all at the same time, when all we want to know is how to get from point A to point B.
</p>
</blockquote>
<p>
Three things jump out at me from that passage. First, the emphasis on XML query. My instincts have been leading me in that direction for a while now, and much of my own R&amp;D in 2003 was driven by a realization that XPath is now a ubiquitous technology with huge untapped potential. Now, of course, XQuery is coming on like a freight train. At XML 2003 I got to meet two of the authors of <a href="http://safari.oreilly.com/0321180607">XQuery from the Experts</a> -- Michael Rys and Jonathan Robie. The humble XPath examples I demonstrated in my own talk barely scratch the surface of what's now possible, but Robie -- a co-creator of XQuery's predecessor, <a href="http://www.almaden.ibm.com/cs/people/chamberlin/quilt.html">Quilt</a>, and an editor of both the XQuery and XPath specs -- told me he thought that was fine. As with other media, we agreed, it's necessary to immerse ourselves in data, play with it, and discover its possibilities. Simple forms of play that yield immediate gratification lay the foundation for more advanced games.
</p>
<p>
The second notable point was Bosworth's use of the term &quot;resources,&quot; which carried extra weight for those who have followed his <a href="http://www.adambosworth.net/archives/000017.html">public meditations on REST</a>.
</p>
<p>
The third point was of course the controversial stance on complex coordination  languages -- BPEL4WS and friends, though he didn't name them. Clearly <a href="http://www.collaxa.com/news.blog.html">Collaxa's Edwin Khodabakchian</a> would take issue with that point. It would be great to get both of them on a panel in 2004 to hash this out. Of course since both are bloggers, maybe a more loosely-coupled conversation can happen instead, or in addition.
</p>


</body>
</item> 

<item num="a872">
<title>Rich Persaud's AV clipping service</title>
<date>2003/12/19</date>
<body>

<p>
Rich Persaud <a href="http://dotpeople.com/archives/000024.html">noticed</a> my frustration with link-addressable AV content and has adapted his <a href="http://autometa.com/RPXP/">RPXP</a> tool for <a href="http://autometa.com/RPXP/web/">through-the-web use</a>. So for example, <a href="http://rpxp.com/?realplayer/clip/video/start/55:30/stop/61:55/stream/rtsp://cyber.law.harvard.edu/BloggerCon%202003/BloggerCon%20Day%202%20-%20Aggregators.rm">this URL</a> links to the five-minute RealVideo clip from BloggerCon that I quoted back in October. 
</p>
<p>
Cool! Of course, these are still treacherous waters. For example, success with that previous URL varies according to platform (Windows, Mac) and browser (Mozilla, IE, Safari). Additional complications arise because QuickTime and Real want to fight over the SMIL mime-type.
</p>
<p>
Here's more pain. I thought I'd illustrate the Windows media feature by clipping from the <a href="http://msdn.microsoft.com/msdntv/episode.aspx?xml=episodes/en/20031218XAMLDB/manifest.xml">Don Box / Chris Anderson XAML Christmas special</a> on MSDNTV. To do that, I need the URL of the stream. Good luck digging it out of MSDNTV's page, though. I persevered and found it by inspecting HTTP headers, so already 99.9% of the population gets left behind. Then I used Rich's service to form <a href="http://rpxp.com/?winmedia/clip/video/start/2:30/stop/3:30/stream/http://msdn.microsoft.com/msdntv/episodes/en/20031218xamldb/ChrisA-DonB02_300.asx">this URL</a> which I hoped would clip thirty seconds from the stream. But no, the stream URL buried in this presentation isn't an mms://...asf, but rather an http://...asx, which I'll bet Rich's service could handle but currently seems not to.
</p>
<p>
Note too that since the line between streaming and progressive downloading has become quite blurry of late, it's possible to confuse yourself and others. For example, having once watched the Windows Media video that Graham Glass <a href="http://radio.weblogs.com/0109134/2003/12/19.html">mentioned</a> on his weblog today, I can use <a href="http://rpxp.com/?winmedia/clip/audio/start/2:00/stop/2:30/stream/http://pc.watch.impress.co.jp/docs/2003/1218/sony_06.wmv">this link</a> to clip just thirty seconds, from two minutes in, which is probably as much of Sony's dancing robots as a person can handle first thing in the morning. But although that clip appears to work for me, it's only because the requested clip is cached. It won't work for you until the movie is also cached on your side.
</p>
<p>
Anyway, Rich is clearly on the right track here and deserves thanks and encouragement. Unless somebody beats me to it (hint, hint), I'll whip up a bit of JavaScript so that you just need to plug in your platform (WinMedia/Real/QuickTime) and your start/stop times in order to generate a clipping-service URL. Then we're cooking with gas -- modulo the platform nightmare, that is. 
</p>
<p>
Coincidentally, I had a conversation with Macromedia yesterday about the just-announced <a href="http://www.macromedia.com/macromedia/proom/pr/2003/vitalstream.html">Flash video streaming service</a>, which embeds the <a href="http://www.macromedia.com/software/flashcom/">Flash Communication Server</a> in a content distribution network. The current pitch seems geared toward using video for brand marketing (see the <a href="http://www.macromedia.com/software/flash/flashpro/video/gallery/">new gallery</a>), but painless hassle-free AV would certainly also be a huge enabler for the more spontaneous kinds of citation and collaboration that I've been trying to achieve.
</p>

</body>
</item> 

<item num="a871">
<title>Cygwin sshd</title>
<date>2003/12/17</date>
<body>

<p>
<span class="minireview">Cygwin openssh</span>
Today I needed to set up an openssh server on a Windows box. Why? In this case, for two reasons. I wanted to use scp to ship files securely to the box. And I wanted to be able to tweak some configuration files remotely. 
</p>
<p>
There are a bunch of options for getting a Win32 sshd going. They include: build from source; use a standalone binary package; go with the openssh that's part of the <a href="http://www.cygwin.com/">Cygwin</a> system. I went with Cygwin, because its Win32 setup program and package installer have, in recent  years, become extremely powerful, flexible, and easy to use.
</p>
<p>
I grabbed the default kit plus the openssh package, and then followed the instructions <a href="http://tech.erdelynet.com/cygwin-sshd.html">here</a>. As smooth as this stuff has gotten, there's always still some kind of glitch, almost invariably permissions-related. And sure enough, the sshd service wouldn't start. I rechecked the instructions and found the culprit:
<blockquote>
<tt>chown system:system /var/log/sshd.log /var/empty /etc/ssh_h*</tt>
</blockquote>
There were two options. Either let sshd log in as SYSTEM, or change ownership on those files to sshd_server, the account used by cygwin sshd. I did the latter. 
</p>
<p>
It's amazing how these kinds of permissions glitches are so common -- on all platforms -- and yet so hard to pin down and untangle. Google showed me that a bunch of other people had run into the snag I encountered. Recommendations included using verbose NTFS auditing, or the <a href="http://www.sysinternals.com/ntw2k/source/filemon.shtml">Filemon</a> utility, to debug the problem. Fair enough, but when you are in installation mode, why can't your OS -- any OS -- be smarter about correlating failed permissions with the software you just installed?
</p>
<p>
Anyway, that's not Cygwin's fault. It's a great resource that just keeps on getting better. 
</p>

</body>
</item> 

<item num="a870">
<title>The LinkedIn dilemma</title>
<date>2003/12/16</date>
<body>

<p>
Here's what stopped me from writing an endorsement for somebody on <a href="https://www.linkedin.com/">LinkedIn</a> today: the requirement to define our relationship as one of these choices:
<blockquote>
<select name="relationship" id="relationship_create_endorsement">
<option value="" selected="">Choose...</option>
<option value="E">You managed R. directly</option>
<option value="W">You were senior to R., but did not manage directly</option>
<option value="R">You reported directly to R.</option>
<option value="X">R. was senior to you, but you did not report directly</option>
<option value="S">You worked with R. in the same group</option>
<option value="D">You worked with R. in different groups</option>
<option value="C">You worked with R. but were at different companies</option>
<option value="L">You were a client of R.'s</option>
<option value="J">R. was a client of yours</option>
</select>
</blockquote>
The same kind of thing stopped me from joining the identity-badge party at the Digital ID conference recently. I'm bugged by forms that invite or require me to specify the unspecifiable. Particularly when Google already knows the subtle truth of the matter. For example, the signup for <a href="http://www.ntag.com/">nTag</a> asked me to state my interests. But I already do that all the time. Everything I write is a statement of my interest in something. Should it be my job to fit those interests into the Procrustean bed of somebody else's form? 
</p>
<p>
Ditto for LinkedIn. The sum of my relationship with &quot;R.&quot; is: 1) he wrote some cool software that I tried and wrote about, and 2) we had an exchange, more recently, in the comments area of a website. And guess what? When I google for &quot;R.'s&quot; last name and mine, the first two hits correspond exactly to those two points. If there were a freeform input box, I'd have simply entered the query.
</p>
<p>
Now clearly I lead a much more public life than most, and I create a much more complete document trail for Google to follow. But is that a difference in degree, or a difference in kind? I suspect the former. And if that's true, then I'm skeptical as to the benefit of a parochial reputation system such as LinkedIn, which requires extra effort to join, to feed with metadata, and to use. If we have (or are rapidly evolving) a global reputation system that can absorb and contextualize our routine communication, then parochial systems will need to deliver huge amounts of extra value.
</p>
<p>
I'm well aware that not everybody can or should spill vast quantities of words onto the Web. On the other hand, I think it will make less and less sense to operate in stealth mode. Most &quot;knowledge workers&quot; will want to plug into the public conversation at certain points -- to promote our activities, to discuss and collaborate, to seek information. So you drop a stone in the pond, and ripples go out, and reflections ripple back. Do we need more than this, really? 
</p>
<p>
I understand the impulse to codify social protocols in software. I'm not at all sure we can do it in ways that preserve the necessary fluidity and fuzziness. But there's VC money in them thar hills, so I guess we're going to do the experiment and find out.
</p>

</body>
</item> 

<item num="a869">
<title>Sun's identity pitch</title>
<date>2003/12/16</date>
<body>

<p>
<a href="http://weblog.infoworld.com/udell/gems/schwartz_01.ram"><img align="right" vspace="6" hspace="6" src="http://weblog.infoworld.com/udell/gems/schwartz_01.jpg"/></a>
At SunNetwork 2003 back in September, Jonathan Schwartz made the case that the Java card is the most strategic piece of Sun's whole technology stack. Actually, I'd say per-employee pricing is the real strategic innovation. But I've always hoped to see movement on the identity card front, so <a href="http://weblog.infoworld.com/udell/gems/schwartz_01.ram">this clip</a>, in which Schwartz stresses something I've been harping on for years, got my attention:
<blockquote cite="Jonathan Schwartz">
Java card support will be built into the desktop that we offer. It is the fundamental way we will help people to understand that if there were a menu item in your mail app that said, 'Show only mail from people that have been strongly authenticated,' then spam would disappear. 'Show me only content that has been strongly authenticated,' viruses would disappear.
</blockquote>
</p>
<p>
I'm with you, Jonathan. Now as a longtime advocate of this view, I've gotten plenty of useful pushback. And it's true, there are problems. PCs don't come with card readers. It's unclear how the governments and banks and airlines and other entities who currently issue cards will evolve the identity infrastructures this solution implies, how those infrastructures will cooperate, and how revocation can be managed in a scalable way.
</p>
<p>
That said, I worry less nowadays about card-reader deployment. Maybe because I figure that we'll just authenticate to our phones, and let them talk Bluetooth to PCs and other devices.
</p>
<p>
I also worry less about how we'll relate identity cards (or devices, like phones) to identity infrastructures. Look at how ordinary credit cards are now used at airline kiosks. There's no multifactor authentication involved in printing your boarding pass. But multifactor authentication is part of the larger system. Your government-issued biometric, aka driver's license with photo, will also be checked. It's all a question of context.
</p>
<p>
I'm not even too worried about how we handle revocation, now that I've seen what <a href="http://www.infoworld.com/article/03/09/26/38OPstrategic_1.html">Corestreet</a> has in mind.
</p>
<p>
All in all, I'm fairly optimistic about the scenario Schwartz paints. The whole talk, by the way, is <a href="rtsp://webcast-east.sun.com/archives/GSN-1133/day1schwartz_300.rm">here</a>. It lays out the new server and client strategies. I do wonder how all this adds up to a &quot;Java system.&quot; There are roles for J2EE, J2SE, and J2ME. But the server suite is based on Solaris, with Java APIs that you might rather generalize as Web services APIs. The desktop is based on Linux/GNOME/Mozilla/StarOffice, and while there is indeed a Java client software renaissance underway, it looks to me as though IBM (with Eclipse and SWT) is more of an instigator there than Sun. But the cards, and more importantly the phones, that part I get. So maybe J2ME really is Sun's ticket.
</p>

</body>
</item> 

<item num="a868">
<title>Mining the intranet</title>
<date>2003/12/15</date>
<body>

<p>
<blockquote cite="InfoWorld">
Of course sites such as Amazon and Google have reasons to create formal APIs and gate access to them. But on an enterprise intranet the threat is disuse, not overuse. You're publishing information that you want people to find, exploit, and recombine. When it's appropriate to use SOAP and WSDL -- for example, when queries require fancy authorization or complex inputs -- then do so. But when a simpler strategy will suffice, don't be ashamed to use it. Between the primordial tag soup of HTML and the formal realm of Web services lies a large and fertile middle ground: XHTML. Information that you publish in XHTML can be directly consumed by browsers, and it's much friendlier to spiders than ill-formed HTML. If you hope people will mine your intranet, make the job as easy as it can be. [Full story at <a href="http://www.infoworld.com/article/03/12/12/49OPstrategic_1.html">InfoWorld.com</a>]
</blockquote>
I sometimes worry that I harp too much on these kinds of simple home truths. But Mike Champion's <a href="http://weblogs.java.net/pub/wlg/806">review</a> of my XML 2003 keynote was a nice bit of validation:
</p>
<p>
<blockquote cite="Mike Champion">
Jon Udell gave a keynote speech on Tuesday that pierced the jaded, slightly cynical shell I've acquired after about 8 years in the XML world.  He didn't talk about &quot;maybe someday...&quot; or &quot;if only ...&quot;, he showed what a little imagination can do with the widely deployed XHTML, CSS, and XPath technologies today. 
<br/><br/>
...
<br/><br/>
So why did this pierce my cynical shell? Most would agree that we need more metadata on the Web for it to live up to its full potential -- that's the very premise of the Semantic Web effort in which Tim Berners-Lee has invested much of the W3C's resources (and credibility). On the other hand, the historical difficulty of getting real people to put metadata in their content is believed by many to doom such efforts to failure. (Cory Doctrow's <a href="http://www.well.com/~doctorow/metacrap.htm">essay </a> is the most colorful and cogent, if widely reviled, statement of this position). Udell's insight is that we can leverage the technology we have, salted by human vanity, to get usable metadata without technological breakthroughs or unrealistic demands on humans.
</blockquote>
</p>
<p>
As Dorothea Salo <a href="http://weblog.infoworld.com/udell/2003/12/11.html#a866">recently pointed out</a>, this isn't only my insight. I'm just one of the people who keeps on noticing, and drawing attention to, ways we can make more out of what we already have. 
</p>

</body>
</item> 

<item num="a867">
<title>Mobile webcasting redux</title>
<date>2003/12/12</date>
<body>

<p>
The XML 2003 conference was the first I've attended with an iSight camera, and with a plan to use it. Part one of the plan was to try bouncing a live stream off my home server, just as a test, but I was too busy to try that. Part two was to use video quotes in the blog entries I posted. This worked fairly well, though in the future I'll want a more time-efficient tool for capturing clips than QuickTime Pro. 
</p>
<p>
As an aside, I had a bit of fun the morning of my keynote. At 7:30AM I looked out my hotel window and saw the lights coming on at Citizens Bank across the street. Watching the bankers boot up their PCs, I got to thinking about how XML is probably transforming their infrastructure, but not yet their desktops. So I took a two-minute movie and made it my first slide. Later, while demonstrating an XPath search of my slideset, my canned query -- which had originally been intended to find a reference to Apple's Knowledge Navigator video on one of the slides -- also (of course) found my opening clip. Which immediately began to play, right there in the dynamically-generated search results on the current slide. OK, OK, I'm easily amused. But it was an unexpectedly cool bit of behavior.
</p>
<p>
With regard to my ongoing media-deep-linking issue, I haven't seen evidence of either streaming or downloadable content at the conference. So, there's nothing to deep-link into. However I did see <a href="http://www.emediacommunications.biz/blog/archives/000041.html">this entry</a> from Larry Bouthillier, with pointers to some documentation that's been extremely hard for me to find. So, for future reference, here's an incantation for Windows Media:
<blockquote cite="Larry Bouthillier">
<pre class="code" lang="xml">
&lt;!-- This is an .asx file with starttime, duration and title elements --&gt; 
&lt;asx version=&quot;3.0&quot;&gt;  
&lt;title&gt;My Video Title&lt;/title&gt;  
  &lt;entry&gt;  
  &lt;ref href=&quot;mms://myserver.com/path_to_movie/myfile.wma&quot; /&gt;  
  &lt;starttime value=&quot;00:05:00&quot; /&gt;  
  &lt;duration value=&quot;00:03:00&quot; /&gt; 
  &lt;/entry&gt; 
&lt;/asx&gt;
</pre>
[<a href="http://www.streamingmedia.com/r/printerfriendly.asp?id=8483">streamingmedia.com : business - technology - content</a>]
</blockquote>
</p>
<p>
<a href="http://www.streamingmedia.com/r/printerfriendly.asp?id=8489">Elsewhere</a>, Larry notes with regard to QuickTime Media Links:
<blockquote cite="Larry Bouthillier">
Notably absent are the <font face="Courier New">starttime</font> and <font face="Courier New">endtime</font> options, a disappointing omission. A full list of the options and values supported in the .qtl file's embed tag is available from <a target="new" href="http://developer.apple.com/documentation/QuickTime/REF/whatsnewqt5/Max.2c.htm#pgfId=93766">Apple's QuickTime Media Link Documentation</a>
</blockquote>
</p>
<p>
Part three of the experiment was to use the iSight for personal notetaking. This was incredibly useful. At one point, for example, I interviewed IBM's Richard Thompson about WSRP, so I turned on the camera and let it take notes for me. Ditto for a couple of sessions. I'll probably wipe most of this stuff next week, but it's nice to know it's available in case I want to check something. Of course, were random-access streams just available to conference-goers, that'd be even better.
</p>
<p>
Suppose I had streamed sessions live, by bouncing them off my home server? It's not clear that <a href="http://www.idealliance.org/">IDEAlliance</a> would be thrilled by that. And what should I (or can I) do with the 100MB QuickTime movie of Adam Bosworth's keynote that's parked here on my disk? I'm sure the two-minute clip I posted would be considered fair use. I doubt that the whole 45-minute movie would be. Interesting times!
</p>

</body>
</item> 

<item num="a866">
<title>Gender and style</title>
<date>2003/12/11</date>
<body>

<p>
I rarely quote another blog's entry in its entirety, but this one needed to appear whole. It's from Dorothea Salo, reacting to Edd Dumbill's report, on XML.com, about something I said in my Tuesday keynote:
</p>
<blockquote cite="Dorothea Salo">
<p>I am a peasant. I am only a peasant. I know this.</p> <p>But, damn, it's irritating to see things I've known from experience for five years get trotted out like the greatest new thing ever.</p> <p>Yeah, yeah, I'm just jealous because I'm not <a href="http://www.xmlconference.org/">in Philadelphia</a>. (The Philly conference everybody is being all nostalgic about? In '99? I was at that one. My first professional conference ever. Still got your conference proceedings? I helped typeset that.)</p> <p>But <a href="http://www.xml.com/pub/a/2003/12/09/xml2003plenary.html">this caused me</a> to emit such an uproar that my husband ran in to see what was wrong with me:</p> <blockquote><p>The other problem in preserving context, aside from the tools, is of course persuading people to create metadata in the first place. Udell suggested that a way of doing this might be through using style as a back door. Many people are willing to spend a long time on getting the look of a document right, but not be willing to spend that time on metadata creation. Udell suggested that by providing metadata-significant styles, authoring tools creators could encourage more preservation of context in communication through the carrot of creating beautiful documents.</p></blockquote> <p>Well. Um. Okay. Oh, to hell with it --</p> <p><strong style="font-size: xx-large; font-weight: bold;">GREAT BIG EFFING DUH!!!!!!!</strong></p> <p>If they can't see it, they won't do it. I've known this since the last Philly conference (even <a href="http://www.yarinareth.net/caveatlector/archive/week_2003_06_29.html#e001885">mentioned it</a> here), and I am only a peasant. Why'd it take the eggheads so long to figure it out? Gee, maybe because they spurn peasants right back into the mud that spawned them?</p> <p>Eh, well. At least now that Jon Udell has said it, somebody will pay attention to it.</p> <p>I'm hesitant to turn this into a gender issue, but honestly, I do wonder. How many eggheads are male, and how many peasants female? And how much of the disdain for WYSIWYG (and the until-now utter failure to figure out that visual distinctiveness of text ranges is an authoring/editing aid) comes from its association with those ditzy blonde secretaries? You know, women?</p> <p>Bah. Grow up, you guys. Start talking to some peasants. If this is the state of the art in the field, y'all could learn a lot from a publishing-production peasant.</p>  [<a href="http://www.yarinareth.net/caveatlector/archive/week_2003_12_07.html#e002429">Caveat Lector: Decembri 07, 2003 - Decembri 13, 2003 Archives</a>]
</blockquote>
<p>
Yikes! Where to start? Well, I guess <a href="http://safari.oreilly.com/1565925378/ch09-6415">here</a>:
<blockquote cite="Jon Udell">
Even though the book's source is &quot;only&quot; HTML/CSS, it is also XML, structured so that it's easy to pick out chapter headings, listings, and figures. This way of combining HTML, CSS, and XML is a transitional strategy. I hope it won't be needed once browsers that render XML directly (subject to CSS or XSL styles) have become widespread and standard, along with tools that help us write XML. But even in &quot;Internet time&quot; these developments sometimes take longer than we'd like. [<a href="">Practical Internet Groupware, 9.2.4, XML and HTML can fruitfully coexist</a>
</blockquote>
That's from my 1999 book. CSS as a style/structure bridge is a strategy that I, too, have been using for five years. I'm way less accomplished on the style side of the equation than Dorothea, though perhaps more accomplished on the structure side. Every now and then, I notice that the idea never has caught on, that my prediction about Internet time was depressingly accurate, and that even at this late date it's worth mentioning again. It feels awkward to do so, frankly, because it is such old news. And yet, sure enough, there were folks who came up to me after the talk and said &quot;Great idea!&quot; 
</p>
<p>
Now to the larger point. The gender issue has been percolating around in blogspace for a while now. One entry that particularly stuck with me is this one from the Longhorn PDC:
<blockquote cite="Rory Blyth">
Everything seems pretty normal and nice, save for one thing: Where's all the women? For reals, y'all. I feel like I'm at a Microsoft monastery here. I think I've seen about 2.5 females, and they were part of the hired help. It's like they're an endangered species. [<a href="http://neopoleon.com/blog/posts/1303.aspx">Rory Blyth</a>]
</blockquote>
It reminded me of the Cairo/Win95 PDC in Chicago ten years ago in Anaheim. Microsoft rented Disneyland for the event. Imagine Disneyland at night, just me and approximately 5000 post-adolescent males wandering around the spookily-lit attractions. It was spectacularly weird.
</p>
<p>
I've been giving this matter some serious thought. For example, if I collect the names of all the people I've quoted in the last few hundred entries, the maleness of the list fits the well-known pattern.
</p>
<p class="realsmall">
Aaron Cohen, Adam Curry, Alf Eaton, Allie Rogers, Andy Clark, Annrai O'Toole, Benjamin J. J. Voight, Bernard Teo, Bill Gates, Bill de hÓra, Bob Clary, Bob DuCharme, Brendan Eich, Brian Marick, Chad Dickerson, Charles Petzold, Chris Anderson, Chris Brumme, Chuck Myers, Claus Dahl, Craig Franklin, Dan Brickley, Dan Bricklin, Dan Gaters, Danny Ayers, Dare Obasanjo, Dave Megginson, Dave Winer, Don Box, Dorothea Salo, Doug Glenn, Douwe Osinga, Edward Tenner, Edwin Khodabakchian, Evan Williams, Gavin Weightman, Gerald Bauer, Glenn Vanderburg, Gordon Weakliem, Hal Roberts, Hiawatha Bray, Ian Hixie, J. Scott Anderson, James Farmer, Jay Rosen, Jeff Angus, Jemaleddin Cole, Jenny Levine, Jesse James Garrett, Jim Mooney, Jim O'Halloran, Joe Hewitt, John Markoff, Jon Udell, Karl Best, Karsten Self, Ken Manheimer, Kevin Werbach, Kimbro Staken, Kingsley Idehen, Kirk Holbrook, Larry O'Brien, Len Bullard, Les Orchard, Matt Griffith, Micah Alpern, Michael Kinsley, Mike Deem, Mitch Kapor, Nancy McGough, Ned Batchelder, PJ Connolly, Paul Everitt, Paul Graham, Paul Philp, Pete Cole, Peter Wayner, Phil Wainewright, Philip Brittan, Ralph Loader, Ray Kurzweil, Ray Ozzie, Rob Howard, Robert Ivanc, Robert L. Vaessen, Robert Scoble, Russell Beattie, Sam Ruby, Samuel Pepys, Sandeepan Banerjee, Scott Reynen, Sean McGrath, Stefano Mazzocchi, Steve Crocker, Steve Lawrence, Sue Spielman, Ted Leung, Ted Neward, Tiernan Ray, Tim Bray, Tim Oren, Tom Yager, Tonico Strasser, Trace Reed, Ward Cunningham
</p>
<p>
Were it not for my recent foray into the world of libraries, the female names would be even fewer, and there are already precious few of them. This can't be good.
</p>
<p>
As it happens, the XML conference I just left was more balanced than most -- which isn't saying much. Still, I did get to meet <a href="http://www.textuality.com/Lauren.html">Lauren Wood</a>, <a href="http://today.java.net/pub/au/63">Eve Maler</a>, and Sharon Adler (see the <a href="http://www.w3.org/TR/xsl/">XSL 1.0 spec</a>). Sharon, who works for IBM Research, was particularly interested to follow up with me on the main theme of the talk, which was how and why we ought to use documents to contextualize human relationships. Which, by the way, is not something I present as a stunning flash of egghead brilliance, but simply as an overlooked home truth. Now, is the fact that Sharon finds this theme to be important -- and underappreciated at IBM Research -- related to the fact that Sharon is female? Seems very likely to me. 
</p>
<p>
So, for the record, I think this gender issue we all keep tiptoeing around is quite real, and affects technological choices and strategies far more deeply than many of us XY types would dare imagine.
</p>

</body>
</item> 

<item num="a865">
<title>Adam Bosworth on navigating the linked web of data</title>
<date>2003/12/10</date>
<body>

<p>
<a href="http://udell.roninhouse.com/movies/bosworth_01.mov"><img align="right" vspace="6" hspace="6" src="http://udell.roninhouse.com/movies/bosworth_01.jpg"/></a>
Here's <a href="http://udell.roninhouse.com/movies/bosworth_01.mov">a brief clip</a> from Adam Bosworth's terrific keynote, in which he talks about the synchronizing data browser that he's been dropping hints about on his weblog, and in which he also pokes some friendly fun at Jean Paoli's French accent. 
</p>
<p>
It's always interesting to see which memes emerge from conferences. I'm delighted that the ideas in the air at this one resonate very strongly with the theme of my own keynote: how and why to make the ways we humans search, navigate, and interact with linked sets of documents a central aspect of the evolving fabric of XML services. Dave Megginson's &quot;strange creations,&quot; Adobe's demonstration yesterday of its forthcoming designer (which I think of as &quot;InfoPath for digital paper&quot;), Mike Champion's level-headed assessment of how Amazon and eBay are exploiting plain XML-over-HTTP, the WSRP interop session I'll attend later, and of course Adam's talk -- all in all I'm feeling pretty good about where things are headed.
</p>

</body>
</item> 

<item num="a864">
<title>Mike Champion on Web services reference architecture</title>
<date>2003/12/10</date>
<body>


<p>
Mike Champion is Software AG's representative to the W3C's <a href="http://www.w3.org/2002/ws/arch/">Web Services Architecture working group</a>. I'm in his talk right now on how &quot;the disparate pieces of the Web services technology space - messaging, description, choreography, security, management, etc. - fit together in a reference architecture.&quot; In <a target="_new" href="http://udell.roninhouse.com/movies/champion.mov">this clip</a>, Mike talks about four overall impressions. First, that it's been harder than expected to arrive at an understanding of basic terms and concepts. Second, that the relationship between the services Web and the plain old Web has proved more controversial than anyone would have guessed. Third, that while the semantic Web is regarded by many as &quot;pie in the sky,&quot; the need to broker some kind of semantic understanding across business processes is front and center. Fourth, that hammering out consensus in an environment where vendors hotly contest their interests has been...challenging.
</p>
<p>
He concludes that the group has not, as yet, made satisfactory progress toward defining a &quot;canonical stack&quot; -- such efforts are &quot;somewhere between challenging and hopeless.  The acronym SOAP, for example, is no longer an acronym for anything.&quot; But there are two reference architectures that correspond to different expansions of SOAP. The original, Simple Object Access Protocol, implies distributed objects made easy for programmers. The other way, Service Oriented Architecture protocol, is more abstract, not as amenable to automated tools, but &quot;is a more powerful idea. RESTful applications are special cases of that architecture.&quot; I like how he sums this up: REST and RPC aren't (or shouldn't be) ideologies, they're tools in an engineering toolkit to be used as and when appropriate. Agreed.
</p>
<p>
As it reaches the end of its chartered lifetime, the group has no plan to re-charter in its current form. It acknowledges, he says, that the real focus of Web services standardization now lies with OASIS and the WS-I. 
</p>
<p><b>Update:</b>
Note to self: using the EMBED tag is a bad idea. It gets confusing when multiple video streams start playing in the aggregator, or on the Web page. 
</p>

</body>
</item> 

<item num="a863">
<title>Dave Megginson's strange creations</title>
<date>2003/12/09</date>
<body>

<p>
According to <a href="http://www.bestkungfu.com/archive/?id=324">this report</a>, my talk at XML 2003 this morning was &quot;trippier than expected.&quot; I like that! However, mine couldn't have been the trippiest. That honor goes to <a href="http://www.xmlconference.org/xmlusa/2003/bios_mr.asp#megginson">Dave Megginson</a> for his talk entitled &quot;Strange Creations: Prototyping XML Data on the Desktop.&quot; With tongue firmly in cheek, he explored a variety of experimental ways to view and interact with XML data. The one that brought the house down worked like a text-based adventure game. &quot;You are in a dark room called PubXS document. You can go north, south, east, or west.&quot;
</p>
<p>
<a target="_new" href="http://udell.roninhouse.com/movies/megginson.mov"><img align="right" vspace="6" hspace="6" src="http://udell.roninhouse.com/movies/megginson.jpg"/></a>
Here's a <a target="_new" href="http://udell.roninhouse.com/movies/megginson.mov">QuickTime clip</a> for your enjoyment. I wasn't in position to get a clear shot of the screen, but hopefully the spirited narrative will carry the day.
</p>
<p>
Although it's not an interface most of us would choose, the adventure game really does work on real data, and it scores points for discoverability. Anyone could use it to navigate -- albeit slowly -- through a collection of linked XML documents. 
</p>
<p>
After showing a tree-control-oriented navigator, Dave concluded with a table-oriented viewer that absorbs structure and flattens it for viewing and navigation. He thinks this approach may be optimal in many cases -- which is good news for Excel 2003, which does a great job of pulling XML data into spreadsheets for viewing and navigation.
</p>
<p>
Here's the quote I want Google to find:
<blockquote cite="Dave Megginson">
I want to see a lot of machine-processable and linked data online, because that's what wins in the end. It doesn't matter if it's RDF or topic maps or PubXS [ed: the system he was demonstrating]. 
</blockquote>
Amen. As Dave pointed out, the formats that tend to succeed are those optimized more for humans than for machines -- he mentioned HTML and RSS as examples. An adventure game is a fanciful way of making a serious point: people are the creators of that linked and machine-processable data, and it has to be fun, easy, and rewarding to create it.
</p>

</body>
</item> 

<item num="a862">
<title>Giving back to open source</title>
<date>2003/12/08</date>
<body>

<p>
<blockquote cite="InfoWorld">
Jonathan Bollers, vice president and chief engineer at Science Applications International Corp. (SAIC), says that SAIC forks open source projects for in-house development &quot;almost without exception.&quot; The problem is that although there is often a desire to give back, it's &quot;a tedious process fraught with more heartache than benefits.&quot; The bureaucratic hurdles include security considerations, export controls, and a host of other issues that Bollers sums up as &quot;releasability remediation.&quot; [Full story at <a href="http://www.infoworld.com/article/03/12/05/48OPstrategic_1.html">InfoWorld.com</a>]
</blockquote>
</p>
<p>
Jonathan Bollers proposes that defense contracts might be structured to make such remediation economical for contractors, and suggests that benefits would flow to private-sector entrepeneurs. I think it's a great idea.
</p>
<p>
Of the many readers who reacted to the <a href="http://www.infoworld.com/article/03/10/24/42OPstrategic_1.html">earlier column</a> on the open-source give-back dilemma, Bollers was the only one willing to go on record. A similar thing happens when I write about Microsoft recently: a lot of folks agree with what I'm saying, but most prefer to remain anonymous. It's funny how both OSS and MS, each in their own ways, raise political issues that people want to talk about but are scared to talk about.
</p>

</body>
</item> 

<item num="a861">
<title>Point/counterpoint: Web services for collaboration</title>
<date>2003/12/08</date>
<body>

<p>
<blockquote cite="PJ Connolly">
P.J.: Despite what some may think, I'm about as platform-neutral as they come. But here's the problem: There's still no agreement on how presence shall be presented as a Web service. On one side are the proponents of XMPP (Extensible Messaging and Presence Protocol), an XML-based outgrowth of the Jabber project, which doesn't seem to be supported by anyone bigger than Novell. On the other, I see IBM and Microsoft agreeing on something for the first time since OS/2 1.0 was released: that SIP (Session Initiation Protocol)/SIMPLE (SIP Implementation for Messaging and Presence Leverage Enhancements) is the way to go. So, I'm curious, Jon: What side are you on?
</blockquote>
<blockquote cite="Jon Udell">
Jon: Both, for different reasons, but it doesn't matter for the purposes of this discussion. I know several developers who are using Jabber as a SOAP transport, and I'm told that the new breed of SIP-oriented IP PBXs offers SOAP interfaces. It's not a question of whether Web services will turbocharge the next generation of collaboration, but how. And there are two big answers. First, Web services will provide a general means of access to the messaging substrates. Second, Web services will help us unify metadata (message headers, aka context) and content (message bodies, aka documents) under a common data-management discipline: XML. [Full story at <a href="http://www.infoworld.com/article/03/12/05/48FEcollabpcp_1.html">InfoWorld.com</a>]
</blockquote>
</p>
<p>
This was my first appearance in the InfoWorld Point/Counterpoint series. I was looking for an opening to deliver Dan Akroyd's immortal line: <a href="http://members.fortunecity.com/wavjunky/swl-j/jane.wav">PJ, you ignorant slut</a>. But the opportunity never arose. Until now :-)
</p>

</body>
</item> 

<item num="a860">
<title>Searching along the path of least resistance</title>
<date>2003/12/05</date>
<body>

<p>
<img hspace="6" vspace="6" align="right" alt="mycroft" src="http://weblog.infoworld.com/udell/gems/mycroft.gif"/>
Yesterday's experiment reminded me that I've been meaning to spend more time with search engines other than Google. So I visited the <a href="http://mycroft.mozdev.org/top30.html">Mycroft download page</a> and picked up Mozilla plugins for AllTheWeb and Teoma. It's funny how arbitrary factors can influence your behavior. Mozilla's search box orders the plugins alphabetically. AllTheWeb is thus on the path of least resistance, and I've been using it often. Of course, it's really impressive.
</p>
<p>
It would be interesting to add an AllTheWeb box to my template, as a complement to the Google box. If somebody's already done that for Radio, let me know, otherwise I'll do it myself when I can.
</p>
<p>
<script type="text/javascript" src="http://weblog.infoworld.com/udell/gems/mycroft.js"/>
I haven't gotten around to registering a plugin for InfoWorld, or for this blog, but I'm parking these here so I can find them later, and you're welcome to use them to if you wish. 
</p>
<p>
InfoWorld, Mycroft plugin: <a href="javascript:addEngine('infoworld', 'gif', 'Tech')">install</a>
</p>
<p>
Jon's Radio, Mycroft plugin: <a href="javascript:addEngine('jonblog', 'gif', 'Tech')">install</a>
</p>

</body>
</item> 

<item num="a859">
<title>Measuring web mindshare</title>
<date>2003/12/04</date>
<body>

<p>
<a href="http://www.oreilly.com/catalog/spiderhks/"><img align="right" hspace="6" vspace="6" alt="spidering hacks" src="http://weblog.infoworld.com/udell/gems/spiderHacks.jpg"/></a>
My old web mindshare calculator has been <a href="http://www.oreillynet.com/pub/a/javascript/excerpt/spiderhacks_chap01/index1.html">updated</a> for the <a href="http://www.oreilly.com/catalog/spiderhks/">Spidering Hacks</a> book. The original version from 1999 used AltaVista to measure what I called the web mindshare -- that is, the number of indexed inbound links -- for a collection of sites in a Yahoo! category. The new version is updated to use Google. Cool! That project was one of the first things that really got me thinking about what Web services would inevitably become. Here's how I described it in my book:
<blockquote cite="Jon Udell">
In effect, every web site is a scriptable component, and the Web as a whole is a vast library of such components. You can invoke these invidually from any scripting language that can issue HTTP requests and interpret the responses.
<br/><br/>
What's more, you can join components to achieve novel effects. For example, I've used Yahoo! and AltaVista in combination to measure the &quot;mindshare&quot; of web sites in specific categories. To do that, I wrote a Perl script that uses Yahoo!'s namespace API to unroll the subdirectories under a node of the Yahoo! directory tree, yielding a consolidated list of URLs belonging to some category, such as /Science/Nanotechnology/. Then the script feeds that list of URLs, one at a time, to AltaVista, using its CGI API to ask, for each site, how many other pages in the AltaVista index refer to that site. The ranked list of these citation counts measures what I call the web mindshare of the sites.
<br/><br/>
Yahoo! wasn't designed to produce an unrolled list of sites in a category, but its web API can be made to do it. Likewise, AltaVista wasn't designed to count references to each of the sites in such a list, but its web API can be made to do it. These two macrocomponents, driven remotely by a 100-line Perl script (see http://www.byte.com/features/1999/03/udellmindshare.html), can be joined to create a new application that measures web mindshare. [<a href="http://secure.safaribooksonline.com/1565925378/ch08-4998">Practical Internet Groupware, Chapter 8, Organizing Search Results</a>]
</blockquote>
</p>
<p>
So I tried out the new version, and here are the top 15 of the 45 sites I got back:
</p>
<table cellpadding="2" cellspacing="2">
<tr><td align="right">28200</td><td> http://www.computeractive.co.uk/</td></tr>
<tr><td align="right">22200</td><td> http://www.byte.com/</td></tr>
<tr><td align="right">21500</td><td> http://www.computerworld.com/</td></tr>
<tr><td align="right">13700</td><td> http://www.eweek.com/</td></tr>
<tr><td align="right">12300</td><td> http://www.cbronline.com/</td></tr>
<tr><td align="right">7110</td><td> http://www.ugeek.com</td></tr>
<tr><td align="right">6480</td><td> http://www.chip-online.com/</td></tr>
<tr><td align="right">5790</td><td> http://www.cmpnet.com</td></tr>
<tr><td align="right">4610</td><td> http://www.fcw.com/</td></tr>
<tr><td align="right">2410</td><td> http://www.digitmag.co.uk/</td></tr>
<tr><td align="right">2040</td><td> http://www.currents.net/</td></tr>
<tr><td align="right">1950</td><td> http://www.advisor.com/</td></tr>
<tr><td align="right">1400</td><td> http://www.esj.com/</td></tr>
<tr><td align="right">1370</td><td> http://www.onmagazine.com/</td></tr>
<tr><td align="right">1100</td><td> http://www.techworthy.com/</td></tr>
</table>
<p>
Except that didn't seem right. It wasn't just that InfoWorld didn't show up anywhere in the top 45. The last time I ran my version, in Jan 2001, it <a href="http://udell.roninhouse.com/mindshare-report.html">found almost 500 sites</a> in the category. Then I saw why. The new version doesn't unroll the subcategories like the old one did. So I dusted that one off, ran it, and sure enough there were all usual suspects, plus some I don't remember from last time I tried this. And what a parade of names! <a href="http://www.guuui.com/">GUUUI</a>, The Interaction Designer's Coffee Break. <a href="http://www.phonelosers.org/">Phone Losers of America</a>. <a href="http://www.thinplanet.com/">Thin Planet</a>, Serving the Thin Client Industry. <a href="http://www.juiced.gs/">Juiced.GS</a>, The Magazine for Apple IIgs Users. Who knew? 
 </p>
<p>
Then I got to wondering about how different search engines would rank the same set of sites. So I repeated the experiment with Google and AllTheWeb. Here's the top layer of results:
</p>
<table cellpadding="8" cellspacing="0">
<tr><td align="right">AltaVista</td><td align="right">Google</td><td align="right">AllTheWeb</td></tr>
<tr>
<td valign="top">
<table cellpadding="2" cellspacing="0">
<tr><td align="right"><font size="-1"><a href="http://www.vnunet.com/">vnunet.com</a></font></td><td align="right">494316</td><td align="right"><b>1</b></td></tr>
<tr><td align="right"><font size="-1"><a href="http://www.internet.com/">internet.com</a></font></td><td align="right">416244</td><td align="right"><b>2</b></td></tr>
<tr><td align="right"><font size="-1"><a href="http://www.cmpnet.com">CMPnet</a></font></td><td align="right">183093</td><td align="right"><b>3</b></td></tr>
<tr><td align="right"><font size="-1"><a href="http://www.techrepublic.com/">TechRepublic</a></font></td><td align="right">118343</td><td align="right"><b>4</b></td></tr>
<tr><td align="right"><font size="-1"><a href="http://www.informationweek.com/">Information Week</a></font></td><td align="right">96498</td><td align="right"><b>5</b></td></tr>
<tr><td align="right"><font size="-1"><a href="http://www.infoworld.com/">InfoWorld</a></font></td><td align="right">90923</td><td align="right"><b>6</b></td></tr>
<tr><td align="right"><font size="-1"><a href="http://www.nwfusion.com/">Network World Fusion</a></font></td><td align="right">79022</td><td align="right"><b>7</b></td></tr>
<tr><td align="right"><font size="-1"><a href="http://www.apacheweek.com/">Apache Week</a></font></td><td align="right">74731</td><td align="right"><b>8</b></td></tr>
<tr><td align="right"><font size="-1"><a href="http://www.networkcomputing.com/">Network Computing</a></font></td><td align="right">73787</td><td align="right"><b>9</b></td></tr>
<tr><td align="right"><font size="-1"><a href="http://www.internetweek.com/">InternetWeek</a></font></td><td align="right">72833</td><td align="right"><b>10</b></td></tr>
<tr><td align="right"><font size="-1"><a href="http://www.computeractive.co.uk/">Computeractive Online</a></font></td><td align="right">70949</td><td align="right"><b>11</b></td></tr>
<tr><td align="right"><font size="-1"><a href="http://www.computerworld.com/">Computerworld</a></font></td><td align="right">68756</td><td align="right"><b>12</b></td></tr>
<tr><td align="right"><font size="-1"><a href="http://www.pcworld.com/">PC World</a></font></td><td align="right">67778</td><td align="right"><b>13</b></td></tr>
<tr><td align="right"><font size="-1"><a href="http://www.networkmagazine.com/">Network Magazine</a></font></td><td align="right">56451</td><td align="right"><b>14</b></td></tr>
<tr><td align="right"><font size="-1"><a href="http://www.ecommercetimes.com">E-Commerce Times</a></font></td><td align="right">53707</td><td align="right"><b>15</b></td></tr>
</table>
</td>
<td valign="top">
<table cellpadding="2" cellspacing="0">
<tr><td align="right"><font size="-1"><a href="http://www.internet.com/">internet.com</a></font></td><td align="right">257000</td><td align="right"><b>1</b></td></tr>
<tr><td align="right"><font size="-1"><a href="http://www.techrepublic.com/">TechRepublic</a></font></td><td align="right">194000</td><td align="right"><b>2</b></td></tr>
<tr><td align="right"><font size="-1"><a href="http://www.datamation.com/">Datamation</a></font></td><td align="right">36600</td><td align="right"><b>3</b></td></tr>
<tr><td align="right"><font size="-1"><a href="http://www.vnunet.com/">vnunet.com</a></font></td><td align="right">29000</td><td align="right"><b>4</b></td></tr>
<tr><td align="right"><font size="-1"><a href="http://www.computeractive.co.uk/">Computeractive Online</a></font></td><td align="right">28200</td><td align="right"><b>5</b></td></tr>
<tr><td align="right"><font size="-1"><a href="http://www.informationweek.com/">Information Week</a></font></td><td align="right">25700</td><td align="right"><b>6</b></td></tr>
<tr><td align="right"><font size="-1"><a href="http://www.networkcomputing.com/">Network Computing</a></font></td><td align="right">25600</td><td align="right"><b>7</b></td></tr>
<tr><td align="right"><font size="-1"><a href="http://www.byte.com/">Byte.com</a></font></td><td align="right">22200</td><td align="right"><b>8</b></td></tr>
<tr><td align="right"><font size="-1"><a href="http://www.computerworld.com/">Computerworld</a></font></td><td align="right">21500</td><td align="right"><b>9</b></td></tr>
<tr><td align="right"><font size="-1"><a href="http://www.internetweek.com/">InternetWeek</a></font></td><td align="right">20900</td><td align="right"><b>10</b></td></tr>
<tr><td align="right"><font size="-1"><a href="http://www.infoworld.com/">InfoWorld</a></font></td><td align="right">19800</td><td align="right"><b>11</b></td></tr>
<tr><td align="right"><font size="-1"><a href="http://www.computergaming.com/">Computer Gaming World</a></font></td><td align="right">19000</td><td align="right"><b>12</b></td></tr>
<tr><td align="right"><font size="-1"><a href="http://msdn.microsoft.com/msdnmag/">MSDN Magazine</a></font></td><td align="right">18300</td><td align="right"><b>13</b></td></tr>
<tr><td align="right"><font size="-1"><a href="http://www.sdmagazine.com/">Software Development Online</a></font></td><td align="right">17500</td><td align="right"><b>14</b></td></tr>
<tr><td align="right"><font size="-1"><a href="http://www.cuj.com/">C/C++ Users Journal</a></font></td><td align="right">16800</td><td align="right"><b>15</b></td></tr>
</table>
</td>
<td valign="top">
<table cellpadding="2" cellspacing="0">
<tr><td align="right"><font size="-1"><a href="http://www.internet.com/">internet.com</a></font></td><td align="right">1252762</td><td align="right"><b>1</b></td></tr>
<tr><td align="right"><font size="-1"><a href="http://www.cmpnet.com">CMPnet</a></font></td><td align="right">363708</td><td align="right"><b>2</b></td></tr>
<tr><td align="right"><font size="-1"><a href="http://www.pcworld.com/">PC World</a></font></td><td align="right">338478</td><td align="right"><b>3</b></td></tr>
<tr><td align="right"><font size="-1"><a href="http://www.informationweek.com/">Information Week</a></font></td><td align="right">290698</td><td align="right"><b>4</b></td></tr>
<tr><td align="right"><font size="-1"><a href="http://www.infoworld.com/">InfoWorld</a></font></td><td align="right">289626</td><td align="right"><b>5</b></td></tr>
<tr><td align="right"><font size="-1"><a href="http://www.networkcomputing.com/">Network Computing</a></font></td><td align="right">263746</td><td align="right"><b>6</b></td></tr>
<tr><td align="right"><font size="-1"><a href="http://www.pcmag.com/">PC Magazine</a></font></td><td align="right">251954</td><td align="right"><b>7</b></td></tr>
<tr><td align="right"><font size="-1"><a href="http://www.internetweek.com/">InternetWeek</a></font></td><td align="right">244718</td><td align="right"><b>8</b></td></tr>
<tr><td align="right"><font size="-1"><a href="http://www.business2.com/">Business 2.0</a></font></td><td align="right">241808</td><td align="right"><b>9</b></td></tr>
<tr><td align="right"><font size="-1"><a href="http://www.eweek.com/">eWeek</a></font></td><td align="right">218071</td><td align="right"><b>10</b></td></tr>
<tr><td align="right"><font size="-1"><a href="http://www.computerworld.com/">Computerworld</a></font></td><td align="right">216081</td><td align="right"><b>11</b></td></tr>
<tr><td align="right"><font size="-1"><a href="http://www.egmmag.com/">Electronic Gaming Monthly</a></font></td><td align="right">194335</td><td align="right"><b>12</b></td></tr>
<tr><td align="right"><font size="-1"><a href="http://www.computergaming.com/">Computer Gaming World</a></font></td><td align="right">188707</td><td align="right"><b>13</b></td></tr>
<tr><td align="right"><font size="-1"><a href="http://www.playstationmagazine.com/">Official U.S. PlayStation Magazine</a></font></td><td align="right">185146</td><td align="right"><b>14</b></td></tr>
<tr><td align="right"><font size="-1"><a href="http://www.vnunet.com/">vnunet.com</a></font></td><td align="right">184610</td><td align="right"><b>15</b></td></tr>
</table>
</td></tr>
</table>
<p>
You can see the complete results <a href="http://weblog.infoworld.com/udell/misc/mindshare2003.html">here</a>, in four columns: AltaVista (as of Jan 2001), plus AltaVista, Google, and AllTheWeb as of today. This massive report not only gives your browser's table formatter a good workout, it raises interesting questions. The results differ pretty dramatically, in both magnitudes and rankings. I won't even pretend I can analyze why, though maybe insiders at the search engine companies could. Meanwhile, the best strategy might be to throw all the results into a pot and stir them up into a merged ranking.  Everything's a service that can support <a href="http://www.infoworld.com/article/03/01/03/030106apapps_1.html">recombinant growth</a>. We're so fixated on Google nowadays, we tend overlook the possibility of yoking multiple services together. It was easy to do that four years ago, and it's just as easy today.
</p>

</body>
</item> 

<item num="a858">
<title>Putting services into buckets is a hopeless exercise</title>
<date>2003/12/03</date>
<body>

<p>
<a href="http://weblog.infoworld.com/udell/gems/werbachFCC.ram"><img align="right" vspace="6" hspace="6" alt="fcc voip hearing" src="http://weblog.infoworld.com/udell/gems/werbachFCC.jpg"/></a>
<blockquote cite="Kevin Werbach">
So the question being asked today is where does VoIP fit? Is it in the unregulated data portion of the circle, or the regulated telephony part of the circle? I'd like to suggest that's the wrong question to ask.
<br/><br/>
...
<br/><br/>
In this world, putting services into buckets is a hopeless exercise. The critical issues concern interconnection among networks and openness of platforms. [<a href="rtsp://video.c-span.org/fdrive/15days/e120103_fcc.rm">Kevin Werbach, presentation to FCC VoIP hearing</a>
</blockquote>
Kevin <a href="http://werbach.com/blog/2003/12/02.html#a1330">points out</a> that the stream I couldn't catch live the other day is <a href="rtsp://video.c-span.org/fdrive/15days/e120103_fcc.rm">archived</a> at CSPAN. His own presentation, which I <a href="http://weblog.infoworld.com/udell/gems/werbachFCC.ram">quote here</a>, was excellent.
</p>
<p>
I don't happen to have four hours to spare right now, so I can't watch the whole thing. It'd be great if the blogging community got into the habit of quoting and linking to highlights when events like this are made available as archived streams. That way I could make better use of the content in the limited time I have available to spend with it.
</p>
<p>
<b>Update</b>: Case in point: David Isenberg <a href="http://www.isen.com/blog/archives/2003_12_01_archive.html#107046695018197191">notes</a> that Michael Powell's published prepared remarks differ from his actual remarks. And Jeff Pulver <a href="http://192.246.69.231/jeff/personal/archives/000281.html"> points out</a> that Reuters quoted from <i>his</i> prepared remarks, posted on his blog, not from what he actually said. Fascinating.
</p>

</body>
</item> 

<item num="a857">
<title>DevPartner 7.1</title>
<date>2003/12/03</date>
<body>

<p>
<a href="http://weblog.infoworld.com/udell/gems/devpartner.JPG"><img border="1" align="right" vspace="6" hspace="6" alt="devpartner" src="http://weblog.infoworld.com/udell/gems/devpartner.gif"/></a>
<blockquote cite="InfoWorld">
Compuware's DevPartner suite of debugging and analysis tools has a long and illustrious history. The first incarnation of its runtime error detection module, BoundsChecker, was released in 1989 by NuMega Technologies, a company that CompuWare acquired in 1997. You might think that BoundsChecker's ability to detect assignments to null pointers (among other sins) would be a historical relic in the brave new world of .NET managed code. Not so. The transition to managed code will probably take a decade, during which time Windows programmers will be struggling with the complexities of a hybrid managed/unmanaged environment -- both in the Windows OS itself, and in the componentized applications and services they layer on top of it. Instrumenting these very different programming environments, so that developers can analyze, profile, and more effectively debug programs straddling the unmanaged and managed worlds, is big challenge that Compuware's latest offering, DevPartner Studio 7.1, tackles fearlessly. [Full story at <a href="http://www.infoworld.com/article/03/11/26/47TCware_1.html">InfoWorld.com</a>]
</blockquote>
The point about the long transition to managed code is one of the things that prompted the <a href="http://weblog.infoworld.com/udell/2003/11/18.html#a849">Lizard brain surgery</a> column. When I talked with the DevPartner folks a year ago, they were feeling bullish about a rapid migration to .NET. When I talked to them more recently, things had settled into a much more gradual pattern, which should surprise no-one.
</p>
<p>
At a conference recently, The Hartford's James McGovern made a compelling point. He's not worried about creating new apps and services, we've got wizards that crank them out like nobody's business. What keeps McGovern up nights is the difficulty of ever actually retiring any code. 
</p>
<p>
This review also brings to mind a wonderful piece written by Sue Spielman, called <a href="http://weblogs.java.net/pub/wlg/532">Dear John...er...I mean Debugger</a>. She writes:
<blockquote cite="Sue Spielman">
The time has come to reevaluate the time we spend together. We've spent hours and hours frolicking at breakpoints, contemplating the meaning of the stack, and chatting into the wee hours of the morning. We've danced, stepped into, and stepped over who knows how many methods and lines of code. As I look back, there is no development tool that could ever take your place in my heart. However, it seems over the last year or two we are spending less and less time with each other. How should I tell you this? My time is now spent with my test cases. 
</blockquote>
I don't think the handwriting is on the wall, yet, for conventional debuggers. But as things move in that direction, I expect tool suites like Compuware's DevPartner will adapt. The technologies of instrumentation and analysis are potent, and are useful in many different ways.
</p>


</body>
</item> 

<item num="a856">
<title>Web services and natural-born cyborgs</title>
<date>2003/12/02</date>
<body>

<p>
<blockquote cite="InfoWorld">
&quot;While a business process is running,&quot; an IBM white paper on BPEL4WS dryly notes, &quot;it might be necessary to undo one of the steps that have already been successfully completed.&quot; Translation: Things can get screwed up, and then they need to be fixed. If there's anything revolutionary about Web services, it's the notion that we'll be able to deal with the inevitable screwups in more realistic and more effective ways. Compensation can't simply mean what it does to a programmer: chaining back through a nested series of automatic exception handlers. We have to accept that it's often people who both throw and handle the exceptions, and we have to build software systems that gracefully include them. [Full story at <a href="http://www.infoworld.com/article/03/11/26/47FEwsdesk_1.html">InfoWorld.com</a>]
</blockquote>
This short article, part of InfoWorld's <a href="http://www.infoworld.com/reports/47SRwebservices.html">special report</a> on Web services, touches on some things I'll say more about in my talk at <a href="http://www.xmlconference.org/xmlusa/">XML 2003</a> next week. 
</p>
<p>
<a href="http://allconsuming.net/item.cgi?isbn=0195148665"><img align="right" vspace="6" hspace="6" alt="natural-born cyborgs" src="http://weblog.infoworld.com/udell/gems/cyborg.jpg"/></a>
One of the books I read over the holiday, Andy Clark's <a href="http://allconsuming.net/item.cgi?isbn=0195148665">Natural-Born Cyborgs: Minds, Technologies, and the Future of Human Intelligence</a>, has helped me to think about what I want to say in that talk. Parts of the book went by in a blur, echoing themes I've read about in too many other books in this genre: agents, swarming, virtual reality, wearable computers. But the central theme really grabbed me, and I'll summarize it like this. When we say, jokingly, that Google and our weblogs have become extensions of our brains, it's not really a joke. Ever since the advent of language, and especially since the advent of print, our minds have operated in a hybrid mode, depending on a complex interaction between information processes running inside the skull and information processes running outside it. This is now our natural state, argues Clark, and it is not really meaningful or useful to try to precisely define the inside/outside boundary. He writes:
<blockquote cite="Andy Clark">
The goal is to provide rich environments in which to <i>grow</i> better brains. The more seriously we take the notion of brain-environment engagement as crucial, the less sense it makes to wonder about the relative <i>size</i> of each of the two contributions. What really matters is the complex reciprocal dance in which the brain tailors its activity to a technological and sociocultural environment, which -- in concert with other brains -- it simultaneously alters and amends. Human intelligence owes just about everything to this looping process of mutual accommodation.
</blockquote>
</p>
<p>
And elsewhere:
<blockquote cite="Andy Clark">
The biological organism is just one part of the chameleon circuitry of thought and reason, much of which now runs and flows outside the head and through our social, technological, and cultural scaffoldings.
</blockquote>
</p>
<p>
So what's all this got to do with XML? If you buy the notion that we are projecting ourselves into networked information systems, then we can't only focus on how processes and data interact in these increasingly XML-based systems. The quality and transparency of our direct interaction with XML processes and data -- and with one another as mediated by those processes and data -- has to be a central concern too.
</p>

</body>
</item> 

<item num="a855">
<title>Link-addressable streams</title>
<date>2003/12/01</date>
<body>
<p>
<blockquote cite="InfoWorld">
To broadcast a stream, you point QuickTime broadcaster at DSS, export an SDP (session description protocol) file from QuickTime BroadCaster, and place it in the streaming server's Movies directory. To view the stream, launch QuickTime Player and load a URL like rtsp://dss_host/broadcast.sdp. Once I got this working I moved the server to a second DSL circuit in my lab. I configured the firewall to allow TCP ports 80 and 544, and UDP ports 5432 and 5434. And amazingly, it all worked. From my TiBook on one Internet-connected private LAN, I was streaming video to a server on a different Internet-connected private LAN. That server's broadcast was available -- at an eight-second delay -- to any QuickTime Player on any platform anywhere on the Internet. What's more, the TiBook could be sending the stream from any Wi-Fi-equipped meeting or conference, anywhere on the Internet. [<a href="http://www.infoworld.com/article/03/11/26/47OPstrategic_1.html">InfoWorld: Mobile Webcasting: November 26, 2003: Jon Udell</a>]
</blockquote>
After this column was done, I repeated the experiment using Real's Helix server. I tried the <a href="https://www.helixcommunity.org/">open source version 10.1</a> first, but couldn't quite get it to work. (Version 10.1 just came out a few weeks ago, and online chatters suggests others ran into the same problem I did.) So for now, I'm running the <a href="http://www.realnetworks.com/products/server/">commercial version</a> on a 30-day eval. Apart from live broadcasting, I wanted to explore an idea I've been really interested in lately: quoting from streams by linking into them with URLs that include start/stop times.
</p>
<p>
The Helix server is handy for this experiment because it supports Real, QuickTime, and Windows media players. So I converted the Knowlege Navigator video into each of those streaming formats, and started hunting around for URL syntaxes. My search hasn't been exhaustive yet, but so far I haven't gotten any farther than before. I can use file.rm?start=mm:ss&amp;end=mm:ss for the Real player, and it seems to help some browsers if you encapsulate that in a .ram metadata file, which the server will generate, i.e. /ramgen/file.rm?start=...
</p>
<p>
I'll be darned if I can come up with an equivalent method for an rtsp: URL that plays a QuickTime stream, or for an mms: URL that plays a Windows Media stream. I found some documentation for QuickTime starttime/endtime parameters but, while these are supposed to work with links as well as in EMBED tags, I can't so far get that to happen. I've also found documentation for the <a href="http://www.w3.org/TR/smil20/extended-media-object.html">SMIL clipBegin/clipEnd</a> syntax, but can't seem to get that to work with any of the current players. And in any case, it would seem to require writing a metafile.
</p>
<p>
Granted I'm new to all this, but the whole streaming situation does seem like a bit of a train wreck. My son made a Lego animation over the weekend, I converted it to all three streaming formats, and then I had to write up a whole page of instructions so that friends and family could view it. Even then, although I could view the streams in all three players on both Windows and Mac, the success rate reported back to me was only about 50%. It's like your worst cross-browser nightmare on steroids.
</p>
<p>
I also observe that when I use Google to research my questions, it comes back with far fewer hits than I'd have imagined. The impression I get is that there just aren't a whole lot of people using this stuff. I plan to persevere, though. The opportunities are just too interesting to ignore.
</p>
<p>
<b>Update</b>: 
David Isenberg is <a href="http://www.isen.com/blog/">blogging</a> the FCC's VoIP meeting today. I'd like to be a fly on that wall. And in theory I can be. It's being <a href="http://www.fcc.gov/realaudio/#dec1">streamed live</a>. But in practice, streams are in short supply. We know there's WiFi in the room. David mentions that Jeff Pulver is using it to get his cellphone to work. It's possible that using the technique described in my column, some additional AV streams could find their way out of that room. That'd be a nice demonstration of the power of decentralized IP networking.
<img align="left" vspace="6" hspace="6" src="http://weblog.infoworld.com/udell/gems/noMoreStreams.jpg"/>
</p>
 </body>
</item> 

<item num="a854">
<title>Data models and network effects</title>
<date>2003/11/25</date>
<body>

<p>
Dare Obasanjo asks an important question:
<blockquote cite="Dare Obasanjo">
For example, should one use SQL to query relational databases and XPath/XQuery for XML or should SQL be the universal query language used by all with any additions needed for XML querying being grafted on to it in most likely a proprietary manner? [<a href="http://www.25hoursaday.com/weblog/PermaLink.aspx?guid=d9510fb8-f1d9-4ecc-afb4-769e668c7e41">Dare Obasanjo</a>]
</blockquote>
For a long time, I thought that object, relational, and XML databases were different tools for different jobs, and that we'd use different query languages to work with them. Recently, I've been impressed by how the major RDBMS systems, most notably Oracle, are weaving these disciplines together under the rubric of SQL:200n. Admittedly, that standard is proceeding as slowly as all SQL standards have. But Oracle's latest stuff does demonstrate a unified and standards-based approach. I was particularly struck by this comment from Oracle's Sandeepan Banerjee, which I've mentioned before:
</p>
<p>
<blockquote cite="Sandeepan Banerjee">
It's possible that developers will want to stay within an XML abstraction for all their data sources.
</blockquote>
I don't know that will happen, or that it's the right thing. But I don't know that it won't happen, or that it's wrong, either. 
</p>
<p>
I've always argued against the notion that there is one true programming language, and I respect -- more than many do -- developers' willingness and ability to master a variety of special-purpose languages. Of course, in a hybrid SQL/XML environment it's not like there's one query language. You write XPath expressions in SQL or XQuery contexts, just as you write regular expressions in Python or JavaScript contexts.
</p>
<p>
We've established XSD for SOAP payloads on the one hand, and for interactive documents on the other. That's a powerful convergence. Evolving it in a standards-based way isn't, for me, about protecting developers from culture shock. It's about smooth interop between the next-gen Windows filesystem and the larger ecosystem it will play in.
</p>

</body>
</item> 

<item num="a853">
<title>Preserving the Internet's neutral core</title>
<date>2003/11/24</date>
<body>

<p>
<blockquote cite="InfoWorld">
The Internet, it's often said, treats censorship as damage and routes around it. Even before ICANN weighed in on Site Finder, ISPs and network administrators had begun to route around it. Ironically, one major network that reportedly opted out was the tightly controlled Chinese Internet backbone, prompting some observers to foresee an escalating arms race of blockages. We doubt that will happen. But recent events remind us that although the Internet was built to survive a nuclear attack, it may need some help resisting political and economic assaults on its policy-neutral core. [<a href="http://www.infoworld.com/article/03/11/21/46FEtroublefuture_1.html?s=feature">InfoWorld: Preserving the Internet's policy-neutral core: November 21, 2003</a>]
</blockquote>
</p>
<p>
This brief commentary of mine is attached to a <a href="http://www.infoworld.com/article/03/11/21/46FEtrouble_3.html">longer article</a> by David Margulius, who interviewed a bunch of people including Vint Cert, Steve Crocker, Fred Baker, and Stratton Sclavos. Margulius gives Crocker the last word:
<blockquote cite="Steve Crocker">
One thing's for sure: The Site Finder episode got a lot more people involved in these questions than had been.
</blockquote>
</p>

</body>
</item> 

<item num="a852">
<title>A tale of two Cairos</title>
<date>2003/11/22</date>
<body>

<p>
<blockquote cite="InfoWorld">
Microsoft's 2003 Professional Developers Conference (PDC) reminded some observers of the same event in 1993, when the hot topics were the Win32 APIs, a rough draft of Windows 95 code-named Chicago, and a preview of a futuristic object-file-system-based NT successor code-named Cairo. The hot topics this year were the WinFX managed APIs, a rough draft of a future version of NT code-named Longhorn, and ... Cairo. Now called WinFS, this vision of metadata-enriched storage and query-driven retrieval was, and is, compelling. Making it real wasn't then, and isn't now, simply a matter of engineering the right data structures and APIs. [Full story at <a href="http://www.infoworld.com/article/03/11/21/46OPstrategic_1.html">InfoWorld.com</a>]
</blockquote>
</p>
<p>
Coincidentally Diego Doval recently posted a <a href="http://www.dynamicobjects.com/d2r/archives/002430.html">meditation on past and future Cairos</a>. It's especially poignant for me because he illustrates his point with quotes from the BYTE.com archives -- including a couple of my articles.
</p>
<p>
As Dan Bricklin noted the other day, digital continuity is fragile:
</p>
<blockquote cite="Dan Bricklin">
I just got a heads up from Larry Magid, who noticed <a href="http://news.com.com/2100-7345-5097537.html?tag=nl" target="_top" title="http://news.com.com/2100-7345-5097537.ht"><u>on News.com</u></a> that <b>Microsoft demonstrated Longhorn's &quot;backward compatibility...running VisiCalc</b>, the 20-year-old spreadsheet program.&quot; That's really nice that they are continuing the tradition of compatibility and showing it with VisiCalc. Software compatibility is something I have discussed about an older version of Windows back in 2000 in my &quot;<a href="http://www.bricklin.com/pcevolution.htm" target="_top" title="http://www.bricklin.com/pcevolution.htm"><u>The Evolving Personal Computer</u></a>&quot; essay. Of course, as I pointed out in &quot;<a href="http://www.bricklin.com/robfuture.htm" target="_top" title="http://www.bricklin.com/robfuture.htm"><u>Copy Protection Robs The Future</u></a>&quot;, <b>the only reason I have a copy that can still work is that someone kept a &quot;bootleg&quot; uncopyprotected copy around</b>. The original disks may not have worked on a Longhorn machine. Just copying the files from the original 5 1/4&quot; floppy to a 3 1/2&quot; one that would fit in today's machines certainly would result in a non-working copy, because of copy protection. We will regret &quot;Digital Restriction/Rights Management&quot; in the future. [<a href="http://www.danbricklin.com/log/">Dan Bricklin Log</a>]
</blockquote>
<p>
Now that my BYTE days are long past, I guess I can admit that a similar act of subversive stewardship preserved the BYTE.com archives. I created them lovingly, and when CMP bought BYTE magazine from McGraw-Hill in 1998, I begged to be allowed to help CMP rework the content -- which existed as neutral markup that I poured through a script-driven template -- for the CMP site. Nobody was willing. Instead, somebody spidered BYTE.com the week before we got shut down, and what appeared on TechWeb was a mangled version of the content. 
</p>
<p>
It was broken for years. Then, finally, somebody asked what could be done to fix it. I wasn't supposed to have a clean copy of the original content, but I did, and was delighted to contribute it.
</p>
<p>
<a href="http://udell.roninhouse.com/bytecols/2001-11-30.html">This column from 2001</a> was inspired by the <a href="http://www.archive.org/web/web.php">Wayback Machine</a>, part of Brewster Kahle's Internet Archive project. Ironically the only linkable version is posted on my personal site -- as, thankfully, my contract permitted me to do -- because the original <a href="http://weblog.infoworld.com/udell/2002/11/29.html#a522">vanished behind a for-pay firewall</a>. I was angry about that at the time, but the damage seems to have been routed around. Diego Doval, for example, cited one of those rehosted columns in his posting. More importantly, though, the original 1994-1998 BYTE.com archives are (for now) intact and accessible. For that I'm grateful.
</p>
<p>
<a href="http://weblog.infoworld.com/udell/gems/bc03.ram"><img alt="doc searls quizzes len apgar about nytimes archive policy" vspace="6" align="right" width="325" height="219" src="http://weblog.infoworld.com/udell/gems/timesCostWall.jpg"/></a>
When it comes to digital archiving, of course, there are bigger fish to fry than BYTE.com. Here's <a href="http://weblog.infoworld.com/udell/gems/bc03.ram">Doc Searls at BloggerCon</a>, making the case for keeping the New York Times web archive accessible. Jay Rosen commented afterward:
</p>
<blockquote cite="Jay Rosen">
<p>There was one almost poignant moment during the question and answer period. Someone stood up and asked will the New York Times <a href="http://query.nytimes.com/search/advanced?query=&amp;date_select=past30days&amp;srchst=nyt&amp;srcht=s&amp;srchot=s&amp;">open its archive to free linking?</a> (The original url's expire after seven days for most articles, then you have to pay.) This appeared to catch Apcar off guard. Perhaps he had not fully understood the ethical universe he had traveled to, the Open Source Society, where naturally you link to everyone who enriches your account, building the social capital of the Web a tiny bit at a time. You take pains to make yourself linkable, too-- that's just good citizenship.</p> <p>What the crowd was really saying, however, cut deeper: Don't you understand? We want to link to you, <a href="http://www.nytimes.com/">mighty New York Times</a>, and give everything you publish more and more Web life. For this, <a href="http://blogs.law.harvard.edu/bloggerCon/ruleOfLinks">the Rule of Links</a>, is the way of our tribe, said conference host <a href="http://www.scripting.com/">Dave Winer</a>, who wrote the rule. But because of your foolish and short-sighted archive policy, our efforts die after a week. Why, why are you causing all this needless link death?</p> <p>This wasn't entirely fair to Apcar, who isn't a corporate head. He seemed puzzled by it.</p> [<a href="http://journalism.nyu.edu/pubzone/weblogs/pressthink/2003/10/05/apcar_weblogs.html">PressThink: Times Web Editor Goes to Harvard in Search of Something</a>]
</blockquote>
<p>
A final thought for the weekend, prompted by Mark Jones' <a href="http://weblog.infoworld.com/techwatch/archives/000124.html">excursion into videoblogging</a>.  Sometimes a digital videocam will be a tool of the blogger's trade. But increasingly, there will quotable video content already online. If it's published in a linkable form (i.e., as a stream not a file),  you can &quot;simply&quot; link into it. As I've noted before, though, <a href="http://weblog.infoworld.com/udell/2003/10/08.html#a823">that isn't as simple as it ought to be</a> -- even with a Real stream that explicitly supports URLs with start/stop parameters.
</p>
<p>
I'm now exploring the Real, QuickTime, and Windows Media technologies -- both clients and streaming servers. I really want to understand how we might make it easier both to publish and to link into (i.e., quote from) audio and video. 
</p>

</body>
</item> 

<item num="a851">
<title>Working with Bayesian categorizers</title>
<date>2003/11/20</date>
<body>

<p>
<blockquote cite="O'Reilly Network">
 There's been some discussion in the blog world about using a Bayesian categorizer to enable a person to discriminate along various interest/non-interest axes. I took a run at this recently and, although my experiments haven't been wildly successful, I want to report them because I think the idea may have merit. [Full story: <a href="http://www.xml.com/pub/a/2003/11/19/udell.html">O'Reilly Network: Working with Bayesian Categorizers</a>]
</blockquote>
This month's O'Reilly Network column was a struggle because categorization itself is a struggle. I remain convinced that the automated classifiers that are doing such a good job beating back the tide of spam will also turn out to be more generally useful. But finding the right synergy between an automated assistant and a human overseer is a subtle and tricky thing.
</p>
<p>
<b>Update:</b> Interesting comment from Larry O'Brien:
<blockquote cite="Larry O'Brien">
Jon appears to be doing something dangerously more ambitious, which is creating a Bayesian <em>categorizer</em> that assigns Jon-meaningful categories (email, collaboration, family, etc.) to items. I say &quot;dangerously more ambitious&quot; because Jon's approach would seem to require a lot of supervision, while the genius of Bayesian spam-filtering is that pressing a button marked &quot;Delete as spam&quot; is no more onerous than deleting the spam in the first place. Similarly, a Bayesian RSS aggregator that just attempted to categorize &quot;Will this item be read, will this item be clicked-through, will this item be deleted without pause?&quot; requires no more supervision than what is natural to the task of RSS browsing. [<a href="http://www.thinkingin.net/2003/11/20.aspx#a526">Knowing .NET</a>]
</blockquote>
Agreed, this is speculative at best. For what it's worth, there's a twofold notion at work here. First, from the perspective of a blog author who already categorizes content (as many do), the question is: can effort that's already being invested pay more dividends? An automated review of things that have been already been categorized can help you sharpen your sense of the structure you are building. A prediction about how to categorize a newly-written item can be interesting and helpful too. As I worked through the exercise, I could (at times) imagine the software to be acting like a person you'd bounce an idea off of. &quot;I can see why you choose that category,&quot; we can imagine it saying, &quot;but for what it's worth, it has a lot in common with these items in this other category.&quot; 
</p>
<p>
The second and even more speculative idea would be to create subscribable filters. Consider the set of items that I write myself, and categorize under, say, web_services. Some other set of items out there in the blogosphere, written by other folks, will tend to cluster with mine. Could we say that those other items have some affinity for &quot;Jon's take on Web services&quot;? And if so, by subscribing to my text-frequency database for that category could you use it to create one view of your own inbound feeds, or to suggest ones you're not reading? This part of the experiment failed badly, I'll freely admit. When I used my database to categorize items drawn from elsewhere, the results weren't promising. However, the sample size for the experiment was very small. It's conceivable to me that something could come of this approach, though I wouldn't bet money on it.
</p>
<p>
A final note: Patrick Phalen wrote to remind me of another toolkit: the Python-based <a href="http://nltk.sourceforge.net/">Natural Language Toolkit (NLTK)</a>. In fact I did try it. NLKT is much more sophisticated than the other two kits I wrote about -- it's evidently used as a foundation for all kinds of natural language research -- but for some reason I didn't get a quick ramp-up with it. That said, for what I was attempting the toolkits weren't really the bottleneck. The time-consuming thing was setting up an environment in which items could be identified by descriptive titles, viewed in HTML, and shuffled around easily in drag-and-drop fashion. The category-per-directory approach is probably about the best you can do in that regard, and you could easily adapt NLTK or another kit to that approach.
</p>

</body>
</item> 

<item num="a850">
<title>Marconi's magic box</title>
<date>2003/11/19</date>
<body>

<p>
<a href="http://allconsuming.net/item.cgi?isbn=0007130058"><img align="right" vspace="6" hspace="6" width="90" height="140" src="http://weblog.infoworld.com/udell/gems/marconiMagicBox.jpg"/></a>
When I need an alternate place to work, I head for the local college library. The armchairs are comfortable, the WiFi is fast. As a bonus, I get to raid the new books shelf. Today's catch, <a href="http://allconsuming.net/item.cgi?isbn=0007130058">Signor Marconi's Magic Box</a>, put my twenty-first-century smugness into perspective:
</p>
<p>
<blockquote cite="Gavin Weightman">
Marconi was one of the greatest amateur inventors of all time. It is remarkable testimony to the fragility of reputation that a man who could command such respect in his lifetime should now be relegated to comparative obscurity, and that the names of scores of his contemporaries who made radio work have no resonance at all for a generation addicted to the most modern form of wireless telegraphy: text messaging on a mobile phone.
<br/><br/>
That Queen Victoria received text messages sent by wireless from the royal yacht to her home on the Isle of Wight more than a century ago will come as a surprise to those who imagine the technology of the mobile phone is almost brand new.
</blockquote>
Guilty as charged. Now that I think about it, though, I'm sure I can guess the content of the first of those messages. <i>&quot;I am here. Are you there?&quot;</i>
</p>

</body>
</item> 

<item num="a849">
<title>Lizard brain surgery</title>
<date>2003/11/18</date>
<body>
	
<p>
<blockquote cite="InfoWorld">
This isn't just a Windows phenomenon, of course. Every major software system has, at its core, what Dave Winer likes to call a lizard brain. The roots of Linux reach down through many layers to its lizard brain. Mac OS X archeologists can explore not only that same deep history but also a parallel 15-year NextStep legacy. Every software architect longs for a chance to reorganize -- or as they like to say, &quot;refactor&quot; -- to a simpler and stronger foundation for new layered abstractions. Few organizations have the resources to maintain and evolve a working system while mercilessly refactoring to produce its successor. Microsoft is among the lucky few. We'll see, in a couple of years, how well Longhorn has exploited that rare opportunity.
[Full story at <a href="http://www.infoworld.com/article/03/11/14/45OPstrategic_1.html">InfoWorld.com</a>]
</blockquote>
The original title of this column was &quot;Lizard brain surgery.&quot; It was problematic because of the Mozilla echo, but I doubt that's why the editors changed it. More likely they thought it too offbeat for the print publication.
</p>
<p>
Editors of print pubs are always struggling with headlines, and making hard decisions. There aren't good ways to test those decisions though, because the feedback loop in print publishing is so attenuated. The Web can inject more science into that difficult art. When headline A is chosen over B and C, save B and C, and use them in rotation on the website. Track the efficacy of A, B, and C. True, the Web audience is not the print audience. But it would be interesting data. Does anyone do this?
</p>
<p>
<b>Update</b>: A funny twist. I put <i>lizard brain surgery</i> into my GoogleBox and the first of the only two hits it came up with was Kevin McKean's <a href="http://www.infoworld.com/article/03/11/14/45OPeditor_1.html">editorial</a> from this week. So the headline actually printed was either <i>Lizard brain surgery</i> or <i>What's holding sofware back?</i>. I guess I'll find out which when my copy of the print magazine lands in my mailbox in a couple of hours.
</p>

</body>
</item> 

<item num="a848">
<title>Great books spam</title>
<date>2003/11/16</date>
<body>
<p>
<img alt="20,000 leagues" align="right" src="http://weblog.infoworld.com/udell/gems/20Kleagues.jpg"/>
A really amusing new kind of spam appeared in my MaybeSpam folder this weekend. Here's an example:
</p>
<blockquote><i>
Here's your chance for Cash Freedom
<br/><br/> 
http://90ui89av.com/boris5hco/index.htm
<br/><br/> 
[... many blank lines to hide the payload ...]
<br/><br/>  
surrounds him and by his own body. I lift my arm and let it fall. My<br/>  
as the Babylonians and the Assyrians. These peoples were of<br/>  
The night passed thus, without disturbing the ordinary repose of the crew.
</i></blockquote>
<p>
SpamBayes scored the message at 89%, just shy of the 90% that would have landed it in CertainSpam instead of MaybeSpam. The three lines at the end, of course, are what created a bit of uncertainty. One of them seemed oddly familiar:
<blockquote><i>
The night passed thus, without disturbing the ordinary repose of the crew.
</i></blockquote>
 So I double-quoted it, Googled it, and found it in <a href="http://jv.gilead.org.il/martin/20000_1-21.html">20,000 Leagues under the Sea</a>, which I had just reread the other night to re-implant the original memory of the book that The League of Extraordinary Gentlemen had partly overwritten.
</p>
<p>
Then I tracked down the other lines:
<ul>
<li><p>&quot;surrounds him and by his own body. I lift my arm and let it fall. My&quot;, <a href="http://www.literaturepage.com/read/warandpeace-1689.html">War and Peace</a></p></li>
<li><p>&quot;as the Babylonians and the Assyrians. These peoples were of&quot;, <a href="http://www.worldwideschool.org/library/books/sci/history/AHistoryofScienceVolumeI/chap8.html">A History of Science Volume 1</a>
</p></li>
</ul>
</p>
<p>
And these from the next message:
</p>
<ul>
<li><p>&quot;that I had leisure to look at the people in the launch again.&quot;, <a href="http://www.pagebypagebooks.com/H_G_Herbert_George_Wells/The_Island_of_Doctor_Moreau/VI_THE_EVIL_LOOKING_BOATMEN_p1.html">The Island of Doctor Moreau</a></p></li>
<li><p>&quot;for the aged? And is it not a palpable, unquestionable good if a&quot;,  <a href="http://www.underthesun.cc/Classics/Tolstoy/warandpeac/warandpeac95.html">War and Peace</a></p></li>
<li><p>&quot;bear on this important subject, we cannot fail to be struck by &quot;, <a href="http://www.bemyastrologer.com/artofwarchapter11.html">The Art of War</a></p></li>
</ul>
<p>
I love the fact that these texts are online, and that Google (and now, in some cases, Amazon) can pinpoint the references!
</p>
<p>
How to dispose of the spam? It's an interesting conundrum. If I click Delete As Spam, I'll boost the scores on these messages to 100%, and future ones won't clutter up my MaybeSpam folder, which I like to keep to at most a handful of messages a day. But do I really want to consign Jules Verne and Sun Tzu and Leo Tolstoy to the CertainSpam folder? Seems like sacrilege, in a way. And will classifying the messages as CertainSpam dilute the efficacy of SpamBayes' frequency database? 
</p>
<p>
To tell you the truth, I'm confident that SpamBayes can deal with this new challenge. I'm reluctant to let it, for now, because (until the novelty wears off) it's kind of fun to see these fragments of great literature fly by.
</p>
<p>
In the TiVo era, the only TV commercials that survive may be the ones that people actually choose to watch. For spammers, too, survival may depend on delivering entertainment value.
</p>


</body>
</item> 

<item num="a847">
<title>Notice to customers using e-mail filtering SPAM software</title>
<date>2003/11/15</date>
<body>
<blockquote cite="Verizon">
<p><img src="http://weblog.infoworld.com/udell/gems/verizon.gif"/></p>
<p>
Important account information communicated through e-mail may be affected by any e-mail filtering &quot;SPAM&quot; software you have installed on your computer.
</p>
<p>
We use your e-mail address to confirm your registration, online order status or payment confirmation, notify you when your bill is available for viewing online, respond to inquiries and to keep you updated about news and information relevant to your account(s).
</p>
<p>
To ensure that you receive important e-mails, do one of the following:
</p>
<ol>
<li><p>Add the &quot;verizon.com&quot; and &quot;vznotes.com&quot; domains to your e-mail &quot;safe list&quot;.</p></li>
<li><p>If your settings do not allow you to add e-mail addresses to a &quot;safe list,&quot; use the Help section or contact your e-mail/internet provider's Customer Support to research your configuration options.</p></li>
<li><p>Disable your e-mail filtering &quot;SPAM&quot; software.</p></li>
<li><p><i>Bypass email entirely, and subscribe to the secure, personalized RSS feed at this address: https://jonudell:xPqR1$7@feeds.verizon.com/jonudell.rss</i></p></li>
</ol>
</blockquote>
<p>
OK, I made up #4, but the rest is genuine. Since AT&amp;T's <a href="http://weblog.infoworld.com/udell/2003/11/01.html#a837">Sonya Bigby-McCloud</a> never got back to me about the online bill presentment glitch, I'll offer the $20 Amazon gift certificate that she could have claimed to the first utility company that figures out this email route-around, and taps into my preferred spam-free channel for receiving notifications.
</p>	
</body>
</item> 

<item num="a846">
<title>Multi-ISBN LibraryLookup</title>
<date>2003/11/13</date>
<body>
<p>
If you've followed the saga of <a href="http://weblog.infoworld.com/udell/stories/2002/12/11/librarylookup.html">LibraryLookup</a>, you may recall that there's a fly in the ointment: a book usually has more than one ISBN (<a href="http://weblog.infoworld.com/udell/2002/12/18.html#a548">1</a>, <a href="http://weblog.infoworld.com/udell/2002/12/18.html#a548">2</a>). So, for example, if you're viewing the <a href="http://www.amazon.com/exec/obidos/ASIN/0066620694">paperback edition</a> of <i>The Innovator's Dilemma</i> on Amazon, a <a href="http://ksclib.keene.edu/search/i=0060521996">search for its ISBN at my local library</a> comes up empty-handed. The library's catalog can't relate the paperback's ISBN, 0066620694, to the hardcover's ISBN, 0875845851 -- a book that is <a href="http://ksclib.keene.edu/search/i=0875845851">actually on its shelf</a>.
</p>
<p>
The strategy I've been recommending is to use Amazon's &quot;All editions&quot; link to find the hardcover edition, then run LibraryLookup from that page. But that's kind of lame. What if the library really does carry the paperback? Or an audiocassette or CD? There ought to be a way to expand a single ISBN to a cluster of related ISBNs. In the case of <i>The Innovator's Dilemma</i>, that cluster would be:
</p>
<p>
0875845851  0585368228  0066620694  1565114159  1578511682 
</p>
<p>
Wouldn't it be nice if something like this worked:
</p>
<p align="center">
<a href="http://labs.oclc.org/xisbn/0875845851">http://labs.oclc.org/xisbn/0875845851</a>
</p>
<p>
Hey, it does! The <a href="http://www.oclc.org/">OCLC</a>'s chief scientist, <a href="http://www.oclc.org/research/staff/hickeyt.htm">Thom Hickey</a>, just alerted me to this excellent news. The service at http://labs.oclc.org/xisbn/ implements a <a href="http://www.oclc.org/research/projects/frbr/algorithm.htm">grouping algorithm</a>:
</p>
<blockquote cite="OCLC">
Much of the early research investigated how best to divide a particular 'work' into its component 'expressions'. Unfortunately, this and <a href="http://www.oclc.org/research/projects/frbr/clinker/default.htm">other FRBR research</a> has shown that the information in existing bibliographic records is, in general, insufficient to reliably divide a work into expressions, so this line of investigation has been abandoned for now.
<br/><br/>
Our research then focused on the seemingly simpler problem of collecting bibliographic records into groups corresponding to different works (such as Shakespeare's Hamlet). An algorithm was developed, based primarily on author and titles found in bibliographic records, to find works in the WorldCat database with a high degree of reliability. [<a href="http://www.oclc.org/research/projects/frbr/algorithm.htm">FRBR Work-Set Algorithm</a>]
</blockquote>
<p>
Hickey and his associate Jeff Young put up a <a href="http://alcme.oclc.org/bookmarks/servlet/OAIHandler?verb=ListRecords&amp;metadataPrefix=oai_dc">page</a> to explore ways of using this one-to-many mapping in a LibraryLookup-like bookmarklet. It produces a bookmarklet that issues a query URL with multiple ISBNs.
</p>
<p>
Since my library's OPAC doesn't respond to multi-ISBN queries, though, I tried another approach: multiple individual queries. One way to handle these would be to have a server-based application parse the results and look for indications of success or failure. (Since the results are only Web pages, not well-formed XML responses, that would entail some crufty pattern recognition.) Another way, which I've implemented, is to open up a new window for each query. As an experiment, here's a version of the Build-Your-Own-Bookmarklet page that creates bookmarklets that use that method:
</p>
<p align="center">
<a href="http://weblog.infoworld.com/udell/gems/multiIsbnLookupGenerator.html">experimental LibraryBookmarklet builder for multi-ISBN lookup</a>
</p>
<p>
There are a few caveats here. First, the one-to-many algorithm doesn't seem to be fully bi-directional. In the example above, we'd like to get from 0066620694, a paperback, to 0875845851, a hardcover. But although we can get from <a href="http://labs.oclc.org/xisbn/0875845851">0875845851 to 0066620694</a>, we can't get from <a href="http://labs.oclc.org/xisbn/0066620694">0066620694 to 0875845851</a>. I'm not sure what's up with that.
</p>
<p>
Second, the mapper can return a dozen or more ISBNs. Dealing with that many windows is more like a denial of service than a service, so I've capped the number at five, for now.
</p>
<p>
Third, although it works OK in Mozilla (<b>update:</b> with popup-window blocking turned off), it seems to work erratically in MSIE. Also, you might need to tweak a security setting in MSIE. Check that <i>Tools -&gt; Internet Options -&gt; Security -&gt; Internet -&gt; Custom Level -&gt; Miscellaneous -&gt; Access data sources across domains</i> (whew!) is set to <i>Prompt</i>. The reason for this is that the bookmarklet's JavaScript code is fetching the XML returned from the OCLC service, which actually looks like this:
</p>
<pre>
&lt;?xml version=&quot;1.0&quot; encoding=&quot;UTF-8&quot; ?&gt;
&lt;idlist&gt;
  &lt;isbn&gt;0875845851&lt;/isbn&gt;
  &lt;isbn&gt;0585368228&lt;/isbn&gt;
  &lt;isbn&gt;0066620694&lt;/isbn&gt;
  &lt;isbn&gt;1565114159&lt;/isbn&gt;
  &lt;isbn&gt;1578511682&lt;/isbn&gt;
&lt;/idlist&gt;
</pre>
<p>
and then processing it locally. 
</p>
<p>
If you want to try this without creating and installing a bookmarklet for your library, here is the multi-ISBN bookmarklet for my own library: <a href="javascript:var%20re=/([\/-]|is[bs]n=)(\d{7,9}[\dX])/i;if(re.test(location.href)==true){var%20isbn=RegExp.$2;(function(){var element=document.createElement('script'); element.setAttribute('src','http://weblog.infoworld.com/udell/gems/xisbn.js'); document.body.appendChild(element);}());load(isbn,'http://ksclib.keene.edu/search/i=__ISBN__');}">Keene Multi-ISBN</a>. And if you're curious, the supporting script that it loads is <a href="http://weblog.infoworld.com/udell/gems/xisbn.js">here</a>.
</p>
<p>
It actually sort of works, which says a lot for the flexibility of Web and XML technologies. But the big news here is OCLC's decision to expose its grouping algorithm as a public XML Web service. I hope this signals the beginning of a closer relationship between WorldCat and the Web. It would be great if library OPACs could do these one-to-many lookups for themselves.
</p>

</body>
</item> 

<item num="a845">
<title>Dynamic documents revisited</title>
<date>2003/11/12</date>
<body>
<p>
Back in July, when Tim Bray pointed to a best-practices document that the W3C TAG (Technical Architecture Group) gathers periodically in exotic locations to discuss, I was exploring a way of dynamically building views of XML documents in the browser. In a <a href="http://weblog.infoworld.com/udell/2003/07/02.html#a684">couple</a> of <a href="http://weblog.infoworld.com/udell/2003/07/02.html#a685">postings</a> (and with some help from Bob Clary) I was able to post a Web page that turned the TAG's Web page into a dynamic outline that can selectively display elements styled as:
<ul>
<li>practices</li>
<li>principles</li>
<li>constraints</li>
<li>stories</li>
<li>acronyms</li>
<li>ednotes</li>
</ul>
In addition, it can selectively display:
<ul>
<li>internal links</li>
<li>external links</li>
<li>paragraphs containing arbitrary text</li>
</ul>
</p>
<p>
Here's a <a href="http://weblog.infoworld.com/udell/gems/tagPractices.html">new version of that dynamic viewer</a> for the <a href="http://www.w3.org/2001/tag/2003/webarch-20031111/">latest revision</a> of the TAG document, which Tim <a href="http://www.tbray.org/ongoing/When/200x/2003/11/12/Webarch03-11">wrote about today</a>. As I review the principles and practices, which my handy-dandy viewer dynamically assembles into tidy lists for me, I'm still not sure which of them accounts for my ability to do this neat trick. I supppose <a href="http://www.w3.org/2001/tag/2003/webarch-20031111/#thoughtful-uris">Good practice: Thoughtful URI creation</a> comes closest, except that in this case, it's really more like <u>Thoughtful content creation</u>. Maybe that's covered by the separate note <a href="http://www.w3.org/2001/tag/issues.html#formattingProperties-19">formattingProperties-19: Reuse existing formatting properties/names, coordinate new ones</a>?  
</p>
<p>
I'm also still not quite sure of XHTML's place in the world. Section 4.6.7., Media Types for XML, advises: 
<blockquote cite="W3C TAG">
Good practice: In general, server managers SHOULD NOT assign Internet Media Types beginning with <code>text/</code> to XML representations.
</blockquote>
The TAG document has a doctype of XHTML 1.0 Strict, and comes over the wire as text/html, but I'm treating it as XML so I can query it with XPath and transform it. I'm not losing sleep over it, but is this a sneaky exploit or a principled technique? 
</p>
</body>
</item> 

<item num="a844">
<title>Skepticism, cynicism, optimism</title>
<date>2003/11/11</date>
<body>
<p>
A bunch of people want this upcoming U.S. presidential election to be one that we look back on, years hence, as the historic intersection between the Internet and democracy. Maybe, maybe not. At the same time, a quieter drama is unfolding. Although not precisely quadrennial, Microsoft product cycles have a comparable rhythm. I've seen too many of them. Yet I think I've managed to stay on the right side of the line that divides <a href="http://blogs.gotdotnet.com/johnmont/PermaLink.aspx/6d95e318-6fcd-470e-8478-14ee10856817">skepticism from cynicism</a>. And when hear a cacophony of voices like this, I can even find cause for optimism:
</p>
<blockquote cite="Chris Anderson">
This is a developer preview. We want your feedback, we are listening to your feedback. The pushback on XUL, CSS, etc, is being listened to. I can't say that we will implement every suggestion that is given, but the entire purpose of this early preview of our technology is to get feedback from the development community. Tell me how to avoid having two languages - CSS for style and XAML for UI, tell me how to make CSS easier to tool, tell me how to make it perform and scale to tens of thousands of elements with nested style sheets... We are listening. [<a href="http://www.simplegeek.com/default.aspx?date=2003-11-02T00:00:00">SimpleGeek</a>]
</blockquote>
<p>
This from Chris Anderson, an architect of Longhorn's Avalon presentation subsystem. To which Gerald Bauer, architect of <a href="http://luxor-xul.sourceforge.net/">Luxor</a>, replied in the <a href="http://www.simplegeek.com/commentview.aspx/b7e02709-0112-4977-9b73-1aa9d471a570">comments section</a>:
</p>
<blockquote cite="Gerald Bauer">
I guess you're kidding. Again, Microsoft's complete disregard for web standards is outrageous and Microsoft's attempt to try to roll back history by refusing to cooperate to build a rich internet for everyone basically amounts to a declaration of war to the Free World.
</blockquote>
<p>
Perhaps the most interesting comment is this one, from <a href="http://www.shamiro.ch">Benjamin J. J. Voight</a>, reacting to Bauer:
</p>
<blockquote cite="Benjamin J. J. Voight">
Would you at least consider providing constuctive critic? I can only imagine how you must feel (And I guess you've all the right to be upset!), but I would very much like to see the barriers be disregarded for a few moments. This new willingness to listen may come as a surprise, but from all I can tell it's honest. If you'd been at PDC you'd know.
<br/><br/>
I've had several enlightening discussions with engineers and architects at PDC, about OSS and about standards. The result was, that either they just did not know (which in my eyes is very valid, they can still learn), or they had a good reason not to use a certain standard. Of course standard by itself is not always the same, since standards itself are relative to the environment.
<br/><br/>
...
<br/><br/>
In my private, very limited, sphere I've only joined MS, on the promise, that MS would indeed be listening more to what other stakeholders have to say (OSS was my life untill I started here). I do see this promise being kept (minus a few execs...), but it requires, that the stakeholders actually start talking (Which some are willing to do just now, some might need more time, which is fine as far as I'm concerned).
</blockquote>
<p>
I very much share Bauer's concerns. At the same time, I'm encouraged to hear all these different voices. At the end of the day, Microsoft's fiercest competitor is itself. The business requires significant numbers of Windows and Office installations to be upgraded, every few years, and it's hard to manufacture compelling reasons to do that. Is the Microsoft product cycle becoming a real conversation? I'm skeptical, just as I'm skeptical that the U.S. presidential election is becoming a real conversation. But in both cases it's (I hope) a healthy skepticism that doesn't require cynicism or preclude optimism.
</p>

</body>
</item> 

<item num="a843">
<title>Mining message metadata</title>
<date>2003/11/10</date>
<body>
<p>
<blockquote cite="InfoWorld">
Point-to-point integration is out; event-driven communication across a common message bus is in. When you build a system this way, message queues are the first and best way to take the pulse of its real-time state...<br/><br/>
Martinez makes a crucial distinction between message data and message metadata. In the realm of Web services, it's the difference between SOAP bodies and SOAP headers. The bodies eventually land in an operational data store, the headers often don't. Yet the headers define the context of the message: who (or what) is sending it and why. For example, a clinical service might be invoked by a monitoring application, or by a compliance officer logging into a portal to research an FDA report. &quot;It's the same message payload,&quot; Martinez says, &quot;but contexts are very different.&quot; [Full story at <a href="http://www.infoworld.com/article/03/11/07/44OPstrategic_1.html">InfoWorld.com</a>]<br/>
</blockquote>
I can't quite put my finger on it, but there's a connection between this week's column -- based on discussions with Blue Titan's Frank Martinez -- and the latest round of wrangling over the semantic web, bracketed by essays from <a href="http://www.shirky.com/writings/semantic_syllogism.html">Clay Shirky</a> and <a href="http://www.tbray.org/ongoing/When/200x/2003/11/09/SemWebFirstStep">Tim Bray</a>. 
</p>
<p>
Martinez's insight is that in a Web services network, the packets (XML payloads) tend to accrete metadata that can usefully be mined. Relative to the SemWeb discussion, I'd add that this contextual metadata arises naturally, without extra effort, when a business process has been automated -- or, to be more realistic, semi-automated. When Jack routes a purchase order to Jill through the BizTalk pipeline, the context is explicitly encoded in the transaction. 
</p>
<p>
What happens if Jack detaches the purchase order from the BizTalk pipeline, as an InfoPath document, and routes it to Jill via email? Now the context is only implicitly encoded in the transaction. The trick is going to be figuring out how to make the implicit context explicit, without interfering with the natural flow of the transaction.
</p>
<p>
As developers of WinFS begin to blog about it, the outlines of Microsoft's approach start to emerge:
</p>
<blockquote cite="Mike Deem">
Joe is exactly right to point out that asking the user to add meta data has met with very limited success. I think WinFS addresses this in two ways: 1) the shell will make it very easy to &quot;paint&quot; meta-data on files just by dragging and dropping, something that users do today to organize their files; and 2) the fact that using the meta data is so easy and powerful (again via the shell's dynamic views) makes the effort to add the meta data more worth while. [<a href="http://anopinion.net/posts/156.aspx">Mike Deem</a>]
</blockquote>
<p>
That sounds reasonable. Of course, the trend even within Microsoft Office is away from micromanaging storage by &quot;dragging and dropping.&quot; Witness the search folders in Outlook 2003, which are intended to create virtual views along multiple dimensions so you don't have to manually build containment structures. The Outlook 2003 product manager, in fact, told me that he managed the whole product cycle in an undifferentiated inbox, creating no folders and moving no messages.
</p>
<p>
My hunch is that as desktop software interacts more often with well-defined services, the context implicit in those interactions will tend to become more available, and will be easier to make explicit. The key is that the context must arise from normal use of software. And as &quot;normal use&quot; comes to mean &quot;participating in a Web of services,&quot; it can.
</p>

</body>
</item> 

<item num="a842">
<title>Citation and influence: science versus the blogosphere</title>
<date>2003/11/06</date>
<body>
<p>
If you're interested in natural language processing, the <a href="http://www.fieldmethods.net/">fieldmethods.net</a> blog offers up interesting nuggets. Here was yesterday's entry:
<blockquote cite="fieldmethods.net">
<p>You can look at the data from <a href="http://www.sciencewatch.com/sept-oct2003/sw_sept-oct2003_page2.htm">Twenty Years of Citation Superstars</a> in two ways: </p><ul><li>There's nobody in computer science, let alone NLP.</li><li>All those biologists are really going to need some NLP to read articles for them.</li></ul><p>We're only partly joking.</p> [<a href="http://www.fieldmethods.net/">fieldmethods.net - the natural language processing portal:  Do you read much, or just cite a lot about it? </a>]
</blockquote>
</p>
<p>
The citation accounting that tracks the flow of influence in scientific disciplines looks a lot like the citation accounting that goes on in the blogosphere. But in truth, for many if not most scientific disciplines, that resemblance is superficial. Consider the &quot;citation superstars&quot; report mentioned in the above item from fieldmethods. The source is the Thomson ISI (Institute for Scientific Information) <a href="http://www.isinet.com/isi/products/citation/wos/">Web of science</a>, one of a family of information products that includes <a href="http://www.isinet.com/products/rsg/products/esi">ISI Essential Science Indicators</a> and its editorial (i.e. marketing) companion, <a href="http://www.in-cites.com/">in-cites</a>. 
</p>
<p>
Each month, in-cites highlights influential scientists, journals, institutions, and papers. Here, for example, are the <a href="http://www.in-cites.com/papers/2003menu.html">highlighted papers for 2003</a>. This one caught my eye:
</p>
<blockquote cite="in-cites">
Dr. Rob Morgan and Dr. Shelby Hunt talk about their highly cited paper, &quot;The commitment-trust theory of relationship marketing,&quot; (<i>J. Marketing</i> 58[3]: 20-38, July 1994). According to the <a href="http://www.isinet.com/products/rsg/products/esi"><i>ISI Essential Science Indicators</i></a><sup><img height="8" width="10" src="http://www.in-cites.com/images/mark-sm.gif" border="0"/></sup> Web product, this paper, which has been cited a total of 394 times to date, is currently ranked at #1 among papers published in the past decade in the field of Economics &amp; Business. Overall, Dr. Morgan's record includes seven papers cited a total of 538 times to date and Dr. Hunt's record includes 21 papers cited 641 times to date. [<a href="http://www.in-cites.com/papers/Morgan_n_Hunt.html">read</a>] [<a href="http://www.in-cites.com/papers/2003menu.html">in-cites - 2003 Papers Menu</a>]
</blockquote>
<p>
Given my interest in collaboration and social networking, I was curious to take a look at this influential paper. But how? It took a fair amount of persistent noodling around on the ISI site, and parallel use of Google, to correlate <i>J. Marketing</i> with the <a href="http://www.marketingpower.com/">American Marketing Association</a> -- which was fruitless, since the <a href="http://www.marketingpower.com/live/content.php?Item_ID=17640&amp;Category_ID=5304">available content</a> is only abstracts, and even those only go back to 2002.
</p>
<p>
Surely, I thought, Google can <a href="http://www.google.com/search?q=%22rob+morgan%22+%22shelby+hunt%22">tell us more</a> about Drs. Morgan and Hunt. Interestingly, the first two Google hits are the pages I've already cited: their profile on in-cites, and the interview linked from the profile. And amusingly, the third hit is <a href="http://www.cba.ua.edu/col/fsnews.html">this faculty and staff news page</a> at the University of Alabama's Culverhouse College of Commerce and Business Administration. Evidently at one time, though no longer recorded even in Google's cache, that page included the fragment:
</p>
<p><i>
...Dr. <b>Rob Morgan</b>, Co-Author of the Most Often Cited Business Research Article of ...
</i></p>
<p>
Odd, isn't it? I've written before about how <a href="http://weblog.infoworld.com/udell/2003/09/26.html">citation of accessible primary sources</a> is a core value of the blogging culture. In certain scientific disciplines, that's becoming true as well. In response to my recent item on Apple's Knowledge Navigator, Claus Dahl, who blogs at <a href="http://www.classy.dk/">Notes from Classy's Kitchen</a>, wrote to say:
</p>
<blockquote cite="Claus Dahl">
For scientific citations, NEC's CiteSeer is way ahead of Google at http://citeseer.nj.nec.com/ and actually not too far from the scenario you referenced (Dr. Flemson/Fleming).
</blockquote>
<p>
That's true for the computer-science-related disciplines tracked by CiteSeer. Likewise, the <a href="http://arxiv.org/">arXiv.org e-print archive</a> is a phenomenal resource for physics, math, computer science, and related fields. As CiteSeer's Steve Lawrence notes:
<blockquote cite="Steve Lawrence">
Articles freely available online are more highly cited. For greater impact and faster scientific progress, authors and publishers should aim to make research easy to access. [<a href="http://www.neci.nec.com/~lawrence/papers/online-nature01/">CiteSeer: Online or invisible?</a>]
</blockquote>
Beyond the computer-science-related disciplines, though, it's unclear to me how much scientific content is becoming freely available online, and therefore able to benefit from the powerful knowledge-transmission and reputation-building forces at work in the blogosphere. 
</p>

</body>
</item> 

<item num="a841">
<title>Conserving the RESTful ecosystem</title>
<date>2003/11/05</date>
<body>
<p>
Michael Pate has <a href="http://www.libraryplanet.com/2003/11/polaris">a note</a> about a LibraryLookup tweak for version 3.0 of the Polaris system. Thanks for the heads-up, Michael. I've added it to the <a href="http://weblog.infoworld.com/udell/stories/2002/12/11/librarylookupGenerator.html">LibraryLookup generator</a> page.
</p>
<p>
It's been almost a year since I started the LibraryLookup project, and it seems to be cruising along nicely. Rarely does an opportunity arise to demonstrate an idea -- in this case, the protean utility of the RESTful Web -- in a way that's easy for people to latch onto, and that delivers immediate value. 
</p>
<p>
Yesterday I spent an hour on the phone with a Microsoft architect, reviewing the whys and wherefores of Longhorn's Avalon presentation system. At one point, he referred me to an Avalon-enhanced Amazon.com demo that was shown at the PDC, and asked: &quot;How could you not want that experience?&quot; Point taken. It seems obvious that everybody would prefer a more dynamic, more interactive application. 
</p>
<p>
Here's a counterpoint. A while back, a librarian wrote to me asking how she could integrate her OPAC with LibraryLookup. I investigated and found that her vendor's implementation was based on a Java applet, and there was no way to link into it. As I mentioned to Eric Rudder and Don Box at a meeting in Redmond, this librarian later posted to a mailing list that her OPAC couldn't support LibraryLookup because it was built on the &quot;wrong kind&quot; of software, where &quot;wrong&quot; meant -- though she wouldn't have called it this -- non-RESTful. For her, the richer experience of that Java applet was a poor tradeoff, since it precluded LibraryLookup's lightweight style of integration. 
</p>
<p>
It's too early to judge whether Avalon will involve us in such tradeoffs, and if so how to mitigate them. But it's certainly something worth having a conversation about. We're dealing with a whole ecosystem here, and we need to think in those terms.
</p>

</body>
</item> 

<item num="a839">
<title>Personal service-oriented architecture</title>
<date>2003/11/04</date>
<body>
<p>
<blockquote cite="InfoWorld">
There's an important lesson here I hope desktop applications will learn, courtesy of the emerging paradigm of SOA (service-oriented architecture). In the realm of SOA, events are represented in an open XML format and flow through a transparent pipeline that's open to inspection and subject to intermediation...
<br/><br/>
Ironically, the graphical desktop popularized the event-driven model that's being writ large in the Web services network. Now we need to come full circle. Local event streams need to be open in the same ways as network event streams are and for the same reasons. [<a href="http://www.infoworld.com/article/03/10/31/43OPstrategic_1.html">InfoWorld: Strategic Developer: October 31, 2003</a>]
</blockquote>
When I mentioned Apple's Knowledge Navigator video in a <a href="http://weblog.infoworld.com/udell/2003/10/23.html#a831">blog posting</a> recently, it attracted an unusual amount of attention. Clearly many people long for the kind of human/computer interaction so clearly imagined in that video. This week's InfoWorld column asks the question: How can today's technologies deliver some of the kinds of intelligent assistance that we crave? My conclusion was that the principles of service-oriented architecture can apply on the desktop as well as in the cloud. If local applications exchange XML messages with one another, as well as with the services cloud, then the same techniques of observation and intermediation can apply in both realms.
</p>
<p>
It's intriguing to note, in this vein, that Longhorn's communication subsystem, Indigo, aims to make standards-based XML messaging work efficiently across a broad range of topologies. In an article just published on MSDN, Don Box writes:
<blockquote cite="Don Box">
Indigo makes service-oriented programming viable in a broad spectrum of mainstream applications. By taking advantage of various facilities of both the CLR and Windows, Indigo can be used in performance-sensitive situations such as single-host and even single-process integration. This scale-invariance makes Indigo-based services accessible to a broader range of deployment options than current technologies. [<a href="http://msdn.microsoft.com/Longhorn/understanding/mag/default.aspx?pull=/msdnmag/issues/04/01/Indigo/default.aspx">A Guide to Developing and Running Connected Systems with Indigo</a>]
</blockquote>
</p>
<p>
<a href="http://en.wikipedia.org/wiki/Black_helicopter_conspiracy_theory"><img align="right" vspace="6" hspace="6" src="http://weblog.infoworld.com/udell/gems/blackHelicopter.jpeg"/></a>
Excellent! Five years ago, I got excited about the idea that a <a href="http://udell.roninhouse.com/dhttp/dhttp.html">local Web server</a> could enable a peer-to-peer style of computing in which user-facing applications and machine-to-machine services were made of the same stuff and therefore highly synergistic. I'm still jazzed about a unified service-oriented model, and look forward to seeing Indigo help make it a reality. So, in case you were wondering, I am not a complete <a href="http://en.wikipedia.org/wiki/Black_helicopter_conspiracy_theory">black helicopter conspiracy theorist</a> on Longhorn!
</p>

</body>
</item> 

<item num="a838">
<title>Windows types and universal types</title>
<date>2003/11/03</date>
<body>
<p>
Dare Obasanjo differentiates the <a href="http://longhorn.msdn.microsoft.com/lhsdk/winfs/daconwhatiswinfsdatamodel.aspx">WinFS data model</a> from the structures definable in W3C XML Schema as follows:
<blockquote cite="Dare Obasanjo">
The WinFS schema language and W3C XML Schema do two fundamentally different things; W3C XML Schema is a way for describing the structure and contents of an XML document while a WinFS schema describes types and relationships of items stored in WinFS. [<a href="http://www.25hoursaday.com/weblog/PermaLink.aspx?guid=d0b398f5-9343-4f91-8e1c-340c5a6669e2">Dare Obasanjo</a>]
</blockquote>
As I see it, there are two major classes of XSD-governed XML documents. Both are well-supported by Microsoft and by other players. First, in the realm of Web services, we have SOAP messages wrapped in WSDL interfaces. Over the past two years, thanks in large part to Microsoft efforts which I've lauded, there's been a shift from an RPC-oriented style of SOAP messaging to the document-oriented style that better suits the coarse-grained, asynchronous, message-driven architectural pattern. XSD is how we define the datatypes and structures in those messages.</p>
<p>
Meanwhile, at the intersection of users and data, we have XML documents that people can read, exchange, and interact with. Here too, we've agreed on XSD as a good way to define the types and structures found in these documents. I've long applauded, and forcefully evangelized, the XML Schema support that became available last month in Office 2003.
</p>
<p>
Finally, I've drawn attention to a remarkable synergy that InfoPath, most notably, makes possible. The message payloads exchanged on the Web services network, and the documents read and written by people, can be the same texts, governed by the same datatype and structure definitions. And those texts are universal in scope, not tied to any platform or framework. True, InfoPath is a Windows-only creature, but since it's built on open standards, InfoPath-like software can exist on other platforms and can interoperate with InfoPath.
</p>
<p>
We have yet to even scratch the surface of what's possible given these circumstances. And now here comes WinFS with its own proprietary schema language. In recent years, it's been popular to layer innovation on top of base standards. So XSLT, XQuery, and SQL200n all rely on XPath, as WSDL relies on XSD. Yet no base standards beyond XML itself were of use to WinFS? It puzzles me. The things defined in WinFS don't seem exotic or mysterious. &quot;A WinFS Contact type,&quot; the docs say, &quot;has the Item super type. Person, Group, and Organization are some of its subtypes.&quot; If XSD can't model such things, we're in real trouble.
</p>
<p>
Of course WinFS does much more than model datatypes and structures. It's a highly sophisticated storage system that supports relational, object, and XML access styles, and that treats relationships among items as first-class objects in themselves (a potent feature I first encountered in the object database world years ago.) Great stuff! But the terminology of the Longhorn docs is revealing. Person, Contact, and Organization items are referred to as &quot;Windows types,&quot; presumably because their schemata appear as classes in Longhorn's managed API. But to me these are universal types, not Windows types. I had expected them to be defined using XML Schema, and to be able to interoperate directly with SOAP payloads and XML documents on any platform. 
</p>
<p>
To use XML Schema, Dare suggests, WinFS would have to do one of two things:
<blockquote cite="Dare Obasanjo">
<ol> 
<li>Support a subset of W3C XML Schema</li> 
<li>Extend W3C XML Schema to add support for WinFS concepts</li>
</ol>
</blockquote>
It seems to me there is lots of precedent for a third approach: use XSD for base datatype/structure definition, and advance a new layered standard (if needed) for relationship definition.
</p>
<p>
Dare concludes:
<blockquote cite="Dare Obasanjo">
Ideally, even though WinFS has its own schema language it makes sense that it should be able to import or export WinFS items as XML described using an W3C XML Schema since this is the most popular way to transfer structured and semi-structured data in our highly connected world. This is functionality is something I've brought up with the WinFS architects which they have stated will be investigated.
</blockquote>
Good. That would be a minimal expectation. It's troubling, though, that the architects must be consulted to find out whether Longhorn's &quot;Windows types&quot; will be transferable to standards-based software. Note that such software now prominently includes Office 2003 along with other Microsoft and non-Microsoft apps and services. &quot;Connected&quot; is one of Longhorn's key marketing terms. Yet I am not alone in observing that Longhorn's approach is, in various ways, oddly self-contained.
</p>
</body>
</item> 

<item num="a837">
<title>Dear Sonya</title>
<date>2003/11/01</date>
<body>
<blockquote cite="AT&amp;T">
<p>Dear Jon Udell,</p>
<ul>
<li><b>Visit</b> our website at www.att.com/nopaper
</li>
<li><b>Click</b> the &quot;Sign Up Now&quot; button
</li>
<li><b>Log in</b> with your User ID and Password for quick ordering
</li>
<li><b>Complete</b> and submit the form
</li>
</ul>
<p>
That's all there is to it. Online billing saves you time and gives you more options. And when you enroll by November 15, 2003, as an added benefit, you'll receive a $20 Amazon.com Certificate.
</p>
<p>
Sincerely, <br/>
Sonya Bigby-McCloud<br/>
Director, AT&amp;T Online Marketing
</p>
</blockquote>
<p>
Dear Sonya,
</p>
<p>
Thanks very much for the offer. As it happens, I already have an online bill presentment system. I would like very much to use it to pay AT&amp;T, rather than create yet another online identity. My own system, which is hosted for my bank by <a href="http://www.s1.com/">S1</a>, and through which I already pay AT&amp;T's paper bills electronically, says it can also present AT&amp;T bills electronically, and it invites me to sign up for that feature. But when I try, the system reports:
</p>
<p><tt>
&quot;The account number you typed does not match the typical account number format for this payee.&quot;
</tt></p>
<p>
It's puzzling. The account number evidently means something to someone, because my payments are being debited from my account, and my long-distance service keeps working. 
</p>
<p>
When I try to call AT&amp;T to sort this out, I end up having surreal adventures like the one <a href="http://weblog.infoworld.com/udell/2003/06/23.html#a731">I reported</a> in June. Maybe you, S1, and CheckFree can put your heads together and fix this? As a token of my gratitude if you do, please use my $20 Amazon Certificate to buy yourself a Christmas present.
</p>
<p>
Sincerely,
</p>
<p>
Jon Udell
</p>
</body>
</item> 

<item num="a836">
<title>Replace and defend</title>
<date>2003/10/31</date>
<body>
<p>
Reading the Longhorn SDK docs is a disorienting experience. Everything's familiar but different. Consider these three examples: 
</p>
<p>
Example 1: The new CSS:
</p>
<blockquote cite="Longhorn SDK">
<p>
<b>Why the need for Adaptive-flow Format?</b> The advent of the World Wide Web posed several challenges for designers. Print media never required the adapting of layout to different media shapes. However, when pages were viewed using the World Wide Web, it was impossible to predict the size of the window in which the document would be viewed or the personal preferences that would be selected by the viewer. As a consequence, a document that looked great in one format might display poorly on another. While HTML and Cascading Style Sheets (CSS) made strides toward remedying this situation, the results were often less than ideal because the specifications were applied without regard to the readability of the page.</p> 
<p>
...
</p>
<pre>
&lt;AdaptiveMetricsContext
  ID=&quot;root&quot;
  xmlns=&quot;http://schemas.microsoft.com/2003/xaml/&quot;
  FontFamily=&quot;Arial&quot;
  ColumnPreference=&quot;Medium&quot;&gt;
&lt;TextPanel Background=&quot;white&quot;&gt;
  &lt;Section&gt;
    &lt;Heading&gt;Adaptive-flow-format Example&lt;/Heading&gt;
      &lt;Paragraph&gt;This example shows the advanced capabilities of... 
</pre>
<p>
[<a href="http://longhorn.msdn.microsoft.com/lhsdk/docservices/overviews/edocs_adaptive.aspx">Introduction to Adaptive-flow Format Documents</a>]
</p>
</blockquote>
<p>
Example 2: The new SVG:
</p>
<blockquote cite="Longhorn SDK">
<p>
<b>Vector drawing for &quot;Avalon&quot;</b>
&quot;Avalon&quot; offers several layers of access to graphics and rendering services. At the top layer, Microsoft Windows Vector Graphics (WVG) provides a number of advantages common to XML-based graphics markup. WVG is straightforward to use with the rest of the &quot;Avalon&quot; object model, it is readily reusable, and it is familiar to users of Scalable Vector Graphics (SVG). Objects are available as markup elements, with properties exposed either as attributes on those elements or as complex properties.
</p>
<p>
...
</p>
<pre>
&lt;Canvas ID=&quot;root&quot;
xmlns=&quot;http://schemas.microsoft.com/2003/xaml&quot;
Background=&quot;White&quot;&gt;
  &lt;Path Data=&quot;M 100,200 C 100,25 400,350 400,175 H 280&quot; 
        Stroke=&quot;DarkGoldenRod&quot; 
        StrokeThickness=&quot;3&quot;/&gt;s
&lt;/Canvas
</pre>
<p>
[<a href="http://longhorn.msdn.microsoft.com/lhsdk/graphicsmm/overviews/wvg1.aspx">Vector Drawing for &quot;Avalon&quot;</a>]
</p>
</blockquote>
<p>
Example 3: The new XSD:
</p>
<blockquote>
<p>
<b>&quot;WinFS&quot; Schema Definition Language</b> 
&quot;WinFS&quot; introduces a schema definition language to describe &quot;WinFS&quot; types. This language is an XML vocabulary. &quot;WinFS&quot; includes a set of schemas that define a set of Item types and NestedElement types. These are called Windows types.
</p>
<p>
Note that the &quot;WinFS&quot; schema definition language is not XSD (XML Schema Definition Language). The &quot;WinFS&quot; data model is distinct from other data models, including Entity-Relationship (ER), Object-Relationship (OR), and various XSD and common language runtime type models. Therefore, to support the rich constructs of the &quot;WinFS&quot; data model, &quot;WinFS&quot; introduces its own schema definition language to define &quot;WinFS&quot; types and relationships.
</p>
<p>
...
</p>
<pre>
&lt;Type Name=&quot;Person&quot; MajorVersion=&quot;1&quot; MinorVersion=&quot;0&quot; 
      ExtendsType=&quot;Core.Contact&quot; ExtendsVersion=&quot;1&quot;&gt;
  &lt;Field Name=&quot;BirthDate&quot; Type=&quot;WinFSTypes.datetime&quot; 
         Nullable=&quot;true&quot; TypeMajorVersion=&quot;1&quot;&gt;&lt;/Field&gt;
  &lt;Field Name=&quot;PersonalNames&quot; Type=&quot;Contact.FullName&quot; 
         Nullable=&quot;true&quot; MultiValued=&quot;true&quot; 
         TypeMajorVersion=&quot;1&quot;&gt;&lt;/Field&gt;
</pre>
<p>
[<a href="http://longhorn.msdn.microsoft.com/lhsdk/winfs/conWinFSImplementationModel.aspx">&quot;WinFS&quot; Schema Definition Language</a>]
</p>
</blockquote>
<p>
Joe Hewitt sums it up nicely:
<blockquote cite="Joe Hewitt">
I think the bottom-line of XAML is that it is equally useful for creating both desktop applications, web pages, and printable documents. This means that Microsoft may be attempting to simultaneously obsolete HTML, CSS, DOM, XUL, SVG, SMIL, Flash, PDF. At this point, the SDK documentation is too incomplete to firmly judge how well XAML compares with these formats, but I hope this lights a fire under the collective butt of the W3C, Macromedia, and Adobe. 2006 is going to be a fun year. [<a href="http://www.joehewitt.com/">joehewitt.com</a>]
</blockquote>
Yeah, &quot;embrace and extend&quot; was so much fun, I can hardly wait for &quot;replace and defend.&quot; Seriously, if the suite of standards now targeted for elimination from Microsoft's actively-developed portfolio were a technological dead end, ripe for disruption, then we should all thank Microsoft for pulling the trigger. If, on the other hand, these standards are fundamentally sound, then it's a time for what <a href="http://allconsuming.net/item.cgi?isbn=0060521996">Clayton Christensen</a> calls sustaining rather than disruptive advances. I believe the ecosystem needs sustaining more than disruption. Like Joe, I hope Microsoft's bold move will mobilize the sustainers.
</p>
<p>
<b>Update:</b> I'm delighted to see that my former BYTE colleague John Montgomery, who is now a Microsoft group product manager and developer platform evangelist, and who helped Microsoft work through a number of standards issues in the formative era of Web services, has launched a <a href="http://blogs.gotdotnet.com/johnmont/">blog</a>. Excellent! Today, John <a href="http://blogs.gotdotnet.com/johnmont/PermaLink.aspx/fc49c430-98ea-438d-b342-0ba3acf6e80c/PermaLink.aspx/fc49c430-98ea-438d-b342-0ba3acf6e80c">notes</a> this posting and promises to return with input from Longhorn architects. I very much look forward to a fuller discussion of these issues. 
</p>

</body>
</item> 

<item num="a835">
<title>The Forbes forum on dynamic mid-sized companies</title>
<date>2003/10/30</date>
<body>
<p> <a href="http://www.forbeshighlander.com"><img src="http://weblog.infoworld.com/udell/gems/highlander.jpg" hspace="6" vspace="6" align="right" alt="forbes forum" height="225" width="313"/></a> I spoke about blogging and corporate identity yesterday at the 6th annual Forbes forum for dynamic mid-sized companies, in New York. This was a pleasant change of pace for me. Although some tech CEOs and VCs were in attendance, the gathering was very diverse and included folks like Fetzer Vineyards' Paul Dolan, Hasbro's Alan Hassenfeld, and Green Mountain Coffee Roasters' Bob Stiller. </p> <p> I was on a panel with <a href="http://www.g2bgroup.com/people.htm#phil">Phil Gomes</a> and <a href="http://www.google.com/search?q=john+ellis+tech+central+station">John Ellis</a>. To a roomful of people mostly unfamiliar with blogging, I stressed that it's a way to communicate passion and energy in a direct and authentic way. I think that message resonated pretty well. The room was full of passionate entrepeneurs, and they're excited about issues -- for example, customer service -- that you don't hear much about at the kinds of tech conferences I usually attend. </p> <p> As it happens, one of the most passionate statements did come from a tech CEO: Rick Belluzzo. Formerly president and COO of Microsoft, he joined Quantum last year and has been shepherding the downsized company through a painful business transition. Recalling a Latin American tour in support of the WinXP launch, during which he hobnobbed with presidents of countries as well as high-profile customers, Belluzzo said: &quot;At the end of the week, I didn't know if I had an impact. Did I help sell more copies of Windows XP product or improve the business?&quot; He added jokingly: &quot;At Microsoft I had a chief of staff. What does that mean?&quot; </p> <p> Nobody was sure exactly what defines a &quot;mid-sized&quot; company, but it was clear that everyone there valued the ability to make direct contributions every day and produce tangible results. Tim Forbes, COO of Forbes Magazine, amplified that theme. Forbes Magazine, of course, lacks the scale of the media titans such as Time-Warner and McGraw-Hill with whom it competes. As a result, a project that promises to make a few million dollars &quot;will get senior management attention,&quot; Forbes said. </p> <p> At lunch I got to quiz Avocet CEO David Tait about the air-taxi revolution that <a href="http://www.avocetprojet.com/">Avocet</a>, <a href="http://www.eclipseaviation.com/">Eclipse</a>, and others want to spark. I mentioned this concept in an item entitled <a href="http://radio.weblogs.com/0100887/2002/07/16.html">peer-to-peer air travel</a>, which includes a link to James Fallows' seminal <a href="http://www.theatlantic.com/issues/2001/06/fallows.htm">Atlantic Monthly article</a> on the subject. Got two million bucks burning a hole in your pocket? You can <a href="http://www.avocetprojet.com/preorder.php">order a ProJet now</a>! But hurry, the low serial numbers are going fast, Joe Montana <a href="http://www.avocetprojet.com/pressrelease5.php">just got #16</a>. </p> <p> The air-taxi idea resonates with me not only because I lack convenient access to hub airports, but because the grid-like architecture of the system in which these planes will operate reminds me of the network of peer-to-peer services that's transforming the software landscape. We'll see in five years whether that analogy holds, I guess. But the concept sure is appealing. Initially the system will only be able to compete with conventional air travel at the margin. But if it can survive and grow to critical mass, it could become massively disruptive. To see why, consider that for trips under 500 miles -- the majority of airline flights -- your average speed is around 60 miles per hour. Sure, some of the trip happens in the air at 300 to 500 mph, but getting to and from the hubs, parking, dealing with security, waiting in runway traffic jams, and all the rest of the nightmarish indignities that characterize modern air travel conspire to make flying not much faster than driving. It's a system ripe for disruption. Bring it on! </p>
</body>
</item> 

<item num="a834">
<title>Avalon isn't about Web/GUI convergence</title>
<date>2003/10/28</date>
<body>
<p>
Edwin Khodabakchian echoes what seems to be a common -- but I think incorrect -- perception that XAML, the <a href="http://www.mozilla.org/projects/xul/">XUL</a>-like layout language revealed this week to be a building block of Longhorn's Avalon presentation subsystem, heralds some kind of Web/GUI convergence:
</p>
<blockquote cite="Edwin Khodabakchian">
We had prototypes and concepts at Netscape that were very close to what Microsoft is starting to promote with Avalon. One the positive side, it is great to see HTML, XML, CSS and SVG become the foundation of UI development within windows. [<a href="http://www.collaxa.com/radio/2003/10/27.jsp#a479">Organic BPEL</a>]
</blockquote>
<p>
My understanding, based on a demo I saw last week, is that although XAML is indeed an XML dialect, it has nothing to do with HTML, CSS, or SVG. It's true that the Avalon presentation engine is Web-like, or to be more precise ASP.NET-like in its separation of layout markup and &quot;code-behind.&quot; But it builds no bridges to pre-Avalon clients. The foundation of Avalon's vector-based UI, for example, is Direct3D. I asked whether SVG -- an obviously relevant Web standard -- would be a preferred (or at least a supported) interface to Direct3D, and was told that it would not. 
</p>
<p>
Here, from the newly-hatched Longhorn Developer Center, is another statement which implies a convergence that I don't see in the cards:
<blockquote cite="Charles Petzold">
Avalon and XAML represent a departure from Windows-based application programming of the past. In many ways, designing your application's UI will be easier than it used to be and deploying it will be a snap. With a lightweight XAML markup for UI definition, Longhorn-based applications are the obvious next step in the convergence of the Web and desktop programming models, combining the best of both approaches. [<a href="http://msdn.microsoft.com/longhorn/default.aspx?pull=/msdnmag/issues/04/01/Avalon/default.aspx">Longhorn Developer Center: Code Name Avalon: Create Real Apps Using New Code and Markup Model</a>]
</blockquote>
To my way of thinking, you don't have &quot;the best of both approaches&quot; unless you have a ubiquitous client. As <a href="http://radio.weblogs.com/0113297/2003/10/27.html#a257">Jeremy Allaire</a> pointed out the other day, Flash is making a serious effort along these lines, and has -- in Laszlo and the forthcoming Royale -- its own XML-based layout techniques. I've also mentioned Mozilla's cross-platform technique, XUL. Now Microsoft is pitching a Windows-only UI renderer that targets 2006-era desktops and notebooks, while allowing MSIE to stagnate. I can see how and why they arrived at this strategy, but it doesn't seem to be the kind of Web/GUI convergence I'm looking for.
</p>

</body>
</item> 

<item num="a833">
<title>Open source citizenship</title>
<date>2003/10/28</date>
<body>
<p>
<blockquote cite="InfoWorld">
On the world stage, both failures and successes can loom larger than in the corporate cubicle. Developers who plug into the reputation-driven meritocracy of open source -- while advancing the goals of your business -- are a force to be reckoned with. [<a href="http://www.infoworld.com/article/03/10/24/42OPstrategic_1.html">InfoWorld: Open source citizenship: October 24, 2003</a>]
</blockquote>
This column was based on the observation that corporate IT shops are apparently more likely to fork an open source project for internal development and use, than to join and contribute to the project. Some correspondents were puzzled by my comments on licensing, so I'll try to clarify. The open source licensing regime, as Tim O'Reilly has often pointed out, has as its basis the distribution of source code. As we move to a service-oriented software ecosystem, that basis will necessarily erode. If a GPL'd module is copied, modified, and then deployed behind a firewall to power a service that's world-accessible and free (as in 'free beer'), then am I as a user of that service free (as in 'free speech') to modify and share it? In one sense yes, I can wrap the service in a novel service of my own creation -- if the provider's terms of service (a different layer of licensing) allow me to. In another sense no, the internal modifications that make the service more interesting/powerful/useful than the GPL'd original are not available to me for modification and sharing.
</p>
<p>
I'm not suggesting that a different licensing regime could, or even should, prevent such a scenario. But I am saying that some habits that evolved decades ago will need to be rethought in a service-oriented ecosystem. Another example, which I mentioned in a <a href="http://www.infoworld.com/article/03/01/31/05stratdev_1.html">column on open services</a> a while back, touches on the way tests are bundled with open source projects. Here again there is a presumption of source distribution. But when a user of a service never acquires its source, and invokes the service from a different programming language than the one in which it was written, it may make more sense to deploy tests as auxiliary SOAP/WSDL constructs.
</p>
<p>
Back to this week's column, here's what I think is really the most salient issue:
<blockquote cite="InfoWorld">
Like the Internet itself, the modern enterprise now relies on the fruits of the most successful open source projects. But the commoditization of operating systems, compilers, and servers only scratches the surface of what's possible. All sorts of infrastructure software can benefit from the open source model. Business software, not all of which is necessarily proprietary, is ripe for commoditization too.
</blockquote>
If we're going to get substantial commoditization in the business layer, based on an open source development model, it won't be the result of licensing innovation. Rather, it will happen when captive developers are allowed to come out and play, to explore the boundary that separates proprietary intellectual property from sharable infrastructure, and to work together on commoditizing that sharable infrastructure. 
</p>
</body>
</item> 

<item num="a832">
<title>Clash of the titans: Amazon vs. Google</title>
<date>2003/10/24</date>
<body>

<p>
<a href="http://weblog.infoworld.com/udell/gems/amazonSearchInside.jpg"><img width="278" height="194" alt="knowledge navigator" align="right" vspace="6" hspace="6" src="http://weblog.infoworld.com/udell/gems/amazonSearchInside.jpg"/></a>
Ned Batchelder alerted me this morning to Amazon's new search feature:
<blockquote cite="Ned Batchelder">
Now Amazon lets you search the full text of its books. This is astounding, not only because of the further differences it highlights between Amazon and traditional bookstores, but because of the effort it must have taken to accomplish. The text seems to be from scans of pages, subjected to an OCR process. And not just the bulk of popular books, either. They've got all sorts of wild and wooly volumes available this way. I don't know how truly useful it will be, since full text searching can be extremely noisy, even before the OCR noise is factored in. [<a href="http://www.nedbatchelder.com/blog/200310.html#e20031023T223042">Ned Batchelder: October 2003</a>]
</blockquote>
I wondered about the OCR strategy too. In this day and age, surely any publisher could provide electronic copy to an indexer. But then I drilled down and discovered something quite remarkable. I own a copy of <a href="http://allconsuming.net/item.cgi?isbn=0743215362">Tesla: Man Out of Time</a>. The other day, I was mentioning to someone that, according to that book, some of Nikola Tesla's writings are <i>still</i> classified. This <a href="http://www.amazon.com/gp/reader/0743215362?v=search-inside&amp;keywords=tesla+classified+files">query</a> finds the passage I was remembering. Awesome! Now the physical book I bought from Amazon is more valuable to me. Its printed index has been augmented by a vastly more capable online index. This extremely useful capability is, by the way, also available to owners of books in the <a href="http://safari.oreilly.com">Safari Books Online</a> service, though it correlates results only to chapter and section, not to page. Little-known fact: you need not be a Safari subscriber to use Safari as an augmented index to books you own.
</p>
<p>
Whether or not you own one of the books now searchable on Amazon, you can now view a scan of any page in the book that matches a query. Clearly that could be abused. Searching for 'Tesla' in the Tesla book finds almost every page, for example. So Amazon requires you to log in in order to view those pages; presumably they'll monitor activity and shut down people who try to read whole books this way.
</p>
<p>
When I designed Safari the notion of a fulltext-searchable book catalog was paramount. So was the notion of a browseable catalog that exposes introductory chunks of every section of every chapter to the public Web. This was a conscious strategy to create &quot;Google surface area&quot; -- and for a while, it worked. When you searched for a term found in an O'Reilly book, the Safari page for that book often showed up. But as time went on, Google seemed less willing to take the linkbait. Currently, it <a href="http://www.google.com/search?q=site%3Asafari.oreilly.com+safari">appears to be finding</a> only a few hundred of the over 1800 O'Reilly / Addison Wesley / New Riders / Prentice Hall / Que / Sams / Peachpit books in the Safari service, and then only the home pages of those books, not any of the tens of thousands of preview pages. Of course Google has no particular incentive to do an exhaustive job searching online catalogs. For businesses that are so incented, like Amazon, local search is the only way to guarantee coverage. I'll be fascinated to see whether and how such local search services federate -- with or without Google's cooperation.
</p>
</body>
</item> 

<item num="a831">
<title>Apple's Knowledge Navigator revisited</title>
<date>2003/10/23</date>
<body>
<p>
<img alt="knowledge navigator" align="right" vspace="6" hspace="6" src="http://weblog.infoworld.com/udell/gems/knownav.jpg"/>
During my session at BloggerCon I <a href="http://weblog.infoworld.com/udell/gems/bc03.ram">referred to</a> Apple's famous Knowledge Navigator concept video. I first saw that video in 1988. Today I tracked down a <a href="http://www.bu.edu/jlengel/kn65kfs.mov">copy</a> and watched it again. It stands the test of time rather well! Certain elements of that vision are now routine -- for example, Google found me the video and WiFi delivered it to a PowerBook which, when equipped with its iSight camera, bears a family resemblance to the Dynabook-like talking computer featured in the video. Other aspects are still far out of reach, especially the conversational interface based on deep understanding of natural language. 
</p>
<p>
Clearly natural language is taking a lot longer than the pioneers expected. Back in 1953 researchers thought it was going to be a five year project. No-one in 2003 is so optimistic. In other respects, though, important elements of the Knowledge Navigator vision seem within reach. At one point, the fictional Professor Bradford tries to recall a paper he read five years before, in which a Dr. Flemson, he misremembered, disagreed with the direction of a colleague's research on deforestation. &quot;John Fleming, of Uppsala University,&quot; the computer replied. &quot;He published in the Journal of Earth Science in July 2006.&quot; Google's not quite there yet, of course, but it's helpful modification of failed queries is a step in the right direction.
</p>
<p>
The next bit is more fanciful. &quot;Fleming challenged Jill's prediction about the amount of carbon dioxide released due to deforestation,&quot; says Prof. Bradford. &quot;I'd like to recheck his figures.&quot;  &quot;Here's the rate of deforestation he predicted,&quot; says the computer, displaying a chart. &quot;Mm hmm,&quot; says Bradford, &quot;and what really happened?&quot; The computer overlays the actual data, showing significant variance from Fleming's prediction. It's a stretch, but we can at least imagine how to pull something like this off today. Fleming's data would be in XML; the software would infer a schema from it; a query to a Web service would yield the actual reported data; transformation would correlate the two data sets for display on a common surface. 
</p>
<p>
Presence, attention management, and multimodal communication are woven into the piece in ways that we can clearly imagine if not yet achieve. &quot;Contact Jill,&quot; says Prof. Bradford at one point. Moments later the computer announces that Jill is available, and brings her onscreen. While they collaboratively create some data visualizations, other calls are held in the background and then announced when the call ends. I feel as if we ought to be further down this road than we are. A universal canvas on which we can blend data from different sources is going to require clever data preparation and serious transformation magic. The obstacles that keep data and voice/video networks apart seem more political and economic than technical.
</p>
<p>
Apple's vision, in any case, was and is spot on. I wonder how much closer to reality it will be in another fifteen years.
</p>
</body>
</item> 

<item num="a830">
<title>Ben Bederman's DateLens</title>
<date>2003/10/21</date>
<body>
<p>
<a href="http://weblog.infoworld.com/udell/gems/datelens.jpg"><img align="right" vspace="6" hspace="6" width="253" height="257" src="http://weblog.infoworld.com/udell/gems/datelens.jpg" alt="datelens"/></a>
Responding to this week's <a href="http://weblog.infoworld.com/udell/2003/10/20.html#a827">column</a>, <a href="http://www.vanderburg.org/cgi-bin/glv/blosxom">Glen Vanderburg</a> notes correctly that I should have credited <a href="http://www.cs.umd.edu/~bederson/">Ben Bederson's</a> pioneering work on fisheye distortion in GUIs. I checked, and Samuel Wan <a href="http://63.144.246.231/information/archives/000092.html">does cite</a> Bederson on his blog. Glenn writes:
<blockquote cite="Glenn Vanderburg">
Ben has been working for over ten years now to increase the richness of our GUIs through the use of scale -- in some cases zooming, and in other cases fisheye distortion.  Check out the <a href="http://www.cs.umd.edu/hcil/datelens/">DateLens</a> link on his page, and *be* *sure* to watch the demo video.
</blockquote>
<span class="minireview">DateLens</span> It's a whopping 45MB MPEG file, but I did watch it. To come full circle, you can also see a Flash demo of DateLens in action <a href="http://www.windsorinterfaces.com/datelens.shtml#">here</a>. Or, as it turns out, you can actually use a free desktop version of DateLens -- there's an add-in (written C# for the 1.1 .NET Framework) available <a href="http://www.cs.umd.edu/hcil/datelens/">here</a>. I'm trying it out today. It's full of powerful ideas! I love how the grid grows and shrinks to accommodate arbitrary amounts of content, how search results are mapped onto the scroll bar, and how exactly-like and similar items are color-coded with a single click. 
</p>
<p>
The retiling as you zoom in and out is, admittedly, a bit disconcerting. That's partly due to the fact that this version isn't optimized for the desktop -- tooltips don't reveal the next level of detail, and there's no use of the right mouse button which in this case could have been used for zooming out. And partly because this is just a very different way of doing things. Too radical for mainstream use? Perhaps so. I've learned the hard way that UI acceptance is very much driven by expectations, and unexpected behavior tends to be penalized. Several people pointed out to me that there's plenty of UI innovation going on, but in games rather than business software. I suppose that's true. Some changes may require a generational shift.
</p>
</body>
</item> 

<item num="a829">
<title>Our own devices</title>
<date>2003/10/21</date>
<body>
<p> 
<a href="http://allconsuming.net/item.cgi?isbn=0375407227"><img vspace="6" hspace="6" align="right" alt="our own devices" src="http://weblog.infoworld.com/udell/gems/ourOwnDevices.jpg" border="1"/></a> Edward Tenner, author of <a href="http://allconsuming.net/item.cgi?isbn=0679747567">Why Things Bite Back: Technology and the Revenge of Unintended Consequences</a>, has a new book called <a href="http://allconsuming.net/item.cgi?isbn=0375407227">Our Own Devices: The Past and Future of Body Technology</a>. Meticulously researched chapters trace the historical development of footgear, chairs, keyboards, and eyeglasses. The unifying theme is the coevolution of technology and technique: how we both change and are changed by these &quot;body technologies&quot; as we use them. 
</p>
<p> <img vspace="6" hspace="6" align="left" alt="birkenstock" src="http://weblog.infoworld.com/udell/gems/birkenstock.jpg"/> Feet in barefoot or sandal-wearing cultures, for example, differ from feet in shod cultures, and so do styles of locomotion. I can walk all day in Birkenstocks at a brisk pace, but I can't sprint for a bus. It's a tradeoff I made years ago in response to a foot injury.
</p>
<p> <img vspace="6" hspace="6" align="right" alt="floating arms" src="http://weblog.infoworld.com/udell/gems/floatingarmschair.gif"/> A different kind of injury -- repetitive strain -- landed me in what my friends call my &quot;Captain Kirk keyboard chair&quot; (aka <a href="http://tim.griffins.ca/gallery/keyboard/FloatArm">Floating Arms</a>). It has its own tradeoffs: I gain a radical redistribution of the load on my arms and shoulders while typing, complete realignment of my posture, and infinite adustability, but give up floor space and convenient access. 
</p>
<p> Meanwhile WiFi has ushered in a new technology/technique dynamic. The untethered TiBook is seductive, as are the many postures that can accommodate its use. Of course its keyboard would be useless to me were it not for a <a href="http://gnufoo.org/ucontrol/ucontrol.html">software modification</a> that -- albeit imperfectly -- relocates the CONTROL key, all-important to the emacs user, to a less crippling location.
</p>
<p> Tenner concludes with an essay on the newest form of mobile data entry, thumb-driven text messaging:
</p>
<blockquote cite="Edward Tenner"> The major laboratories did not predestine the thumb as the successor to the index finger, though they did help make the thumb's usefulness possible. The full capacities of the digit were discovered through the joint experimentation of users, designers, and manufacturers. 
</blockquote> 
<p>
And here is the lesson he draws from this example: 
<blockquote cite="Edward Tenner"> 
One challenge of advanced industrial societies is a degree of standardization that threatens to choke off both new technologies and new techniques. The remedy is a return to the collaboration between user and maker that marked so many of the great innovations, whether the shaping of the classic American fire helmet or the development of the touch method by expert typists. 
</blockquote>
It's a great point. I particularly wonder about software, which is in theory infinitely plastic but in practice rarely modified. Every time I use a computer at my local library, for example, I boost the display resolution, which invariably delivers far less screen real estate than what the machine can support and what I want. Vast numbers of displays are underutilized in this way, and we know too that the majority of features in software applications are never found or used. Collaboration between user and maker is clearly the right thing, but in the realm of software we're a long way from making it happen. 
</p>
</body>
</item> 

<item num="a828">
<title>W3C vs. OASIS patent policies</title>
<date>2003/10/20</date>
<body>
<p>
At last week's Digital ID conference, Phil Windley gave an excellent overview of a number of base technologies, including <a href="http://www.oasis-open.org/committees/tc_home.php?wg_abbrev=security">SAML</a> and <a href="http://www.oasis-open.org/committees/tc_home.php?wg_abbrev=xacml">XACML</a>, two OASIS standards. Cory Doctorow suggested that these are OASIS rather than W3C standards because the <a href="http://www.oasis-open.org/who/intellectualproperty.php">OASIS patent policy</a> doesn't encourage patent disclosure and royalty-free licensing as vigorously as the <a href="http://www.w3.org/Consortium/Patent-Policy-20030520.html">W3C patent policy</a> now does.
</p>
<p>
Phil didn't have an immediate answer and, after rereading the two policies, neither do I. They are superficially similar but, of course, I am not a lawyer. I do note that, in the case of XACML, approval was delayed -- according to OASIS' Karl Best -- until the technical committee convinced itself that there wasn't going to be a patent hitch:
</p>
<blockquote cite="Karl Best">
The XACML TC has taken reasonable steps to ensure and to document that all features of this language are derived from previous work in the field that is not under patent restrictions. We intentionally made one feature of the language - Obligations - not mandatory to implement in order to make it easier for implementations to avoid a particular feature that might, in some cases, infringe on a known IP claim.
<br/><br/>
It is the belief of the TC that useful and fully-compliant implementations of XACML 1.0 can be created that are royalty free. The TC therefore requests that OASIS adopt the XACML 1.0 specification as submitted.&quot; [<a href="http://lists.oasis-open.org/archives/xacml/200302/msg00016.html">OASIS xacml mailing list</a>]
</blockquote> 
<p>
Still, Cory raises an interesting point. I haven't seen a side-by-side legal analysis of the W3C and OASIS policies, nor of the motivations vendors have for aligning with one or the other. <a href="http://radiocomments.userland.com/comments?u=100887&amp;p=828&amp;link=http%3A%2F%2Fweblog.infoworld.com%2Fudell%2F2002%2F12%2F14.html%23a828">Comments and pointers</a> are welcome.
</p>
</body>
</item> 

<item num="a827">
<title>GUIs, linking, and interface experimentation</title>
<date>2003/10/20</date>
<body>
<p>
<blockquote cite="InfoWorld">
In one crucial way, the rich GUI is tragically disadvantaged with respect to its poor browser cousin. Trying to sort out a permissions problem with IIS 6, I clicked a Help button and landed on a Web page. The page could only describe the tree-navigation procedure required to find the tabbed dialog box where I could address the problem. It could not link to that dialog box. This is nuts when you stop and think about it. Documentation of GUI software needs pages of screenshots and text to describe procedures that, on the Web, are encapsulated in links that can be published, bookmarked, and e-mailed. A GUI that doesn't embrace linking can never be truly rich. [<a href="http://www.infoworld.com/article/03/10/17/41OPstrategic_1.html">InfoWorld: How rich is the rich GUI?: October 17, 2003</a>]
</blockquote>
Everybody agreed with the central theme of this column: the &quot;rich&quot; GUI ought to embrace linking. The secondary theme -- that the &quot;rich&quot; GUI ought to be richer on its own terms -- provoked a variety of responses. Danny Ayers worked up an interesting <a href="http://dannyayers.com/2003/10/fisheye.html">CSS/JavaScript</a> variant of Samuel Wan's <a href="http://www.samuelwan.com/downloads/com.samuelwan.eidt/fisheyemenu/FisheyeMenuDemo.html">fisheye demo</a>. Hamish Harvey's response:
</p>
<blockquote cite="MishMash">
I may be wrong, but it seems to me that the CSS provides a pretty effect, whereas the Flash version provides a genuinely different (though I'm not yet sure if it's useful) way to deal with long lists in short spaces. [<a href="http://hamish.blogs.com/mishmash/2003/10/links_are_an_es.html">MishMash</a>]
</blockquote>
<p>
Actually, I thought the CSS variant was cool, and could probably mimic the Flash technique more closely if needed. Like Hamish, I'm not sure this is a useful technique, but I mentioned it in the column just because it's striking how little UI innovation there is. <a href="http://www.xml.com/pub/a/2003/08/06/x3d.html">Len Bullard</a> wrote to second that observation:
</p>
<blockquote cite="Len Bullard">
To improve the GUI, we don't need more dancing 
tabs, we need richer integration metaphors. Can 
we get those without creating domain-focused, 
limitations?  In other words, going from GUIs 
that anyone can use to GUIs that only specialists 
can use somewhat following the evolution path 
of domain languages?  Is that progress?
<br/><br/>
Should we look more carefully at the intersections 
of 3D virtual characters and RPGs as means to find 
and present information, faces on the agents?  Should 
we work harder at understanding the relationships among 
the sign systems that people use in daily life and 
the effects these have on human emotions?
<br/><br/>
Most of what we do with GUIs, trees, tabs, dropdowns, 
cascading menus and so on, is list process named items. 
A richer interface would be conversational and would 
organize itself dynamically according to its role.
</blockquote>
<p>
Len asks great questions, and we won't be able to answer them until we do the experiments. Oddly, there doesn't seem to be much experimentation going on. 
</p>
<p>
<b>Update</b>: J. Scott Anderson wrote to say:
<blockquote cite="J. Scott Anderson">
I found it interesting that you mention the fisheye effect yet failed 
to mention the effect in active everyday use -- the Macintosh OS X Dock. 
In my opinion, there is a very successful implementation of such an 
effect.
</blockquote>
Good point. I use OS X myself, and should have made that connection. Thanks!
</p>
</body>
</item> 

<item num="a825">
<title>Why Mozilla matters</title>
<date>2003/10/11</date>
<body>
<blockquote cite="InfoWorld"> 
The browser is dead. It's gone to meet its maker, shuffled off its mortal coil, and joined the bleedin' choir invisible. Macromedia knows this. Microsoft knows this. The makers of countless variations on the theme of the next-generation rich Internet client know this. Everyone knows this except the folks who build and deploy Internet apps. We surveyed them recently, and they told us last-century Web apps are not only alive and kicking, but dominant. Is that nostalgia, or a leading indicator? Both, I suspect. [Full story at <a href="http://www.infoworld.com/article/03/10/10/40OPstrategic_1.html">InfoWorld.com</a>] 
</blockquote> 
<p>
John Dowdell <a href="http://www.markme.com/jd/archives/003452.cfm">wonders</a> why I led this article the way I did. It was intended ironically. I don't in any way think the browser is dead.</p>
<p>
In related news, Derek Robinson wrote to point out that MSIE does indeed have an analog to Mozilla's W3C DOM Traversal and Range API, called <a href="http://www.irt.org/xref/TextRange.htm">TextRange</a>. Neither of these technologies has seen much use, judging by the paucity of Google results for <a href="http://www.google.com/search?q=msie+textrange">msie textrange</a> and <a href="http://www.google.com/search?q=dom+traversal+range">dom traversal range</a>. I don't know. Maybe I'm just swimming against the current here, but the ability to <a href="http://weblog.infoworld.com/udell/2003/10/09.html#a824">lift a well-formed XML object right off a web page</a> strikes me as remarkably useful.
</p>
</body>
</item> 

<item num="a824">
<title>Interactive microcontent</title>
<date>2003/10/09</date>
<body>

<p>
<blockquote cite="O'Reilly Network">
The friction that really wears us down is at the interface between people and data. I'm not too worried about how we represent XML fragments, but very curious about how we enable people to interact with them. These little experiments are hardly conclusive, but they do hint at the still-untapped potential of the scriptable document object model. [Full story at <a href="http://www.xml.com/pub/a/2003/10/08/udell.html">O'Reilly Network</a>]
</blockquote>
<script src="http://weblog.infoworld.com/udell/gems/quote.js"/>
<script src="http://weblog.infoworld.com/udell/gems/getFragment.js"/>
</p>
<p>
A glitch that affected the live examples in that article is being repaired. Meanwhile, I'm reproducing the examples here. The first is a DOM-aware citation bookmarklet (Mozilla-only, alas). This is the text of the bookmarklet:
</p>
<p>
<tt>
javascript:(function(){var element=document.createElement('script'); \
element.setAttribute('src','http://weblog.infoworld.com/udell/gems/quote.js');\
document.body.appendChild(element); })()
</tt>
</p>
<p>
And here's the actual <a href="javascript:(function(){var element=document.createElement('script'); element.setAttribute('src','http://weblog.infoworld.com/udell/gems/quote.js'); document.body.appendChild(element); })()">quote</a> bookmarklet, draggable to your linkbar if you're running Mozilla.
</p>
<p>
The next example uses similar technology to find a fragment of XML matching some pattern, lift it out of a web page, and do something with it. In this case, the fragment is simply echoed to a window. One link finds a fragment with a cal:date attribute, the other finds the enclosing fragment with a cal:range attribute. Click on one or the other of the events to set focus first. 
</p>
<table border="1" cellspacing="0" cellpadding="8">
<tr><td>
<p><a href="javascript:getFragment('cal:date')">grab calendar event</a>, <a href="javascript:getFragment('cal:range')">grab calendar events</a></p>
</td></tr>
<tr><td>
<table><tr><td>
Here's the schedule:
<table xmlns:cal="urn:cal" xmlns:team="urn:team" class="calCalendar" cal:range="20031011-20031012" cellspacing="0" cellpadding="4">
<tr class="calEvent" cal:date="20031011"><td>
<div class="calName"><span team:id="42">Red Sox</span> vs. <span team:id="27">Yankees</span> Saturday 4PM</div>
<div class="calLocation">Fenway Park</div>
</td></tr>
<tr class="calEvent" cal:date="20031012"><td>
<div class="calName"><span id="42">Red Sox</span> vs. <span id="27">Yankees</span> Sunday 7:30PM</div>
<div class="calLocation">Fenway Park</div>
</td></tr></table>
</td></tr></table>
</td></tr></table>
<p>
Just this morning, I read a well-reasoned argument on the <a href="http://www.mezzoblue.com/archives/2003/09/03/markup_bulle/">fragility of XHTML</a>. I can't really disagree, it's true that the effort/reward ratio is currently out of whack. But for what it's worth, I'm looking for ways to reduce the effort required to create and maintain the stuff, and increase the rewards for doing so. Note that in the case of these calendar fragments, it's not actually necessary for the whole page to be valid XHTML. Well-formed fragments contained within documents that are not themselves guaranteed to be well-formed is a possibility, and maybe a useful possibility.
</p>
<p>
<b>Update:</b> When I transplanted the calendar example from the ORA site to here, I noticed something odd, the significance of which didn't sink in until just now. The sample fragment in the published article was <i>not</i> well-formed. I wrote the article in an XHTMLish way, but didn't it vet it for well-formedness -- as I do now for what I write here. As it turns out, the table containing the calendar fragment was ill-formed. I've corrected the version shown here, just to keep my XML database clean, but the fact is that I could have published the ill-formed content, exploited the leniency of the browser, and still been able to extract well-formed fragments from it. Hmmm. 
</p>
</body>
</item> 

<item num="a823">
<title>Video citation and blogging frustration</title>
<date>2003/10/08</date>
<body>

<p>
<a href="http://weblog.infoworld.com/udell/gems/bc01.ram"><img align="right" vspace="6" hspace="6" src="http://weblog.infoworld.com/udell/gems/bconAmy.jpg"/></a>
A few months ago, I wrote about the <a href="http://weblog.infoworld.com/udell/2003/07/30.html">tragic inaccessibility</a> of audio and video content on the web. Today, while trying to summarize one of the key insights that came out of the BloggerCon conference I attended this past weekend, I was again reminded of this problem. The issue I want to highlight is the gap between what technically proficient users of the blogging medium (or indeed any kind of web authoring) can achieve, and what average users can achieve. Ironically the best way for me to make that point -- by citing portions of the webcast -- is yet another illustration of the problem. In several different sessions, people made compelling pleas for simplicity. It ought to be easy for me to send you to those places in the webcast. It isn't.
</p>
<p>
In <a href="http://weblog.infoworld.com/udell/gems/bc01.ram">this two-minute clip</a>, industry veteran <a href="http://www.wohl.com/">Amy</a> <a href="http://amywohl.weblogger.com/">Wohl</a> talks about her frustrations with blogging tools. It's a powerful statement. Writing for the web is something I've done for so long that I forget how much tacit knowledge I've accumulated over the years, and how it empowers me. But for Amy (and for most people) effective use of hypertext, images, and tabular data in a fast-paced communication environment is unreasonably hard. The analogous problem for me is communicating with video clips. Here the knowledge isn't tacit, and the communication doesn't flow easily.
</p>
<p>
Let's look at the steps required for me to bring you Amy's clip. On the <a href="http://blogs.law.harvard.edu/bloggerCon/webcast">BloggerCon webcast page</a> there is a link to the <a href="http://cyber.law.harvard.edu/ml/output.pl/35514/stream/temp.ram">day 2 technology session</a>. Let's look at that link:
</p>
<p>
http://cyber.law.harvard.edu/ml/output.pl/35514/stream/temp.ram
</p>
<p>
In order to refer to a location within that stream, I need to inspect and modify the contents of that file. Here's what it contains:
</p>
<p>
<div>rtsp://cyber.law.harvard.edu/ml/25b50bb11838407597b70be18172f8e3-\</div>
<div>1065623830.rm</div>
</p>
<p>
Of course, I can't just load the .ram file into a browser, because it's hardwired to launch a player to play the stream described in the file. Some alternate HTTP client is needed. Already we've fallen off the continental shelf in terms of any normal person's knowledge. Here's what I did:
</p>
<p>
curl http://cyber.law.harvard.edu/ml/output.pl/35514/stream/temp.ram
</p>
<p>
In other words I used curl, a command-line HTTP client, to fetch the contents of the .ram file. Next, I embedded the rtsp URL that it contained into a new .ram file, and added start/stop parameters like so:
</p>
<p>
<div>rtsp://cyber.law.harvard.edu/ml/25b50bb11838407597b70be18172f8e3-\</div>
<div>1065623830.rm?start=14:05&amp;end=16:10</div>
</p>
<p>
Of course it took some fiddling to get the start and end times I wanted. Finally, I uploaded the .ram file to the <a href="http://weblog.infoworld.com/udell/gems/bc01.ram">address</a> I cited at the beginning of this posting. Even this procedure, which I take for granted, is far outside the mainstream. To this day, uploading an arbitrary file to a public URL is something that most people can't do. We (as a species) know how to attach files to email. We don't know how to post files and communicate their addresses.
</p>
<p>
So anyway, the end result of this process is very powerful. I can point you to something compelling that Amy said, and you can watch her say it. Equally noteworthy is the fact that I have now made Amy's video clip <i>visible to Google</i>. Unfortunately the process that yields these powerful results is absurd.
</p>
<p>
Hundreds of such moments during the conference (and indeed in every conference that's ever been webcast) ought to have been similarly exposed to Google, and to the linking and citation mechanisms that drive Google. Nobody has time to watch complete webcasts. We have the technology to quote from them, and thereby index them. But it never happens, because we haven't packaged the technology for effective use. This sucks.
</p>
<p>
<a href="http://weblog.infoworld.com/udell/gems/bc02.ram"><img align="right" vspace="6" hspace="6" src="http://weblog.infoworld.com/udell/gems/bconClip.jpg"/></a>
Clearly writing and production tools don't even scratch the surface of what's possible and, if we're to realize more than a fraction of the web's potential as a communication medium, what's necessary. For consumers, though, I'd have said that things were pretty well in hand. But the discussion in my own session on aggregators showed me that here too, major ease-of-use barriers remain. Here's <a href="http://weblog.infoworld.com/udell/gems/bc02.ram">the five-minute clip</a> that captures that segment of the discussion. It was the most animated five minutes of the whole hour-and-a-half session. I know that the impassioned plea made by the guy shown in this screenshot annoyed some of the more technical attendees. But let's think about what he's telling us. And let's also think about why these kinds of powerful moments, even when recorded for the web, don't participate fully in the web.
</p>

<p><b>Update:</b> AAARRGGHH!!! It gets worse! The .rm addresses encapsulated in those .ram addresses have changed since this afternoon. Dynamic, maybe? Phooey. </p>

<p><b>Later update:</b> Fixed. This from Hal Roberts:
<blockquote cite="Hal Roberts">
Sorry you're having trouble with the bloggercon video links.  We store all of
our video clips in a media archive (http://cyber.law.harvard.edu/ml) that
includes (among other things) permissions on some of the files.  The system
only gives out temporary rtsp links because it can't password protect them. 
In any case, I've mirrored the files to our static helix server.  You can get
to the file you need at: <br/><br/>
rtsp://cyber.law.harvard.edu/BloggerCon 2003/BloggerCon Day 2 - Technology.rm<br/><br/>
The other files are at analagous locations (eg. 'BloggcerCon Day 2 - Aggregators.rm').
</blockquote>
Excellent! The quotes are working again. Thanks much, Hal!
</p>


</body>
</item> 

<item num="a822">
<title>Reuse pattens and antipatterns</title>
<date>2003/10/07</date>
<body>

<p>
My second panel next week is <a href="http://www.fawcette.com/conferences/eas/sessions.aspx#14PatternsAnti">Patterns and Antipatterns: Reusability Hits Prime Time</a>. To be honest, I had to look up the term &quot;antipattern.&quot; Google's <a href="http://www.antipatterns.com/">first result</a> was: &quot;AntiPatterns are an exciting new form of expert software knowledge, a Dilbert-like extension to design patterns.&quot; Nice tagline! <sup>1</sup> Ward Cunningham's Wiki describes &quot;antipattern&quot; this way:
</p>
<blockquote cite="PortlandPatternRepository">
An AntiPattern is a pattern that tells how to go from a problem to a bad solution. (Contrast to an AmeliorationPattern, which is a pattern that tells how to go from a bad solution to a good solution.) <br/><br/>
A good AntiPattern also tells you why the bad solution looks attractive (e.g. it actually works in some narrow context), why it turns out to be bad, and what positive patterns are applicable in its stead.<br/><br/>
...<br/><br/>
Accordingly to JimCoplien: an anti-pattern is something that looks like a good idea, but which backfires badly when applied.<br/><br/>
It's not fun documenting the things that most people agree won't work, but it's necessary because many people may not recognize the AntiPattern. 
<br/><br/>
In the old days, we used to just call these 'bad ideas'. The new name is much more diplomatic. [<a href="http://c2.com/cgi/wiki?AntiPatterns">PortlandPatternRepository: Anti Patterns</a>]
</blockquote>
<p>
In InfoWorld's recent <a href="http://www.infoworld.com/article/03/09/26/38FErrcode_1.html">programming survey</a>, we asked some questions about software reuse. Developers were quite sharply divided in terms of their satisfaction with the level of reuse they achieve:
</p>
<p>
<img border="1" src="http://weblog.infoworld.com/udell/gems/psReuse01.gif"/>
</p>
<p>
They were also sharply divided in terms of the perceived obstacles to reuse:
</p>
<p>
<img border="1" src="http://weblog.infoworld.com/udell/gems/psReuse02.gif"/>
</p>
<p>
I wish I could follow up with more questions. How much of the source of satisfaction lies on the producer side of the relationship -- that is, in the design strategies, tools, and frameworks that make it possible to package software for reuse? And how much lies on the consumer side -- that is, in the determination and skill required to ferret out and apply that which is available to be reused? Likewise for dissatisfaction: how much is methods and tools, how much culture and behavior?
</p>
<p>
The response to our question about the relative benefits of different flavors of reusable software could also use some unpacking:
</p>
<p>
<img border="1" src="http://weblog.infoworld.com/udell/gems/psReuse03.gif"/>
</p>
<p>
For a long time, I've thought that what I call the components-and-glue pattern is the ultimate recipe for effective reuse. In this pattern, relatively few systems programmers (who create kernels, libraries, and components in &quot;hard&quot; languages) support relatively many application developers (who use these kernels, libraries, and components from &quot;soft&quot; languages). I've also assumed that pattern would iterate one level up: relatively few application developers packaging idioms (&quot;solutions&quot;) for relatively many power users to customize and apply. The progression shown here could be interpreted as suggesting that reuse thrives best on the forest floor, not up in the canopy. Is the democratization of reuse perhaps an antipattern? Good question for the panel. If you'd like to suggest other questions, <a href="http://radiocomments.userland.com/comments?u=100887&amp;p=822&amp;link=http%3A%2F%2Fweblog.infoworld.com%2Fudell%2F2002%2F12%2F14.html%23a822">please do</a>.
</p>

<hr/>
<p>
<sup>1</sup> Because antipatterns.com used &lt;meta name=&quot;description&quot; content=&quot;...&quot;&gt;, Google coughs up that nice pithy phrase instead of whatever random stuff it would otherwise glean from the page. Heh. How did I never notice <i>that</i> pattern before?
</p>
</body>
</item> 

<item num="a821">
<title>XML vocabularies: freedom and control</title>
<date>2003/10/07</date>
<body>

<p>
I'll be at the Enterprise Architect Summit in Palm Springs next week, on a couple of panels. One's entitled <a href="http://www.fawcette.com/conferences/eas/sessions.aspx#13SchemasWild">Schemas in the wild: XML takes on the vertical industries</a>, and the panelists are Jon Bosak and Jean Paoli. The single most important question I'd like to ask these guys is: how do we strike the proper balance between freedom and control? By freedom I mean incremental and iterative evolution of data structures in response to patterns of real-world use. By control I mean the predictable regularity enforced by a DTD or XSD. 
</p>
<p>
I had a great conversation about this with <a href="http://www.mhxml.com/myinfo/dave.htm">Dave Hollander</a> the other day. Dave's the CTO of <a href="http://www.contivo.com/">Contivo</a> and co-chairs the XML Schema Working Group. Contivo's data integration product combines line-of-business-specific XML vocabularies with a mapper that (partly) automates translation to and from these vocabularies. My pushback was: OK, fine, but we don't <i>really</i> know what those vocabularies need to be until people start using software that speaks them. We agreed that extensible schemas are a good and necessary thing. But that's like arguing for mom and apple pie. More concretely, how does it work? Sure, we have <a href="http://www.w3schools.com/schema/schema_complex_any.asp">xsd:any</a>, just as we've always had email X-Headers and a thousand other escape hatches. What lessons from the past will help us create a sustainable and evolvable ecosystem based on XML vocabularies? And what new perspectives (if any) does XML bring to this old problem?
</p>
<p>
I'm sure I'll think of other things to ask my esteemed panelists, but this issue is top-of-mind for me. If there's something you'd like me to ask, <a href="http://radiocomments.userland.com/comments?u=100887&amp;p=821&amp;link=http%3A%2F%2Fweblog.infoworld.com%2Fudell%2F2002%2F12%2F14.html%23a821">fire away</a>.
</p>


</body>
</item> 

<item num="a820">
<title>Office 2003 perspectives</title>
<date>2003/10/06</date>
<body>

<p>
InfoWorld's <a href="http://www.infoworld.com/reports/SRoffice.html">special report on Office 2003</a> appears this week. I was interested to compare our take with that of the New York Times. David Pogue wrote:
</p>
<blockquote cite="New York Times">
In Office 2003, Microsoft has made shockingly few changes to Word, Excel and PowerPoint. The message seems to be: &quot;You people didn't like it when we piled on features? O.K., fine. Let's see how you like it when we add none at all. [<a href="http://www.nytimes.com/2003/09/25/technology/circuits/25stat.html?ei=5007&amp;en=e7875031a32cde5d&amp;ex=1379908800&amp;partner=USERLAND&amp;pagewanted=print&amp;">New York Times</a>
</blockquote>
<p>
In a brief paragraph, he summarized one of the major advances that I've devoted many articles to:
</p>
<blockquote cite="New York Times">
Another juicy corporate morsel: Office documents can now incorporate what's called XML code (extensible markup language). That statement may mean nothing to you, but it will give corporate geeks dilated pupils and sweaty palms. XML can tie together specially defined areas in ordinary Word and Excel documents with big, humming corporate databases. Fill in the blanks of the company expense report, and the company's SQL database can inhale and process it automatically.
</blockquote>
<p>
Given how Microsoft has soft-pedaled the XML infrastructure now woven into Office, I can see why Pogue relegated it to a footnote. And the truth is that it will probably take at least one more turn of the crank before XML in Office, or indeed in any mainstream application, makes this kind of scenario -- which I suggest in the lead story -- a routine thing:
<blockquote cite="InfoWorld">
An entire document can always be saved as XML, but you can bind just a subset of a document to a schema and manage it accordingly. It's always been possible to attach metadata to an Office document using global properties. With this approach, the metadata can appear anywhere in the document. A paragraph or section, for example, might be assigned to a category and thus exposed to a category-aware search engine. [<a href="http://www.infoworld.com/article/03/10/03/39FEoffice_1.html">InfoWorld.com</a>]
</blockquote>
</p>
<p>
In a separate story, I looked at the collaborative modes now available when you use Office apps together with SharePoint and Live Communication Server. First, Outlook users get what Apple's Mail.app users have had for some time now: presence indicators in the email client:
</p>
<p>
<a href="http://weblog.infoworld.com/udell/gems/lcsOutlook.gif"><img border="1" width="299" height="152" alt="presence in outlook" vspace="6" hspace="6" src="http://weblog.infoworld.com/udell/gems/lcsOutlook.gif"/></a>
</p>
<p>
Likewise in SharePoint:
</p>
<p>
<a href="http://weblog.infoworld.com/udell/gems/lcsSharePointLibrary.gif"><img border="1" width="335" height="102" alt="presence in sharepoint" vspace="6" hspace="6" src="http://weblog.infoworld.com/udell/gems/lcsSharePointLibrary.gif"/></a>
</p>
<p>
And perhaps most interestingly, in the SharePoint pane of Excel (or Word):
</p>
<p>
<a href="http://weblog.infoworld.com/udell/gems/lcsExcelAndSharepoint.gif"><img border="1" width="322" height="266" alt="presence in sharepoint" vspace="6" hspace="6" src="http://weblog.infoworld.com/udell/gems/lcsExcelAndSharepoint.gif"/></a>
</p>

<p>
Pretty hot stuff! I concluded:
<blockquote cite="InfoWorld">
E-mail, the intranet, and IM have been on a collision course for some time now. I am delighted to see Microsoft not only embracing all three modes but also looking for ways to weave them together. Yet I can't avoid a sense of deja vu. In the 1990s, Netscape tried something similar, offering a suite of collaboration servers and a matching suite of clients. There were compelling benefits, but also a lot of moving parts. I feel the same way about Office, Exchange, SharePoint, and Live Communication Server. Users will find no single unifying theme akin to the Groove shared space. Administrators will have to install and manage three or four sets of clients and servers. The new capabilities are exciting, but it'll take lots more integration to make Office-based collaboration a seamless and manageable experience. [<a href="http://www.infoworld.com/article/03/10/03/39FEmscollab_1.html">InfoWorld.com</a>]
</blockquote>
</p>

<p>
That conclusion was reinforced for me this weekend at BloggerCon, during Joi Ito's wonderful session about <a href="http://joi.ito.com/joiwiki/BloggerConCommunity">online community</a>. Even highly advanced bloggers are often unfamiliar with alternate modes such as IRC and Wiki. Helping users to integrate these different communication modes and cognitive styles is going to be a real challenge, but it's great to see movement in that direction.
</p>

</body>
</item> 

<item num="a819">
<title>If it's Tuesday, it must be 10AM</title>
<date>2003/10/06</date>
<body>
<p>
A long-ago friend who was the alpha math/science geek in our junior high school used to set his watch by the stars. If programmers had their way, we'd all use astronomically-pure sidereal time. Or at least we'd abandon the absurd notion of time zones. Daylight Saving Time? Don't even go there. I have seen world-renowned software architects go ballistic when that hated subject comes up. Look at the ill-disguised contempt in the IETF's <a href="http://www.ietf.org/rfc/rfc3339.txt">RFC 3339</a>, Date and Time on the Internet:
<blockquote cite="RFC3339">
All times expressed have a stated relationship (offset) to Coordinated Universal Time (UTC).  (This is distinct from some usage in scheduling applications where a local time and location may be known, but the actual relationship to UTC may be dependent on the unknown or unknowable actions of politicians or administrators. The UTC time corresponding to 17:00 on 23rd March 2005 in New York may depend on administrative decisions about daylight savings time. This specification steers well clear of such considerations.)
</blockquote>
</p>
<p>
This week, Ray Ozzie touched off a cross-blog discussion by asking why we don't yet have a standard way to exchange &quot;virtual objects so basic as calendars.&quot; His comment inspired me to dust off a back-burner project to export my Outlook calendar in iCalendar format (<a href="http://www.ietf.org/rfc/rfc445.txt">RFC 2445</a>), so that users of Apple's iCal, or Mozilla Calendar, or other iCalendar-aware programs could subscribe to it. Along the way I rediscovered why calendaring, though indeed basic, is far from simple. 
</p>
<p>
I'll spare you the details of my excursion into MAPI, Microsoft's mail API, using Python's Win32 and COM extensions plus code I cribbed from the SpamBayes plug-in for Outlook. Suffice it to say that after some fiddling, I got Outlook to disgorge my events as iCalendar VEVENT records. Then the fun began.
</p>
<p>
My calendar for October includes several trips to other timezones. (For good measure, some occur before the end of Daylight Saving Time, others after.) Those of you who travel more than I do have already guessed what dilemma now arose. Outlook's COM interface handed my Python script a UTC time object, and another field with my timezone, which it reported as GMT-05:00 Eastern Time (US &amp; Canada). How should I represent that time in my published calendar? Here's one choice for a 10AM appointment on October 15:
</p>
<p>
DTSTART;TZID=US/Eastern:20031015T100000
</p>
<p>
But wait! That appointment is in Denver. So while it looks right to me in Outlook now, it's wrong for people in Denver. Alternatively I can set Outlook's timezone to GMT-07:00, record the event, and then switch back. Now the appointment will be right for people in Denver, but wrong for me. 
</p>
<p>
The problem, as everybody who runs into this dilemma soon realizes, is that calendar programs typically don't allow you to distinguish between the location of the event you're scheduling and the location of the computer you are using to record the event. 
</p>
<p>
When a computer in one timezone schedules an event in another timezone, the computer doing the scheduling needs to be able to accept and display both. Since the feature usually isn't needed, it should ideally be hidden but easily accessible. That's admittedly a thorny user interface problem. I'm sure programmers could solve it -- if they weren't so indignant about humanity's perversion of astronomical time. And now, if you'll excuse me, I've got to go. It's 0100 UTC and the sun's coming up.
</p>
</body>
</item> 

<item num="a818">
<title>Beyond linking: the challenge and opportunity of citation</title>
<date>2003/10/02</date>
<body>
<p>
Louis Menand's <i>The End Matter</i>, in this week's New Yorker, is one of the funniest articles I've read in years. The lead is priceless:
<blockquote cite="The New Yorker">
<p class="descender">It is 2:30 <span class="smallcaps">a.m</span>. of a Monday, spring semester, 1983. Things are looking extremely good. Forty-eight hours of high-intensity stack work and some inspired typing have produced the thirty-page final paper for Modern European History (Mr. Blague, MW 9-10) that you were supposed to be working on all semester but that an unfortunate dispute involving a car, which, as you have repeatedly pointed out, really wasn't in such good shape when you borrowed it, has prevented you from giving the time and attention you sincerely intended.</p> [<a href="http://www.newyorker.com/critics/books/?031006crbo_books1">The New Yorker: The Critics: Books: End Matter</a>]
</blockquote>
The Chicago Manual of Style and Microsoft Word receive equal attention in this ferocious satire. In this excerpt, they conspire to make citation of web content particularly vexing:
<blockquote cite="The New Yorker">
On the aggravating business of citing a Web page, Chicago recommends giving the entire URL, usually in addition to any print data (journal volume number, year, page range, and so on), plus a &quot;descriptive locator&quot; (where to find the quotation on the screen, since electronic editions sometimes do not paginate), plus the date accessed. This can make for a very long note. Here is one of the samples the &quot;Manual&quot; offers, as it would appear if you reproduced it in Word:
<blockquote>
<span class="item">Hlatky, M. A., D. Boothroyd, E. Vittinghoff, P. Sharp, and M. A. Whooley. 2002. Quality-of-life and depressive symptoms in postmenopausal women after receiving hormone therapy: Results from the Heart and Estrogen/Progestin Replacement Study (HERS) trial. <span class="italic">Journal of the American Medical Association </span>287, no. 5 (February 6), <a class="external" href="http://jama.ama-assn.org/issues/v287n5/rfull/joc10108.html#aainfo"> http://jama.ama-assn.org/issues/v287n5/rfull/joc10108.html#aainfo</a> (accessed January 7, 2002).</span>
</blockquote>
Try to prevent Word from doing that blue thing to whatever it recognizes as a hyperlink. There is undoubtedly a way to reset this, but it is deep within the bowels of the machine, guarded by dozens of angry pop-ups.
</blockquote>
</p>
<p>
It's a wonderful piece worth reading in full. And of course there's a serious point behind all the satire. The web came from scholars and is all about sharing knowledge. Citation is the conversational medium in which we do that. Links are powerful tools that we're still learning to use, but citation is about more than just linking. I'm becoming deeply interested in how we can publish fragments that are easy to cite and that, when cited, carry rich context with them. Phil Windley's <a href="http://www.windley.com/2003/09/20.html#a831">quote bookmarklet</a> is an example of what can be done. If you are running Mozilla and want to see a markup-preserving variation on that theme, select some text on this page and then <a href="javascript:(function(){var element=document.createElement('script'); element.setAttribute('src','http://weblog.infoworld.com/udell/gems/quote.js'); document.body.appendChild(element); })()">click here</a>. For best effect, sweep out a selection that crosses an element boundary, for example everything from &quot;Windley's&quot; through to &quot;an example&quot; in the sentence before the previous one. You should get this complete paragraph, a la Mozilla's right-click View Selection Source feature, plus some metadata. 
</p>

</body>
</item> 

<item num="a817">
<title>Happy birthday, Nigeria</title>
<date>2003/10/01</date>
<body>
<p>
<a href="http://www.derechos.org/human-rights/afr/nigeria/"><img align="right" vspace="6" hspace="6" src="http://weblog.infoworld.com/udell/gems/nigeria.gif"/></a>
Every now and then my aggregator brings me an especially surreal juxtaposition. First this:
</p>
<blockquote cite="MailBucket">
<pre>
     from: Mr Muhamed Hassan
       to: clever@mailbucket.org
 reply-to: 
  subject: Good day
 
First=2C I must solicit your strictest
confidence in this transaction... 
[<a href="http://www.mailbucket.org/clever-1055.html">MailBucket.org</a>]
</pre>
</blockquote>
<p>
So after one day, it's time to unsub the <a href="http://weblog.infoworld.com/udell/2003/09/30.html#a815">clever feed</a>. That was no surprise. But then, eerily, here was the next item in my queue, from Kingsley Idehen:
</p>
<blockquote cite="Kingsley Idehen">
I am a Nigerian reminiscing as my country turns 43 today. October the 1st is an emotional day for many Nigerians, especially those of us in the Diaspora. Our country remains a paradox as the excerpts below attest. [<a href="http://www.openlinksw.com/blog/~kidehen/?id=380">Kingsley Idehen's Blog</a>]
</blockquote>
<p>
Kingsley cites Ars Technica's achingly funny spoof:
</p>
<blockquote cite="Ars Technica">
DEAR SIR/MADAM:<br/><br/> I AM MR. DARL MCBRIDE CURRENTLY SERVING AS THE PRESIDENT AND CHIEF EXECUTIVE OFFICER OF THE SCO GROUP, FORMERLY KNOWN AS CALDERA SYSTEMS INTERNATIONAL, IN LINDON, UTAH, UNITED STATES OF AMERICA. I KNOW THIS LETTER MIGHT SURPRISE YOUR BECAUSE WE HAVE HAD NO PREVIOUS COMMUNICATIONS OR BUSINESS DEALINGS BEFORE NOW.<br/> <br/> MY ASSOCIATES HAVE RECENTLY MADE CLAIM TO COMPUTER SOFTWARES WORTH AN ESTIMATED $1 BILLION U.S. DOLLARS. I AM WRITING TO YOU IN CONFIDENCE BECAUSE WE URGENTLY REQUIRE YOUR ASSISTANCE TO OBTAIN THESE FUNDS.<br/> [<a href="http://arstechnica.com/wankerdesk/03q2/nigerian-sco.html">Ars Technica: The Nigerian SCO Connection</a>]
</blockquote>
<p>
On the flip side, Kingsley reminds us that he's not the only notable Nigerian technical innovator. Being an XML geek, I am of course well aware of <a href="http://www.kuro5hin.org/user/Carnage4Life/diary">Dare Obasanjo</a> and <a href="http://uche.ogbuji.net/uche.ogbuji.net/caramusis/">Uche Ogjubi</a>. I'm embarrassed to say, though, that <a href="http://www.emeagwali.com/index.shtml">Philip Emeagwali's</a> backstory was unknown to me. 
</p>
<p>
Strange juxtapositions. But...happy birthday, Nigeria!
</p>
</body>
</item> 

<item num="a816">
<title>Adobe's trial balloon</title>
<date>2003/09/31</date>
<body>

<p>
Earlier this month I <a href="http://weblog.infoworld.com/udell/2003/09/04.html#a792">brokered</a> a dialogue between Kirk Holbrook, an Acrobat developer who'd read my <a href="http://weblog.infoworld.com/udell/2003/08/21.html#a778">Acrobat and InfoPath</a> column, and Adobe executive Chuck Myers, whom I'd interviewed for the column. Apparently Chuck's response to Kirk made some news in the PDF community:
</p>
<blockquote cite="Planet PDF">
As we reported from last fall's PDF Conference in Las Vegas, there was a mix of <a href="http://www.planetpdf.com/mainpage.asp?webpageid=2486">confusion and consternation</a>
regarding the new Reader Extensions technology and its implications,
representing a change in direction for Adobe -- pushing the cost from
the end user to the forms creator/publisher. And the referenced cost
was significant, putting the solution well out of the budget for all
but the largest companies, organizations and agencies -- and denying
the new forms-enabling capabilities to the vast number of small- to
mid-size Acrobat and PDF users.<br/>
<br/>
In subsequent discussions with Adobe, there were hints that the initially
announced pricing structure was something of a trial balloon -- and
based on some of the reaction -- likely to be revised. But no public
announcement by Adobe of any such revision followed. <br/>
<br/>
Surprisingly, details of a change in the pricing structure emerged recently in a public response by Adobe's Chuck Myers to a <a href="http://weblog.infoworld.com/udell/2003/09/04.html#a792">Weblog</a> item posted by InfoWorld's Jon Udell. Previously Udell had posted an item titled &quot;<a href="http://weblog.infoworld.com/udell/2003/08/21.html#a778">Acrobat and InfoPath</a>,&quot;
which triggered a response and some questions from one of his readers.
Adobe's Myers responded, addressing the concerns and noting that Adobe
had implemented a revised pricing model for its Document Server for
Reader Extensions. [<a href="http://www.planetpdf.com/mainpage.asp?webpageid=3080&amp;nl">Planet PDF</a>]
</blockquote>
<p>
The author of the Planet PDF article, Kurt Foss, goes on to say that he has &quot;confirmed this week with Adobe PR staff what the official purchasing options are for ADSFRE,&quot; and he concludes:
</p>
<blockquote cite="Planet PDF">
This revision won't exactly qualify as a fire sale, and seems doubtful to
bring the technology much closer to reality for small- to mid-size
customers. The minimum spend is still in the $60K-plus range. Time will
tell if this second trial balloon will float or burst.
</blockquote>
<p>
I agree with that analysis. The ability to gather structured data -- from anyone, anywhere -- is one of the fundamental enablers of the business web. In different and complementary ways, Acrobat and InfoPath are superior instruments for the task, but if they can't get within shouting distance of ubiquity then it won't matter. Because the browser, contrary to popular opinion, is not standing still. Well, at least the new standard-bearer, Mozilla, isn't. It has the a lot of the raw ingredients and serious momentum. Funding's an issue now but if Mitch Kapor can score <a href="http://blogs.osafoundation.org/mitch/000414.html#000414">$2.75 million in grants for OSAF</a>, then maybe Mozilla.org can pull a rabbit out of its hat too.
</p>

</body>
</item> 

<item num="a815">
<title>MailBucket: an email-to-RSS gateway</title>
<date>2003/09/30</date>
<body>

<p>
Back in March, I <a href="http://weblog.infoworld.com/udell/2003/03/17.html#a640">mentioned</a> that <a href="http://www.throwingbeans.org/">Tom Dyson</a> is working on XPath bindings for PostgreSQL. Today he wrote to announce something completely different: an email-to-RSS gateway called <a href="http://www.mailbucket.org/">MailBucket</a>. It couldn't be simpler to use. Moments ago I sent an email to <a href="mailto:clever@mailbucket.org">clever@mailbucket.org</a>. Almost immediately, I was able to subscribe to <a href="http://www.mailbucket.org/clever.xml">http://www.mailbucket.org/clever.xml</a>, which is also rendered <a href="http://www.mailbucket.org/clever-395.html">here</a>. 
</p>
<p>
I'm reminded of Steve Yost's <a href="http://www.quicktopic.com/">QuickTopic</a>, another clever hack that crosses communication-mode boundaries. I feel certain that MailBucket will be useful, though I can't say exactly in what ways. Whether and how names expire or get recycled is one interesting question that will come up. For example, I've just claimed the word 'clever' on a whim -- but when that thread dies, should 'clever' be tied perpetually to my whim? 
</p>
<p>
Then, of course, there's the question of spam. A long-lived MailBucket namespace will undoubtedly get spammed. But perhaps an instance of the service supporting collaboration within a known group could make effective use of whitelisting. 
</p>
<p>
In any case, it's a worthy experiment and a clever idea. Nice going, Tom!
</p>

</body>
</item> 

<item num="a814">
<title>The avocado-green fridge</title>
<date>2003/09/30</date>
<body>

<p>
<a href="http://www.therealmartha.com/greenfridge/"><img border="1" align="right" alt="green fridge" width="190" height="202" vspace="6" hspace="6" src="http://weblog.infoworld.com/udell/gems/greenFridge.jpg"/></a>
Tom Yager wrote an unusually poetic column this week. I am particularly savoring the first two sentences of his lead:
<blockquote cite="Tom Yager">
The PC is the black and white TV in the wood cabinet. It's the round, tan thermostat dial, the avocado-green fridge, the Steve Miller Band.
 [<a href="http://www.infoworld.com/article/03/09/26/38OPcurve_1.html">InfoWorld: Die, die, accursed PC</a>]
</blockquote>
</p>
<p>
And here's my favorite line:
<blockquote cite="Tom Yager">
A 3GHz Pentium 4 desktop is an IBM PC/AT wearing a mail-order gown and too much rouge.
</blockquote>
Great stuff! Of course there's something to be said for the avocado-green fridge, as Russell Beattie -- lusting over Tim Bray's new PowerBook -- observes:
<blockquote cite="Russell Beattie">
More and more <a href="http://www.tbray.org/ongoing/When/200x/2003/09/26/NewMac">webloggers are receiving theirs</a> in the mail (the bastards, all of them) and I'm seeing <a href="http://www.tbray.org/ongoing/When/200x/2003/09/26/-big/macPair.jpg">pics</a> and getting more and more insane with jealousy. I really *want* that box. But man, it's expensive. These guys must be made of money. $2500+ for a laptop? That's a serious premium over the $1200 I paid for this Toshiba. [<a href="http://www.russellbeattie.com/notebook/1004446.html">Russell Beattie's Notebook</a>]
</blockquote>
</p>
<p>
As for me, I'm happy to have a foot in both camps. The original 15&quot; TiBook is, by a longshot, the most useful laptop I've ever carried around. But I collect avocado-green fridges too. They're real handy gadgets.
</p>

</body>
</item> 

<item num="a813">
<title>XPath everywhere</title>
<date>2003/09/29</date>
<body>

<p>
XPath-aware blog engines are sprouting like weeds. Over at Sam Ruby's place, you can for example <a href="http://www.intertwingly.net/blog/?q=//xhtml:cite[contains(.,'Udell')]">find entries that cite me</a>. Kimbro Staken's <a href="http://www.syncato.org/">Syncato</a> should soon find its way on to more machines now that Rick Bradley has <a href="http://www.rickbradley.com/code/syncatomatic/">automated</a> its <a href="http://www.intertwingly.net/blog/1598.html">prequisite installation hairball</a>.
</p>
<p>
Meanwhile, OpenLink's Kingsley Idehen points out that his product, Virtuoso (which I've written about [<a href="http://www.infoworld.com/article/02/04/12/020415plvirtuoso_1.html">1</a>, <a href="http://weblog.infoworld.com/udell/categories/infoworld/2003/03/24.html#a647">2</a>]) can play this game too. Riffing on Kimbro's transclusion feature, which pulls the result of an XPath query into a blog posting, Kingsley shows off a <a href="http://www.openlinksw.com/blog/~kidehen/index.vspx?id=377">dynamic XQuery-based transclusion</a> that reformulates a chunk of XML -- specifically, the <a href="http://cyber.law.harvard.edu/blogs/gems/bloggerCon/opml/day1.opml">BloggerCon RSS feeds</a>. The <a href="http://www.openlinksw.com/blog/~kidehen/queries/bloggercon.xml">XQuery code</a> that does this is:
<pre class="code" lang="xquery">
&lt;sql:xquery sql:context=
 &quot;http://cyber.law.harvard.edu/blogs/gems/bloggerCon/opml/day1.opml&quot;&gt;
&lt;attendees&gt; 
{
for $o in document(&quot;day1.opml&quot;)//outline
  return 
      &lt;table border=&quot;1&quot;&gt;  
      &lt;tr&gt;&lt;td&gt;
               {string($o/@text)}&lt;/td&gt;
      &lt;td&gt;&lt;a href={string($o/@url)}&gt;{string($o/@url)}&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;/table&gt;
}
&lt;/attendees&gt;
&lt;/sql:xquery&gt;
</pre>
</p>
<p>
Instructive! Note that if you use Mozilla's nifty View Selection Source on the above, you'll see this:
<pre class="code" lang="xhtml">
&lt;pre class=&quot;code&quot; lang=&quot;xquery&quot;&gt;
...
&lt;/pre&gt;
</pre>
</p>
<p>
<a href="http://weblog.infoworld.com/udell/gems/xquerySnippets.gif"><img width="288" height="205" align="right" alt="xquery snippets" vspace="6" hspace="6" src="http://weblog.infoworld.com/udell/gems/xquerySnippets.gif"/></a>
As a result, my own <a href="http://weblog.infoworld.com/udell/gems/blogsearch.html">XPath search mechanism</a> can find that fragment. You can too, using IE or Mozilla, but I can't offer links to queries because the mechanism works by downloading a big swonk of XML into memory. 
</p>
<p>
We're obviously going to want linkable queries. I see two parallel paths forward. Along one path, dynamic blog servers will start to matter more than they have. (And we're going to see serious attention paid to XPath/XSLT performance.) Along another, services such as Technorati and Feedster will start to process XHTML content (when available) as well as RSS metadata.
</p>
<p>
One way or another, XML fragments are going to be in play. And then things are going to get really interesting.
</p>

</body>
</item> 

<item num="a810">
<title>Ace travel agents available</title>
<date>2003/09/29</date>
<body>

<p>
In my <a href="http://www.infoworld.com/article/03/08/29/34OPstrategic_1.html">column</a> a few weeks ago, I wrote:
<blockquote cite="InfoWorld">
Why, for example, does Expedia ask you to specify the number of children traveling with you, and even their ages, and then proceed to show you hotel rooms that can't accommodate the kids?<br/><br/>
When I'm traveling on business, I get to bypass this nonsense. I just call up IDG Travel and talk to Bruce or Michael. These guys can hack through the plane/hotel/rental-car system like hot knives through butter. It's true they have access to privileged information, but I'm sure they could also make better use of public information than you or I. Their ability to recognize and exploit patterns is what makes them so effective. 
</blockquote>
Sadly, as of today, as a result of a restructuring at IDG Travel, I can't just call up Bruce Powell or Michael McCarthy. But my loss could be your gain. If you know of a Bay Area opportunity for an ace travel agent, I'll be happy to pass it along. 
</p>
<p>
And thanks again, guys. You'll be sorely missed.
</p>

</body>
</item> 

<item num="a809">
<title>Permissions on the edge</title>
<date>2003/09/28</date>
<body>

<p>
<blockquote cite="InfoWorld">
CoreStreet has just signed a deal with Swedish locksmith Assa Abloy that will enable doors to enforce highly granular card access policies without wired (or wireless) connections. When an employee swipes a card at the main entrance, it's refreshed with a daily set of proofs. The door need only check that the proof binds a resource (itself) to an identity (the employee) at a certain time (today). <br/><br/>
CoreStreet's president, Phil Libin, sketches another interesting scenario. Suppose an employee needs a proof to access her own laptop but can't contact the network. Since proofs are minimally just 20 bytes, it's feasible to convey one in a phone call.<br/><br/>
We'll always have to manage permissions centrally. But CoreStreet's method of distributing them to the edge of the network -- and beyond -- strikes me as an excellent way to tackle a thorny logistical problem. [Full story at <a href="http://www.infoworld.com/article/03/09/26/38OPstrategic_1.html">InfoWorld.com</a>]
</blockquote>
</p>

</body>
</item> 

<item num="a808">
<title>Mechanical memory</title>
<date>2003/09/27</date>
<body>

<p>
<img align="right" alt="carlton fisk" width="152" height="262" vspace="6" hspace="6" src="http://weblog.infoworld.com/udell/gems/fisk.jpeg"/>
Example one. I want to call a friend, so I look for the cellphone. Could use the cordless, but it doesn't know my friend's number, and the cellphone does. Except damn, can't find the cellphone. Better call it. So I reach for the cordless. Except double damn, can't find it either. Better page it. So I go to the cordless's base, page the cordless, locate it, call the cellphone, locate it, and call my friend.
</p>
<p>
Example two. I want to cite the movie <a href="http://us.imdb.com/title/tt0209144/">Memento</a> in this blog entry, because example one reminds me of Guy Pearce in that movie, without any short-term memory, tattooing reminders on his skin and annotating Polaroid snapshots. Except damn, I can't remember the movie's name, or Guy Pearce's last name, but I do remember he was in L.A. Confidential, so I ask imdb.com for his <a href="http://us.imdb.com/name/nm0001602/">filmography</a> and that leads me to Memento.
</p>
<p>
Example three. I'm in Cambridge, MA one evening, and I wander into the <a href="http://info.wordsworth.com/www/subst/info/help/where/ww27813542710219">WordsWorth</a> bookstore in Harvard Square. The New Yorker's octagenarian baseball writer, <a href="http://www.amazon.com/exec/obidos/Author%3DRoger%20Angell/">Roger Angell</a>, is there for a lecture and book-signing. I attend the lecture. Someone asks: &quot;What's the biggest change you've seen in your lifetime?&quot; Angell answers: &quot;Without a doubt, television.&quot; He says instant replay has undermined our ability to perceive and to remember. And he relates a remarkable conversation with Carlton Fisk, whose epic home run in game 6 of the 1975 World Series is probably one of the most replayed sports moments ever. I'm not much of a baseball fan (though it's odd how the subject keeps coming up lately), but even I've seen that clip dozens of times: Fisk dancing up and down, waving the ball fair. So anyway, Roger Angell says he asked Fisk, in an interview, how many times Fisk has seen that clip. And Fisk says he's hardly ever seen it, maybe just a few times. When it comes on TV, he leaves the room to avoid watching it, because he doesn't want to overwrite the original memory in his head with a different version.
</p>
<p>
We all like to joke, nowadays, about how Google has become humanity's collective memory, and we're properly grateful not to have to remember a lot of things that we know we can just look up. We've gone through this before, of course. Pre-Gutenberg, we routinely memorized vast amounts of verse. Then we learned to offload chunks of memory to print. Now we're learning to offload a whole lot more memory to the Net. I'm not saying I'd have it otherwise, but sometimes I wonder about the tradeoffs we're making.
</p>	
</body>
</item> 

<item num="a806">
<title>Monoculture on the Potomac</title>
<date>2003/09/26</date>
<body>

<body>
<p>
My next column, which appears online tonight and in print next week, quotes from a <a href="http://www.simc-inc.org/archive0002/February02/Speakers/geer-keynote.htm">speech</a> given last year by now-former @Stake CTO Dan Geer. (I also <a href="http://weblog.infoworld.com/udell/2002/08/28.html#a389">referred</a> to that speech last August in this weblog). Today my RSS feed is full of news about Geer, who was principal author of a paper that was presented on Wednesday at the 30th annual Washington Caucus sponsored by the <a href="http://www.ccianet.org">Computer and Communication Industry Assocation (CCIA)</a>. Most reports suggest Geer was fired for his role in the report, though some suggest he resigned.
</p>
<p>
I hadn't known about the CCIA or its annual event, but the <a href="http://www.ccianet.org/meetings/03caucus_agenda.pdf">caucus agenda</a> gave me some sense of the event:
<blockquote cite="CCIA">
3:30 p.m. Discussion: Software Security -- Facing the Problem<br/>
Following the unveiling of CCIA's paper on Software Security, prepared by
industry's leading experts on encryption and cybersecurity, a panel will discuss
the report which will include representatives from Congress, the media, and
authors of the report, including Dan Geer.<br/><br/>
6:00 p.m. Yacht Cruise, Reception and Dinner<br/>
Mingle with distinguished guests at one of Washington's premier venues. The
USS Sequoia served as the official Presidential yacht from President Herbert
Hoover until President Jimmy Carter. Enjoy the beautiful scenery of Washington
and Old Town Alexandria as we cruise the Potomac River.<br/>
Location: The USS Sequoia Presidential Yacht, 600 Water Street, S.W.
</blockquote>
An odd juxtaposition, but hey, why not? Microsoft was, after all, forced to become a Washington influence-peddler. As Michael Kinsley, in a column for Slate, said this week:
<blockquote cite="Michael Kinsley">
Refusing to wallow like a reptile in the influence-trading swamp is almost a violation of a big company's fiduciary duty to its shareholders. [<a href="http://slate.msn.com/id/2088408/">Slate</a>]
</blockquote>
So we should expect no less of Microsoft's critics. On Thursday, News.com's Robert Lemos pointed out:
</p>
<p>
<blockquote cite="News.com">
The paper is the latest salvo fired by the <a href="http://www.ccianet.org/index.php3">CCIA</a>
at Microsoft. And although the argument has been made in security
circles before, this may be the first time that the position has been
outlined to legislators. [<a href="http://news.com.com/2100-1029-5081214.html?tag=nl">News.com</a>]
</blockquote>
</p>
<p>
That position is, in a nutshell, that the Microsoft software monoculture is a national (indeed global) security risk. I agree. Yet I found both the paper itself, and the reporting that surrounded it, strangely unsatisfactory. 
</p>
<p>
Entitled <i>CyberInsecurity: The Cost of Monopoly</i> and subtitled <i>How the Dominance of Microsoft's Products Poses a Risk to Security</i>, the paper was written by a security dream team: <br/><br/>
Daniel Geer, Sc.D - Chief Technical Officer, @Stake<br/>
Charles P. Pfleeger, Ph.D - Master Security Architect, Exodus Communications, Inc.<br/>
Bruce Schneier - Founder, Chief Technical Officer, Counterpane Internet Security<br/>
John S. Quarterman - Founder, InternetPerils, Matrix NetSystems, Inc.<br/>
Perry Metzger - Independent Consultant<br/>
Rebecca Bace - CEO, Infidel<br/>
Peter Gutmann - Researcher, Department of Computer Science, University of Auckland<br/>
</p>
<p>
Here are the conventional news sources I collected this morning: 
<a href="http://news.com.com/2100-1029-5081214.html?tag=nl">News.com on the report</a>, 
<a href="http://news.com.com/2100-1014_3-5082649.html">News.com on the firing</a>, 
<a href="http://www.techweb.com/wire/story/TWB20030924S0008">TechWeb</a>, 
<a href="http://www.forbes.com/technology/newswire/2003/09/25/rtr1092228.html">Forbes</a>, 
<a href="http://www.infoworld.com/article/03/09/26/HNdisowns_1.html">InfoWorld</a>, 
<a href="http://www.washingtonpost.com/wp-dyn/articles/A54872-2003Sep23.html">Washington Post on the report</a>, 
<a href="http://www.washingtonpost.com/ac2/wp-dyn/A2328-2003Sep25?language=printer">Washington Post on the firing</a>.
</p>
<p>
Reading through these, I was amazed not to find a single link to the report -- whose URL, by the way, is <a href="http://www.ccianet.org/papers/cyberinsecurity.pdf">http://www.ccianet.org/papers/cyberinsecurity.pdf</a>. The only hint that it was even available online came in the second Washington Post story, which reports that <a href="http://www.cio.com/">CIO Magazine</a> declined to rent its subscriber mailing list to CCIA, which had wanted to notify CIO's readers of the report. The Post's story reads:
</p>
<blockquote cite="Washington Post">
At the same time, the editor for the magazine's Web site posted a poll asking readers what they thought of the report, which he linked to through the CCIA Web site. [<a href="http://www.washingtonpost.com/ac2/wp-dyn/A2328-2003Sep25?language=printer">Washington Post</a>]
</blockquote>
<p>
How odd that I had to use Google to find the CCIA website, and then scan it for the link to the report at the center of all this hullabaloo! I was sure that the blogsphere would handle this very differently, and of course it did. I ran a <a href="http://www.feedster.com/search.php?hl=en&amp;ie=UTF-8&amp;q=geer+schneier+ccia&amp;btnG=Search&amp;sort=date">Feedster search</a> for &quot;geer schneier ccia&quot;. The first result didn't link to the report, but the <a href="http://www.cs.rochester.edu/~bukys/weblog/archives/2003/09/25.html#001621">second one</a> did, as did the <a href="http://newsforge.com/article.pl?sid=03/09/24/1518219">fourth</a>. 
</p>
<p>
I find this generally true nowadays. Folks who consume news by way of blogs are likelier to be exposed to primary sources than folks who rely on conventional news sources. Of course everyone's time is finite, so I'm sure those primary sources often go unread, but at least they're <i>available</i>. When conventional news websites don't bother, they make themselves much less valuable. 
</p>
<p>
In any case, I finally tracked down and read the report. And I found myself agreeing with The Register's skeptical view of its central assertion:
</p>
<blockquote cite="The Register">
From the security perspective monoculture is not of necessity bad; the problems (as indeed the document argues) lie in the flawed nature of the design of the base product, magnified many times by the ubiquity of that product, and again by the complexities introduced under the banner of integration and automation. So in theory at least, it seems to us, you could have a monoculture whose fundamental design premise was not fatally flawed, and whose security issues would therefore not be magnified by &quot;cascade failure&quot; across the network. Sure you could still argue it was lining the pockets of a bunch of greedheads who were stifling diversity, but that's a different argument. [<a href="http://www.theregister.co.uk/content/4/33082.html">The Register</a>]
</blockquote>
<p>
I agree. The argument I put forward in my controversial <a href="http://www.infoworld.com/article/03/09/05/35OPstrategic_1.html">Security Blame Games</a> column, and in the <a href="http://weblog.infoworld.com/udell/2003/09/08.html#a793">posting</a> that aired some follow-on email discussion, is that we now have an intense and healthy competition between two different approaches to the construction of secure software: the open source way, and the new methodologies in effect in Redmond now that Microsoft has belatedly gotten religion. 
</p>
<p>
With governments and major corporations now cozying up to Linux, some suggest that significant erosion of the monoculture has already begun. In fact, I think that's a stretch, and so does Karsten Self:
</p>
<blockquote cite="Karsten Self">
I _strongly_ suspect that many of these announcements are part of the current round of license (re)negotiation with Microsoft, rather than sincere efforts to deploy alternatives.  The pattern has been for $MAJOR_FIRM or $COUNTRY to make a similar declaration, and a delegation from Microsoft to visit, followed by denials of any special deals. [<a href="http://zgp.org/linux-elitists/Pine.LNX.4.33.0211201354470.28925-100000@hydrogen.leitl.org.html">Linux-elitists mailing list</a>]
</blockquote>
<p>
He adds, though:
</p>
<blockquote cite="Karsten Self">
The announcements aren't credible unless the threat is credible, and I do feel that a GNU/Linux-based desktop _is_ credible at this point.
</blockquote>
<p>
Absolutely. From the perspective of national and global security, what's needed in my view is not simply software diversity, but competition among different ways of producing software. The vitality of OSS is, in many ways, a competitive reaction to Microsoft. I argue the reverse is also now becoming true. Microsoft is reinventing itself again because it faces a real competitive threat. These two software-producing cultures are forcing one another to raise the level of their games.
</p>
<p>
The Cyberinsecurity report concludes:
</p>
<blockquote cite="CIAA">
While appropriate remedies require significant debate, these three alone would
engender substantial, lasting improvement if Microsoft were vigorously forced to:
<ul>
<li><p>Publish interface specifications to major functional components of its code, both Windows and Office.</p></li>
<li><p>Foster development of alternative sources of functionality through an approach comparable to the highly successful 'plug and play' technology for hardware components.</p></li>
<li><p>Work with consortia of hardware and software vendors to define specifications and interfaces for future developments, in a way similar to the Internet Society's RFC process to define new protocols for the Internet.
</p></li>
</ul>
</blockquote>
<p>
Sounds great! But how do we get there? The forces currently in play may well produce the right results. In the case of XML and Web services, I'd argue they already have. The growing viability of OSS will advance the report's agenda more effectively, I suspect, than any legislation can.
</p>
</body>

</body>
</item> 

<item num="a805">
<title>More baseball lessons</title>
<date>2003/09/25</date>
<body>

<p>
My recent column about baseball and IT was an odd departure for me, since I lack the sports gene and don't even try to pretend otherwise. But as it turns out, the baseball/IT connection is right up the alley of Jeff Angus, an InfoWorld <a href="http://search.infoworld.com/servlet/query.html?qt=jeff+angus">contributing editor</a> whose background (who knew?) includes baseball reporting as well as management consulting. 
</p>
<p>
Jeff says he's writing a book that'll be called &quot;Almost Everything I Need to Know About Management, I Learned From Baseball.&quot; And he's working out the ideas on a <a href="http://cmdr-scott.blogspot.com/2003_09_01_cmdr-scott_archive.html">weblog</a>; here's a sample: 
</p>
<blockquote cite="Jeff Angus">
TIP: Take a lesson from baseball and consider using late-cycle projects with determined outcomes (bad and good) as September Call-Up opportunities for less-experienced staffers. Your risk will just about never be lower, and the chances for high returns will just about never be higher. [<a href="http://cmdr-scott.blogspot.com/2003_09_01_cmdr-scott_archive.html#106339019660449047">Management by Baseball</a>]
</blockquote>
<p>
Cool. But dude...get yourself an RSS feed! Although, come to think of it, how does a Blogger user do that at the moment? According to <a href="http://new.blogger.com/feature_giveaway/pro_email.pyra">this message</a> now attached to the Blogger Pro link at www.blogger.com: 
</p>
<blockquote cite="Evan Williams">
<p>
We're no longer offering Blogger Pro as a separate product and we're folding most of the features into regular (free) Blogger.
</p>
<p>
Don't worry - nothing you paid for is going away. Your subscription is still valid, and you will continue to have access to features like RSS and post-via-email that are still not in the free version. 
</p>
</blockquote>
<p>
Sounds like a Catch-22. Jeff can't upgrade to Blogger Pro, but free Blogger doesn't yet support RSS? 
</p>
</body>
</item> 

<item num="a804">
<title>Loosely-coupled publishing</title>
<date>2003/09/24</date>
<body>
<p>
I've long admired the way my favorite book site, <a href="http://www.allconsuming.net/">All Consuming</a>, pulls book discussions out of thin air. For example, yesterday I posted an item pointing to my current InfoWorld column, which riffs on the themes of Michael Lewis' book, Moneyball. In that posting, I included an image of the book, and I linked the image to the <a href="http://allconsuming.net/item.cgi?isbn=0393057658">Moneyball page</a> at All Consuming. Alternatively, I could have linked the image to the <a href="http://amazon.com/o/asin/0393057658/">Moneyball page</a> at Amazon. Either way, I knew that All Consuming would scan my blog, find a reference to a book, and assimilate that reference into the Moneyball discussion. And sure enough, today the Moneyball page at All Consuming includes this element:
</p>
<p>
    <b class="bold">Weblogs that mentioned this book within the last week</b>

    <ul class="pad">
      
      <li class="pad"><a href="http://allconsuming.net/travel.cgi?url=http://weblog.infoworld.com/udell/#0393057658">http://weblog.infoworld.com/udell/</a> (<a href="/weblog.cgi?url=http://weblog.infoworld.com/udell/">site info</a>) (1 days ago)      </li>

      
      <li class="pad"><a href="http://allconsuming.net/travel.cgi?url=http://ross.typepad.com/blog/#0393057658">http://ross.typepad.com/blog/</a> (<a href="/weblog.cgi?url=http://ross.typepad.com/blog/">site info</a>) (1 days ago)<br/>
      <span style="font-size: 12px;">&quot;Jon Udell's latest column discusses the <span class="lw">Moneyball</span>
approach to doing more with less through measurement. You can't manage
what you can't measure and when it comes to human and social capital
opportunities for advantage abound:
&quot; [<a href="http://allconsuming.net/travel.cgi?url=http://ross.typepad.com/blog/#0393057658">read more</a>]</span></li>
      
      <li class="pad"><a href="http://allconsuming.net/travel.cgi?url=http://www.sauria.com/blog/#0393057658">http://www.sauria.com/blog/</a> (<a href="/weblog.cgi?url=http://www.sauria.com/blog/">site info</a>) (1 days ago)<br/>
      <span style="font-size: 12px;">&quot;It's Friday, which means its time for the columnists at Infworld to post
their latest. I'm impressed by how consistently Jon Udell comes up with
something interesting to write about. This week it's themes from
Michael Lewis' <span class="lw">Moneyball</span>.
&quot; [<a href="http://allconsuming.net/travel.cgi?url=http://www.sauria.com/blog/#0393057658">read more</a>]</span></li>
      
      <li class="pad"><a href="http://allconsuming.net/travel.cgi?url=http://www.jcwinnie.us/MT/weblog/#0393057658">http://www.jcwinnie.us/MT/weblog/</a> (<a href="/weblog.cgi?url=http://www.jcwinnie.us/MT/weblog/">site info</a>) (5 days ago)<br/>
      <span style="font-size: 12px;">&quot;Moneyball: The Art of Winning an Unfair Game ASIN: 0393057658
&quot; [<a href="http://allconsuming.net/travel.cgi?url=http://www.jcwinnie.us/MT/weblog/#0393057658">read more</a>]</span></li>

      
      <li class="pad"><a href="http://allconsuming.net/travel.cgi?url=http://www.techlawadvisor.com/#0393057658">http://www.techlawadvisor.com/</a> (<a href="/weblog.cgi?url=http://www.techlawadvisor.com/">site info</a>) (5 days ago)<br/>
      </li>
      
    </ul>
</p>
<p>
I'm happy to report that InfoWorld.com has now dipped a toe into these waters. The template for articles now includes this element:
</p>
<table width="165" border="0" cellspacing="0" cellpadding="4" style="border: 1px dashed rgb(153, 153, 153);">
<tbody><tr>
<td><b>TOP SITE REFERRALS</b><br/><br/>
<a href="http://www.infoworld.com/article/03/09/23/HNsymantec_1.html">
SMS Virus Alert</a><br/>
(<a href="http://www.smartmobs.com/archives/001691.html">Smart Mobs</a>)<br/><br/>
<a href="http://www.infoworld.com/article/03/09/12/36OPcringely_1.html">
DVD Licensing and SCO as a Verb</a><br/>
(<a href="http://blogs.law.harvard.edu/cmusings/2003/09/23#a364">A Copyfighter's Musings</a>)
</td>
</tr>
</tbody></table>
<p>
It's made of pairs of links, the first being an InfoWorld.com story, the second being a weblog item that refers to the story. Currently the items included here are culled, by an editor, from the output of Technorati and Feedster, both of which offer views (<a href="http://www.technorati.com/watchlists/rss.html?wid=928">1</a>, <a href="http://www.feedster.com/rss.php?q=infoworld&amp;sort=date&amp;ie=UTF-8">2</a>) of InfoWorld-related blog discussion. The rule for inclusion in the culled list is, roughly, &quot;items that do not merely cite the InfoWorld article, but say something substantive about it, and/or advance the story in some useful way&quot;.
</p>
<p>
Publication-related websites often use the TalkBalk device, whereby every article is (potentially) the root of a discussion thread. What we're trying here is more like TrackBack. Should we go all the way and let TrackBack-capable blogs ping InfoWorld.com articles directly? Perhaps. But for now, since we have more culled referral data than we are using, I'd like to see the current mechanism made context-sensitive. So, for example, <a href="http://www.infoworld.com/article/03/09/19/37FEcodeedit_1.html">this story</a> by Maggie Biggs has been widely cited in recent days (<a href="http://www.blueskyonmars.com/archives/2003_09_23.html#001027">1</a>, <a href="http://www.teammurder.com/archives/000896.html">2</a>, <a href="http://jerobins.freeshell.org/house/archives/000075.php">3</a>). When such references are available for an article, they ought to supersede the sitewide references on that article's page, thusly:
</p>
<p>
<table width="165" border="0" cellspacing="0" cellpadding="4" style="border: 1px dashed rgb(153, 153, 153);">
<tbody><tr>
<td><b>BLOG CHATTER</b><br/><br/>
<a href="http://www.blueskyonmars.com/archives/2003_09_23.html#001027">Blue Sky On Mars</a>: <i>Of course, Emacs with all of the Java tools incorporated could readily be considered an IDE.</i><br/><br/>
<a href="http://www.teammurder.com/archives/000896.html">Team Murder</a>: <i>Maybe it's because I rarely code anything terribly complex but I find most IDEs way too pushy.</i><br/><br/>
<a href="http://jerobins.freeshell.org/house/archives/000075.php">The Robinson House</a>: <i>I also believe that IDEs tend to groom bad habits; clicking 'build' constantly to catch errors is the biggest time waster I've observed.</i><br/>
</td>
</tr>
</tbody></table>
</p>
<p>
I'm sure we'll see this kind of thing evolve in coming weeks and months. For blog veterans, all this may seem obvious. But in the realm of traditional publications, the instinct has always been to try to form gated communities, not federate with independent voices. I think the loosely-coupled model is more compelling, so I'm delighted to see InfoWorld.com taking its first step in that direction.
</p>

</body>
</item> 

<item num="a803">
<title>iCal explorations</title>
<date>2003/09/23</date>
<body>
<p>
A couple of people wrote to point out that I'd given the impression that iCal was an Apple-only thing. Not so, of course. iCal is a <a href="http://www.ietf.org/rfc/rfc2445.txt">standard</a> with many implementations. One that I hadn't tried until yesterday is <a href="http://www.mozilla.org/projects/calendar/">Mozilla Calendar</a> -- it's available as an XPI-style extension and works, for me, with Windows and Mac versions of Firebird. (I had to also use the <a href="http://quicktools.mozdev.org/installation.html">QuickTools</a> extension in order to get Calendar to show up on the menu.)
</p>
<p>
As an experiment, I decided to see what it would take to get Outlook, which is currently my canonical PIM, to publish events to a public iCal file. I started with <a href="http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/*checkout*/spambayes/spambayes/Outlook2000/sandbox/dump_props.py?rev=1.10&amp;content-type=text/plain">dump_props.py</a>, which is a tool included with Mark Hammond's SpamBayes Outlook plug-in. This script, which showed me how to <a href="http://webservices.xml.com/pub/a/ws/2003/05/13/email.html">extract</a> mail for fulltext indexing, is a great starting point for the novice Python/MAPI coder. 
</p>
<p>
The script I came up with is crufty and doesn't do import yet, just export, but it works. As a result, I'm now even more aware of what I suspected about <a href="http://www.ietf.org/proceedings/02mar/I-D/draft-ietf-calsch-many-xcal-01.txt">xCal</a>, the (apparently now expired) proposal to XMLize iCal. xCal proposed a straight mapping of iCal elements to XML equivalents. Fair enough, and certainly helpful to applications that are parsing this stuff, but what struck me when I saw my events showing up in iCal, and when I looked around at other iCal data, is how impoverished the stuff is.
</p>
<p>
Back in 1977, at the dawn of the Java servlet era, I wrote a <a href="http://www.byte.com/art/9708/sec8/sec8.htm">servlet-based group calendar</a>. One of the delightful things about it -- to me, anyway -- was that calendar entries accepted HTML, so could contain links and rich formatting. Now we have the technology to treat such entries as structured data. But there's no place in iCal to put that stuff. XML calendar metadata would be useful, but not revolutionary. XML payloads, though, could really open some doors.
</p>
<p>
As <a href="http://www.snee.com/bob/">Bob DuCharme</a> said to me yesterday:
<blockquote cite="Bob DuCharme">
When I give an XSLT class, I like to point out that one key to its success is that you can create whatever XML serves your purpose and translate between your XML and that of your business partners as needed without waiting and waiting for a common format to share.
</blockquote>
Exactly. There are a handful of things that absolutely must be nailed down to make something like iCal work. And then there are a million &quot;nice-to-haves&quot; that will never make it through the standards bottleneck. With XML Web services, we've concluded that all payloads are extensible. That seems like a sound principle to me, and fertile ground for innovation in other areas too. <i>&quot;But then, calendaring might as well just be an XML Web service.&quot;</i> OK.
</p>
</body>
</item> 

<item num="a802">
<title>Baseball lessons for software teams</title>
<date>2003/09/23</date>
<body>
<p>
<a href="http://allconsuming.net/item.cgi?isbn=0393057658"><img vspace="6" hspace="6" alt="moneyball" align="right" src="http://images.amazon.com/images/P/0393057658.01.MZZZZZZZ.jpg"/></a>
<blockquote cite="InfoWorld">
Doing more with less is the theme of Michael Lewis' terrific new book, Moneyball. This David-versus-Goliath tale explains how the low-budget Oakland Athletics consistently win more games than much richer teams. Moneyball is not just a baseball book; it's a treatise on the science and economics of individual and team performance. The methods pioneered by Oakland General Manager Billy Beane, based on the theoretical foundations laid by maverick statistician Bill James, hold important lessons for enterprise IT. [Full story at <a href="http://www.infoworld.com/article/03/09/19/37OPstrategic_1.html">InfoWorld.com</a>]
</blockquote>
</p>
</body>
</item> 

<item num="a801">
<title>The calendar fiasco</title>
<date>2003/09/21</date>
<body>
<p>
Ray Ozzie reminds us what a fiasco calendars still are:
<blockquote cite="Ray Ozzie">
Each fall, as I manually enter the entire Celtics season schedule, my company's holidays and my childrens' school calendars into my own personal calendar, I am again reminded how ridiculous it is that The Net has not yet ubiquitously embraced the everyday exchange of virtual objects so basic as calendars and as vCards - which can also likewise be subscribed-to, aggregated into Contact Lists and auto-updated via personal RSS feeds. Bizarre. [<a href="http://www.ozzie.net/blog/2003/09/20.html#a109">Ray Ozzie's Weblog</a>]
</blockquote>
The situation is perhaps slightly less dismal than that, but not much less. I asked Google for Celtics schedules. It found a <a href="http://www.nba.com/celtics/news/downloadable_schedule.html">CSV file from nba.com</a> that I could download into Outlook, and also an <a href="http://icalshare.com/viewer/month.php?cal=http%3A%2F%2Fical.mac.com%2Fical%2FCeltics.ics&amp;getdate=20031128&amp;sid=20020918234608300">iCal calendar at iCalShare.com</a> that I could subscribe to on the Mac. I don't know how many people have bothered to acquire the Outlook-compatible data, but according to iCalShare's stats, an underwhelming number of Mac folk have subscribed to that version.
</p>
<p>
Let's look at the formats used by these two services. First the CSV file:
</p>
<pre class="code" lang="csv">
START_DATE,START_TIME,START_TIME_ET,SUBJECT,LOCATION,\
 DESCRIPTION,END_DATE,END_TIME,END_TIME_ET,\
 REMINDER_ON_OFF,REMINDER_DATE,REMINDER_TIME,REMINDER_TIME_ET
&quot;10/29/2003&quot;,&quot;07:00 pm&quot;,&quot;07:00 pm&quot;,&quot;Celtics vs. Miami&quot;,&quot;FleetCenter&quot;,\
 &quot;Regular Season - Celtics vs. Miami&quot;,&quot;10/29/2003&quot;,&quot;10:00 pm&quot;,&quot;10:00 pm&quot;,\
 &quot;TRUE&quot;,&quot;10/29/2003&quot;,&quot;01:00 pm&quot;,&quot;01:00 pm&quot;
&quot;10/31/2003&quot;,&quot;08:00 pm&quot;,&quot;08:00 pm&quot;,&quot;Celtics @ Memphis&quot;,&quot;The Pyramid&quot;,\
  &quot;Regular Season - Celtics @ Memphis&quot;,&quot;10/31/2003&quot;,&quot;11:00 pm&quot;,&quot;11:00 pm&quot;,\
 &quot;TRUE&quot;,&quot;10/31/2003&quot;,&quot;01:00 pm&quot;,&quot;01:00 pm&quot;
</pre>
<p>
And then the iCal data:
</p>
<pre class="code" lang="vcal">
BEGIN:VCALENDAR
CALSCALE:GREGORIAN
X-WR-TIMEZONE;VALUE=TEXT:US/Eastern
X-WR-CALDESC;VALUE=TEXT:Boston Celtics
METHOD:PUBLISH
PRODID:-//Apple Computer\, Inc//iCal 1.0//EN
X-WR-RELCALID;VALUE=TEXT:109067FB-EC55-11D7-8B7F-0003937196E0
X-WR-CALNAME;VALUE=TEXT:Celtics
VERSION:2.0
BEGIN:VEVENT
SEQUENCE:2
DTSTAMP:20030917T173759Z
SUMMARY:Celtics vs Heat
LOCATION:FleetCenter
LOCATION: Boston
UID:108F66AE-EC55-11D7-8B7F-0003937196E0
DTSTART;TZID=US/Eastern:20031029T190000
DURATION:PT2H30M
END:VEVENT
BEGIN:VEVENT
SEQUENCE:2
DTSTAMP:20030917T173759Z
SUMMARY:Celtics at Grizzlies
LOCATION:The Pyramid
LOCATION: Memphis
UID:108F6A6D-EC55-11D7-8B7F-0003937196E0
DTSTART;TZID=US/Eastern:20031031T200000
DURATION:PT2H30M
END:VEVENT
...
END:VCALENDAR
</pre>
<p>
If you want to translate between formats like these, God help you. It's a game I call the import/export shuffle. In one version of this game, you run all your Outlook mail and contacts through Mozilla in order to send them to another application. An <a href="http://weblog.infoworld.com/udell/2002/06/28.html">item</a> I posted last year about this insane procedure is <i>still</i> drawing hits. 
</p>
<p>
Here's another example of the import/export shuffle:
<blockquote cite="Mac OS X Hints">
A tip for anyone trying to get info from
Outlook to iCal (or even just into Entourage for that matter!) using
Palm Desktop 4 for OS X as a middleman: <br/>
<br/> Export your calendar events from Outlook as a tab-delimited text
file, open the text file in Excel (or other spreadsheet which can read
delimited text files), change the appropriate column headings to read: <b>Title</b>, <b>Date (Start)</b>, <b>Start Time</b>, <b>End Date</b>, <b>End Time</b>, <b>Category 1</b>, <b>Category 2</b>, and <b>Private</b>. Delete any remaining columns. Save in the same delimited text format as the file was when you opened it.
<br/>
<br/> Now download and install the newest free Palm Desktop for OS X,
import this freshly edited delimited text file into the Palm Desktop
datebook, then turn around and export the datebook from Palm Desktop as
a vCal file, which imports just fine into iCal. Also, Entourage says it
will import from Palm Desktop, though I've not tried that. <br/>
<br/>
Phew! That was easy wasn't it?
[<a href="http://www.macosxhints.com/article.php?story=20020912063701529">Mac OS X Hints</a>]
</blockquote>
Indeed. Procedures like this make me want to lie down until the urge to accomplish whatever I was trying to do just fades away.
</p>
<p>
Why are we in this terrible mess? It seems to me that the Net has yet to embrace ubiquitous sharing of <i>any</i> kind of structured data. Weblogs as we know them today take two steps in the right direction. They make it easy to share unstructured information. It's now as trivial as it should be to post some text up on the Web at an URL that anybody else can access. Weblogs also make it easy to share a very specialized kind of structured data. The items we post are wrapped in XML metadata can be aggregated and mined. We call this wrapper RSS.
</p>
<p>
Ray asks:
</p>
<blockquote cite="Ray Ozzie">
Has a method to embed xCal [a hypothetical XMLization of iCal] events/etc ever been suggested as a viable item type for RSS?
</blockquote>
<p>
For months now I've been asking, and trying to answer, another question: What if the items we can now so easily publish and subscribe to were made of, not merely wrapped in, XML? Here are a couple of cells in an XHTML table:
</p>
<p>
<table border="1" cellspacing="0" cellpadding="4" class="ical">
<tr class="week">
<td class="day" date="20031029">
<div class="event" start="20031029T190000" duration="PT2H30M">
<span class="date">
October 29, 2003
</span><br/>
<span class="eventSummary">
Celtics vs Heat
</span><br/>
<span class="Location">
FleetCenter
</span><br/>
<span class="Location">
Boston
</span><br/>
</div>
</td>
<td class="day" date="20031031">
<div class="event" start="20031031T200000" duration="PT2H30M">
<span class="date">
October 31, 2003
</span><br/>
<span class="eventSummary">
Celtics at Grizzlies
</span><br/>
<span class="Location">
The Pyramid
</span><br/>
<span class="Location">
Memphis
</span><br/>
</div>
</td>
</tr>
</table>
</p>
<p>
Here is the XHTML markup behind it:
</p>
<pre class="code" lang="xhtml">
&lt;table border=&quot;1&quot; cellspacing=&quot;0&quot; cellpadding=&quot;4&quot; class=&quot;ical&quot;&gt;
&lt;tr class=&quot;week&quot;&gt;
&lt;td class=&quot;day&quot; date=&quot;20031029&quot;&gt;
&lt;div class=&quot;event&quot; start=&quot;20031029T190000&quot; duration=&quot;PT2H30M&quot;&gt;
&lt;span class=&quot;date&quot;&gt;
October 29, 2003
&lt;/span&gt;&lt;br/&gt;
&lt;span class=&quot;eventSummary&quot;&gt;
Celtics vs Heat
&lt;/span&gt;&lt;br/&gt;
&lt;span class=&quot;Location&quot;&gt;
FleetCenter
&lt;/span&gt;&lt;br/&gt;
&lt;span class=&quot;Location&quot;&gt;
Boston
&lt;/span&gt;&lt;br/&gt;
&lt;/div&gt;
&lt;/td&gt;
&lt;td class=&quot;day&quot; date=&quot;20031031&quot;&gt;
&lt;div class=&quot;event&quot; start=&quot;20031031T200000&quot; duration=&quot;PT2H30M&quot;&gt;
&lt;span class=&quot;date&quot;&gt;
October 31, 2003
&lt;/span&gt;&lt;br/&gt;
&lt;span class=&quot;eventSummary&quot;&gt;
Celtics at Grizzlies
&lt;/span&gt;&lt;br/&gt;
&lt;span class=&quot;Location&quot;&gt;
The Pyramid
&lt;/span&gt;&lt;br/&gt;
&lt;span class=&quot;Location&quot;&gt;
Memphis
&lt;/span&gt;&lt;br/&gt;
&lt;/div&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;/table&gt;
</pre>
<p>
I posted this chunk of data over on Kimbro Staken's site to explore what kinds of access his engine makes possible. Some examples:
</p>
<p>
<a href="http://www.xmldatabases.org/WK/blog/*//table[@class='ical' and contains (.//span[@class='date'], 'October')]//div[@class='event']">Celtics games in October.</a>
</p>
<p>
<a href="http://www.xmldatabases.org/WK/blog/*//table[@class='ical']//span[@class='Location' and contains(., 'Boston')]/ancestor::div[@class='event']">Home games (fuzzy match for Boston).</a>
</p>
<p>
<a href="http://www.xmldatabases.org/WK/blog/*//table[@class='ical']//span[@class='Location' and normalize-space(./text())='Boston']/ancestor::div[@class='event']">Home games (exact match for Boston).</a>
</p>
<p>
I am in violent agreement with Ray. It's crazy that we haven't got a ubiquitous way to share this kind of data. And weblogs-plus-RSS clearly ought to be helping us do that. An &quot;RSS item type&quot; for event records is one approach. A few months ago, I was envisioning a similar kind of thing for job postings. But now I'm asking myself: Which comes first? The chicken of Web content that's derived from a structured format (as <a href="http://icalshare.com/viewer/month.php?cal=http%3A%2F%2Fical.mac.com%2Fical%2FCeltics.ics&amp;getdate=20031128&amp;sid=20020918234608300">this calendar view</a> is derived from <a href="webcal://ical.mac.com/ical/Celtics.ics">this data</a>), or the egg of the structured data itself?
</p>
<p>
The answer that keeps coming back is: Neither. Maybe the chicken and the egg are really the same thing. An XHTML table with a certain structure, whose rows and cells are decorated in a certain way, can render calendar entries (or job postings) directly on the Web for people to read. That same XHTML table, flowing out in the xhtml:body of an item in an RSS feed, or accessed by way of a blog's XPath (or other) API, can be manipulated by applications. 
</p>
<p>
I realize of course that XHTML was never intended to be used this way. The right answer is, arguably, to create special-purpose XML vocabularies for events, jobs, and other &quot;virtual objects,&quot; then extend our content-producing and -consuming tools accordingly. But at the rate we invent special-purpose XML vocabularies and standardize on tools that support them, that could take a very long time. 
</p>
<p>
Meanwhile, we are busily publishing and subscribing to information flows that come so close to being made of general-purpose XML stuff that I see low-hanging fruit everywhere I look. 
</p>
</body>
</item> 

<item num="a800">
<title>Kimbro's science experiment</title>
<date>2003/09/19</date>
<body>
<p>
Kimbro Staken's new science experiment, <a href="http://www.syncato.org/">Syncato</a>, is bubbling right along. I just used the new comments feature to post a comment to an <a href="http://www.xmldatabases.org/WK/blog/507?t=item">item</a>. In the comment, which is well-formed, I &quot;transcluded&quot; the result of an XPath query of Syncato's XML database. I also quoted Kimbro in the comment, and used my own blog's convention -- <tt>&lt;blockquote cite=&quot;...&quot;&gt;</tt> -- to do so. Now watch:
</p>
<p>
<a href="http://www.xmldatabases.org/WK/blog/comment[author='Jon Udell']">Comments I've posted.</a>
</p>
<p>
<a href="http://www.xmldatabases.org/WK/blog/*//blockquote">All blockquote elements.</a>
</p>
<p>
<a href="http://www.xmldatabases.org/WK/blog/*//blockquote[@cite='Kimbro Staken']">All quotations of Kimbro.</a>
</p>
<p>
<a href="http://www.xmldatabases.org/WK/blog/*//blockquote[@cite='Kimbro Staken']/ancestor::comment">Comments, written by me, that quote Kimbro.</a>
</p>
<p>
<a href="http://www.xmldatabases.org/WK/blog/comment[author='Jon%20Udell']//blockquote[@cite='Kimbro%20Staken']/ancestor::comment//pre[@class='code' and @lang='xpath']">XPath code fragments contained within comments where I quote Kimbro.</a>
</p>
<p>
Awesomely cool. Kimbro writes:
<blockquote cite="Kimbro Staken">
In reality Syncato is much more then just a weblog system, it's an XML fragment management system. [<a href="http://www.syncato.org/WK/blog/508?t=page">Syncato</a>]
</blockquote>
Great description. And what a powerful concept!
</p>
</body>
</item> 

<item num="a799">
<title>Language Instincts</title>
<date>2003/09/18</date>
<body>
<p>
<blockquote cite="O'Reilly Network">
The dictionary of the Semantic Web may one day be written. But not until we've done a lot of yammering, a lot of listening, and a lot of imitating. We need to find ways to help these behaviors flourish. [<a href="http://www.xml.com/pub/a/2003/09/17/udell.html">O'Reilly Network</a>]
</blockquote>
The ideas put forward in this article got a big boost this week when Kimbro Staken revealed <a href="http://www.xmldatabases.org/WK/blog/262?t=item">Syncato</a>, his way-cool new blog software that's based on XML DB and that uses XPath expressions in its URLs. Today Kimbro <a href="http://www.xmldatabases.org/WK/blog/503?t=item">released the source code</a> which (I can't resist) you could also find using this search: <a href="http://www.xmldatabases.org/WK/blog/item//a[contains (@href, '.gz')]">http://www.xmldatabases.org/WK/blog/item//a[contains (@href, '.gz')]</a>. 
</p>
<p>
Another related item I've been meaning to mention comes from Sam Ruby's blog. Referring to an <a href="http://weblog.infoworld.com/udell/2003/08/29.html#a787">earlier posting of mine</a> on this theme, Sam wrote:
<blockquote cite="Sam Ruby">
Converging on well formed XML will encourage <a href="http://www.infoworld.com/article/02/12/17/021219opwebserv_1.html">spontaneous integration</a>. [<a href="http://www.intertwingly.net/blog/1582.html">Intertwingly</a>]
</blockquote>
In the comments attached to Sam's item, Jemaleddin Cole noted:
<blockquote cite="Jemaleddin Cole">
If all you want from a blogging tool is to make sure that the xml is well formed, Jesse Ruderman has a solution. <a href="http://www.squarefree.com/archives/000033.html">http://www.squarefree.com/archives/000033.html&quot;</a>.
</blockquote>
<a href="http://weblog.infoworld.com/udell/gems/blogidate.gif"><img align="right" width="221" height="106" vspace="6" hspace="6" alt="blogidate" src="http://weblog.infoworld.com/udell/gems/blogidate.gif"/></a>
I picked up Jesse's &quot;blogidate xml well-formedness&quot; bookmarklet there, and it immediately became part of my routine. Enlarge the screenshot and you'll see it in action, in Mozilla (works in IE too), pinpointing a well-formedness error in a draft of this posting. The red tint tells me there's a problem; the location is highlighted; the parser error shows up in the browser's status line. When I fix the error, I'm green and good to go. Excellent!
</p>
</body>
</item> 

<item num="a798">
<title>Weblogs, prior art, and virtual machines</title>
<date>2003/09/17</date>
<body>
<p>
<a href="http://www.uspto.gov/"><img border="1" align="right" vspace="6" hspace="6" src="http://weblog.infoworld.com/udell/gems/uspto.gif" alt="US PTO"/></a>
Ray Ozzie recently posted what may prove to be the single most influential weblog item ever written: <a href="http://www.ozzie.net/blog/stories/2003/09/12/savingTheBrowser.html">Saving the Browser</a>. As you probably already know, Ray makes a compelling argument that the 1993-era Lotus Notes should have been considered prior art for the Eolas <a href="http://164.195.100.11/netacgi/nph-Parser?Sect1=PTO1&amp;Sect2=HITOFF&amp;d=PALL&amp;p=1&amp;u=/netahtml/srchnum.htm&amp;r=1&amp;f=G&amp;l=50&amp;s1='5838906'.WKU.&amp;OS=PN/5838906&amp;RS=PN/5838906">patent</a> filed in 1994 and issued in 1998. Ray's extraordinary essay might conceivably save Microsoft ten times what it invested in Groove, should the argument prove decisive in an appeal of the recent ruling in favor of Eolas. Of more interest to those who weep only crocodile tears for Microsoft in this case, it might prevent a bunch of other applecarts from being upset: Flash, Mozilla, Safari. 
</p>
<p>
It's fun to imagine that a single weblog posting could turn out to be worth a half-billion dollars. But Ray's essay fascinates me for other reasons too. First, it shows how weblogs could help accelerate the flow of information through the patent system. The workings of that system are revealed in another remarkable patent-related essay posted this week: Tim Bray's <a href="http://www.tbray.org/ongoing/When/200x/2003/09/15/SWPatents">Software Patents from the Inside</a>. Curious about how the examiners working on his own patent application do their background research, Tim learned from his lawyer that: 
<blockquote cite="Tim Bray">
...examiners are insanely overworked and under huge pressure to get through the maximum number of claims every day, and (at least in this first-cut situation) may take an approach as simple as digging up other patents in the space and running through them with the PDF search function and a thesaurus. [<a href="http://www.tbray.org/ongoing/When/200x/2003/09/15/SWPatents">ongoing</a>]
</blockquote>
Although Tim was in the end surprised by the quality of the review his application finally received, the critical bottleneck is clearly awareness of relevant material. The weblog network is, above all, an amplifier of such awareness. Using it I rely less on my own capacity to search for and to absorb raw material, and more on a network of people, the results of whose searching, reading, and analysis are made available to me. Whatever you think of software-related patents -- and Tim's views on the subject are complex -- I've got to think that weblog technology can help to improve the patent process.
</p>
<p>
Ray's essay also shows poignantly how an innovator can be blindsided by a competing technology that's less advanced in many ways, but tuned for ubiquity and accessibility:
<blockquote cite="Ray Ozzie">
In 1993 or thereabouts, we saw the emergence of TCP/IP, HTML, HTTP, Mosaic and the Web. From our perspective, all of these were simplistic emulations of a tiny subset of what we'd been doing in Notes for years. TCP/IP instead of Netbeui or IPX/SPX, HTML instead of CD [Compound Document] records, HTTP instead of the Notes client/server protocols, httpd instead of a Notes server. And we were many years ahead in other ways: embedded compound objects, security, composition of documents as opposed to just 'browsing' them, and a sophisticated development environment. I am quite embarassed to say that we frankly didn't 'get' what was so innovative about this newfangled 'Web' thing, given the capabilities of what had already been built. [<a href="http://www.ozzie.net/blog/stories/2003/09/12/savingTheBrowser.html">Ray Ozzie: Saving the Browser</a>]
</blockquote>
It's easy for everyone to see now that the Web had to trump Notes, along the crucial axes of ubiquity and accessibility. Ray blushes to admit that he and his team didn't see that coming a decade ago, but the reason why not is instructive for all of us. Innovation is an act of willful imagination. What you dream up can indeed change the world. But the world keeps changing too, in ways that can require you to refocus your dream. Not an easy thing to do.
</p>
<p>
A final interesting aspect of Ray's posting is his use of VMWare to reconstitute the 1993-era software environment: DOS 6.22, Windows for Workgroups 3.11, Excel 5.0, Notes 3.0. If I had to cop to a technology that I underestimated a decade ago, it'd be virtualization. It's still true, today, that most of the software I use is compiled for real CPUs and runs on real systems. But I can now foresee a time when such resources will be more often virtualized than not. Managed environments like the JVM and the CLR are slowly but I think inexorably advancing. System emulation -- whether it's Linux on the mainframe, or x86 on the Mac, or x86 on x86 -- is becoming less exotic and more routine. 
</p>
<p>
There's been a lot of second-guessing about Ray's decision not to port Groove from Windows. At the rate things are going, I wonder when that might become a non-issue. Not soon enough, alas, for the Virtual PC-equipped 800MHz TiBook I'm typing on at the moment.
</p>
</body>
</item> 

<item num="a797">
<title>Kimbro Staken's REST/XPath blog</title>
<date>2003/09/16</date>
<body>
<p>
Kimbro Staken's new blog software, built on top of Sleepycat's Berkeley DB XML, echoes a theme I've been working with myself for a while. A collection of well-formed weblog entries is, implicitly, an XML database whose contents can be searched and intelligently recombined. I've been toying with a simple file-based solution that creates an XPath search interface to my blog content. Kimbro's approach takes the next step:
<blockquote cite="Kimbro Staken">
Now the really interesting feature of this system is that it's really an XML database Web Service. I exposed an XPath query facility through the URL so that the database can be queried via HTTP GET. [<a href="http://www.xmldatabases.org/WK/blog/262?t=item">Inspirational Technology</a>]
</blockquote>
</p>
<p>
Kimbro gives this example:
</p>
<p>
<a href="http://www.xmldatabases.org/WK/blog/item//a">http://www.xmldatabases.org/WK/blog/item//a</a> (all links)
</p>
<p>
But I can change it to, for example:
</p>
<p>
<a href="http://www.xmldatabases.org/WK/blog/item//table[contains(.,'Annie Lennox')]">http://www.xmldatabases.org/WK/blog/item//table[contains(.,'Annie Lennox')]</a> (tables containing 'Annie Lennox')
</p>
<p>
Very cool! As Kimbro points out:
<blockquote cite="Kimbro Staken">
The possibilities of this are endless, especially as you add more meaningful markup to your posts.
</blockquote> 
I just love this idea of incorporating XPath into RESTian URLs. With Kimbro's approach, you get immediate use of the markup you create -- just the kind of incentive that's needed.
</p>
</body>
</item> 

<item num="a796">
<title>Email's special power</title>
<date>2003/09/15</date>
<body>
<p>
<blockquote cite="InfoWorld">
Software that requires people to explicitly declare the formation of these groups, and to acknowledge their dissolution, is too blunt an instrument for such ephemeral social interaction. Like an operating-system thread, an e-mail thread is a lightweight construct, cheap to set up and tear down. Could a protocol other than SMTP, and an application other than e-mail, support such interaction? Sure, but any other communication medium that has e-mail's special power to convene groups will suffer the same diseases that afflict e-mail: spam, abuse, infoglut. [Full story at <a href="http://www.infoworld.com/article/03/09/12/36OPstrategic_1.html">InfoWorld.com</a>]
</blockquote>
When I recently <a href="http://weblog.infoworld.com/udell/2003/08/29.html#a786">threw some cold water</a> on the notion that RSS is ready to displace email, I failed to articulate what, exactly, is so special about email. This column makes the case. 
</p>
<p>
As much as I've used and thought about email, I never fully appreciated what an extraordinary thing the cc: header is. Not only do we use it to form groups spontaneously, across all kinds of boundaries, we use it to affirm or to adjust the membership of those groups. If you watch from a bird's eye perspective -- as social network analysts are beginning to do -- you see groups forming and dissolving with the unconscious fluidity of a cocktail party. Of course people also come and go in discussion forums, blog threads, and chats. But in these venues, every message doesn't reconstitute the group. When we strip everything else away from email, that single amazing quality -- at once powerful and problematic -- remains. 
</p>
</body>
</item> 

<item num="a795">
<title>TCP 135 and the loss of end-to-end</title>
<date>2003/09/12</date>
<body>
<p>
I've never spent much time tethered to an Exchange Server, other than on an experimental basis, so I'd forgotten -- or never knew -- that Outlook contacts Exchange on TCP port 135. That is, of course, the same port that Blaster has lately been partying on with wild abandon. I'd also heard that some ISPs had begun blocking 135, on the grounds that it's more trouble than it's worth. As <a href="http://support.cox.net/custsup/safety/port_135.shtml">this document</a> from Cox High Speed Internet notes: 
</p>
<blockquote cite="Cox High Speed Internet">
<strong>Customers who use Microsoft Outlook to connect directly to a Microsoft Exchange server may no longer be able to connect when this port filter is applied.</strong> We recommend the use of a Virtual Private Network (VPN) to the company or group who operates the Exchange server. Please contact the network administrator or helpdesk for that company or group for additional details. 
</blockquote>
<p>
Yesterday, PJ Connolly and I discovered the effects of this kind of policy first hand. He's got an Exchange Server 2003 in the test center lab, I've got an Outlook 2003 client in my home lab. You wouldn't normally connect these over the open Internet without using a VPN, but for the purposes of this temporary experiment we thought we could. Yes and no. I have two DSL circuits here, from two different ISPs. One allows me to make that connection, but the other is blocking 135. PJ had the exact same experience from his home lab, which also has two DSL circuits provided by yet another pair of different ISPs. One allows 135 traffic, one blocks it.
</p>
<p>
(This is, incidentally, a nice opportunity to try out an interesting new feature of Outlook 2003: <a href="http://www.win2000mag.net/Articles/Index.cfm?ArticleID=40018">RPC-over-HTTP</a>, an alternate, special-purpose kind of VPN that's arguably both as secure and more flexible than the standard approach.)
</p>
<p>
It worries me to see this kind of systematic erosion of end-to-end connectivity. Of course you ought not run unencrypted Exchange traffic over the open Internet in the first place, so maybe we shouldn't mourn the loss of that capability. But I hate to see the vulnerabilities of Windows' RPC implementation tar other uses of RPC. My understanding (which is far from thorough) is that port 135 was used, long before Microsoft glommed onto it, as a rendezvous point for DCE RPC applications -- an <a href="http://support.entegrity.com/private/doclib/docs/osfhtm/develop/apgstyle/Apgst154.htm">endpoint mapper</a> used to convert a partial binding to a host into a complete binding to a host and port.
</p>
<p>
Given the mess we are in, arguing against blocking 135 at the center rather than at the edge sounds kind of academic, and maybe it is. But once these things are done I don't see how they'll ever be undone. Someday (we can hope) the flaws that provoked these actions will be fixed. But the center will have ceased to be a dumb neutral fabric. Policy that used to reside at the edge will have migrated to the center, and will be difficult (maybe impossible) to dislodge. That doesn't feel good.
</p>
</body>
</item> 

<item num="a794">
<title>Apples to apples</title>
<date>2003/09/09</date>
<body>
<p>
Bob McMillan, who wrote tons of interviews and analysis for <a href="http://www.linux-mag.com/auth.html#Robert%20McMillan">Linux Magazine</a> before joining IDG News Service recently, <a href="http://www.infoworld.com/article/03/09/09/HNforrester_1.html">reports today</a> on a Microsoft-sponsored Forrester study that finds Microsoft cheaper than Linux/J2EE for enterprise software development. As Bob points out, what largely accounts for the difference is the price of the BEA and Oracle software used on the Linux side of the fence. Others can (and will) dissect Forrester's motives and objectivity, but the report is, on its face, unsurprising. It's reasonable for a Linux-based enterprise to choose BEA and Oracle, and it's obvious that these are expensive choices. Interestingly, the report says that some shops prefer Linux despite these higher price tags, for cultural and/or strategic reasons.
</p>
<p>
I'm bothered, though, by Forrester's eagerness to blur the distinction between base operating systems and layered development and service platforms. For example, the cost of system administration is pegged at an identical $200,000 in both large-organization scenarios. Say what? The patchfest that Windows has been lately costs no more to manage than Linux? I find that hard, no impossible, to believe. I'm also not sure I'm willing to accept Oracle -- which runs on Windows too, let's not forget -- as a fair swap for SQL Server. I like SQL Server 2000 just fine, but it's long of tooth nowadays. And Oracle has been pushing the envelope aggressively. An enterprise will run Oracle because it thinks it has to, not because it has chosen Linux and then gone shopping for a database.
</p>
<p>
The Forrester study is an equation with too many variables. Some of them can be held constant, and I'd like to see that done. For example, the study mentions Zope and PHP but dismisses these as not being serious options for the task at hand -- what Forrester's John Rymer calls &quot;mainstream portal style applications.&quot; I wonder what NATO, which has <a href="http://www.infoworld.com/article/03/08/01/30OPstrategic_1.html">based its worldwide intranet on Zope</a>, will make of that? Note that although you can also run Zope on Windows as well as Linux, hardly anybody (according to Zope Corp.) does.
</p>
<p>
Of course, Visual Studio.NET is a darned productive environment too. It's a shame we can't make an apples-to-apples comparison. Wait, I've got it! Microsoft just needs to port VS.NET, ASP.NET, and the CLR to Linux. Then we can settle this thing once and for all.
</p>
</body>
</item> 

<item num="a793">
<title>The security blame game</title>
<date>2003/09/08</date>
<body>
<p>
<blockquote cite="InfoWorld">
You can't turn Windows' installed base on a dime, but you can turn it eventually. In four or five years, the true nature of the struggle between the methodologies of Microsoft and the open source community may finally begin to emerge. My hunch is that both strategies will produce reliable and secure software, and that competition between them will benefit everyone. Neither strategy will deliver perfect security, of course, because no such thing exists. We'll always be assessing risks and making trade-offs. [<a href="http://www.infoworld.com/article/03/09/05/35OPstrategic_1.htm">http://www.infoworld.com/article/03/09/05/35OPstrategic_1.htm</a>]
</blockquote>
Last week's column provoked more than the usual number of responses. Here are some of them.
</p>
<p><b>Dan Gaters:</b></p>
<blockquote cite="Dan Gaters">
<p>
While any OS might be vulnerable to security attacks to some extent, the problem is that the OS that dominates 95% of the market has been quite problematic for over a decade.
</p>
<p>
From a hacker's POV if the market looks like:<br/>
<br/>
Windows: 95%<br/>
Others: 5%
</p>
<p>
it's easy to see what would get targeted.
</p>
<p>
However, if the market looked like this:
</p>
<p>
Windows 9/XP: 35%<br/>
SuSE/Debian/Mandrake: 20%<br/>
Mac OS 9/X: 20% <br/>
Red Hat: 15% <br/>
*BSD: 5% <br/>
Others: 5% 
</p>
<p>
the target would not be so obvious, given the fact that methods of attack
and conduits of transmission would not be predictably the same. From
agriculture to finance, we try to avoid monoculture, why do we foolishly
tolerate it software?
</p>
</blockquote>
<p>
Dan, I agree. Monoculture delivers economies of scale by externalizing costs, but the costs don't go away. And yet...it's seductive. See next letter.
</p>
<p><b>Aaron Cohen</b></p>
<blockquote cite="Aaron Cohen">
<p>
I enjoyed reading your article &quot;Security blame games&quot; but I noticed that
it mentioned that :
<blockquote>
&quot;If more people used Linux and/or Mac OS X, more attackers would exploit
the vulnerabilities of these systems.&quot;
</blockquote>
</p>
<p>
This is where the fact that there are multiple distributions/flavors of
Linux comes in handy. Yes, there will be more viruses that will be aimed
at linux, but not all distributions will have the same security holes.
Different distributions can run different types/configurations of
software. So when a whole slew of viruses for Linux come out they will
probably be aimed at a certain distribution of Linux that has that
vulnerability. It is easier to switch to a different distro of Linux
without the  vulnerability than wait for Microsoft's next more secure
operating system (Longhorn release date 2005).
</p>
<p>
Good article!
</p>
</blockquote>
<p>
Aaron: Point taken. I must admit, though, that when trolling for a piece of software to add to my collection, I am not immune to the allure of a Windows or MacOS binary that I know will just work, versus a Linux binary or source download with unknown version and dependency issues. Monoculture may be an unhealthy vice, but it has its virtues too.
</p>
<p><b>Jim Mooney</b></p>
<blockquote cite="Jim Mooney">
<p>
The premise of your article is unfounded.
</p>
<p>
You state that if Macs were the majority, that there would be viruses 
for it as well.
</p>
<p>
There is no basis for that &quot;fact&quot;.  It is not a fact at all.
</p>
<p>
You are misleading your readers.
</p>
<p>
The fact is there are no viruses for Mac OS X provided you are 
Microsoft-free (some nasty scripting problems with Outlook and 
Entrourage which due to Microsofts poor programming standards allowed 
to trickle in). Any updates are done automatically and allow the user 
to continuously be a safe computer user. There is no need for a user 
to go out of their way, it just works and updates itself.
</p>
<p>
There are a lot of journalists writing about OS X and Macs and they 
indeed do not know the real facts.  Perhaps you could take 10 minutes 
and do some research and  perhaps try using a Mac for a day (safely may 
I add).
</p>
</blockquote>
<p>
Jim: I do use a Mac, every day. Also Windows. Also Linux.
</p>
<p><b>Ralph Loader</b></p>
<blockquote cite="Ralph Loader">
<p>
You wrote:
<blockquote> 
If more people used Linux and/or Mac OS X, more attackers would exploit the vulnerabilities of these systems.
</blockquote>
</p>
<p>
eventually drawing the conclusion that
<blockquote>
... any dominant software player would have created a similar mess.
</blockquote>
</p>
<p>
This is easily seen to be wrong. For web serving software, Apache is
the dominant player, with Microsoft's product in a distant minority, but
still dominating real life security problems.
</p>        
<p>
Examining my web server logs for connections from web server worms, I
see hundreds of hits per day from compromised machines running MS
software, and a few a year from others.  On that metric, MS web servers
are tens of thousands times worse than the more popular Apache web
server.
</p>
<p>
In any case, the number of attackers writing viruses or worms that
exploit vulnerabilities of a system seems pretty irrelevant.  It only
takes one worm written by one person to propagate over the Internet and
cause havoc.
</p>
<p>
To my mind, the old adage &quot;quality not quantity&quot; sums up this matter
well.
</p>
</blockquote>
<p>
Ralph: Points well taken. I completely agree that the open source methodology of collaboration and review produces software that is inherently more secure, and that gets fixed faster when vulnerabilities do surface. The question that we'll never be able to answer is: Would these qualities have emerged had open source not been an intense competitive reaction to the disastrous results produced by Microsoft's methodology, which the Trustworthy Computing initiative by its very existence admits was shoddy? A question that we will be able to answer in a couple of years, I think, is: What happens when a well-funded organization brings military discipline to bear on the problem of building secure and reliable software? Competition cuts both ways. We need an open source movement to challenge Microsoft, but we need a Microsoft to keep open source on its toes too. Example: buffer overflows continue to create vexing security problems on all platforms. Managed code is not a panacea, but it sure helps. And the next version of ASP.NET is entirely a managed application. My point is not that Microsoft is blame-free. It patently is not. Rather, my point is that open source has pushed Microsoft to evolve in ways beneficial to everybody, and that Microsoft can (and I hope will) do the same for open source.
</p>
<p><b>Doug Glenn</b></p>
<blockquote cite="Doug Glenn">
<p>
You're correct in that as Linux grows larger more worms or viruses 
will be targeted to it.  You're also correct that the competition 
from Linux is causing Microsoft to finally begin reviewing its code and 
making it more secure. It may take a couple of years, but they will get 
it right eventually although it may take a rewrite from the ground up. 
They will even reach a point where they have as good security as the 
unices. But I believe by the time they reach that point, the migration 
to a OSS-based platform will have gone beyond critical mass and start 
them on a long downward spiral.
</p>
<p>
What I also believe, is that without competition from Linux, MS 
would have continued to go on with business as usual. 
</p>
</blockquote>
<p>
Doug: My crystal ball is cloudier than yours when it comes to predicting the fate of OSS relative to MS. But we agree that the competition is healthy.
</p>
<p><b>Tracy Reed</b></p>
<blockquote cite="Trace Reed">
<p>
You wrote:
<blockquote>
Open source software partisans never seem to follow their argument to its
logical conclusion, however. If more people used Linux and/or Mac OS X,
more attackers would exploit the vulnerabilities of these systems.
</blockquote>
</p>
<p>
Ah, but that is NOT the logical conclusion because open source software
has FAR fewer vulnerabilities. Our email programs are not designed to
automatically execute attachments or render HTML in a preview pane nor do
we routinely operate our computers with administrator rights. These simple
things make a HUGE difference in our susceptibility to viruses and worms.
</p>
<p>
Plus Linux is developing important new security technologies while
Microsoft does nothing. Linux 2.6 has a system called SE Linux built into
it. It is basically a system of important security restrictions which
prevent programs from doing things they normally should not do.  Are you
familiar with Linux/Unix at all? Telnet to ultraviolet.org with username
root and password root. Try to undermine the system in some way. You will
find that you are unable to. Now that's impressive security! :)
</p>
</blockquote>
<p>
Tracy: The default behavior of Microsoft email clients has been an unmitigated disaster, no argument there. Likewise the fact that a Windows user was historically a superuser by default. The questions now before us: Are things changing? Answer: Yes. How fast? Answer: Not fast enough. 
</p>
<p>
I think <a href="http://www.nsa.gov/selinux/">Security-enhanced Linux</a> is a great idea. But I wouldn't agree that Microsoft has &quot;done nothing&quot; in this area. The .NET managed-code initiative, with its emphasis on evidence-based secure code paths, is one important example. Palladium is another. The ultimate foundation for bulletproof security is, of course, a secure kernel that works hand-in-hand with securable hardware. Opponents of DRM (digital rights management) wish that Microsoft were doing less, not more, along these lines!
</p>
<p><b>Craig Franklin</b></p>
<blockquote cite="Craig Franklin">
<p>
I am a Linux user.  I agree with most of what you said in your article. 
The lack of competition is the major problem that created this mess.
</p>
<p>
We are angry because Microsoft has let these problems languish for
years.  They haven't needed to fix them, because there has been no real
competition.
</p>
<p>
Why wouldn't we be angry?  We were forced to buy a flawed product with
an excessively high price.
</p>
<p>
Microsoft has been able to accumulate more than 49 billion dollars,
while basic problems existed in its products. Their efforts are focused
at maintaining their position in the market, not improving their
products.  
</p>
<p>
Microsoft should make money, but until they stop abusing their monopoly,
expect more criticism when the next Sobig hits.
</p>
</blockquote>
<p>
Craig: Agreed. When the next Sobig hits, Microsoft will have earned the criticism it receives.
</p>
</body>
</item> 

<item num="a792">
<title>Adobe Q and A</title>
<date>2003/09/04</date>
<body>
<p>
A reader named Kirk Holbrook raised some interesting issues in response to my <a href="http://weblog.infoworld.com/udell/2003/08/21.html#a778">column</a> on Acrobat and InfoPath. Although he addressed his email to me, Kirk was really hoping for a response from Chuck Myers, the Adobe executive whose conversation with me I reported in the column. Below I have reprinted Kirk's message (with his permission), and Chuck's response, relayed to me by an Adobe PR representative and also reprinted with permission. It's an interesting and useful exchange, but one that wouldn't have come to light without several intermediaries. I was happy to help in this case, but such intermediation clearly won't scale. I continue to believe that thoughtful, articulate, and passionate spokespeople like Chuck would be doing themselves and their companies a favor by establishing weblogs and using them to address these kinds of issues in a proactive -- yet personable -- way.
</p>
<p>
Consider the security issue that Kirk raises, and Chuck's response to it. I gather that in this particular case, the risk is that the for-pay features of Acrobat could be unlocked in an unauthorized way. So Adobe stands to lose revenue, but neither the customer who uses ADSRE to enable advanced reader features (such as digital signature) nor the user of those features is at risk. Fair enough. But how can developers and users best exploit Acrobat's digital signature capabilities? What opportunities are being overlooked? What lessons have been learned? These are complex stories that can't adequately be told in whitepapers. They must evolve over time, revisiting the same issues from different perspectives, reacting to current events and public commentary, and finding an authentic voice. I know it's scary for companies to communicate that way. I wonder if we'll get to the point where it's scarier not to. 
</p>
<hr align="center" width="70%"/>
<p><b>Kirk Holbrook's message</b></p>
<blockquote cite="Kirk Holbrook">
<p>
Hi Jon,
</p>
<p>
I've enjoyed your articles in InfoWorld for some time now. I just read
your article about Adobe and XML and have a few comments.
</p>
<p>
First, I agree that Acrobat is a great tool. Simply using its &quot;print this
like it was meant to be printed&quot; capabilities is only a fraction of what
one can do with Acrobat. Several years ago I used Acrobat (3) for a
multmedia CD-ROM project for a client who needed some rather advanced
functions, but didn't have the budget for a more &quot;advanced&quot; solution -- it
worked out well.
</p>
<p>
That said, I have some problems with Adobe's strategy for Acrobat forms.
Document Server for Reader Extensions (DSRE) seems to be astronomically
priced. I say &quot;seems&quot; because I have been unable to get ahold of anyone at
Adobe who can give me a price (I've been trying for three days), although
I did find an old press release from before DSRE was released that says
pricing &quot;starts at $75,000.&quot;
</p>
<p>
At the very least, Adobe needs to provide a service whereby Adobe will
take a PDF that I create and enable these functions at a reasonable price
-- the key here being reasonable. There are many small businesses that
could make use of these functions in Reader. Reader has a huge
distribution (although Adobe seems to think that the download is not an
issue for users, as the Reader download is getting bigger and bigger) and
most people do not need the complete functionality of the Full version of
Acrobat. Acrobat Approval was a step in the right direction, but it seems
to have been discontinued -- again, the key is price. Acrobat Standard to
too pricy for small businesses to force their customers/clients to pony up
the cash for.
</p>
<p>
Reader needs to provide a way for users to fill out Acrobat forms and save
their data and print the form with the user's filled-in data (not simply
the default data). As long as Reader cannot perform these two tasks (at a
reasonable cost for PDF developers), there's no way that Acrobat will out
sell InfoPath. InfoPath is built on top of MS Office, and as such has a
much bigger foot in the door than Acrobat ever will -- at least until
Adobe allows users with Reader to sign and save Acrobat forms data. When
that happens (and I hope it's soon) Adobe can take over the world!
</p>
<p>
BTW there seem to be some issues with the manner in which DSRE signs PDF
files as it enables the reader extensions. What's the sense in having a
signed doc if it can easily be spoofed. See:
http://lists.insecure.org/lists/vulnwatch/2003/Jan-Mar/0103.html
</p>
<p>
Thanks for your time and all the great articles,
</p>
<p>
Kirk Holbrook<br/>
Senior Programmer
</p>
</blockquote>
<p><b>Chuck Myers' response:</b></p>
<blockquote cite="Chuck Myers">
<p>
Jon (and, indirectly, Kirk),
</p>
<p>
There seem to be three questions:
</p>
<p>
1) Pricing for Adobe Document Server for Reader Extensions (ADSRE)
</p>
<p>
2) What ADSRE does and how it is positioned relative to our previous low-priced licensing-only Acrobat (Approval)
</p>
<p>
3) The ADSRE &quot;spoofing issue&quot; that Kirk mentions
</p>
<p>
The answers are:
</p>
<p>
1) Some of the pricing listed in the email is our first-round pricing from when we released the product last Fall (that was 75K for 10 forms, 1.5M for unlimited). Since then, we have refined the pricing with two models: form-based and user-based. Form-based (commercial) pricing starts at $6,250 per form (minimum number of forms is 10), and goes as low as $2,500/form for quantities over 500.
</p>
<p>
User-based pricing is intended to resolve issues when you have a large/infinite number of forms/documents and a smaller/finite number of users. A good hypothetical example is an insurance company that has 30 forms, each with 50 state variations (1,500 forms total) that they send to their 4,000 agents on a nationwide basis. Here, the per-user model is much more effective. In this case, the pricing starts at $59/user (minimum 250 users) and goes as low as $24 in volume.
</p>
<p>
2) ADSRE enables three basic functions: form fill-in and save, digital signatures, and offline comment/collaboration. Details can be found at <a href="http://www.adobe.com/products/server/readerextensions/pdfs/ds_docserver_readerext.pdf">http:\//www.adobe.com/products/server/ readerextensions/pdfs/ds_docserver_readerext.pdf</a> or the flash-based demo at <a href="http://www.adobe.com/products/server/readerextensions/main.html#">http:\//www.adobe.com/products/ server/readerextensions/main.html#</a> (under Web tour). This is done for any document that has been run through the ADSRE server; it can then be used by any user of the free Adobe Reader, whether on Windows or Mac, as long as they use Reader 5.1 or above.
</p>
<p>
Acrobat Approval is still available for sale, but the user-based pricing for ADSRE is only slightly more than Approval was, and it gives more capabilities (most notably the annotation tools). Whether this ADSRE pricing fits the definition of &quot;reasonable&quot; is in the eye of the beholder; $59 is much less than the $299 for Acrobat Standard.
</p>
<p>
3) ADSRE Issue. Back in March, it appeared that Elcomsoft published the details of a method to enable advanced features in Reader. This does not pose any security risk to our customers who have implemented solutions based on ADSRE.
</p>
</blockquote>
</body>
</item> 

<item num="a791">
<title>Politically incorrect observations about Mac OS X and Windows</title>
<date>2003/09/03</date>
<body>
<p>
A few minutes ago, I had to hard-reset the TiBook I'm typing on. This happens at least once every week or two. Some of these events have been seemingly random, others I can almost -- but not quite -- reliably reproduce. One happens (very rarely, just once or twice ever) when the machine fails to wake from sleep. The other happens (much more often, but by no means always) when, after switching Wi-Fi networks, I connect to my Windows network. Meanwhile, my workhorse desktop machine running Windows XP has yet to bluescreen.
</p>
<p>
Of course, this is hardly an apples-to-apples (!) comparison. The TiBook leads an entirely different life than the Windows desktop machine does. It has three personalities -- a Unix core, a Mac face, and a Windows alter ego (Samba, Remote Desktop Connection, VirtualPC). It's on the road a lot, never knowing what network it'll attach to, and is expected to sleep or wake at all hours. And it manages all this more gracefully than any PC notebook I've ever had my hands on.
</p>
<p>
The truth is the Mac is only having trouble meeting expectations that it itself has raised. I had learned not to expect that a PC notebook would connect to unknown networks, assimilate new devices, and manage battery power in a reliable, predictable, and graceful manner. That the Mac with OS X surprises me by occasionally failing to do these things, rather than by occasionally succeeding, is pretty remarkable.
</p>
<p>
Still, there's no excuse for instability. I consider myself a resolute non-partisan when it comes to computer platform choices, so I'm compelled to point out that the TiBook/Jaguar combination leaves room for improvement. I expect Panther to be, first and foremost, more stable than Jaguar has been.
</p>
<p>
I'm also compelled to point out that Windows doesn't suck. Windows 9x does, to be sure, but the NT codebase never did, and doesn't now. If you plunk down $600 for a new PC that comes with Windows XP (Home Edition, even), you are getting an OS that bears little relationship to Windows 9x and shares a huge amount of its DNA with Microsoft's latest and most mature enterprise-class offering, Windows Server 2003. Why don't more people seem to appreciate this fact? To achieve a transition from the 9x codebase to the NT codebase, Microsoft had to downplay the extent of the change, and the reasons for it. To do otherwise would have been to admit, &quot;Well, yes, Windows used to suck, but now it doesn't.&quot;
</p>
<p>
That epochal transition is still underway. The center of gravity of an installed base moves slowly. But it does move, and the old criticisms based on 9x become less valid every day. 
</p>
</body>
</item> 

<item num="a790">
<title>Code reading and literary criticism</title>
<date>2003/08/30</date>
<body>
<p>
Brian Marick has posted a <a href="http://www.testing.com/cgi-bin/blog/2003/08/29#reader-response">wonderful essay</a> on the subject of commenting code. &quot;I do believe that code with comments should often be written to be more self-explanatory,&quot; he says. &quot;But code can only ever be self-explanatory with respect to an expected reader.&quot; To illustrate, he shows an algorithm in C, then translates to the kind of Lisp a C programmer would write, then retranslates to the kind of Lisp a Lisp programmer would write. Then he walks through the Lisp code line by line, exploring how the code itself sets up and then satisfies expectations in the mind of a reader who is presumed Lisp-proficient.
</p>
<blockquote cite="Brian Marick">
The line by line analysis I gave above was inspired by the literary critic Stanley Fish. He has a style of criticism called &quot;affective stylistics&quot;. In it, you read something (typically a poem) word by word, asking what effect each word (and punctuation mark, and line break...) will have on the canonical reader's evolving interpretation of the poem. [<a href="http://www.testing.com/cgi-bin/blog/">Exploration Through Example</a>]
</blockquote>
<p>
You don't run into the juxtaposition of Lisp and lit-crit every day! But as a former lit-crit person myself, I think there's a lot of merit to what Brian is saying here. The practical conclusions:
</p>
<blockquote cite="Brian Marick">
The more diverse your audience, the more likely you'll need comments. Teams will naturally converge on a particular &quot;canonical reader&quot;, but perhaps that process could be accelerated if people were mindful of it.
</blockquote>
<p>
Fascinating stuff, Brian.
</p>
</body>
</item> 

<item num="a789">
<title>In search of Office 2003 developers</title>
<date>2003/08/30</date>
<body>
<p>
You: A developer building an Office solution that couldn't have been done prior to Office 2003 -- i.e., that leverages the XML capabilities of (in particular) InfoPath, Excel, and Word.
</p>
<p>
Me: A journalist interested in interviewing you, starting Tuesday of next week.
</p>
<p>
Serious <a href="mailto:judell@mv.com?subject=Office2003Solution">inquiries</a> only, please.
</p>
</body>
</item> 

<item num="a788">
<title>More pleasant surprises, please</title>
<date>2003/08/30</date>
<body>
<blockquote cite="InfoWorld">
I want to be pleasantly surprised by software that notices when message patterns indicate the formation of a group or project, and volunteers to set up folders and filters for me. Likewise, I want to be pleasantly surprised by an RSS newsreader that notices how I save and organize items from my subscribed feeds. No breakthrough in artificial intelligence is needed to make this happen. We do the pattern recognition ourselves, quite naturally, as we process our information flows. If software paid more attention to what we attend to, and how, there could be more pleasant surprises. Full story at [<a href="http://www.infoworld.com/article/03/08/29/34OPstrategic_1.html">InfoWorld.com</a>]. 
</blockquote>
<p>
I had a bit of a pleasant surprise today. Last night <a href="http://www.rassoc.com/gregr/weblog/">Greg Reinacker</a> wrote to say that Outlook 2003 does have a Bayesian filtering capability (as I'd heard), and he's getting pretty good mileage out of it, though he admits there's no documentation on whether or how to train it, and points out that it's odd there's a &quot;Not Junk&quot; button but no &quot;Junk This&quot; button. For about a week I'd been using Outlook 2003 in parallel with Outlook 2000 + SpamBayes. In OL2003 I kept manually dragging spams -- mostly Sobig.F's -- into the junk folder, but it didn't seem to catch on. Nor did it catch more than a few non-Sobig.F's. Then, last night, I loaded up a batch of about 5000 choice spam messages from my SpamBayes training database into OL2003's junk folder. That seemed to do the trick. Now it's doing much better at catching non-Sobig.F's, though for reasons I can't determine it still hasn't trained on Sobig.F's, even though they're decorated with SpamAssassin headers. Is it possible it only looks at the body, not the header? Anyway, it's not as lame as I thought, though not nearly -- so far as I can tell -- as useful as SpamBayes.
</p>
</body>
</item> 

<item num="a787">
<title>Well-formed writing and information routing</title>
<date>2003/08/29</date>
<body>
<p>
The tagging conventions I've been applying for the last four months are really springing to life, now that <a href="http://weblog.infoworld.com/udell/gems/blogsearch.html">structured search of my blog</a> is available. For example, my convention has been to write quotations like so:
</p>
<p>
&lt;p class=&quot;quotation&quot; source=&quot;...&quot;&gt;...&lt;/p&gt;
</p>
<p>
On the search page, one of the canned queries uses this XPath expression to find all the places where I quote Ward Cunningham:
</p>
<p>
<tt>//*[@class='quotation' and @source='Ward Cunningham']</tt>
</p>
<p>
If I want to find Don Box quotes, I can just change that -- in the form's accompanying input field -- to:
</p>
<p>
<tt>//*[@class='quotation' and @source='Don Box']</tt>
</p>
<p>
While I'm at it, I might as well acknowledge all of the voices that have enriched my blog over the past four months. A snippet of XSLT found them:
</p>
<blockquote>
<i>
Adam Curry, Alf Eaton, Allie Rogers, Annrai O'Toole, Bernard Teo, Bill de hÓra, Bill Gates, Bob Clary, Brendan Eich, Brian Marick, Chad Dickerson, Chris Brumme, Crazy Apple Rumors, Dan Brickley, Danny Ayers, Dave Winer, Don Box, Douwe Osinga, Gordon Weakliem, Hiawatha Bray, Ian Hixie, James Farmer, Jenny Levine, Jesse James Garrett, Jim O'Halloran, John Markoff, Ken Manheimer, Les Orchard, Matt Griffith, Micah Alpern, Mitch Kapor, Nancy McGough, Patrick Logan, Paul Everitt, Paul Graham, Paul Philp, Pete Cole, Peter Wayner, Phil Wainewright, Philip Brittan, Ray Kurzweil, Ray Ozzie, Rob Howard, Robert Ivanc, Robert L. Vaessen, Robert Scoble, Sam Ruby, Samuel Pepys, Sandeepan Banerjee, Scott Reynen, Sean McGrath, Stefano Mazzocchi, Ted Leung, Ted Neward, Tiernan Ray, Tim Bray, Tim Oren, Tom Yager, Tonico Strasser, Ward Cunningham
</i>
</blockquote>
<p>
It's great to be able to reuse content like this. A point I made yesterday bears repeating, because it's central to what <a href="http://www.crn.com/weblogs/stevegillmor/">Steve Gillmor</a> calls the &quot;information routing&quot; aspect of RSS and blogging. Well-formed content is a powerful enabler for a couple of reasons.
</p>
<p>
First, you have more control over your own material. If you want to develop a series of elements -- mine include quotations, mini-reviews, tips, and code snippets -- there's no special content-management machinery needed to do so. Just start tagging things accordingly; structured search immediately brings these views to life. Some will merit formalization in the CMS, others won't. This exploratory mode is to the CMS world what dynamic languages and interactive environments are to the world of programming.
</p>
<p>
The second reason is subtler. Your content doesn't just live on your blog. It flows through the RSS network. If others can perform structured search of your content, and use automated methods to recombine it, then your stuff can resonate more powerfully and is more likely to retain its fidelity as it gets routed around.
</p>
<p>
To ante up for this game, you have to produce well-formed content. The mainstream blog-writing tools aren't helping at all. Most well-formed writing is done in emacs, still. Can we please change that soon?
</p>
</body>
</item> 

<item num="a786">
<title>RSS to replace email? Nah.</title>
<date>2003/08/29</date>
<body>
<p>
I've heard a lot about how Outlook 2003, both alone and in combination with Exchange Server 2003, has been beefed up to fight the war on spam. From a client-only perspective, it doesn't look too promising. Apart from filtering messages that have been externally processed -- for example, by SpamAssassin -- the primary strategy appears to be blacklisting or whitelisting senders. As this screenshot illustrates, Sobig-like worms destroy that strategy. I can neither whitelist nor blacklist email appearing to be from Dave Ogle or Anne Manes or Tom Thompson or Lowell Rapaport. Quite likely, none of these folks has even been infected with the worm. Their names just happened to be chosen randomly from the address books of users who were infected. 
<img border="1" vspace="6" hspace="6" alt="sobig" src="http://weblog.infoworld.com/udell/gems/sobig.jpg"/>
</p>
<p>
For what it's worth, my current lines of defense are:
<ol>
<li>
<p>
<a href="http://www.spampal.org/">SpamPal</a>, a local proxy that I use for RBL (realtime blacklist) checking. I point Outlook 2000 at SpamPal on localhost; it rewrites the headers of RBL positives; Outlook filters send them straight to Deleted Items for review.
</p>
</li>
<li>
<p>
<a href="http://spamassassin.org/">SpamAssassin</a>. Mail to my InfoWorld address is checked by SpamAssassin. Until Sobig came along, I wasn't getting much mileage out of SpamAssassin, because the IW guys have it running in a conservative mode. SpamBayes, my third line of defense, was doing most of the work. But this SpamAssassin rule has been highly effective against Sobig:
<br/>
<div>
<tt>
MICROSOFT_EXECUTABLE (10.0 points) RAW: Message includes Microsoft executable program</tt>
</div>
<br/>
Again, Outlook filters send these straight to Deleted Items for review.
</p>
</li>
<li>
<p>
<a href="http://spambayes.sourceforge.net/">SpamBayes</a>. 
I'm quite sure that SpamBayes alone would have adapted to Sobig. But by letting SpamAssassin do the grunt work, I reserve SpamBayes for subtler discrimination. You'd think that during this onslaught, my MaybeSpam folder -- where SpamBayes puts messages it's not sure about -- would be overflowing. In fact, only five or 10 messages a day land there, and as usual they are messages that I legitimately have to decide how I want to handle.
</p>
</li>
</ol>
The only real accommodation I've had to make is to reduce the amount of mail I leave on the server, because the volume -- which seems not to be slackening -- was causing quota problems. Also, to be fair, I spend more time scanning for false positives, though nowhere near the amount of time I used to spend sorting things out before I implemented this layered strategy.
</p>
<p>
There's been a lot of talk about replacing email with RSS. I don't buy it. Although I am a huge fan of RSS, and expect it to largely replace email for subscription-related purposes (e.g., mailing lists), I don't see it as a general solution for ad-hoc person-to-person communication. Nor do I buy the argument that we need to toss SMTP. Obviously, we need to use it in a slightly different way. Of the various proposals floating around, the <a href="http://www.ietf.org/internet-drafts/draft-danisch-dns-rr-smtp-02.txt">RMX</a> idea -- a DNS-based solution that enables a receiving mail server to verify whether the sender's IP address is authorized to send from the domain within the sender's address -- seems particularly interesting. (I mentioned RMX in the <a href="http://www.infoworld.com/article/03/07/18/28FEspam_1.html">Canning Spam</a> article last month.) But it would be nuts to throw out the SMTP baby with the spam bathwater, and I'd be really surprised if that were to happen.
</p>
</body>
</item> 

<item num="a785">
<title>Enjoying XPath search</title>
<date>2003/08/28</date>
<body>
<p>
I've made a few refinements to yesterday's <a href="http://weblog.infoworld.com/udell/gems/blogsearch.html">XPath search</a> hack. And, now that it's so easy to classify and locate code fragments, I wanted to update yesterday's posting with a detail I forgot to mention. Apart from the XML escaping, the data that came out of Radio wasn't quite what I needed. I wanted to turn elements like &lt;table name=&quot;00000666&quot;&gt; into &lt;table name=&quot;a666&quot;&gt;, and elements like &lt;date name=&quot;when&quot; value=&quot;Wed, 27 Apr 2003 12:42:28 GMT&quot;/&gt; into &lt;date name=&quot;when&quot; value=&quot;2003/08/27&quot;/&gt;, so that the search script could more easily form URLs pointing back to found items. Used to be, I'd reach for Perl on occasions like this. But nowadays, Python seems to have become the tool of choice. Here's what I did:
</p>
<pre class="code" lang="python">
import rfc822, re
 
def dateRepl(matchobj):
    ret = matchobj.group(0)
    dt = matchobj.group(1)
    l = list ( rfc822.parsedate(dt) )
    d = &quot;%04d/%02d/%02d&quot; % (l[0], l[1], l[2])
    return ret.replace (dt, d)
 
def permaRepl(matchobj):
    ret = matchobj.group(0)
    oldperma = matchobj.group(1)
    newperma = re.sub('^0+','a',oldperma)
    return ret.replace (oldperma, newperma)
 
f = open('weblog.xml')
s = f.read()
r = re.sub('&lt;date value = &quot;([^&quot;]+)&quot; name = &quot;when&quot; &gt;', dateRepl, s)
r = re.sub('&lt;table name = &quot;(\d+)&quot; &gt;', permaRepl, r)
f.close()
print r
</pre>
<p>
Next, I wanted to reverse the top-level &lt;table&gt; elements in Radio's XML dump. The XSLT search finds things in document order, and I wanted to reverse that, to make newest results come first. Here's the XSLT transform to reverse the order of the items:
</p>
<p>
<pre class="code" lang="xslt">
&lt;?xml version=&quot;1.0&quot;?&gt; 
&lt;xsl:stylesheet 
  xmlns:xsl=&quot;http://www.w3.org/1999/XSL/Transform&quot; version=&quot;1.0&quot;&gt;
 
&lt;xsl:output method=&quot;xml&quot; indent=&quot;yes&quot; encoding=&quot;us-ascii&quot;/&gt;
 
&lt;xsl:template match=&quot;node() | @*&quot;&gt;
  &lt;xsl:copy&gt;
    &lt;xsl:apply-templates select=&quot;@* | node()&quot;/&gt;
  &lt;/xsl:copy&gt;
&lt;/xsl:template&gt;
 
&lt;xsl:template match=&quot;/blog&quot;&gt;
&lt;blog&gt;
&lt;xsl:for-each select=&quot;table[starts-with(@name, 'a')]&quot; &gt;
&lt;xsl:sort  select=&quot;@name&quot; order=&quot;descending&quot;/&gt;
&lt;table name=&quot;{@name}&quot;&gt;
&lt;xsl:apply-templates /&gt;
&lt;/table&gt;
&lt;/xsl:for-each&gt;
&lt;/blog&gt;
&lt;/xsl:template&gt;
    
&lt;/xsl:stylesheet&gt;
</pre>
</p>
<p>
Now these snippets in this entry will show up first in searches for Python and XSLT fragments. The whole entry will show up in the canned search I just added, entitled &quot;complete entries containing the phrase 'XPath search'. If I modify the query to read &quot;//body[contains ( . , 'XPath search' ) and contains (., 'reverse')]&quot; I'll currently find just this entry.
</p>
<p>
Cool! Now there's a virtuous circle. The various flavors of query -- 'body contains phrase', 'element has class attribute with value', 'link text contains text', 'link address contains text' --  reminds me what's possible. Each of these queries is a template that encourages substitution and variation. As I do the substition and create new variations, I think of new kinds of elements that might exist, and new kinds of searches they could enable. 
</p>
<p>
A particularly nifty aspect of this, which took me very much by surprise when I first realized it, is the effect of dynamically collapsing the document to just the found elements, while preserving their style and structural integrity. This has an interesting -- and to me pleasing -- visual effect. But there's also a universal-canvas kind of thing happening. In MSIE, a View Source of the generated results page only shows you the script. Likewise in Mozilla, but if you select a fragment and do right-click and then View Selection Source, you'll get the nearest enclosing XHTML element that contains your selection. You can then capture the element, for purposes of quoting, most likely, with little effort and no loss of integrity. That's <i>very</i> interesting.
</p>
</body>
</item> 

<item num="a784">
<title>De-anonymizing with Google</title>
<date>2003/08/28</date>
<body>
<p>
The other day Mitch Kapor posted an <a href="http://blogs.osafoundation.org/mitch/000338.html">anonymized request</a> for a guest spot on his blog. He asked: &quot;I wonder how many of these were sent out?&quot; Well, I got one, and so did some other folks I know. I responded privately to the PR person, reiterating what I've said before <a href="http://weblog.infoworld.com/udell/2002/08/14.html#a383">1</a>, <a href="http://weblog.infoworld.com/udell/2003/04/09.html#a663">2</a>, <a href="http://weblog.infoworld.com/udell/2003/04/11.html#a665">3</a>) -- that executives who wish to influence the conversation can and should join the conversation by writing regularly on their own blogs. 
</p>
<p>
I wasn't going to comment on this latest incident, but the story has taken a fascinating turn. From Viswanath Gondi's <a href="http://blogs.law.harvard.edu/vgondi/2003/08/26#a443">Design Media</a>, I learned that -- in a comment on Kapor's blog -- <a href="http://www.jjg.net/">Jesse James Garrett</a> found a clever way to de-anonymize Kapor's posting:
</p>
<blockquote cite="Jesse James Garrett">
Here's a hint for those interested in identifying the subject: the phrase &quot;disruptive competitive advantage&quot; appears on very few Web pages. [<a href="http://www.jjg.net/">Jesse James Garrett</a>]
</blockquote>
<p>
Had this possibility occurred to Kapor? My guess is probably not. But once again, the all-seeing Google has rewritten the rules. Identity is woven deeply into texts, and many texts are now public. I am sure we will soon see applications of the Google API that automate what Garrett did -- in other words, that answer the question: &quot;Does Google clearly identify who owns (or is closely bound to) the words in this document?&quot; Whether Kapor would have used that application, and if so whether he would have further obscured the text, only he can say. Certainly the PR person responsible for this misguided effort would like to have done so!
</p>
</body>
</item> 

<item num="a783">
<title>Closing the loop on XHTML blog content</title>
<date>2003/08/27</date>
<body>
<p>
James Farmer asks about the difference between WYSIWYG XML and HTML editing:
</p>
<blockquote cite="James Farmer">
Micah says we need a WYSIWYG XML editing tool, um (me being naive here), what's wrong with WYSIWYG HTML tool? [<a href="http://radio.weblogs.com/0120501/2003/08/27.html#a333">James Farmer</a>]
</blockquote>
<p>
Here was Micah Alpern's response:
</p>
<blockquote cite="Micah Alpern">
The bottom line is we need a way to edit/create structure while editing formatting.  The high priests of XML want us to think about structure directly (this section should be H1 or H2), when most humans think in formatting (18 point font vs 9 point).  WYSIWYG HTML editors have gotten us some of the way there, but the structure behind HTML is limited and co-mingles the presentation.
<br/>
<br/>
While as users we often want to manipulate the structure and the presentation at the same time it's important that within the underlying representation (the code behind the WYSIWYG editor) these two layers remain separate. 
[<a href="http://www.alpern.org/weblog/2003/08/26.html#a785">Micah Alpern</a>]
</blockquote>
<p>
I noticed these remarks because I've recently upgraded to Firebird 0.61 (from 0.6) in order to try out <a href="http://jake.userland.com/2003/08/12.html#a857">Jake Savin's Midas implementation</a> for Radio UserLand. What I'd forgotten about Midas is that, unlike <a href="http://mozile.mozdev.org/">Mozile</a>, it doesn't produce XHTML. I can probably coerce its output to XHTML using <a href="http://tidy.sourceforge.net/">Tidy</a>, and may do so because Midas is the more powerful of the two as an editor. 
</p>
<p>
Meanwhile, though, I got to thinking about why I'm writing XHTML content in the first place. I laid out the case in <a href="http://weblog.infoworld.com/udell/2003/04/16.html#a667">this article</a>, but although I've been steadily accumulating well-formed content since then, I hadn't gotten around to mining it. 
</p>
<p>
Today I took the plunge, starting with this UserTalk fragment: 
</p>
<pre class="code" lang="usertalk">
local(s);
for i = sizeof(weblogdata.posts) - 112 to sizeof(weblogdata.posts) 
  {
  s = s + table.tableToXml(@weblogdata.posts[i]);
  };
file.writeWholeFile( &quot;export.xml&quot;, s ) 
</pre>
<p>
In other words, 112 postings ago I began requiring myself to post well-formed XHTML. This snippet exports those postings as XML. 
</p>
<p>
Of course Murphy struck immediately. Radio was not expecting me to store well-formed XHTML, so it escaped all my entries. I was able to undo that escaping with this transformation:
</p>
<pre class="code" lang="xslt">
&lt;?xml version=&quot;1.0&quot;?&gt; 
&lt;xsl:stylesheet 
  xmlns:xsl=&quot;http://www.w3.org/1999/XSL/Transform&quot; version=&quot;1.0&quot;
 &gt;
&lt;xsl:output method=&quot;xml&quot; indent=&quot;yes&quot; encoding=&quot;us-ascii&quot;/&gt;
&lt;xsl:template match=&quot;node() | @*&quot;&gt;
  &lt;xsl:copy&gt;
    &lt;xsl:apply-templates select=&quot;@* | node()&quot;/&gt;
  &lt;/xsl:copy&gt;
&lt;/xsl:template&gt;
&lt;xsl:template match=&quot;body&quot;&gt;
&lt;body&gt;
&lt;xsl:apply-templates/&gt;
&lt;/body&gt;
&lt;/xsl:template&gt;
&lt;xsl:template match=&quot;text()&quot;&gt;
&lt;xsl:value-of disable-output-escaping=&quot;yes&quot; select=&quot;.&quot;/&gt;
&lt;/xsl:template&gt;
&lt;/xsl:stylesheet&gt;
</pre>
<p>
But not completely. The problem is that when I write an entry, I distinguish between markup tags, such as &lt;p&gt;, and non-markup, such as &amp;lt;description&gt;. Radio's escaping eliminated that distinction. So after recovering my XML, I had to do a combination of scripted and manual fixup to restore it. Ugh. Going forward, I'll either have to convince Radio not to escape my stuff, or else maintain new items in the XML file I've extracted from Radio.
</p>
<p>
Anyway, the point of all this is to be able to blend style tags that make sense to ordinary users with structural cues that can facilitate intelligent search and recombination of content. Here's a <a href="http://weblog.infoworld.com/udell/gems/blogsearch.html">search example</a>. It's similar to some others I've done recently, and relies on the ability of MSIE or Mozilla to suck in XML and dynamically restructure it based on XPath search. It's not optimal for client-side use over the Web, since the first search hauls in .5MB of XML. Obviously a server-side implementation can be easily done as well, if needed.
</p>
<p>
So this closes the loop for me. Now when I add a CSS class attribute to an element -- like 'quotation' or 'minireview' -- I can think about it in two ways. As a writer, I'll assign some appropriate style to it. As a reader, I'll be able to filter my whole blog down to just elements of that class. Or to subsets of the class. For example, the XSLT and UserTalk fragments in this item are found by the corresponding canned XPath queries in the example.
</p>
<p>
This is not, in itself, very interesting to other people, though it's incredibly helpful to me as the author of my own blog. As it stands, of course, the cost/benefit ratio is way out of whack for most people. I'm willing to jump through hoops to make this happen, because I can and because I see the value of it. What I envision, though, is that a Midas-like thingy (tweaked to save as XHTML, and to integrate its CSS awareness with that of the host blog tool) could be used by lots of folks to enrich their blogs with named styles. If those blogs then flow their XHTML content out through RSS, we have a way to close the loop on a grander scale. Should people decide that a 'minireview' is a cool kind of blog element, they can use CSS styling to distinguish them visually. Meanwhile, as a secondary benefit, aggregators can collect and recombine these elements.
</p>
<p>
There are too many moving parts here, I realize, and it's going to be hard to get this whole concept over the activation threshold. But I'm ever hopeful!
</p>
</body>
</item> 

<item num="a782">
<title>Exploration and discovery</title>
<date>2003/08/26</date>
<body>
<p>
This week's <a href="http://www.infoworld.com/article/03/08/22/33OPstrategic_1.html">column on dynamic languages</a>, and its associated <a href="http://weblog.infoworld.com/udell/2003/08/25.html#a780">blog entry</a>, provoked some interesting reactions. From Don Box:
</p>
<blockquote cite="Don Box">
Jon Udell's recent post on using dynamic languages to work with XML and web services echos a common meme.  
<br/>
<br/>
More and more, however, I'm less convinced the problem is with languages per se but rather is related to how dynamic linking and binding works in most modern object systems.
<br/>
<br/>
More often than not, the developer needs to make some assumptions about the expected shape of a given data structure.
<br/>
<br/> 
The problem is that we bake a fair amount of non-semantic information into the way their code compiles against those assumptions.
<br/>
<br/> 
I think we can do way better than the current state of the practice.
<br/>
<br/> 
I don't think the answer is for everyone to adopt Perl - if that's our only hope let's give up now. [<a href="http://www.gotdotnet.com/team/dbox/default.aspx?key=2003-08-26T04:06:38Z">Don Box's spoutlet</a>]
</blockquote>
<p>
To which Patrick Logan responded:
</p>
<blockquote cite="Patrick Logan">
I think Don is half right. It *is* about how dynamic linking and binding works in a system. But how that works is intimately associated with the language design.
<br/>
<br/> 
Have you ever used an interactive environment for a Pascal-like language? It is a very different experience from using one for Lisp, Smalltalk, or Python.
<br/>
<br/> 
I won't mention Perl. That's the red herring for static typing loyalists.  [<a href="http://patricklogan.blogspot.com/2003_08_24_patricklogan_archive.html#106191509346137314">Making it stick</a>]
</blockquote>
<p>
In this discussion, Perl is a red herring in more ways than one. First, there's the ongoing confusion between two axes of typing -- strong versus weak, and dynamic versus static. For example, both Perl and Python are dynamic, in the sense that you need not declare a type when first assigning to a variable. But while Perl's typing is weak -- you can just assign a date to a variable that holds a number or a string -- Python's is strong. Once a variable has a value, Python cares very deeply about what its type says that thing can or can't do. Bruce Eckel's <a href="http://www.mindview.net/WebLog/log-0025">assertion to the contrary</a> <a href="http://www.artima.com/weblogs/viewpost.jsp?thread=7590">raised hackles</a> in the Python community. Dynamic typing and strong typing are orthogonal.
</p>
<p>
But there's another sense in which Perl is a red herring here. Perl isn't interactive in the manner of Python -- or, for that matter, VB6 as compared to VB.NET. Those who have resisted adopting VB.NET have sometimes been characterized as knuckle-scraping Neanderthals who must be dragged kicking and screaming into the modern OOP era, or else left behind. But while the .NET Framework has much to offer, I think the VB6 crowd are right to demand a more interactive way to use it. As programming increasingly relies on external services and alien environments, it becomes as much a game of exploration and discovery as of design and specification. I think dynamic languages and interactive programming environments help make us better explorers and discoverers, and I think that's only going to matter more as time goes on.
</p>
</body>
</item> 

<item num="a781">
<title>Bootstrapping location-based services</title>
<date>2003/08/25</date>
<body>
<p>
<a href="http://www.irishcarrentals.us/roadsigns.html">
<img vspace="6" hspace="6" alt="irish road signs" align="right" src="http://weblog.infoworld.com/udell/gems/irishSpeedLimit.gif"/>
</a>
Sean McGrath has a great idea for bootstrapping location-based services in Ireland:
</p>
<blockquote cite="Sean McGrath">
So, the Irish Government is <a href="http://www.rte.ie/news/2003/0825/traffic.html">overhauling</a> the speed limit signs. Every town/village in the country has at least one speed limit sign. So, while changing it for the new system, allocate each one a unique four character code. Then, tie the location of the road sign to the code on the web and you have a very simple, very cheap way to deploy location based services e.g. where am I, where is the nearest hospital to me, how far is it to Sligo town, whatever. Just get people to jot down the 4 letter code and then they can use that to find your website etc. Why not? [<a href="http://seanmcgrath.blogspot.com/2003_08_24_seanmcgrath_archive.html#106182021978384365">Sean McGrath</a>]
</blockquote>
<p>
Undoubtedly the folks responsible for the design of <a href="http://www.irishcarrentals.us/roadsigns.html">Irish road signs</a> would raise concerns. Will a four-character code be legible? Will it interfere with the primary function of the sign? Still, this is a brilliant suggestion that merits serious consideration. 
</p>
<p>
Internet-related signage has always been problematic. URLs tend not to work well on signage, for the same reason they tend not to work well in spoken discourse. I noticed the other day that the police cars in my town sport a dot-com-era relic: http://www.ci.keene.nh.us/police. I find this www.ci.CITY.STATE.us naming convention to be utterly non-mnemonic, though perhaps that's because so few meaningful services attach to these URLs that they just haven't had a chance to sink in. 
</p>
<p>
The <a href="http://tinyurl.com/">TinyURL-like</a> compression achieved by Sean's proposed four-character code is one factor that makes me think this idea could be workable. Another and perhaps more compelling factor is that the codes need not only function as Internet addressess, but also could enter ordinary discourse as a way for people to express locations on highways more granularly than &quot;five point two miles past exit 37.&quot; 
</p>
<p>
Terrific idea, Sean! I'll bet there are similar opportunities latent in many classes of signage. When upgrades do occur, it would be great if designers were sensitive to those opportunities.
</p>
</body>
</item> 

<item num="a780">
<title>Dynamic languages and virtual machines</title>
<date>2003/08/25</date>
<body>
<blockquote cite="Jon Udell">
What about robustness? In a world where computation lived within a single VM, strong type-checking and bytecode verification may have been reasons to prefer languages such as C# and Java. But we don't live in that world any more. Computation is distributed; interfaces are language neutral and document oriented; cross-domain trust is a work in progress. In these circumstances, dynamic languages -- which neither the Java nor .Net VMs yet fully embrace -- may be the best way to tame the services network we are now constructing.
[Full story at <a href="http://www.infoworld.com/article/03/08/22/33OPstrategic_1.html">InfoWorld.com</a>]
</blockquote>
<p>
In a <a href="http://www.sauria.com/blog/2003/08/23#507">comment</a> on this article, Ted Leung asks: &quot;Web services networks are networks of loosely coupled services. Part of the flexiblity of dynamic languages results in looser coupling in programs. Is this what Jon was after?&quot;
</p>
<p>
Yes, in part. There are so many facets to this issue, though. One that I didn't allude to in the column, but think about a lot, is the role that I have long believed dynamic languages should play in the rapid development of what we are now calling &quot;rich Internet clients,&quot; or what Adam Bosworth has recently been calling the <a href="http://www.adambosworth.net/archives/000005.html">Web services browser</a>. It's always been interesting to me that in the early years of the Web, the terms &quot;Perl script&quot; and &quot;cgi-bin&quot; were often synonymous. The server-side Web was programmable, and a dynamic language was the way you did that programming. Those of us who saw application lifecycles shrink from years to days, or from months to hours, got a thrilling taste of what it could be like to evolve software in near-realtime, in response to immediate feedback from users. 
</p>
<p>
The client-side Web was programmable too, but in ways that have only recently begun to live up to the original promise. Meanwhile, our communication clients -- email (and for some of us, nowadays, RSS) -- remain fixed-function monoliths that are, to this day, extended mainly by C, C++, or maybe nowadays Java or C# programmers. 
</p>
<p>
I do think that dynamic languages can help us tame the complexity of server-to-server communication on Web services networks. The reason: we simply don't know which arrangements will prove workable and which won't, and we'll need highly productive RAD tools to plow through lots of experiments. By the same token, we know very little about how to best enable people to interact with Web services. Here too, we'll need all the productivity and flexibility we can get.
</p>
</body>
</item> 

<item num="a779">
<title>Chris Brumme's blog</title>
<date>2003/08/22</date>
<body>
<p>
Microsoft senior developer Chris Brumme doesn't post often to his <a href="http://blogs.gotdotnet.com/cbrumme/">weblog</a>, but every one of his essays is a lengthy, authoritative, and candidly self-critical exploration of .NET and CLR arcana, the sort of thing you might expect to read on MSDN (minus the self-criticism, that is). And in fact, the absence of this material from MSDN is controversial. Back in June, Dare Obasanjo <a href="http://www.haloscan.com/comments.php?user=scoble&amp;comment=3249#2192">complained</a> about that. Robert Scoble's response was:
</p>
<blockquote cite="Robert Scoble">
Publishing is too hard for many Microsoft employees. Blogging makes it easy. Would Chris even bother if he needed to figure out who was responsible for publishing stuff like his over at MSDN? Would Chris bother if he needed to have three meetings just to get his stuff approved to post up? I wouldn't. I'm not gonna publish on microsoft.com or msdn.com unless I have to. The process is just too daunting...Think that most of Microsoft's 55,000 employees know how to get something through the publishing system at MSDN? I don't think so. Blogs take up the slack. [<a href="http://radio.weblogs.com/0001011/2003/06/09.html#a3264">The Scobleizer weblog</a>]
</blockquote>
<p>
To which Pete Cole responded:
</p>
<blockquote cite="Pete Cole">
Errr, as a stupid sap paying $1000s for MSDN subscription I would rather that a company the size of Microsoft SORTED ITSELF OUT - please explain to me why I should even have to answer the question of which I would rather he do? If the MSDN people are a pain in the butt, then management should sort them out. 
<br/>
<br/>
The trouble for me is that the API surface I write against is documented neither on MSDN nor the Web - I spent my life in a haystack of needles looking for the right one to put the thread through. [<a href="http://www.profundis.co.uk/peteblog/2003/06/11.html#a472">Pete Cole's weblog</a>]
</blockquote>
<p>
So premium content's path of least resistance is the blog, not the premium channel. Pete's right to suggest that if there's going to be a premium channel, it ought to figure out how to be the path of least resistance for premium content like Chris Brumme's. Of course, I'd hate to see Chris' voice vanish from the public scene. Wednesday's edition of his blog, which featured a lengthy analysis of how to shut down a managed application, concluded with a &quot;security addendum&quot; that reads in part:
</p>
<blockquote cite="Chris Brumme">
I haven't blogged in about a month. That's because I spent over 2 weeks (including weekends) on loan from the CLR team to the DCOM team.  If you've watched the tech news at all during the last month, you can guess why.  It's security.
<br/>
<br/>
From outside the company, it's easy to see all these public mistakes and take a very frustrated attitude. When will Microsoft take security seriously and clean up their act?  I certainly understand that frustration.  And none of you want to hear me whine about how it's unfair.
<br/>
<br/>
The company performed a much publicized and hugely expensive security push.  Tons of bugs were filed and fixed.  More importantly, the attitude of developers, PMs, testers and management was fundamentally changed.  Nobody on our team discusses new features without considering security issues, like building threat models.  Security penetration testing is a fundamental part of a test plan.
<br/>
<br/>
Microsoft has made some pretty strong claims about the improved security of our products as a result of these changes.  And then the DCOM issues come to light.
<br/>
<br/>
Unfortunately, it's still going to be a long time before all our code is as clean as it needs to be.
<br/>
<br/>
Some of the code we reviewed in the DCOM stack had comments about DGROUP consolidation (remember that precious 64KB segment prior to 32-bit flat mode?) and OS/2 2.0 changes.  Some of these source files contain comments from the 80s.  I thought that Win95 was ancient! [<a href="http://blogs.gotdotnet.com/cbrumme/PermaLink.aspx/dac5ba4a-f0c8-42bb-a5cf-097efb25d1a9">.Net notes</a>]
</blockquote>
<p>
It's easy to throw rocks at a faceless monolith. It's harder to throw them at a human face speaking with a human voice. I can only guess at the struggle that must be going on inside Microsoft, these days, between those who seek to control the message (a legitimate and necessary business instinct!) and those who want credible and candid voices to be heard directly. I'm not a huge fan of the book genre that chronicles high-tech corporate intrigue, but when this story is finally told I'll be fascinated to read it.
</p>
</body>
</item> 

<item num="a778">
<title>Acrobat and InfoPath</title>
<date>2003/08/21</date>
<body>
<blockquote cite="Jon Udell">
Look at Adobe's <a href="http://www.adobe.com/products/server/readerextensions/pdfs/incometaxform.pdf">interactive income tax form</a>. That document is licensed, by the Document Server for Reader Extensions, to unlock the form fill-in and digital signature capabilities of the reader. Filling in a form and then signing it digitally is an eye-opening experience. It's more interesting now that the form's data is schema-controlled and, Myers adds, can flow in and out by way of WSDL-defined SOAP transactions. The only missing InfoPath ingredient is a forms designer that nonprogrammers can use to map between schema elements and form fields. That's just what the recently announced Adobe Forms Designer intends to be. I like where Adobe is going. The familiarity of paper forms matters to lots of people. And unless Microsoft's strategy changes radically, those folks are far likelier to have an Adobe reader than an InfoPath client. [Full story at <a href="http://www.infoworld.com/article/03/08/15/32OPstrategic_1.html">InfoWorld.com</a>]
</blockquote>
<p>
Among the comments I've received on this piece was one from Philip Brittan, chairman of <a href="http://www.droplets.com/">Droplets</a>, who pointed to an earlier java.net blog entry that says in part:
</p>
<blockquote cite="Philip Brittan">
The question on everyone's tongue now is how these products [Acrobat and InfoPath] will compete with each other. A deeper question is how they will compete with HTML/XForms and whether they will indeed progress towards being full application delivery platforms. It seems that there is market pressure for a platform to provide a continuum of capabilities from document publishing to application delivery. Maybe docs, forms, and apps are really all meant to be the same thing. But how we'll achieve that is still far from clear. [<a href="http://weblogs.java.net/pub/wlg/320">java.net</a>]
</blockquote>
<p>
Well said. In response to those who ask whether I think Acrobat will prove to be a better form designer and info-gathering tool than InfoPath, I'd say two things. First, of course, neither is shipping. Second, and more important, they aim at different targets. Although I despise paper forms, our business culture is deeply rooted in paper and will be for a long time. Interactive digital paper is a necessary bridge technology. Meanwhile, new workflows are emerging that have no ties to the world of paper and printers. So I actually see Acrobat and InfoPath as complementary. Of course Philip's point is well taken. Neither Acrobat nor InfoPath is a first-class native citizen of the Web. I'd love to see both move in that direction. Adobe can help by making SVG easier to deploy and use. Microsoft can help by making schema-aware data gathering easier to deploy and use. There's plenty of headroom for commercial products based on these technologies, but they need foundations on which to stand.
</p>
</body>
</item> 

<item num="a777">
<title>Too close for comfort</title>
<date>2003/08/21</date>
<body>
<p>
<img vspace="6" hspace="6" align="right" width="324" height="354" src="http://weblog.infoworld.com/udell/gems/seaotter.jpg"/>
I've just returned from vacation, one of the highlights of which was paddling around Monterey Bay in a sea kayak. It was a real treat for me and my family to hang out in close proximity to the sea lions, seals, and sea otters. For the woman shown in this picture, though, things got way too close for comfort. This sea otter jumped into her kayak, gnawed on her life jacket, and gave her quite a scare.
</p>
<p>
I told her companion I'd post the photo, so here it is. Not much of a technical hook for this item, unless you count the fact that Google was the easiest way for her companion to find me afterward so I could point him to this picture. Floating in kayaks without paper or pencil, I just said &quot;Google me, and write to the address you find.&quot; He did, and SpamAssassin and SpamBayes had no trouble distinguishing his message from the thousands of Sobig.F attacks. So the spam shields help up nicely, but I wonder if next time I kayak in Monterey Bay I'll need a critter shield.
</p>
</body>
</item> 

<item num="a776">
<title>Ratholes and training wheels</title>
<date>2003/08/12</date>
<body>
<p>
Tim Bray speaks bluntly about &quot;what over in the <a href="http://www.w3.org/2001/tag">W3C TAG</a> we refer to as a 'rat-hole'&quot;:
</p>
<blockquote cite="Tim Bray">
At the end of the day, markup is just a bunch of labels. We should be grateful that XML makes them (somewhat) human-readable and internationalized, and try to write down what we want them to mean as clearly as and cleanly as we can, with a view to the needs of the downstream implementors and users.
<br/>
<br/>
But we shouldn't try to kid ourselves that meaning is inherent in those pointy brackets, and we really shouldn't pretend that namespaces make a damn bit of difference. [<a href="http://www.tbray.org/ongoing/When/200x/2003/08/11/SymbolGrounding">ongoing</a>]
</blockquote>
<p>
Well, that's my gut feeling too. But when the W3C director writes an <a href="http://www.sciam.com/print_version.cfm?articleID=00048144-10D2-1C70-84A9809EC588EF21">article in Scientific American</a> suggesting otherwise, it calls for a bit of due diligence. I'm satisfied for now, though, on two key points. People say what the labels mean, and the tools we have for processing the labels are not too shabby.
</p>
<p>
Speaking of tools, <a href="http://www.snee.com/bob/">Bob DuCharme</a> has done a handy <a href="http://www.snee.com/xml/nscomments.html">namespace visualizer</a>. Here's how it parses <a href="http://www.w3.org/2000/06/webdata/xslt?xslfile=http%3A%2F%2Fwww.snee.com%2Fxml%2Fnscomments.xsl&amp;xmlfile=http%3A%2F%2Fwww.snee.com%2Fxml%2Ftb.xml&amp;transform=Submit">Tim Bray's example</a>, and here's how it parses <a href="http://www.w3.org/2000/06/webdata/xslt?xslfile=http%3A%2F%2Fwww.snee.com%2Fxml%2Fnscomments.xsl&amp;xmlfile=http%3A%2F%2Fweblog.infoworld.com%2Fudell%2Frss.xml&amp;transform=Submit">my RSS feed</a> (Firebird users: do View Source in this case). Thanks, Bob! That's just the kind of <a href="http://weblog.infoworld.com/udell/2003/08/09.html#a774">training wheel</a> I had in mind.
</p>
</body>
</item> 

<item num="a775">
<title>Symbol grounding and extensible aggregators</title>
<date>2003/08/11</date>
<body>
<p>
Last week's items about RDF evoked lots of feedback. First, Patrick Logan:
</p>
<blockquote cite="Patrick Logan">
This is beyond RSS, beyond RDF, even beyond XML.
<br/>
<br/>
This is known as <a href="http://www.ecs.soton.ac.uk/~harnad/Papers/Harnad/harnad90.sgproblem.html">The Symbol Grounding Problem</a>.
<br/>
<br/>
Namespaces allow you to create unique (enough) symbols. There is no consistent way to interpret them. All XML-based standards should fully support namespaces. The minimum acceptable standards should support distinguished symbols from among various standards.
<br/>
<br/>
What those mixtures &quot;mean&quot; in any specific context will be, well, context dependent. All we can hope for from XML alone is an arrangement of symbols. Whoever tells us of the arrangement will also have to tell us how to interpret the arrangement. [<a href="http://patricklogan.blogspot.com">Patrick Logan</a>]
</blockquote>
<p>
The bibliography in the article Patrick refers to is full of names familiar to me: Chomsky, Dennett, Fodor, Haugeland, Miller, Minsky, Newell, Penrose, Pylyshyn. I've read all these writers, and have had a decades-long fascination with the relationship between human and computer languages. Will we ever figure out a useful mapping between the two? I hope so, but I won't make a short-term bet either way. Therefore, it seems to me, we need a strategy that doesn't depend on the outcome of that bet. In that vein, I found this exchange on the Atom mailing list to be noteworthy:
</p>
<blockquote cite="Danny Ayers">
One thing about the syntax that concerns me greatly is that there doesn't yet appear to be a consistent way of interpreting material from other namespaces. I believe this to be a make or break issue for interop. [<a href="http://www.imc.org/atom-syntax/mail-archive/msg00013.html">Danny Ayers writing on the atom-syntax mailing list</a>]
</blockquote>
<blockquote cite="Tim Bray">
This problem has never been solved in the general case that I know of. So I really hope that you're wrong on its make-or-break-ness. Worth taking a whack at, but don't underestimate the difficulty. [<a href="http://www.imc.org/atom-syntax/mail-archive/msg00013.html">Tim Bray writing on the atom-syntax mailing list</a>]
</blockquote>
<p>
Meanwhile, Bill de hÓra, responding to my question -- &quot;Shouldn't we then say, there is no reason to create any mixed-namespace XML document that is not RDF?&quot; -- writes:
</p>
<blockquote cite="Bill de hÓra">
Yes we should say that, but that would be saying the Emperor has no clothes. Does anyone want to hear it? [<a href="http://www.dehora.net/journal/archives/000349.html#000349">Bill de hÓra</a>]
</blockquote>
<p>
Absolutely. If there's a naked emperor's butt flapping in the breeze, I definitely do want to know about it. In my experience, though, XPath search and XSLT transformation are quite effective and Bill seems to agree:
</p>
<blockquote cite="Bill de hÓra">
Web services for the most part are predicated on XML namespaced vocabularies, as are any number of behind the firewall integration efforts. In those worlds, there's historically been zero agreement on uniform content models, which is precisely why transformation is such an effective technology for integrating systems. Get the data into XML and start pipelining. And though neither the declarative or the API/RPC school of integration may like the idea of chaining processes with XML, in my and my employer's experience, the results speak for themselves. In truth, XML Namespaces are incidental to a transformation architecture. [<a href="http://www.dehora.net/journal/archives/000349.html#000349">Bill de hÓra</a>]
</blockquote>
<p>
But Bill also believes that a uniform content model is the best strategy, and he defines it in an interesting way:
</p>
<blockquote cite="Bill de hÓra">
By the way, if you don't like all the semantic web stuff that RDF is associated with, here's another way of looking at it. Think of RDF as a CVM, a Content Virtual Machine, out of which any content can be described and by which content codecs can interoperate, by sharing a uniform view of the data. That's all there really is to RDF - an instruction set for content description. This is no more naive a view than Java's WORA. [<a href="http://www.dehora.net/journal/archives/000349.html#000349">Bill de hÓra</a>]
</blockquote>
<p>
<a href="http://safari.oreilly.com/0596002637">
<img alt="Practical RDF" align="right" vspace="6" hspce="6" src="http://weblog.infoworld.com/udell/gems/practicalRDF.gif"/>
</a>
I find this formulation very appealing in the abstract. I'm still not sure what it means concretely, though. To get a better picture of how the CVM works, I read Shelley Powers' very well-written new book, <a href="http://safari.oreilly.com/?XmlId=0-596-00263-7">Practical RDF</a>. I read it online, actually. Very cool to be able to do that. (<a href="http://weblog.infoworld.com/udell/2003/08/07.html#a772">Tank, I need a pilot program for a B-212 helicopter.</a>) My eyelids fluttered for a while, and when I opened them again it was <a href="http://safari.oreilly.com/?XmlId=0-596-00263-7/pracrdf-CHP-10">Chapter 10: Querying RDF: RDF as Data</a> that emerged as pivotal. Let's look at an example:
</p>
<pre>
SELECT ?value
WHERE (?x, &lt;pstcn:presentation&gt;, ?resource),
(?resource, &lt;pstcn:requires&gt;, ?resource2),
(?resource2, &lt;pstcn:type&gt;, &quot;stylesheet&quot;),
(?resource2, &lt;rdf:value&gt;, ?value)
USING pstcn FOR &lt;http:\//burningbird.net/postcon/elements/1.0/&gt;,
      rdf FOR &lt;http:\//www.w3.org/1999/02/22-rdf-syntax-ns#&gt;
 
The result from running this query is:
 
http:\//burningbird.net/de.css
</pre>
<p>
The backstory here is that a resource -- and the running example through the book is <a href="http://www.burningbird.net/articles/monsters1.htm">this article</a> (monsters1.htm) -- has an <a href="http://www.burningbird.net/articles/monsters1.rdf">RDF description</a> (monsters1.rdf) based on an RDF vocabulary, called PostCon, whose development the book demonstrates. When you run monsters1.rdf through an <a href="http://www.w3.org/RDF/Validator">RDF parser</a> (e.g. http://www.w3.org/RDF/Validator) -- try it <a href="http://www.w3.org/RDF/Validator/ARPServlet?RDF=%3C%3Fxml+version%3D%221.0%22%3F%3E%0D%0A%3Crdf%3ARDF+xmlns%3Ardf%3D%22http%3A%2F%2Fwww.w3.org%2F1999%2F02%2F22-rdf-syntax-ns%23%22%0D%0A++xmlns%3Adc%3D%22http%3A%2F%2Fpurl.org%2Fdc%2Felements%2F1.1%2F%22%3E%0D%0A++%3Crdf%3ADescription+rdf%3Aabout%3D%22http%3A%2F%2Fwww.w3.org%2F%22%3E%0D%0A++++%3Cdc%3Atitle%3EWorld+Wide+Web+Consortium%3C%2Fdc%3Atitle%3E+%0D%0A++%3C%2Frdf%3ADescription%3E%0D%0A%3C%2Frdf%3ARDF%3E%0D%0A++&amp;PARSE=Parse+URI%3A+&amp;URI=http%3A%2F%2Fwww.burningbird.net%2Farticles%2Fmonsters1.rdf&amp;TRIPLES_AND_GRAPH=PRINT_TRIPLES&amp;FORMAT=PNG_EMBED&amp;NODE_COLOR=Black&amp;NODE_TEXT_COLOR=Blue&amp;EDGE_COLOR=Darkgreen&amp;EDGE_TEXT_COLOR=Red&amp;FONT_SIZE=10&amp;ORIENTATION=LR">here</a> -- you get a list of subject-predicate-object triples. For example, the 57th triple says:
</p>
<table cellpadding="4" cellspacing="0">
<tr>
<td>subject</td>
<td>predicate</td>
<td>object</td>
</tr>
<tr>
<td>http:\//burningbird/articles/ monsters1.htm</td>
<td>http:\//burningbird.net/ postcon/elements/1.0/reason</td>
<td>&quot;Collapsed into Burningbird&quot;</td>
</tr>
</table>
<p>
The context (an intentionally loaded word!) of this triple is something like: &quot;The subject URL is partly described by an RDF vocabulary, PostCon, which can be used to track the history of its 'movement' -- that is, from one Web address to another. Whenever such a move occurs, there is a reason given. This triple gives the reason for one such movement.
</p>
<p>
Armed with this model, and with an understanding of the PostCon vocabulary, whose domain elements are detailed in <a href="http://safari.oreilly.com/?XmlId=0-596-00263-7/pracrdf-CHP-6-SECT-3">this section</a> of the book, we can see how the query works its way through the triples to answer the question: &quot;What CSS resource is required (in the PostCon sense) by the subject URL&quot;?
</p>
<p>
This is cool. RDF triples are relations, and here we see that they're amenable to relational processing. I can grok that.
</p>
<p>
Now, back to this notion of the Content Virtual Machine. Commenting on my <a href="http://weblog.infoworld.com/udell/2003/08/06.html#a769">Plain Old Metadata</a> proposal, which focused on the idea of putting job-board postings into RSS as structured payloads, Danny Ayers wrote:
</p>
<blockquote cite="Danny Ayers">
Ok, so what happens when we need a vacation language? Right, build it all again from scratch, I'm sure those aggregator developers will welcome the opportunity to do virtually the same work all over again... [<a href="http://dannyayers.com/archives/001685.html">Danny Ayers</a>]
</blockquote>
<p>
This isn't just idle speculation. A very real situation looms for both RSS and Atom alike, as Ted Leung points out:
</p>
<blockquote cite="Ted Leung">
Jon Udell is writing about extending RSS 2.0, asking whether it should be done via namespaces or via RSS. Either way you do it, you've just entered the realm of extensible aggregators, because the jobs namespace is just the first of many that will come pouring through the gate once we open it. The question then becomes, how do you build an aggregator in such a way that we don't have download after download of new aggregator binaries, or aggregator extension/plugin binaries? [<a href="http://www.sauria.com/blog/2003/08/06#453">Ted Leung on the air</a>]
</blockquote>
<p>
Exactly. Now, what the RDF advocates appear to be saying is that if extensions show up as sets of RDF triples, then the problem is solved. An aggregator that can consume job-related triples already &quot;knows what to do with&quot; vacation-related triples. 
</p>
<p>
I'm with Patrick Logan here: you can't finesse the symbol grounding problem so easily. When I write an RDF query involving job-related and vacation-related RDF triples, I'll need to know which predicates exist in these vocabularies, what they are documented to mean, and how to construe operations that combine them.
</p>
<p>
I do absolutely see value in a common processing model, and I like the RDF style of triple-oriented querying. But I also like XPath-oriented querying, and I especially like the emerging styles of XQuery for cross-document joins in pure XML space, and SQL/XML for joins across relational and XML spaces. 
</p>
<p>
If the RDF folks have really solved the symbol grounding problem, I'm all ears. I'll never turn down a free lunch! If the claim is, more modestly, that RDF gives us a common processing model for content -- a Content Virtual Machine -- then I will assert a counter-claim. XML is a kind of Content Virtual Machine too, and XPath, XQuery, and SQL/XML are examples of unifying processing models. As we move into the realm of extensible aggregators we'll face the same old issues of platform support and code mobility. Nothing new there. However, as XQuery and SQL/XML move into the mainstream -- as is <a href="http://weblog.infoworld.com/udell/2003/07/30.html#a760">rapidly occurring</a> -- aggregator developers are going to find themselves in possession of new data-management tools that can combine and query structured payloads. Those tools will not, because they cannot, know <i>a priori</i> what those payloads mean. But they'll provide leverage, and will simplify otherwise more complex chores. I can't see the endgame, but for me this is enough to justify doing the experiment.
</p>
</body>
</item> 

<item num="a774">
<title>Namespace training wheels</title>
<date>2003/08/09</date>
<body>
<blockquote cite="Jon Udell">
In general, we don't have much experience creating and using simple XML vocabularies, never mind mixed ones. InfoPath, the first application making a serious bid to enable mainstream folks to routinely gather and use XML data, hasn't even shipped. I think the creators of InfoPath and similar tools -- who hope that use of modular XML vocabularies will turn out to be like riding a bicycle -- ought to provide some training wheels. [Full story at <a href="http://www.infoworld.com/article/03/08/08/31OPstrategic_1.html">InfoWorld.com</a>]
</blockquote>
<p>
In a brief item called <a href="http://www.tbray.org/ongoing/When/200x/2003/08/08/NamespaceOrNot">Namespace Pedantry</a>, Tim Bray points out that it's wrong to say, as I did in this column, that &quot;by default, every element in an XML document is assigned to the 'empty' namespace.&quot; The correct statement would be something like 'by default, elements and attributes are not in any namespace.' Fair enough. I don't object to pedantry, and I assume that careful distinctions exist for good reasons. I'm curious to know the reason in this case. I can imagine, for example, that if I wrongly refer to the e6 and a1 in Tim's example as &quot;in the empty namespace&quot; rather than correctly refer to them as &quot;not in any namespace,&quot; the flawed conceptual model could lead to errors in my ability to deal with these things. Is that so? And if so, what patterns of error are typical? <a href="http://radiocomments.userland.com/comments?u=100887&amp;p=774&amp;link=http%3A%2F%2Fweblog.infoworld.com%2Fudell%2F2002%2F12%2F14.html%23a774">Comments</a>
</p>
<p>
I guess this all further illustrates the point of my column, which is that learning to ride the XML namespace bicycle requires a bit of help. An example of the kind of thing I have in mind is <a href="http://saxadapter.sourceforge.net/XMLNamespaceTutorial.html">this tutorial</a> by Mark Priest, which enumerates the namespace assignments in a sample document under various conditions. It strikes me that an online service that reports such assignments for any uploaded document would be extremely helpful.
</p>
<p>
<b>Update</b>: It's not just me. Clemens Vasters (whose new design looks like <a href="http://weblog.infoworld.com/udell/gems/clemensInFirebird.gif">this</a> in Firebird, BTW) also <a href="http://staff.newtelligence.net/clemensv/PermaLink.aspx?guid=47c865b9-48d4-473d-9c0e-a90a35f2bf52">tripped over</a> the &quot;empty&quot; vs. &quot;not-in-a&quot; namespace distinction. In any case, although Clemens suggets that RSS 2.0's lack of a top-level namespace is the issue, the controversy is broader. I mention in my column that Sean McGrath has raised general questions about namespace use. His <a href="http://www.itworld.com/nl/xml_prac/04112002/">column</a> on the subject, and related references embedded within it (<a href="http://lists.xml.org/archives/xml- dev/200204/threads.html">1</a>, <a href="http://lists.xml.org/archives/xml-dev/200204/msg00170.html">2</a>), elaborate. My conclusion, nevertheless, is not &quot;don't use namespaces&quot; but rather &quot;use training wheels when learning to ride the namespace bicycle.&quot;
</p>
</body>
</item> 

<item num="a773">
<title>An RSS/RDF epiphany</title>
<date>2003/08/08</date>
<body>
<p>
Some fascinating conversations have been weaving their way through blogspace and email in the last few days. As a result, I think I've reached a new understanding of the seemingly endless debate about whether and how to use RDF (Resource Description Framework) and RSS together. I mentioned <a href="http://www.w3.org/People/DanBri/">Dan Brickley's</a> comments the other day. He expands on his remarks over on Shelley Powers' blog:
</p>
<blockquote cite="Dan Brickley">
For me, this is all about data mixing, and it's the only real way I know how to do it in XML. I'm just used to it, maybe. With RDF I can take a couple of RDF documents and merge them by adding the triples together. I just don't know how to do that with non-RDF XML.
<br/>
<br/>
RDF's syntax is hard to learn, and the underlying triples model isn't obvious to those without a built-in mental RDF parser. But there's also truth in the concern that we don't really know how to freely inter-mix independently defined XML namespaces when those namespaces are defined with XML rather than RDF schema languages... So imho spending time trying to have best of both worlds isn't a waste. [Commenting in Shelley Powers' <a href="http://burningbird.net/cgi-bin/mt-comments.cgi?entry_id=1428">Practical RDF</a>]
</blockquote>
<p>
Arguably I should get a life :-), but for me this remark was an epiphany. I've long suspected that we won't really understand what it means to mix XML namespaces until we do some large-scale experimentation. What I hadn't fully appreciated, until just now, is the deep connection between RDF and namespace-mixing. Dan's original hard-line position, he now explains, was that there is no sane way to mix namespaces without some higher-order model, and that RDF is that model. That he is now modulating that position, and saying that none of us yet knows whether or not that is true, strikes me as both intellectually honest and potentially a logjam-breaker.
</p>
<p>
Meanwhile, in email, <a href="http://www.betaversion.org/~stefano/">Stefano Mazzocchi</a> made this striking comment (which I hope he won't mind being quoted here):
</p>
<blockquote cite="Stefano Mazzocchi">
The mental model that XML promotes is basically a tree of couples.
<br/>
<br/>
The mental model that RDF promotes is basically a collection of triples.
<br/>
<br/>
Sounds familiar doesn't it? The Hierarchical vs. Relational war over 
again 30 years later?
</blockquote>
<p>
Indeed, it does. Stefano's formulation suggests to me that the troubled relationship between RSS and RDF may have been a red herring all along. Either we do or don't need some higher-order model to manage mixed namespaces sanely. Nobody knows yet. That the question arose in the context of RSS may simply have been an unfortunate historical accident -- RSS happened to be a likely candidate for the necessary large-scale experimentation, and got caught in the crossfire. 
</p>
<p>
Atom is headed into the same field of fire, but if I'm right in my analysis, this isn't about syndication at all. It's about the general question of using XML namespaces. And yet, again and again, RSS gets entangled in the discussion. Today for example, Patrick Phelan referred me to <a href="http://www.xml.com/lpt/a/2003/07/23/extendingrss.html">this XML.com article</a> in which <a href="http://dannyayers.com/archives/001685.html">Danny Ayers</a> writes:
</p>
<blockquote cite="Danny Ayers">
There's no consistent means of interpreting material from other namespaces that may appear in an RSS 2.0 document.
</blockquote>
<p>
To which I responded:
</p>
<blockquote cite="Jon Udell">
Shouldn't we then substitute XML for RSS 2.0 in that sentence, and say there is no consistent way to interpret material from other namespaces in any XML document, period?
<br/>
<br/>
Shouldn't we then say, there is no reason to create any mixed-namespace XML document that is not RDF?
<br/>
<br/>
This is the conclusion which it seems Dan Brickley is, recently, trying to avoid. I'm glad to see him raising the issue. This has been pigeonholed as an RSS thing for too long, it's really much larger I think.
</blockquote>
<p>
Given this analysis, Dave Winer's comment, over on Shelley's blog, also merits deep consideration: 
</p>
<blockquote cite="Dave Winer">
Jon, I'd add this -- the working from both ends towards the middle should take place away from ongoing commercial development. It would be like experimenting with space travel on the construction site for the Golden Gate Bridge. The purpose of the bridge might be confusing to the motorists.
</blockquote>
<p>
An excellent point. Over on his blog, commenting on my Plain Old Metadata proposal, Danny Ayers writes:
</p>
<blockquote cite="Danny Ayers">
He [me] also talks of &quot;plain old metadata&quot; - ok, how are we going to present this - in a random, inconsistent HTML tag soup kind of a fashion? Or shall we try and do it in a way that tries to maximise the potential utility of the data? [<a href="http://dannyayers.com/archives/001685.html">Danny's Raw Blog</a>]
</blockquote>
<p>
Well, I'm with Dan Brickley on this:
</p>
<blockquote cite="Dan Brickley">
Of these three paths for job-data-in-RSS, 'entity escape it and stuff it in the description', 'use non-RDF namespace extensions', and 'use RDF namespace extensions', to my mind only one of them stands out as clearly the worst way forward. [Commenting in Shelley Powers' <a href="http://burningbird.net/cgi-bin/mt-comments.cgi?entry_id=1428">Practical RDF</a>]
</blockquote>
<p>
What we have now is 'entity escape and stuff in description' and I doubt anyone will argue that's good. Like Dan, I don't know which of the other two options is best. I do know, however, that I can easily shred an XML document with XPath and XSLT, pick out subsets -- whether or not they're namespace-qualified -- and do useful things with them. I don't believe that doing that, without first settling on a higher-order semantic model, is a bad idea. Far from it. It's abundantly clear to me that we've wasted years, that we must do that experiment ASAP, and that it will yield new killer applications. No agreement on the higher-order model need be reached as a precondition. If some higher-order model is going to ultimately prevail, then a lot of existing data will have to get converted into it. Would you rather convert 'entity-escaped-and-stuffed-in-the-description' data, which is all we have now, or XML data that you can at least shred and manipulate? That choice seems transparently clear to me.
</p>
<p>
Finally, a plea to all concerned. Let's stop punishing RSS syndication for its success by asking it to carry the whole burden of XML usage in the semantic Web. 
</p>
</body>
</item> 

<item num="a772">
<title>Tank, I need a pilot program for a B-212 helicopter</title>
<date>2003/08/07</date>
<body>
<p>
<img vspace="6" hspace="6" align="right" alt="trinity" src="http://weblog.infoworld.com/udell/gems/trinity.jpeg"/>
Used to be, it was hard to find stuff out, especially in a hurry. Now Google makes it so easy to find stuff out in a hurry that I often feel like Trinity in that scene in The Matrix where she barks:
<blockquote cite="Trinity">
<i>
Tank, I need a pilot program for a B-212 <sup>1</sup> helicopter. 
</i>
</blockquote>
As a matter of fact, it is now possible to find out so much stuff so quickly that I found myself today imagining this alternate version of the scene:
</p>
<blockquote cite="imaginary">
<b>Trinity:</b> Tank, I need a pilot program for a B-212 helicopter. While you're at it, give me wind turbulence at 4th and 32nd, altitude 73 meters. Also calculate shear produced by a gatling gun firing continuously from the helicopter's bay. And run a simulation of Neo and Morpheus hanging from 3.5cm braided cable, I need to know if that cable will hold. Oh, and don't forget to factor in the rate of fuel loss with a couple of bullet holes in the fuel tank. Let's see, what else...
</blockquote>
<blockquote cite="imaginary">
<b>(Agent Smith's hands close around Trinity's neck.)</b>
</blockquote>
<blockquote cite="imaginary">
<b>Trinity:</b> Urp, gurgle.
</blockquote>
<p>
Don't get me wrong. Too much information is a good problem to have. Beats the hell out of <i>not</i> being able to call Tank and download the pilot program. But it does create an interesting new dilemma. If you like to be well-informed, as I do, it's getting harder than ever to draw the line and say: &quot;Enough research, time to act.&quot;
</p>
<hr/>
<p>
<sup>1</sup> Also variously reported on Web pages as: V-212, M-109... 
</p>
</body>
</item> 

<item num="a771">
<title>SpamBayes now accepting donations</title>
<date>2003/08/07</date>
<body>
<p>
A reader wrote yesterday to say:
<blockquote>
<i>
Your SPAMBayes recommendation alone as saved me hours, since I receive 10-15 SPAM per hour, 24x7. It's on the list of software that I'd pay for if I could.  I'll probably make a donation to the project in some form.
</i>
</blockquote>
Now you can. <a href="http://altis.pycs.net/2003/08/07.html#a86">Kevin Altis</a> notes that you can make a <a href="http://spambayes.sourceforge.net/donations.html">donation</a> to the <a href="http://www.python.org/psf/">Python Software Foundation</a>. &quot;All donations will go to the PSF,&quot; Kevin told me in email, &quot;but the donation is marked so that we know the donation was due to appreciation of SpamBayes.&quot;
</p>
<p>
Done!
</p>
<p>
<a href="http://weblog.infoworld.com/udell/gems/spambayesDonation.gif">
<img alt="spambayes donation" width="463" height="317" src="http://weblog.infoworld.com/udell/gems/spambayesDonation.gif"/>
</a>
</p>
</body>
</item> 

<item num="a769">
<title>Chicken and egg</title>
<date>2003/08/06</date>
<body>
<p>
When I look at today's Web, I see precious little metadata. We mine the scraps we have -- email addresses, URLs, HTML metatags -- for all they're worth. We know intuitively that with more and richer metadata, we could build more and richer applications. People much smarter than me imagine what it would be like if machines could &quot;reason&quot; about the things described with metadata. I'd love to see those people get the chance to do their experiment. So would Tim Bray, who also thinks the Web is &quot;terribly metadata-thin&quot; and has issued a challenge to produce a <a href="http://tbray.org/ongoing/When/200x/2003/05/21/RDFNet">killer app for RDF (Resource Description Framework)</a>.
</p>
<p>
But there's a chicken-and-egg problem. You can't do the RDF experiment until there are interesting amounts of metadata floating around. But if we say that all metadata that can benefit from RDF must first be expressed in RDF, it's kind of a non-starter. 
</p>
<p>
Why can't we first get a bunch of POM (plain old metadata) flowing through the system? Job feeds carrying metadata packets in namespaced payloads, for example? Once quantities of real data are in circulation, I'll bet the RDF gang could  RDF-ify it, and then we can all learn -- finally -- what kinds of higher-order reasoning will become possible. 
</p>
<p>
Meanwhile, the POM would be darned useful in its own right. The difference between an opaque unstructured job item and a transparent structured job item is night and day. 
</p>
</body>
</item> 

<item num="a768">
<title>A killer app for RSS</title>
<date>2003/08/06</date>
<body>
<p>
Some feedback on yesterday's trial balloon:
</p>
<blockquote cite="Alf Eaton">
Jon Udell wants to put RDF into RSS. I'm not sure if it's a good idea (it's certainly ugly) - wouldn't it be better to rewrite the RDF to fit the RSS format, and keep a separate RDF feed for the pure data? [<a href="http://www.pmbrowser.info/hublog/archives/000430.html">HubLog</a>]
</blockquote>
<p>
Actually, I'm not saying that I <i>want</i> to put RDF into RSS. I'm trying to ask and answer two questions: 1) Is it feasible? and 2) What benefits would it confer? 
</p>
<p>
Here's Scott Reynen's reaction to the trial balloon:
</p>
<blockquote cite="Scott Reynen">
I don't really understand a lot of things about his example, and many of his answers to questions posted in the comments reveal that he doesn't either. [<a href="http://weblog.randomchaos.com/index.php?date=2003-08-06&amp;title=rss+for+job+feeds">randomchaos</a>]
</blockquote>
<p>
It's true. Until somebody proves otherwise, my gut feeling is that the combination of RSS 2.0 and a job-related namespace is the sweet spot. I wouldn't want to close the door on RDF, because where there's smoke there may be fire, and a lot of smart people are smoking RDF. Hence yesterday's exploration of the idea that RDF can intermix with non-RDF XML vocabularies. Dan Brickley says he's leaning toward the position that such mixtures can work. Great! I'll look forward to seeing what results may come from that approach.
</p>
<p>
(<b>Update:</b> From <a href="http://radiocomments.userland.com/comments?u=100887&amp;p=767&amp;link=http%3A%2F%2Fweblog.infoworld.com%2Fudell%2F2002%2F12%2F14.html%23a767">yesterday's comments</a>, Shelley Powers: &quot;At no point did Dan mention that you can throw RDF and plain-vanilla XML together and have it 'work'.&quot; Fair enough. I know it's not that simple.)
</p>
<p>
Meanwhile, setting aside RDF for the moment, why don't the various job-related RSS applications deliver basic metadata (salary, location) in a format that can be parsed sanely? And why don't we yet see any RSS readers being extended to make use of such metadata? 
</p>
<p>
I submit that all the ingredients are in place. Job sites exist, and they do deliver metadata (albeit screen-scraping is required to recover it in useful form). RSS 2.0 feeds can be extended to include job-related metadata. RSS readers can be extended to do useful things with that metadata -- filtering, prioritizing. Sounds like a killer app to me, and one that could finally prove the point that RSS is really a generalized system for delivering payloads of structured data. So what's the holdup? Scott Reynen again:
</p>
<blockquote cite="Scott Reynen">
I started wondering if there isn't already a namespace dedicated to job posts that could simply be put inside an RSS feed. sure enough, there are multiple XML formats for job posts. <a href="http://216.239.39.104/search?q=cache:fqFSjCvZyjwJ:www.postingpal.co.uk/jobboards_samplexml.asp+%22xml+format%22+job+salary&amp;hl=en&amp;ie=UTF-8">The first I found</a> is only accessible through Google's cache anymore, and looks a bit verbose. But then I stumbled upon <a href="http://www.hr-xml.org/">the HR-XML consortium</a>, a <a href="http://www.hr-xml.org/channels/membership.cfm">high-price</a> club including some <a href="http://www.hr-xml.org/channels/members_list.cfm">big name companies</a>, dedicated to developing XML formats for human resources (that's what they call us when they give us jobs). The irony here is that monster.com is paying tens of thousands of dollars to this consortium, and hasn't even implemented anything as useful as what I and rssjobs.com have for free just by scraping their pages. (Note to monster.com: give me that money, and i'll make you some XML feeds and write the software users would need to read them.)
<br/>
<br/>
Unfortunately, <a href="http://www.hr-xml.org/channels/projects_main.cfm#">all of HR-XML's formats</a> are geared towards being used by businesses rather than job seekers, and so don't include information any job seeker would probably want, such as salary. So I'm just going to expand on <a href="http://ilrt.org/discovery/2000/11/rss-query/jobvocab.rdf">Jon's very brief description what information a job post would include</a>. [ed: FYI: That's not my description, it comes from Dan Brickley's example.]
<br/>
<br/>
The only decent-looking XML format I found in all this was <a href="http://xmlresume.sourceforge.net/">XML resume library</a>, a project that will possibly open up some automated job matching possibilities once we get a job format established.
[<a href="http://weblog.randomchaos.com/index.php?date=2003-08-06&amp;title=rss+for+job+feeds">randomchaos</a>]
</blockquote>
<p>
Scott, my $0.02 is go for it. Use prior art, such as the XML resume library, if it makes sense. Define a simple RSS 2.0 module for job metadata. Deliver a job feed that's enriched with data in that module's namespace. Invite one or more RSS aggregators to support it. It would be a win-win for everybody.
</p>
</body>
</item> 

<item num="a767">
<title>Using RSS 2.0 and RDF together</title>
<date>2003/08/05</date>
<body>
<p>
I've been working on a series of issue analyses for the <a href="http://blogs.law.harvard.edu/tech/">RSS 2.0 site</a>. One of the questions I've been wanting to explore is whether RDF might be used in conjunction with RSS 2.0, and if so how. Today, in the <a href="http://blogs.law.harvard.edu/tech/comments?u=tech&amp;p=167&amp;link=http%3A%2F%2Fblogs.law.harvard.edu%2Ftech%2F2003%2F08%2F04%23a167">comments section</a> of the site, Dan Brickley pointed me to the example I've been looking for. He writes:
</p>
<blockquote cite="Dan Brickley">
This week, a new 'RSS and jobs' site is getting some interest. <a href="http://www.rssjobs.com/rssjobs/index.jsp">http://www.rssjobs.com/rssjobs/index.jsp</a> There is a similiar effort at <a href="http://jobs.perl.org/rss/">http://jobs.perl.org/rss/</a> (eg. see <a href="http://jobs.perl.org/rss/telecommute.rss">http://jobs.perl.org/rss/telecommute.rss</a>) and an old example scenario that Libby and I worked on at <a href="http://ilrt.org/discovery/2000/11/rss-query/">http://ilrt.org/discovery/2000/11/rss-query/</a>.
<br/>
<br/>
I hope we all agree that such applications are an exciting part of the future of RSS and RSS-like technology. To my mind, the big question is, how can we partition the work so that we have a Web of complementary namespaces which fit together to give us better descriptions in our XML feeds.
<br/>
<br/>
Looking at the feeds currently served by rssjobs.com, all the structure is hidden, entity escaped, inside the 'description' tag. Date, job title, employer, location, blurb... all crushed into a single field. 
</blockquote>
<p>
Suppose you wanted to do an RSS 2.0 feed that would expose those job fields as first-class XML. And suppose further that you wanted to express the job data in terms of RDF. What might that look like?
</p>
<p>
Here's a <a href="http://weblog.infoworld.com/udell/gems/rdfJobs.xml">trial balloon feed</a> (also reproduced below). I'm sure there are problems with it. I have no idea whether, or if so how well, something like this would meet the need -- which, to be clear, is to leverage RSS 2.0 to transmit application-specific data. There's lots that I still don't know about XML namespaces, and even more that I don't know about RDF, so even though this example is <a href="http://feeds.archive.org/validator/check?url=http%3A%2F%2Fweblog.infoworld.com%2Fudell%2Fgems%2FrdfJobs.xml">valid</a>, I know there are issues to explore, both in terms of RDF usage specifically, and more generally with respect to how namespaces should best work together. 
</p>
<p>
I'd like to surface those issues, in the spirit of better understanding how to achieve the &quot;Web of complementary namespaces&quot; that Dan rightly envisions. <a href="http://radiocomments.userland.com/comments?u=100887&amp;p=767&amp;link=http%3A%2F%2Fweblog.infoworld.com%2Fudell%2F2002%2F12%2F14.html%23a767">Comments.</a>
</p>
<p>
<b>Update</b>: In the comments, Scott Reynen pointed out that &quot;your rss tag ends (&gt;) before the namespaces (as well as after).&quot; Heh. Good catch. That explains why I couldn't attach the rdf:about attribute to the channel, and some other oddities. Thanks Scott. I've amended the example accordingly. Now, I still have no idea whether this is, or could be, coherent from an RDF perspective. But here's what Dan Brickley added today over on the <a href="http://blogs.law.harvard.edu/tech/discuss/msgReader$167?mode=topic">RSS 2.0 site</a>:
</p>
<blockquote cite="Dan Brickley">
 The tricky case is where RDF namespaces are mixed in an XML context where the XML is not generally being arranged as an encoding of an RDF graph. In such cases, all bets are off, we don't really know what is meant by an unknown XML element or attribute, since there is no over-arching set of cross-namespace rules that are being followed.
<br/>
<br/>
So, what to do? A stark line (one I don't believe anyone is taking) would be to assert that RDF vocabularies can only appear in XML documents that are structured according to the (current) RDF/XML syntax. So any use of FOAF in non-RDF XML would be illegal, for example. That strikes me as wrong for a couple of reasons. Firstly it increases the gap between RDF and XML when we should be working to reduce it. Secondly, it would create an upgrade hell if W3C (or anyone else) were trying to deploy an alternative XML encoding of RDF graphs (there are a few of these and one may catch on...). There may end up being other XML encodings of RDF graphs, and we won't want to go changing all our RDF vocab namespace URIs to celebrate that occasion.
<br/>
<br/>
So a less stark line, and one I'm currently inclined towards, is to explore some rules along the lines of &quot;It is OK for XML markup to draw upon RDF namespaces even when the enclosing XML markup isn't in RDF syntax, so long as all elements below that RDF-namespaced element are structured in accordance with a regular RDF syntax.
</blockquote>
<p>
So, let's follow through on this. Scott Reynen also points out that he's implemented a service that converts Monster.com and HotJobs.com searches into RSS feeds. Here's one such feed for <a href="http://weblog.randomchaos.com/jobfeeds.php?source=http%3A%2F%2Fhotjobs.yahoo.com%2Fjobseeker%2Fjobsearch%2Fsearch_results.html%3Fkeywords_all%3D%26industry1%3DTEL%26city1%3DBoston%26state1%3DMA%26country1%3DUSA%26search_type_form%3Dquick%26updated_since%3Danytime%26quicksearch_x%3D1%26metro_area%3D1%26search%3DSearch&amp;format=rss2.0">technology jobs in or around Boston</a>. And here's a sample of the RSS 2.0 results:
<pre class="code" lang="xml">
&lt;item&gt;
&lt;title&gt;Vice President WW Channel, Enterprise Sales ( Boston, MA ) &lt;/title&gt;
&lt;link&gt;http://us.rd.yahoo.com/hotjobs/searchresults.... &lt;/link&gt;
&lt;description&gt;DuVall &amp; Associates&lt;br /&gt;$160K-$200K &lt;/description&gt;
&lt;author&gt;scott@randomchaos.com &lt;/author&gt;
&lt;pubDate&gt;Tue, 05 Aug 2003 12:00:00 +0800&lt;/pubDate&gt;
&lt;/item&gt;
</pre>
Cool. Now the question I'm asking, which Dan Brickley's comment partly answers, is: &quot;Can Scott Reynen augment this RSS 2.0 feed with RDF-isms that carry application-specific data that a jobs-aware RSS reader could use to filter or more richly display the items in this feed?&quot; Dan seems to be saying &quot;Yes.&quot; 
</p>
<p>
I'll ask and answer a more general question. Can a modular extension to RSS, built to carry job-related metadata, add value for users of RSS readers aware of that extension? The answer here is clearly yes. This kind of thing has long been possible, and is now -- I feel sure -- going to explode on the scene. 
</p>
<p>
If RDF can add value above and beyond what a non-RDF job-related metadata extension can offer, I would really like to see that clearly explained, ideally in the context of this example. Can the RDF-isms in this trial-balloon example be made coherent from an RDF perspective? What advantages would they confer, above and beyond the job-related metadata that any application aware of the job: and wn: namespaces could pick up?
</p>
<h3>Mockup of an RSS 2.0/RDF jobs feed</h3>
<pre clas="code" lang="xml">
&lt;?xml version=&quot;1.0&quot;?&gt;
&lt;rss version=&quot;2.0&quot;
 xmlns:wn=&quot;http://xmlns.com/wordnet/1.6/&quot;
 xmlns:rdf=&quot;http://www.w3.org/1999/02/22-rdf-syntax-ns#&quot;
 xmlns:job=&quot;http://ilrt.org/discovery/2000/11/rss-query/jobvocab.rdf#&quot;
 &gt;
&lt;channel rdf:about=&quot;http://ilrt.org/discovery/2000/11/rss-query/jobs-rss.rdf&quot;&gt;
  &lt;title&gt;A hypothetical job listings channel&lt;/title&gt;
  &lt;link&gt;http://ilrt.org/discovery/2000/11/rss-query/ &lt;/link&gt;
  &lt;description&gt;
    This example shows RSS used as a lightweight data transport mechanism
  &lt;/description&gt;
  &lt;rdf:items&gt;
     &lt;rdf:Seq&gt;
        &lt;rdf:li resource=&quot;http://example.com/job1.html&quot; /&gt;
        &lt;rdf:li resource=&quot;http://example.com/job2.html&quot; /&gt;
     &lt;/rdf:Seq&gt;
  &lt;/rdf:items&gt;
  &lt;item rdf:about=&quot;http://example.com/job1.html&quot;&gt;
     &lt;title&gt;The title of job1 goes here&lt;/title&gt;
     &lt;link&gt;http://example.com/job1.html &lt;/link&gt;
     &lt;description&gt;
        (Job1-Job1-Job1...)
     &lt;/description&gt;
  &lt;job:advertises&gt; 
       &lt;wn:Job job:title=&quot;Job title for job1 goes here&quot;
          job:salary=&quot;100000&quot;
          job:currency=&quot;USD&quot;
          &gt;
        &lt;job:orgHomepage rdf:resource=&quot;http://www.ukoln.ac.uk/&quot;/&gt;
        &lt;/wn:Job&gt;
  &lt;/job:advertises&gt;
  &lt;/item&gt;
  &lt;item rdf:about=&quot;http://example.com/job2.html&quot;&gt;
      &lt;title&gt;The title of job1 goes here&lt;/title&gt;
      &lt;link&gt;http://example.com/job2.html &lt;/link&gt;
      &lt;description&gt;
         (Job2-Job2-Job2...)
      &lt;/description&gt;
     &lt;job:advertises&gt; 
          &lt;wn:Job job:title=&quot;Job title for job2 goes here&quot;
             job:salary=&quot;150000&quot;
             job:currency=&quot;UKP&quot;
             &gt;
          &lt;job:orgHomepage rdf:resource=&quot;http://ilrt.org/&quot;/&gt;
          &lt;/wn:Job&gt;
      &lt;/job:advertises&gt;
   &lt;/item&gt;
&lt;/channel&gt;
&lt;/rss&gt;
</pre>
</body>
</item> 

<item num="a766">
<title>Test-driven development</title>
<date>2003/08/04</date>
<body>
<blockquote cite="Jon Udell">
Programmer and author Dave Johnson shared an anecdote on his <a href="http://www.rollerweblogger.org/page/roller/20021201#you_did_good">weblog</a> last year about what happened when his 5-year-old son walked up behind him while he was coding. &quot;He saw the JUnit green bar on the screen,&quot; Johnson reports, &quot;and said 'Dad, you did good.'&quot; There's more to this touching father-and-son moment than meets the eye. The idea that software development can proceed by tackling a sequence of small tasks -- the successful completion of which is evident even to a child -- is fueling a groundswell of interest in the so-called &quot;xUnit&quot; testing frameworks (see <a href="http://archive.infoworld.com/article/03/08/01/30FEtesttools_1.html">sidebar</a>) and in a companion work style called &quot;test first&quot; or &quot;test driven.&quot; [Full story at <a href="http://www.infoworld.com/article/03/08/01/30FEtestmain_1.html">InfoWorld.com</a>.]
</blockquote>
<p>
For this story I also interviewed <a href="http://www.infoworld.com/infoworld/article/03/08/01/30FEtestward_1.html">Ward Cunningham</a> and <a href="http://www.infoworld.com/infoworld/article/03/08/01/30FEtestbrian_1.html">Brian Marick</a>. Both of these conversations were fascinating; here are some outtakes that didn't fit into the article.
</p>
<div>
<b>Ward on the relationship of QA and test-first development:</b>
</div>
<blockquote cite="Ward Cunningham">
If you're the head of QA, and you hear that your developers are working test-first, you should think, &quot;Good for them, now we can focus on the truly diabolical tests instead of working on these dead-on-arrival problems.&quot;
</blockquote>
<div>
<b>Ward on doing only what is declared:</b>
</div>
<blockquote cite="Ward Cunningham">
When somebody says this is test-first code, you believe that it's going to be more robust than otherwise. And if you do run into a limitation, it's more likely that you can get past it. Test-first means that it's clear what the code does, and that what it does has been tested, but also that the code doesn't devote a lot of effort to doing things that aren't declared. There is not a lot of cruft lying around in there that's going to prevent you from getting what you want. That's part of the whole extreme programming promise: if you change your mind and ask for something new, it's not an exorbitant penalty for not asking up front.
</blockquote>
<div>
<b>Ward on the inevitability of test-first:</b>
</div>
<blockquote cite="Ward Cunningham">
The idea is so unarguable that everybody who ships software has got to be doing some form of it, they've discovered it themselves. They might not be doing it with the aid of software. Really, test-first is just be clear about what you want, write it, and then see if that's what you got. Although when you get the computer to do the checking, you can make your work easier. 
</blockquote>
<div>
<b>Ward on using FIT:</b>
</div>
<blockquote cite="Ward Cunningham">
We had 50 people on the project, 10 were business analysts. When we introduced this FIT style, we told them they could use spreadsheets but they had to work out the test cases. Their first response was: &quot;This is trivial, I can't believe you can't do it.&quot; The woman who wrote the first one thought it would take few minutes, and it took a day. As she began writing out the details, she wanted them to be right. She told me that to her surprise, people started lining up outside her door when they heard she was working on the [test-case] spreadsheet. Over the months they did this, as you got from simple to harder, the spreadsheets and test cases got awfully intricate. They had a team of five or six who accepted the responsibility, on an iteration basis, of being prepared to explain what was required in the next iteration, and to have detailed spreadsheets that worked through those cases. It was a huge amount of work. They really felt they had taken on a significant part of the job -- and they had. &quot;This isn't fair, is it?&quot; they'd ask. I'd say: &quot;Think about what you're doing when you do this, would you rather have the developers do it?&quot; They'd say: &quot;The developers should be smarter.&quot; I'd say: &quot;They're getting smarter, but they're also worrying about EJB and all that junk.&quot; In the end, the business analysts became totally engaged. They didn't like all this work, they wished there was an easier way. But it sure beat writing docs and then bitching about nobody reading them.
</blockquote>
<div>
<b>Brian on completeness and sufficiency:</b>
</div>
<blockquote cite="Brian Marick">
When we talk about how testing can drive development, it tends to embroil logically-minded people in pointless arguments of the following sort. The requirements document is a bunch of abstract statements that describe what the program should do for all possible inputs. All the test does is say, for this input, you should get this output. So if the program passes 100 tests, how do you know it's what you really want? What's becoming true, I think, is that we're trying to figure out how to write tests sufficient to cause the programmer to write the right program, even if logically the tests are just a bunch of examples. 
</blockquote>
<div>
<b>Brian on incrementalism:</b>
</div>
<blockquote cite="Brian Marick">
This whole notion of incremental discovery is really central. The big difference between the way I used to do test-first programming, and the way I do it now, is that I used to think of writing all the tests up front. That makes it much more like a complete representation of the program. Whereas the XP style is radically different, it's one test at a time. That makes a very big difference. If I were going to emphasize one thing to your audience, it would be that. 
</blockquote>
<div>
<b>Brian on the psychology of test-first development:</b>
</div>
<blockquote cite="Brian Marick">
When I taught testing back in the early 90s, I had people do tests after they wrote the code. I would say in my class, &quot;this all works better if you do it up front,&quot; but I didn't have Kent Beck's forcefulness to say &quot;of course you'll write the tests up front, how could you not?&quot; So when you do it after the fact, it becomes a chore, not a tool that helps the programmer. The key thing for adoptibility of any test technique is that it has to make programmers believe they're they're doing more of, and getting better at, what they want to do most of all, which is write code. If you do the tests up front, they're helping you think through the problem. That's good. Programmers are not happy doing any kind of design documentation up front, and one of the reasons is that there's so much of it you don't get your daily fix of coding. The essential thing about test-first programming that makes it possible is that it's a design technique intimately interwoven with the act of coding. Therefore it's sustainable, it's much less likely to get dropped under pressure, because it's not a separate activity.
</blockquote>
<div>
<b>Brian on the test-first safety net:</b>
</div>
<blockquote cite="Brian Marick">
When I'm working on a new feature, I'll go in and deliberately break something in the guts of the program, then run the tests and see which ones fail. It's a form of traceability that is automated. And it's a really powerful feeling -- a switch from the usual feeling that your program is an elaborate house of cards. Now you can pull the card out, see what tumbles down, and then just make it work. It's a way of converting programming from an adrenalin-filled activity to something smooth. It's about about smooth, steady, fast progress.
</blockquote>
</body>
</item> 

<item num="a765">
<title>Revisiting Zope</title>
<date>2003/08/04</date>
<body>
<blockquote cite="Jon Udell">
For years I've been following the adventures of Zope, an open source application server that is particularly adept at content management. The Zope engine and its layered applications are written in Python, and the whole system is built on top of a Python-based object database called ZODB. Having done a lot of Zope development myself, I know firsthand how powerful and productive this arrangement can be. Admittedly it's an unorthodox approach that an enterprise IT planner might be reluctant to bet on. But as I learned recently on a visit to Zope's headquarters in Fredericksburg, Va., some big organizations are doing just that. NATO's worldwide intranet, for example, is based on Zope. [Full story at <a href="http://www.infoworld.com/article/03/08/01/30OPstrategic_1.html">InfoWorld.com</a>]
</blockquote>
</body>
</item> 

<item num="a764">
<title>Exploring Ultraseek</title>
<date>2003/08/01</date>
<body>
<p>
Chad Dickerson emailed me to note a couple of things about Ultraseek. First, the <a href="http://downloadcenter.verity.com/ds030103/Ultraseek/5.1.0/doc/Ultra5.1_Custom.pdf">docs</a>. Heh. I stand corrected. REST interfaces are often poorly documented, but not in this case. To wit:
</p>
<blockquote cite="Ultraseek documentation">
<div>
The ql (query level) parameter determines whether the default search form should be simple or advanced.
</div>
</blockquote>
<p>
Here's another cool that Chad points out, which has evidently been added since I first worked with Ultraseek years ago. If you switch from:
</p>
<blockquote>
<i>
<a href="http://search.infoworld.com/servlet/query.html?col=ifwspid&amp;qt=mycroft">http:\//search.infoworld.com/servlet/<b>query.html</b>?col=ifwspid&amp;qt=mycroft</a>
</i>
</blockquote>
<p>
To:
</p>
<blockquote>
<i>
<a href="http://search.infoworld.com/servlet/saquery.xml?col=ifwspid&amp;qt=mycroft">http:\//search.infoworld.com/servlet/<b>saquery.xml</b>?col=ifwspid&amp;qt=mycroft</a>
</i>
</blockquote>
<p>
Then presto! XML results. It's only a short hop from there to an RSS feed that watches for new items matching a search term. Here's an <a href="http://weblog.infoworld.com/udell/gems/ultraseek.xml">XSLT stylesheet</a> that converts Ultraseek results to RSS 2.0. Here's a <a href="http://www.w3.org/2000/06/webdata/xslt?xslfile=http%3A%2F%2Fweblog.infoworld.com%2Fudell%2Fgems%2Fultraseek.xml&amp;xmlfile=http%3A%2F%2Fsearch.infoworld.com%2Fservlet%2Fsaquery.xml%3Fcol=ifwspid%26rf=1%26qt=udell">prototype feed</a> to which you can actually subscribe. It runs the results of a search for items at InfoWorld.com mentioning 'Udell' through the XSLT transform to produce RSS 2.0.
</p>
<p>
Granted, that's not too useful as is. The devil is always in the details. In this case, Ultraseek's date ordering (rf=1) isn't yielding all the newest stuff in strict reverse chronological order, which is what you'd want. A real solution would need to collect metadata about actual publication dates. It's just lying around in metatags of IW pages (e.g., &lt;meta name=&quot;publicationDate&quot; content=&quot;2003-06-13&quot;&gt;) and URLs of blog pages (e.g. http://weblog.infoworld.com/udell/2003/08/01.html#a763), waiting to be used.
</p>
<p>
Still, you gotta love that Web pipelining!
</p>
</body>
</item> 

<item num="a763">
<title>InfoWorld.com updates</title>
<date>2003/08/01</date>
<body>
<p>
There are several InfoWorld.com updates to report today. First, the news gang have started a team blog at <a href="http://www.infoworld.com/techwatch/">http://www.infoworld.com/techwatch/</a>. Excellent! 
</p>
<p>
Second, the site's search engine has switched from Google to Ultraseek. For users of Mozilla Firebird, here are some search plugins you can use to search the whole site, or just this blog:
</p>
<script src="http://weblog.infoworld.com/udell/gems/mycroft.js" type="text/javascript"/>
<p>
Mozilla search plugin for InfoWorld: <a href="http://weblog.infoworld.com/udell/gems/infoworld.src">src</a>, <a href="http://weblog.infoworld.com/udell/gems/infoworld.gif">img</a>, <a href="javascript:addEngine('infoworld', 'gif', 'Tech')">install to Mozilla</a>
</p>
<p>
Mozilla search plugin for Jon's Radio: <a href="http://weblog.infoworld.com/udell/gems/jonblog.src">src</a>, <a href="http://weblog.infoworld.com/udell/gems/jonblog.gif">img</a>, <a href="javascript:addEngine('jonblog', 'gif', 'Tech')">install to Mozilla</a>
</p>
<p>
Finally, our survey on programming tools and practices is up at <a href="http://www.infoworld.com/161">www.infoworld.com/161</a>. 
</p>
<p>
<b>Update:</b> I did a bit more spelunking with Ultraseek's REST interface, ran into a problem, and then solved it. I decided to document the process here because it illustrates why I think REST interfaces should be more carefully specified.
</p>
<p>
The Firebird plugin, and now also the local blog search in the left navbar, are using Ultraseek's advanced search to match only results with URLs containing weblog.infoworld.com/udell. Here's an example of my first try:
<blockquote>
<i>
<a href="http://search.infoworld.com/servlet/query.html?tx0=mycroft&amp;tx1=weblog.infoworld.com%2Fudell&amp;op0=%2B&amp;fl0=&amp;ty0=w&amp;ql=a&amp;op1=%2B&amp;fl1=url%3A">http:\//search.infoworld.com/servlet/query.html?tx0=mycroft &amp;tx1=weblog.infoworld.com%2Fudell&amp;op0=%2B &amp;fl0=&amp;ty0=w&amp;ql=a&amp;op1=%2B&amp;fl1=url%3A</a>
</i>
</blockquote>
</p>
<p>
But if you follow that link, you'll see there's a kludge: the skip-to-content link at the top is an awkward way to get past the advanced search instructions. I was able to append #skip to the URL formed by the Mozilla plugin, which sort of worked. A different method would have been required to append #skip to the URL created by the search form on this page -- probably a JavaScript hack. Really, though, I'd rather have just used the simple search page, invoking the URL restriction like so:
</p>
<blockquote>
<i>
<a href="http://search.infoworld.com/servlet/query.html?qt=%2Bmycroft+%2Burl%3Aweblog.infoworld.com/udell">http:\//search.infoworld.com/servlet/query.html? qt=%2Bmycroft+%2Burl%3Aweblog.infoworld.com/udell</a>
</i>
</blockquote>
<p>
Much simpler, easier, cleaner. The problem was that interpolating a variable search term (e.g., 'mycroft') into a string template that includes 'url:weblog.infoworld.com/udell' would have required a different JavaScript hack for the form on this page, and might not even be supportable in the Mozilla plugin.
</p>
<p>
At this point, I concluded it was an overloading problem. With Ultraseek, as with Google and others, the advanced syntax is available in either the advanced or basic mode. But Ultraseek, unlike Google, overloads &quot;advanced&quot; to mean not only a syntax, but a display format. And there's no way (I thought) to use the advanced flavor of the syntax, which enables the term and the URL restriction to be decoupled as separate variables, while preserving the simple flavor of display.
</p>
<p>
As I wrote the first version of that previous paragraph, though, the solution occurred to me. The advanced syntax used the variable/value pair &quot;ql=a&quot; which presumably says &quot;query language = advanced.&quot; What if I kept the rest of the advanced-style URL, but removed the &quot;ql=a&quot;? Sure enough, that worked. Here's the solution:
</p>
<blockquote>
<i>
<a href="http://search.infoworld.com/servlet/query.html?tx0=mycroft&amp;tx1=weblog.infoworld.com/udell&amp;op0=%2B&amp;fl0=%20&amp;ty0=w&amp;op1=%2B&amp;fl1=url%3A&amp;ty1=w">http:\//search.infoworld.com/servlet/query.html?tx0=mycroft &amp;tx1=weblog.infoworld.com/udell&amp;op0=%2B&amp;fl0=%20 &amp;ty0=w&amp;op1=%2B&amp;fl1=url%3A&amp;ty1=w</a>
</i>
</blockquote>
<p>
This is all very geeky I know, but I do have a larger point. It took me a lot longer to work this out than it ought to have. REST interfaces, in general, are far more capable than people realize, but the possibilities are usually so poorly documented that you have to be extra-motivated to figure them out. Why don't people who work so hard to provide cool features make it easier for other people to discover and use them?
</p>
</body>
</item> 

<item num="a762">
<title>iChat AV, iSight, and FlashCom</title>
<date>2003/08/01</date>
<body>
<p>
<div class="minireview">iChat AV/iSight</div> 
People keep badgering me to buy webcams, and I'm always glad when they do. It seems like just yesterday that <a href="http://radio.weblogs.com/0113297/">Jeremy Allaire</a> insisted I get hold of a Logitech QuickCam in order to try out the video capabilities that were latent in Flash 6, and that would be unlocked by the then-imminent Flash Communication Server MX (<a href="http://www.infoworld.com/article/02/07/19/020722apappserve_1.html">1</a>, <a href="http://webservices.xml.com/pub/a/ws/2002/08/02/flashcomm.html">2</a>). In fact, that happened a year ago. The FlashCom server has since been <a href="http://www.macromedia.com/devnet/mx/flashcom/articles/exploring.html">updated</a> to version 1.5, which reportedly makes a number of improvements.
</p>
<p>
I haven't had a chance to try FlashCom 1.5 yet, but now I'm curious to take a look. The reason is that I got my arm twisted again, yesterday, to buy another camera. This time <a href="http://www.crn.com/weblogs/stevegillmor/">Steve Gillmor</a> did the arm-twisting. The problem was that Apple's new iChat AV only likes FireWire cameras. My USB QuickCam does work with the TiBook (once I acquired the <a href="http://www.ioexperts.com/usbwebcam.html">driver</a>), and I've used it to have an <a href="http://www.xmeeting-project.net/ohphoneX-docs/ohphoneX.html">OhPhoneX</a> conference with <a href="http://groove.jpj.net/guerrillanetworking/">Paul Venezia</a>. The results weren't great, though, and I really wanted to try iChat AV, so I picked up an <a href="http://www.apple.com/isight/">iSight</a> today. It's a sweet little device, which is consolation for my later discovery that <a href="http://www.ecamm.com/mac/ichatusbcam/">Ecamm</a> could've made iChat AV work with the Logitech. 
</p>
<p>
As everyone says, the iChat AV/iSight combo delivers an awesome experience. I'm not any kind of AV expert, so I can't even try to impress you with details about framerates, synch, or any of that stuff, though I should probably mention that we were both on DSL, me with a 256kbps up-and-down link that does somewhat better than that, and Steve with what appeared to be a 128kbps up-and-down.
</p>
<p>
What I can say is that 1) it all worked correctly, without any fuss whatsoever, and 2) it delivered enough interactivity to make the experience a useful approximation of face-time. And really, that's all that needs to be said.
</p>
<p>
Now, of course, I've got the itch to marry these two things. iChat AV is a wonder, but it's self-contained. You can't (yet) build apps around it. The FlashCom server is a wildly productive way to build apps that use videoconferencing (and recording and playback of AV streams) as elements of applications that also share event-driven GUI widgets with small groups. Of course FlashCom is intimately tied to the AV features built into the Flash 6 player, which doesn't yet even recognize the iSight camera so far as I can tell (<a href="http://www.macromedia.com/cfusion/search/index.cfm?loc=en_us&amp;term=isight">1</a>, <a href="http://www.macromedia.com/support/flashcom/ts/documents/cam_matrix.htm">2</a>). So I'm not sure how these two technologies can come together, but I'd love to see it happen.
</p>
</body>
</item> 

<item num="a761">
<title>Zora's list</title>
<date>2003/07/30</date>
<body>
<p>
<a href="http://www.graffiti.org/lm/lm_2.html">
<img alt="zora" align="right" width="300" height="210" vspace="6" hspace="6" src="http://weblog.infoworld.com/udell/gems/zora.jpg"/>
</a>
Yesterday's Scripting News <a href="http://scriptingnews.userland.com/2003/07/29#When:7:07:24PM">points</a> to a <a href="http://blogs.law.harvard.edu/lydon/2003/07/29#a211">Christopher Lydon interview</a> with Steve Kinzer, a New York Times correspondent. When I used to live in Boston, I often heard Lydon on WBUR's <a href="http://www.theconnection.org/">The Connection</a>; it's a real treat to catch his class act now on the Web. There's something missing from the audioblogging experience, though, and I've written about it <a href="http://weblog.infoworld.com/udell/2003/01/18.html">before</a>: audio is a more opaque datatype than it ought to be.
</p>
<p>
We routinely quote fragments of text, and although the tools available for doing so leave a lot to be desired, it's something most people can figure out. We almost never quote fragments of audio. Here, for example, is a <a href="http://weblog.infoworld.com/udell/gems/kinzer.wav">clip</a> from the Lydon/Kinzer interview. It wasn't incredibly hard to quote that bit of audio, but it wasn't trivial either. I'm on Windows today, so it took some fiddling to switch from microphone to wave recording, and to capture the exact quote I wanted. In my case, I only needed to save to Radio's /www/gems folder, so the upstreaming of kinzer.wav was automatic, but in other situations that step is less straightforward. All in all it's doable to include an audio quote extracted from an MP3, but you need to be pretty motivated to do it. It's not going to happen casually.
</p>
<p>
In theory, it should be much easier in the case of streaming audio. With RealAudio, for example, you can form a URL that includes start and end times, so you can literally quote from a stream. But as I discovered in my <a href="http://weblog.infoworld.com/udell/2003/01/18.html">earlier posting</a>, it's again not trivial. In that posting, I mentioned one of my favorite episodes of <a href="http://www.thisamericanlife.com">This American Life</a>, entitled <a href="http://www.thisamericanlife.com/pages/descriptions/01/178.html">Superpowers</a>. The second act of that episode tells the story of a modern-day wonder woman called Zora. It includes an extraordinary recitation of the list of skills that Zora set out to master. I've sometimes wanted to bookmark and point people directly to that place in the stream. So I finally sorted out how to do it. The base URL of the stream is <a href="rtsp://a129.r.akareal.net/ondemand/7/129/1854/7d2416a19/www.wbez.org/ta/178.rm">here</a>, but although that URL works on Windows, on the Mac -- because QuickTime wants to handle the rtsp: protocol -- it won't work unless you encapsulate it in a wrapper file, e.g. <a href="http://weblog.infoworld.com/udell/gems/superpowers.ram">superpowers.ram</a>. 
</p>
<p>
Here's an URL for Zora's list -- that is, the minute-and-a-quarter starting at 24 minutes and 24 seconds from the beginning of the stream:
</p>
<p>
<a href="rtsp://a129.r.akareal.net/ondemand/7/129/1854/7d2416a19/www.wbez.org/ta/178.rm?start=24:24&amp;end=25:40">rtsp:\//a129.r.akareal.net/ondemand/7/129/1854/7d2416a19/ www.wbez.org/ta/178.rm?start=24:24&amp;end=25:40</a>
</p>
<p>
Nailing the exact start and end times was, again, a tricky endeavor. The Real player doesn't give you any help bookmarking those offsets. And then I had to make and upload the encapsulation file. Finally I achieved the result. So here, for your listening pleasure, is <a href="http://weblog.infoworld.com/udell/gems/zorasList.ram">Zora's list</a>.
</p>
<p>
When we think about the opacity of audio (and video) datatypes, we tend to assume that some kind of Google-on-steroids is going to come along and make all this stuff more accessible. And indeed, technology like Fast-Talk Communications' <a href="http://www.infoworld.com/article/02/12/13/021216apfastalk_1.html">phonetic indexing</a> will certainly be part of the answer. But it strikes me that plain old Google could do a lot for us too, if only quoting from streams were easier to do. A few days after I post this entry, for example, the phrase 'Zora's list' will be in Google's index. Thereafter, when I want to point someone to that audio fragment, I'll be able to say: &quot;Search Google for Zora's list.&quot; 
</p>
<p>
Achieving that effect ought to be way easier than it is. Why, in discussions of weblog formats, APIs, and tools, do issues like this never seem to arise?
</p>
</body>
</item> 

<item num="a760">
<title>The marriage of SQL and XML</title>
<date>2003/07/30</date>
<body>
<blockquote cite="Jon Udell">
A major shift in the style of enterprise data management is under way, and there are huge architectural issues yet to be resolved. Oracle, not surprisingly, wants you to store everything in a centralized hybrid DBMS. IBM says it would rather enable you to federate data across a range of sources. Each strategy has merit, and most enterprises will wind up pursuing both -- in different ways, for various reasons. Despite these differences, we are witnessing a sacred union. SQL and XML have been pronounced man and wife, and the honeymoon has begun. [Full story at <a href="http://www.infoworld.com/article/03/07/25/29FEdocs_1.html">InfoWorld.com</a>]
</blockquote>
<p>
I just reread this story, and the quote that most resonates with me is this one, from Oracle's Sandeepan Banerjee (or, as San Francisco, Calif.-based InfoWorld's copy editors would have it, &quot;Redwood Shores, Calif.-based Oracle's Banerjee&quot;):
</p>
<blockquote cite="Sandeepan Banerjee">
It's possible that developers will want to stay within an XML abstraction for all their data sources.
</blockquote>
<p>
I've been living that experiment for a few months. My last few O'Reilly Network columns (<a href="http://www.xml.com/pub/a/2003/07/09/udell.html">1</a>, <a href="http://webservices.xml.com/pub/a/ws/2003/06/10/xpathsearch.html">2</a>) describe an XML-oriented approach to data management that I am continuing to find fruitful -- even without the capabilities that XML-savvy databases bring to the table. When you think about how long it took for SQL to become an established discipline, it helps put SQL/XML hybridization -- the subject of this InfoWorld story -- into perspective. It could take a decade or more for this stuff to really start to sink in. Along the way all sorts of new opportunities will emerge, and I find the whole thing terrifically exciting.
</p>
</body>
</item> 

<item num="a759">
<title>More Zope tips</title>
<date>2003/07/29</date>
<body>
<p>
In response to my <a href="http://weblog.infoworld.com/udell/2003/07/24.html">Zope tips</a> entry the other day, I've received a few more tips and clarifications. <a href="http://tonico.freezope.org/Home.xhtml">Tonico Strasser</a> solved the problem of getting Casey Duncan's <a href="http://www.zope.org/Members/Caseman/ExternalEditor">ExternalEditor</a> to work with Firebird:
</p>
<blockquote cite="Tonico Strasser">
You can get it working with the &quot;Things They Left Out (TTLO)&quot; extension for Firebird: <a href="http://cdn.mozdev.org/ttlo/">http://cdn.mozdev.org/ttlo/</a>.
</blockquote>
<p>
Yup. That did the trick. Thanks Tonico! The TTLO extension includes the MIME-type editor that you use to map application/x-zope-edit to the helper app which in turn launches your text editor. I've been doing a bunch of Zope stuff lately, and it just rocks to be able to load Zope objects (Python scripts, HTML and JavaScript files) straight into my text editor of choice (<a href="http://www.lugaru.com/">Epsilon</a>) on Windows. (There's a recipe for a <a href="http://www.zope.org/Members/Feneric/ExtEditMacOSX">MacOS X</a> helper app as well, but I couldn't quite get it to work. If you know where a binary can be found, let me know and I'll post that address.)
</p>
<p>
The TTLO extension, by the way,  brings back some familiar stuff (certificate management, control of HTTP Keep-Alive) but also introduces some Firebird-specific options. You can, for example, tweak the Find As You Type feature. By default, Firebird finds links on the current page that match what you type. You can extend the search to everything on the page by typing a forward slash, but that little extra bit of modality has prevented me from getting over the activation threshold and really using the feature. In TTLO you can turn off the Links Only setting, so that typing in a Web page always searches the complete text of the page. Nice!
</p>
<p>
In my earlier item I mentioned that it's a challenge to retain the interactivity of Python while working in the Zope environment. Ken Manheimer responded:
</p>
<blockquote cite="Ken Manheimer">
There are some great opportunities for interacting with Zope directly from the Python prompt -- I describe a few (in some detail) in a paper I presented at the PyCon DC 2003 conference: <a href="http://www.zope.org/Members/klm/ZopeDebugging/ConversingWithZope">http://www.zope.org/Members/klm/ZopeDebugging/ConversingWithZope</a>.
</blockquote>
<p>
Excellent. In the class I attended, Casey Duncan showed how it's possible to launch an entire instance of Zope interactively at the Python prompt, like so:
</p>
<pre class="code" lang="python">
app = Zope.app()
</pre>
<p>
I did so, and was able to poke around inside the app object, but the missing ingredient that we didn't have time to explore in the class was the ZEO (Zope Enterprise Objects) &quot;storage server&quot; which enables multiple instances of Zope to communicate with a common object database. It's normally used for high-availability clustering but is also handy for advanced debugging. A ZEO-aware command-line-interactive instance of Zope can share a database with a standard (also ZEO-aware) instance that responds normally to Web requests. Ken's paper gives the recipe for setting up that scenario. 
</p>
</body>
</item> 

<item num="a758">
<title>2003 InfoWorld awards for innovative tech projects</title>
<date>2003/07/29</date>
<body>
<p>
If your company is launching an innovative technology project this year, why not <a href="http://www.infoworld.com/awards/awd_iwo_sub.html">nominate</a> it for an InfoWorld 100 award? This year's deadline is Monday, Sept. 8.
</p>
<p> 
Here are the all-too-brief blurbs on the <a href="http://www.infoworld.com/awards/awd_iwo_2002.html">2002 winners</a>. Each one hints at an interesting story. I'll bet more than a few of the 2003 entries will be able to point to blogs that tell those stories in more depth. If you can, please do!
</p>
</body>
</item> 

<item num="a757">
<title>GAIA and the services fabric</title>
<date>2003/07/27</date>
<body>
<blockquote>
<i>
<p>
I've written often about using active intermediaries to modulate the flow of messages in a network of XML Web services. Gaia makes it trivial to inject such intermediaries into the fabric. Its API supports filters, monitors, and transforms. Filters can reject messages that fail a test, for example an authorization check. Monitors gather data for logging and traffic analysis. Transforms can change the behavior of a SOAP endpoint by rewriting its input and output messages.
</p>
<p>
During the next few years, we'll all be exploring how a services fabric can robustly support and flexibly adapt to the needs of our businesses. Gaia makes that vision seem closer -- and simpler -- than you might think. [Full story at <a href="http://www.infoworld.com/article/03/07/25/29OPstrategic_1.html">InfoWorld.com</a>]
</p>
</i>
</blockquote>
<p>
I often hear it said that &quot;SOAP is overkill.&quot; <a href="http://www.themindelectric.com/gaia/index.html">GAIA</a> is a great example of the kinds of benefits you can get when you ante up to play the SOAP way. It's based on The Mind Electric's <a href="http://www.themindelectric.com/glue/index.html">GLUE</a>, which makes producing and consuming SOAP services nice and easy. With GAIA, you do no extra work but your services (or indeed any SOAP services, GLUE-based or not) are published into a &quot;fabric&quot; that handles failover automatically, and makes intermediation easy. We hear a lot lately about what's variously called the &quot;enterprise service bus&quot; or &quot;message bus.&quot; GAIA makes the concepts simple and approachable, and I think it's wicked cool.
</p>
</body>
</item> 

<item num="a756">
<title>RSS top-level namespace</title>
<date>2003/07/25</date>
<body>
<p>
In order to be able to encapsulate RSS payload into other XML applications, it will be necessary to explicitly place RSS into its own namespace. It's been speculated that you can do that without causing any breakage. This posting tests that theory. Did it work? <a href="http://radiocomments.userland.com/comments?u=100887&amp;p=756&amp;link=http%3A%2F%2Fweblog.infoworld.com%2Fudell%2F2002%2F12%2F14.html%23a756">Comments.</a>
</p>
<p>
Well that was exciting! Sorry for jumping the gun, should've done this in a test-first way! I've <a href="http://weblog.infoworld.com/udell/gems/rssInNamespace.html">cached</a> the original version of this posting. Here's a <a href="http://weblog.infoworld.com/udell/gems/rssInNamespace.xml">static version</a> of what I think should test the theory. It's just a simple RSS 2.0 file with one item, and a top-level namespace declaration. NNW seems to like it, but not SharpReader and Radio. 
</p>
</body>
</item> 

<item num="a755">
<title>Zope tips</title>
<date>2003/07/24</date>
<body>
<p>
<img border="1" hspace="6" vspace="6" align="right" src="http://weblog.infoworld.com/udell/gems/zope.gif"/>
I'm back from a visit to Zope headquarters in Fredericksburg, Virginia. I was there to attend the training seminar that supports Zope Corporation's TurboIntranet product, but it was also a great opportunity to refresh my understanding of the various layers of underlying Zope technology.
</p>
<p>
Although I use Zope, I've yet to incorporate some of the more recently-added layers  -- including the Content Management Framework and Zope Page Templates -- into my use of the product. I also picked up on some basic best practices for managing Zope. One is the use of the <a href="http://www.zope.org/Members/4am/instancehome">INSTANCE_HOME</a> environment variable to separate Zope's database, extensions, and installed products from the software installation, so that it's easier to upgrade the base software. 
</p>
<p>
Another is <a href="http://www.zope.org/Members/Caseman/ExternalEditor">ExternalEditor</a>, a server product and helper app combo -- written by Casey Duncan, the instructor for the course I attended -- that enables you to launch an external editor instead of editing Zope scripts and templates inside the TEXTAREA widgets supplied by Zope's management interface. I haven't gotten it working with Mozilla Firebird yet, but with IE it works like a charm.
</p>
<p>
One of the challenges of working in any complex object-oriented framework is figuring out, given some object, just what you can do with it. Python handles this nicely with <tt>dir</tt>, which if you haven't seen it, goes like this:
</p>
<p>
&gt;&gt;&gt; import re <br/>
&gt;&gt;&gt; dir (re) <br/>
['DOTALL', 'I', 'IGNORECASE', 'L', 'LOCALE', 'M', 'MULTILINE', 'S', 'U', 'UNICODE', 'VERBOSE', 'X', '__all__', '__builtins__', '__doc__', '__file__', '__name__', 'compile', 'engine', 'error', 'escape', 'findall', 'match', 'purge', 'search', 'split', 'sub', 'subn', 'template']
</p>
<p>
The first line imports the regular expression module. Then <tt>dir (re)</tt> asks the module to report what it knows how to do. The <tt>re.findall</tt> method is the one I use most often, but when I've been away from Python for a while I can forget its name. This is how I remember it.
</p>
<p>
Although Zope's written in Python, you lose this immediacy because like any Web application server, Zope introduces a bunch of intermediate layers: templates, scripts, the browser. But I learned of a few ways to make exploring Zope a more interactive affair. A Zope Product called <a href="http://www.dieter.handshake.de/pyprojects/zope/DocFinder.html">DocFinder</a> enables you to append <tt>/showDocumentation</tt> to any Zope URL and reveal the docstrings (documentation) for all the classes from which the object represented by that URL is made. Of course this information is, as DocFinder's author Dieter Maurer notes, &quot;unreliable&quot; -- that is, it presumes the docstring is useful and correct. But what is reliable -- and incredibly useful -- is simply to know what are the available methods and their parameters.
</p>
<p>
Another incredibly useful add-on product is <a href="http://hathaway.freezope.org/Software/VerboseSecurity">VerboseSecurity</a>. When a Zope operation is blocked by a security check, I've often struggled to figure out what was the reason for the failed access. As a matter of fact, one such puzzle arose during the class I attended, and VerboseSecurity revealed the solution. As the docs explain, you wouldn't want to dump this information to the browser, because you don't want to give away any details about your security setup. But for developers, VerboseSecurity -- which logs informative messages to the console -- is a terrific resource.
</p>
</body>
</item> 

<item num="a754">
<title>OS X sendmail enabler</title>
<date>2003/07/21</date>
<body>
<p>
It happens all the time. You're in a hotel room, with a nice fast DSL or cable connection, but no mail relay. When I used to tote a ThinkPad, I ran Windows 2000 Server and its SMTP server. Now I tote a PowerBook and, although it comes with sendmail, I've never -- until just now -- gotten it working. For starters, although I'm comfortable with all kinds of software configuration and have run many kinds of servers, sendmail just has never been my thing. Plus, OS X's sendmail is a horse of a slightly different color. James Duncan Davidson wrote a great <a href="http://www.macdevcenter.com/pub/a/mac/2002/09/10/sendmail.html">tutorial</a> that almost got me over the hump, but not quite. Tonight, though, I really needed a solution, and I'd like to thank Bernard Teo for providing it. In the wee hours of Monday morning, he wrote on his weblog:
</p>
<blockquote cite="Bernard Teo">
I put <a href="http://www.roadstead.com/weblog/Tutorials/SMSource.html">Sendmail Enabler</a> up on versiontracker.com on Sunday. Around midnight, I had confirmation that it had been listed. I went to sleep and woke up to find almost a thousand downloads had occurred. [<a href="http://www.roadstead.com/weblog/index.php?entry=/Commentary/geometricprogression.txt">The Ultimate Business Machine</a>]
</blockquote>
<p>
I'm not surprised. There was a huge need for <a href="http://www.roadstead.com/weblog/Tutorials/SMSource.html">Sendmail Enabler</a>, a beautifully-done little OS X app that implements the recipes described in Davidson's article. Thanks Bernard! Your enabler worked like a champ, and you made my day!
</p>
<p>
<b>Update from Nancy McGough</b>
</p>:
<blockquote cite="Nancy McGough">
For most users it makes more sense to use a remote always-on non-dynamic-IP SMTP server that they can authenticate to using SMTP AUTH. This is:
<br/>
<br/>* easier to set up 
<br/>* less likely to be blocked by receiving SMTP servers
<br/>
<br/>I am collecting a list of IMAP Service providers here
<br/>
<br/> http://www.ii.com/internet/messaging/imap/isps/
<br/>
<br/>and many of them support SMTP AUTH.
<br/>
<br/>BTW, Panther is going to use Postfix rather than Sendmail so it probably doesn't make a lot of sense for most Mac users to invest a lot of time learning Sendmail (actually that's probably true for all sys admins these days!).
</blockquote>
<p>
Excellent points. Thanks, Nancy!
</p>

</body>
</item> 

<item num="a753">
<title>Canning spam</title>
<date>2003/07/20</date>
<body>
<p>
<blockquote>
<i>
Spontaneous end-to-end communication used to be the Internet's magic ingredient. But scarcity of IPv4 address space and legions of vandals resulted in NATs and firewalls. Now, unfiltered end-to-end communication happens, for the most part, by invitation only. Until recently, the lone exception was e-mail. You didn't need permission to contact someone by e-mail, and you could be reasonably certain that a message you sent would land in the recipient's inbox. Inevitably that had to change, too. The spam epidemic compels us to create and use the e-mail equivalent of NATs and firewalls: a combination of content filters, white lists, and blacklists. 
</i>
</blockquote>
Full story at [<a href="http://www.infoworld.com/article/03/07/18/28FEspam_1.html">InfoWorld.com</a>]
</p>
</body>
</item> 

<item num="a752">
<title>Aspects revisited</title>
<date>2003/07/20</date>
<body>
<p>
<blockquote>
<i>
 Ken Wing Kuen Lee's <a href="http://www.cs.ust.hk/~scc/comp610e/assignment/reading04.pdf">An Introduction to Aspect-Oriented Programming</a> is a concise and cogent primer on the subject. By way of example, Lee imagines a development shop where the policy is that methods, in any class, must invoke a logger on entry and exit. AOPers call this kind of requirement a &quot;crosscutting concern,&quot; meaning that it affects classes without regard to their kinship in the class hierarchy. You can make a rule that programmers have to call the logger from every class, but there's no easy way to ensure that they'll do it at all, nevermind correctly. The AOP solution is to define a pattern that matches the set of methods that should call the logger, and to rewrite the code automatically so they do. That's what happens under the covers, anyway, but the idea is that the person who'd like to enforce the policy simply declares it, and tools make it so -- without requiring the cooperation or even the knowledge of the programmers responsible for the affected classes.
</i>
</blockquote>
Full story at [<a href="http://www.infoworld.com/article/03/07/18/28OPstrategic_1.html">InfoWorld.com</a>]
</p>
</body>
</item> 

<item num="a751">
<title>An announcement about RSS</title>
<date>2003/07/18</date>
<body>
<p>
As <a href="http://scriptingnews.userland.com/2003/07/18#rss20News">noted</a> on Scripting News today, the RSS 2.0 copyright has been transferred to Harvard, the spec has been placed under a Creative Commmons license, and an <a href="http://blogs.law.harvard.edu/tech/advisoryBoard">advisory board</a> -- initially Dave Winer, <a href="http://www.inessential.com/">Brent Simmons</a>, and me -- has been formed.
</p>
<p>
By way of disclosure, I have no financial stake in any weblog-related products or services. I do, clearly, have a huge personal and professional interest in current and future incarnations of the publish/subscribe technologies now known as weblogs and RSS.
</p>
<p>
Although the uses and benefits of these technologies have become clear to a lot of us over the last five years, a surprising number of folks in the technical community have yet to really exploit them. And most people outside that narrow community haven't even scratched the surface. I hope my involvement with this effort will help me to help them become more aware of what is now possible. 
</p>
<p>
So what about <a href="http://www.intertwingly.net/wiki/pie/FrontPage">Pie/Echo/Necho/Atom</a>? I had hoped that project would be able to find more common ground with the RSS legacy while advancing its technical objectives. Prior to this announcement, that didn't seem possible. Frankly, it may still not be, but I think continuity matters so I hope otherwise. Time will tell.
</p>
<p>
<b>Update</b>: My aggregator brought me an interesting juxtaposition just now:
</p>
<table width="80%" align="center" class="dwsTable" border="0" cellspacing="1" cellpadding="5">
<tr bgcolor="#F5F5F5">
<td class="dwsTableCellHeader"> </td>
<td class="dwsTableCellHeader">
<b>
<a href="http://weblog.infoworld.com/udell/" title="Jon Udell's Radio Blog"/> Jon's Radio, 7/18/2003; 3:37:31 PM.</b> <a href="?xmlUrl=http%3A%2F%2Fweblog.infoworld.com%2Fudell%2Frss.xml" title="View stories in the Jon's Radio channel."/>
</td>
<td class="dwsTableCellHeader" valign="bottom">
<center>
<a href="http://weblog.infoworld.com/udell/rss.xml" title="Click to view the current XML source text for the channel."/>
</center>
</td>
</tr>

			<tr bgcolor="FFFFFF">

				<td class="dwsTableCell" valign="top">
<input type="checkbox" name="108975" checked="on"/>
</td>

				<td class="dwsTableCell" valign="top">
<table>
<tr>
<td class="dwsTableCell">  </td>
<td class="dwsTableCell">
<a name="2"/>
<a href="http://weblog.infoworld.com/udell/2003/07/18.html#a751">An announcement about RSS</a>. As <a href="http://scriptingnews.userland.com/2003/07/18#rss20News">noted</a> on Scripting News  today, the RSS 2.0 copyright has been transferred to Harvard, the spec has been placed under a Creative Commmons license, and an <a href="http://blogs.law.harvard.edu/tech/advisoryBoard">advisory board</a> -- initially Dave Winer, <a href="http://www.inessential.com/">Brent Simmons</a>, and me -- has been formed.
 <b>...</b>
</td>
</tr>
</table>
</td>
				<td class="dwsTableCell" valign="top">
<center>
<a href="/?idStory=108975" title="Add this story to your weblog."/>
</center>
</td>
				</tr>
			<tr bgcolor="#F5F5F5">
<td class="dwsTableCellHeader"> </td>
<td class="dwsTableCellHeader">
<b>
<a href="http://www.infoworld.com/news/index.html" title="Top News from InfoWorld"/> InfoWorld:  Top News, 7/18/2003; 3:36:06 PM.</b> <a href="?xmlUrl=http%3A%2F%2Fwww.infoworld.com%2Frss%2Fnews.rdf" title="View stories in the InfoWorld:  Top News channel."/>
</td>
<td class="dwsTableCellHeader" valign="bottom">
<center>
<a href="http://www.infoworld.com/rss/news.rdf" title="Click to view the current XML source text for the channel."/>
</center>
</td>
</tr>
			<tr bgcolor="FFFFFF">
				<td class="dwsTableCell" valign="top">
<input type="checkbox" name="108968" checked="on"/>
</td>
				<td class="dwsTableCell" valign="top">
<table>
<tr>
<td class="dwsTableCell">  </td>
<td class="dwsTableCell">
<a name="1"/>
<a href="http://www.infoworld.com/article/03/07/18/28NNrss_1.html">Debate flares over Weblog standards</a>. Despite technical battles, Weblogs prepare to alter the collaboration and content management space</td>
</tr>
</table>
</td>
				</tr>
</table>
<p>
The second story, by Cathleen Moore, appears on InfoWorld's home page today.
</p>
</body>
</item> 

<item num="a750">
<title>Who would need a telephone?</title>
<date>2003/07/17</date>
<body>
<p>
In an essay on the difficulty of explaining RSS to folks in the Netherlands, where he lives, Adam Curry writes: 
</p>
<blockquote cite="Adam Curry">
Really Simple translates well, but there is no dutch equivalent for the word Syndication. The country is so small it hasn't ever needed syndicated content. TV and radio signals are nationwide, so there is no network of affiliates to speak of. [<a href="http://live.curry.com/stories/2003/07/13/rssInTheLowlands.html">Adam Curry's weblog</a>]
</blockquote>
<a href="http://photo2.si.edu/infoage/infoage.html">
<img align="right" alt="early phones" vspace="6" hspace="6" src="http://photo2.si.edu/infoage/bell1t.gif"/>
</a>
<p>
A fascinating observation. It reminds me of a talk once given by a Lotus exec, Larry Moore, who was trying to explain why Notes' biggest problem was that nobody understood what problem it solved. To illustrate by example, Moore told a story -- possibly apocryphal, but wonderful nonetheless -- about the early road shows evangelizing the telephone. To demonstrate the new technology, Moore said, the evangelists would unspool some copper wire, stretch it across the stage, connect two phones, and show how two people could talk to one another over the wire. But nobody got it. Any fool could plainly see that those two people could as easily shout across the stage to one another! Who would need a telephone? They didn't get that the wire could also stretch across continents and oceans!
</p>
</body>
</item> 

<item num="a749">
<title>Publishing, permanence, and transparency</title>
<date>2003/07/16</date>
<body>
<p>
<table align="right" cellspacing="0" cellpadding="6">
<tr>
<td>
<a href="http://www.nb.no/baser/schoyen/5/5.1/ms575.jpg">
<img alt="palimpsest" width="123" height="175" src="http://www.nb.no/baser/schoyen/5/5.1/ms575.jpg"/>
</a>
</td>
</tr>
<tr>
<td width="125">
<div class="realsmall">A <a href="http://www.iath.virginia.edu/elab/hfl0243.html">palimpsest</a> is a manuscript on which an earlier text has been effaced and the vellum or parchment reused for another.</div>
</td>
</tr>
</table>
Under heavy surveillance (which has now <a href="http://diveintomark.org/ww/">ceased</a>), Dave Winer reacted:
</p>
<blockquote cite="Dave Winer">
Now that people have set up a system to record everything on Scripting that I post within five minute intervals, I don't think I'll be writing any more of that stuff here. I guess it's time for weblogs to become like television. Polished and politically correct. Impersonal. Commercial. [<a href="http://scriptingnews.userland.com/2003/07/14#l08a5766959d139341772b0cbc0134695">Scripting News</a>]
</blockquote>
<p>
I understand and sympathize, but I think a bigger story is unfolding around us. Last year, I wrote an item entitled <a href="http://weblog.infoworld.com/udell/2002/02/19.html#a72">Walking the fault lines</a> about my experiences with SOAP and WSDL. Scripting News <a href="http://scriptingnews.userland.com/backissues/2002/02/19#l68d705dc391c89b3a837e674464f4300">picked up on it</a>. (This was the same posting that began my serendipitous association with an Indian programmer named Nishant S. [<a href="http://weblog.infoworld.com/udell/2002/02/20.html">1</a>, <a href="http://weblog.infoworld.com/udell/2003/03/19.html">2</a>].) Later that day, using the <a href="http://www.oreillynet.com/meerkat/">Meerkat</a> aggregator, I noticed there were two versions of Dave's commentary, and I wrote:
</p>
<blockquote cite="Jon Udell">
Meerkat captured two versions. I like them both.  When the Wayback Machine really gets cranking, we'll have to accept that all our revisions can be seen. This seems like it should be scary. But it doesn't seem to bother me much. Palimpsests are intriguing to read, and fun to write. [<a href="http://weblog.infoworld.com/udell/2002/02/19.html#a74">Jon's Radio</a>]
</blockquote>
<p>
<a href="http://radio.weblogs.com/0100887/images/my/meerkatTwoVersions.jpg">
<img alt="two versions" vspace="6" hspace="6" width="280" height="145" align="right" src="http://radio.weblogs.com/0100887/images/my/meerkatTwoVersions.jpg"/>
</a>
Still later on the same day, Dave was riffing on the theme of &quot;Internet 3.0&quot; and wrote:
</p>
<blockquote cite="Dave Winer">
Jon discovers an important feature of Internet 3.0. Real-time edits preserved for perpetuity. [<a href="http://scriptingnews.userland.com/backissues/2002/02/19#lc4d854b528bf8bea11885546a54a53ef">Scripting News</a>]
</blockquote>
<p>
I think it probably <i>is</i> a feature, but one that will have profound effects that we're all just beginning to come to terms with. Remember when, as a kid, you imagined that if you jumped in and out of the pool <i>really quick</i> you wouldn't get wet? Throughout most of the short life of Web publishing, we've been able to sort of pull that trick off. No more. Increasingly the Web sees and remembers everything. 
</p>
<p>
When Tim Bray <a href="http://www.tbray.org/ongoing/When/200x/2003/07/10/StillMovie">pointed</a> the other day to <a href="http://www.acmqueue.org/modules.php?name=Content&amp;pa=showpage&amp;pid=43">Jim Gray's musings</a> on the future of storage -- practically infinite capacity, sequential vs. random-access devices, log-structured filesystems, unconstrained archiving and versioning -- it reminded me of Ted Nelson's vision of Xanadu as the all-seeing, all-remembering, write-once, versioned memory of our species. 
</p>
<p>
The current debate about <i>depublishing</i> (<a href="http://www.tenreasonswhy.com/weblog/archives/2003/07/10/the_ethics_of_depublishing.html">1</a>, <a href="http://eepi.ubib.eur.nl/iliit/archives/000244.html">2</a>) also reminds me, yet again, of David Brin's seminal book <a href="http://weblog.infoworld.com/udell/2002/02/14.html#a65">The Transparent Society</a>. Brin argues that we'll be unable to prevent what happens in public spaces -- physical or virtual -- from being recorded, and that the best we can do is to assure equality of access to that data. 
</p>
<p>
Maybe Brin's wrong. Maybe we will find a technological substitute for the veil of practical obscurity that historically protected us from undue scrutiny. But while that may be possible in <a href="http://weblog.infoworld.com/udell/2003/06/21.html">some cases</a>, I suspect that in general Brin is right. We can still enjoy realms of privacy -- both physical and virtual -- but public acts will become part of an increasingly detailed and indelible public record. That will cause problems that have no technological solutions, only human ones. I can think of two. First, as with email, we're going to have to accept that what goes to the Web tends to stay there. Second, since we are all going to make mistakes, say things we wish we hadn't, and suffer the effects of software glitches, we're all going to have to learn to cut one another a lot more slack.
</p>
</body>
</item> 

<item num="a748">
<title>The Mozilla Foundation</title>
<date>2003/07/16</date>
<body>
<p>
I've been writing about Mozilla quite a lot lately (<a href="http://weblog.infoworld.com/udell/2003/07/02.html#a685">1</a>, <a href="http://www.infoworld.com/article/03/06/13/24OPstrategic_1.html">2</a>, <a href="http://weblog.infoworld.com/udell/2003/06/12.html#a720">3</a>, <a href="http://weblog.infoworld.com/udell/2003/06/06.html#a714">4</a>, <a href="http://weblog.infoworld.com/udell/2003/06/05.html">5</a>, <a href="http://weblog.infoworld.com/udell/2003/06/04.html">6</a>, <a href="http://weblog.infoworld.com/udell/2003/06/02.html">7</a>), now that Firebird has -- quite unexpectedly -- become my browser of choice on Windows, Mac OS X (Safari notwithstanding), and Linux. In a <a href="http://www.infoworld.com/article/03/06/13/24OPstrategic_1.html">recent column</a> I implored AOL to do the right thing by Mozilla, and it seems that is happening. Today AOL announced financial and logistical support for the newly-hatched <a href="http://www.mozillafoundation.org/press/mozilla-foundation.html">Mozilla Foundation</a>. Excellent!
</p>
<p>
Of course, the $2 million that AOL is tucking into Mozilla's pocket, as it sends the project out to make its own way in the world, is only a drop in the bucket. Mitch Kapor writes:
</p>
<blockquote cite="Mitch Kapor">
Now, Mozilla's fate is under its own control. AOL has given it a good send-off so there is enough in the way of resources to get going, but it's going to need to gather more financial support from corporations and others for its long-term future. [<a href="http://blogs.osafoundation.org/mitch/">Mitch Kapor's weblog</a>]
</blockquote>
<p>
My hunch is that that Mozilla will now find the air supply it needs to keep going. I'm curious to see how Microsoft will respond. As has been often noted, the stated or implied plan to make IE 6 hold the fort until Longhorn's arrival in 2005 (or whenever) suddenly looks pretty shaky. I'd love to see Microsoft turn the competitive crank a notch, for example by pushing some of the InfoPath and Word 2003 XML technologies into the IE browser where everybody (well, everybody on Windows) can get at them. Six months ago there wouldn't have been a business case for doing that. I'd love to see Mozilla's success make that case, and in so doing create its own next great challenge.
</p>
</body>
</item> 

<item num="a747">
<title>Test-driven development</title>
<date>2003/07/15</date>
<body>
<p>
I've been researching the subject of test-driven development for an upcoming story. My sources include Kent Beck's excellent book, <a href="http://safari.oreilly.com/?XmlId=0-321-14653-0">Test-Driven Development by Example</a>, and interviews with <a href="http://weblog.infoworld.com/udell/2003/02/13.html">Ward Cunningham</a> (again!), <a href="http://www.testing.com/cgi-bin/blog">Brian Marick</a>, and others. As one who has yet to incorporate any of the <a href="http://www.xprogramming.com/software.htm">xUnit</a> family of <a href="http://c2.com/cgi/wiki?TestingFramework">testing frameworks</a> into my coding practice, I'm perhaps not the best person to tell this story. On the other hand, I do think I'm well suited to understand the message that test-first practitioners have been consistently delivering, and to broadcast it to a wider audience.
</p>
<p>
Test-first is quite radically different from test-later. And the most important difference I've discovered so far is that test-first, in the hands of people like Kent Beck and Ward Cunningham and Martin Fowler and the many others who practice the technique, is as much a tool for rationalizing the exploration of the problem domain, and sequencing the work of refactoring, as it is a way of protecting against regression. Those of you to whom this has already become second nature are probably yawning and thinking &quot;welcome to the club.&quot; But I'm pretty  sure this take on the purpose and benefit of test-first has not been adequately conveyed to a general audience of IT practitioners who regard &quot;extreme programming&quot; as, well, a bit extreme. It's going to be an interesting challenge!
</p>
</body>
</item> 

<item num="a746">
<title>Test Center programming survey</title>
<date>2003/07/14</date>
<body>
<p>
InfoWorld's Test Center is conducting another survey. This time the subject is programming. Tom Yager and I contributed questions, and then it occurred to me that readers of this weblog might like to weigh in on the questions we proposed, or to contribute others. If so, please <a href="http://radiocomments.userland.com/comments?u=100887&amp;p=746&amp;link=http%3A%2F%2Fweblog.infoworld.com%2Fudell%2F2003%2F05%2F12.html%23a746">fire away</a>!
</p>
<p>
(Note: This is <i>not an actual survey</i>. It's an invitation to help us shape what the survey will ask.)
</p>
<hr align="center" width="25%"/>
<p>
 How much do you spend per year on internal software development (not including staff salaries)?
<div>   - ...range of choices...</div>
</p>
<p>
 What did you spend last year on outsourced development? 
<div>   - ...range of choices...</div>
</p>
<p>
 What will you spend in the next 12 months on outsourced development?
<div>   - ...range of choices...</div>
</p>
<p>
 Which frameworks or APIs do you and your engineers use for your projects (choose all that apply)?
<div>   -  J2EE</div>
<div>   -  Microsoft .Net</div>
<div>   -  Microsoft Win32/COM/DCOM or VB6</div>
<div>   -  Unix or Linux</div>
<div>   -  Macintosh Classic or OS X</div>
<div>   -  Common Gateway Interface (CGI)/HTML/xScript/DOM</div>
<div>   -  Other</div>
</p>
<p>
 The software you write is used by (choose as many as apply) 
<div>   -  Internal users</div>
<div>   -  External customers</div>
<div>   -  Partners</div>
</p>
<p>
 Which programming languages do you and your engineers use for development (choose as many as apply)?
<div>   -   Java</div>
<div>   -   C</div>
<div>   -   Visual Basic</div>
<div>   -   C++</div>
<div>   -   C#</div>
<div>   -   Perl</div>
<div>   -   PHP</div>
<div>   -   Python</div>
<div>   -   JavaScript/ECMAScript</div>
<div>   -   Unix shell scripts</div>
<div>   -   Other interpreted languages (e.g. Ruby, Eiffel, Oberon)  </div>
</p>
<p>
 Which vendors supply your development tools? 
<div>   -   Borland</div>
<div>   -   Microsoft</div>
<div>   -   IBM</div>
<div>   -   Sun Microsystems</div>
<div>   -   BEA</div>
<div>   -   Oracle</div>
<div>   -   Other</div>
</p>
<p>
 Do you, or would you, permit your programmers to use some work time for open source projects?</p>
<p>
 Do you require new hires to be proficient in multiple programming languages and tools?</p>
<p>
 You tend to upgrade to new versions of tools or frameworks:
<div>   -   As soon as they become available</div>
<div>   -   Only when they solve specific problems or supply needed features</div>
<div>   -   Only for new projects</div>
<div>   -   Rarely</div>
<div>   -   Never</div>
</p>
<p>
 Which technologies are part of your server development (choose as many as apply)? 
<div>   -   Web services</div>
<div>   -   Business process management</div>
<div>   -   Web applications (Dynamic HTML, server-side scripting)</div>
<div>   -   XML</div>
<div>   -   XSLT</div>
<div>   -   Relational databases</div>
<div>   -   XML or object-oriented databases</div>
<div>   -   Clustering</div>
<div>   -   Monitoring and self-healing</div>
<div>   -   Object brokers</div>
<div>   -   Speech</div>
<div>   -   Other</div>
</p>
<p>
 What are the types of data stored and manipulated by the software you and your engineers write (choose as many as apply)? 
<div>   -  Relational</div>
<div>   -  XML</div>
<div>   -  Unstructured text</div>
<div>   -  Persistent objects</div>
<div>   -  PDF/spreadsheet/others (please specify)</div>
</p>
<p>
 Your preferred programming abstraction for dealing with data is:
<div>   -  Relational</div>
<div>   -  Object</div>
<div>   -  XML</div>
</p>
<p>
 Do you find dynamic languages (e.g., Python, Perl, VBScript) appropriate, neutral, or inappropriate for:
<div>   - Automation of build and test procedures</div>
<div>   - Data reduction and analysis</div>
<div>   - Consumption of components and/or Web services</div>
<div>   - Production of components and/or Web services</div>
<div>   - Production of user-facing Web applications</div>
<div>   - Production of GUI applications</div>
</p>
<p>
 How do you and your programmers write tests (choose as many as apply)?
<div>   -  Before coding</div>
<div>   -  While coding</div>
<div>   -  After coding</div>
<div>   -  Never</div>
</p>
<p>
 Are you satisfied with the level of software reuse you achieve?
<div>   -  Yes</div>
<div>   -  No</div>
<div>   -  Not sure</div>
</p>
<p>
 Rate the following strategies (0 = low benefit, 1 = some benefit, 2 = high benefit) used to package software for reuse:
<div>   -  Shared libraries (e.g., .so or .dll, Java class, .NET assembly)</div>
<div>   -  Components (e.g., COM objects, Java beans)</div>
<div>   -  <s>Shared scripts</s> dynamic-language (e.g., Python or Perl) modules</div>
<div>   -  Web services</div>
<div>   -  Other</div>
</p>
<p>
 The single biggest obstacle to reuse is:
<div>   -  Effort required to design software for reuse</div>
<div>   -  Programmer disinclination to package software for reuse</div>
<div>   -  Effort required to package software for reuse</div>
<div>   -  Lack of awareness of what software is available for reuse</div>
<div>   -  Effort required to learn and effectively apply software available for reuse</div>
<div>   -  Other</div>
</p>
<p>
 When your software fails to satisfy users, the reason is most likely:
<div>   -  Implementation of a flawed spec</div>
<div>   -  Flawed implementation of a correct spec</div>
<div>   -  Inability to integrate with other software</div>
<div>   -  Poor application response time</div>
<div>   -  Dissatisfaction with user interface</div>
<div>   -  Failure to evolve in a timely way as business needs change</div>
<div>   -  Other</div>
<p>
 When your software presents a user interface, your preferred technology is:
<div>   - Web-style (HTML/DHTML/JavaScript) </div>
<div>   - fat-client GUI (such as Windows, Flash, or Java/Swing)</div>
<div>   - thin-client (remote) GUI </div>
<div>   - Other</div>
</p>
<p>
 The ability to present a <s>user interface</s> single common GUI on various client platforms (Windows, Unix/Linux, Mac OS X) is:
</p> 
<div>   -  Essential</div>
<div>   -  Important</div>
<div>   -  Not needed</div>
</p>
<p>
 Activities currently supported by your programming tools include (choose all that apply):
<div>   -  Edit / compile / link / debug</div>
<div>   -  Testing</div>
<div>   -  Documentation/modeling/diagramming</div>
<div>   -  Source control/version management</div>
<div>   -  Configuration/deployment management</div>
<div>   -  Logging/monitoring</div>
<div>   -  Issue tracking/bug database</div>
<div>   -  Business rules/workflow</div>
</p>
<p>
 The activity, not currently supported by your programming tools, for which you most need support (rank top 3) are:
<div>   -  Testing</div>
<div>   -  Documentation/modeling/diagramming</div>
<div>   -  Source control/version management</div>
<div>   -  Configuration/deployment management</div>
<div>   -  Logging/monitoring</div>
<div>   -  Issue tracking/bug database</div>
<div>   -  Business rules/workflow</div>
</p>
</body>
</item> 

<item num="a745">
<title>Core and periphery</title>
<date>2003/07/14</date>
<body>
<p>
<blockquote>
<i>
Successful evolution can't simply be a matter of choosing the right boundary because that's a moving target. Even at a given point in time, there are many ways to draw the &quot;right&quot; boundaries. Linux, for example, is a so-called monolithic system. It shunned the microkernel architecture that became fashionable years ago, yet it enjoys wild success. And while Windows and Mac OS X embrace the microkernel approach, nobody calls that the key to their success. In all these cases, other kinds of boundaries are being drawn, and other balances struck. There's no easy answer here, but I do have a hunch about what works best. [Full story at <a href="http://www.infoworld.com/article/03/07/11/27OPstrategic_1.html">InfoWorld.com</a>]
</i>
</blockquote>
</p>
</body>
</item> 

<item num="a744">
<title>The document is the database</title>
<date>2003/07/14</date>
<body>
<p>
<blockquote>
<i>
When you need to store and display a modest amount of structured or semistructured data, it's tempting to store it directly in an HTML file. I've used this strategy many times; undoubtedly you have too. The advantages and disadvantages of working directly with a presentation format are pretty clear. It's handy that the &quot;database&quot; is a self-contained package that can be updated using any text editor, emailed, read directly from a file system, or served by any web server. But it's awkward to share the work of updating with other people or to isolate and edit parts of the file as it grows. When we convert to a database-backed web application in order to solve these problems, we trade away the convenience of the file-oriented approach. Can we have our cake and eat it too? This month's column explores the idea that a complete web application can be wrapped around an XHTML document, using XSLT for search, insert, and update functions. [Full story at <a href="http://www.xml.com/pub/a/2003/07/09/udell.html">O'Reilly Network</a>]
</i>
</blockquote>
</p>
</body>
</item> 

<item num="a743">
<title>SpamBayes update</title>
<date>2003/07/11</date>
<body>
<p>
I've been comparing notes with <a href="http://weblog.infoworld.com/yager/">Tom Yager</a>, who notices lately that spammers' use of nonsense words, especially in Subject: headers, seems to be effective against the Bayesian filter in OS X's Mail.app. I checked, and <a href="http://www.infoworld.com/article/03/05/16/20TCspam_1.html">SpamBayes</a> is (so far) unaffected by this ploy. One of the cool things about SpamBayes is its ability to reveal how it analyzes messages. See below for its take on a message that has the Subject: line &quot;Jon nezinyunyane inflechies&quot; and a bunch of angle-bracketed garbage in the text. 
</p>
<p>
Evidently SpamBayes ignores all the garbage tags, but since these serve as word delimiters, it winds up seeing a bunch of word fragments -- like 'innov' and 'ative' -- which it finds suspicious. And as the spam counts indicate, it has seen these fragments before, so over time their discriminatory power should only grow, not diminish.
</p>
<p>
It's also fascinating to look at the handling of the giveaway phrase &quot;Multi-Trillion Dollar Market.&quot; &quot;Multi-Trillion&quot; does not appear on SpamBayes' list of interesting tokens, though &quot;dollar&quot; and &quot;market&quot; do. That SpamBayes makes no effort to correlate these adjacent words seems like an obvious limitation, and yet it is (so far) continuing to perform spectactularly well for me despite that.
</p>
<p>
<b>Update</b>: As I keep forgetting for some reason, and as Giorgio  Valoti reminds me, Mail.app uses <a href="http://www.pacificavc.com/blog/2003/02/10.html#a78">latent semantic analysis</a>, not the Bayesian technique.
</p>
<a name="analysis"/>
<p>
Spam Score: 1
</p>
<pre>
word                                spamprob         #ham  #spam
'*H*'                               0                   -      -
'*S*'                               1                   -      -
'jon-'                              0.0313807          38      1
'from:addr:jonathan'                0.06584             3      0
'noheader:mime-version'             0.267816         3682   1332
'there'                             0.357648         1865   1027
'web'                               0.359379         1678    931
'noheader:reply-to'                 0.398404         8311   5444
'reply-to:none'                     0.398404         8311   5444
'your'                              0.607781         3493   5354
'now'                               0.609287         1198   1848
'header:Date:1'                     0.614892         5565   8789
'header:From:1'                     0.616075         5536   8787
'live'                              0.617098          227    362
'subject:Jon'                       0.628519          123    206
&quot;you've&quot;                            0.635875          294    508
'potential'                         0.637791          171    298
'header:Received:6'                 0.639839          738   1297
'url:com'                           0.643368         3651   6515
'must'                              0.651722          330    611
'year.'                             0.654359          135    253
'proto:http'                        0.657505         4086   7759
'area'                              0.657698          183    348
'break'                             0.668753           75    150
'skip:m 10'                         0.671444          690   1395
'header:Return-Path:1'              0.698156         3807   8710
'join'                              0.72811           152    403
'market.'                           0.736942           76    211
'serious'                           0.779225           64    224
'header:Message-Id:1'               0.782263         1220   4336
'sell'                              0.789965           88    328
'life'                              0.850081          110    618
'walls'                             0.86821            15     99
'url:htm'                           0.870883          144    962
'to:addr:jon'                       0.871895           87    587
'url:index'                         0.877064          152   1074
'dollar'                            0.917399           19    211
'url:173'                           0.934783            0      3
'unique,'                           0.949503            5     97
'subject:\\xe9'                      0.95032             1     23
'pac'                               0.96723             3     94
'independence'                      0.969427            3    101
'earn'                              0.969434           11    352
'000'                               0.971088            3    107
'url:61'                            0.974407            1     46
'url:133'                           0.97619             0      9
'margin'                            0.977151            2     94
'$100,'                             0.977616            2     96
'url:41'                            0.981928            0     12
'ing'                               0.982844            2    126
'infor'                             0.987666            1     97
'ailability.'                       0.994822            0     43
'rofit'                             0.994822            0     43
'ative'                             0.994938            0     44
'innov'                             0.994938            0     44
'aking'                             0.99505             0     45
'busin'                             0.99505             0     45
'ited'                              0.99505             0     45
'pture'                             0.99505             0     45
'azing.'                            0.995156            0     46
'lim'                               0.995156            0     46
'ame'                               0.99545             0     49
'che'                               0.99545             0     49
'message-id:@vampiress.zzn.com'     0.995627            0     51
'from:addr:vampiress.zzn.com'       0.99579             0     53
'rica'                              0.996014            0     56
'ess'                               0.996937            0     73
'amearn'                            0.997592            0     93
'amed'                              0.997592            0     93
'fina'                              0.997592            0     93
'kage.'                             0.997592            0     93
'ncial'                             0.997592            0     93
'ney!'                              0.997592            0     93
'dre'                               0.997667            0     96
'mation'                            0.997738            0     99
'to:name:jon'                       0.998132            0    120
'to:addr:songline.com'              0.998351            0    136
</pre>
<p>Message Stream:</p>
<pre>
Date: Thu, 10 Jul 2003 21:08:32 -0800
Subject: Jon nezinyunyane inflechies
X-Sender: Cole Nickol &lt;jonathan@vampiress.zzn.com&gt;
&lt;/pre&gt;
&lt;pre&gt;
&lt;ioIVkQkR&gt;&lt;OlpyCsaP&gt;&lt;MSvpk&gt;
&lt;dHEnke&gt;&lt;LmKTnRTowC&gt;&lt;rrRsACDfT&gt;
&lt;oxwCx&gt;&lt;fRmtryOnGq&gt;&lt;UUkBB&gt;
&lt;XCKfvJ&gt;&lt;cNPVNIa&gt;&lt;qOaXN&gt;
&lt;pJIdrYdnW&gt;&lt;ipvub&gt;&lt;TigRpgXSHL&gt;
&lt;jOMnGnk&gt;&lt;fAdDuDwhKg&gt;&lt;vtWrFAnErq&gt;
&lt;kSYDLiOeO&gt;&lt;PMcHf&gt;&lt;lmkJTSd&gt;
&lt;JRFwXqJrU&gt;&lt;aFFNaxP&gt;&lt;PcdrQ&gt;
&lt;interesses&gt;&lt;nenahospodari&gt;&lt;interesses&gt;
&lt;p&gt;&lt;font face=&quot;Trebuchet MS&quot;&gt;Jon-&lt;/font&gt;&lt;/P&gt;
&lt;p&gt;&lt;/p&gt;
&lt;p&gt; &lt;nenahospodari&gt;&lt;interesses&gt;&lt;/p&gt;
&lt;p&gt;&lt;font face=&quot;Trebuchet MS&quot;&gt;Ca&lt;NLsQcKSV&gt;pture Your Dre&lt;bSVhiD&gt;amEarn
Fina&lt;interesses&gt;ncial Independence&lt;/font&gt;&lt;/p&gt;
&lt;p&gt;&lt;font face=&quot;Trebuchet MS&quot;&gt;You can now for the first
time,&lt;nenahospodari&gt;
&lt;BJxQNXNYyQ&gt;own a busin&lt;Pqktbw&gt;ess in your area with the most unique,
&lt;KvynRHwW&gt;innov&lt;FkEhUrwj&gt;ative product in Ame&lt;SatnmBdYiP&gt;rica today. Work
le&lt;mQvuxWo&gt;ss a week with the potential to earn
$100,&lt;interesses&gt;000 a year. There is no sell&lt;QgBARjeQ&gt;ing and not
ML&lt;gtdyhs&gt;M. Join a Multi-Trillion Dollar Market.&lt;/font&gt;&lt;/p&gt;
&lt;p&gt;&lt;font face=&quot;Trebuchet MS&quot;&gt;The p&lt;KChHi&gt;rofit margin is
am&lt;SIGOVYIdkA&gt;azing.&lt;nenahospodari&gt;&lt;/font&gt;&lt;/p&gt;
&lt;p&gt;&lt;font face=&quot;Trebuchet MS&quot;&gt;&lt;nenahospodari&gt;Break down the walls and live
this life you've only dre&lt;interesses&gt;amed about.&lt;/font&gt;&lt;/p&gt;
&lt;p&gt;&lt;font face=&quot;Trebuchet MS&quot;&gt;Lim&lt;QkWkUgv&gt;ited av&lt;UPeOWb&gt;ailability.
for Your Fr&lt;interesses&gt;ee infor&lt;RCbUkNPVg&gt;mation
pac&lt;afkLWbCAa&gt;kage.&lt;/font&gt;&lt;/p&gt;
&lt;p&gt;&lt;font face=&quot;Trebuchet MS&quot;&gt;&lt;oCgNBti&gt;&lt;a
href=&quot;http://61.173.41.133/3a22147895.com/index.htm&quot;&gt;-web-site-&lt;/a&gt;&lt;/font&gt;
&lt;/p&gt;
&lt;p&gt;&lt;nenahospodari&gt;&lt;/p&gt;
&lt;p&gt;&lt;font face=&quot;Trebuchet MS&quot;&gt;Y&lt;rcMWMN&gt;ou must che&lt;bkvKm&gt;ck this out if you
are serious about m&lt;TQYVM&gt;aking mo&lt;interesses&gt;ney!&lt;/font&gt;&lt;/p&gt;
&lt;nenahospodari&gt;
&lt;p&gt;&lt;/p&gt;
&lt;p&gt;&lt;/p&gt;
&lt;p&gt;&lt;font face=&quot;Trebuchet MS&quot;&gt;O&lt;interesses&gt;pt o&lt;yxXafJRXuQ&gt;ut at
web&lt;interesses&gt;site&lt;interesses&gt;&lt;/font&gt;&lt;/p&gt;
</pre>
</body>
</item> 

<item num="a742">
<title>The starship and the canoe</title>
<date>2003/07/11</date>
<body>
<p>
<a href="http://www.amazon.com/exec/obidos/asin/0060910305/">
<img align="right" vspace="6" hspace="6" src="http://images.amazon.com/images/P/0060910305.01.MZZZZZZZ.jpg"/>
</a>
I am unfortunately not able to be at this year's open source conference, but I've been reading about it at the <a href="http://www.oreillynet.com/weblogs/">O'Reilly Network</a> site and also on <a href="http://www.windley.com/">Phil Windley's site</a> where he is, as has become his habit, doing a spectacular job of blogging the show.
</p>
<p>
Today Phil writes about <a href="http://www.windley.com/2003/07/11.html#a729">a talk by George Dyson</a> featuring materials from the archives of the Institute for Advanced Study. I'd love to have been there! 
</p>
<p>
If you've never read Kenneth Brower's <a href="http://www.amazon.com/exec/obidos/asin/0060910305/">The Starship and the Canoe</a>, by the way, it's a real treat. First published 25 years ago (!), it's a father/son biography of Freeman Dyson and George Dyson. The &quot;starship&quot; in the title refers to <a href="http://www.wikipedia.org/wiki/Nuclear_pulse_propulsion">Project Orion</a>, an effort by Freeman Dyson and <a href="http://www.wikipedia.org/wiki/Ted_Taylor">Ted</a> <a href="http://www.sondra.net/concerns/ttspeech.htm">Taylor</a> to produce a nuclear-bomb-propelled interplanetary vessel. The &quot;canoe&quot; refers to a seagoing kayak that George, then a tree-dwelling rebel living in British Columbia, was building. I wonder how the Dysons felt about the book then, and I wonder how they feel about it now. 
</p>
</body>
</item> 

<item num="a741">
<title>ACM Queue</title>
<date>2003/07/11</date>
<body>
<p>
Tim Bray's <a href="http://www.tbray.org/ongoing/When/200x/2003/07/10/StillMovie">videoblogging experiment today</a> points to a wonderful <a href="http://www.acmqueue.org/modules.php?name=Content&amp;pa=showpage&amp;pid=43">Conversation between Dave Patterson and Jim Gray</a>. Tim writes:
</p>
<blockquote cite="Tim Bray">
Patterson and Gray are both pretty famous in our profession, but neither is as famous as he deserves to be.
</blockquote>
<p>
I'll second that. I once got Jim (along with Jeri Edwards) to write <a href="http://www.byte.com/art/9504/sec11/art3.htm">an article for BYTE</a> that's still a great introduction to the subject of TP monitors. 
</p>
<p>
The interview Tim points to is featured in a magazine I'd not heard of before, <a href="http://www.acmqueue.org/">ACM Queue</a>, which seems to have started quite recently (March 2003). The articles appear online, but I couldn't find an RSS feed, so I <a href="http://weblog.infoworld.com/udell/gems/acmqueue.py.txt">made</a> <a href="http://weblog.infoworld.com/udell/gems/acmqueue.xml">one</a>. Ah. Life is good!
</p>
</body>
</item> 

<item num="a740">
<title>Wrappers, injectors, and writing tools</title>
<date>2003/07/10</date>
<body>
<p>
<a href="http://www.jasa.net.au/quillpen.htm">
<img vspace="6" hspace="6" src="http://www.jasa.net.au/images/scribe.gif" align="right"/>
</a>
I gather that <a href="http://www.tnl.net/channels/rss2necho/http://weblog.infoworld.com/udell/rss.xml">this way</a> of representing my RSS feed is ready to <a href="http://tbray.org/ongoing/When/200x/2003/07/09/PieSchema01">declare victory</a> over <a href="http://weblog.infoworld.com/udell/rss.xml">this way</a>. Wake me up when it's over. At the end of the day, any XML metadata wrapper around the content of our blog entries will do the job, and it's trivial to transform one flavor of wrapper to another. If there were no legacy to consider, it'd be a toss-up as to which I'd prefer. Since there is a legacy, I'd rather preserve it, but that's a complicated matter about which too much has been said, and I'm only one of many voices.
</p>
<p>
Similarly, I gather that <a href="http://wellformedweb.org/news/5">this way</a> of injecting an item into blogspace is preparing to declare victory over <a href="http://www.xmlrpc.com/metaWeblogApi">this way</a> or <a href="http://www.blogger.com/developers/api/1_docs/">this way</a>. Again, wake me up when it's over. Were there no legacy to consider, I'd much prefer the new approach. I like its RESTian purity, though I'd also be open to a SOAP variant that could optionally leverage all of the authorization and routing machinery that's finding its way into SOAP headers. Of course there is a legacy here as well, and in this case, it seems to <a href="http://www.intertwingly.net/wiki/pie/XmlRpcDiscussion">carry some weight</a>.
</p>
<p>
Oddly, despite all my blogging, I seem to depend on none of these injector APIs. When the &quot;Blog This&quot; bookmarklet first surfaced I used it for a while, but soon lost interest. I never wanted to blog just a link with a sentence of description. I always wanted to write something more substantial. The same holds true for comments, another major use case for the injector APIs. No matter which API wins, we will still -- so far as I can see -- be dumped into HTML TEXTAREA widgets to compose the content that is the ultimate purpose of all this blogging. Isn't it?
</p>
<p>
So while wiser heads than I debate the pros and cons of various wrapper and injector strategies, I've decided to try to achieve some forward motion on a different tack. For reasons and in ways I've recently been demonstrating, blog content would be a lot more valuable if it were easier for non-emacs-using civilians to write XHTML. I'm particularly interested in finding ways to relate the style vocabulary of a standard wordprocessor, which is the only kind of granular metadata that people will consistently apply, to an emergent semantic vocabulary in blogspace. But that's a long-term thing. In the short term, I'm just looking for ways to empower regular folks to create well-formed blog content.
</p>
<p>
My first experimental subject will be Word 2003. I've just received the &quot;beta refresh&quot; so this is a good time to explore whether it's practical to write in WordML (Word 2003's XML vocabulary) and then transform to XHTML. If you've already gone down that path and have experiences and/or XSLT code to share, I'd love to hear from you.
</p>
</body>
</item> 

<item num="a739">
<title>Voltage Security's identity-based encryption</title>
<date>2003/07/08</date>
<body>
<blockquote cite="John Markoff">
Today, in order to exchange a secure electronic message it is necessary for the sender to go to a directory to find the recipient's public key. That public key is then used to scramble the message in such a way that only the recipient can read it by using a second, privately held key to unscramble it. [<a href="http://www.nytimes.com/2003/07/07/technology/07CODE.html?ex=1372910400&amp;en=78010b33326549db&amp;ei=5007&amp;partner=USERLAND">A Simpler, More Personal Key to Protect Online Messages, John Markoff, New York Times</a>]
</blockquote>
<p>
False. Today, I attach S/MIME signatures to all my outbound messages. The signature includes my public key. Thousands of recipients of messages from me could choose to encrypt messages to me, since they already have my public key. Of course, no-one does. Conversely, all six of the folks who have ever attached an S/MIME signature to a message sent to me have, by doing so, inserted their public keys into my mail client's keystore. I rarely choose to encrypt messages to one of these people, but when I do there is no need to look up a public key in a directory. I already have it.
</p>
<p>
You can't blame John Markoff for this error. Essentially no-one understands the suite of S/MIME signature and encryption capabilities built into the major email clients. Anyway, the real benefit of identity-based encryption, according to <a href="http://www.voltagesecurity.com/technology/ibe.htm">Voltage Security's writeup</a>, is different, and interesting. It purports to do away with the need for certificates and certification authorities.
</p>
<p>
As detailed in its <a href="http://www.voltage.com/pdf/VoltagePlatformTechOverview.pdf">white paper</a>, the Voltage scheme proposes to:
</p>
<ol>
<li>
<p>Use email addresses, IM screennames, or other short mnemonic identity handles as public keys.</p>
</li>
<li>
<p>Bind, to your email and IM applications, a module that retrieves private keys from a key server.</p>
</li>
</ol>
<p>
What's most attractive here is that users need not &quot;pre-enroll&quot; with a certification authority -- something that users have so far avoided like the plague. 
</p>
<p>
I see a few problems. First, it supposes that a sender would choose to encrypt a message if the procedure's barrier to entry were lower. This fails the <a href="http://www.ozzie.net/">Ray Ozzie</a> test for &quot;complacency-immunity&quot; -- the sender must still choose to encrypt, and given a choice almost none will.
</p>
<p>
Second, the scheme works bidirectionally only if both parties use Voltage's key server.
</p>
<p>
Then there's the question of what widespread encryption would do to an email infrastructure that has, rather suddenly, grown to depend on content analysis for spam detection.
</p>
<p>
A related issue: although we've never exploited it, the peer-to-peer nature of S/MIME or PGP key exchange creates an implicit whitelist. Today you are only likely to receive an encrypted email from someone you're previously written to. Imagine what might happen if spammers needed only your email address -- which they obviously already have -- to encrypt messages to you. 
</p>
<p>
I could be wrong, maybe this really is the silver bullet. But, like <a href="http://radio.weblogs.com/0125664/2003/07/07.html#a23">Pito Salas</a>, I am skeptical.
</p>
<p>
<b>Update</b>: Douwe Osinga writes:
</p>
<blockquote cite="Douwe Osinga">
On the other hand, encrypting a message is a relatively time
consuming thing to do. It would make sending out millions of
messages much harder, because each would have to be encrypted seperately. 0.1 sec per message would be okay for normal users, but if you need to send out 1 million messages to make one sale, it pretty much destroys your model.
</blockquote>
<p>
Excellent point. It would be a kind of <a href="http://www.cypherspace.org/hashcash/">hashcash</a>, wouldn't it?   
</p>
</body>
</item> 

<item num="a738">
<title>Tweedledum, Tweedledee, and active intermediaries</title>
<date>2003/07/07</date>
<body>
<blockquote>
<i>
<p>
<a href="http://www.sabian.org/Alice/lgchap04.htm">
<img hspace="6" vspace="6" align="right" src="http://www.sabian.org/Alice/lg18.gif"/>
</a>
The nascent Web services industry has so far focused mainly on the technical implications of these active intermediaries. They do make it vastly easier to integrate systems that pass around packets of self-describing data. But the reasons for this go beyond the regularity of XML data and the ubiquity of tools that can parse, search, and transform it. XML data flows fundamentally alter the political landscape of IT, shifting the locus of control away from the service endpoints and into the fabric of the network itself.
</p>
<p>
Closed systems that use proprietary APIs and speak binary protocols are a recipe for finger-pointing. &quot;I can't adjust the discount until Tweedledum upgrades the purchasing module,&quot;, says Tweedledee. &quot;Contrariwise,&quot; says Tweedledee, &quot;I don't control that logic. My hands are tied.&quot;
</p>
<p>
We've all been on both sides of this dispute, with no Alice to point out that we are only fighting over a rattle. Web services, however, can take us through the looking glass, ending the blame games to reveal the truth. 
</p>
[Full story at <a href="http://www.infoworld.com/article/03/07/03/26FEpipelinecontrol_1.html">InfoWorld.com</a>]
</i>
</blockquote>
<p>
See also Phil Windley's <a href="http://www.infoworld.com/article/03/07/04/26FEpipeline_1.html?s=tc">Pipelining to connect IT infrastructure</a>, in which he sums up what he learned while evaluating <a href="http://www.infoworld.com/article/03/06/06/23TCsonic_1.html">Sonic ESB</a>, <a href="http://www.infoworld.com/article/03/04/11/15grand_1.html">Grand Central</a>, and <a href="http://www.infoworld.com/article/03/07/03/26TCconfluent_1.html">Confluent</a>.
</p>
</body>
</item> 

<item num="a737">
<title>The network song</title>
<date>2003/07/05</date>
<body>
<blockquote>
<i>
The first server I connected to the Internet sat on the floor of my office, close enough so I could hear -- and feel -- its response to heavy load. It seems weird to admit that I relied on those sensory cues, but I've talked to enough system administrators to know I'm not alone. The sounds of a working machine enable the pattern recognition engine in your brain to create a baseline -- and to detect deviations from it -- in ways that are effortless, automatic, and incredibly efficient. [Full story at <a href="http://www.infoworld.com/infoworld/article/03/07/03/26OPstrategic_1.html">InfoWorld.com</a>]
</i>
</blockquote>
<p>
My title for this column was &quot;Listening to my server&quot; but InfoWorld's CEO Kevin McKean came up with a better one: &quot;<a href="http://www.infoworld.com/article/03/07/03/26OPeditor_1.html">The network song</a>.&quot; In his column, Kevin points out that InfoWorld.com is now using a <a href="http://tinyurl.com/">TinyUrl-like</a> scheme to compress URLs printed in the magazine. Excellent! Since moving my column from its online-only venue to its current position in the printed magazine and online, I've been frustrated by the need to curtail the amount of linking I do. This will help a lot.
</p>
</body>
</item> 

<item num="a736">
<title>In search of XHTML guidance</title>
<date>2003/07/02</date>
<body>
<p>
<a href="http://www.higher-guidance.com.au/">
<img align="right" width="225" height="200" src="http://www.higher-guidance.com.au/assets/HGlogo.jpg" alt="higher guidance"/>
</a>
I've got yesterday's <a href="http://weblog.infoworld.com/udell/gems/tagPractices.html">example</a> working in Mozilla now as well, thanks to <a href="http://www.google.com/search?q=bob%20clary">Bob Clary</a>, who wrote to say:
</p>
<blockquote cite="Bob Clary">
There are two issues in Mozilla, Firebird and other Gecko-based browsers which cause your example to fail.
<br/>
<br/>
1. There is a Cross Domain issue in terms of accessing the content on the Tag site. Not only are you reading content you are actually writing to the DOM cross domain. To support this you need to enable privileges to allow cross domain read and write. 
<br/>
<br/>
Add the following to the MOZtransform function after the beginning of the function.
<br/>
<br/>
netscape.security.PrivilegeManager.enablePrivilege('UniversalBrowserRead UniversalBrowserWrite');
<br/>
<br/>
If the user does not grant permission it will throw an exception so you probably want it in a try catch block.
<br/>
<br/>
You will need to enable the ability to override security. I am not sure how to do that in Firebird but in Mozilla and Netscape 7.1 you can either add the required preference in about:config or install the user.xpi from http://devedge.netscape.com/toolbox/examples/2003/CSpider/
<br/>
<br/>
http://devedge.netscape.com/toolbox/examples/2003/CSpider/user.xpi
<br/>
<br/>
which adds the preference user_pref(&quot;signed.applets.codebase_principal_support&quot;, true); 
<br/>
<br/>
2. The second issue is that the document http://www.w3.org/TR/2003/WD-webarch-20030627/ is NOT XML. It is HTML as you can see by the Content Type returned by the server. xml.load actually results in an EMPTY xml document since the source URI is HTML. There are ways to get around this by using XMLHttpRequest and forcing the Content Type.
<br/>
<br/>
This is a common issue in the current use of XHTML served as HTML. Complain to the Tag that they should be serving XHTML as application/xhtml+xml or as text/xml but not as text/html!
</blockquote>
<p>
Thanks Bob! After invoking http://devedge.netscape.com/toolbox/examples/2003/CSpider/user.xpi and adding the enablePrivilege call to the script, I used this approach to load the XML:
</p>
<pre class="code" lang="javascript">
try
    {
    var myXMLHTTPRequest = new XMLHttpRequest(); 
    myXMLHTTPRequest.open(&quot;GET&quot;, xmlurl, false); 
    myXMLHTTPRequest.send(null);                 
    var xmltext = myXMLHTTPRequest.responseText;          
    var xml = document.implementation.createDocument(&quot;&quot;, &quot;xml&quot;, null);
    var objDOMParser = new DOMParser();
    var objDoc = objDOMParser.parseFromString(xmltext, &quot;text/xml&quot;);
    while (xml.hasChildNodes())
        xml.removeChild(xml.lastChild); 
    for (var i=0; i &lt; objDoc.childNodes.length; i++)
        {
        var objImportedNode = xml.importNode(objDoc.childNodes[i], true);
        xml.appendChild(objImportedNode);
        }
    }
</pre>
<table align="right" cellspacing="0" cellpadding="6">
<tr>
<td>
<a href="http://weblog.infoworld.com/udell/gems/tagPracticesFirebird.jpg">
<img hspace="6" src="http://weblog.infoworld.com/udell/gems/tagPracticesFirebird.jpg" width="300" height="275" alt="xpath local search in firebird"/> </a>
</td>
</tr>
<tr>
<td>
<div align="center" class="realsmall">xpath local search in firebird</div>
</td>
</tr>
</table>
<p>
The security stuff (in either browser) is a bit scary, but note that it is not central to what's being illustrated in my example. Had I asked for and received permission to serve the TAG's document from my site, or were the viewer being served from the TAG's site, there would be no cross-domain conflict.
</p>
<p>
As to the issue about which content-type to use, I am no expert but Bob's message reminded me of Ian Hixie's note, <a href="http://www.hixie.ch/advocacy/xhtml">Sending XHTML as text/html Considered Harmful</a>. The upshot of this very well-reasoned brief is stated at the beginning:
</p>
<blockquote cite="Ian Hixie">
It is suggested that XHTML delivered as text/html is broken and XHTML delivered as text/xml is risky, so authors intending their work for public consumption should stick to HTML 4.01.
</blockquote>
<p>
I have long been fascinated by the notion that XHTML can be a best-of-both-worlds approach, combining the immediate accessibility of HTML (i.e., no transformation needed to read it in a browser) with the advanced capabilities of XML (e.g., the kinds of dynamic XPath-driven views my example illustrates). It seems to be unclear, at the moment, how or even whether to use XHTML in this best-of-both-worlds way. I will be on the lookout for guidance.
</p>
</body>
</item> 

<item num="a735">
<title>Specification and emergence</title>
<date>2003/07/01</date>
<body>
<p>
<table align="right" cellspacing="0" cellpadding="6">
<tr>
<td>
<a href="http://weblog.infoworld.com/udell/gems/tagPractices.gif">
<img hspace="6" src="http://weblog.infoworld.com/udell/gems/tagPractices.gif" width="225" height="300" alt="www architecture: local search"/> </a>
</td>
</tr>
<tr>
<td>
<div align="center" class="realsmall">
<a href="http://weblog.infoworld.com/udell/gems/tagPractices.html">Dynamic views</a> of the <br/> <a href="http://www.w3.org/TR/2003/WD-webarch-20030627/">WWW Architecture document</a>
</div>
</td>
</tr>
</table>
The <a href="http://www.w3.org/2001/tag/">WWW Technical Architecture Group</a> (TAG) has been gradually codifying a bunch of principles and best practices in a document entitled <a href="http://www.w3.org/TR/2003/WD-webarch-20030627/">The Architecture of the World Wide Web</a>. One of the authors, Tim Bray, asks:
</p>
<blockquote cite="Tim Bray">
<b>Mystery</b> Why, I wonder, did nobody ever get around to writing this stuff down in one place before? [<a href="http://tbray.org/ongoing/When/200x/2003/06/30/WebArch20030627">ongoing</a>]
</blockquote>
<p>
It does, admittedly, seem odd. Roy Fielding's influential thesis, for example, which kicked off the RESTian reinterpretation, did not appear until 2000. Maybe this statement, from the TAG's own document, says why:
</p>
<blockquote cite="WWW TAG">
The architecture described in this document is principally the result of experience.
</blockquote>
<p>
In any architecture, the interplay between what is specified and what emerges is subtle and complex, far beyond my ability to understand. For example, when I noted that the categorization of points within the document (i.e, what is a &quot;constraint,&quot; &quot;principle,&quot; or &quot;practice&quot;) is still <a href="http://www.w3.org/TR/2003/WD-webarch-20030627/#app-principles">in flux</a>, I wondered how to visualize that. The result -- another one of my recent series of XPath search experiments -- is shown in the screenshot. If you're an IE user, you can try it <a href="http://weblog.infoworld.com/udell/gems/tagPractices.html">here</a>, after visiting Tools -&gt; Internet Options -&gt; Security -&gt; Internet -&gt; Custom Level -&gt; Miscellaneous -&gt; Access data sources across domains (shouldn't there be <i>URLs</i> for these things?), then switching the default ('Disable') to, I would recommend, 'Prompt'. The reason, by the way, is that the page, which loads from my site, has a script that fetches the TAG document from its site, so it's cross-domain access. (If you're a Firebird user (as I also am), maybe you can help me sort out why <a href="http://weblog.infoworld.com/udell/gems/tagPractices.html">this</a> doesn't work but <a href="http://weblog.infoworld.com/udell/misc/oscom/search.html">this</a> does. It's not just about cross-domain access, I don't think.)
</p>
<p>
I'm really excited by this way of creating dynamic views of documents in the browser. It's possible, in this case, because of some other kinds of best practices: the document is written in XHTML, and its style tags are used in a consistent way. Having worked through a couple of these implementations, I'm starting to catch glimpses of further best practices -- related to use of structure and use of style tags in XHTML documents -- that could amplify the power of this approach and minimize some of its weaknesses. Could I write them down now? No way. It's a process of discovery. Could that discovery happen if people had not agreed on and documented formats and protocols and APIs? Also, obviously, no way. 
</p>
<p>
So I guess it <i>is</i> a mystery.
</p>
</body>
</item> 

<item num="a734">
<title>Voices</title>
<date>2003/06/29</date>
<body>
<p>
So many voices in this most tumultuous of the many tumultuous moments I've lived through, in my five years of involvement with the RSS phenomenon. So many people taking time away from friends and family, this weekend, to consider the matters at hand. So tempting to simplify it all as a silly-season little-endian/big-endian tempest in a teapot. So much at stake. <b>Update</b>: So sad the <a href="http://www.scripting.com/defaultJul29.html">voice that started it all</a> has, for now, gone silent. <b>Further update</b>: And now <a href="http://scriptingnews.userland.com/2003/06/29#When:4:25:49PM">is back</a>, thankfully.
</p>
<p>
<b>Paul Philp</b>:
</p>
<blockquote cite="Paul Philp">
Dave Winer and Userland Software lack the market power to create enough opportunity to incent competitors to co-operate. There is a fundamental mismatch between Winer's attempt to control the standard and his market power. It is this difficult reality Winer must now face.
<br/>
<br/>
Sam Ruby and his collaborators on the Echo project need to understand the economics of switching costs and the price of uncertainty. The RSS format is adopted broadly today. Many web sites, web tools, aggregators, programmers, editors and publishers know and trust the RSS format and name. They simply will not incur the risk and cost of switching to a new format unless the benefit is a full order of magnitude larger than the benefit of the current RSS format. 
<br/>
<br/>...<br/>
<br/>
The predictable outcome of the RSS/Echo debate will be exactly the outcome the participants wanted to avoid in the beginning - Microsoft controlling the web content syndication format.
<br/>
<br/>
A vendor neutral industry standard based on the RSS format and name is the one path I see to avoid this outcome. [<a href="http://www.longharvest.com/archives/000126.html">The Long Harvest</a>]
</blockquote>
<p>
A fascinating analysis worth reading in full. I mostly agree, but would qualify a few aspects. First, although there would be switching costs, they would be less onerous than format-related switching costs usually are -- because of the inherent nature of weblogs and RSS. The role of RSS so far is primarily that of a connection broker. People don't yet store and reuse RSS content. This can and should and will begin to change, for reasons, and in ways, I've <a href="http://webservices.xml.com/pub/a/ws/2003/04/15/semanticblog.html">described</a>, but probably not soon enough to affect near-term switching costs. If such costs must be borne. They need not, and I continue to hope they will not.
</p>
<p>
Second, I'm suspicious of the notion that Microsoft -- or InsertYourFavoriteBigCoHere -- would seek control of the format. These companies are learning fast that formats, in the XML era, are kind of beside the point. Services and applications are what matter, and they can easily be adapted to any format so long as it <i>is</i> XML. The recent evolution of SOAP, for example, boils down to agreeing that the payload is an XML document, not an expression of protocols or APIs.
</p>
<p>
<b>Patrick Logan</b>:</p>
<blockquote cite="Patrick Logan">
The best thing to come of this recent RSS/Echo issue is that Sam Ruby has demonstrated the value of the Wiki Wiki Web.
<br/>
<br/>
What do we need, another Echo? Nope. We really need better applications. What happened to that Web Services revolution anyway, that the heat generated in 2003 is about HTTP? [<a href="http://patricklogan.blogspot.com/archives/2003_06_22_patricklogan_archive.html#105685124195556317">Making it stick</a>]
</blockquote>
<p>
That weblog technology is not the primary means of its own re-analysis is a fascinating observation that occurred to me yesterday too. I've been a Wiki (and a <a href="http://weblog.infoworld.com/udell/2003/02/13.html#a605">Ward Cunningham</a>) fan for years, but I would say that Wiki, too, is suboptimal for the task at hand. Ideally XML, not raw ASCII text, would be the stuff that was written, and refactored, and then mined to produce coherent views. We have no tools that come close to enabling that to happen. 
</p>
<p>
Such tools, combining the power of XML with the flexibility of freeform text, and operating on a universal canvas, are what will really drive mainstream adoption of a two-way Web. In my OSCOM keynote, when I revealed <a href="http://weblog.infoworld.com/udell/misc/oscom/emacs.html">the depressing truth about how I wrote my slideshow</a>, I noted that for most of the people in the audience, writing XML in emacs seemed completely normal. But while it seems so to me, and probably seems so to most involved in the Echo project, it most decidedly is not. 
</p>
<p>
One of the reasons I can speak effectively in this discussion is that I've mastered skills -- for quickly gathering and reshaping raw material, and composing original material -- that most people will never, and should never, be expected to master. Empowering non-techies was the torch that Dave Winer lit with Radio UserLand. One particular BigCo, Microsoft, well understands what it takes to carry it forward, and to empower most people to achieve 80% of the fluency with 20% of the effort. After years of foot-dragging, because of a historical format lock-in that will soon be irrelevant and will be abandoned, Microsoft is on the verge of delivering the kinds of applications (<a href="http://weblog.infoworld.com/udell/2003/02/24.html#a617">1</a>, <a href="http://weblog.infoworld.com/udell/2003/02/21.html#a615">2</a>) that can be decisive. I have applauded their efforts. I wish I saw credible competition on the horizon, because the health of the ecosystem requires it. The opportunity is NOT primarily tied to syndication formats, or to weblog APIs. I think Microsoft gets that, and I wish more people did.
</p>
<p>
<b>Gordon Weakliem</b>:</p>
<blockquote cite="Gordon Weakliem">
Dare also mentions politics as being a driver behind Echo.  Evidently, this is what the controversy's about.  I think that it's a terrible thing to even tacitly admit this sentiment into your mission.  I've never yet found software that can fix interpersonal relationships.  When all's said and done and Echo is a reality, the same people will still be around, disliking each other. [<a href="http://radio.weblogs.com/0106046/2003/06/28.html#a292">Gordon Weakliem's Weblog</a>]
</blockquote>
<p>You're right, Gordon. I'd add: and Microsoft will still be busily creating the applications that make these politically-charged formats and APIs relevant to the masses.
</p>
<p>
<b>Sam Ruby</b>:</p>
<blockquote cite="Sam Ruby">
<ul>
<li>
<p>
I'd like to get to the point where the original functionality of the RSS 0.90 link tag can be achieved with the xpath expression &quot;//a/@href&quot; on those feeds that have well formed HTML. 
</p>
</li>
<li>
<p>
If you are a user of a recent version of IE or Mozilla, you already have a validator for wellformedness.
</p>
</li>
</ul> [<a href="http://www.intertwingly.net/blog/1501.html">Intertwingly</a>]
</blockquote>
<p>
Sam, we are in violent agreement as to the value of well-formed and XPATH-searchable content, something I have recently sought to demonstrate (<a href="http://weblog.infoworld.com/udell/misc/oscom/xsltAndSlideML.html">1</a>, <a href="http://weblog.infoworld.com/udell/misc/oscom/search.html">2</a>). 
</p>
<p>
I think -- no, I am sure -- that you radically underestimate the distance between possession of a tool (IE or Mozilla) that can validate well-formedness, and ability to produce well-formed content in routine written communication. In terms of such ability, I'm undoubtedly in the 99.9% percentile, and it's still something that slows me down and makes me think and sometimes trips me up. That makes it, by definition, a non-starter for most people. You called my OSCOM keynote <a href="http://www.intertwingly.net/blog/1441.html">best in show</a>. The message wasn't that we need more or different formats or APIs. I'm not saying formats or APIs don't matter, obviously they do, and obviously they must and will evolve. But my message was, and is, that weblog technology has to empower more people to communicate more easily and more effectively. Many more people, and much more easily and effectively, than now. I illustrated with some simple best practices that seemed to resonate powerfully with a lot of people. And I showed how, even in the midst of elaborating some of those best practices, my own <a href="http://weblog.infoworld.com/udell/misc/oscom/slideScript.html">geek blinders</a> prevented me from seeing a simple and obvious thing. Those of us with techie DNA have got to try to understand, especially at this critical juncture, how it influences our perceptions and behavior.
</p>
<p>
<b>Tim Bray</b>:</p>
<blockquote cite="Tim Bray">
Jeepers, how many more levels deeper are going on this one till we get to the bottom? [<a href="http://www.tbray.org/ongoing/When/200x/2003/06/28/Learning">ongoing</a>]
</blockquote>
<p>
Tim, in my view, you won't hit paydirt until the non-emacs-using civilians -- the people who are expected to create and gather and process this content whose representation we all (myself included) find so fascinating -- are put front and center. You write:
</p>
<blockquote cite="Tim Bray">
Having a wonderfully readable language is not a win if it's such a pain in the ass to write code for that nobody does...So the real situation is this: the interests of those who for one reason or another hand-author and hand-read syndication feeds are directly in conflict with those of the people who write the software that reads them.
</blockquote>
<p>
Being one of those very few inclined to hand-author XHTML, I should be flattered to see that my interests figure so prominently in the Echo process. In fact it worries me, as the continuing non-adoption of XML by civilians has always worried me. The elephant in the room here is this thing called the semantic web. We are all like the proverbial blind men trying to feel the shape of that elephant. There is one aspect of the beast I see clearly: we're not going to have a semantic web until regular people can effectively write it. Weblogs were a major step forward, hence all this excitement and energy. Why the urge now to tweak the formats and APIs? Politics aside, because it's what we do, it's what we know, it's how we think. Some tweaking may be necessary, and establishing a framework within which to tweak is even more necessary. But none of that adds up to the next major step forward. 
</p>
<p>
<b>Dave Winer</b>:</p>
<blockquote cite="Dave Winer">
How about let's try to put this back together so that RSS stays what it is, a simple syndication format, with a set of best practices that all parties adhere to, so that the format isn't vulnerable to takeover by one or more BigCo's. [<a href="http://scriptingnews.userland.com/2003/06/28#daveWinerIsAngry">Scripting News</a>]
</blockquote>
<p>
Dave, I'm obviously 100% in favor of defining best practices, preserving simplicity, and ensuring continuity. One way or another, it looks as though weblog formats and APIs are heading onto the standards track -- something that I think everyone involved regards as necessary yet problematic. I'd vastly prefer to see this happen with little disruption to the existing RSS ecosystem. But if it winds up being a lot, I wouldn't characterize that mainly as a &quot;format takeover.&quot; I'm tempted to call it a &quot;techie takeover&quot; instead. Casting this as BigCo's vs. SmallCo's is too easy. What you have always contributed -- in the form of ideas and of implementations -- is a profound user sensibility. We have never needed that more than now. 
</p>
<p>
I've written recently about <a href="http://weblog.infoworld.com/udell/2003/06/22.html#a730">finishing work</a>. Jean Paoli has dreamed for half his life of bringing XML to the masses, and he well knows that Microsoft's ability to pour resources into usability analysis and finishing work is his best shot at making the dream real. Perhaps naively, I don't think it's our only shot. Big or small, open source or commercial, I believe anybody who focuses on what really matters can create forward motion -- thanks in no small measure to the breakthroughs you have pioneered. Formats and APIs are not what really matter here. Figuring out what people can use, in ways that make sense to them, is what matters. That's what you've always done, what I have always supported and loudly acclaimed, and what no-one else is doing in a big-picture way. I know you'll keep pointing the way forward.
</p>

</body>
</item> 

<item num="a733">
<title>My conversation with Mr. Safe</title>
<date>2003/06/27</date>
<body>
<a href="http://www.fplsafetyworld.com/safe_choice/">
<img hspace="6" align="right" src="http://www.fplsafetyworld.com/safe_choice/media/safe_choice.gif"/> </a>
<p>
<b>Mr. Safe</b>:
Hey, I've been reading about that RSS thing you were telling me about. It was mentioned recently in the New York Times, and also the Wall Street Journal. I'm thinking maybe it's a safe choice after all.
</p>
<p>
<b>Jon</b>:
Not so fast.
</p>
<p>
<b>Mr. Safe</b>:
Oh, why? What's up?
</p>
<p>
<b>Jon</b>:
Well, a bunch of people in the RSS community have decided to push the reset button, and redesign the format from the ground up. The new thing is called Echo.
</p>
<p>
<b>Mr. Safe</b>:
But I thought you said RSS was a mature and tested format, the only real problems being confusion about which version to use, a potential copyright issue, and no official stamp of approval from a standards body.
</p>
<p>
<b>Jon</b>:
I did, but I also said there were political problems. This is mainly about the politics.
</p>
<p>
<b>Mr. Safe</b>:
Uh oh.
</p>
<p>
<b>Jon</b>:
Look, it'll be fine, really. There are a bunch of really smart people, and they're collaborating in a Wiki.
</p>
<p>
<b>Mr. Safe</b>:
A what-i?
</p>
<p>
<b>Jon</b>:
It's an online blackboard, sort of, where everybody gets to scribble their thoughts, and reorganize other people's thoughts. 
</p>
<p>
<b>Mr. Safe</b>:
How many people are actively involved?
</p>
<p>
<b>Jon</b>:
I don't know, fifty maybe, it's hard to tell.
</p>
<p>
<b>Mr. Safe</b>:
Look, I'm no expert, but are the best Internet standards really designed by committees of fifty on shared blackboards?
</p>
<p>
<b>Jon</b>:
Like I said, this is politically necessary. RSS has been handled in a far less collaborative -- some say dictatorial -- manner. There's a huge amount of resentment over that, and it's fueling this new movement. So the redesign is taking place in a fully-transparent environment -- a smoke-free room..
</p>
<p>
<b>Mr. Safe</b>:
Will it still be called RSS?
</p>
<p>
<b>Jon</b>:
No.
</p>
<p>
<b>Mr. Safe</b>:
OK, but will it be backward-compatible with RSS?
</p>
<p>
<b>Jon</b>:
It doesn't look that way.
</p>
<p>
<b>Mr. Safe</b>:
Look, I'm no expert, but don't Internet standards take years -- not weeks -- to mature?
</p>
<p>
<b>Jon</b>:
Yes.
</p>
<p>
<b>Mr. Safe</b>:
Uh oh.
</p>
<p>
<b>Jon</b>:
It'll be fine, really. I'll tell you the dirty little secret about all of this: the RSS format is kind of trivial when you come right down to it. Although there will be a lot of disruption and fallout, the blog community could probably reconstitute itself around a completely different format in a week or two.
</p>
<p>
<b>Mr. Safe</b>:
So this isn't rocket-science technology?
</p>
<p>
<b>Jon</b>:
I'll tell you another secret. You know that billion-dollar application the Wall Street Journal was talking about? Replacing broadcast email with RSS? That would work just fine with even the most primitive version of RSS. Hell, it'd work fine with the progenitor of RSS, an old Microsoft format called CDF.
</p>
<p>
<b>Mr. Safe</b>:
Really? Microsoft had a horse in this race? What happened?
</p>
<p>
<b>Jon</b>:
They couldn't grasp the idea of personal publishing, and made the CDF network into a small club for rich media titans. It took somebody like Dave Winer to see that the real opportunity was to radically lower the barrier to entry, and empower everybody to publish as well as subscribe.
</p>
<p>
<b>Mr. Safe</b>:
Who's Dave Winer?
</p>
<p>
<b>Jon</b>:
He's the guy everybody in this new post-RSS movement is pissed off at. And believe me, they have their reasons. Even though the RSS format is not the critical thing here, as I've explained, Dave's been an incredible control freak about it.
</p>
<p>
<b>Mr. Safe</b>:
So if the format doesn't matter so much, where's the magic?
</p>
<p>
<b>Jon</b>:
The magic is in knowing how to use RSS. Knowing what to read and write, and how, and when. Absorbing and transmitting awareness. 
</p>
<p>
<b>Mr. Safe</b>:
Uh oh. This sounds like that touch-feely Wiki thing you were talking about.
</p>
<p>
<b>Jon</b>:
Well, kind of. Anyway, Dave showed everybody how to <i>use</i> RSS. That's his crowning achievement in my book. Not the format. Not even the tools his company created to make it easy for anybody to write for the RSS network. Like I said, this is simple stuff, and now there are lots of tools. One guy, to make a point about simplicity, wrote a blogging tool in 30 lines of Perl. So it's not about the format, and it's not about the tools. It's about a new way of communicating, one that's defined by personal publishing and subscribing, and that empowers writers and readers as never before. Dave knew that earlier, and still knows it better, than anyone.
</p>
<p>
<b>Mr. Safe</b>:
Hmm. It sounds kind of abstract. But you're saying this touchy-feely stuff is actually starting to resonate with the New York Times and the Wall Street Journal?
</p>
<p>
<b>Jon</b>:
Yes, finally, after about four years of gestation. It's a hard thing to describe, and it takes a long time to sink in, but a lot of people -- me included -- are as excited about this now as we were in 1994 about the Web.
</p>
<p>
<b>Mr. Safe</b>:
And now they're going to hit the reset button on the format? 
</p>
<p>
<b>Jon</b>:
Apparently.
</p>
<p>
<b>Mr. Safe</b>:
Do you think that's a good idea?
</p>
<p>
<b>Jon</b>:
I'm never a fan of fixing what ain't broken. Arguably, though, there was no other way forward in this case. The worm at the core of the weblog apple had to be extracted. It's true that vast numbers of yet-to-be-written RSS applications need no more than what RSS already does, or can be extended to do using the mechanisms it sanctions. It's also true that vast numbers of yet-to-be-written RSS applications will require RSS to evolve. It had to become possible for that evolution to occur in an open and vendor-neutral way, and when the dust settles I think it will be possible.
</p>
<p>
<b>Mr. Safe</b>:
But is this evolution, or is it revolution?
</p>
<p>
<b>Jon</b>:
Good question. Now that the dam has broken, Dave has endorsed the new effort. It must have been an incredibly hard thing to do. I have a teenage daughter and when it's time for her to leave the nest, in a couple of years, I hope I'll handle that transition as graciously as Dave is handling this one. Meanwhile, the Echo designers are -- not surprisingly -- converging on a core that looks a lot like RSS. So far they've discovered that a blog entry has a link, an author, a publication date, and one or more semantically-equivalent content items. Any day now, they'll conclude that it also has a description. There's really not much mystery about this stuff.
</p>
<p>
<b>Mr. Safe</b>:
So what's the safe choice?
</p>
<p>
<b>Jon</b>:
I don't know. Would you feel better if this Echo process were a continuation of RSS rather than a recreation of it?
</p>
<p>
<b>Mr. Safe</b>:
Yes.
</p>
<p>
<b>Jon</b>:
Me too.
</p>
</body>
</item> 

<item num="a732">
<title>Fixing RSS's public-relations problem</title>
<date>2003/06/25</date>
<body>
<p>
Yesterday I spoke with two acquaintances, both of whom have decades-long track records in the high-tech biz, and neither of whom has ever used an RSS newsreader. When I mentioned RSS as an alternative to mailing lists, both said the same thing: &quot;But I don't have time to visit 30 different websites in order to find things out.&quot; Of course, that is exactly the problem that RSS solves. And has been solving, for me, since 1999.
</p>
<p>
Over the years, people have asked me which version of RSS to use. I've always said it doesn't matter, they all do the same thing. But the question always annoys me, because while I've tried to pretend otherwise, the fragmentation of RSS really is a problem. I think it's part of the reason my two acquaintances aren't using RSS today. And if they're not, how can we really expect <a href="http://tbray.org/ongoing/When/200x/2003/06/19/RSS4All">Tim Bray's Mr. Safe</a> to jump onboard?
</p>
<p>
Despite the confusion, a very notable Mr. Safe -- the BBC -- just did <a href="http://davenet.userland.com/2003/06/24/bbcNewsArchiveWeblogsAndRss">jump aboard</a>. Given a choice of three formats -- RSS .9x, 1.0, and 2.0 -- the BBC opted for the safe choice: .9x. That's sad in a couple of ways. First, because people at the BBC even had to worry about this choice at all. Second, because 2.0 has a stronger core and a well-defined mechanism for extension. It should have been the safe choice.
</p>
<p>
I'm delighted to see that Sam Ruby has launched <a href="http://www.intertwingly.net/wiki/pie/FrontPage">a collaborative effort</a> to review the RSS core and delineate its periphery. According to <a href="http://tbray.org/ongoing/When/200x/2003/06/23/SamsPie">Tim</a>, Sam's employer -- IBM -- has given him the go-ahead to work fulltime on the project. This is <i>huge</i>. I'm equally delighted to see that Dave Winer is both <a href="http://scriptingnews.userland.com/2003/06/24">reporting on</a> and <a href="http://scriptingnews.userland.com/2003/06/25#postIds">contributing</a> <a href="http://scriptingnews.userland.com/2003/06/25#theLizardBrainOfRss">to</a> the discussion. 
</p>
<p>
If the goal of this effort is to nail down what last year's RSS 2.0 process also aimed to achieve -- a solid and universally-acknowledged RSS core, freely extensible in a solid and universally-acknowledged way -- then I hope it moves swiftly to achieve that. The existing core, in my view, requires few (if any) changes. What we need is consensus on the core, the sooner the better.
</p>
<p>
The periphery is vast. It includes commenting, threaded discussions, semantic modeling, authentication and encryption, and an endless amount of other stuff. All that can come in due course. 
</p>
<p>
Let's be clear: RSS is in no way broken. I, for example, will be using RSS to <a href="http://www.intertwingly.net/wiki/pie/RecentChanges?action=rss_rc">monitor</a> this current round of analysis and specification. I don't really care whether tags are written as mixed-case or lowercase. But there are issues in the core, and issues related to the delineation of the periphery, that do matter to me. RSS will empower me to tune into its own review process in the most efficient way. What's broken is that not nearly enough people know about, or use, this model of awareness diffusion. That's a public-relations problem, not a technology problem, and one that I hope will at last be fixed.
</p>
</body>
</item> 

<item num="a731">
<title>AT&amp;T, CheckFree, and electronic bill presentment</title>
<date>2003/06/23</date>
<body>
<p>
<a href="http://www.allie.att.com/nlq/virtual_rep.jsp">
<img hspace="6" src="http://weblog.infoworld.com/udell/gems/attAllie.gif" alt="AT&amp;T Allie" align="right"/>
</a>
<a href="http://weblog.infoworld.com/udell/2003/05/23.html#a702">Harry Tuttle</a> failed to come to my rescue, so I wound up transferring payee data from my bank's old bill-payment system to the new one. The cloud's silver lining is bill presentment, which the old system didn't support but the new one does. The first of my payees to indicate support for bill presentment was AT&amp;T.
</p>
<p>
In the old system, I used my phone number as my account number. Seemed logical, since the bill reads <b>Customer ID: 603 xxx-xxxx</b>, and it worked. But no, the new system doesn't recognize that account number. Hmm. Scanning the bill again, I notice at the bottom: <b>Customer ID: 603 xxx-xxxx D</b>. Could that be it? Nope, it isn't. Now I'll have to -- <i>shudder</i> -- call them.
</p>
<p>
Wouldn't it be cool if a customer-service call to AT&amp;T went like this?
</p>
<p>
<b>AT&amp;T IVR</b>: Welcome to AT&amp;T. Please enter your 10-digit telephone number.</p>
<p>
<b>Jon</b>: Beep beep boop, beep boop boop, beep beep boop beep.</p>
<p>
<b>AT&amp;T IVR</b>: Welcome to IVR Hell. Press any digit to be annoyed, or scream 'HUMAN BEING!' to be connected to a carbon-based life form. </p>
<p>(Unfortunately, you still have to press 0 to reach that person. Here's how it really went.)</p>
<p>
<b>AT&amp;T IVR</b>: Welcome to AT&amp;T. Please enter your 10-digit telephone number.</p>
<p>
<b>Jon</b>: Beep beep boop, beep boop boop, beep beep boop beep.</p>
<p>
<b>AT&amp;T IVR</b>: Blah, blah, blah, press 0 to be connected to...</p>
<p>
<b>Jon</b>: 0.</p>
<p>
<b>AT&amp;T Human</b>: AT&amp;T Consumer Services, please tell me your 10-digit phone number.</p>
<p>(This part always amazes me. Why bother to ask the first time, and then immediately forget?)</p>
<p>
<b>Jon</b>: Digit digit digit, digit digit digit, digit digit digit digit.</p>
<p>
<b>AT&amp;T Human</b>: Our records show you are using AT&amp;T long-distance service on that line. Thank you for using AT&amp;T. Do you have another line?</p>
<p>
<b>Jon</b>: Yes.</p>
<p>
<b>AT&amp;T Human</b>: Please tell me the 10-digit phone number.</p>
<p>(This part always amazes me too. Do they really not know this?)</p>
<p>
<b>Jon</b>: Digit digit digit, digit digit digit, digit digit digit digit.</p>
<p>
<b>AT&amp;T Human</b>: Our records show you are using AT&amp;T long-distance service on that line. Thank you for using AT&amp;T. Do you have any other lines?</p>
<p>
<b>Jon</b>: No.</p>
<p>
<b>AT&amp;T Human</b>: How can I help you today?</p>
<p>
<b>Jon</b>: I'm trying to sign up for online bill payment and electronic bill presentment, and I need to know my account number. I thought it was just my phone number, but evidently not.</p>
<p>
<b>AT&amp;T Human</b>: I can send you a package of forms to fill out...</p>
<p>
<b>Jon</b>: HUMAN BEING!</p>
<p>
<b>AT&amp;T Human</b>: Excuse me sir?</p>
<p>(OK, that last bit didn't really happen. But as Dave Barry would say, I am not making the rest of this up.)</p>
<p>
<b>Jon</b>: No, no, no. You don't understand. AT&amp;T has a relationship with CheckFree, and CheckFree has a relationship with my bank, and I'm on the signup screen for online bill payment and bill presentment as we speak. I just need to know what AT&amp;T thinks my account number is.</p>
<p>
<b>AT&amp;T Human</b>: It's not your phone number?</p>
<p>
<b>Jon</b>: No.</p>
<p>
<b>AT&amp;T Human</b>: Then try this: Digit digit digit digit digit digit digit digit digit digit digit digit digit.</p>
<p>(Oh, of course, it's obvious. That's the first 13 digits of the unpunctuated 51-digit number at the bottom of my bill. Silly me, why didn't I think of that? Sheesh.)</p>
<p>
<b>Jon</b>: Let's see if it'll work. Click, click...yup, that's it. Thanks so much.</p>
<p>
<b>AT&amp;T Human</b>: Thank you for choosing AT&amp;T.</p>
<p>OK, now we're cooking with gas.</p>
<p>Not.</p>
<p>So far, I have only recreated what I had before: the ability to pay AT&amp;T online. There's an additional signup step for online bill presentment. The form -- presented by CheckFree, populated and handled by AT&amp;T -- wants the account number, and cleverly supplies it as the default. Great! Except:
</p>
<p>
<b>AT&amp;T</b>: Unknown account number.</p>
<p>Sigh. This was supposed to save us time and money, right?</p>
<blockquote>
<i>
A leading driver of biller conversion to EBPP is potential cost-savings. As companies and other stakeholders recognize the significant expense reduction and revenue creation opportunities to be generated by EBPP, the depth of product capabilities and delivery alternatives will continue to grow. [<a href="http://about.reuters.com/newsreleases/art_13-11-2000_id452.asp">TowerGroup Research shows move to Electronic Bill Presentment &amp; Payment could save U.S. Billers $5.5 Billion a Year, and Consumers $4.4 Billion</a>]
</i>
</blockquote>
<p>
That report was issued in 2000. I wonder how much billers and consumers have so far saved, and at what cost in time and effort. 
</p>
<p>
The picture isn't all gloomy. The next payee on my list, <a href="http://www.countrywide.com/">Countrywide Home Loans</a>, worked flawlessly. Still, you'd have thought that AT&amp;T, who gave the world Unix, would know a thing or two by now about process pipelines.
</p>
</body>
</item> 

<item num="a730">
<title>Engines, steering wheels, and open source</title>
<date>2003/06/22</date>
<body>
<p>
<table align="right" cellspacing="0" cellpadding="6">
<tr>
<td>
<img src="http://weblog.infoworld.com/udell/gems/engine.jpg"/>
</td>
</tr>
<tr>
<td>
<img src="http://weblog.infoworld.com/udell/gems/steering.gif"/>
</td>
</tr>
</table>
The VP of technology for a leading enterprise software vendor recently told me that he spends a lot of time wondering how open source projects can possibly work. &quot;You take out the internal combustion engine,&quot; he said, &quot;yet somehow the car still runs.&quot; My take was that there is, indeed, a powerful engine purring under the hood of open source. Creative programming is a deeply addictive behavior. Part of the rush comes from the endorphins released when the mind enters a state of flow. And part of it comes from the peer acclaim that all programmers crave. 
</p>
<p>
What open source projects often lack is not the engine, but the steering wheel. &quot;Too many programmers,&quot; says Tony Byrne, who tracks commercial and open source content management systems for <a href="http://www.cmswatch.com/">CMSWatch</a>, &quot;but not enough product managers.&quot; <a href="http://radio.weblogs.com/0116506/">Paul Everitt</a>, co-founder of Zope Corporation, puts it this way: &quot;We suck at finishing work.&quot; Writing documentation doesn't make endorphins flow. Neither does organizing a usability study, or doing triage on bug reports, or writing a bulletproof installer, or internationalizing a product for fourteen languages, or creating an intuitive user interface.
</p>
<p>
Mind you, I'm not complaining. Every day I use Perl, Python, Linux, Apache, Mozilla, Zope, emacs, and countless supporting libraries and tools. But I also use Windows, Mac OS X, MSIE, Outlook, and a bevy of commercial software products. We tend to think of the open source stuff as low-level developer-oriented infrastructure, and the commercial stuff as user-facing applications. The two realms need not be separate and distinct, though, as I am daily reminded when I check my Outlook email. Thanks to the <a href="http://www.infoworld.com/article/03/05/16/20TCspam_1.html">SpamBayes plug-in for Outlook</a>, this is an experience that I no longer dread. 
</p>
<p>
The SpamBayes engine is open source software that could be integrated with any email program. Most open source folk wouldn't want to actually <i>do</i> that integration. But Mark Hammond saw it as an interesting challenge. He did the finishing work: packaging up the Python distribution needed to run SpamBayes and the plug-in, writing the user-interface code that enables the plug-in to work with my Outlook filters and folders, and delivering the whole thing as a clickable installer. 
</p>
<p>
I once asked a pair of open source wizards if they were inspired by the thought that their software could improve the lives of millions of people. Nope. &quot;We build infrastructure for other developers,&quot; they said. &quot;If they use it to make software that makes people happy, then fine, but it's not what motivates us.&quot; 
</p>
<p>
You can't argue with success, so I won't. The infusion of open source infrastructure into the enterprise is a remarkable success story, and I truly do regard those responsible as heroes. There are only so many infrastructure projects to go around, though. Whether and how open source energy flows into user-facing applications is a key question for enterprise IT. 
</p>
<p>
Mark Hammond is a different kind of open source hero, and I hope others will emerge. Finishing work is at least as great a challenge as infrastructure work, and may even produce endorphin flow in some minds. Of course, the act need not be its own reward. SleepyCat, MySQL, and other open source companies are proving there's money to be made on licensing, support, and customization. These projects are built from the ground up. But there's also an opportunity to piggyback on the finishing work that Microsoft has already done.
</p>
<p class="realsmall">
<a href="http://www.infoworld.com/article/03/06/20/25OPstrategic_1.html">InfoWorld.com version</a>
</p>
</body>
</item> 

<item num="a729">
<title>The translucent veil</title>
<date>2003/06/21</date>
<body>
<p>
<blockquote>
<i>
As we shift to an economy based on access to networked services more than on ownership of goods, translucency will be harder to achieve. Identity, after all, is a condition of access to such services. Even so, when customer data need not necessarily be personalized, translucency is a powerful technique that can meet your requirements, satisfy your customers, and keep the feds happy too. [Full story at <a href="http://www.infoworld.com/article/03/06/20/25FEprivacydb_1.html">InfoWorld.com</a>]
</i>
</blockquote>
When I challenged Peter to nail down the practical uses and limits of translucency, he responded with <a href="http://www.wayner.org/books/td/u1.php">an analysis</a> of how Amazon might apply it. He concludes that it would be practical for Amazon to avoid storing a lot of data, and notes that the problem is really more in our heads than in our databases:
</p>
<blockquote cite="Peter Wayner">
Of course, just because an idea is simple and stops terrorism (among other things), doesn't mean that it can or will be widely adopted. I think the resistence in ourselves is deeply buried, perhaps even below our logical layer. Many people still feel a packrat's instinct with data. They feel that this information should be kept around,  just in case.  This is a natural human wish, but it should also be balanced by the just as natural aversion to responsibility. Most businesses don't have to pay the price if a customer's identity gets stolen, their credit cards get cloned, or their bank account is raided. This may change as more people and businesses become aware of the danger of misused information and the responsibility to protect it. [<a href="http://www.wayner.org/books/td/u1.php">www.wayner.org</a>]
</blockquote>
</body>
</item> 

<item num="a728">
<title>Certifying email senders</title>
<date>2003/06/20</date>
<body>
<p>
Last month, in an <a href="http://weblog.infoworld.com/udell/2003/05/08.html">item about SpamBayes</a>, I mentioned IronPort's <a href="http://www.senderbase.org/">SenderBase</a> service and Bonded Sender (<a href="http://www.bondedsender.com/">1</a>, <a href="http://www.bondedsender.com/">2</a>) program. SenderBase is an extraordinary resource for investigating email senders. The homepage lists the top high-volume senders, but you can drill all the way down to small-fry IP addresses. If you're on a DSL or cable connection with a static IP address, try looking yourself up:
</p>
<script language="javascript" src="http://weblog.infoworld.com/udell/gems/senderbaseLookup.js"/>
<form name="ipform" method="get" action="javascript:senderbaseLookup(document.ipform.ip.value)">
Your IP address: <input name="ip"/> <input type="submit" value="SenderBase Lookup"/>
</form>
<p>
Now, here's part of the result screen you get when you look up <a href="http://www.senderbase.org/search/?searchString=206.16.1.160&amp;searchBy=ipaddress">206.16.1.160</a>, which is a CNET address:
</p>
<p>
<a href="http://www.senderbase.org/search/?searchString=206.16.1.160&amp;searchBy=ipaddress">
<img width="500" height="298" alt="senderbase" src="http://weblog.infoworld.com/udell/gems/bondedCnet.gif"/>
</a>
</p>
<p>
CNET is a bonded sender. The company promises not to send spam, and stands to forfeit its bond (to TrustE, not IronPort) if it breaks the promise. The SenderBase service provides a <a href="http://www.dnsbl.com/">DNSBL</a>-style lookup service that works like this:
</p>
<p>
1% nslookup 160.1.16.206.query.bondedsender.org<br/>
<br/>
Name:    160.1.16.206.query.bondedsender.org<br/>
Address:  127.0.0.10
</p>
<p>
Normally any IP-address response from such a query means the queried address is listed on an RBL (realtime black list), but in this case it's the converse: an RWL (realtime white list). 
</p>
<p>
I had a long talk with IronPort's founder and CTO Scott Banister yesterday, as part of my research for an upcoming InfoWorld article. I've long been fascinated with digital identity schemes. This one certifies what Scott calls the single unforgeable element of an email message: the sender's IP address. It's a really interesting way for legitimate high-volume senders to bypass content filters deployed in gateways, servers, and clients.
</p>
<p>
Today, when I interviewed Mark Mallett, who runs <a href="http://www.mv.com/">my local ISP</a>, I learned of an individual variation on the bonded-sender theme: <a href="http://www.habeas.com">Habeas</a>. With Habeas, senders license a haiku that they embed (along with legalese) in message headers, like so:
</p>
<p>
X-Habeas-SWE-1: winter into spring<br/>
X-Habeas-SWE-2: brightly anticipated<br/>
X-Habeas-SWE-3: like Habeas SWE (tm)<br/>
X-Habeas-SWE-4: Copyright 2002 Habeas (tm)<br/>
X-Habeas-SWE-5: Sender Warranted Email (SWE) (tm). The sender of this<br/>
X-Habeas-SWE-6: email in exchange for a license for this Habeas<br/>
X-Habeas-SWE-7: warrant mark warrants that this is a Habeas Compliant<br/>
X-Habeas-SWE-8: Message (HCM) and not spam. Please report use of this<br/>
X-Habeas-SWE-9: mark in spam to &lt;http://www.habeas.com/report/&gt;.
</p>
<p>
The idea, as Mark explains, is that in the absence of anti-spam laws with teeth, it's nevertheless possible to use existing copyright and trademark law to attack spammers who misappropriate the copyrighted and trademarked haiku (lines 1-3 of the headers).
</p>
<blockquote cite="Hiawatha Bray">
But would it hold up in court? Jonathan Zittrain, assistant professor at the Harvard Law School and co-director of the Berkman Center on Internet and Society, has his doubts about the haiku copyright infringement charge. [<a href="http://www.habeas.com/about/boston_globe-2002-08-26.html">Hiawatha Bray</a>]
</blockquote>
<p>
I'm not a lawyer, and can't evaluate that. I will note, though, that there are usability challenges for individuals who want to self-certify messages using the version of the Habeas license that's free for non-commercial purposes. Consider this HOWTO for OS X Mail.app users: 
</p>
<blockquote cite="Robert L. Vaessen">
Here's the command I typed at the Terminal prompt in order to add the 
Habeas custom headers:
<br/>
<br/>
[localhost:~] rvaessen% defaults write com.apple.mail UserHeaders '{&quot;X-Habeas-SWE-1&quot; = &quot;winter into spring&quot;;&quot;X-Habeas-SWE-2&quot; = &quot;brightly anticipated&quot;;&quot;X-Habeas-SWE-3&quot; = &quot;like Habeas SWE (tm)&quot;;&quot;X-Habeas-SWE-4&quot; = &quot;Copyright 2002 Habeas (tm)&quot;;&quot;X-Habeas-SWE-5&quot; = &quot;Sender Warranted Email (SWE) (tm). The sender of this&quot;;&quot;X-Habeas-SWE-6&quot; = &quot;email in exchange for a license for this Habeas&quot;;&quot;X-Habeas-SWE-7&quot; = &quot;warrant mark warrants that this is a Habeas Compliant&quot;;&quot;X-Habeas-SWE-8&quot; = &quot;Message (HCM) and not spam. Please report use of this&quot;;&quot;X-Habeas-SWE-9&quot; = &quot;mark in spam to &lt;http://www.habeas.com/report/&gt;.&quot;;}'
<br/>
<br/>
[<a href="http://www.habeas.com/pipermail/technical-discussion/2002-November/000011.html">Habeas discussion list</a>]
</blockquote>
<p>
I dunno. If you're really ready to get that geeky, why not go all the way and set yourself up with a regular S/MIME cert, which is also useful for signing and encryption? Of course, <a href="http://www.baltimore.com/devzone/pki/ocsp.asp">OCSP</a> (online certificate status protocol) lookups can't leverage the same DNS-oriented infrastructure that the RBLs and RWLs use. But from the perspective of an implementor who's doing lookups from an email gateway, server, or client, it's six of one, half-dozen of the other, I would think. Why haven't VeriSign, Baltimore, and the rest of the PKI gang glommed onto this? 
</p>
<p>
Well, the answer's pretty obvious. OCSP and CRL (certificate revocation list) have never been heavily used, and are unlikely to scale. (For the same reason, I suspect that if Habeas or any other individual certification scheme became popular, it would become a victim of its own success.) There are, however, some new approaches in the PKI space that aim to make massive scale-up practical -- see, for example, <a href="http://www.corestreet.com/technology.html">CoreStreet Technology's Real Time Credentials Validation Authority</a>. I'll be watching this area with interest.
</p>
</body>
</item> 

<item num="a727">
<title>Rules engine/debugger as system service?</title>
<date>2003/06/19</date>
<body>
<p>
<table align="right" cellspacing="6">
<tr>
<td>
<a href="http://www.hsmo.org/kids/">
<img alt="rules" src="http://weblog.infoworld.com/udell/gems/rules.gif"/>
</a>
</td>
</tr>
</table>
I like to imagine new OS system services. Yesterday, it struck me that a rules engine, logger, and debugger would be an appropriate bundle of stuff to generalize as a standard system service. Two experiences that seemed quite different, but were really the same, led me to this conclusion. First, I wrangled with my Microsoft Outlook email filters. Second, I tweaked the ipfw firewall on Mac OS X. In both cases, the job boiled down to defining conditions and actions, thinking about the order in which rules fire, twiddling the rules, and trying to visualize the effects of the twiddling.
</p>
<p>
In Outlook, the rule-twiddling was motivated by further experimentation with anti-spam tools. I'm still a happy user of SpamBayes (<a href="http://weblog.infoworld.com/udell/2003/05/08.html#a684">1</a>, <a href="http://weblog.infoworld.com/udell/2003/05/09.html#a685">2</a>, <a href="http://www.infoworld.com/article/03/05/16/20TCspam_1.html">3</a>), but I've been exploring the use of other solutions too. Spam received on my InfoWorld account is tagged by SpamAssassin, so it's interesting to compare its judgements to what SpamBayes can do. And in order to compare the effects of an RBL (Realtime Blackhole List) solution, I've added <a href="http://www.spampal.org/">SpamPal</a> to the mix. It's pretty cool, actually -- runs as a local proxy, and makes it easy to try out combinations of RBLs.
</p>
<p>
It's gotten tricky, though, to use all these schemes in parallel. I have rules for SpamAssassin and for SpamPal that should, in theory, move those messages to appropriate folders and then exclude them from further rule processing. In practice that mostly happens, but there's some leakage. Sometimes a SpamAssassin message lands in the SpamPal folder, or a SpamPal message lands in the SpamBayes folder. How do you debug something like that? There's no easy way.
</p>
<p>
Meanwhile, over on Mac OS X, I was preparing to move the TiBook out from behind a NAT to do some videoconferencing tests. I'm no firewall expert, so I made a few of the classic saw-off-the-branch-you're-sitting-on kinds of mistakes before I got what I wanted. Now, of course, I'm reminded that there are loads of creative and useful things I could be doing with this firewall -- if it were easier to experiment with, and verify the effects of, more complex rulesets.
</p>
</body>
</item> 

<item num="a726">
<title>SpamBayes/Outlook review</title>
<date>2003/06/18</date>
<body>
<p>
Thomas Bayes, a Presbyterian minister and mathematician born just over 300 years ago, would be shocked to see most of the e-mail messages that bid for our attention nowadays. He would be thrilled to know, however, that his statistical inference theorem has inspired a potent counterattack. An open source project called SpamBayes has emerged as a powerful weapon in the war on spam. There are a few different implementations of SpamBayes. I'll focus here on an Outlook add-in, written by renowned Python hacker Mark Hammond. I've been skeptical about the long-term prospects for content-based e-mail filtering. But the Python-based SpamBayes engine, and Hammond's brilliant add-in (also written in Python), are rapidly making me a believer. 
Full story at <a href="http://www.infoworld.com/article/03/05/16/20TCspam_1.html">InfoWorld.com</a>
</p>
</body>
</item> 

<item num="a725">
<title>Towards a unification of strengths</title>
<date>2003/06/18</date>
<body>
<p>
Yesterday's item about FreeBSD drew a number of responses pointing out that the <a href="http://www.freebsd.org/ports/">FreeBSD ports</a> collection, and associated <a href="http://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/packages-using.html">package tools</a>, are comprehensive and convenient. To clarify: I was unable to use these because I'm operating in user space on this particular box, not as root. In that case, as pkg_add says, &quot;You're on your own!&quot;
</p>
<p>
<img align="right" alt="catdog" src="http://weblog.infoworld.com/udell/gems/catdog.jpg"/>
As sometimes happens in these rambling blog entries, I found my real point at the end. That Linux makes user-space installation of software easier for people on hosted boxes, as compared to FreeBSD or alternatives, is interesting if you're using a hosted box. What really intrigues me, though, is the odd Windows/Unix culture clash we have here, and it runs in both directions. 
</p>
<p>
It becomes clearer to me every day that running XML transformations in series is the modern incarnation of the venerable Unix pipeline, a philosophy that's baked into the platform. But although you're never in doubt of finding ls or awk on a box, xslt isn't there by default, and getting hold of it is a crapshoot. 
</p>
<p>
Meanwhile, over on Windows, the xslt component is standard, but the pipeline philosophy is not well supported. For example, you have to hunt around to find <a href="http://msdn.microsoft.com/webservices/building/xmldevelopment/xslt/default.aspx?pull=/library/en-us/dnxml/html/msxsl.asp">msxsl.exe</a>, the command-line wrapper for the XSLT engine. I get tremendous mileage out of it, but I'll bet I'm part of a small minority who do.
</p>
<p>
Allie Rogers is CTO of Triple Point Technology; I interviewed him for a recent <a href="http://www.infoworld.com/article/03/06/06/23FEj2eeweapons_1.html">J2EE story</a>. He writes:
</p>
<blockquote cite="Allie Rogers">
Hear hear! I go through this all the time and loath it.
<br/>
<br/>
However, while I agree with your &quot;it's already in the box&quot; plea for Windows, that platform has its own share of similar frustrations, including the lack of a proper shell in which to script, hack, run daemons, etc.  Everytime I'm forced to do something like your project, Windows is convenient, but also unsatisfying for a different set of reasons. My dream is that Mac OS X eventually gets it all just right.
</blockquote>
<p>
That'd be sweet. After all these years, the Unix and Windows cultures are still profoundly unaware of one another's strengths. Maybe an outside perspective can finally unify those strengths.
</p>
</body>
</item> 

<item num="a724">
<title>Perils of the road less taken</title>
<date>2003/06/17</date>
<body>
<p>
Last month, Chad Dickerson wrote:
</p>
<blockquote cite="Chad Dickerson">
With Linux approaching ho-hum status in its ubiquity and MySQL getting its fair share of attention these days, there are still a few open source projects out there that sometimes slip under the hype radar...The free BSD derivatives -- FreeBSD, OpenBSD, and NetBSD -- have been around for years and deserve consideration alongside Linux. [<a href="http://www.infoworld.com/article/03/05/23/21OPconnection_1.html">CTO Connection</a>
</blockquote>
<p>
That's true. But as I was reminded yesterday, stepping off the Linux path can lead to serious headaches. A system on which I deploy some Zope-based applications used to run BSDI, and now runs FreeBSD. Building Python and Zope on the box was a lot trickier than it would have been on Linux, just because FreeBSD is the road less taken. I'd pretty much forgotten about that until yesterday, when I decided to add an XSLT processor to the mix. That proved way harder than it should have been.
</p>
<p>
The first choice was <a href="http://xml.apache.org/xerces-c/index.html">Xerces-C</a> and <a href="http://xml.apache.org/xalan-c/index.html">Xalan-C</a>. I had no luck getting Xerces-C built, though. Partly that's because the box is ISP-controlled and has a fairly minimal configuration, but partly -- I think -- it's because most of the Unix folk who build and run this software are doing so on Linux or Solaris.
</p>
<p>
The next option was to try the Java versions of Xerces and Xalan. But apparently, running Java on FreeBSD isn't exactly a cakewalk. You need to build Java from its sources, and even then, the <a href="http://www.eyesbeyond.com/freebsddom/java/index.html">available patches</a> don't officially support recent JDKs. 
</p>
<p>
I know there are people who enjoy doing this kind of thing. I'm not one of them. If I want an XSLT processor, it's because I need it for some project. When I find myself re-registering at java.sun.com in order to download the Java sources in order to build a JDK so that I can download the Xerces-J/Xalan-J sources, in order to build the XSLT processor that I wanted to use in the first place, I've learned to take a deep breath and regroup. It's all too easy to lose hours to a procedure like this and come up empty-handed. 
</p>
<p>
After a refreshing oxygen break, I decided to give <a href="http://www.gingerall.com/charlie/ga/xml/p_sab.xml">Sablotron</a> a try. Here were the two statements that were good predictors of success:
</p>
<blockquote>
<i>
Sablotron is a fast, compact and portable XML toolkit implementing XSLT 1.0, DOM Level2 and XPath 1.0.
</i>
</blockquote>
<blockquote>
<i>
Sablotron uses James Clark's <a href="http://sourceforge.net/projects/expat">expat</a> XML parser.
</i>
</blockquote>
<p>
I liked the sound of that! I know expat will work everywhere -- indeed, it's already working on this box, as part of my Python/Zope kit. And the 6MB of Xerces-C's compressed sources, plus 1MB of Xalan-C sources, does seem a tad hefty. Sablotron weighs in at .5MB of compressed sources, plus .3MB of expat's, which seems like it ought to be enough to do the job. 
</p>
<p>
And sure enough, I did get it working, though I had to back off the latest expat-1.95.6 to the prior expat-1.95.5, and I haven't extensively tested Sablotron yet. Now, what was it I needed it for the first place? Oh, yeah, I remember. It was the project that, had I been working on a Windows server, I'd be deep in the middle of, because XML parsing and transformation are standard features of the Windows platform :-)
</p>
<p>
So the moral isn't just that Linux would have saved me some headaches. Really, the issue of XSLT support shouldn't even have come up. It ought to just be there. The fact that it isn't standard on Linux, or FreeBSD, or Mac OS X for that matter, doesn't speak well for these platforms.
</p>
</body>
</item> 

<item num="a723">
<title>The universal client</title>
<date>2003/06/14</date>
<body>
<p>
<blockquote>
<i>
The game of Web services is played by passing around XML documents. Office 2003 will be the superior technology for writing/editing (InfoPath) and analyzing (Excel) such documents, but in many cases users will be searching, viewing, tweaking, approving, and routing. It's a huge win if we can use Web standards to do these things in a lightweight, cross-browser, cross-platform way. We've waited so long for this moment to come. AOL, please don't screw it up. If you don't get why this matters, turn Mozilla over to an organization that does. [Full story at <a href="http://www.infoworld.com/article/03/06/13/24OPstrategic_1.html">InfoWorld.com</a>]
</i>
</blockquote>
</p>
<p>
<table align="right" cellspacing="0" cellpadding="6">
<tr>
<td>
<a href="http://weblog.infoworld.com/udell/gems/radioMacPC.jpg">
<img border="1" alt="universal client" width="300" height="200" src="http://weblog.infoworld.com/udell/gems/s_radioMacPC.gif"/>
</a>
<div align="center" class="realsmall">a different kind of universal client</div>
</td>
</tr>
</table>
I posted this item in an odd way. As the screenshot shows, I'm running the Windows version of Radio UserLand on Mac OS X, by way of <s>Connectix</s> Microsoft Virtual PC. I don't enjoy these hall-of-mirrors effects for their own sake anymore. There's actually a reason for this crazy setup. I discovered a while ago that my standard procedure for cloning Radio from my desktop to my ThinkPad -- xcopy t:\radio\. c:\radio\. /s -- wasn't going to work for Mac OS X. There are scads of hardcoded paths scattered throughout the various .root files. (Phil Windley's <a href="http://www.windley.com/2003/06/06.html#a653">generalization</a> of my retitling script barely scratches the surface, as it turns out.) If there's a reliable fixup script, I'd like to hear about it. Meanwhile, it's a chance to put Virtual PC to the test. When it comes to universal clients, There's More Than One Way To Do It. Well, here goes...
</p>
<p>
...heh, I'll be <s>damned</s> (sorry, SurfControl) darned, it worked. Sure does <s>suck</s> drain the battery, though.
</p>
</body>
</item> 

<item num="a722">
<title>Mozilla search plugins</title>
<date>2003/06/13</date>
<body>
<p>
<table align="right" cellspacing="0" cellpadding="6">
<tr>
<td>
<a href="http://weblog.infoworld.com/udell/gems/mycroft.jpg">
<img border="1" alt="mycroft in mozilla firebird" width="300" height="187" src="http://weblog.infoworld.com/udell/gems/s_mycroft.gif"/>
</a>
<div align="center" class="realsmall">Mozilla search plugins</div>
</td>
</tr>
</table>
Don Box <a href="http://www.gotdotnet.com/team/dbox/default.aspx?key=2003-06-13T08:15:02Z">notices</a> a cool IE feature. The view-source: protocol is supported. I tried it and it worked. Even cooler, I wasn't in IE at the time, I was in Firebird. I guess we should call it a <i>browser</i> feature :-)
</p>
<p>
Speaking of cool features, I've been meaning to mention <a href="http://mycroft.mozdev.org/">Mycroft</a>, a collection of search plugins for the Mozilla toolbar. These plugins work like <a href="http://ranchero.com/huevos/">Huevos</a>, and a bunch of other doodads, so there's really nothing new here, but it's handy to be able to extend the browsers integrated search toolbar.
</p>
<p>
The screenshot shows Firebird with the standard set of plugins -- for Google, Google News, Amazon, and dmoz.org (Open Directory) -- plus a handful I've added. They're easy to make. As with Huevos and other tools, you just need to describe your search URL and parameterize the search term. For Mozilla, you also supply a 16x16 image to differentiate each plugin in the dropdown list. A simple description goes like this:
</p>
<code>
&lt;search  name=&quot;Jon's Radio&quot; version=&quot;1.0&quot; method=&quot;GET&quot; action=&quot;http://search.atomz.com/search/&quot;&gt;&lt;br/&gt;
&lt;input name=&quot;sp-q&quot; user&gt; &lt;br/&gt;
&lt;input name=&quot;sp-a&quot; value=&quot;sp10022a3d&quot;&gt; &lt;br/&gt;
&lt;/search&gt;
</code>
<p>
Here are the files for the plugins shown in the screenshot:
</p>
<script src="http://weblog.infoworld.com/udell/gems/mycroft.js" type="text/javascript"/>
<ul>
<li>
<p>Google MS: <a href="http://weblog.infoworld.com/udell/gems/googlems.src">src</a>, <a href="http://weblog.infoworld.com/udell/gems/googlems.gif">img</a>, <a href="javascript:addEngine('googlems', 'gif', 'Tech')">install to Mozilla</a>
</p> </li>
<li>
<p>InfoWorld: <a href="http://weblog.infoworld.com/udell/gems/infoworld.src">src</a>, <a href="http://weblog.infoworld.com/udell/gems/infoworld.gif">img</a>, <a href="javascript:addEngine('infoworld', 'gif', 'Tech')">install to Mozilla</a>
</p>
</li>
<li>
<p>Jon's Radio: <a href="http://weblog.infoworld.com/udell/gems/jonblog.src">src</a>, <a href="http://weblog.infoworld.com/udell/gems/jonblog.gif">img</a>, <a href="javascript:addEngine('jonblog', 'gif', 'Tech')">install to Mozilla</a>
</p>
</li>
<li>
<p>Safari Books Online: <a href="http://weblog.infoworld.com/udell/gems/safari.src">src</a>, <a href="http://weblog.infoworld.com/udell/gems/safari.gif">img</a>, <a href="javascript:addEngine('safari', 'gif', 'Tech')">install to Mozilla</a>
</p>
</li>
<li>
<p>Switchboard: <a href="http://weblog.infoworld.com/udell/gems/switchboard.src">src</a>, <a href="http://weblog.infoworld.com/udell/gems/switchboard.gif">img</a>, <a href="javascript:addEngine('switchboard', 'gif', 'Tech')">install to Mozilla</a>
</p>
</li>
</ul>
<p>
You put them into a searchplugins directory whose location is platform-dependant. Here on Mac OS X, with Firebird, it's /Applications/Mozilla Firebird.app/Contents/MacOS/searchplugins.
</p>
<p>
The Switchboard example is slightly gratuitous. You can't find anybody in Switchboard with a single search term, so just winds up being a quick way to launch Switchboard with the person's last name already filled in.
</p>
<p>
Nothing earthshattering, but certainly handy. Here's the <a href="http://mycroft.mozdev.org/download.html">Mycroft directory</a> of search plugins.
</p>
</body>
</item> 

<item num="a721">
<title>Censored!</title>
<date>2003/06/13</date>
<body>
<p>
<table align="right" cellspacing="0" cellpadding="6">
<tr>
<td>
<a href="http://weblog.infoworld.com/udell/gems/surfControl.gif">
<img border="1" width="275" height="169" src="http://weblog.infoworld.com/udell/gems/s_surfControl.gif" alt="censored!"/>
</a>
<div align="center" class="realsmall">weblog.infoworld.com considered harmful</div>
</td>
</tr>
</table>
A reader wrote to point out that weblog.infoworld.com is <a href="http://mtas.surfcontrol.com/mtas/MTAS.asp">categorized</a> by <a href="http://www.surfcontrol.com/">SurfControl</a> as &quot;Usenet News&quot; and is, therefore, being blocked for people in organizations that deploy SurfControl's server-based filter. 
</p>
<p>
Curious, I downloaded a copy of their software to check it out. Before I got a chance to look at it, I got a sales call from the SurfControl folks. Whom I proceeded to grill on their procedures. Is all &quot;Usenet News&quot; blocked by default? Yes. Can users override the block? Yes, either per-category or per-site. Does each such override require a transaction with the local SurfControl administrator? Yes. Will a subsequent recategorization by SurfControl take precedence over a locally-administered override? No. Does the operator of a categorized site receive any notification of the status assigned to it? No, it's up to you to <a href="http://mtas.surfcontrol.com/mtas/MTAS.asp">check for yourself.</a> 
</p>
<p>
And then, of course, the burning question: how did my site get assigned to an objectionable category that's blocked unless users request an override?
</p>
<blockquote>
<i>
<p>
<b>SurfControl</b>: We have a worldwide team of expert researchers who evaluate sites 24x7.
</p>
<p>
<b>Jon</b>: Hmm. Well it seems there's been an error. Can you please correct it?
</p>
<p>
<b>SC</b>: Give me the URL and I'll review your site.
</p>
<p>
<b>JU</b>: Look, just type &quot;j-o-n&quot; into Google and follow the first link. 
</p>
</i>
</blockquote>
<p>
(OK, that was bit was harsh, but geez, give me a break...)
</p>
<p>
I guess there's a good chance this will be fixed in a few days. Not that it matters much to readers of this blog, since few of you -- I suspect -- work in SurfControlled organizations. Still, the experience has opened a window onto a world I'm glad I don't live in.
</p>
<p>
<b>Update</b>: Jenny points out that maybe I do, or soon will, live in that world:
</p>
<blockquote cite="Jenny Levine">
Imagine for a moment that you don't have internet access at home, school is closed for the day, or you're just at the public library doing your homework. In the near future, Congress may finally get its wish to force libraries to install filtering software like SurfControl on all of their internet terminals. They claim it's to protect the kids, but in reality it censors adults and kids alike. States are trying to do this, too, rather than leaving it up to you and your local community. [<a href="http://www.theshiftedlibrarian.com/2003/06/13.html#a4121">The Shifted Librarian</a>]
</blockquote>
<p>
I also note that although Scripting News <a href="http://scriptingnews.userland.com/backissues/2003/06/13#When:7:04:05AM">may often be blocked</a>, SurfControl has www.scripting.com in the presumably non-objectionable Computing &amp; Internet category.
</p>
</body>
</item> 

<item num="a720">
<title>Structured writing, structured search</title>
<date>2003/06/12</date>
<body>
<p>
<blockquote>
<i>
From a user's point of view, XPath query strings are pretty darned geeky. I'm hopeless with them myself unless I have examples in front of me. I find that having a list of examples available in the context of my own live data, and synchronizing it to an input box in which examples can be modified, leads me to discover and record more useful patterns. A subtler thing happens too. As you're writing the XHTML, the search possibilities begin to guide your choices. [Full story at <a href="http://webservices.xml.com/pub/a/ws/2003/06/10/xpathsearch.html">O'Reilly Network</a>]
</i>
</blockquote>
I always think that my latest invention is the coolest one ever, so you should take this with a grain of salt, but I can't stop thinking about the implications of this one. First, because of the cross-browser, cross-OS angle introduced by Mozilla. Second, because it strikes me that XPath really could be packaged up for use by civilians (i.e.,  non-geeks). Third, because the availability of structured search -- during the writing process -- can have a profound effect on how (and why) we structure what we write. 
</p>
</body>
</item> 

<item num="a719">
<title>The WASTE affair</title>
<date>2003/06/11</date>
<body>
<p>
<img alt="waste" hspace="6" src="http://weblog.infoworld.com/udell/gems/waste.gif" align="right"/>
Once upon a time (1998) I prototyped a <a href="http://udell.roninhouse.com/dhttp/dhttp.html">P2P system</a> based on two simple ideas:
<ol>
<p>
<li>A local webserver implemented in a scripting language.</li>
</p>
<p>
<li>Browser-based access to local (or remote) apps.</li>
</p>
</ol>
</p>
<p>
The possibilities inherent in this architecture were, and are, astonishing. Since it was a Web-style system, it was obvious -- yet still somehow surprising -- that applications could display anywhere. Less obvious was the ease with which I could make data, and even code, replicate. And all that <i>before</i> I implemented the notion of local proxies, which could intercept and act on browser-initiated connections to remote nodes.
</p>
<p>
Although there are a million reasons to use proxies, the one that motivated me was security. I wanted to see if I could encrypt the connection to a remote node, and I wanted to do it without having to implement SSL in my tiny HTTP listener. Today, I'd probably just use <a href="http://www.stunnel.org/">stunnel</a>. But then, as an exercise, I added <a href="http://www.counterpane.com/blowfish.html">Blowfish</a> encryption between nodes, as described in <a href="http://safari.oreilly.com/?XmlId=1-56592-537-8/ch15-10097">a section</a> of my book. When it came time to post the sample software for the book, though, I removed the Blowfish implementation and substituted a non-cryptographic transformation. Why? I didn't understand the export regulations, but suspected there could be problems.
</p>
<p>
Times haven't changed much. Ray Ozzie recently <a href="http://www.ozzie.net/blog/2003/05/30.html#a91">called</a> <a href="http://www.ozzie.net/blog/2003/06/04.html#a93">attention</a> to the controversy of AOL's withdrawal of Nullsoft's <a href="http://www.nullsoft.com/free/waste/download.html">WASTE</a>. 
</p>
<blockquote cite="Ray Ozzie">
...let's say that AOL really does regard this as proprietary software, as its web page now seems to imply, and that the shadowy Dick Pumpaloaf didn't just write the doc ... he indeed posted Justin's code without AOL's consent.  Then AOL would have had to apply for an export license, which takes many many months for a technical review and, in our experience takes on average about six months just for renewals or major version modifications. [<a href="http://www.ozzie.net/blog/2003/06/04.html#a93">Ray Ozzie's weblog</a>]
</blockquote>
<p>
A bit of explanation is in order here. The source distribution of the WASTE P2P system included a Word file describing WASTE's architecture. (<a href="http://www.danger-island.com/~dav/writeon/archives/000903.shtml">Apparently</a> the acronym refers to the underground postal system in Thomas Pynchon's <a href="http://allconsuming.net/item.cgi?isbn=0060931671">The Crying of Lot 49</a>.) The file's author property is Dick Pumpaloaf, whose name was, as I wrote this, a <a href="http://www.google.com/search?q=%22dick%20pumpaloaf%22">googlewhack</a>. The Last-Saved-By property is, however, Justin Frankel, the subject of this sensational Yahoo! News headline: <a href="http://story.news.yahoo.com/news?tmpl=story&amp;u=/ap/20030603/ap_on_hi_te/aol_nullsoft_1">Rogue AOL Subsidiary Leader to Resign</a>.
</p>
<p>
I have, of course, destroyed any and all copies of the Software, including by deleting it from my computer, as per <a href="http://www.nullsoft.com/free/waste/download.html">instructions</a>. While investigating it, though, I was reminded of my own experiments with encryption and proxying. After establishing a private WASTE network behind my NAT, I took one of the nodes to a separate Internet-connected subnet, after punching a hole in the first subnet's NAT for port 1337, and connected back into the network. Then I created a new node on the outside network, and introduced it to the original outside node (by exchanging public keys). Now both nodes were visible to everybody on the inside network. WASTE nodes are routers; public keys are broadcast; nodes accept broadcast public keys by default. There isn't complete firewall/NAT transparency, but so long as any member of a private network is world-visible, it seems that way.
</p>
<p>
Ad-hoc and administratorless VPNs are the name of this game. It's what Groove does, and what other tools -- including WASTE and its successors -- will also do. For a long time, RSA's patent and trademark claims impeded a lot of the innovation that should have been happening in this area. Now, as Ray points out, there's still the export-regulation impediment, and ignoring it doesn't make it go away. I'm glad Ray's shining a light on this issue.
</p>
<p>
Let me see if I understand correctly. Take <a href="http://www.winfosec.com/">Winfosec</a> for example. The company offers <a href="http://www.winfosec.com/download.php">SIMP</a>, a secure IM tool, and makes the source freely available. It is therefore subject to a <a href="http://www.bxa.doc.gov/Encryption/PubAvailEncSourceCodeNofify.html">notification requirement</a>, which boils down to sending mail to crypt@bis.doc.gov, enc@ncsc.mil, and web_site@bis.doc.gov. OK. That's easy enough for Winfosec to have done. 
</p>
<p>
There's also, Ray says, &quot;an affirmative obligation to use efforts to block the 'rogue nations.&quot; Hence this Winfosec warning:
</p>
<blockquote>
<i>
WARNING: You may not download SIMP or its source code if you are a resident of Cuba, Iran, Iraq, Libya, North Korea, Syria, or Sudan.
<br/>
<br/>
Notice: SIMP contains strong encryption, which is classified as a munition in the United States. If you live in the US and plan to redistribute either SIMP or its source code, you may be required to notify the US Bureau of Export Administration (BXA). 
</i>
</blockquote>
<p>
I see similar language at <a href="http://www.groove.net/downloads/groove/">groove.net</a>:
</p>
<blockquote>
<i>
The software you are about to download contains restricted cryptographic functionality subject to U.S. export control law. You may not download this software if you are located in Cuba, Iran, Iraq (for some purposes), Libya, North Korea, Syria, or the Sudan. U.S. export control law forbids the export or re-export of this software to any destination in any of theses countries or to any person on the U.S. Department of Commerce's list of denied persons or entities or the U.S. Treasury Department's master list of Specially Designated Nationals and Blocked Persons. By downloading this software, you agree that you will not transfer it to any destination in (or to a national of) any of the countries listed above or to any person or entity on a U.S. denial list, and that you will comply with all import regulations imposed by jurisdictions other than the United States. 
</i>
</blockquote>
<p>
That also seems straightforward enough. But terrorists aren't any likelier to be stopped by such notices than are teenagers when confronted with &quot;You Must Be Over 18&quot; entry pages. Are Winfosec, or Groove, or others expected to do more? For example, to try to block specific ISPs or IP address ranges? 
</p>
<p>
That would be quixotic anyway, for the very reason that WASTE demonstrates: the protean power of proxies. And of course the government can't come right out and say that. So it seems we'll continue to muddle along, trying to do the right thing, but worrying (some more than others) that we've done the wrong thing. I wish we could dispel this cloud of confusion and guilt that surrounds cryptographic software, and just get on with it. There's a ton of vital experimentation and innovation that still needs to occur. 
</p>
</body>
</item> 

<item num="a718">
<title>In search of the JBoss Pet Store</title>
<date>2003/06/10</date>
<body>
<p>
<table align="right" cellspacing="0" cellpadding="6">
<tr>
<td>
<a href="http://weblog.infoworld.com/udell/gems/jbossPetstore.gif">
<img width="300" height="200" alt="jboss petstore" src="http://weblog.infoworld.com/udell/gems/jbossPetstore.gif"/>
</a>
<div class="realsmall" align="center">netbooting the jboss petstore</div>
</td>
</tr>
</table>
So I'm checking out JBoss 4, and I figured that a good way to do that is to try running Sun's Pet Store demo app. Various sets of instructions are available for getting it to work with earlier versions of JBoss, but I didn't find anything for 4.0 (or 3.2). Along the way, though, I did learn about a neat JBoss trick called netboot -- it's old news in the JBoss community but apparently not widely known. As the <a href="http://www.jboss.org/index.html?module=html&amp;op=userdisplay&amp;id=demos/netboot">jboss.org netboot page</a> explains, you simply download a very small Java stub, then type:
</p>
<p>
./bin/run.sh --netboot http://jboss.sf.net/demo/netboot --config default
</p>
<p>
Pretty soon, depending on the speed of your net connection, you're running a minimal JBoss 3.0 server. It's amazingly cool. And on the <a href="http://www.jboss.org/index.html?module=html&amp;op=userdisplay&amp;id=demos/netboot-advanced#Pet_Store">advanced netboot</a> page, there's this hopeful suggestion:
</p>
<blockquote>
<i>
This configuration provides the required components for Sun's Java Pet Store demo running inside of JBoss! <br/>
<br/>
./bin/run.sh --netboot http://jboss.sf.net/demo/netboot --config petstore
</i>
</blockquote>
<p>
Ah, the Pet Store! Just what I'm looking for. That'd be awesome. Except, oops,
</p>
<blockquote>
<i>
TODO: Make the config... this won't work until there is a config up there...
</i>
</blockquote>
<p>
OK, now I'm curious. A fairly recent jboss-development list <a href="http://www.mail-archive.com/jboss-development@lists.sourceforge.net/msg34653.html">message</a> notes that the current JBoss CVS tree includes, under applications, a builder that can migrate Sun's Pet Store to JBoss and even netboot it. I tried it, and the screenshot displays the hopeful result. Unfortunately, it doesn't work. The Pet Store <s>build</s> deployment gave me a bunch of these messages:
</p>
<p>
09:16:44,483 WARN  [verifier] EJB spec violation: <br/>
Bean   : LineItemEJB<br/>
Section: 10.6.13<br/>
Warning: The class must provide suitable implementation of the hashCode() method.
</p>
<p>
Now I'm a complete JBoss novice, so there are a million ways I could have gotten this wrong. Very possibly some reader of this posting will point me to the forehead-slapping solution. But still, you've got to wonder.
</p>
<p>
&quot;Marketing,&quot; wrote Marc Fleury on the <a href="http://www.mail-archive.com/jboss-development@lists.sourceforge.net/msg32048.html">list</a>, &quot;ain't it a bitch.&quot; Perhaps, but it strikes me that if a working netbootable Pet Store were visible anywhere on the net -- or heck, forget netboot, just a working JBoss 3.2/4.0 Pet Store that could be downloaded and dropped into a deployment directory on a test server -- then those URLs would become rather well known whether or not they were marketed. I think the word we are looking for here is 'finishing':
</p>
<blockquote cite="Paul Everitt">
That extra touch is called finishing work, and it's the kind of thing that, too often, we in open source don't do well. I'm not sure why, but my guess is that in the hierarchy of needs, finishing work provides little positive reinforcement to open source developers. [<a href="http://radio.weblogs.com/0116506/2003/06/06.html#a100">Zope Dispatches</a>]
</blockquote>
<p>
Paul Everitt made this same point at the OSCOM conference I recently attended. It's a profound issue for open source.
</p>
</body>
</item> 

<item num="a717">
<title>RSS reader stats re-analyzed</title>
<date>2003/06/08</date>
<body>
<p>
My <a href="http://weblog.infoworld.com/udell/2003/06/04.html#a712">report on June 3's access_log</a> sparked a fair bit of commentary. I realized today, though, that I neglected to account for the rss.xml requests that were answered with an HTTP 304 (&quot;not modified&quot;) response. The RSS community owes <a href="http://www.pocketsoap.com/weblog/stories/2002/05/0015.html">Simon Fell</a> a vote of thanks for noticing, about a year ago, that the volume of redundant RSS traffic could be radically reduced by exploiting the HTTP/1.1 ETag and If-None-Match headers. Aggregators promptly jumped on Simon's suggestion, and soon after things got a lot faster and more efficient.
</p>
<p>
From a log analysis perspective, the result is that an RSS request can look like this:
</p>
<p>
&quot;GET /udell/rss.xml HTTP/1.1&quot; <b>304 0</b> &quot;http://ranchero.com/software/netnewswire/&quot; &quot;NetNewsWire Lite/1.0.2 (Mac OS X)&quot;
</p>
<p>
Or this:
</p>
<p>
&quot;GET /udell/rss.xml HTTP/1.1&quot; <b>200 39861</b> &quot;http://ranchero.com/software/netnewswire/&quot; &quot;NetNewsWire Lite/1.0.1 (Mac OS X)&quot;
</p>
<p>
In the former case, the newsreader was informed that there was no change to the RSS file since the last time it was fetched. In the latter case, there was a change, and 39861 bytes of RSS data were shipped to the client.
</p>
<p>
For my 55354 log entries on June 3, the big picture looks like this:
</p>
<pre>
         total requests:  55354
   requests for rss.xml:  19165
   for rss.xml, non-304:   6986
</pre>
<p>
Here's a more detailed breakdown:
</p>
<table width="80%" border="1" cellspacing="0" cellpadding="4">
 <tr>
  <td/>
  <td align="right">rss.xml requests</td>
  <td align="right">http 200 responses</td>
  <td align="right">% of 200s</td>
  <td align="right">http 304 responses</td>
  <td align="right">% of 304s</td>
 </tr>
 <tr>
  <td>total</td>
  <td align="right">19165</td>
  <td align="right">6986</td>
  <td align="right">36%</td>
  <td align="right">12179</td>
  <td align="right">64%</td>
 </tr>
 <tr>
  <td>newswire</td>
  <td align="right">4280</td>
  <td align="right">1179</td>
  <td align="right">28%</td>
  <td align="right">3101</td>
  <td align="right">72%</td>
 </tr>
 <tr>
  <td>sharpreader</td>
  <td align="right">2941</td>
  <td align="right">506</td>
  <td align="right">17%</td>
  <td align="right">2435</td>
  <td align="right">83%</td>
 </tr>
 <tr>
  <td>newsgator</td>
  <td align="right">1105</td>
  <td align="right">189</td>
  <td align="right">17%</td>
  <td align="right">916</td>
  <td align="right">83%</td>
 </tr>
 <tr>
  <td>feedreader</td>
  <td align="right">1128</td>
  <td align="right">640</td>
  <td align="right">57%</td>
  <td align="right">488</td>
  <td align="right">43%</td>
 </tr>
</table>
<p>
Clearly Simon's technique is saving everybody a ton of bandwidth. 
</p>
<p>
Does this diminish the importance of RSS readers? It depends what you count. If you subtract the rss.xml requests from the total, there were 36,189 non-RSS requests. But of course, many of these were for images of coffee cups and other UI paraphernalia. The number of requests for HTML pages was only 7752, a number only slightly greater than the 6986 RSS requests that yielded fresh content. That's quite remarkable. And yes, I am curious to see whether and how today's &lt;xhtml:body&gt; experiment might affect that ratio.
</p>
<p>
<b>Update</b>: More fiddling yields this:
</p>
<table border="1" cellpadding="4" cellspacing="0">
<tr>
<td align="right"/>       <td align="right">&quot;views&quot; <sup>1</sup>
</td>    <td align="right">unique IPs</td>    <td align="right">adjusted <br/> unique IPs <sup>2</sup>
</td>
<td>adjusted %</td>
</tr>

<tr>
<td align="right">HTML</td>   <td align="right">8410</td>    <td align="right">1306</td>    <td align="right">996</td> <td align="right">75%</td>
</tr>

<tr>
<td align="right">RSS</td>    <td align="right">6685</td>   <td align="right">2005</td>   <td align="right">1695</td>  <td align="right">85%</td>
</tr>
</table>
<p>
<i>
<sup>1</sup> For HTML, requests for */ or *.html. For RSS, requests for rss.xml, excluding HEAD requests and 304 responses.
</i>
</p>
<p>
<i>
<sup>2</sup> For HTML, IPs found in HTML requests and not in RSS requests. For RSS, the inverse.
</i>
</p>
<p>
It appears there are two quite disjoint populations of readers. 75% of the HTML requests come from IP addresses not seen from RSS readers, and 85% of the RSS requests come from IP addresses not seen from HTML requestors. Another way to look at this is to combine all the IP addresses, dedupe them, and check how many appear twice. The number of IPs unique to the HTML + RSS combination is 3001. Of these, only 310 appear twice.
</p>
</body>
</item> 

<item num="a716">
<title>RSS bugfixes and experiments</title>
<date>2003/06/08</date>
<body>
<p>
I've been tinkering again with my RSS feeds. First, I found and fixed the problem that was causing the <a href="http://weblog.infoworld.com/udell/gems/longDescriptionFeed.xml">alternate &quot;long-descriptions&quot; feed</a> to sometimes break. The problem was that, when an item is tagged with a category, the feed got processed twice -- and the second time through, there was no &lt;content:encoded&gt; item to latch onto and turn into the &lt;description&gt;. The fix, for now, is to not postprocess category feeds. In general, I'm rethinking this whole business of rendering categories as completely separate HTML subtrees and RSS feeds. It feels too heavyweight. The primary feed, after all, includes things like <category>InfoWorld</category>. An XPath search could just pick those items out of the standard feed, right?
</p>
<p>
Another experimental change: I've tweaked the transformation applied to the default feed so that it truncates the &lt;xhtml:body&gt; in the same way that the &lt;description&gt; is truncated. For now, only the first element is included. Here's the XSLT template, if you're curious:
</p>
<pre class="code" lang="xslt">
&lt;xsl:template match=&quot;//xhtml:body&quot;&gt;
&lt;body xmlns:xhtml=&quot;http://www.w3.org/1999/xhtml&quot;&gt;
&lt;xsl:copy-of select=&quot;./*[position()=1]&quot; /&gt;
&lt;p&gt;
[Full story: 
&lt;a&gt;
&lt;xsl:attribute name=&quot;href&quot;&gt;
&lt;xsl:value-of select=&quot;ancestor::item/link&quot;/&gt;
&lt;/xsl:attribute&gt;
&lt;xsl:value-of select=&quot;ancestor::item/title&quot;/&gt;
&lt;/a&gt;
]
&lt;/p&gt;
&lt;/body&gt;
&lt;/xsl:template&gt;
</pre>
<p>
I'm doing this for a couple of reasons. First, because I've realized that the newer readers, like <a href="http://www.sharpreader.net/">SharpReader</a>, pick up the &lt;xhtml:body&gt; by default. For the same reason that I truncate &lt;description&gt; for older readers -- namely, to ensure the scannability of my feed amidst a bunch of other feeds -- it seems I should also now truncate &lt;xhtml:body&gt;. Those who really do prefer to take the entire contents of each item in the feed can still use the alternate feed. Personally, I'm still annoyed by feeds that dump full items into my aggregator, and I still consider item truncation to be the polite way to present your feed. 
</p>
<p>
Still, I'm not certain I'll stick with this policy. One of the reasons I created the full &lt;xhtml:body&gt;, after all, was that I wanted to encourage aggregators to offer the kinds of advanced XPath searching I've been exploring lately. Truncating the &lt;xhtml:body&gt; conflicts with that desire. With respect to polite presentation, an advanced aggregator that receives a full &lt;xhtml:body&gt; can choose to truncate -- even on a per-feed basis -- at the user's discretion, without being constrained by the feed provider's policy.
</p>
<p>
Clearly other issues are in play as well. Since InfoWorld.com is an ad-supported operation, the fact is that any kind of full-content feed can be used to disintermediate the site's advertising. Will in-feed ads substitute for on-site ads? Will any kind of conventional advertising work in the RSS world? Dunno. I think we're all just making it up as we go along. 
</p>
</body>
</item> 

<item num="a715">
<title>Choosing your J2EE weapons</title>
<date>2003/06/07</date>
<body>
<p>
<blockquote>
<i>
When Sun finally shipped J2EE in December 1999, Java was already well-established as a platform for serious enterprise applications. The features that Java app server vendors were already delivering -- transaction and session management, clustering, role-based security -- would now, it seemed, coalesce into a standard. But the dizzying complexity of that standard touched off a debate that continues to this day. It boils down to robustness vs. agility. In theory, J2EE can combine both by cleanly separating plumbing from business logic. In practice, the means of achieving that separation are subtle, varied, and controversial. [Full story at <a href="http://www.infoworld.com/article/03/06/06/23FEj2eeweapons_1.html">InfoWorld.com</a>
</i>
</blockquote>
</p>
<p>
This story ran alongside a <a href="http://www.infoworld.com/article/03/06/06/23FEj2ee_1.html">review of four J2EE servers</a> (WebLogic, WebSphere, JBoss, and EAServer) whose authors, Oliver Rist and David Aubrey, deserve congratulations for a job done thoroughly and well. 
</p>
</body>
</item> 

<item num="a714">
<title>Winning the browser peace</title>
<date>2003/06/07</date>
<body>
<p>
<blockquote>
<i>
Mozilla has emerged from its long nuclear winter to become a pillar of the Linux desktop. Alpha geeks everywhere (including Sun and Microsoft) are running Safari on their PowerBooks. But here's the reality check you knew was coming: cross-browser and cross-OS compatibility remains nearly as elusive as ever. I won't bore you with the details. Let's just say that testing CSS and JavaScript effects on the three major OS platforms, in six different browsers, isn't a good use of anybody's time. [Full story at <a href="http://www.infoworld.com/article/03/06/06/23OPstrategic_1.html">InfoWorld.com</a>]
</i>
</blockquote>
</p>
<p>
As my blog entries this week indicate, I'm still deeply contemplating the state of the browser art. It seems kind of retro to yearn for cross-browser and cross-OS support of advanced standards but, when the pieces of the puzzle fall into place -- as XSLT did for me this week -- it can still give me the same kind of buzz I got when the 4.0 browsers (was it really <i>1997</i>?) delivered CSS, client certs, and rich mail/news clients. 
</p>
</body>
</item> 

<item num="a713">
<title>Cross-platform, cross-browser XML apps</title>
<date>2003/06/05</date>
<body>
<p>
<table align="right">
<tr>
<td>
<a href="http://weblog.infoworld.com/udell/gems/firebird.gif">
<img width="300" height="235" src="http://weblog.infoworld.com/udell/gems/firebird_s.gif"/>
</a>
<div align="center" class="realsmall">Firebird Mac running an XSLT app</div>
</td>
</tr>
</table>
Let's review what's happening in this screen shot. I'm running Mozilla Firebird on my Mac. The application is a structured search of my OSCOM slides. There's no search engine beyond the browser itself, which provides the JavaScript UI, the XPath-based search, and the XSLT-driven results display. 
</p>
<p>
This app behaves identically in Firebird on Windows and Linux. It also behaves identically in MSIE 6. 
</p>
<p>
One more thing. Although it's not obvious from this screenshot, my AirPort is turned off at the moment. Search still works, since the DOM that was built from the XML content at infoworld.com is hanging around in memory.
</p>
<p>
Pretty freaking cool. I'll leave as an exercise for the reader the fact that this DOM could have been built from a SOAP call, the mechanism for which is now also commonly available in Mozilla and MSIE.
</p>
</body>
</item> 

<item num="a712">
<title>User agents revisited</title>
<date>2003/06/04</date>
<body>
<p>
It's been a while since I took a look at my own browser stats. So long that the term is really obsolete, given the rise of the RSS newsreader. We might as well just call the things that fetch web pages what they technically are: user agents. Anyway, I started by looking for a comprehensive list of user-agent signatures, and found a promising candidate at <a href="http://www.pgts.com.au/pgtsj/pgtsj0212d.html">PGTS</a>. (Got a better one? Let me know.) Their <a href="http://www.pgts.com.au/download/data/browser_list.txt">compilation</a> of about 6600 user-agent strings seemed reasonably current. I ran yesterday's 55000 log entries for this blog through it and got this:
</p>
<table cellspacing="0" cellpadding="2">
<tr>
<td>
<i>unclassified</i>
</td>
<td align="right">30085</td>
<td align="right">54.350</td>
</tr>
<tr>
<td>
<b>MSIE</b>
</td>
<td align="right">17554</td>
<td align="right">31.712</td>
</tr>
<tr>
<td>
<b>Mozilla</b>
</td>
<td align="right">3852</td>
<td align="right">6.959</td>
</tr>
<tr>
<td>
<b>Safari</b>
</td>
<td align="right">1380</td>
<td align="right">2.493</td>
</tr>
<tr>
<td>
<b>Netscape</b>
</td>
<td align="right">842</td>
<td align="right">1.521</td>
</tr>
<tr>
<td>
<b>Opera</b>
</td>
<td align="right">611</td>
<td align="right">1.104</td>
</tr>
<tr>
<td>
<b>Galeon</b>
</td>
<td align="right">433</td>
<td align="right">0.782</td>
</tr>
<tr>
<td>
<b>Konqueror</b>
</td>
<td align="right">170</td>
<td align="right">0.307</td>
</tr>
<tr>
<td>
<b>Python-urllib</b>
</td>
<td align="right">170</td>
<td align="right">0.307</td>
</tr>
<tr>
<td>
<b>Java</b>
</td>
<td align="right">82</td>
<td align="right">0.148</td>
</tr>
<tr>
<td>
<b>Powermarks</b>
</td>
<td align="right">52</td>
<td align="right">0.094</td>
</tr>
<tr>
<td>
<b>Lynx</b>
</td>
<td align="right">38</td>
<td align="right">0.069</td>
</tr>
<tr>
<td>
<b>Crazy Browser</b>
</td>
<td align="right">18</td>
<td align="right">0.033</td>
</tr>
<tr>
<td>
<b>iCab</b>
</td>
<td align="right">15</td>
<td align="right">0.027</td>
</tr>
<tr>
<td>
<b>OmniWeb</b>
</td>
<td align="right">14</td>
<td align="right">0.025</td>
</tr>
<tr>
<td>
<b>PHP</b>
</td>
<td align="right">14</td>
<td align="right">0.025</td>
</tr>
<tr>
<td>
<b>lwp-trivial</b>
</td>
<td align="right">13</td>
<td align="right">0.023</td>
</tr>
<tr>
<td>
<b>Wget</b>
</td>
<td align="right">8</td>
<td align="right">0.014</td>
</tr>
<tr>
<td>
<b>CFNetwork</b>
</td>
<td align="right">2</td>
<td align="right">0.004</td>
</tr>
<tr>
<td>
<b>Download Ninja</b>
</td>
<td align="right">1</td>
<td align="right">0.002</td>
</tr>
</table>
<p>
Clearly that <i>unclassified</i> category wants to be unpacked. So I scanned the log for user-agent names, producing a list like this:
<pre>
amaya/5.1
aolbrowser/1.0
curl/7.7.1
curl/7.9.8
gazz/2.1
gnome-vfs/1.0.1
iCab/2.8
iCab/2.9
Mozilla/4.5
iCab/2.9.1
</pre>
</p>
<p>
I threw away the versions, deduped, and scanned my log entries again, giving preference to the PGTS list (bolded in the tables) but then falling back to my secondary names (italicized in the tables). Of the many interesting points that could be drawn from this data, I'll just focus on one for now. Browsers whose names begin with &quot;Mozilla&quot; make up almost a third of what was the <i>unclassified</i> category. Those plus the Mozillas recognized by the PGTS list add up to about 25%, versus MSIE's 32%. Meanwhile, as I showed yesterday, Mozilla has become a platform that can support a rather interesting XML application -- a specialized information viewer, with its own built-in structured search engine -- on Windows, Mac, and Linux.
</p>
<p>
Having reached this point after long struggle, will the Mozilla project now find a sponsor worthy of its ambition? I hope so.
</p>
<p>
Here's the revised table:
</p>
<table cellspacing="0" cellpadding="4">
<tr>
<td>
<b>MSIE</b>
</td>
<td align="right">17554</td>
<td align="right">31.712</td>
</tr>
<tr>
<td>
<i>Mozilla</i>
</td>
<td align="right">11052</td>
<td align="right">19.966</td>
</tr>
<tr>
<td>
<i>NetNewsWire</i>
</td>
<td align="right">4339</td>
<td align="right">7.839</td>
</tr>
<tr>
<td>
<b>Mozilla</b>
</td>
<td align="right">3852</td>
<td align="right">6.959</td>
</tr>
<tr>
<td>
<i>SharpReader</i>
</td>
<td align="right">2998</td>
<td align="right">5.416</td>
</tr>
<tr>
<td>
<i>Radio</i>
</td>
<td align="right">2364</td>
<td align="right">4.271</td>
</tr>
<tr>
<td>
<b>Safari</b>
</td>
<td align="right">1380</td>
<td align="right">2.493</td>
</tr>
<tr>
<td>
<i>Feedreader</i>
</td>
<td align="right">1123</td>
<td align="right">2.029</td>
</tr>
<tr>
<td>
<i>NewsGator</i>
</td>
<td align="right">1114</td>
<td align="right">2.013</td>
</tr>
<tr>
<td>
<i>Wildgrape</i>
</td>
<td align="right">924</td>
<td align="right">1.669</td>
</tr>
<tr>
<td>
<b>Netscape</b>
</td>
<td align="right">842</td>
<td align="right">1.521</td>
</tr>
<tr>
<td>
<i>Syndirella</i>
</td>
<td align="right">673</td>
<td align="right">1.216</td>
</tr>
<tr>
<td>
<b>Opera</b>
</td>
<td align="right">611</td>
<td align="right">1.104</td>
</tr>
<tr>
<td>
<i>Web</i>
</td>
<td align="right">581</td>
<td align="right">1.050</td>
</tr>
<tr>
<td>
<i>RssBandit</i>
</td>
<td align="right">554</td>
<td align="right">1.001</td>
</tr>
<tr>
<td>
<i>Java</i>
</td>
<td align="right">479</td>
<td align="right">0.865</td>
</tr>
<tr>
<td>
<b>Galeon</b>
</td>
<td align="right">433</td>
<td align="right">0.782</td>
</tr>
<tr>
<td>
<i>unclassified</i>
</td>
<td align="right">377</td>
<td align="right">0.681</td>
</tr>
<tr>
<td>
<i>nntp</i>
</td>
<td align="right">340</td>
<td align="right">0.614</td>
</tr>
<tr>
<td>
<i>AmphetaDesk</i>
</td>
<td align="right">287</td>
<td align="right">0.518</td>
</tr>
<tr>
<td>
<i>curl</i>
</td>
<td align="right">220</td>
<td align="right">0.397</td>
</tr>
<tr>
<td>
<i>LWP::Simple</i>
</td>
<td align="right">218</td>
<td align="right">0.394</td>
</tr>
<tr>
<td>
<b>Konqueror</b>
</td>
<td align="right">170</td>
<td align="right">0.307</td>
</tr>
<tr>
<td>
<b>Python-urllib</b>
</td>
<td align="right">170</td>
<td align="right">0.307</td>
</tr>
<tr>
<td>
<i>clevercactus</i>
</td>
<td align="right">150</td>
<td align="right">0.271</td>
</tr>
<tr>
<td>
<i>Hep</i>
</td>
<td align="right">133</td>
<td align="right">0.240</td>
</tr>
<tr>
<td>
<i>Soup</i>
</td>
<td align="right">130</td>
<td align="right">0.235</td>
</tr>
<tr>
<td>
<i>gnome-vfs</i>
</td>
<td align="right">129</td>
<td align="right">0.233</td>
</tr>
<tr>
<td>
<i>PHP</i>
</td>
<td align="right">107</td>
<td align="right">0.193</td>
</tr>
<tr>
<td>
<i>Wget</i>
</td>
<td align="right">106</td>
<td align="right">0.191</td>
</tr>
<tr>
<td>
<i>Python-urllib</i>
</td>
<td align="right">100</td>
<td align="right">0.181</td>
</tr>
<tr>
<td>
<i>SwitchCrawler</i>
</td>
<td align="right">94</td>
<td align="right">0.170</td>
</tr>
<tr>
<td>
<i>Genecast</i>
</td>
<td align="right">86</td>
<td align="right">0.155</td>
</tr>
<tr>
<td>
<b>Java</b>
</td>
<td align="right">82</td>
<td align="right">0.148</td>
</tr>
<tr>
<td>
<i>Hapax</i>
</td>
<td align="right">78</td>
<td align="right">0.141</td>
</tr>
<tr>
<td>
<i>Broked</i>
</td>
<td align="right">72</td>
<td align="right">0.130</td>
</tr>
<tr>
<td>
<i>Straw</i>
</td>
<td align="right">59</td>
<td align="right">0.107</td>
</tr>
<tr>
<td>
<i>http://www.almaden.ibm.com/cs/crawler</i>
</td>
<td align="right">55</td>
<td align="right">0.099</td>
</tr>
<tr>
<td>
<i>blagg</i>
</td>
<td align="right">54</td>
<td align="right">0.098</td>
</tr>
<tr>
<td>
<i>libwww-perl</i>
</td>
<td align="right">53</td>
<td align="right">0.096</td>
</tr>
<tr>
<td>
<b>Powermarks</b>
</td>
<td align="right">52</td>
<td align="right">0.094</td>
</tr>
<tr>
<td>
<i>PostNuke:</i>
</td>
<td align="right">49</td>
<td align="right">0.089</td>
</tr>
<tr>
<td>
<i>Syndic8</i>
</td>
<td align="right">48</td>
<td align="right">0.087</td>
</tr>
<tr>
<td>
<i>Hatena</i>
</td>
<td align="right">41</td>
<td align="right">0.074</td>
</tr>
<tr>
<td>
<i>Googlebot</i>
</td>
<td align="right">39</td>
<td align="right">0.070</td>
</tr>
<tr>
<td>
<b>Lynx</b>
</td>
<td align="right">38</td>
<td align="right">0.069</td>
</tr>
<tr>
<td>
<i>NIF</i>
</td>
<td align="right">37</td>
<td align="right">0.067</td>
</tr>
<tr>
<td>
<i>Awasu</i>
</td>
<td align="right">36</td>
<td align="right">0.065</td>
</tr>
<tr>
<td>
<i>Scooter</i>
</td>
<td align="right">34</td>
<td align="right">0.061</td>
</tr>
<tr>
<td>
<i>rssSearch</i>
</td>
<td align="right">33</td>
<td align="right">0.060</td>
</tr>
<tr>
<td>
<i>Frontier</i>
</td>
<td align="right">31</td>
<td align="right">0.056</td>
</tr>
<tr>
<td>
<i>MagpieRSS</i>
</td>
<td align="right">30</td>
<td align="right">0.054</td>
</tr>
<tr>
<td>
<i>MovableType</i>
</td>
<td align="right">30</td>
<td align="right">0.054</td>
</tr>
<tr>
<td>
<i>Opera</i>
</td>
<td align="right">30</td>
<td align="right">0.054</td>
</tr>
<tr>
<td>
<i>Channel</i>
</td>
<td align="right">30</td>
<td align="right">0.054</td>
</tr>
<tr>
<td>
<i>Aggie</i>
</td>
<td align="right">28</td>
<td align="right">0.051</td>
</tr>
<tr>
<td>
<i>Zao</i>
</td>
<td align="right">28</td>
<td align="right">0.051</td>
</tr>
<tr>
<td>
<i>CFMX</i>
</td>
<td align="right">24</td>
<td align="right">0.043</td>
</tr>
<tr>
<td>
<i>ia_archiver</i>
</td>
<td align="right">24</td>
<td align="right">0.043</td>
</tr>
<tr>
<td>
<i>spnlib</i>
</td>
<td align="right">24</td>
<td align="right">0.043</td>
</tr>
<tr>
<td>
<i>KNewsTicker</i>
</td>
<td align="right">24</td>
<td align="right">0.043</td>
</tr>
<tr>
<td>
<i>Edu_RSS</i>
</td>
<td align="right">24</td>
<td align="right">0.043</td>
</tr>
<tr>
<td>
<i>XSA</i>
</td>
<td align="right">24</td>
<td align="right">0.043</td>
</tr>
<tr>
<td>
<i>servalBlagg.py</i>
</td>
<td align="right">23</td>
<td align="right">0.042</td>
</tr>
<tr>
<td>
<i>mt-rssfeed</i>
</td>
<td align="right">21</td>
<td align="right">0.038</td>
</tr>
<tr>
<td>
<i>Twisted</i>
</td>
<td align="right">21</td>
<td align="right">0.038</td>
</tr>
<tr>
<td>
<i>OpenTextSiteCrawler</i>
</td>
<td align="right">19</td>
<td align="right">0.034</td>
</tr>
<tr>
<td>
<i>Dual</i>
</td>
<td align="right">19</td>
<td align="right">0.034</td>
</tr>
<tr>
<td>
<b>Crazy Browser</b>
</td>
<td align="right">18</td>
<td align="right">0.033</td>
</tr>
<tr>
<td>
<i>ScoopRDF</i>
</td>
<td align="right">16</td>
<td align="right">0.029</td>
</tr>
<tr>
<td>
<i>timboBot</i>
</td>
<td align="right">16</td>
<td align="right">0.029</td>
</tr>
<tr>
<td>
<b>iCab</b>
</td>
<td align="right">15</td>
<td align="right">0.027</td>
</tr>
<tr>
<td>
<b>OmniWeb</b>
</td>
<td align="right">14</td>
<td align="right">0.025</td>
</tr>
<tr>
<td>
<b>PHP</b>
</td>
<td align="right">14</td>
<td align="right">0.025</td>
</tr>
<tr>
<td>
<i>ActiveRefresh</i>
</td>
<td align="right">14</td>
<td align="right">0.025</td>
</tr>
<tr>
<td>
<b>lwp-trivial</b>
</td>
<td align="right">13</td>
<td align="right">0.023</td>
</tr>
<tr>
<td>
<i>Popdexter</i>
</td>
<td align="right">12</td>
<td align="right">0.022</td>
</tr>
<tr>
<td>
<i>larbin_2.6.2</i>
</td>
<td align="right">12</td>
<td align="right">0.022</td>
</tr>
<tr>
<td>
<i>QuepasaCreep</i>
</td>
<td align="right">11</td>
<td align="right">0.020</td>
</tr>
<tr>
<td>
<i>FeedDemon</i>
</td>
<td align="right">11</td>
<td align="right">0.020</td>
</tr>
<tr>
<td>
<i>MyHeadlines</i>
</td>
<td align="right">11</td>
<td align="right">0.020</td>
</tr>
<tr>
<td>
<i>IdeaLibHttp</i>
</td>
<td align="right">10</td>
<td align="right">0.018</td>
</tr>
<tr>
<td>
<i>Fresh</i>
</td>
<td align="right">9</td>
<td align="right">0.016</td>
</tr>
<tr>
<td>
<i>ovidiubot</i>
</td>
<td align="right">8</td>
<td align="right">0.014</td>
</tr>
<tr>
<td>
<i>RSSMirandaPlugin</i>
</td>
<td align="right">8</td>
<td align="right">0.014</td>
</tr>
<tr>
<td>
<i>Browser</i>
</td>
<td align="right">8</td>
<td align="right">0.014</td>
</tr>
<tr>
<td>
<i>lwp-trivial</i>
</td>
<td align="right">8</td>
<td align="right">0.014</td>
</tr>
<tr>
<td>
<b>Wget</b>
</td>
<td align="right">8</td>
<td align="right">0.014</td>
</tr>
<tr>
<td>
<i>effnews</i>
</td>
<td align="right">8</td>
<td align="right">0.014</td>
</tr>
<tr>
<td>
<i>janes-blogosphere</i>
</td>
<td align="right">7</td>
<td align="right">0.013</td>
</tr>
<tr>
<td>
<i>FAST-WebCrawler</i>
</td>
<td align="right">6</td>
<td align="right">0.011</td>
</tr>
<tr>
<td>
<i>RPT-HTTPClient</i>
</td>
<td align="right">6</td>
<td align="right">0.011</td>
</tr>
<tr>
<td>
<i>Microsoft</i>
</td>
<td align="right">5</td>
<td align="right">0.009</td>
</tr>
<tr>
<td>
<i>FeedOnFeeds</i>
</td>
<td align="right">5</td>
<td align="right">0.009</td>
</tr>
<tr>
<td>
<i>vw-http</i>
</td>
<td align="right">4</td>
<td align="right">0.007</td>
</tr>
<tr>
<td>
<i>Gazette</i>
</td>
<td align="right">4</td>
<td align="right">0.007</td>
</tr>
<tr>
<td>
<i>vspider</i>
</td>
<td align="right">4</td>
<td align="right">0.007</td>
</tr>
<tr>
<td>
<i>eCatch</i>
</td>
<td align="right">4</td>
<td align="right">0.007</td>
</tr>
<tr>
<td>
<i>synerge</i>
</td>
<td align="right">4</td>
<td align="right">0.007</td>
</tr>
<tr>
<td>
<i>httpSocket</i>
</td>
<td align="right">3</td>
<td align="right">0.005</td>
</tr>
<tr>
<td>
<i>Mail</i>
</td>
<td align="right">3</td>
<td align="right">0.005</td>
</tr>
<tr>
<td>
<i>Feedster</i>
</td>
<td align="right">3</td>
<td align="right">0.005</td>
</tr>
<tr>
<td>
<i>Plucker</i>
</td>
<td align="right">3</td>
<td align="right">0.005</td>
</tr>
<tr>
<td>
<i>DMonitor</i>
</td>
<td align="right">3</td>
<td align="right">0.005</td>
</tr>
<tr>
<td>
<i>MobiPocket</i>
</td>
<td align="right">2</td>
<td align="right">0.004</td>
</tr>
<tr>
<td>
<i>grimp:</i>
</td>
<td align="right">2</td>
<td align="right">0.004</td>
</tr>
<tr>
<td>
<i>NPBot</i>
</td>
<td align="right">2</td>
<td align="right">0.004</td>
</tr>
<tr>
<td>
<i>The</i>
</td>
<td align="right">2</td>
<td align="right">0.004</td>
</tr>
<tr>
<td>
<i>ColdFusion</i>
</td>
<td align="right">2</td>
<td align="right">0.004</td>
</tr>
<tr>
<td>
<i>MnogoSearch</i>
</td>
<td align="right">2</td>
<td align="right">0.004</td>
</tr>
<tr>
<td>
<i>ASPseek</i>
</td>
<td align="right">2</td>
<td align="right">0.004</td>
</tr>
<tr>
<td>
<i>iSiloX</i>
</td>
<td align="right">2</td>
<td align="right">0.004</td>
</tr>
<tr>
<td>
<i>EbiNess</i>
</td>
<td align="right">2</td>
<td align="right">0.004</td>
</tr>
<tr>
<td>
<i>linkhype.com</i>
</td>
<td align="right">2</td>
<td align="right">0.004</td>
</tr>
<tr>
<td>
<i>MiracleAlphaTest</i>
</td>
<td align="right">2</td>
<td align="right">0.004</td>
</tr>
<tr>
<td>
<i>LinkWalker</i>
</td>
<td align="right">2</td>
<td align="right">0.004</td>
</tr>
<tr>
<td>
<b>CFNetwork</b>
</td>
<td align="right">2</td>
<td align="right">0.004</td>
</tr>
<tr>
<td>
<i>SURF</i>
</td>
<td align="right">1</td>
<td align="right">0.002</td>
</tr>
<tr>
<td>
<i>InfoMinder</i>
</td>
<td align="right">1</td>
<td align="right">0.002</td>
</tr>
<tr>
<td>
<i>PocketFeed</i>
</td>
<td align="right">1</td>
<td align="right">0.002</td>
</tr>
<tr>
<td>
<i>Watchfire</i>
</td>
<td align="right">1</td>
<td align="right">0.002</td>
</tr>
<tr>
<td>
<i>daypopbot</i>
</td>
<td align="right">1</td>
<td align="right">0.002</td>
</tr>
<tr>
<td>
<i>htdig</i>
</td>
<td align="right">1</td>
<td align="right">0.002</td>
</tr>
<tr>
<td>
<i>Blogosphere</i>
</td>
<td align="right">1</td>
<td align="right">0.002</td>
</tr>
<tr>
<td>
<i>Internet</i>
</td>
<td align="right">1</td>
<td align="right">0.002</td>
</tr>
<tr>
<td>
<b>Download Ninja</b>
</td>
<td align="right">1</td>
<td align="right">0.002</td>
</tr>
<tr>
<td>
<i>lachesis</i>
</td>
<td align="right">1</td>
<td align="right">0.002</td>
</tr>
<tr>
<td>
<i>Calzilla</i>
</td>
<td align="right">1</td>
<td align="right">0.002</td>
</tr>
<tr>
<td>
<i>Openbot</i>
</td>
<td align="right">1</td>
<td align="right">0.002</td>
</tr>
<tr>
<td>
<i>LinkScan</i>
</td>
<td align="right">1</td>
<td align="right">0.002</td>
</tr>
<tr>
<td>
<i>FlickBot</i>
</td>
<td align="right">1</td>
<td align="right">0.002</td>
</tr>
<tr>
<td>
<i>BlogBot</i>
</td>
<td align="right">1</td>
<td align="right">0.002</td>
</tr>
<tr>
<td>
<i>MSProxy</i>
</td>
<td align="right">1</td>
<td align="right">0.002</td>
</tr>
</table>
</body>
</item> 

<item num="a711">
<title>Searchable slides</title>
<date>2003/06/03</date>
<body>
<p>
I've added an <a href="http://weblog.infoworld.com/udell/misc/oscom/search.html">XPath search feature</a> to the OSCOM slideshow. Thanks to <a href="http://wp.netscape.com/comprod/columns/techvision/innovators_be.html">Brendan Eich</a>, it's working identically in both IE and Mozilla (though coded somewhat differently for each). I'll have more to say about this technique in an upcoming article, but since the feature is already public I thought I'd at least mention it. Sure is nifty to be able to do something like this in a serverless and cross-browser fashion -- well, for IE and Mozilla, anyway, others need not apply, I'm afraid. 
</p>
<p>
In my talk I gave the example of Phil Gyford's <a href="http://www.pepysdiary.com">The Diary of Samuel Pepys</a>, and asked the rhetorical question: &quot;What CMS environment would make it easier for Phil to achieve this effect?&quot;
</p>
<p>
If you haven't seen it, here's a fragment of the diary:
</p>
<blockquote cite="Samuel Pepys">
Thanks to God I got to bed in my own poor cabin, and slept well till 9 o'clock this morning.  <a href="http://www.pepysdiary.com/p/725.php">Mr. North</a> and <a href="http://www.pepysdiary.com/p/770.php">Dr. Clerke</a> and all the great company being gone, I found myself very uncouth all this day for want thereof.  <a href="http://www.pepysdiary.com/p/112.php">My Lord</a> dined with <a href="http://www.pepysdiary.com/p/110.php">the Vice-Admiral</a> to-day (who is as officious, poor man! as any spaniel can be; but I believe all to no purpose, for I believe he will not hold his place), so I dined commander at the coach table to-day, and all the officers of the ship with me, and Mr. White of Dover.  After a game or two at <a href="http://www.pepysdiary.com/p/684.php">nine-pins</a>, to work all the afternoon, making above twenty orders.
</blockquote>
<p>
And here's the underlying structure Phil's creating:
</p>
<table border="1" cellspacing="0" cellpadding="4">
<tr>
<td>Mr. North</td> <td>http://www.pepysdiary.com/p/725.php</td>
<td> /People</td>
</tr>
<tr>
<td>nine-pins</td> <td>http://www.pepysdiary.com/p/684.php</td>
<td> /Entertainment/Games</td>
</tr>
</table>
<p>
Phil noticed my links to him and wrote to me to say: &quot;I keep wondering myself what CMS environment would help me...&quot;
</p>
<p>
I don't have the whole answer, but my gut tells me that we all want to be able to achieve these kinds of effects, and that structured writing and structured search are closely related.
</p>
</body>
</item> 

<item num="a710">
<title>Mozilla on the move</title>
<date>2003/06/02</date>
<body>
<p>
Joel Spolsky's <a href="http://www.joelonsoftware.com/news/20030601.html">endorsement of Mozilla Firebird</a> -- he says &quot;it has finally caught up with Internet Explorer&quot; -- has attracted lots of notice. It is, indeed, a sweet piece of work. I love how the <a href="http://texturizer.net/firebird/extensions.html">extensions</a> work. The first one I picked up was <a href="http://texturizer.net/firebird/extensions.html#LiveHTTPHeaders">LiveHTTPHeaders</a> which seems to instantly obsolete Proxomitron for purposes of HTTP protocol sniffing and website reverse-engineering. I was also delighted to see that there's a build which <a href="http://www.mozillazine.org/talkback.html?article=3216">includes the DOM Inspector</a> -- and, soon after, an <a href="http://www.mozilla.gr.jp/~mal/inspector-mozfb-ahm.xpi">extension</a> that added DOM Inspector to my existing Firebird installation. Also, XSLT is working now (since Mozilla 1.2, in fact). Extremely cool!
</p>
<p>
What I'm not finding, yet, is the Mozilla equivalent of the MSXML.DOMDocument and MSXML2.XSLTemplate interfaces -- which I am actually using at the moment in MSIE for an interesting project. If these do exist, trust someone will enlighten me. Otherwise, I'll rate Firebird (and Mozilla) as not yet the full equal of MSIE in terms of these advanced features -- but moving quickly now, which is <i>so</i> gratifying to see.
</p>
<p>
<b>Update:</b>
Brendan Eich let no dust gather on that one:
<blockquote cite="Brendan Eich">
A couple of pointers re: <a href="http://weblog.infoworld.com/udell/2003/06/02.html">http://weblog.infoworld.com/udell/2003/06/02.html</a> -- Mozilla Firebird (actually, the Gecko engine shared with other Mozilla apps) does support the w3c-standard form of MSXML.DOMDocument, created via document.implementation.createDocument().  See <a href="http://www.mozilla.org/newlayout/xml/#load">http://www.mozilla.org/newlayout/xml/#load</a> for details -- there's a short example at <a href="http://lxr.mozilla.org/seamonkey/source/content/xml/tests/load/load.html">http://lxr.mozilla.org/seamonkey/source/content/xml/tests/load/load.html</a>.
<br/>
<br/>
Mozilla also supports XSLT complete with scriptable interfaces.  See <a href="http://devedge.netscape.com/viewsource/2003/xslt-js/">http://devedge.netscape.com/viewsource/2003/xslt-js/</a> for docs that probably cover the cases you handled using MSXML.XSLTemplate.  Let me know if something's missing.
</blockquote>
Excellent! This is just what I was looking for. Thanks Brendan! 
</p>
</body>
</item> 

<item num="a709">
<title>Patterns of persistence</title>
<date>2003/06/02</date>
<body>
<p>
<i>Programmers spend time and effort translating between objects represented in high-level programming languages, such as Java, and structures stored in relational databases. Object databases can remove that impedance, automatically binding programming-language objects to database objects. But doing so transparently requires some deep magic. [Full story at <a href="http://www.infoworld.com/article/03/05/30/22OPstrategic_1.html">InfoWorld.com</a>]</i>
</p>
</body>
</item> 

<item num="a708">
<title>OSCOM Wrapup</title>
<date>2003/06/01</date>
<body>
<p>
Here are the <a href="http://weblog.infoworld.com/udell/misc/oscom/intro.html">slides</a> from OSCOM keynote at Harvard on Friday. The title of the talk was: &quot;Everything you need to know about content management, you (should have) learned in grade school.&quot; Grade school lesson #1 was: &quot;Write effective titles.&quot; It's surprisingly hard to remember that lesson. On <a href="http://weblog.infoworld.com/udell/misc/oscom/titlesMatter.html">these</a> <a href="http://weblog.infoworld.com/udell/misc/oscom/slideScript.html">two</a> slides, I admitted the ironic fact that <i>my own slideshow software</i> did not, initially, create meaningful HTML doctitles. Driving down to the conference, I realized I'd made the same error at another level. The entire slideshow was untitled! I wondered who would notice. Leonard Megliola <a href="http://radio.weblogs.com/0125462/2003/05/31.html#a20">nailed it</a>. After the talk Tony Byrne, of <a href="http://www.cmswatch.com/CMSWatch/">CMSWatch</a>, quoted an old adage: &quot;Naming and cache invalidation are the hardest problems.&quot; How true!
</p>
<p>
Thanks to <a href="http://radio.weblogs.com/0116506/">Paul Everitt</a> for inviting me to give the talk. The last time that happened was my <a href="http://www.oreillynet.com/pub/a/network/2000/02/02/zopekeynote.html">keynote</a> at the 8th International Python conference in January 2000. Looking back on it now, I can see that some of the ideas have become a bit clearer -- notably document-oriented web services and XML storage. Here's hoping that three years hence some of the ideas from the Harvard talk -- high-quality embeddable XML writing tools, a user-friendly approach to descriptive tagging -- will seem realer than they do today.
</p>
</body>
</item> 

<item num="a707">
<title>Translucent databases revisited</title>
<date>2003/05/29</date>
<body>
<p>
<a href="http://allconsuming.net/item.cgi?isbn=0967584418">
<img alt="translucent databases" vspace="4" hspace="4" align="right" src="http://weblog.infoworld.com/udell/gems/translucentDatabases.jpg"/>
</a>
For an upcoming article on the eternal riddle of identity and privacy, I revisted Peter Wayner's notion of <a href="http://allconsuming.net/item.cgi?isbn=0967584418">translucent databases</a> (<a href="http://weblog.infoworld.com/udell/2002/08/07.html#a374">1</a>, <a href="http://weblog.infoworld.com/udell/2002/08/02.html#a364">2</a>, <a href="http://weblog.infoworld.com/udell/2002/07/19.html#a345">3</a>), which can hide even from themselves -- and from their authorized or unauthorized operators -- such personally-identifying data as is not strictly needed.
</p>
<p>
I asked Peter how large a class of applications he thought might be suitable for this treatment. As a thought experiment, he's investigating to what degree an e-commerce system like Amazon could work translucently. Some aspects of this are seemingly straightforward. By keying your purchase history to the hash of your name and a password known only to you, for example, Amazon could in theory deliver all the personalization you expect, and do all the aggregate analysis it needs to do, without tying your name to purchase records. Why do we personalize data more than is necessary? It's a fascinating question about which I hope to see broader disussion when Peter posts his analysis.
</p>
<p>
Still, Amazon obviously has to store your name somewhere, plus your credit card number and street address, in order to do the e-commerce dance, right? Well, actually, no, it does not need to store those data, it needs your permission to use them -- and a means to access them. This was, of course, the Hailstorm vision. Microsoft floated that trial balloon a couple of years ago, and it got shot down. It's clear now that Microsoft won't own the identity business, and that identity systems will federate. But we ought not forget that at the core of Hailstorm is an idea that is correct, necessary, and inevitable. Services don't need to store your data, they need to use it with your permission. Hailstorm, as originally conceived, <i>was</i> a translucent database -- and a darned good idea.
</p>
</body>
</item> 

<item num="a706">
<title>InfoWorld survey on enterprise security</title>
<date>2003/05/29</date>
<body>
<p>
InfoWorld invites your participation in a <a href="http://www.surveymonkey.com/s.asp?u=58601196071">survey</a> on enterprise security. Note: if you would be inclined to answer &quot;IT staff&quot; (as opposed to manager/director/engineer/analyst/developer) on question 1, &quot;What is your title?&quot;, or &quot;Not involved&quot; on question 2, about purchasing, then don't bother -- it's a quick trip to the exit in that case. For those of you who do qualify and do participate, thanks. It really is useful feedback for us.
</p>
<p>
By the way, I found it surprisingly hard to complete the last page of the survey. I think I've figured out why. It uses tables that look like this:
</p>
<img border="1" src="http://weblog.infoworld.com/udell/gems/surveyMonkey.GIF"/>
<p>
This alternating-color-bar design is a convention I've never much liked, but I guess we're stuck with it for now. It might be OK for reading, though, but I think it's really bad for input. Something about the alternation kept making me skip rows, so it was -- as I said -- surprisingly hard to complete the survey.
</p>
</body>
</item> 

<item num="a705">
<title>The business of RSS</title>
<date>2003/05/28</date>
<body>
<p>
How do you count subscribers in the RSS network? Tim Bray meditates on the question in an <a href="http://www.tbray.org/ongoing/When/200x/2003/05/25/Subscribers">essay on the subject</a>. Dave Winer says that Radio's <a href="http://radio.userland.com/whosReadingMyXml">Web Bug Simulator</a> (WBS) solved the problem last year. There a few different issues here to tease out, but in the end I'm not sure there is, or ever was, a problem.
</p>
<p>
As it's currently used, the WBS is more about transparency than comprehensive measurement. Consider this <a href="http://radio.xmlstoragesystem.com/rcsPublic/referers?site=graham%20glass%3A%20what%27s%20next%3F&amp;group=rss">report</a>, which tells us who has fetched Graham Glass' RSS feed from Radio UserLand<sup>1</sup> today. As you can see, I am one of his subscribers. The WBS technique makes Graham's subscribership transparently visible to the world -- an interesting situation that was the focus of my <a href="http://radio.weblogs.com/0100887/2002/03/03.html#a99">comment</a> which Dave cited. Now as it happens, Graham could otherwise know that I subscribe to him, since my channelroll (which <a href="http://cheerleader.yoz.com/">Yoz Grahame</a> has cleverly ascended to the top of -- nice hack!) already makes that fact known to the world. But if I didn't choose to publicize the fact, you could still piece together my subscriptions by crawling through the WBS reports. And if you'd rather not publicize your reading habits, you can <a href="http://localhost:5335/system/pages/prefs?page=5.11">disable</a> the WBS. The transparency effect is really cool, and is essential to the constitution of blogspace in ways I think none of us yet fully understand. 
</p>
<p>
The WBS technique could be used to measure subscribership for commercial purposes. Other measurements exist too. In RadioSpace, the <a href="http://radio.xmlstoragesystem.com/rcsPublic/rssHotlist">RSS Hotlist</a> reports the top 100 most-subscribed-to RSS feeds. Note that as of today, my blog appears twice: at #37, and again at #100. See <a href="http://weblog.infoworld.com/udell/2002/11/19.html">The RSS Hotlist: quantity vs quality</a> for background on this. Briefly, when I moved my feed, I couldn't redirect subscribers to the new address. (When I moved again, I learned that happily <a href="http://weblog.infoworld.com/udell/2003/04/17.html">a solution now exists</a>, at least for Radio and NetNewsWire.) Back in November, I had 251 Radio UserLand subscribers to the new feed but 145 were still fetching the old one. A few of the 145 have dropped away, but today the numbers stand at 365 and 125. This means that 125 &quot;subscribers&quot; have continued to fetch my old RSS feed, which hasn't changed in over 6 months. Who are these &quot;subscribers&quot;? In this case, it's users of the Radio UserLand aggregator<sup>2</sup> who haven't noticed my feed go dark. It's hard to notice the absence of something. If one of your 100+ feeds goes dark, would you notice? If it's one you care about, yes. Otherwise, no.
</p>
<p>
Robots, of course, don't care about anything, and the vast amount of RSS fetching is robotic. I submit that the WBS technique, if widely implemented, would become largely robotic too. As Steve Gillmor <a href="http://www.crn.com/weblogs/stevegillmor/2003/05/27/27.asp#42197">points out</a>, it becomes a question of who's subscribing. And, I'd add, who's <i>responding</i>. There are already well-established ways of figuring this stuff out, and they play beautifully into the existing weblog/RSS architecture.
</p>
<p>
It'd be useful to see something other than generic aggregator signatures in server logs, so the ideas that Tim was kicking around, and/or a WBS-like approach, would be helpful. But even if such information arrived, nobody at InfoWorld would much care at the moment. Our webstats people basically just ignore all rss.xml page-&quot;views&quot; because robots aren't interesting. What's interesting is people who respond to the feeds. And they respond, in the time-honored fashion, by clicking through to the website, thereby displaying ads. Because our CMS decorates the RSS URLs delivered in the feed (as it decorates URLs in different areas of the website), the RSS-referred views are known as such.
</p>
<p>
How do we know who's subscribing? We don't, yet. If and when we want to attach value specifically to an RSS feed, though, the mechanisms for doing so are again -- I think -- well-known. Currently, for example, this blog comes in three flavors. You can read it on the web, along with an ad. You can receive RSS blurbs that invite you to click through to the website -- again, displaying the ad. Or, as I mentioned yesterday, you can take the full feed in XML, disintermediating the browser. Few folks do that. If more chose to, it could become a paid feature, or a free benefit to InfoWorld subscribers. 
</p>
<p>
The relationship between UserLand and the New York Times illustrates how this might work. When you install Radio UserLand, you're offered a special set of Times newsfeeds, with more stuff than you'd usually get. These are password-protected and not available to the general public. Inside Radio, there's a table of URLs that look like this:
</p>
<p>
http://radiouser:xxxxxxx@partners.userland.com/nyt/arts.xml
</p>
<p>
In other words, Radio encodes the name/password credentials for these special feeds. As I understand it<sup>3</sup>, this is effectively a group credential shared by all RU users. There's a relationship between UserLand and the Times, not between individual RU users and the Times. But it's easy to see how, using standard e-commerce techniques, the Times could arrange to invidualize its relationships with RSS subscribers.
</p>
<p>
I believe this kind of thing will happen. So far as I can see, though, there's prior web art for all the pieces of the e-commerce puzzle. There is a ton of innovation that can and should still occur around blogs and RSS. But are innovations needed purely to enable a blog-related business model? Maybe, but if so I'm not sure what they are. 
</p>
<hr/>
<p>
<sup>1</sup> I think other aggregators could show up here, but currently don't.
</p>
<p>
<sup>2</sup> Because only RU feeds this data to the hotlist, right?
</p>
<p>
<sup>3</sup> Again, please correct me if I'm wrong.
</p>
</body>
</item> 

<item num="a704">
<title>The Harry Tuttle award</title>
<date>2003/05/27</date>
<body>
<p>
The weekend's Harry Tuttle award goes to <a href="http://clarity.awakeheart.net">Robert Ivanc</a>. On Friday he wrote to inform me that my weblog was interfering with an otherwise painless visit to the dentist:
</p>
<blockquote cite="Robert Ivanc">
A few days ago, I was waiting at a dentist and trying to kill the time thought of using my Nokia 3650 (with Doris HTML browser) to have a look at your site, to see if there's anything there that might put my mind on other matters than the precarious closeness of the dentist drilling machines! And what I found out was how hard it was to get to the actual content on your site...I had to scroll through all of what is usually hidden...after about 10 minutes or so I finally got to the content. Any way to redesign it, so that content gets loaded first or putting up a mobile lightweight version?
</blockquote>
<p>
Excellent point. I thought about this for five seconds and realized that Rob could solve this problem for himself -- and for others -- in a very simple way. I pointed him at the solution, and he picked up the ball and ran with it. 
</p> 
<p>
My blog is currently available in two XML flavors: the <a href="http://weblog.infoworld.com/udell/rss.xml">standard feed</a> and the <a href="http://weblog.infoworld.com/udell/gems/longDescriptionFeed.xml">extended feed</a>. My suggestion to Rob was to write an XSLT transform for one or the other, and pipe the XML content through it (using the W3C's public XSLT transformation service) to create a lightweight HTML rendering.
</p>
<p>
Here is the <a href="http://awakeheart.net/rss2html.xsl">XSLT file</a> Rob wrote. Here's how <a href="http://www.w3.org/2000/06/webdata/xslt?xslfile=http%3A%2F%2Fawakeheart.net%2Frss2html.xsl&amp;xmlfile=http%3A%2F%2Fweblog.infoworld.com%2Fudell%2Frss.xml&amp;transform=Submit">it renders</a> my standard feed. Here's how <a href="http://www.w3.org/2000/06/webdata/xslt?xslfile=http%3A%2F%2Fawakeheart.net%2Frss2html.xsl&amp;xmlfile=http%3A%2F%2Fweblog.infoworld.com%2Fudell%2Fgems%2FlongDescriptionFeed.xml&amp;transform=Submit">it renders</a> my extended feed.
</p>
<p>
As Rob notes in <a href="http://clarity.awakeheart.net/archives/000233.html#000233">his writeup</a>, there was a problem with the extended feed, so originally he was only able to pipe the standard feed to his Nokia. But that was my fault, not his. I kicked my setup and it seems to be working properly now. Rob's conclusion:
</p>
<blockquote cite="Robert Ivanc">
Wow, that was pretty simple and quite powerful. The power of this kind of ad hoc scripting never ceases to amaze me! [<a href="http://clarity.awakeheart.net/">Clarity's Blog</a>]
</blockquote>
<p>
Amen.
</p>
<p>
Clearly I'd prefer (and InfoWorld would prefer, and doubtless the W3C would prefer) not to see this solution used inappropriately. InfoWorld's advertisers do support this work, after all, and the W3C is not in the business of providing production transformation services. But if you're stuck in the dentist's office with only your Nokia, there oughta be a better way. 
</p>
<p>
I've written before about the extraordinary fact that a new and useful service can be created by simply posting a file to a weblog. And about how that new service can combine other services (an XSLT transformer, an RSS datasource), without requiring the knowledge or cooperation of the providers of those other services, and with essentially zero coordination cost. But yeah, it really does never cease to amaze me.
</p>
</body>
</item> 

<item num="a703">
<title>Upcoming talks</title>
<date>2003/05/27</date>
<body>
<p>
I've got a couple of speaking engagements coming up. On Friday morning, I'm the keynote at <a href="http://www.oscom.org/Conferences/Cambridge/">OSCOM</a> (Open Source Content Management), following Dave Winer's Thursday AM keynote. The event is at the Berkman Center in Cambridge, MA.
</p>
<p>
In December, I'm speaking at <a href="http://www.xmlconference.org/xmlusa/">XML 2003</a> in Philadelphia. That's a long way off, but if you want to <a href="http://www.xmlconference.org/xmlusa/2003/call.asp">submit a proposal</a> for a presentation or tutorial, the conference planners would like to remind you that those proposals are due June 2.
</p>
</body>
</item> 

<item num="a702">
<title>APIs, protocols, and rogue plumbers</title>
<date>2003/05/23</date>
<body>
<p>
<table align="right">
<tr>
<td>
<a href="http://members.tripod.com/airstripone/as6.html">
<img width="170" height="140" alt="harry tuttle" src="http://weblog.infoworld.com/udell/gems/tuttle.jpg"/>
</a>
</td>
</tr>
</table>
<blockquote>
<i>
The inexorable logic of Web services sets aside APIs in favor of protocols. XML messages flowing through the pipes enact those protocols. Anyone with authorized access to that plumbing will be able to monitor and inject messages quite easily, and everyone will know that's true. If B or S or V can't unclog the pipes, we'll elect Harry Tuttle, and he'll do it without even getting his hands dirty. [Full story at <a href="http://www.infoworld.com/article/03/05/23/21OPstrategic_1.html">InfoWorld.com</a>]
</i>
</blockquote>
</p>
<p>
This was the 35th weekly column I've written for InfoWorld, and it might be my favorite one so far. 
<table align="right" cellpadding="6" cellspacing="0">
<tr>
<td>
<!-- BEGIN PULSEPOLL CODE -->
<!-- COPYRIGHT 1999 PULSEPOLL.COM, INC. ALL RIGHTS RESERVED -->
<script language="javascript" src="http://www.pulsepoll.com/scripts/mgwms32.dll?MGWLPN=LOCAL&amp;wlapp=PollServe&amp;SiteID=7880&amp;Profile=1&amp;Width=110&amp;LanguageID=35&amp;LocationID=1"/>
<noscript>You must turn on JavaScript to view  the <A HREF="http://www.pulsepoll.com">PulsePoll</A>. For tech support: <a href="http://www.co-laboratory.com">co-laboratory</a>
</noscript>
<!-- END PULSEPOLL CODE  -->
</td>
</tr>
</table>
We had a bit of a discussion about how to play the column on the cover of next week's magazine. &quot;Web services conquer all&quot; would likely have been tweaked, but the idea was to invoke a hot theme. A klunky phrase like &quot;Web services&quot; doesn't leave much maneuvering room in a short headline, though. &quot;Protocols and rogue plumbers,&quot; a shorter version of the title that actually appears on the column in the magazine, is arguably too obscure for the cover. Of course &quot;Harry Tuttle's Revenge&quot; is completely obscure, but might its very oddity provoke interest? In the end it was moot, because a design change pushed the column tease off the cover. But if you have an opinion, fire away!
</p>
<p>
PS: There is still time for certain value-added reseller, V, to save me (and hundreds of other customers) the trouble of manually transferring records from bank B to service-provider S. But in a few weeks, the <a href="http://members.tripod.com/airstripone/as6.html">music</a> will stop.
</p>
</body>
</item> 

<item num="a699">
<title>Koha and the Library of Congress</title>
<date>2003/05/22</date>
<body>
<p>
I've added a tenth OPAC vendor to the <a href="http://weblog.infoworld.com/udell/stories/2002/12/11/librarylookupGenerator.html">LibraryLookup generator</a>, and this one is kind of special. It's an open source product called <a href="http://www.koha.org/">Koha</a>, commissioned by the HLT Library in New Zealand. Pat Eyler, the Kaitiaki (Guardian) of the project, told me about it way back in November, and the email got buried in a folder which I didn't revisit until this week. Sorry Pat, but better late than never. Very cool to see an open source OPAC in use.
</p>
<p>
What got me to revisit my LibraryLookup email folder was a message from David Carter-Tod. It seems that Raymond Yee has been pondering for some time how to write an URL that would address a Library of Congress record. He finally <a href="http://iu.berkeley.edu/rdhyee/2003/05/20#a802">cracked it</a>, and David <a href="http://instructionaltechnology.editthispage.com/2003/05/21#a3830">connected the dots</a> to make a <a href="javascript:var%20re=/([\/-]|is[bs]n=)(\d{7,9}[\dX])/i;if(re.test(location.href)==true){var%20isbn=RegExp.$2;void(win=window.open('http://catalog.loc.gov/cgi-bin/Pwebrecon.cgi?v3=1&amp;DB=local&amp;CMD=020a+'+isbn+'&amp;CNT=10+records+per+page','LibraryLookup','scrollbars=1,resizable=1,location=1,width=575,height=500'))}">LibraryLookup bookmarklet for the Library of Congress</a>. (Note: the bookmarklet which is the address of the previous link adds three backslashes that Manila evidently swallowed when David posted his item.) Excellent!
</p>
</body>
</item> 

<item num="a698">
<title>The new old man</title>
<date>2003/05/22</date>
<body>
<p>
<table align="right">
<tr>
<td>
<a href="http://www.pilgriminn.com/attractions.htm">
<img src="http://www.pilgriminn.com/oldman.jpg"/>
</a>
<div align="center" class="realsmall">the old old man</div>
</td>
</tr>
<tr>
<td>
<img src="http://weblog.infoworld.com/udell/gems/oldman.jpg"/>
<div align="center" class="realsmall">the new old man</div>
</td>
</tr>
</table>
My rule is never to post anything here that doesn't have a technical hook, but maybe by the time I get to the end of this item, I'll have thought of one. I live in New Hampshire, and we had a tough winter. Some winters are extra cold, some extra long, and some extra snowy. This year we got the trifecta. And to cap it off, New Hampshire's state symbol, a granite formation known as the <a href="http://www.mutha.com/oldmanmt.html">Old Man</a>, came crashing down earlier this month. He's on every license plate, and on every state highway sign, so this was quite an odd thing to have happen -- though not unexpected, he'd been on life support for years.
</p>
<p>
Last week, hiking with a friend in the woods near our homes, I spotted a small formation that looked eerily like the Old Man. Yesterday, my friend mentioned it to our local newspaper, and today they ran a photo with this caption:
</p>
<blockquote>
<i>
An area of cliff along the Washington Street extension in Keene resembles the late Old Man of the Mountain in Franconia. This formation is located on the gated road that leads to Beaver Brook Falls. The natural rock formation resembling an old man's face - perhaps New Hampshire's best-known symbol - fell from Cannon Mountain earlier this month. (Sentinel photo by MICHAEL MOORE) 
</i>
</blockquote>
<p>
So what's the technical hook? Damned if I know. I guess I'll have to think of something else to write today to push this item off InfoWorld's home page.
</p>
	</body>
</item> 

<item num="a697">
<title>Googling for social security numbers</title>
<date>2003/05/22</date>
<body>
<p>
Now and again, I google for my social security number, hoping that the number of hits will be zero but fearing that it won't be. So far, so good. In case you've never tried it, here's an interesting experiment. Search for the first digit, then the first two digits, and so on until you build up the string of all nine digits. Here's the pattern for me:
<pre>
digits    Google
of SS#     hits
<br/>
1      952,000,000
2      182,000,000
3        5,900,000
4       14,700,000 (Because it spells a year in the last century.)
5           13,300
6              683
7               22
8                3
9                0
</pre>
There is, of course, a class of 10-digit numbers -- namely phone numbers -- that produce  <a href="http://www.google.com/search?q=6033558980">Google results</a> that usually shock people who haven't seen them before. How shocked would you be to find your social security number effective as a Google search term? Does this in fact already happen sometimes?
</p>
</body>
</item> 

<item num="a696">
<title>Blogs, chaos, and neural nets</title>
<date>2003/05/21</date>
<body>
<p>
Testing guru Brian Marick thought that yesterday's item <a href="http://www.testing.com/cgi-bin/blog/2003/05/20#fit-in-the-news">kind of muddles together several issues</a> -- Ward Cunningham's FIT framework, testing, and the phenomenon of Windows rot. I agree. The distance from the opening point to the conclusion was, in retrospect, more than a comfortable stretch. The editor who lives inside my head told me that too, but not quite loudly enough. As I've mentioned <a href="http://weblog.infoworld.com/udell/2003/03/09.html">elsewhere</a>, amplification of that internal voice with external feedback is one of the great benefits of writing for the Web. Of course I've now subscribed to Brian's <a href="http://www.testing.com/cgi-bin/blog/2003/05/20#fit-in-the-news">RSS feed</a>, so that going forward I'll have one more window open onto the world of agile programming and testing.
</p>
<p>
How much RSS input can a person usefully process? There's got to be a limit, but I've picked up a bunch of feeds lately, bringing my total to 130, and it's still surprisingly manageable. Ray Ozzie <a href="http://www.ozzie.net/blog/2003/05/17.html#a84">says</a> he's even more voracious. &quot;I've got nearly 150 feeds that I monitor in one way or another,&quot; he writes. Of course monitoring doesn't mean exhaustively reading. Like Ray I scan my feeds and only read them selectively, but absolutely depend on the aggregate effect they create. &quot;I won't truly understand what's going on out there unless I mix it up a bit,&quot; Ray says, adding that RSS is a way to &quot;force some chaos&quot; into his routine. I like that metaphor a lot. Here's another: the blog network is a nervous system, and we are the neurons. As another Ray -- Ray Kurzweil -- has pointed out:
</p>
<blockquote cite="Ray Kurzweil">
The mission of the neuron is to render a judgment. It does this by summarizing the thousands of chaotic messages it receives into a coherent answer that reflects its particular view of the world. ... Key to this process is the concept of feedback, without which a net of neurons would be unable to learn. [<a href="http://www.kurzweilai.net/articles/art0254.html?printable=1">KurzweilAI.net</a>]
</blockquote>
<p>
Hmm. Now that I think of it, shouldn't the <a href="http://www.kurzweilai.net/news/">KurzweilAI.net news</a> be an RSS feed as well as an <a href="http://www.kurzweilai.net/email/?main=signup.html">email newsletter</a>?
</p>
</body>
</item> 

<item num="a695">
<title>Testing for Windows rot</title>
<date>2003/05/20</date>
<body>
<p>
It's nice to see the New York Times <a href="http://www.nytimes.com/2003/05/19/technology/19NECO.html">mentioning</a> Ward Cunningham as the father of <a href="http://www.c2.com/cgi/wiki?WelcomeVisitors">Wiki</a>. I wonder, though, whether another of Ward's efforts -- Extreme Programming, and in particular his advocacy of test-driven software development -- might not ultimately affect more people's lives. 
</p>
<p>
The column in which I <a href="http://weblog.infoworld.com/udell/2003/02/13.html">interviewed</a> Ward attracted a lot of notice. Bill de hÓra <a href="http://www.dehora.net/journal/archives/000192.html">wrote</a>: 
</p>
<blockquote cite="Bill de hÓra">
Stop what you're doing. Ward Cunningham is quite possibly the most vital actor and thinker on software development over the last ten years. [<a href="http://www.dehora.net/journal/">Bill de hÓra</a>]
</blockquote>
<p>
Tim Bray <a href="http://www.tbray.org/ongoing/When/200x/2003/02/13/NamingFinishing">wrote</a>: 
</p>
<blockquote cite="Tim Bray">
A couple of [Ward's] remarks have been creating rumbling echoes that won't die down in the back of my brain. [<a href="http://tbray.org/ongoing">ongoing</a>]
</blockquote>
<p>
One of the things Ward told me about in that interview, which I omitted from my writeup -- because, frankly, I didn't really get the significance of it at the time -- is the <a href="http://fit.c2.com/">Fit framework</a>. The O'Reilly Network has a <a href="http://www.macdevcenter.com/lpt/a/3014">good article</a> about Fit. You can see it in action at the <a href="http://fit.c2.com/wiki.cgi?WhatsWhat">Fit Wiki</a>, about which Ward writes, in his typically direct way:
</p>
<blockquote cite="Ward Cunningham">
This site is about tests that people can read. Here is a sample. Green is good.
</blockquote>
<p>
Fit uses the simple construct of an HTML table on a Web page to synchronize the activities of testers and developers. There is art involved on both sides. The tester's art is to represent conditions and expected outcomes in tabular form. It turns out that all kinds of software behavior can be represented in this way, but doing so effectively takes thought, skill, and experience. The developer's art is to write the &quot;fixtures&quot; that map between the tabular representation and the code under test. Also a subtle game. In the end, the magical effect is this: the same HTML table that represents the tests becomes the dashboard that displays test results. 
</p>
<p>
<table align="right">
<tr>
<td>
<a href="http://www.bwtse.co.uk/fungal_decay.htm">
<img width="237" height="139" vspace="6" hspace="6" align="right" alt="dry rot" src="http://www.bwtse.co.uk/images/rot/dry_rot_cuboidal_cracking.jpg"/>
</a>
<div align="center" class="realsmall">dry rot</div>
</td>
</tr>
</table>
What got me thinking about this, again, was <a href="http://www.sciam.com/article.cfm?chanID=sa006&amp;articleID=000DAA41-3B4E-1EB7-BDC0809EC588EEDF">this Scientific American article</a> on self-repairing computers. A couple of days ago, while wasting an afternoon undoing some rot that had crept into a Windows XP installation, I wished I could fast-forward to that happy world of micro-rebooting and rapid recovery. But when I stop to think about it, Windows rot isn't really about catastrophic failure, it's about, well, rot. I love this description:
</p>
<blockquote cite="Jim O'Halloran">
The problem with WinRot is that its a process that just seems to &quot;happen&quot; over a period of time.  There's no warning, no messages in the event log, no &quot;Windows would like to rot now.  Is this ok? Yes/No&quot; dialog.  Nothing. [<a href="http://www.jimohalloran.com/archives/000201.html">Jim O'Halloran's Weblog</a>]
</blockquote>
<p>
In my case, two bizarre symptoms appeared on the same day:
</p>
<ul>
<li>
<p>MSIE 6 began crashing hard immediately upon loading any Amazon.com page. It was fine, though, with every other site. And MSIE 6 on another machine was (of course) fine with Amazon.</p>
</li>
<li>
<p>Radio UserLand's GUI interface, accessed by right-clicking the tray icon and selecting Open Radio, became inaccessible. No other functions of the program were noticeably affected.</p>
</li>
</ul>
<p>
I surmised these two oddities were related to a slew of software installation I'd done recently, so I began playing the System Restore roulette game. Pick a day, revert to that day. Problem solved? If yes, assess collateral damage. If no, pick an earlier day and repeat. After three tries I was back 12 days, and the Radio glitch was solved. But not the MSIE/Amazon glitch. Applying the most recent MSIE patch did solve that one too, for reasons I'll never know. Then the collateral damage. Office 2003 apps got unregistered, as did the SpamBayes add-in for Outlook 2000. My Jython-based email searcher broke because of a rollback to a previous CLASSPATH. And there were a few more things like that.
</p>
<p>
Probably it was software installation or uninstallation that caused the two symptoms of rot. Maybe not. Either way, lack of immediate feedback made recovery much worse. System Restore is actually a darned useful utility, but if you don't catch the cancer early you're in for a painful treatment. Problem was, the disease attacked functions that I don't use every day, or even every week. 
</p>
<p>
I wonder if regular testing of installed applications could enable early detection. And if so, I wonder how that might practically be done.
</p>
<p>
Yes, I do also use Mac OS X. And while I can't confirm rot, I do detect questionable odors. In the final analysis, any complex system can benefit from regular and disciplined verification.
</p>
</body>
</item> 

<item num="a694">
<title>Experimental journalism</title>
<date>2003/05/18</date>
<body>
<p>
<blockquote>
<i>
Thomas Bayes, a Presbyterian minister and mathematician born just over
300 years ago, would be shocked to see most of the e-mail messages
that bid for our attention nowadays. He would be thrilled to know,
however, that his statistical inference theorem has inspired a potent
counterattack. An open source project called SpamBayes has emerged as
a powerful weapon in the war on spam. There are a few different
implementations of SpamBayes. I'll focus here on an Outlook add-in,
written by renowned Python hacker Mark Hammond. I've been skeptical
about the long-term prospects for content-based e-mail filtering. But
the Python-based SpamBayes engine, and Hammond's brilliant add-in
(also written in Python), are rapidly making me a believer.
[Full story at <a href="http://www.infoworld.com/article/03/05/16/20TCspam_1.html">InfoWorld.com</a>]
</i>
</blockquote>
This article was, among other things, an experiment in combining print journalism with blogging. When I wrote the review, I was so excited about the results I was getting with SpamBayes that I wanted to blog it immediately. Instead, I wrote a <a href="http://weblog.infoworld.com/udell/2003/05/08.html#a684">companion piece</a> online and then, based on feedback, <a href="http://weblog.infoworld.com/udell/2003/05/09.html#a636">another one</a> the next day. The latter URL will presciently appear in the print article that InfoWorld subscribers will receive next week. This time-travel effect is kind of cool, but it also demonstrates what I think is a really powerful kind of print/online synergy. 
</p>
<p>
I love the idea of a story that begins online, is snapshotted for a print readership, and then continues online. I first tried the experiment in 1996, when I was researching a BYTE cover story. Why, I wondered, should the Internet serve only as a mechanism for after-the-fact feedback? Why couldn't I post the general outline of the story I was envisioning, in order to attract perspectives that could usefully shape the story? That's just what I did, and it was a transforming experience. An engineer at JPL told me about a compelling use of Java for distributed data visualization, and that became a central element of the story.
</p>
<p>
The SpamBayes review worked a bit differently. It was already in production when I blogged the companion pieces, but there was still time to incorporate feedback into the review. (I see that one correction wasn't made, though. I'd asked to remove the reference to Mac OS X mail from this -- &quot;Several e-mail programs, including the Mail program bundled with Mac OS X, use Bayesian techniques&quot; -- because of what I learned <a href="http://weblog.infoworld.com/udell/2003/05/12.html#a686">here</a>.) 
</p>
<p>
A more complete example of print/online synergy is an article I'm working on right now, a companion piece to a review of J2EE servers. I posted some initial thoughts <a href="http://weblog.infoworld.com/udell/2003/05/12.html#a688">here</a>, and collected lots of useful feedback both in comments and privately. Based on that feedback, I've realized that the original &quot;is J2EE/EJB overkill?&quot; theme was a bit dated. The folks I've talked to have gotten past that issue. They are choosing from the smorgasbord of J2EE services in thoughtful and clever ways. But it also became clear that the &quot;agility versus robustness&quot; theme continues to resonate with everybody. I hope that the story I'll write tomorrow or Monday will do justice to that theme. But the material I've gathered, from interviews with Adam Bosworth, Marc Fleury, <a href="http://radio.weblogs.com/0118231/">Steve Muench</a>, Annrai O'Toole, and others -- plus <a href="http://radiocomments.userland.com/comments?u=100887&amp;p=688&amp;link=http%3A%2F%2Fweblog.infoworld.com%2Fudell%2F2003%2F05%2F12.html%23a688">online commentary</a> and email correspondence, is far more than an 800- or 1000-word InfoWorld print article can accommodate. Happily the blog exists, and can carry the theme forward at greater length, over time, and in collaboration with other blogs.
</p>
<p>
A great deal has been said and written about weblogs and journalism, but I've not seen the following point articulated clearly. In a world full of weblogs, written from all kinds of perspectives, information and opinion are commodities. But selection, analysis, synthesis, and coherent storytelling -- the highest and best functions of journalism -- are arguably more valuable than ever. That value cannot be delivered from an ivory tower, though. It must flow from an intense collaboration with what Dan Gillmor calls <a href="http://www.google.com/search?q=dan+gillmor+former+audience">the former audience</a>.
</p>
</body>
</item> 

<item num="a693">
<title>Toggling between HTML and XML</title>
<date>2003/05/17</date>
<body>
<p>
The HTML Tidy procedure I spelled out <a href="http://weblog.infoworld.com/udell/2003/04/14.html#a666">here</a> is proving awkward. If I screw something up that Tidy can't fix, I wind up checking the document in an XML parser. A convenient parser is the one in MSIE, and lately I've been using it for this purpose. This is awkward too, however. For a few days, I tried this procedure:
</p>
<ul>
<li>Write the file, e.g. file.html</li>
<li>Copy the file to, e.g., file.xml.</li>
<li>Check file.xml in MSIE.</li>
<li>Extract and post contents.</li>
</ul>
<p>
But the edit/copy/check cycle is ridiculous. Couldn't I toggle the same file between two modes? Here's one approach. By adding this to the file:
</p>
<p>
<head>
<meta http-equiv="content-type" content="text/xml"/>
</head>
</p>
<p>
I could view as XML and check well-formedness. Now toggling between HTML and XML modes was as simple as changing between &quot;text/xml&quot; and &quot;text/html&quot; and reloading. Here's are both views, for this in-progress document, in MSIE on Mac OS X:
</p>
<p align="center">
<img width="486" height="415" border="1" src="http://weblog.infoworld.com/udell/gems/xhtml01.gif"/>
<div align="center" class="realsmall">Content-type: text/html</div>
</p>
<p align="center">
<img width="522" height="479" border="1" src="http://weblog.infoworld.com/udell/gems/xhtml02.gif"/>
<div align="center" class="realsmall">Content-type: text/xml</div>
</p>
<p>
What I didn't discover until later is that, for reasons I'm sure someone will explain to me, this doesn't work in MSIE 6. Not knowing that, I next attempted what seemed like a brilliant hack. Why not make a couple of bookmarklets to poke the META HTTP-EQUIV tags into the DOM? Then it'd be just a one-click deal to toggle from HTML to XML and back. 
</p>
<p>
Here's a snippet that (in theory) does the HTML-to-XML switcheroo:
</p>
<p>
<a href="javascript:void( function() {var element=document.createElement('meta'); element.setAttribute('http-equiv','content-type'); element.setAttribute('content', 'text/xml'); var oldhead = document.getElementsByTagName('head')[0]; var newhead=document.createElement('head'); newhead.appendChild(element); document.firstChild.replaceChild(newhead, oldhead);} () )">switchToXML</a>
</p>
<p>
Mozilla's amazingly cool DOM inspector enables us to verify that the expected change was made. Here's the document in its default text/html mode:
</p>
<p align="center">
<img width="491" height="380" border="1" src="http://weblog.infoworld.com/udell/gems/xhtml03.gif"/>
<div align="center" class="realsmall">Content-type: text/html</div>
</p>
<p>
And here's the document after clicking on that javascript: link to modify the META HTTP-EQUIV element:
</p>
<p align="center">
<img width="477" height="384" border="1" src="http://weblog.infoworld.com/udell/gems/xhtml04.gif"/>
<div align="center" class="realsmall">Content-type: text/xml</div>
</p>
<p>
It works! Except no, it doesn't. Although the DOM exactly matches the DOM obtained by reading in a file that originally contains an HTTP-EQUIV of text/xml, MSIE doesn't react to the dynamic change.
</p>
<p>
Sadly it's all kind of moot anyway. Safari won't parse XML, Mozilla parses without the XML viewer, but like MSIE 6 doesn't seem to want to let HTTP-EQUIV override the filename's extension. <a href="http://radiocomments.userland.com/comments?u=100887&amp;p=693&amp;link=http%3A%2F%2Fweblog.infoworld.com%2Fudell%2F2003%2F05%2F12.html%23a693">Ideas?</a>
</p>
</body>
</item> 

<item num="a692">
<title>Tools for rules</title>
<date>2003/05/16</date>
<body>
<p>
<a href="http://www.amazon.com/exec/obidos/tg/detail/-/0387583505/">
<img align="right" hspace="6" src="http://images.amazon.com/images/P/0387583505.01.MZZZZZZZ.jpg"/>
</a>
The dust was thick on my copy of the 1985 Clocksin and Mellish classic, Programming Prolog. But Ted Neward, author of the forthcoming book Effective Enterprise Java, brought it all rushing back: expert systems, declarative rules engines, predicate calculus, backward- vs. forward-chaining evaluation. 
<br/>
<br/>
Neward gives an example on his weblog why this obscure discipline is back in vogue. 
</p>
<blockquote cite="Ted Neward">
&quot;If the guy filing the expense report files a 334-B form, then the upper limit on the total is twice his Personal Expense Liability total (which you get from the HR database, of course), unless his boss is an Assistant Vice President, in which case we have to get departmental approval from two managers and the AVP himself.&quot; [<a href="http://www.neward.net/ted/weblog/index.jsp?date=20030222#1045902285278">The Mountain of Worthless Information</a>]
</blockquote>
<p>
Today we program this stuff in procedural languages, and we make a hell of a mess doing so. Wouldn't it be great if we could declare a bunch of rules and have a rules engine work out the consequences? As Ted points out, this is the moral equivalent of using SQL to say what you want done with data not how. [Full story at <a href="http://www.infoworld.com/article/03/05/16/20OPstrategic_1.html">InfoWorld.com</a>]
</p>
<p>
For those (like me) who haven't followed recent developments in the world of rules engines, here are some signposts:
</p>
<ul>
<li>
<a href="http://drools.org/">drools</a>
</li>
<li>
<a href="http://www.yasutech.com">QuickRules</a>
</li>
<li>
<a href="http://www.ilog.com/products/rules/engines/jrules/">ILOG Jrules</a>
</li>
<li>
<a href="http://www.w3.org/2000/10/swap/doc/rule-systems">A W3C taxonomy of rule-based systems</a>
</li>
<li>
<a href="http://herzberg.ca.sandia.gov/jess/">Jess</a>
</li>
<li>
<a href="http://herzberg.ca.sandia.gov/jess/docs/52/rete.html">the Rete algorithm</a>
</li>
<li>
<a href="http://www.jcp.org/en/jsr/detail?id=094">JSR 94, the Java Rule Engine API</a>
</li>
</ul>
</body>
</item> 

<item num="a691">
<title>Applied social network analysis</title>
<date>2003/05/15</date>
<body>
<p>
Somehow Eric Promislow's <a href="http://www.baconizer.com">Amazing Baconizer</a> escaped my attention until Eric mentioned it to me recently. Eric was co-creator of <a href="http://www.stilo.com/products/omnimark/buildingxmlmiddleware.html">OmniMark</a>, an ahead-of-its-time XML-oriented programming language, and is a senior developer at <a href="http://www.activestate.com/Corporate/People/Senior_Developers.html">ActiveState</a>. &quot;The Baconizer,&quot; he says, &quot;is where I go to play in your basic LABP world (I'm too lazy to replace Berkeley DB with My SQL).&quot; I've seen a few other applications that automate the traversal of Amazon's &quot;Customers who bought this book also bought...&quot; links, but Eric's does so in a goal-directed way. Here, for example, is the 12-hop path from my book to my wife's book:
</p>
<p>
<a href="http://www.baconizer.com/cgi-bin/boston?title1=1565925378&amp;title2=1579903002">http:\//www.baconizer.com/cgi-bin/boston?title1=1565925378&amp;title2=1579903002</a>
</p>
<p>
You can reverse the connection, yielding a 14-hop path:
</p>
<p>
<a href="http://www.baconizer.com/cgi-bin/boston?title2=1565925378&amp;title1=1579903002">http:\//www.baconizer.com/cgi-bin/boston?title2=1565925378&amp;title1=1579903002</a>
</p>
<p>
You can also walk a random path to my book:
</p>
<p>
<a href="http://www.baconizer.com/cgi-bin/boston?t1_type=7&amp;t2_type=7&amp;bnum1=0&amp;bnum2=332017&amp;title2=Practical+Internet+Groupware&amp;formtype=6">http:\//www.baconizer.com/cgi-bin/boston?<br/>t1_type=7&amp;t2_type=7&amp;bnum1=0&amp;bnum2=332017&amp;<br/>title2=Practical+Internet+Groupware&amp;formtype=6</a>
</p>
<p>
If you do that a few times, you'll notice that all paths are roughly the same length, rarely fewer than 10 hops or more than 14. You'll also notice that the final hops to my book are almost always: <a href="http://www.amazon.com/exec/obidos/ASIN/059600110X">Peer-to-Peer</a> -&gt; <a href="http://www.amazon.com/exec/obidos/ASIN/0793148782">P2P</a> -&gt; <a href="http://www.amazon.com/exec/obidos/ASIN/076454893X">Get in the Groove</a> -&gt; <a href="http://www.amazon.com/exec/obidos/ASIN/0789726777">Using Groove 2.0</a> -&gt; <a href="http://www.amazon.com/exec/obidos/ASIN/1565925378">Practical Internet Groupware</a>. This is emphatically not a good thing, and partly explains why my book is out of print: it failed to appeal to a critical mass of overlapping interest groups. Pick almost any other book, and you'll that there are at least several paths leading to it.	
</p>
<p>
There are lots of these affinity browsers kicking around nowadays, for lots of different kinds of networks. Yesterday, for example, Phil Windley posted a <a href="http://www.windley.com/2003/05/14.html#a616">nice write-up</a> on Jo Walsh's FOAF (Friend of a Friend) session at ETCON, and included a pointer to a <a href="http://swordfish.rdfweb.org/discovery/2002/02/paths/">path-finding demo</a> that connects two people by way of a sequence of linked FOAF files. 
</p>
<p>
When you encounter one of these affinity browsers, it usually takes about 5 minutes to traverse the <a href="http://www.fawcette.com/dotnetmag/2002_09/magazine/columns/trends/figure1.asp">hype cycle</a> from the peak of inflated expectations to the trough of disillusionment. This stuff is fun, but what's it really good for?
</p>
<table align="right" cellpadding="6">
<tr>
<td>
<a href="http://semanticstudios.com/publications/semantics/000006.php">
<img alt="social network analysis" width="205" height="170" src="http://semanticstudios.com/publications/semantics/images/02212002_snastory.gif"/>
</a>
</td>
</tr>
</table>
<p>
For book publishers, second-order analysis of these affinities could prove to be a powerful weapon in what has become a brutal war of attrition. Here's how social network analysis pioneer <a href="http://www.orgnet.com/">Valdis Krebs</a> envisions it:
</p>
<blockquote>
<i>
If you follow these links a few steps out, says Krebs, clusters emerge, and sometimes those clusters represent disjoint interests connected only through one book. He offers Thomas Petzinger's <a href="http://www.amazon.com/exec/obidos/ASIN/0684863103/">The New Pioneers</a> as an example. It connected two different groups -- one reading books on business and strategy, the other reading books on complexity science and chaos theory. Now there are a number of books that broker that connection, but Petzinger's was one of the first popular books to do so, according to Krebs. [<a href="http://webservices.xml.com/pub/a/ws/2002/06/04/udell.html">Seeing and Tuning Social Networks</a>, O'Reilly Network]
</i>
</blockquote>
<p>
<span class="minireview">Visible Path</span> There isn't yet a tool that solves this problem for publishers, at least not that I've heard of. But a company called <a href="http://www.visiblepath.com">Visible Path</a> has big plans to use social network analysis to turbocharge sales cycles. The company's founder, Antony Brydon (formerly general manager of <a href="http://www.iuma.com/">IUMA</a>), recently walked me through a demo. For salesfolk, it's all about access -- getting to the right people at the right levels in target organizations. The Visible Path software mines relationship data from contact databases, builds a network map by scanning email headers, and says to the salesperson: &quot;You need to get to person X at company Y? Here are the paths that link you to X. Would you like to request an introduction via intermediaries?&quot; 
</p>
<p>
A delicate email protocol then ensues, because to safeguard privacy the system will not reveal the identity of intermediaries until they agree to participate in the referral. The salesperson can pursue multiple paths in parallel; activity is co-ordinated with Salesforce.com's CRM system. <s>Powerful</s> <a href="http://www.ozzie.net/blog/2003/05/13.html#a83">Superconductive</a> stuff.</p>
<p>
If you're the kind of person who prefers not to think about how sausage gets made, you might find this all somewhat creepy -- particularly when you're approached by an automated relationship manager asking you to make an introduction. Personally, I'm fascinated to see how this will unfold.
</p>
<p>
Can it really work? Well, Antony demoed his own system to me, with real prospects, so he is clearly eating the dogfood. How Visible Path fares, during this enterprise software sales drought, will be one way to measure the validity of the concept. 
</p>
</body>
</item> 

<item num="a690">
<title>Indexing and searching Outlook email</title>
<date>2003/05/14</date>
<body>
<p>
<blockquote>
<i>
I never thought I'd find myself digging around in my Outlook message store, but Mark's SpamBayes addin -- which is written in Python -- turns out to be a great Python/MAPI tutorial. Borrowing heavily from his examples, I came up with a script to extract my Outlook mail to a bunch of files that I could feed to a standalone indexer. [Full story at <a href="http://www.xml.com/pub/a/ws/2003/05/13/email.html">O'Reilly Network</a>]
</i>
</blockquote>
</p>
<p>
This was a fun project that gave me a chance to explore three different technologies: the Lucene search engine, Jython, and Python's MAPI interface. As I learned this morning, my closing lament -- that the CPython/MAPI and Jython/Lucene halves of this project do not communicate directly -- is somewhat mitigated by the existence of Lupy (<a href="http://sourceforge.net/projects/lupy/">1</a>, <a href="http://www.divmod.org/Lupy/">2</a>), a Python port of Lucene. But I think the general point still stands. Must every component be rewritten in every language? Let's not go there. 
</p>
<p>
I'm only somewhat satisfied with the search solution I've cobbled together, by the way. The major challenge so far has been learning when and how to use various Lucene search idioms. For example, I can restrict messages to a date in March like so:
</p>
<pre>
yager AND 03/??/03 -&gt; 20 docs
</pre>
<p>
But watch this:
</p>
<pre>
yager and 03/??/03 -&gt; 777 docs
</pre>
<p>
Evidently 'AND' is a boolean conjunctive, but 'and' is just a noise word. And since Lucene (somewhat annoyingly, to my taste) defaults to an OR conjunction, this winds up being:
</p>
<pre>
yager OR 03/??/03 -&gt; 777 docs
</pre>
<p>
It's harder to generalize the date to 2003:
</p>
<pre>
yager AND ??/??/03 -&gt; org.apache.lucene.queryParser.ParseException
</pre>
<p>
You can't begin a term with a wildcard. This will work:
</p>
<pre>
yager AND (0?/??/03 1?/??/03) -&gt; 141 docs
</pre>
<p>
But that's getting pretty darned geeky. Lucene also supports proximity search, but it's a subtle thing as well. Consider:
</p>
<pre>
&quot;from date mcalister dickerson&quot;~20 -&gt; 8 docs
</pre>
<p>
This is a nicely fuzzy search in which the ~20 specifies a 20-word window, and 'from' and 'date' bind that window to the message header. In Outlook, apart from it being ungodly slow to search for messages where Matt McAlister and Chad Dickerson appear in To: or From: headers, I'd have to be too specific -- i.e., From: Matt, or To: Chad. On the other hand, proximity is a tricky thing: 
</p>
<pre>
&quot;from date mcalister dickerson&quot;~30 -&gt; 8 docs
&quot;from date mcalister dickerson&quot;~40 -&gt; 12 docs
&quot;from date mcalister dickerson&quot;~50 -&gt; 17 docs
</pre>
<p>
What's the &quot;right&quot; amount of fuzziness? And then there's this:
</p>
<pre>
&quot;from date mcalister dickerson&quot; -&gt; 0 docs
</pre>
<p>
No docs are found because the literal string does not appear anywhere.
</p>
<p>
I'm the kind of person who'll play around with these variations, but in general, people expect not to have to. The Web has trained us, rightly, to expect that we just type in a word or two and get the &quot;right&quot; answer. I don't know what the stats are on use of Google's advanced search, or any advanced search, but my gut tells me such features are rarely used.
</p>
<p>
I used to think the answer was to standardize on query syntax. Now I think that might help some, but not much. More fruitful, perhaps, would be to use multiple search strategies in parallel, suggest &quot;best&quot; outcomes, and factor the user's choices into future determinations of &quot;best.&quot; 
</p>
<p>
For years now, we've been able to find things on the Web more easily than we can find things in our own personal data stores. There's a huge opportunity, and a huge need, to swing that pendulum back toward the center. 
</p>
</body>
</item> 

<item num="a689">
<title>Google cache looker-upper</title>
<date>2003/05/13</date>
<body>
<p>
<table align="right">
<tr>
<td>
<img alt="google" src="http://weblog.infoworld.com/udell/gems/google.jpg"/>
</td>
</tr>
</table>
This is wicked cool. <a href="http://www.rentzsch.com/">Jonathan Rentzsh</a> has <a href="http://www.rentzsch.com/notes/googleCacheHacking">invented</a> (and <a href="http://fuse.ghostcassette.com/">Fuse</a> has refined) a <a href="http://weblog.infoworld.com/udell/stories/2002/12/11/librarylookup.html">LibraryLookup-style</a> bookmarklet that looks up your current page in Google's cache. It's useful if a page is temporarily or permanently 404. It's also a nifty way to check a prior version of a page, or to see which version of your page Google last cached.
</p>
<p>
In a column on the <a href="http://web.archive.org/">Wayback Machine</a>, I speculated on a more automatic form of redirection:
<blockquote>
<i>
If the Archive proves reliable, we may well soon see servers and/or clients, upon encountering 404s, try to fail gracefully by redirecting to the most-recently-saved archive page. This would be more than a major convenience. It could help bring hypertextual writing finally into the mainstream. Linking is the most profound way in which the web alters (or should alter) how we communicate. The lack of widespread and easy-to-user hypertext writing tools has been an impediment. But the vexing problem of linkrot is the real barrier. We won't collectively invest much effort in weaving the web until we can begin to regard its namespace as less fragile than it has so far proved to be. [<a href="http://udell.roninhouse.com/bytecols/2001-11-30.html">Digital Archives</a>]
</i>
</blockquote>
</p>
<p>
Jonathan's posting suggests the same idea. At the moment, Google is our online cache, and the Wayback Machine is our near-line cache. The shoe that's waiting to drop is copyright. I wonder if <a href="http://www.creativecommons.org/">Creative Commons</a> licenses can be used as is, or with modifications, to express the intent: &quot;Please cache this page so that in case I can no longer support it at its original address, resolvers will be able to find it in the Web's online or near-line caches.&quot;
</p>
</body>
</item> 

<item num="a688">
<title>Appropriate use of J2EE/EJB</title>
<date>2003/05/12</date>
<body>
<p>
An oft-heard complaint, echoed recently by <a href="http://capescience.capeclear.com/articles/j2ee/index.shtml">Annrai O'Toole</a>, is that J2EE app servers are oversold:
</p>
<blockquote cite="Annrai O'Toole">
The J2EE vendors have done a fantastic job of convincing the world
that you can't run a line of Java unless it runs inside a J2EE
container. This is just pure bunkum.
</blockquote>
<p>
I like his formulation that &quot;J2EE is the Java equivalent of a mainframe.&quot; We also have, of course, in COM+, the Windows version of the same idea, which in its earlier MTS incarnation predated J2EE. I also notice that the real mainframes haven't gone away. With respect to the middleware services for which the J2EE server is best known --  &quot;TP-heavy&quot; transaction management, connection and object pooling, role-based security, and declarative control of these aspects -- the question of when and why this stuff is or isn't overkill seems never to go away.
</p>
<p>
Here are some of the arguments and counter-arguments I hear:
</p>
<table border="1" cellpadding="4" cellspacing="1">
<tr>
<td>
<b>You need a J2EE container / EJB architecture because...</b>
</td>
<td align="center">
<b>But...</b>
</td>
</tr>
<tr>
<td>
Your app has to scale.
</td>
<td>
Other clustering and load-balancing solutions are available -- for Java, for Windows (with and without COM+), and for LAMP.
</td>
</tr>
<tr>
<td>
Your app is heavily transactional.
</td>
<td>
TP-Heavy solutions like CICS, Tuxedo, and TopEnd haven't gone away. TP-Lite is arguably getting more interesting, as well, now that the database engines can cluster, and produce/consume Web services.
</td>
</tr>
<tr>
<td>
You need object-relational persistence.
</td>
<td>
And there are other ways to get it. E.g.: Castor, servlets with JDO (Java Data Objects), Apple's Enterprise Object Framework. 
</td>
</tr>
<tr>
<td>
You need a low-impact way to evolve business logic.
</td>
<td>
J2EE's complexity bites you. It's easy to change what's inside a chunk of business logic, but it's hard to refactor in-the-large.
</td>
</tr>
<tr>
<td>
You need your business logic available to multiple applications.
</td>
<td>
What if those other applications aren't Java-based?
</td>
</tr>
<tr>
<td>
You want to standardize on one environment and one language.
</td>
<td>
Even the most dyed-in-the-wool J2EE developer is likely to touch (at least) SQL, JSP, XSLT, and a scripting language for automation. 
</td>
</tr>
</table>
<p>
For an upcoming article, I'd like to explore these and related themes with people on all sides of this complex, many-sided discussion. In an earlier posting, I characterized one axis of the debate as robustness versus agility, and wondered how we can have our cake and eat it too. Sam Ruby pointed to <a href="http://www.intertwingly.net/blog/1376.html">WS-Transaction</a>, to which Patrick Logan replied &quot;I think we need a rethinking of databases, messages, and coordination.&quot; I'm sure that's true. Meanwhile, what to do?
</p>
<p>
I'm going to contact some people privately, but it seems useful to invite feedback here as well, hence this <a href="http://radiocomments.userland.com/comments?u=100887&amp;p=688&amp;link=http%3A%2F%2Fweblog.infoworld.com%2Fudell%2F2003%2F05%2F12.html%23a688">comment link</a>.
</p>
</body>
</item> 

<item num="a687">
<title>Interfaces and habits</title>
<date>2003/05/12</date>
<body>
<p>
<table align="right">
<tr>
<td>
<a href="http://www.amazon.com/exec/obidos/asin/0201379376">
<img width="96" height="140" alt="jef raskin, the humane interface" src="http://images.amazon.com/images/P/0201379376.01.MZZZZZZZ.jpg"/>
</a>
</td>
</tr>
</table>
<blockquote>
<i>
It seems kind of unfair, doesn't it? First, developers have to understand and accommodate users' habits. Then we have to deliver solutions that add value while surreptitiously encouraging users to adopt better habits. Finally, we have to bring to the surface, examine, and modify our own deeply-ingrained habits. That's a painful and psychologically hard thing to do. But happy users are not the only reward. The habit of breaking habits will serve you well. [Full story at <a href="http://www.infoworld.com/article/03/05/09/19OPstrategic_1.html">InfoWorld.com</a>]
</i>
</blockquote>
</p>
</body>
</item> 

<item num="a686">
<title>Bayesian vs. latent semantic analysis</title>
<date>2003/05/12</date>
<body>
<p>
By way of <a href="http://aldoblog.com/blog/2003-05-04">Michael Alderete's blog</a>, I found <a href="http://www.pacificavc.com/blog/2003/02/10.html#a78">this fascinating item</a> by Tim Oren, a venture capitalist whose eight-year stint at Apple included advanced research on the use of latent semantic analysis for document categorization. Although he can't say for sure, Oren strongly suspects that although OS X Mail is <a href="http://www.c-command.com/spamsieve/">widely</a> <a href="http://email.about.com/cs/macclientreviews/gr/mail.htm">thought</a> to use Bayesian techniques, it in fact uses latent semantic analysis:
</p>
<blockquote cite="Tim Oren">
So what's Apple doing with latent semantics to catch spam? Not sure. The simplest approach is to use a related factor analysis technique to find the best fit to predicting spam/not-spam in a training sample; it's not a full PCA but I suppose you could call it latent semantics. It would be more interesting if they are using the full deal, maybe computing separate models for spam/not. Because, you see, latent semantics naturally lends itself to automatic sorting and organization of the document space over which the model was computed. And afew years back, the Apple group that I and then Dan Rose managed did an automatic e-mail organization project rather unfelicitously called <a href="http://www.acm.org/sigchi/bulletin/1998.2/rose.html">piles</a>, that included a user interface for just such a thing.  (IP alert:  Some of it was <a href="http://patft.uspto.gov/netacgi/nph-Parser?Sect1=PTO2&amp;Sect2=HITOFF&amp;p=1&amp;u=/netahtml/search-bool.html&amp;r=1&amp;f=G&amp;l=50&amp;co1=AND&amp;d=ptxt&amp;s1=oren.INZZ.&amp;s2=timothy.INZZ.&amp;OS=IN/oren+AND+IN/timothy&amp;RS=IN/oren+AND+IN/timothy">patented.</a>)  Hmmm.... [<a href="http://www.pacificavc.com/blog/">Due Diligence</a>]
</blockquote>
<p>
I wonder if Apple will clarify? Meanwhile, I just picked up a weekend's worth of mail. Here's the score:
<pre>
     rightly sent to Spam folder:  237
     wrongly sent to Spam folder:    0
rightly left in original folders:   14
</pre>
In other words, a perfect performance. How long can this last? One of the good messages, from an old acquaintance, pointed out that the SpamBayes glow will inevitably fade as spammers regroup to attack it. I'm sure that's true. In the long run a multi-pronged approach seems best. Server-based gateways to keep the worst of the junk off your network, digital identity, filtering, blacklisting, whitelisting, digital postage -- all of these strategies will play important roles.
</p>
<p>
To reiterate, though, there's more going on here than spam prevention. Bringing advanced computational methods for document categorization to the desktop will create a host of new opportunities. Our personal data stores are about to become the laboratory for some really fascinating experimentation.
</p>
</body>
</item> 

<item num="a685">
<title>SpamBayes futures</title>
<date>2003/05/09</date>
<body>
<p>
<table align="right" cellpadding="6">
<tr>
<td>
<a href="http://www.infinitytrading.com/pork_belly_futures_options.htm">
<img alt="pork futures" src="http://www.infinitytrading.com/images/pork_bellies.jpg"/>
</a>
</td>
</tr>
</table>
Spam being the visceral topic that it is, yesterday's item provoked a number of responses. An email correspondent asks whether SpamBayes can deal with tricks like image-only messages, or words obfuscated with interposed characters. So far, not a problem. Part of the reason seems to be that the email headers contribute to the analysis. As Paul Graham notes, spammers:
</p>
<blockquote cite="Paul Graham">
would have to change (and keep  changing) their whole infrastructure, because otherwise the headers would look as bad to the Bayesian filters as ever, no matter what they did to the message body. [<a href="http://www.paulgraham.com/spam.html">A Plan for Spam</a>]
</blockquote>
<p>
Sam Ruby asks:
</p>
<blockquote cite="Paul Graham">
What if we could marry
<a href="http://www.osafoundation.org/Chandler_Compelling_Vision.htm">
Chandler</a> and
<a href="http://spambayes.sourceforge.net/">SpamBayes</a> (both in
Python)... [<a href="http://www.intertwingly.net/blog/1390.html">Intertwingly</a>]
</blockquote>
<p>
Yup, that's a natural. Though the Python-ness of both may not be directly relevant. In Mark Hammond's Outlook implementation, the SpamBayes engine could as easily be a COM component or a local Web service as a Python module. It's important to SpamBayes that it's written in a flexible, dynamically-typed language. Likewise to the Outlook addin. But Python isn't, and shouldn't be, necessarily the glue between them.
</p>
<p>
Several bloggers have advanced a line of thinking that I too find fascinating, and that points to implications far beyond the world of spam:
</p>
<p>Matt Griffith:</p>
<blockquote cite="Matt Griffith">
My problem is information overload. I'm much more interested in seeing the same thing for RSS. Instead of blocking stuff I don't want I want it to highlight the stuff I might want. [<a href="http://matt.griffith.com/weblog/2003/05/08.html#a129">matt.griffith</a>]
</blockquote>
<blockquote cite="Les Orchard">
Ditto.  Using a Bayesian approach, or some other form of machine learning, as applied to my aggregator and my viewing patterns is something I've been wanting for awhile now. [<a href="9http://www.decafbad.com/blog/tech/rssbayes_now.phtml">0xDECAFBAD</a>]
</blockquote>
<p>
In fact, a kind of RSS-Bayes is already available to users of <a href="http://www.newsgator.com/">NewsGator</a>, since you could process its messages through SpamBayes along with your email. I wouldn't, though, unless it were possible to use multiple instantiations of SpamBayes, because the ham/spam distinction in email is very different from the read/skip distinction in RSS.
</p>
<p>
The multiple-instantiation idea is potentially huge, I think. Consider just your email. I can imagine many dimensions of classification beyond spam/ham. For example: family/not-family, projectX/not-projectX. I actually go to the trouble of creating filters for some of these kinds of things, but it's arguably more trouble than it's worth. A multidimensional classifier that could notice these patterns emerging, offer to set up the foldering and filtering for me, and then reinforce the classification by observing my behavior over time -- wow, isn't that what computers were supposed to be for?
</p>
<p>
One other thought prompted by my conversation with a PR person yesterday about mail gateways. It's true that even if I decide SpamBayes is a total success, my email administrator has a bigger problem. He'd like to keep that stuff off his disks and off his wires. And I think I see how that can happen. I'm almost, but not quite, ready to tell Outlook to delete what lands in my Spam folder, sight unseen. If I do make that choice, why not replicate my SpamBayes database up to the server? Since my local database is in constant flux -- as my disposition of messages refines it -- this would ongoing message flow between client and server. Sounds like a job for Web services!
</p>
</body>
</item> 

<item num="a684">
<title>SpamBayes rocks</title>
<date>2003/05/08</date>
<body>
<p>
<table align="right">
<tr>
<td>
<a href="http://www.oreilly.com/catalog/spam/index.html">
<img width="141" height="191" alt="stopping spam" src="http://weblog.infoworld.com/udell/gems/stoppingSpam.gif"/>
</a>
</td>
</tr>
</table>
<span class="minireview">SpamBayes with Outlook Addin</span>
In an upcoming InfoWorld article, which will post next Friday and appear in print the following week, I review the <a href="http://spambayes.sourceforge.net/">SpamBayes</a> filtering engine and Mark Hammond's brilliant <a href="http://starship.python.net/crew/mhammond/spambayes/">Outlook addin</a>. Thanks to this remarkable open source duo, I am ready to declare victory on spam.
</p>
<p>
I can't post the whole review yet, but neither can I resist reporting here what I think are remarkable results. I've been a skeptic when it comes to content-filtering solutions. I thought about driving down to Cambridge in January for the <a href="http://www.spamconference.org/proceedings2003.html">Spam Conference</a>, for example, but at the time felt that its narrow focus on filtering -- to the exclusion of many alternatives, including strong identity management -- was short-sighted. Now I think I'm the one who was short-sighted. 
</p>
<p>
Here are the first two paragraphs from a particularly interesting spam:
</p>
<blockquote>
<i>
<p>
I came across your web site- 'http://udell.roninhouse.com/', the
official website of &quot;Jon Udell&quot;. I found your website to be very
impressive and I went through the contents of your website which were
quite interesting. Your article on &quot;Distributed HTTP&quot; has been great! I
thoroughly enjoyed browsing through various web pages via the
interactive links (InfoWorld) given in your site. The link to 'Analysis
| XML alone won't cure Web security ills' has been very educative. I
give you all the credit for creating such an incredible site and wish
you all the best. I believe that you can enrich your prospects by having
a better hosting platform for your website. 
</p>
<p>
I represent xxxx.com a professional web hosting and a web design
provider currently servicing over 75000 customers world wide and we are
currently promoting a trial offer. I want to offer you 1 Year of web
hosting absolutely FREE OF CHARGE.  This is our attempt to project an
important juncture or probability for you to move on to a better web
host. 
</p>
</i>
</blockquote>
<p>
What's interesting is that this spammer has spidered my personal home page in order to gather vocabulary (&quot;xml,&quot; &quot;distributed,&quot; &quot;platform&quot;) typical of legitimate mail to me. This is precisely the kind of tactic that you'd think might fool a Bayesian filter, which looks for both positive as well as negative evidence. It did, in fact, fool SpamAssassin. 
</p>
<p>
For SpamBayes, in this case, the &quot;hammy&quot; words did help counteract the &quot;spammy&quot; words, yielding a score of only 80%. That was enough uncertainty to land the message in my MaybeSpam folder. After I declared it as spam, it rescored to 99%.
</p>
<p>
Paul Graham:
</p>
<blockquote cite="Paul Graham">
I think it's possible to stop spam, and that content-based filters are the way to do it. The Achilles heel of the spammers is their message. They can circumvent any other barrier you set up. They have so far, at least. But they have to deliver their message, whatever it is. If we can write software that recognizes their messages, there is no way they can get around that. [<a href="http://www.paulgraham.com/spam.html">Paul Graham: A Plan for Spam</a>]
</blockquote>
<p>
It's hard, at first, to see how SpamBayes can possibly work. When you look at a message from <a href="#stats">SpamBayes'</a> point of view, you see a different and far more granular approach than SpamAssassin's, which reports things like:
</p>
<p>
<tt>
PENIS_ENLARGE  (2.2 points)  BODY: Information on getting a larger penis
<div>
SUB_FREE_OFFER (0.3 points)  Subject starts with &quot;Free&quot;
</div>
<div>
US_DOLLARS (2.0 points)  BODY: Nigerian scam key phrase (million dollars)
</div>
</tt>
</p>
<p>
SpamBayes doesn't know any of these rules. It just knows what I want to see, and what I don't want to see. It knows because I show it a bunch of positive and negative examples up front, and then refine its understanding of my wishes continuously as I process my (surprisingly few) MaybeSpam messages. 
</p>
<p>
When I sent the review to my InfoWorld colleagues, I sent it twice:
</p>
<div>
<tt>
Subject: Penis enlargement
</tt>
</div>
<div>
<tt>
Subject: SpamBayes review
</tt>
</div>
<p>
SpamAssassin jumped all over the first message. But SpamBayes knew that neither was anything to worry about. The &quot;spammy&quot; clues were strong but the &quot;hammy&quot; evidence completely overwhelmed them -- in ways that are specific to my own unique patterns of communication.
</p>
<p>
You get some of this effect with Mac OS X's Mail app, but it doesn't feel like a complete solution to me. SpamBayes, as implemented for Outlook by Mark Hammond, does. I asked Mark if I could send him a PayPal contribution. He said: &quot;No, it would be innappropriate for this project, as so many people smarter than I worked on the back end.&quot; Fair enough. Thanks to all of them for a job well done!
</p>
<p>
<b>Update:</b> I just got a phone call from a PR representative wanting to tell me about <a href="http://www.ironport.com/">IronPort's messaging gateway</a>, <a href="http://www.senderbase.org/">SenderBase service</a>, and <a href="http://www.bondedsender.com/">bonded sender program</a>. It's interesting stuff. We talked some about client versus server solutions, and finally she asked: &quot;So, how did my email message to you score?&quot; I went back and looked: 64%. Slightly spammy, but not over the threshold. Here were the spammiest clues:
</p>
<pre>
'truste'                     0.955709            0      5
'affiliate'                  0.963994            3     93
'spam'                       0.965056           14    424
'7000'                       0.974385            0      9
'forever!'                   0.99203             0     30
</pre>
<p>
She took notes. We were both surprised to see that the word <a href="http://www.truste.com/">TrustE</a> has so far showed up in 5 spams and no hams (until this one).
</p>
<hr/>
<a name="#stats"/>
<div>
<b>An ingenious approach foiled:</b>
</div>
<pre>
Spam Score: 0.993749
word                            spamprob         #ham  #spam
'url:roninhouse'                0.0461277          81      4
'jon'                           0.0506434        1722     99
'thanks!'                       0.0882482         136     14
'xml'                           0.120668          278     41
'interesting.'                  0.127463           33      5
'to:addr:judell'                0.188539          949    238
'great!'                        0.203616           26      7
'lead'                          0.20497           173     48
'appreciate'                    0.206009           72     20
'also,'                         0.20929           172     49
'to:addr:mv.com'                0.214304          900    265
'contents'                      0.218866          103     31
'pages'                         0.273282          153     62
'article'                       0.273369          180     73
'header:Received:3'             0.283642          330    141
'platform'                      0.308406          102     49
'user'                          0.312891          169     83
'around'                        0.313592          288    142
'etc.'                          0.316824          136     68
'best.'                         0.320998           12      6
'process,'                      0.322662           43     22
'know'                          0.326371          797    417
'enjoyed'                       0.331886           17      9
'web'                           0.332325          720    387
'mention'                       0.333422           65     35
'point'                         0.333817          233    126
'&quot;distributed'                  0.340883            2      1
'hosting'                       0.344502           46     26
'udell,'                        0.345753           51     29
'there'                         0.346756          778    446
'quite'                         0.346958          103     59
'were'                          0.361252          357    218
'which'                         0.363948          801    495
'time.'                         0.365849          151     94
'represent'                     0.368909           35     22
'having'                        0.377575          226    148
'hear'                          0.379814          112     74
'noheader:reply-to'             0.381912         3028   2021
'reply-to:none'                 0.381912         3028   2021
'cure'                          0.62087             5      9
'to:no real name:2**0'          0.624296         1508   2707
'probability'                   0.625744            6     11
'alone'                         0.626741           17     31
'such'                          0.627115          301    547
'ensure'                        0.631623           35     65
'accounts'                      0.635156           35     66
'url:com'                       0.639288         1312   2512
'prospects'                     0.644734            5     10
'send'                          0.646326          355    701
'number,'                       0.64719            11     22
'online'                        0.650194          228    458
'experience'                    0.650383           83    167
'proto:http'                    0.651725         1483   2998
'link'                          0.656753          221    457
'currently'                     0.658234          109    227
'header:Return-Path:1'          0.665302         1635   3511
'official'                      0.670506           29     64
'skip:1 10'                     0.671471           95    210
'best'                          0.672719          262    582
'pleasure'                      0.675451            7     16
'further'                       0.678206           93    212
'please'                        0.682766          747   1737
'witness'                       0.685674            2      5
'immediate'                     0.695489           40     99
'name'                          0.700578          170    430
'attempt'                       0.705123           20     52
'website'                       0.707746           66    173
'simply'                        0.71687            92    252
'subject:Jon'                   0.717693           33     91
'account'                       0.720615           86    240
'incredible'                    0.723674           14     40
'thoroughly'                    0.729777            5     15
'includes'                      0.729946           80    234
'professional'                  0.741174           38    118
'interest'                      0.742465          101    315
'absolutely'                    0.744416           31     98
'header:Mime-Version:1'         0.755206          230    767
'url:udell'                     0.756494          302   1014
'mr.'                           0.758107           22     75
'trial'                         0.774293           19     71
'email'                         0.788011          345   1386
'here'                          0.790386          387   1577
'contact'                       0.792903          179    741
'dear'                          0.79496            76    319
'toll'                          0.800125            7     31
'card'                          0.803112           40    177
'offer'                         0.807203           98    444
'free'                          0.80753           224   1016
'subject:About'                 0.809947            2     10
'educative.'                    0.83645             0      1
'unaccounted'                   0.83645             0      1
'visiting,'                     0.83645             0      1
'x-mailer:ximian evolution 1.0. 0.83645             0      1
'satisfied'                     0.841567            4     24
'header:Message-Id:1'           0.844496          315   1849
'offer.'                        0.851727            9     57
'maintenance,'                  0.861673            1      8
'ordering'                      0.864587            3     22
'skip:h 30'                     0.894696            1     11
'obtained'                      0.898357            2     21
'credit'                        0.902204           29    291
'&quot;pay'                          0.902236            0      2
'supplemented'                  0.902236            0      2
'wish'                          0.905889           50    522
'assisting'                     0.909155            1     13
'check&quot;'                        0.93028             0      3
'cancel'                        0.938587            3     53
'incase'                        0.945821            0      4
'juncture'                      0.945821            0      4
'servicing'                     0.945821            0      4
'24/7'                          0.981978            0     13
'charge.'                       0.991468            0     28
</pre>
</body>
</item> 

<item num="a683">
<title>MTU mysteries and regedit links</title>
<date>2003/05/07</date>
<body>
<p>
<table align="right" cellspacing="6">
<tr>
<td>
<a title="Why doesn't this work?" href="javascript:alert('Why doesn\'t this work?')">reg:\//HKEY_ <br/>LOCAL_MACHINE/ <br/>System/ <br/>CurrentControlSet/ <br/> Services/ <br/>Tcpip/Parameters/<br/> Interfaces</a>
</td>
</tr>
</table>
A problem on my home network has been solved, but not the mystery behind it. Here's the scoop. My wife's studio is on the other end of the house from my office, where the Ethernet hub and the Linksys router live. She'd been getting great downloads, but crummy uploads. Really crummy, worse than 28.8-dialup. Here's the setup:
</p>
<pre>
+-----+              +----+   +-------+  
|WinXP|--A--90ft--B--|hub |---|Linksys|
+-----+              +----+   +-------+
</pre>
<p>
When I attach my PowerBook to point A, I see the same syndrome. When I attach my PowerBook to point B, no problem. Gotta be the wiring, right? Of course, I've lost the charger for my old Microtest Compas, a network tester that was cool in its day (TCP/IP and IPX/SPX, woo hoo). While I dilly-dallied about how to replace it, my wife (aka the shoemaker's barefoot spouse) had a couple of bad days when outbound email was not just slow, but actually stalled. 
</p>
<p>
To the rescue came <a href="http://groove.jpj.net/guerrillanetworking/">Paul Venezia</a>, who lives and breathes networks. While charging up his tester, he got to wondering about MTU (Maximum Transmission Unit) settings. The Ethernet default, 1500, wasn't causing any trouble for devices on my end of the suspect cable, but it turns out that on the other end, 1470 is the magic number. And no, this isn't a PPPoE situation, in case you're wondering. And, by the way, the cable tested fine.
</p>
<p>
So problem solved, marriage saved. But mystery unsolved. Why does MTU 1500 at point B need to become MTU 1470 at point B? Well, perhaps the <a href="http://www.lazyweb.org/">Lazy Web</a> will tell us. Meanwhile, though, I can't resist putting in another plug for an idea I had long ago:
</p>
<blockquote>
<i>
Many of NT's admin tools are primarily navigators and editors of specialized information spaces. A web-enabled User Manager would be able to remember and replay paths through its space -- the directory -- using bookmarks. The tool most in need of this capability isn't User Manager, though. It's RegEdit, the registry editor. Spend a day within earshot of an NT administrator and you'll hear mantras like this chanted repeatedly: &quot;HKEY_LOCAL_MACHINE, System, CurrentControlSet, Services, W3SVC, Parameters...&quot; Once the target key is found and altered (and the machine has been rebooted), the problem may still not be solved. So the administrator must repeat the same mantra and travel the same path to the same registry key for another try. This is nuts! In web mode, you'd bookmark that page after the second or third visit to it. [<a href="http://safari.oreilly.com/?XmlId=1-56592-537-8/ch14-8891">Practical Internet Groupware, Chapter 14, Automating Internet Components</a>]
</i>
</blockquote>
<p>
I wrote those words five years ago. But here we are, still transcribing registry keys, and still chanting. In his wonderful analysis of Apple's <a href="http://tbray.org/ongoing/When/200x/2003/04/30/AppleWA">iTunes Music Store URI</a>, Tim Bray frowns on the gratuitous new itms: scheme. I agree. As Tim points out, http: plus a media type is the better way, given that what's behind that itms: URI is just XML that a browser might want to use directly.
</p>
<p>
I'm not sure that application-specific schemes are always wrong, though. For an upcoming O'Reilly Network column in which I learn how to extract, index, and search my Outlook mail, I was reminded that these protocols:
</p>
<pre>
outlook:/Personal Folders/Inbox/Infoworld
outlook:000000001a994298592a2b49af2a7f782e8f774f245b4300
</pre>
<p>
jump you directly into an Outlook folder or message. This is darned useful.
</p>
<p>
Take a look at <a href="http://www.winguides.com/registry/display.php/280/">this MTU-tweaking page</a>, which as Paul observed, is tagged as <font color="green">popular</font>. Some sites go so far as to provide <a href="http://www.dslreports.com/front/drtcp.html">reg-tweaking utilities</a>, but why can't the pages that describe how and why to do the tweaking also jump you straight into the appropriate regedit locations?
</p>
</body>
</item> 

<item num="a682">
<title>Crazy like a fox</title>
<date>2003/05/06</date>
<body>
<p>
<table align="right" cellpadding="6">
<tr>
<td valign="center">
<img width="58" height="51" alt="yager" src="http://weblog.infoworld.com/udell/gems/yager.gif"/>
</td>
<td valign="center">
<img width="46" height="51" alt="jobs" src="http://weblog.infoworld.com/udell/gems/jobs.jpg"/>
</td>
</tr>
</table>
Astute readers may have noticed an apparent contradiction in this week's issue of InfoWorld. First this:
</p>
<blockquote cite="Tom Yager">
The technology Microsoft pulled together for Windows Server 2003 is its best effort by far, an uncannily good fit for the myriad challenges modern IT organizations face. It is, in the best sense, a total solution in a box. [<a href="http://www.infoworld.com/article/03/05/02/18TCms2003_1.html">Microsoft's platform play hits the big time</a>, Tom Yager, InfoWorld, May 2, 2003]
</blockquote>
<p>
And then this:
</p>
<blockquote cite="Tom Yager">
I am turning away from two core assumptions: that all x86 machines are destined to run Windows and that all worthwhile entry-level servers, business desktops, and serious notebooks use Intel CPUs...The two Windows servers that are presently the hub of my network -- the sacred production boxes that serve my directory, mail, Web, streaming video, database, and applications -- will soon be gone. I'm replacing them with Apple Xserve machines...
 [<a href="http://www.infoworld.com/article/03/05/02/18OPcurve_1.html">Windows doesn't live here anymore</a>, Tom Yager, InfoWorld, May 2, 2003]
</blockquote>
<p>
Well, the enterprising reporters at Crazy Apple Rumors did some digging, and here's what they found:
</p>
<blockquote cite="Crazy Apple Rumors">
When reached for comment, Yager at first denied any attempt on his part to subvert the process. Under a withering barrage of questions from Crazy Apple Rumors Site reporters, however, his story quickly collapsed.
<br/>
<br/>
&quot;Oh, jeez, I just thought for sure if I kept writing about Apple they'd can my ass! I mean, the magazine's for enterprise users, for Pete's sake! No one who reads it cares about Apple! They hate Apple!
<br/>
<br/>
&quot;What do I have to do to get fired from this place?&quot; a frustrated Yager asked, &quot;Prance around with a tangerine iBook like a little girl?&quot;
<br/>
<br/>
Yager even sports a black turtleneck a la Steve Jobs in the photograph of him that appears next to every column.
<br/>
<br/>
&quot;I'm trying to play it just short of crazy, you know?&quot; Yager said. &quot;Just enough to make them uncomfortable but not have a black mark on my resumé.&quot;
<br/>
<br/>
Asked why it was he was hoping to get fired, Yager replied &quot;I've been a technology columnist for a while now, but what I really want to do is direct. [<a href="http://www.crazyapplerumors.com/2003_05_04_archive.htm#200249990">Crazy Apple Rumors</a>]
</blockquote>
<p>
Funny stuff! We're flattered. But in all seriousness, I don't see any contradiction here. Like Tom, I'm a zealous non-zealot. I touch Windows, Linux, FreeBSD, and Mac OS X every day. My only religion is what works.
</p>
<p>
As for Apple, its current portfolio packs more of an enterprise punch than the company lets on. An eclectic wizard like Tom is just the guy you want doing a serious investigation of the possibilities. 
</p>

</body>
</item> 

<item num="a681">
<title>Computer/telephone integration: Why don't we expect more?</title>
<date>2003/05/05</date>
<body>
<p>
<table cellspacing="6" align="right">
<tr>
<td>
<a href="http://weblog.infoworld.com/udell/gems/spiderPhone.gif">
<img alt="SpiderPhone" width="200" height="100" src="http://weblog.infoworld.com/udell/gems/spiderPhone.gif"/>
</a>
<div align="center" class="realsmall">SpiderPhone</div>
</td>
</tr>
</table>
<span class="minireview">SpiderPhone</span>
I'm always on the lookout for innovative CTI (computer/telephone integration) tricks that work with our existing hybrid infrastructure. Sure, VoIP's right around the corner, but it's been right around the corner for a long time. Meanwhile, the humble business conference call remains a comedy of errors. One solution that recently impressed me is <a href="http://www.spiderphone.com/">SpiderPhone</a>. 
</p>
<p>
The screenshot depicts a three-way call. The connection between the voice call and this Web application is the really cute trick. 
<table cellspacing="6" align="right">
<tr>
<td>
<a href="http://weblog.infoworld.com/udell/gems/spiderPhoneJoin.gif">
<img alt="SpiderPhone: Joining a call" width="200" height="100" src="http://weblog.infoworld.com/udell/gems/spiderPhoneJoin.gif"/>
</a>
<div align="center" class="realsmall">Joining a call</div>
</td>
</tr>
</table>
I'm the only one of the three participants so connected, and I lied about my name to see if the system would notice (it didn't), but here's how it works. Once you're dialed into a call, you visit the Website and click the <a href="https://www.spiderphone.com/Conference/vwWhoAreYou.asp">Join a call</a> link. 
After you declare your identity for the call, the application generates a 4-digit number and invites you to press * on your phone and enter the number. Now the Web app and the phone call are connected. Slick!
</p>
<p>
Monitoring the call in your browser, you can see the names (or numbers) of the speakers flickering in the &quot;Talker:&quot; field. Basic screensharing is available, and Web participants can whisper (&quot;Psst&quot;) to one another in a chat window -- though I didn't get to try these features, being the only Web participant on the call. As the first screen shows, I did switch recording on and off, then on again. I record a lot of telephone interviews, so this feature really piqued my interest. As it turns out, though you can't just download your recording. You pay dial-in charges to listen by phone ($0.19/min for the 212 number, $0.24/min for the toll-free number) or through the Web ($0.10/min). So I'll stick with my local phone tap for now. 
</p>
<p>
Still, SpiderPhone is a really clever piece of work. I love it when the phone network and the computer network cooperate in unexpected ways. Of course, the fact that we don't generally expect this kind of thing is puzzling. You'd think the absence of points of contact between our two major instruments of communication would raise more eyebrows than it does.
</p>
</body>
</item> 

<item num="a680">
<title>Enterprise buses and dirt roads</title>
<date>2003/05/03</date>
<body>
<p>
<table cellspacing="6" align="right">
<tr>
<td>
<a href="http://www.gonecamping.net">
<img src="http://www.gonecamping.net/images/dirt-road.jpg"/>
</a>
</td>
</tr>
</table>
<blockquote>
<i>
As vendors begin to identify themselves with SOA, I hope they won't apologize for the dirt road or demand that we pave it. SOAP traffic flowing over the Web and through e-mail isn't a bad thing. We already know how to proxy this stuff. We're about to discover a whole new set of reasons to do it. [Full story at <a href="http://www.infoworld.com/article/03/05/02/18OPstrategic_1.html">InfoWorld.com</a>]
</i>
</blockquote>
</p>
<p>
The &quot;dirt road&quot; metaphor is courtesy of <a href="http://www.capeclear.com/clear_thinking3.shtml">Annrai O'Toole</a>. It resonates for me in a couple of different ways. This column talks about how our &quot;dirt road&quot; protocols, SMTP and HTTP, are routable, cacheable, and proxy-able in ways that we've yet to fully exploit. 
</p>
<p>
The idea of dirt roads also evokes, for me, Larry Wall's famous anecdote about the University of California's approach to designing walkways. At the Irvine campus, according to Larry, planners just sowed grass everywhere and let the paths that emerged define where to put the sidewalks. I wonder about this a lot, lately, when thinking about the differences between LAMP (Linux/Apache/MySQL/Perl|Python|PHP) and .NET/COM+ or J2EE/EJB. Where's the inflection point between these two styles? When you harden an architecture for robust transactions, how do you preserve the fluidity that the agile enterprise requires?
</p>
<p>
As <a href="http://www.gotdotnet.com/team/dbox/default.aspx#nn2003-05-03T11:25:53Z">Don Box</a> points out, <a href="http://www.artima.com/weblogs/index.jsp?blogger=unclebob">Bob Martin</a>, <a href="http://www.artima.com/weblogs/index.jsp?blogger=ward">Ward Cunningham</a>, and <a href="http://www.artima.com/weblogs/index.jsp?blogger=guido">Guido van Rossum</a> have recently joined the blog conversation. Ward, the other day, wrote:
</p>
<blockquote cite="Ward Cunningham">
Two wiki sites are sisters if they join their namespace in such a way that happy collisions occur. [<a href="http://www.artima.com/weblogs/viewpost.jsp?thread=4615">Ward Cunningham's Weblog</a>]
</blockquote>
<p>
I hope to see some happy collisions between the LAMP and .NET/J2EE perspectives.
</p>
</body>
</item> 

<item num="a679">
<title>Tablet vs tablet</title>
<date>2003/05/02</date>
<body>
<p>
<table align="right">
<tr>
<td>
<a href="http://www.jamesshuggins.com/h/bas1/hugginisms.htm">
<img src="http://www.jamesshuggins.com/i/bas1/legal_pad_and_pen.jpg"/>
</a>
</td>
</tr>
<tr>
<td>
<a href="http://www.spiegel.de/netzwelt/netzkultur/0,1518,grossbild-222079-221948,00.html">
<img width="150" height="200" src="http://www.spiegel.de/img/0,1020,222079,00.jpg"/>
</a>
</td>
</tr>
</table>
Here's a fascinating juxtaposition. Rob Howard, ASP.NET Program Manager for Microsoft, offered a glimpse of Bill Gates at work in an internal design review:
</p>
<blockquote cite="Rob Howard">
The first thing I notice as the meeting starts is that Bill is left-handed. He also didn't bring a computer in with him, but instead is taking notes on a yellow pad of paper. [<a href="http://dotnetweblogs.com/rhoward/posts/6128.aspx">Rob Howard's blog</a>]
</blockquote>
<p>
The next day, Gates delivered a speech in which he said, among other things:
</p>
<blockquote cite="Bill Gates">
The PC will have different form factors. One that we'll highlight a lot today is this one called the Tablet form factor...You just simply take this pen here, and you can take notes on the surface. And so if you see an article that's interesting you simply write an annotation and share that with your friends. [<a href="http://www.microsoft.com/billgates/speeches/2003/04-29naa.asp">Bill Gates' Web Site</a>]
</blockquote>
<p>
Common sense would have told me that a Tablet PC running OneNote isn't ready to be Gates' (or my) information-gathering tool of choice. So I'm not surprised to find out that Microsoft's top dog isn't yet eating this particular flavor of dogfood. But consider how Steve Jobs <i>was</i> eating the dogfood this year at MacWorld, delivering his keynote using <a href="http://www.apple.com/keynote/">Keynote</a>, an app built to be his dream presentation tool. Howard's glimpse of Gates scribbling mission-critical notes on a legal pad tells us something about how badly Gates must want to be able to dogfood the Tablet. Imagine the pressure when the design review is focused on the Tablet!
</p>
<p>
It's very cool that these human details are starting to emerge. 
</p>
</body>
</item> 

<item num="a678">
<title>How not to contact me</title>
<date>2003/05/02</date>
<body>
<p>
<img border="1" width="514" height="621" alt="How not to contact me" src="http://weblog.infoworld.com/udell/gems/howNotToContactMe.gif"/>
</p>
</body>
</item> 

<item num="a677">
<title>Revisiting the Virtual Press Room</title>
<date>2003/05/01</date>
<body>
<p>
<table align="right" cellspacing="6">
<tr>
<td>
<a href="http://www.philwainewright.com/about/bio.htm">
<img src="http://www.philwainewright.com/img/philw.jpg"/>
</a>
<div align="center" class="realsmall">Phil Wainewright</div>
</td>
</tr>
</table>
I've just subscribed to Phil Wainewright's <a href="http://www.looselycoupled.com/news/releases.html">archive of press releases</a> at <a href="http://www.looselycoupled.com/">looselycoupled.com</a>. (PR folk take note: I <i>subscribed voluntarily to this feed</i>.) An analyst and writer focused on Web services, Phil has built an application that publicists can use to post their press releases to his website, which in turn flows them out as an <a href="http://www.looselycoupled.com/news/releases.rss">RSS feed</a>. 
</p>
<p>
Boy, does this ever bring back memories! I built a similar application, called the Virtual Press Room, for BYTE.com in 1995, and wrote a <a href="http://www.byte.com/art/9512/sec9/art1.htm">column</a> about it. (There's still a Web page that gamely tries to <a href="http://www.asianet.net/engin588.html">search the VPR</a>, on a server that CMP recycled five years ago.) Back then, it was cutting-edge to enable folks to cut and paste from their Microsoft Word press releases into a Web form, and then to publish that content for them. Nowadays, you'd think that would be pass&amp;eacute;. Why not just offer RSS feeds? But old habits are hard to break. As Phil <a href="http://www.looselycoupled.com/blog/2003_04_20_lc.htm#200181105">discovered</a>, there aren't yet many publicists doing RSS feeds. There are still quite a few, though, who are  willing to do the cut-and-paste. 
</p>
<p>
Phil writes:
</p>
<blockquote cite="Phil Wainewright">
This is all a bit of a departure for a content site, which according to conventional wisdom ought to be trying to suck in as much traffic as possible to its own pages. But I beg to differ. I think it's wrong to think of a website as a static destination. Better to think of it as a delivery hub, the point from which you disseminate information and services over the network on demand. All of this is part of a philosophy that I like to call &quot;content-as-a-service&quot;. The Loosely Coupled website will be elaborating on that philosophy in the coming weeks and months, and in the meantime we welcome your comments and feedback on our services as they evolve. 
</blockquote>
<p>
+1, Phil! I particularly applaud the way your form invites contributors to categorize their entries in ways that make those entries more broadly useful. Press releases are a really interesting source of information, provided that you contextualize them. When you stop to think about it, the whole history of our industry is written on a series of press releases (and also, of course, on a bunch of T-shirts, coffee mugs, and mousepads). Why wouldn't we want to manage this data?
</p>
<p>
This click-to-enlarge screenshot shows how I used to do it on BYTE.com, circa 1997 :
</p>
<table align="center" cellspacing="6">
<tr>
<td>
<a href="http://udell.roninhouse.com/bytecols/byte-search-results.gif">
<img border="1" width="300" height="250" alt="BYTE.com search results circa 1997" src="http://udell.roninhouse.com/bytecols/byte-search-results.gif"/>
</a>
<div class="realsmall" align="center">Vendor information in context</div>
</td>
</tr>
</table>
<p>
And here's the <a href="http://udell.roninhouse.com/bytecols/2000-09-06.html">column</a> that explains how. Editorial content was clearly labeled as such. Ditto for user-contributed content (our newsgroups) and vendor-contributed content (press releases). Although it's slightly depressing to see that Phil was driven to the same data-capture solution that I resorted to 8 years ago, I'm nevertheless hopeful that we are, finally, on the cusp of change. 
</p> 
<p>
Sam Ruby today <a href="http://www.intertwingly.net/blog/1371.html">cites</a> this <a href="http://www.wirelessnewsfactor.com/perl/story/21389.html">eCommerceTimes story</a> by Tiernan Ray:
</p>
<blockquote cite="Tiernan Ray">
In effect, the XML standard for structured Web data could be used as a uniform way to transform each tool's blog into another's, in order to hand off control. Not only would this avoid a knowledge disaster in the long term, but it would encourage blog sharing and collaboration in the near term. 
</blockquote>
<p>
Yes. Now, think about the tool used to write the press releases that I used to collect, and that Phil is now collecting. It's Microsoft Word. Eventually, people will figure out that it's easier to save an RSS item directly from Word 2003 (or, if you prefer, from OpenOffice or another XML-savvy tool) than it is to do the cut-and-paste. 
</p>
<p>
That presumes, of course, that it <i>is</i> easier to do it. Don Box assures us that it will be:
</p>
<blockquote cite="Don Box">
I just got the WordML-&gt;RSS20+XHTML transform to work. As much as people bitch about how hideous WordML is, it's considerably easier to handle than XHTML + CSS, as the latter is not XML markup. [<a href="http://www.gotdotnet.com/team/dbox/default.aspx?key=2003-05-01T04:53:37Z">Don Box's spoutlet</a>]
</blockquote>
<p>
Cool! How about sharing? My transforms from beta 1 aren't working in beta 2, and I could use a jumpstart. So could the thousands of other people who have yet to be shown a compelling reason to use the XML features of Word 2003. 
</p>
</body>
</item> 

<item num="a676">
<title>It's a magazine!</title>
<date>2003/04/30</date>
<body>
<p>
<table align="right">
<tr>
<td>
<img alt="It's a magazine!" width="130" height="173" src="http://weblog.infoworld.com/udell/gems/infoworldCover.jpg"/>
</td>
</tr>
</table>
My copy of the new InfoWorld has arrived. It's a magazine! Cool! And I'm surrounded by old friends and new friends. In the old friends category, <a href="http://www.google.com/search?hl=en&amp;ie=UTF-8&amp;oe=UTF-8&amp;q=rick+grehan">Rick Grehan</a> makes the first of what I hope will be many appearances, reviewing <a href="http://www.infoworld.com/article/03/04/25/17TCweblogic_1.html?s=tc">WebLogic Workshop 8.1</a>. Some of you will fondly remember Rick's <i>Some Assembly Required</i> column in BYTE. It predated the BYTE Web archive, but you can still read <a href="http://www.byte.com/art/9406/sec11/art4.htm">later</a> <a href="http://www.byte.com/art/9412/sec13/art3.htm">incarnations</a> of the column. As this week's BEA review shows, Rick hasn't lost his deft touch.
</p>
<p>
I'm also thrilled to see <a href="http://weblog.infoworld.com/yager/">Tom Yager</a> ensconced on the <a href="http://www.infoworld.com/article/03/04/25/17OPcurve_1.html">back page</a>. I count on his analysis of things like <a href="http://www.infoworld.com/article/03/04/18/15centrino_1.html">Centrino</a> and <a href="http://www.infoworld.com/article/03/03/14/11opteron_1.html">Opteron</a> -- subjects I'm interested in, but don't have time to dig into.
</p>
<p>
Not appearing in this week's issue, but joining Rick on the <a href="http://www.infoworld.com/advertise/adv_edt_bet.html">masthead</a>, are two new friends: <a href="http://www.windley.com/">Phil Windley</a> and <a href="http://groove.jpj.net/guerrillanetworking/">Paul Venezia</a>. One of the great pleasures of my life is launching new writers. You'll be seeing more great stuff from these guys, I'm sure.
</p>
</body>
</item> 

<item num="a675">
<title>Blogs and InfoWorld</title>
<date>2003/04/29</date>
<body>
<p>
When a writer for IDG's newsletter asked me some questions about how weblogs relate to InfoWorld's mission, I realized I might as well use the medium self-reflexively. I'm more often an interviewer than an interviewee. But like a lot of other distinctions, that one has lately begun to blur. By posting a draft of my answers to his questions, I hope to demonstrate -- not just describe -- the process of public writing and cross-blog commentary. I expect this writer will wind up with a deeper and richer story as a result. That, of course, is exactly why I think blogs matter to InfoWorld.
</p>
<hr align="center" width="25%"/>
<p>
<b>Q</b>: For the uninitiated, what are blogs? (In layman's terms, please.)
</p>
<p>
<b>A</b>: Blogs are public Web journals. 'Public' can mean a few different things, though. Usually blogs are world-visible. But they can also be company-visible, or department- or workgroup-visible. 
</p>
<p>
<b>Q</b>: Why are they so important to InfoWorld now?
</p>
<p>
<b>A</b>: 
Here's an example. A couple of months ago, I wrote an item on my blog in response to an <a href="http://scriptingnews.userland.com/backissues/2003/02/24#When:7:50:30AM">item on Scripting News</a> about the Windows &quot;blue screen of death.&quot; I used my posting to broaden the discussion, pointing out that device drivers are problematic in every operating system. In passing, I mentioned that Windows Server 2003 moves some HTTP networking code into the kernel, and wondered about the performance/stability tradeoff.
</p>
<p>
An hour later, I happened to check my <a href="http://www.technorati.com/cosmos/links.html?url=http%3A%2F%2Fweblog.infoworld.com%2Fudell">Technorati cosmos</a> -- that's a website (one of several) that keeps track of blogs that cite other blogs. In response to my posting, a Microsoft developer named Ari Pernick (whom I'd never heard of) posted <a href="http://radio.weblogs.com/0100529/2003/02/25.html#a1506">an item on his blog</a> that said, in part, &quot;Well, it's a scary change, but hopefully appropriate.&quot; Ari went on to explain more than I'd seen or heard previously about this potentially controversial feature of Windows Server 2003. In so doing, he helped to defuse controversy.
</p>
<p>
Stop and think about this for a moment. Ari's not one of Microsoft's stars, like <a href="http://www.gotdotnet.com/team/dbox/default.aspx">Don Box</a> or <a href="http://www.simplegeek.com/">Chris Anderson</a>, both of whom have lately become fascinated with blogs. He's a footsoldier in Redmond's army of developers. And yet he felt empowered to clarify how and why part of the HTTP stack was pushed down into the Windows kernel.
</p>
<p>
Some in the journalistic, vendor, and user communities think that this kind of open, cross-blog conversation is the way of the future. If that's true, it'll change a lot of things -- including the nature of tech journalism.
</p>
<p>
A conspiracy theorist, noticing that Ari's blog went silent a few days after that extraordinary posting, might conclude that he'd been shut down. I'm not a conspiracy theorist, however. I see lots of other blogs cooking at Microsoft. Maybe Ari decided he's not cut out for public journalling, or just got too busy.
</p>
<p>
Maybe there's a real change underway, maybe not. And maybe that change will broadly transform the way information flows in the vendor/customer/journalist ecosystem. Time will tell. I certainly hope that that we're moving in the direction of more openness and transparency. That can't and shouldn't mean, of course, that people inside vendor or user organizations will divulge secrets on public blogs. It can and should mean they'll present real faces, and speak in real voices about issues of interest to colleagues, partners, and customers. Think of an organization as a single-celled animal. Blogs increase the surface area of the cell, help nutrients flow across its membrane, and promote multicellular cooperation.
</p>
<p>
You haven't asked why people write blogs. At first glance, the whole thing looks like a gigantic vanity press, and indeed there are elements of that. But every serious professional blog has an agenda. Reasons to invest time and effort in writing a blog can incude:
</p>
<ul>
<p>To promote yourself, your company, or (typically) both at the same time.</p>
<p>To influence the thinking of people inside and outside your organization.</p>
<p>To communicate directly with customers.</p>
<p>To advertise aspects of your internal process that are not proprietary, and that can benefit from the collaborative energy that a blog can attract.</p>
</ul>
<p>
The blog network is a kind of engine for processing all of these agendas. Think about how science is driven by publication and citation indexing. Blogs, and the aggregators that track them, make publication and citation indexing a realtime 24x7 process. The blog universe is a literal marketplace of ideas, an economy whose currency is the hyperlink.
</p>
<p>
<b>Q</b>: How do they work?
</p>
<p>
<b>A</b>: Technically, blogs are dead simple: static Web pages, with diary entries ordered newest to oldest. In a pinch, you could maintain one with nothing fancier than a text editor and an FTP program. There's a bit more to RSS (Rich Site Summary), the XML format that's the basis of a blog syndication network, but in the end it's dead simple too. The real innovations are cultural. In other modes of electronic discourse -- Usenet newsgroups, Web forums -- you join a shared public space and take turns speaking in that space. Blogs work very differently. Your blog is your own personal space, an extension of yourself that you project with pride, and control with care. You write about things that matter to you, optionally referring to other blogs and acknowledging other blogs' referrals to you. These referrals and acknowledgments are driven by the other cultural novelty of blogging: the use of RSS newsreaders to selectively tune into the &quot;channels&quot; broadcast by other bloggers. 
</p>
<p>
<b>Q</b>: How are they integrated into InfoWorld's site?
</p>
<p>
<b>A</b>: The site points to InfoWorld-written blogs, and vice versa. And all of InfoWorld's print content flows out through RSS feeds advertised on the home page. But there should be more to the story, as I'll explain below.
</p>
<p>
<b>Q</b>: Kevin McKean said that blogs can be a distant early warning of something before it breaks in a conventional news story. Can you explain that?
</p>
<p>
<b>A</b>: I think that blogs can sometimes scoop conventional news stories, but can also support and deepen them. It depends on the nature of the story. Three of the five W's -- who, where, and when -- are becoming commodities exchanged at light speed on the RSS network. But the remaining two -- what and why -- require synthesis and analysis. Journalists who read and write blogs will find themselves better connected and more able to do that synthesis and analysis effectively.
</p>
<p>
<b>Q</b>: Do InfoWorld writers maintain blogs or do they monitor other people's blogs or both?
</p>
<p>
<b>A</b>: The balance between original writing and commentary is a matter of individual style.
</p>
<p>
<b>Q</b>: What does it mean to be a blog-friendly IT site?
</p>
<p>
<b>A</b>: More than just converting columnists into bloggers. It's a two-way street. InfoWorld's regular news, reviews, and features -- as well as our blogs -- are widely read and commented upon. Increasingly those comments appear on readers' weblogs, where <a href="http://www.technorati.com/cosmos/links.html?rank=&amp;url=infoworld.com&amp;sub=Get+Link+Cosmos">Technorati</a>, <a href="http://www.popdex.com/search/?query=www.infoworld.com">Popdex</a>, other aggregators, and our own referral logs can track them. <a href="http://weblog.infoworld.com/dickerson/">Chad Dickerson</a> signed us up for a Technorati watchlist last week, and was blown away by the amount and quality of feedback reflected there. I'd like to see InfoWorld.com weave those perspectives into its presentation. Writing for the Columbia Journalism Review, Dan Gillmor <a href="http://www.cjr.org/year/03/1/gillmor.asp">says</a> that journalism should &quot;help the former audience become part of the process.&quot; I violently agree, and have in fact worked that way since about 1996. That's when I realized how the newsgroups I ran at BYTE.com could enrich the editorial process of BYTE magazine. What was then the exception is now -- I hope -- going to become the rule. 
</p>
</body>
</item> 

<item num="a674">
<title>Don't segment desktop XML</title>
<date>2003/04/28</date>
<body>
<p>
<table align="right">
<tr>
<td>
<img alt="jon's infoworld photo" src="http://images.infoworld.com/img/img_hdshot_82x74_Jon.gif"/>
<div align="center" class="realsmall">Ouch :-)</div>
</td>
</tr>
</table>
<blockquote>
<i>
The future of XML on the desktop is far from certain. Now is not the time to segment a market that has only just begun to grow. I hope Microsoft will reconsider. And I trust that the <a href="http://www.openoffice.org/">competition</a> is paying attention. [<a href="http://www.infoworld.com/article/03/04/25/17OPstrategic_1.html">InfoWorld.com</a>]
</i>
</blockquote>
</p>
<p>
This is the 30th installment of my Strategic Developer column, but the first to appear in InfoWorld's print edition, relaunched today in a spiffy new magazine format. What I like most about the new format is that we carve out a feature well in which to run thematic collections of investigative and analytical pieces. I always enjoyed doing that at BYTE. Doing it weekly rather than monthly is going to be a challenge!
</p>
<p>
Accompanying my column in the magazine is the photo you see here. During the photo shoot, I glanced at the Polaroid preview shot and noticed that the top of my head had been deleted, as is customary for columnist headshots. For some reason, I wondered aloud when this particular design had come into fashion. After all, head shots haven't always looked this way (have they?) -- I guess at some point another style will prevail. The art director gave some reasons, in artspeak which I can't precisely recall. But here's the thing: the photographer scalped me in the camera, leaving the art director no option not to crop, or to crop differently. And this was simply taken for granted. There was no discussion between them. There was no conscious decision. 
</p>
<p>
Of course I couldn't design my way out of a paper bag, so I'm grateful to work with people who can. Still, the incident made me think about the role of habit and custom in professional work. It's a good thing to be able to internalize standards. Conscious awareness of the rationales underlying them would cripple our ability to work. But unconscious habits can become dangerous too. At what point do we need to surface and evaluate them? And how do we do that? I have a hunch that cross-disciplinary blogging will help. InfoWorld's art director probably won't read this item, but if he did, he might find it useful. It's precisely because I am not a designer that I can call attention to things that designers take for granted. Conversely, he might be able to call attention to habits in my work that I'm not aware of.
</p>
<p>
One habit that's going to be tough to break, as I move the column into print, is my addiction to Web-style writing. The first 30 installments of my column went out as an email newsletter with footnoted URLs, which became hyperlinks in the version that was posted to the Web. Space is tight in the print edition, though. I have to boil the columns down from a variable length (often 1000 words or more) to about half that. Plus, long links play havoc with print layouts. Fortunately, the layered strategy I've developed -- pointing to my InfoWorld print articles, and embellishing them here on the blog -- will help me bridge the two formats. 
</p>
<p>
There's an old adage: &quot;I am sorry to write such a long letter, I didn't have time to write a short one.&quot; (I can't find an authoritative source for this. It seems to be variously attributed to Kipling, Twain, Emerson, Voltaire, Proust, Pliny the Younger...) For me, though, density of writing is roughly constant, and quantity of output is linearly proportional to time. What will take more time is the refactoring. That's an engineering term for what is, to me, an engineering problem. I have a hunch that the &quot;desktop XML&quot; which is the subject of this week's column will play a role in the solution. It should be fun! 
</p>
<hr align="left" width="20%"/>
<p>
<b>Update: </b> I'm told the photo in the magazine isn't the cropped head shot after all, but rather a full-length pose. Cool! Can't wait to see the new format, it should arrive here tomorrow.
</p>
<p>
<b>Further update: </b> According to Greg Wilson, the quotation is: <i>&quot;I have made this [letter] longer, because I have not had the time to make it shorter.&quot; Blaise Pascal, &quot;Lettres provinciales&quot;, letter 16, 1657.</i> Bingo. That's it. Thanks Greg!
</p>
</body>
</item> 

<item num="a673">
<title>The global advantage</title>
<date>2003/04/27</date>	
<body>
<p>
<table align="right" class="illustration">
<tr>
<td>
<a href="http://www.fourmilab.ch/cgi-bin/uncgi/Earth?imgsize=200&amp;opt=-l&amp;lat=18.25&amp;ns=North&amp;lon=22.9583&amp;ew=West&amp;alt=150591093&amp;daynight=-d&amp;img=learth.evif">
<img width="200" height="200" alt="fourmilab earth viewer" src="http://weblog.infoworld.com/udell/gems/earth.jpg"/>
</a>
</td>
</tr>
</table>
<blockquote>
<i>
With U.S. enterprises increasingly looking to offshore talent to reduce costs, the American programmer has become, in bottom-line speak, a fungible asset. As the globalization of software development unfolds all around us, it's clear that dollars-per-line-of-code is but one of the equation's variables. Other factors influencing this view include time to market, the speed with which project teams and resources can be assembled, and the rate at which tools and techniques can be transferred between offshore outfits and U.S and European companies. [Full story at <a href="http://www.infoworld.com/article/03/04/18/16dyndev_1.html?s=feature">InfoWorld.com</a>]
</i>
</blockquote>
</p>
<p>
One of the common threads running through this story and the related <a href="http://www.infoworld.com/article/03/04/04/14stratdev_1.html">conversation with Brian Behlendorf</a> is social software. A much-hyped trend at the moment, it's a critical enabler for the globalization of all kinds of intellectual labor -- including software development. 
</p>
<p>
In today's NY Times, there's a brief item entitled <i>Turning to Friends for Facts</i>. It cites a University of Washington study showing that people learn from friends and associates more than they learn from the Internet. Of course, the social software movement is rapidly erasing the difference between these two modes. Companies should encourage social interaction, one of the authors says, by providing free cafeteria meals. That's a good idea. Companies should also think hard about how to foster collegiality among distributed teams working around the clock in Europe, Asia, and North America. 
</p>
<p>
Paul Venezia, a <a href="http://www.infoworld.com/article/03/04/04/14okena_1.html">new InfoWorld freelancer</a> who's one of the few alpha geeks I get to spend face-time with (he lives in Keene, NH, too), has been musing on his blog about the <a href="http://groove.jpj.net/guerrillanetworking/archives/2003_04.html#000041">slow uptake of videoconferencing</a>. We've yet to make use of this technology seem routine and casual, but the nature of distributed work may soon force the issue.
</p>
</body>
</item> 

<item num="a672">
<title>Item 500</title>
<date>2003/04/22</date>
<body>
<p>
<table align="right">
<tr>
<td>
<div style="font-size: 50; font-weight: bold">500</div>
</td>
</tr>
</table>
Amazingly, this is the 500th item I've posted since I began this blog. In order to recharge for the next 500, I'm taking a bit of time off. Meanwhile, feel free to revisit the <a href="http://weblog.infoworld.com/udell/stories/2002/03/16/storylist.html">other 499 items</a>. You might also want to keep an eye on Tom Yager, whose blog I <a href="http://weblog.infoworld.com/udell/2002/09/04.html#a397">introduced</a> quite a while ago. He has <a href="http://weblog.infoworld.com/yager/">reappeared</a> with some thoughts on AMD's Opteron launch. And this time, I've got a feeling he's going to stick around.
</p>
</body>
</item> 

<item num="a671">
<title>Do the simplest thing that could possibly work</title>
<date>2003/04/18</date>
<body>
<p>
I snuck a little experiment into yesterday's posting. It contains a paragraph that's coded like so:
<pre class="code" lang="html">
&lt;p class=&quot;tip&quot;&gt;
Here's a tip, by the way. When you're hacking around with your 
Radio feeds, turn upstreaming off. Otherwise you'll torment your 
subscribers. 
&lt;/p&gt;
</pre>
</p>
<p>
To the writer and to the reader, this &quot;tip&quot; thingy is a stylistic element. As such, it can acquire a distinctive appearance by way of a CSS binding:
</p>
<pre class="code" lang="css">
 p.tip 
  {
  margin-left: 50px;
  margin-right: 50px;
  font-size: 80%;
  }
 p.tip:before
  {
  font-weight: bold;
  content: &quot;TIP: &quot;;
  }
</pre>
<p>
Of course, CSS implementations being what they still are, this yields differing results:
</p>
<p align="center">
<img border="1" src="http://weblog.infoworld.com/udell/gems/tipSafari.jpg"/>
<div align="center" class="realsmall">Safari / Mac OS X</div>
</p>
<p align="center">
<img border="1" src="http://weblog.infoworld.com/udell/gems/tipMozillaMac.jpg"/>
<div align="center" class="realsmall">Mozilla / Mac OS X</div>
</p>
<p align="center">
<img border="1" src="http://weblog.infoworld.com/udell/gems/tipMsieWin.JPG"/>
<div align="center" class="realsmall">MSIE 6 / Windows</div>
</p>
<p>
I'd hoped to do something a bit more visual, by the way:
</p>
<pre class="code" lang="css">
 p.tip:before
  {
  font-size: 250%;
  float: left;
  margin-right: 10;
  content: &quot;☛&quot;;
  }
</pre>
<p>
But although I can see the Unicode x261B (☛    BLACK RIGHT POINTING INDEX) on my Mac, it doesn't seem to render on Windows. (See <a href="http://www.alanwood.net/unicode/miscellaneous_symbols.html">Alan Wood's Unicode pages</a> if you want to check out what your browser can display.) And even on the Mac, under Safari, Mozilla, and MSIE, the character doesn't seem to cooperate with the CSS :before pseudo-element. 
</p>
<p>
Oh well. The point, in any case, is that I can associate a CSS style with this tip thingy. The label, the font change, and the wide margins conspire to set the element apart from the main flow of the text. As a writer I convey, and as a reader you intuitively understand, that it is a categorized chunk of content. This is immediately and narrowly useful.
</p>
<p>
It is also useful over a longer term, in a broader context. Assume that you've gathered a bunch of RSS feeds carrying xhtml:body elements, or that an aggregator has done so for you. The same <tt>class=&quot;tip&quot;</tt> that CSS uses to style the element can be used to form queries like the ones I demonstrated in my <a href="http://webservices.xml.com/pub/a/ws/2003/04/15/semanticblog.html">Semantic Blog</a> article. In this case, such queries might ask:
</p>
<blockquote>
<p>
<i>
Which of the feeds contains tips?
</i>
</p>
<p>
<i>
Which of the items in Jon's feed contains tips?
</i>
</p>
<p>
<i>
Which items contain tips about Radio upstreaming?
</i>
</p>
</blockquote>
<p>
This approach doesn't rely on anything fancy. Just CSS, RSS, XHTML, XPath. Why, I've been asked a few times lately, am I not advocating <a href="http://www.w3.org/RDF/">RDF</a>? As the extreme programming folks like to say: <a href="http://c2.com/cgi/wiki?DoTheSimplestThingThatCouldPossiblyWork">Do the simplest thing that could possibly work</a>. 
</p>
</body>
</item> 

<item num="a670">
<title>RSS redirection and regex/Frontier/XSLT XML hacking</title>
<date>2003/04/17</date>
<body>
<p>
After I posted yesterday's note about RSS redirection, Dave Winer wrote to remind me that there is a mechanism known to work for both Radio UserLand and NetNewsWire. It looks like this:
</p>
<pre class="code" lang="xml">
&lt;?xml version=&quot;1.0&quot;?&gt; 
&lt;redirect&gt;
&lt;newLocation&gt;
 http:\//weblog.infoworld.com/udell/gems/longDescriptionFeed.xml
&lt;/newLocation&gt;
&lt;/redirect&gt;
</pre>
<p>
If you hit my <a href="http://www.w3.org/2000/06/webdata/xslt?xslfile=http%3A%2F%2Fweblog.infoworld.com%2Fudell%2Fgems%2FlongFeed.xml&amp;xmlfile=http%3A%2F%2Fweblog.infoworld.com%2Fudell%2Frss.xml&amp;transform=Submit">alternate feed</a> with your browser, you'll see this XML redirect. I'm happy to report that I've just tested it successfully in both RU and NNW. Both adjust the original address to the new one. I'll be interested to know which other readers do, or don't, make the same adjustment.
</p>
<p>
The reason for the switch is that I wanted to clean up my primary feed. There's no reason for it to contain &lt;content:encoded&gt; anymore, and it confuses <a href="http://www.newsmonster.org/news-Jon_Udell_xhtmlbody_and_NewsMonster.html">NewsMonster</a>. So now, my primary feed is just a short description and an &lt;xhtml:body&gt;. My alternate feed contains the whole body encoded within the description, for folks who want to read the blog in a non-&lt;xhtml:body&gt;-aware aggregator.
</p>
<p>
Making these adjustments was trickier than I thought it would be. I'm narrating the process here partly so I can remember it later, and partly because it glosses some recent discussion about XML processing[<a href="http://tbray.org/ongoing/When/200x/2003/03/16/XML-Prog">1</a>, <a href="http://weblog.infoworld.com/udell/2003/03/18.html#a642">2</a>]. There were two parts to the task:
</p>
<ol>
<li>
<p>Copy the original feed -- that is, what my alternate RSS writer produces. Remove &lt;description&gt; and &lt;xhtml:body&gt; from the copy, and rename &lt;content:encoded&gt; to &lt;description&gt;, in order to create the alternate feed.</p>
</li>
<li>
<p>Modify the original feed in-situ, removing &lt;content:encoded&gt;, to create the standard feed.</p>
</li>
</ol>
<p>
I tried three approaches: regular expression hacking, Frontier-style XML hacking, and XSLT. 
</p>
<p>
<b>The regex approach</b>
</p>
<p>
Despite my earlier regex advocacy, I didn't have much luck. That's partly, I guess, because Radio's regex.dll doesn't think quite like Perl's regex engine. Anyway, it got to be a mess.
</p>
<p>
<b>The Frontier approach</b>
</p>
<p>
To use the Frontier-style approach, I started like so:
</p>
<pre class="code" lang="usertalk">
xml.compile (&quot;c:\\radio\\www\\rss.xml&quot;, @rss);
</pre>
<p>
This failed embarrassingly, because (as Dave kindly pointed out) I was trying to compile the filename, not the content. This is the right way to turn your RSS file into a Frontier table:
</p>
<pre>
xml.compile (file.readWholeFile(&quot;c:\\radio\\www\\rss.xml&quot;), 
  @scratchpad.rss);
edit ( @scratchpad.rss );
</pre>
<p>
The <tt>edit</tt> statement brings up a Frontier editing window, where you can inspect the RSS file as a Frontier table. You can also write code to walk around inside the table, making changes as needed. You have to use special verbs to get at the addresses of the subtables, which I found confusing, but this hybrid interactive/programmatic approach is nifty. It started to add up to a lot of code to do what I wanted, though, so I decided to try things the XSLT way.
</p>
<p>
<b>The XSLT approach</b>
</p>
<p>
I started with <a href="http://msdn.microsoft.com/webservices/building/xmldevelopment/xslt/default.aspx?pull=/library/en-us/dnxml/html/msxsl.asp">msxsl.exe</a>, a command-line tool for running XSLT transforms. The first stylesheet, <a href="http://weblog.infoworld.com/udell/gems/longDescription.xml">longDescription.xml</a>, was a minor variation on the one that was already transforming my primary feed into my alternate feed (by way of the W3C XSLT service). The only change needed here was to remove &lt;xhtml:body&gt;, so I added this template:
</p>
<pre class="code" lang="xslt">
&lt;xsl:template match=&quot;xhtml:body&quot;&gt;
&lt;/xsl:template&gt;
</pre>
<p>
Here's the second stylesheet, <a href="http://weblog.infoworld.com/udell/gems/xhtmlBody.xml">xhtmlBody.xml</a>:
</p>
<pre class="code" lang="xslt">
&lt;?xml version=&quot;1.0&quot;?&gt; 
&lt;xsl:stylesheet 
  xmlns:xsl=&quot;http://www.w3.org/1999/XSL/Transform&quot; 
  xmlns:content=&quot;http://purl.org/rss/1.0/modules/content/&quot; 
  xmlns:dc=&quot;http://purl.org/dc/elements/1.1/&quot; 
  xmlns:xhtml=&quot;http://www.w3.org/1999/xhtml&quot; 
  version=&quot;1.0&quot;&gt;
&lt;xsl:output method=&quot;xml&quot; indent=&quot;yes&quot; encoding=&quot;us-ascii&quot;/&gt;
&lt;xsl:template match=&quot;node() | @*&quot;&gt;
  &lt;xsl:copy&gt;
    &lt;xsl:apply-templates select=&quot;@* | node()&quot;/&gt;
  &lt;/xsl:copy&gt;
&lt;/xsl:template&gt;
&lt;xsl:template match=&quot;//content:encoded&quot;&gt;
&lt;/xsl:template&gt;
&lt;/xsl:stylesheet&gt;
</pre>
<p>
The first template in this stylesheet is something the XSLT geeks call &quot;the identity transform.&quot; It just echoes the input to the output. But as you do that, you get the chance to override aspects of the transform. In this case, my second template  matches &lt;content:encoded&gt; and does nothing with it. As a result, that element drops out of the feed.
</p>
<p>
To integrate this with Radio UserLand, I packaged up my two command-line transforms into a CMD file:
</p>
<pre class="code" lang="shell">
c:\\radio\\tools\\msxsl.exe c:\\radio\\www\\rss.xml \\
  c:\\radio\\www\\gems\\longDescription.xml \\
  -o c:\\radio\\www\\gems\\longDescriptionFeed.xml
</pre>
<pre class="code" lang="shell">
c:\\radio\\tools\\msxsl.exe c:\\radio\\www\\rss.xml \\ 
  c:\\radio\\www\\gems\\xhtmlBody.xml -o c:\\radio\\www\\rss.xml
</pre>
<p>
Then, I added this line to my <a href="http://weblog.infoworld.com/udell/gems/rssWriter.txt">alternate RSS writer</a>:
</p>
<pre>
launch.application(&quot;c:\\radio\\tools\\fixrss.cmd&quot;);
</pre>
<p>
There's one more XSLT stylesheet involved here. The alternate feed's original address invokes <a href="http://weblog.infoworld.com/udell/gems/longFeed.xml">longFeed.xml</a>, which used to look a lot like <a href="http://weblog.infoworld.com/udell/gems/longDescription.xml">longDescription.xml</a>, but now simply returns the XML redirect.
</p>
<p class="tip">
Here's a tip, by the way. When you're hacking around with your Radio feeds, turn upstreaming off. Otherwise you'll torment your subscribers.
</p>
<p>
Whew! That was kind of confusing, but I think everything's straightened out now. (Sanity check: is the primary feed <a href="http://feeds.archive.org/validator/check?url=http%3A%2F%2Fweblog.infoworld.com%2Fudell%2Frss.xml">valid</a>? Is the alternate feed <a href="http://feeds.archive.org/validator/check?url=http%3A%2F%2Fweblog.infoworld.com%2Fudell%2Fgems%2FlongDescriptionFeed.xml">valid</a>?) As  the Perl guys like to say, There's More Than One Way to Do It. Were Radio's regex engine more Perl-like, I'd probably have solved the problem that way. It's a reflex. Were I more accomplished at Frontier XML hacking, I might have gotten the quickest result using <tt>xml.compile</tt> and friends. In this case, however, XSLT wound up being my weapon of choice. I'm glad to have figured out how to incorporate it into my Radio repertoire.
</p>
<p>
Finally, I'm <i>really</i> glad to know that RSS redirection works, at least for RU and NetNewsWire.
</p>
</body>
</item> 

<item num="a668">
<title>RSS redirection, again</title>
<date>2003/04/16</date>
<body>
<p>
<a href="http://colossus.net/power.redirection.html">
<img border="0" width="220" height="270" align="right" alt="redirect" src="http://colossus.net/images/power.postman.gif"/>
</a> According to <a href="http://www.newsmonster.org/news-Jon_Udell_xhtmlbody_and_NewsMonster.html">NewsMonster's</a> blog, there's a minor problem with my feed.  NewsMonster seems to want to render both the &lt;content:encoded&gt;  and &lt;xhtml:body&gt; elements. At this point, I'd just as soon drop &lt;content:encoded&gt;. And I probably will. But in so doing, I'll run smack into a long-unresolved problem. There's still no way to cleanly redirect from one RSS feed to another.
</p>
<p>
My current (and not well-conceived) strategy is to send &lt;content:encoded&gt; in my feed, and transform it into &lt;description&gt; in the <a href="http://www.w3.org/2000/06/webdata/xslt?xslfile=http%3A%2F%2Fweblog.infoworld.com%2Fudell%2Fgems%2FlongFeed.xml&amp;xmlfile=http%3A%2F%2Fweblog.infoworld.com%2Fudell%2Frss.xml&amp;transform=Submit">alternate version</a> of my feed. There aren't a lot of people depending on that version, so changing its address won't upset many apple carts. But as I've been reminded several times this week, there's still no solution to this problem. For example, <a href="http://www.planetshwoop.com/blog/">Brian Sobolak</a> wrote to point out that <a href="http://www.mcgeesmusings.net/">Jim McGee's</a> writing might be of interest to me. Indeed! Jim's was one of the first orange icons on my channelroll. But I missed the announcement that he was moving from the community server to his own address, and so lost track of him.
</p>
<p>
There was <a href="http://weblog.infoworld.com/udell/2002/09/25.html">some</a> <a href="http://www.intertwingly.net/blog/907.html">discussion</a> of this a while back. It's easy to see why these proposals aren't getting over the activation threshold, though. Most people never run into the problem. Those who do run into it maybe once a year.
</p>
</body>
</item> 

<item num="a667">
<title>The semantic blog</title>
<date>2003/04/16</date>
<body>
<p class="leadpara">
<i>
I've long dreamed of using RSS to produce and consume XML content. We're so close. RSS content is HTML, which is almost XHTML, a gap that HTML Tidy can close. In current practice, the meat of an RSS item appears in the &lt;description&gt; tag, either as an HTML-escaped (aka entity-encoded) string or as a CDATA element. As has been often observed, it'd be really cool to have the option to use XHTML as well. Then I could write blog items in which the &lt;pre&gt; tag, or perhaps a class=&quot;codeFragment&quot; attribute, marks regions for precise search. You or I could aggregate those items into personal XPath-aware databases in order to do those searches locally (perhaps even offline), and public aggregators could offer the same capability over the Web. [<a href="http://webservices.xml.com/pub/a/ws/2003/04/15/semanticblog.html">O'Reilly Network</a>]
</i>
</p>
<p>
I wound up scooping this article on Monday, because I rolled out xhtml:body sooner than expected. So of course, blog commentary relevant to the article appeared even before the article did -- a curious and delightful inversion. On <a href="http://www.intertwingly.net/blog/1333.html">Sam Ruby's blog</a>, Danny Ayers wrote:
</p>
<blockquote cite="Danny Ayers">
Some very good points made in Jon's piece, though I think it's a bit silly jamming any old XML in and calling it &quot;descriptive markup&quot;. We can reasonably assume in this case the implied meaning 'this is the content', but nowhere is it made explicit - no description is given. 
</blockquote>
<p>
I'd like to see more opinions on this, but to me, CSS looks like an excellent bridge technology. It is declarative in nature. If I write &lt;p class=&quot;codeFragment&quot;&gt; I am merely attaching a label to that element. I can then, optionally, associate a style with that declaration. But I can use that same declaration for other purposes as well. In particular, I can use it to precisely search code fragments. I've long advocated this dual-purposing of CSS declarations because I think it fits well with how people actually write. I find that people who'll say they have no time for descriptive markup will nevertheless fiddle quite obsessively with presentation. Seems to me that if there can be synergy between the two, we ought to exploit that to the hilt.
</p>
</body>
</item> 

<item num="a666">
<title>How (and why) to include an xhtml:body in a Radio UserLand RSS feed</title>
<date>2003/04/14</date>
<body>
<p>
<a href="http://www.intertwingly.net/blog/1299.html">Sam Ruby</a> and <a href="http://www.gotdotnet.com/team/dbox/spoutletex.aspx?key=2003-03-30T04:44:35Z">Don Box</a> have both demonstrated valid RSS 2.0 feeds (<a href="http://intertwingly.net/blog/index.rss2">Sam</a>, <a href="http://www.gotdotnet.com/team/dbox/rss.aspx">Don</a>) that include a <tt>&lt;body&gt;</tt> element, properly namespaced as XHTML. Quietly, last week, I joined the party. My primary feed now includes:
</p>
<ol>
<li>
<p>A brief &lt;description&gt;.</p>
</li>
<li>
<p>The full text of the item, as an HTML-escaped string, in a &lt;content:encoded&gt; element.<sup>1</sup>
</p>
</li>
<li>
<p>The full text of the item, as XHTML, in a &lt;xhtml:body&gt; element</p>
</li>
</ol>
<p>
Although it enlarged my RSS feed, the XHTML body shouldn't have affected -- and indeed seems not to have affected -- any existing RSS-aware software. So, what's it good for? In an upcoming O'Reilly Network column, I lay out the case. I want to aggregate feeds into XML-aware databases, and be able to run precise XPath queries against those databases. Given these capabilities, it makes sense to invest in more and better <a href="http://tbray.org/ongoing/When/200x/2003/04/09/SemanticMarkup">descriptive markup</a>. That, in my view, is how we bootstrap from the existing blogosphere to the semantic web. Not by defining a cosmic ontology. But rather from the bottom up, by building consensus around incremental enrichment of the stuff we write every day.
</p>
<p>
At the moment, of course, that's just one more theory. So I won't speculate further now. I do, however, want to suggest a way to do the experiment. That boils down to some nitty-gritty implementation details. Here, I'll discuss how to simplify XHTML authoring for Radio UserLand, because that's the blog software I use. I hope users of Movable Type and other platforms will offer similar tutorials.
</p>
<p>
For me, writing XHTML isn't a big deal. Having abandoned Radio's embedded MS DHTML edit control -- because I'm often working on a Mac nowadays, and also because it sucks -- I just write my blog entries in simple, clean HTML. As a result, they're already very close to XHTML. If you remember to quote all your attributes, and close all your tags (for example, &lt;img ... /&gt;, <br/>, and <hr/>), you're almost there. Of course, I don't always remember that, so I need some help. There's also the nasty problem of bare ampersands and HTML entities, which need to be escaped or altered for XML transmission. Life's too short to deal with this kind of thing; clearly you want some tool support.
</p>
<p>
I knew, of course, that I wanted to use <a href="http://tidy.sourceforge.net">HTML Tidy</a>, which can not only clean up the worst of the mess that the DHTML edit control makes, but can also be used to XHTML-ify your content. The question was how to integrate it with Radio UserLand's publishing process. I'm happy to report that David Carter-Tod has done the heavy lifting. His <a href="http://www.wcc.vccs.edu/dtod/frontier/tidy.html">Tidy Tool</a> wraps HTML Tidy in a Radio script. (More generally, his tool shows how to spawn any command-line executable from Radio.) To use it, you need a local copy of the HTML Tidy program. Since the instance of Radio that I publish from runs on Windows, I acquired the Win32 version of HTML Tidy, tidy.exe, and put it in Radio's Tools directory along with David's files: tidy.root and tidyconfig.txt. 
</p>
<p>
After some standalone experimentation with HTML Tidy's XML/XHTML output mode, I settled on these tidyconfig.txt settings:
</p>
<pre>
output-xml: true
numeric-entities: true
markup: true
</pre>
<p>
It seemed to me that only the first of these should have been necessary. But without numeric-entities set to true, my HTML entities weren't escaped as they need to be. And oddly, the same thing happened when the markup setting (which pretty-prints the XML output) wasn't set to true. Perhaps an HTML Tidy expert can explain why the first setting alone wasn't sufficient, but in any case, what I show here is working for me.
</p>
<p>
To initialize David's Tidy tool, I typed CTRL-; to launch Radio's QuickScript editor, entered <tt>tidySuite.init()</tt>, and clicked Run. This launches a file browser so you can identify the location of tidy.exe, which in my case was c:\\radio\\tools\\tidy.exe.
</p>
<p>
Next, I tested against some sample postings. In the QuickScript editor, I ran <tt>scratchpad.s = tidySuite.clean( &quot;...&quot; )</tt>, substituting item texts for &quot;...&quot;, and inspected radio.root.scratchpad.s in Radio's database editor. A minor annoyance with HTML Tidy is that it returns complete HTML files, adding &lt;HTML&gt; and &lt;HEAD&gt; and &lt;TITLE&gt; tags if you omit them. In this case, though, only the content of the &lt;BODY&gt; tag is needed, and happily, that's exactly what tidySuite.clean returns by default.
</p>
<p>
So far so good. Now, how to stuff this XML-ified item into an RSS feed? The <a href="http://backend.userland.com/stories/storyReader$210">new extensibility hooks</a> were almost, but not quite, sufficient to the task. You can do three useful things with these hooks: add namespace declarations to your feed, add channel elements, and add item elements. But when you add an item element, it will automatically be escaped for HTML transmission. That's not what I want here. I really do want to send XML. So, for now, I'm continuing to use the <a href="http://scriptingnews.userland.com/backissues/2002/05/09#aSmallChange">original extensibility hook</a>, which has enabled me to completely replace Radio's RSS writer with a <a href="http://weblog.infoworld.com/udell/gems/myRssWriter.txt">modified version</a>. In that version, here's how I'm adding the XHTML body: 
</p>
<pre class="code" lang="usertalk">
add (&quot;&lt;body xmlns=\&quot;http:\//www.w3.org/1999/xhtml\\&quot;&gt;&quot;);
add (tidySuite.clean( string ( adrpost^.text) ) );
add (&quot;&lt;/body&gt;&quot;)
</pre>
<p>
This works. It's a shame, though, not to use the newer, less invasive, more elegant extensibility hooks. If there were a way for item-level callbacks to indicate when angle-bracket-escaping and entity-encoding are not wanted, then you could use this better approach. Of course, for the vast majority of users, a Pref (&quot;Emit XHTML bodies in RSS feed&quot;) would be ideal. So would a version of HTML Tidy that's included as a DLL, just as regular expression support is included in regex.dll. But first things first. If some of us try this bootstrap approach, and if it produces real value, then we can make a case for a more general solution. For now, my screen flashes five or six times, as each item on my homepage is processed in a command window -- but there's no real delay, and that isn't a problem.
</p>
<p>
My next step will be to start aggregating XHTML-aware feeds -- mine, Sam's, Don's, others -- into a database. I want to get a feel for what it's like to search that database with element-level specificity. And then I want to start injecting descriptive markup into my own stuff that will enable me (or you) to search more precisely.
</p>
<hr align="left" width="20%"/>
<p>
<sup>1</sup> The &lt;content:encoded&gt; element is only used by my <a href="http://www.w3.org/2000/06/webdata/xslt?xslfile=http%3A%2F%2Fweblog.infoworld.com%2Fudell%2Fgems%2FlongFeed.xml&amp;xmlfile=http%3A%2F%2Fweblog.infoworld.com%2Fudell%2Frss.xml&amp;transform=Submit">alternate feed</a>, which transforms it into &lt;description&gt; for those who prefer reading complete items in RSS newsreaders, rather than the first-paragraph-only truncated &lt;description&gt; I send in my <a href="http://weblog.infoworld.com/udell/rss.xml">primary feed</a>.
</p>
</body>
</item> 

</blog>

