Compound documents for the web

As has become my custom, I used HTML Slidy to make the presentation I gave at the Paris SOA Forum last Thursday. I've argued that since web standards can support everything a presentation needs to do, the only obstacle is the perennial lack of decent web-oriented authoring tools. But my recent experiences remind me that there's another obstacle: we still have no standard compound document format for the web.

An HTML Slidy presentation is a collection of files: a single main XHTML file, a JavaScript file, one or more CSS files, and one or more media files which can be images and, in my case, sometimes also movies. It runs identically from a local disk using the file: protocol and from the web using HTTP. But for an event, the host usually wants to receive your presentation and load it onto their machine. And the zip-transfer-unzip dance is more friction than anyone needs or wants.

I can think of two possible approaches.

1. Use the web's native compound document features. You're probably wondering: "What compound document features?" Sadly, although it's been supported ever since Netscape's mail and news clients back in the day, this idea never gained traction. In Practical Internet Groupware I quoted this from RFC 2557:

In order to transfer a complete HTML multimedia document in a single e-mail message, it is necessary to: a) aggregate a text/html root resource and all of the subsidiary resources it references into a single composite message structure, and b) define a means by which URIs in the text/html root can reference subsidiary resources within that composite message structure.
The mid: (message ID) and cid: (content ID) URL schemes proposed in RFC 2557 answered this requirement. In particular, it was possible -- and at one point for me common -- to create compound HTML documents that included inline images as MIME message parts. Similarly there's also the "data" URL scheme proposed in RFC 2397, which is still used (rarely) to include images or audio directly in web pages.

2. Use ZIP or JAR, à la OpenOffice or the Java runtime. But browsers don't work directly with these archives, and web servers don't either, and there's approximately zero chance that will change.

At the moment, neither of these approaches seems to have a future in the realm of web technology. But as the AJAX juggernaut rolls along, it's going to keep reminding us that we still don't have a standard notion of a web compound document. We could sure use one.


Former URL: http://weblog.infoworld.com/udell/2006/10/09.html#a1540