Tangled in the ThreadsJon Udell, June 6, 2001
The universal canvas
Putting WYSIWYG HTML editors to useContent management systems can make creative use of the MS DHTML edit control
In a recent column on weblogging as a project management tool, I showed how a simple weblog can be a powerful way to focus the attention of a distributed project team. Now, a couple of weeks farther along in the experiment, it's continuing to have that very useful effect. What prevents this from happening more often, I argued, is not a shortage of tools, but rather a shortage of storytellers who have the skills and inclination to weave a coherent narrative out of the conversations and documents that are the lifeblood of a project.
In truth, while I can and do maintain this project weblog with nothing fancier than ftp and a text editor, there is a tools problem here -- one that's being addressed by a growing number of content management systems. People usually think of a CMS as a big tool (like Vignette StoryServer) used to manage a big site (like ZDNet). Of course, relatively few sites operate on that scale. Lots of companies need help managing smaller Internet, intranet, or extranet sites. Many more such sites (like my project weblog) should exist, but don't. That need defines the opportunity for new content management systems.
What is "content" anyway?
The web was adopted by millions of people because it is a thrilling combination of two familiar media -- publishing and broadcasting. It is like publishing because you read it as you would a book or newspaper. It is like broadcasting because you change channels by clicking. In those two media industries, the product is called "content," and content management systems for the web are tools used to deliver that product through a new channel.
Companies not in the "content" business -- that is, most companies -- are often confused about why and how to be on the web. A friend who ran the website for a health maintenance organization once asked me: "We get 50,000 hits a week -- is that good?" Well, it depends. An HMO is not in the business of attracting eyeballs to its website. It's in the business of delivering health-care services. If those 50,000 hits represented people looking for needed health-care information, and finding it, that would be good. If the hits also represented people actually using health-care services -- scheduling appointments, filling out forms -- that would great. But often, companies forget that they're not in the entertainment business, and that their product is not "content."
If you're not in the "content" business, then what is your "content" exactly, and why and how do you need to manage it on the web? Mostly, it's business documents that contain words, numbers, tables, charts, and pictures. They're intended to communicate externally with customers and business partners, and internally with coworkers. They're created in word processors and spreadsheets. And they are typically emailed as attachments, rather than shared on the web, because we lack infrastructure that makes such sharing an easy and natural thing to do.
As I mentioned in the earlier column, one of the key services I'm providing as the maintainer of a project weblog is the capture and publication of key documents. A number of these documents are spreadsheets, Word files, and PDF files. I've noticed, though, that none of these documents really depends on the special capabilities of Excel, Word, or Acrobat. All of them, in fact, could have been created using an HTML editor, and would then have been more easily and more effectively shared. Here's how .XLS, .DOC, and .PDF files are actually being used:
spreadsheets as table editors The special ability of a spreadsheet is, of course, to interconnect cells according to formulas. But that's not how I see spreadsheets typically being used. On this project, and in many other contexts, I see spreadsheets used almost exclusively to format tabular information.
word processors as rich-text widgets Today's word processors are capable of prodigious publishing feats. But on this project, as elsewhere, I see Word files used just to add a bit of formatting to text.
PDF files as compound-document containers A PDF file is the best available electronic representation of a printed book, magazine, or brochure. This project, however, has no need to produce printed publications. PDF files are used, instead, because they can bind text and a set of images into compound documents that preserve the integrity of all their parts. Word also does this, and that's another reason why Word files appear in the project repository.
I don't deny that these are excellent applications that solve important problems. But they are often used for different reasons -- particularly to format, package, and transmit information that supports collaborative work. And when used this way, they yield less than optimal solutions. Now don't get me wrong. It's infinitely better to have a web-accessible repository of documents in all of these formats, than to have no repository at all. But when the objective is to share information freely, there are also associated costs. Some people don't have all these apps installed. Some do, but incur delay when launching them. Information doesn't flow freely among the apps; you can't easily search across a collection of these disparate file types; you can't refer (with hyperlinks) to elements contained within the files.
The universal canvas
The environment in which such things are possible is one that Microsoft, in its marketing literature, calls "the universal canvas." I've highlighted aspects of this vision in columns on MathML and SVG. My argument is that universal representation of data in XML is as important as universal representation of web-services APIs in XML, and for the same reason: stuff needs to flow. In the case of web services, flow is a network effect that happens when services can trivially interconnect. If XML-RPC and SOAP are succeeding, it's because they reduce barriers to flow more effectively than CORBA/IIOP or DCOM have done. It should be the same with applications. We want the network effect that happens when I can trivially connect a calculation, a table, an image, and a description -- from different sources -- and then share the results. The web can achieve this effect, but not easily. Excel, Word, and Acrobat, built to achieve different goals, are the wrong tools for this job -- overkill on features, weak on integration. The web's own writing tool, the HTML TEXTAREA widget, is conversely well-integrated (with the web) but hopelessly inadequate in terms of features. What we've always needed, and what we still need, is the universal canvas -- a surface on which we view, but also create and edit, words and tables and charts and pictures.
Since IE4, Microsoft has been shipping a component that can be used to achieve at least some of the universal-canvas effect. It's called the DHTML edit control. If you've ever composed an HTML message in Outlook Express, you've used this component, perhaps to add styled text, tables, images, or hyperlinks to an email message. Because it's a component, it can offer these same services to other applications. Recently, I've been in touch with two companies that are doing just that. Both offer content management systems "for the rest of us" -- that is, for the majority who do not need a Vignette-class hammer to pound a ZDNet-sized nail. And both use the DHTML edit control to make the readable web also writable by ordinary users.
Spoke Technologies' CrankSet 2.0
Spoke Technologies operates as a Zope-based application service provider. The company hosts websites that are rich in community-generated content, yet manageable by nontechnical people. Zope itself, of course, is a content management system, but not one that your average kindergarten teacher would find easy to use. Spoke's CrankSet extends Zope to create a CMS that kindergarten teachers, and other regular folks, can use. In doing so, the Spoke developers have tapped into the power -- and experienced the frustration -- of Zope. That's an interesting story in itself, but for another column. Here, I just want to highlight one particular feature of the service. When it accepts user input, for example in a discussion forum, it offers (to IE users) the option of WYSIWYG HTML. In that mode, people can use familiar Word-like UI to create rich documents that, when posted, merge with the rest of the site's HTML content.
We ought to take this for granted by now but, in fact, it's still a relative novelty, and a surprising and wonderful thing to see. There's also more going on than meets the eye. You can, for example, include an image in a table cell. When you do that, you upload the image along with the text of your document. The generated HTML refers to the image, but something else has subtly and crucially occurred. The document looks like a compound document containing an image, but really, the image has its own name (really, a Zope URL) and therefore its own identity. It's an independent piece of content, which can be used standalone, or referenced in other contexts.
Although WYSIWYG HTML is an attention-grabber, I'm even more interested in this underlying mechanism for accumulating an inventory of reusable assets. That's because what's really important about HTML isn't fonts, colors, and lists, but rather hypertext. We are all consumers of hypertext, but few of us yet are producers of it. The documents we send as attachments in email do not end up in canonically URL-addressable locations, and we cannot therefore refer to them with links. When a environment does make such documents URL-addressable, it creates the possibility of vastly more effective collaboration. As I mentioned in the earlier column, much of the value of my project weblog comes from nothing more than assigning URLs to email attachments and then publishing those URLs.
Ektron's eWebEditPro and eMPower
Spoke's use of the DHTML edit control prompted me to look around for other implementations. The search led to Ektron, which has refined and packaged the technology to produce something I've long imagined: a complete pluggable replacement for the HTML TEXTAREA widget. The product, which wraps several layers of integration code around the core edit control, works with MSIE and, by way of a plugin, with Netscape -- though for Windows only in both cases.
The XML configuration also defines how you save content. The choices are "minimal," "cleanhtml," and "xhtml." I was delighted to find that XHTML is the default. That makes a great deal of sense. The point of collecting structured text, after all, is to be able to work with that structure. Well-formed content is a rare, poorly understood, but extremely valuable asset.
Ektron sells the editor bound to one or more URLs. For each URL, you can buy it two ways: $30/user (in $299 10-packs), and $6000/enterprise. The 10-packs would make sense for something like a project weblog, an intranet, or even a public site with a small number of contributors. The enterprise license is needed only when there will be lots of writers of rich content. Ektron also includes the editor in its line of content management products -- eMPower (for ColdFusion), and eMPower Express (for ColdFusion or ASP).
The Spoke and Ektron adaptations of the DHTML edit control give us an inspiring glimpse of what the universal canvas might someday be. There is a long road ahead, to be sure. I don't pretend that an obscure piece of Microsoft code can be dusted off and made into something that meets all the needs that compel people to try to collaborate on the web using spreadsheets, Word, and PDF files. Nor do I miss the point that this is a Windows-only solution. It is, however, a dramatically useful Windows-only solution that will, I hope, make people think about -- and demand -- a true universal canvas.
Jon Udell (http://udell.roninhouse.com/) was BYTE Magazine's executive editor for new media, the architect of the original www.byte.com, and author of BYTE's Web Project column. He is the author of Practical Internet Groupware, from O'Reilly and Associates. Jon now works as an independent Web/Internet consultant. His recent BYTE.com columns are archived at http://www.byte.com/index/threads
This work is licensed under a Creative Commons License.