Tangled in the Threads

Jon Udell, June 6, 2001

The universal canvas

Putting WYSIWYG HTML editors to use

Content management systems can make creative use of the MS DHTML edit control

In a recent column on weblogging as a project management tool, I showed how a simple weblog can be a powerful way to focus the attention of a distributed project team. Now, a couple of weeks farther along in the experiment, it's continuing to have that very useful effect. What prevents this from happening more often, I argued, is not a shortage of tools, but rather a shortage of storytellers who have the skills and inclination to weave a coherent narrative out of the conversations and documents that are the lifeblood of a project.

In truth, while I can and do maintain this project weblog with nothing fancier than ftp and a text editor, there is a tools problem here -- one that's being addressed by a growing number of content management systems. People usually think of a CMS as a big tool (like Vignette StoryServer) used to manage a big site (like ZDNet). Of course, relatively few sites operate on that scale. Lots of companies need help managing smaller Internet, intranet, or extranet sites. Many more such sites (like my project weblog) should exist, but don't. That need defines the opportunity for new content management systems.

What is "content" anyway?

The web was adopted by millions of people because it is a thrilling combination of two familiar media -- publishing and broadcasting. It is like publishing because you read it as you would a book or newspaper. It is like broadcasting because you change channels by clicking. In those two media industries, the product is called "content," and content management systems for the web are tools used to deliver that product through a new channel.

Companies not in the "content" business -- that is, most companies -- are often confused about why and how to be on the web. A friend who ran the website for a health maintenance organization once asked me: "We get 50,000 hits a week -- is that good?" Well, it depends. An HMO is not in the business of attracting eyeballs to its website. It's in the business of delivering health-care services. If those 50,000 hits represented people looking for needed health-care information, and finding it, that would be good. If the hits also represented people actually using health-care services -- scheduling appointments, filling out forms -- that would great. But often, companies forget that they're not in the entertainment business, and that their product is not "content."

If you're not in the "content" business, then what is your "content" exactly, and why and how do you need to manage it on the web? Mostly, it's business documents that contain words, numbers, tables, charts, and pictures. They're intended to communicate externally with customers and business partners, and internally with coworkers. They're created in word processors and spreadsheets. And they are typically emailed as attachments, rather than shared on the web, because we lack infrastructure that makes such sharing an easy and natural thing to do.

As I mentioned in the earlier column, one of the key services I'm providing as the maintainer of a project weblog is the capture and publication of key documents. A number of these documents are spreadsheets, Word files, and PDF files. I've noticed, though, that none of these documents really depends on the special capabilities of Excel, Word, or Acrobat. All of them, in fact, could have been created using an HTML editor, and would then have been more easily and more effectively shared. Here's how .XLS, .DOC, and .PDF files are actually being used:

I don't deny that these are excellent applications that solve important problems. But they are often used for different reasons -- particularly to format, package, and transmit information that supports collaborative work. And when used this way, they yield less than optimal solutions. Now don't get me wrong. It's infinitely better to have a web-accessible repository of documents in all of these formats, than to have no repository at all. But when the objective is to share information freely, there are also associated costs. Some people don't have all these apps installed. Some do, but incur delay when launching them. Information doesn't flow freely among the apps; you can't easily search across a collection of these disparate file types; you can't refer (with hyperlinks) to elements contained within the files.

The universal canvas

The environment in which such things are possible is one that Microsoft, in its marketing literature, calls "the universal canvas." I've highlighted aspects of this vision in columns on MathML and SVG. My argument is that universal representation of data in XML is as important as universal representation of web-services APIs in XML, and for the same reason: stuff needs to flow. In the case of web services, flow is a network effect that happens when services can trivially interconnect. If XML-RPC and SOAP are succeeding, it's because they reduce barriers to flow more effectively than CORBA/IIOP or DCOM have done. It should be the same with applications. We want the network effect that happens when I can trivially connect a calculation, a table, an image, and a description -- from different sources -- and then share the results. The web can achieve this effect, but not easily. Excel, Word, and Acrobat, built to achieve different goals, are the wrong tools for this job -- overkill on features, weak on integration. The web's own writing tool, the HTML TEXTAREA widget, is conversely well-integrated (with the web) but hopelessly inadequate in terms of features. What we've always needed, and what we still need, is the universal canvas -- a surface on which we view, but also create and edit, words and tables and charts and pictures.

Since IE4, Microsoft has been shipping a component that can be used to achieve at least some of the universal-canvas effect. It's called the DHTML edit control. If you've ever composed an HTML message in Outlook Express, you've used this component, perhaps to add styled text, tables, images, or hyperlinks to an email message. Because it's a component, it can offer these same services to other applications. Recently, I've been in touch with two companies that are doing just that. Both offer content management systems "for the rest of us" -- that is, for the majority who do not need a Vignette-class hammer to pound a ZDNet-sized nail. And both use the DHTML edit control to make the readable web also writable by ordinary users.

Spoke Technologies' CrankSet 2.0

Spoke Technologies operates as a Zope-based application service provider. The company hosts websites that are rich in community-generated content, yet manageable by nontechnical people. Zope itself, of course, is a content management system, but not one that your average kindergarten teacher would find easy to use. Spoke's CrankSet extends Zope to create a CMS that kindergarten teachers, and other regular folks, can use. In doing so, the Spoke developers have tapped into the power -- and experienced the frustration -- of Zope. That's an interesting story in itself, but for another column. Here, I just want to highlight one particular feature of the service. When it accepts user input, for example in a discussion forum, it offers (to IE users) the option of WYSIWYG HTML. In that mode, people can use familiar Word-like UI to create rich documents that, when posted, merge with the rest of the site's HTML content.

We ought to take this for granted by now but, in fact, it's still a relative novelty, and a surprising and wonderful thing to see. There's also more going on than meets the eye. You can, for example, include an image in a table cell. When you do that, you upload the image along with the text of your document. The generated HTML refers to the image, but something else has subtly and crucially occurred. The document looks like a compound document containing an image, but really, the image has its own name (really, a Zope URL) and therefore its own identity. It's an independent piece of content, which can be used standalone, or referenced in other contexts.

Although WYSIWYG HTML is an attention-grabber, I'm even more interested in this underlying mechanism for accumulating an inventory of reusable assets. That's because what's really important about HTML isn't fonts, colors, and lists, but rather hypertext. We are all consumers of hypertext, but few of us yet are producers of it. The documents we send as attachments in email do not end up in canonically URL-addressable locations, and we cannot therefore refer to them with links. When a environment does make such documents URL-addressable, it creates the possibility of vastly more effective collaboration. As I mentioned in the earlier column, much of the value of my project weblog comes from nothing more than assigning URLs to email attachments and then publishing those URLs.

Ektron's eWebEditPro and eMPower

Spoke's use of the DHTML edit control prompted me to look around for other implementations. The search led to Ektron, which has refined and packaged the technology to produce something I've long imagined: a complete pluggable replacement for the HTML TEXTAREA widget. The product, which wraps several layers of integration code around the core edit control, works with MSIE and, by way of a plugin, with Netscape -- though for Windows only in both cases.

From a developer's perspective, the trick is performed in exactly the right way. When you want to embed the editor in an HTML file, you use JavaScript to source the editor's integration code into the page, define a hidden form variable to accumulate and send its output, and then create an instance of the control. To the user, it looks like Word where a TEXTAREA otherwise would have been. To the developer, it looks the same: that is, when unpacking form variables in a back-end script, the editor's HTML (or XHTML) output shows up in a named field, just like the output of a TEXTAREA widget would have.

There are three ways to control the behavior of the editor: JavaScript, ActiveX, and XML. The JavaScript API defines properties and events. Among the properties is a collection of instances, of which there can be several on the page. Methods include load() and save(), which transfer content between the editor and its associated hidden form variable. The ActiveX API allows more granular control -- for example, getting or setting the selected region within the editor.

Although Ektron does offer specific integration support for ASP, ColdFusion, PHP, and JSP, you can deploy the editor from any kind of web server. The XML API is what makes this possible. When the editor loads, its JavaScript glue tells it where (on the server) to find an XML configuration file. That file then governs both the appearance and behavior of the control. Using XML declarations, you can fully customize the toolbar, the menus, and the wiring that connects these either to built-in commands, or to custom commands implemented elsewhere. You can also use XML declarations to define where the control sends attached/embedded content, and how. By default, the upload method is FTP, but you can specify HTTP instead.

The XML configuration also defines how you save content. The choices are "minimal," "cleanhtml," and "xhtml." I was delighted to find that XHTML is the default. That makes a great deal of sense. The point of collecting structured text, after all, is to be able to work with that structure. Well-formed content is a rare, poorly understood, but extremely valuable asset.

Ektron sells the editor bound to one or more URLs. For each URL, you can buy it two ways: $30/user (in $299 10-packs), and $6000/enterprise. The 10-packs would make sense for something like a project weblog, an intranet, or even a public site with a small number of contributors. The enterprise license is needed only when there will be lots of writers of rich content. Ektron also includes the editor in its line of content management products -- eMPower (for ColdFusion), and eMPower Express (for ColdFusion or ASP).

The Spoke and Ektron adaptations of the DHTML edit control give us an inspiring glimpse of what the universal canvas might someday be. There is a long road ahead, to be sure. I don't pretend that an obscure piece of Microsoft code can be dusted off and made into something that meets all the needs that compel people to try to collaborate on the web using spreadsheets, Word, and PDF files. Nor do I miss the point that this is a Windows-only solution. It is, however, a dramatically useful Windows-only solution that will, I hope, make people think about -- and demand -- a true universal canvas.

Jon Udell (http://udell.roninhouse.com/) was BYTE Magazine's executive editor for new media, the architect of the original www.byte.com, and author of BYTE's Web Project column. He is the author of Practical Internet Groupware, from O'Reilly and Associates. Jon now works as an independent Web/Internet consultant. His recent BYTE.com columns are archived at http://www.byte.com/index/threads

Creative Commons License
This work is licensed under a Creative Commons License.