Jon Udell: XML for the rest of us

Tangled in the Threads
Jon Udell, April 11, 2001
XML for the rest of us

Web services were the theme of XML DevCon 2001

But web services are just plumbing. When will XML improve the tools that end users see and touch?

I've just returned from the XML Developer's Conference in New York. The upshot, from my perspective, is both good news and bad news. Let's start with the good, of which there is plenty. There can be no doubt that the fabric of the next-generation Internet is being woven of XML. As I'm sure is true for many of you, XML has already insinuated itself into many aspects of my daily work. I write this column in XHTML, because it's easy (for me) to do so, and because well-formedness helps ensure clean HTML rendering. I promote my own website, and others that I work on, using RSS newsfeeds. I produce Linux Magazine's website by running Perl scripts over an XHTML repository. I've helped develop a service -- O'Reilly's Safari -- that stores content as XML, transforms it to HTML by way of XSLT, and performs its business logic using XML-RPC.

There's nothing cutting-edge about any of this. It's just a matter of applying useful tools to basic problems. So it was a treat to lift up my nose from the grindstone and see how other folks are using XML, and where that community is heading.

To some people, I must admit, the notion that there is even such a thing as an "XML community" seems a little odd. Here's a fragment of dialogue between me and a friend who is an IT executive:

me: "I'm going to XML DevCon 2001."

her: "OK, I'll bite. What is the big deal with XML, anyway?"

me: "Well, it's all about universal representation of data."

her: "So, is that like going to the ASCII Developer's Conference?"

She makes a great point. Will XML standards development someday fade into the woodwork? Will the "XML community" dissolve back into the many constituencies from which it emerged -- publishing, software development, e-commerce? The answer to both questions is probably yes, but don't hold your breath. XML's charter puts it on a course to intersect with essentially all of the world's documents, data, and software. That's trivially true for ASCII, which can (and often does) encode all this stuff at a low level. For XML, which aims higher, it's not yet true, and far from trivial.

The XML community is, in fact, wrestling with this whole question of levels of abstraction. During the panel discussion I sat in on, Tim Bray, co-editor of the original XML specification, noted that we already have the "low-level" standards -- specifically SOAP -- that we need for the emerging web services architecture. In response Dave Orchard, an IBM technical architect and XML standards maven, observed that SOAP was until recently seen as a "high-level" standard. In the same way, he argued, what we now see as "high-level" proposed standards for orchestrating of SOAP-based interactions -- such as UDDI (Universal Description, Discovery, and Integration) and ebXML -- will work their way down the protocol stack.

Everyone agreed that while XML is stable at its core, there's tectonic movement in the standards accreting around it. According to Bray, that's inevitable. The XML core, he pointed out, borrows heavily from proven SGML technology. Extensions such as XML Schema and XQuery break new ground, synthesizing ideas from object programming, relational database management, and other realms to create what is really a new way of representing and working with data.

For myself, I tend to stick with the stable core, and watch with interest as the tectonic plates slide around on the map. Sometimes I find myself using, in an unofficial way, things that later become official standards. That was true for XHTML, a technique I was emulating in my own work about a year before I ever heard the term. And it's still true for XML-RPC, which I use because it's easy, supported in the environments I use, and good enough for the tasks at hand. Sometimes I keep on doing things in an unofficial way even though an official way exists. So, for example, I still write a lot of transformations in Perl rather than XSLT, which I find to be a powerful declarative transformer but a crummy scripting language. I rely more often on well-formedness than on validity, which the XML founders wisely did not require. I'm not opposed to XML namespaces, but so far I haven't created any of my own. In confessing these things, I suppose I am not alone. Most of us, I guess, don't try to eat the whole XML layer cake, and would get sick if we tried.

As I walked the exhibition floor and listened to presentations, I wondered what will be the next piece of the cake that I'll put onto my own plate. What attracts me the most are the XML databases.

XML databases

I'm a longtime fan of object databases, and have watched with great interest as these products have evolved in the direction of XML-aware storage. Ron Bourret has compiled an excellent overview of the menagerie of things that can be called XML databases. Of those he lists, I've gotten the most real-world mileage out of Zope's wonderful ZODB, though Bourret rightly classifies Zope as an XML application server (which I guess it is, sort of), rather than an XML database (which ZODB isn't, though it has an XML-ish flavor).

A while ago, I spent some time evaluating Excelon (now called B2B Portal Server) which wraps XML interfaces around the powerful ObjectStore database. I don't actually use it, though. Five-digit price (Excelon) or zero-digit price (Zope): you decide.

At XML DevCon, I saw a few examples of what Bourret calls "native XML databases." These aren't object databases which have grown XML interfaces, like Excelon. They're built from the ground up to store, index, search, and retrieve XML. I looked at Ixia's TEXTML server, Neocore's XML Commerce Server, and XYZFind's ZYZFind Server. With any of these products, I should be able to dump in the kinds of XHTML-formatted documents that I use to drive websites, have them automatically indexed, and then search them with a precision and structural awareness that's not possible with conventional fulltext search engines.

When I test-drove Excelon, it implemented an XML query language called XQL, in anticipation of its becoming the official XML query standard. That didn't happen, but a lot of XQL's syntax found its way into XPath, which is used to select sets of nodes in XSLT. Now it looks as though XQuery, which uses XPath while also borrowing from the more SQL-oriented XML-QL and other sources, is expected to reach critical mass and stabilize. Meanwhile, these products don't use XQuery. Their proprietary query languages look interesting, and useful, but I'd probably want to hold out for XQuery, or whatever it turns into.

Then there are what Bourret calls "XML-enabled databases" -- conventional SQL-style engines with various kinds of XML adaptations. When I first heard of these, they struck me as ungainly. But the more I see of them, the more I like them. Oracle's Steve Muensch, author of Building Oracle XML Applications, gave a great talk on Oracle's approach to XML/SQL hybridization. His product, XSQL Pages, automates a lot of the grunt work involved in translating data between the two disciplines. A developer I met at lunch raved about the mileage he's gotten from XSQL Pages on a big project for Standard and Poor's. I also loved two other things that Muensch demonstrated. In one example, he superimposed an XML schema on some Oracle tables, and then queried the tabular data using XPath. In another, he showed how to merge SQL-style tabular querying with XML-style hierarchical querying. The example was an insurance claims database that mixes rows of conventional data with XML-formatted damage reports. The query filtered on the standard rows, while at the same time looking for phrases within nested paths of elements in the damage report. Over in the Microsoft booth, similar capabilities -- implemented differently -- were on display. It's really quite exciting to see data and documents come together in this new way.

Where are the XML apps we can see and touch?

Don't get me wrong, I've been saying for years that XML-oriented web services would be huge. Now they are, and I'm delighted. But I'm also starting to wonder if there's something missing from the picture that's developing. I started to say so in my panel discussion, but I didn't want to rain on the web-services parade, which I am greatly enjoying. Nevertheless, I've got to ask at some point: where and how do all these web services intersect with people?

A lot of demos featured a Purchase Order or some equivalent XML packet being shunted around from business process to business process. I've been assuming, all along, that we'll get that Purchase Order pretty well figured out. Apparently we haven't yet, and people whom I greatly respect say we need to bake some more standards to get there. I don't disagree, but neither do I have much to add to that discussion. We have the right core in place: XML over HTTP. Let's follow the advice of the Extreme Programming folks -- namely, "Do The Simplest Thing That Could Possibly Work" -- and then figure out we stand.

Here's the fly in the ointment. That Purchase Order, in real life, does not wend its way through the machinery untouched by human hands. There is usually a conversation swirling around it. That conversation happens in phone calls and IMs and emails and documents, and there is context woven through all that stuff which -- I am sorry to say -- we are making no progress in capturing, never mind using to improve the business process.

Where's the viral app that does, for the end user, by means of XML, what the browser did for the end user by means of HTML? Where are the XML-enabled tools for writing, for personal information management, for knowledge capture and refinement?

I do a lot of electronic publishing. Nowadays, it's all driven from XML repositories. But none of that content is yet produced by XML-aware tools. Rather, it's converted from other formats. These conversions are painful, and expensive, and still occur in about the same ways they did five years ago. And because there are still no general-purpose XML editing tools that users will accept, changes require another painful conversion.

I did look very closely at i4i's impressive S4/TEXT, which can capture XML documents in Word in ways that feel fairly natural to Word users. This looks like a really effective way for certain kinds of businesses to manage specific classes of documents. But as the i4i folks explained, S4/TEXT has to be adapted on a per-DTD basis to each of these document types. So it isn't (yet) a tool that anyone can just pick up and use to write in an ad-hoc -- but well-structured -- fashion.

Even more so than the word processor, the email client funnels vast quantities of data into the personal and public spaces of the Net. The result is infoglut. We can't shut out what we don't want, and we can't find what we need. XML, I keep hoping, will give us some tools -- structure, metadata -- to help us stay afloat. But so far, it's not reaching the desktop in any direct way. Sure, we're about to get XSLT on the client, but that's just more plumbing -- a way to offload some transformation work from the server farms.

Nobody's more jazzed than me about web services. But let's not forget that people need to engage with those services. The first-generation web did forget that. What was intended to be a collaborative medium turned into something more like television, and reduced people to sets of eyeballs. Will the next-generation web make the same mistake? I hope not, and I believe XML can help us out. But not if it just reinvents the plumbing. It had better also improve the apps we see and touch.

Jon Udell (http://udell.roninhouse.com/) was BYTE Magazine's executive editor for new media, the architect of the original www.byte.com, and author of BYTE's Web Project column. He's now an independent Web/Internet consultant, and is the author of Practical Internet Groupware, from O'Reilly and Associates. His recent BYTE.com columns are archived at http://www.byte.com/index/threads

This work is licensed under a Creative Commons License.