Mining message metadata

Point-to-point integration is out; event-driven communication across a common message bus is in. When you build a system this way, message queues are the first and best way to take the pulse of its real-time state...

Martinez makes a crucial distinction between message data and message metadata. In the realm of Web services, it's the difference between SOAP bodies and SOAP headers. The bodies eventually land in an operational data store, the headers often don't. Yet the headers define the context of the message: who (or what) is sending it and why. For example, a clinical service might be invoked by a monitoring application, or by a compliance officer logging into a portal to research an FDA report. "It's the same message payload," Martinez says, "but contexts are very different." [Full story at]
I can't quite put my finger on it, but there's a connection between this week's column -- based on discussions with Blue Titan's Frank Martinez -- and the latest round of wrangling over the semantic web, bracketed by essays from Clay Shirky and Tim Bray.

Martinez's insight is that in a Web services network, the packets (XML payloads) tend to accrete metadata that can usefully be mined. Relative to the SemWeb discussion, I'd add that this contextual metadata arises naturally, without extra effort, when a business process has been automated -- or, to be more realistic, semi-automated. When Jack routes a purchase order to Jill through the BizTalk pipeline, the context is explicitly encoded in the transaction.

What happens if Jack detaches the purchase order from the BizTalk pipeline, as an InfoPath document, and routes it to Jill via email? Now the context is only implicitly encoded in the transaction. The trick is going to be figuring out how to make the implicit context explicit, without interfering with the natural flow of the transaction.

As developers of WinFS begin to blog about it, the outlines of Microsoft's approach start to emerge:

Joe is exactly right to point out that asking the user to add meta data has met with very limited success. I think WinFS addresses this in two ways: 1) the shell will make it very easy to "paint" meta-data on files just by dragging and dropping, something that users do today to organize their files; and 2) the fact that using the meta data is so easy and powerful (again via the shell's dynamic views) makes the effort to add the meta data more worth while. [Mike Deem]

That sounds reasonable. Of course, the trend even within Microsoft Office is away from micromanaging storage by "dragging and dropping." Witness the search folders in Outlook 2003, which are intended to create virtual views along multiple dimensions so you don't have to manually build containment structures. The Outlook 2003 product manager, in fact, told me that he managed the whole product cycle in an undifferentiated inbox, creating no folders and moving no messages.

My hunch is that as desktop software interacts more often with well-defined services, the context implicit in those interactions will tend to become more available, and will be easier to make explicit. The key is that the context must arise from normal use of software. And as "normal use" comes to mean "participating in a Web of services," it can.

Former URL: