Further adventures in lightweight service composition

I normally avoid posting and dissecting code here, but I broke the rule last Thursday and will again today because my latest round of LibraryLookup work highlights interesting opportunities for service composition at the tolerant end of the tolerance continuum. Last week's adventure was a lightweight service orchestration that connects Amazon's wishlist to RSS, by way of the OCLC xISBN service and a local library catalog. Among other things, that composite solved a longstanding problem with LibraryLookup: an ISBN identifies just one manifestation of a work -- e.g., paperback versus hardcover -- not all of them.

The OCLC's xISBN service goes a long way toward solving this problem. Feed it one ISBN, and it returns a cluster of related ISBNs. A couple of years ago, when OCLC first announced xISBN, there was no way LibraryLookup bookmarklets could exploit it directly. An external intermediary was needed. And in fact that's still true, because bookmarklets can't handle multi-step service orchestration.

The RSS notifier I showed last week isn't subject to that constraint. It can be implemented in any general-purpose programming language, and deployed on any machine with Internet access and the ability to schedule periodic batch jobs.

The next logical step was to add xISBN capability to the Greasemonkey-based version of LibraryLookup that's shown (along with wishlist/library-availability notification) in the screencast Content, services, and the yin-yang of intermediation. In theory this should have been a snap. I already had a Greasemonkey script that was calling one service -- the library catalog -- using the keystone of AJAX, XMLHTTPRequest. How hard could it be to add a second service to this application?

Harder than I thought, as it turned out. The asynchronous nature of XMLHTTPRequest worked out beautifully for the first implementation. You'd load your Amazon page; the library lookup would be dispatched; if it later returned with a positive result -- in a second or three, the timing didn't matter -- the Amazon page would update to include the result. But when you add in a call to xISBN, followed by one or more library lookups, the asynchrony becomes a problem. How do you defer the lookups until after the xISBN request has finished?

The solution clearly required use of JavaScript events. But while I knew how to register functions to handle events raised by GUI components in the browser -- the onChange event triggered by interaction with a form widget, for example -- I wasn't sure how to programmatically raise an event in response to completion of a service request.

The Greasemonkey script shown below illustrates the technique I came up with. It makes use of the W3C Document Object Model (DOM) Level 3 Events API. Near the bottom of the script, you'll find this snippet:

  document.addEventListener("DOMAttrModified", 
    libraryLookup.changeHandler, false);

This says that when an attribute of any node of the DOM is modified, the libraryLookup.changeHandler method is called, and is passed an event object that includes a reference to the modified node.

In the current example, the modified node is one that I subsequently create myself by calling the libraryLookup.createHiddenDiv method. Here is the XML representation of that node right after it's created:

<div id=\"LibraryLookup" isbns="" done="false"/>

Once that's done, the script calls libraryLookup.doLookup, passing the ISBN from the Amazon book page. If the asynchronous call to the library's catalog finds the book, then we're done. There's no need to call xISBN. The doLookup method reports this successful outcome by updating the inserted DOM node, setting its done attribute to true.

If we're not done, the script calls libraryLookup.xisbn and hands it the original ISBN. When the asynchronous call to the xISBN service returns, this method again updates the DOM node, this time by setting the isbns attribute to a string of one or more ISBNs, e.g.:

<div id=\"LibraryLookup\" 
  isbns="0875845851 0585368228 0066620694" done="false"/>

The update to the isbns attribute raises the event that libraryLookup.changeHandler is registered to watch for. If the target node doesn't have an id attribute of LibraryLookup, it just passes on the event. But if it is a LibraryLookup-related event, the handler extracts the list of ISBNs from the isbns attribute. Then, so long as we're not yet done, it calls libraryLookup.doLookup for each ISBN in the list.

This technique involves a strange but intriguing use of the DOM. It becomes, in effect, a transient XML database. There are immediate as well as potential long-range benefits to this approach. For starters, it:

In the long run I'm wondering whether the DOM, seen as a general-purpose in-memory XML database, will be able to connect to persistent XML data stores. Of course this is another version of the Alchemy question I keep harping on.

I'm also getting really curious about IE7. As popular as Firefox has become, applications that require it target a small slice of the pie. When you exploit Greasemonkey as I do here, or advanced CSS capabilities in Firefox 1.5 as I do in the infoworld explorer, you restrict your audience to an even smaller slice of that pie.

For a long time, it looked as though IE was frozen and would not implement any more advanced web standards. But thanks to the AJAX juggernaut, there's a thaw underway. Last week, for example, Microsoft's Sunava Dutta announced that IE7 will support XMLHTTPRequest natively, rather than by way of ActiveX, which opens the door to lots of stuff that otherwise would be precluded by policies banning ActiveX. Dare Obasanjo's comment was wonderful:

I wonder if anyone else sees the irony in Internet Explorer copying features from Firefox which were originally copied from IE?

It's ironic indeed, since Microsoft of course invented DHTML in the first place, but it's also hopeful. Other things I'm hoping for in IE7:

It would be great if the IE7 developers could publish a roadmap indicating which of these and other advanced web standards will be supported in IE7, along with the level of support if it's partial. Even more wonderful would be if all the browser vendors synchronized their test harnesses to a common vocabulary so we could all sort out, in a sane way, what the cross-browser reach of a given application would at least theoretically be.

Well, it never hurts to ask...

Anyway, here's the script. If you adapt it for your OPAC, let me know. The coolest way to let me know would be to tag it in del.icio.us (or elsewhere) with librarylookup, greasemonkey, and xisbn. Similarly, if you adapt Thursday's wishlist notifier, it would be cool to tag it with librarylookup, wishlist, and notification. That'll make it easy to collect a range of implementations and consider how best to generalize them.


\// ==UserScript==
\// @name        LibraryLookup
\// @namespace   http://jonudell.net/udell/2006-01-30-further-adventures-in-lightweight-service-composition.html
\// @description Check availability in Keene libraries
\// @include     http://*.amazon.*
\// ==/UserScript==
 
(
function()
{
var libraryQuery = 'http:\//ksclib.keene.edu/search/i='
var libraryName = 'Keene';
var libraryAvailability = /AVAILABLE/;
var libraryDueBack = /DUE (\d{2}\-\d{2}\-\d{2})/;
 
var xisbnQuery = 'http:\//labs.oclc.org/xisbn/'
 
var isbnREplain = /(\d{7,9}[\d|X])/ig;
var isbnREdelimited = /\/(\d{7,9}[\d|X])\//;
 
\//-- from sam stephenson's http://prototype.conio.net/ --
function $() 
  {
  var elements = new Array();
  for (var i = 0; i < arguments.length; i++) 
    {
    var element = arguments[i];
    if (typeof element == 'string')
      element = document.getElementById(element);
    if (arguments.length == 1) 
      return element;
    elements.push(element);
    }
  return elements;
  }
\//-- thanks, sam! --
 
var libraryLookup = 
  {
  changeHandler: function(e)
    {
    var node = e.target;
    if ( node.getAttribute('id') != 'LibraryLookup')
      { return }
    var isbns = node.getAttribute('isbns');
    isbns = isbns.split(' ');
    for ( i = 0; i < isbns.length; i++ )
      {
      libraryLookup.doLookup(isbns[i]);
      if ( $('LibraryLookup').getAttribute('done') == 'true' )
        { return }
      }
    },
 
  createHiddenDiv: function()
    {
    if ( document.getElementById('LibraryLookup') != null )
      {
      return;
      }
    var div = document.createElement('div');
    div.setAttribute('id','LibraryLookup');
    div.setAttribute('isbns','');
    div.setAttribute('done','false' ); 
    document.body.insertBefore(div, document.body.firstChild);
    },  
 
  insertLink: function(isbn, hrefTitle, aLabel, due)
    {
    var div = origTitle.parentNode;
    var title = origTitle.firstChild.nodeValue;
    var newTitle = document.createElement('b');
    newTitle.setAttribute('class','sans');
    var titleText = document.createTextNode(title);
    newTitle.appendChild(titleText);
    var sp = document.createTextNode(' ');
    var link = document.createElement('a');
    link.setAttribute ( 'title', link );
    link.setAttribute('href', libraryQuery + isbn);
    var label = document.createTextNode( aLabel );
    link.appendChild(label);
    div.insertBefore(newTitle, origTitle);
    div.insertBefore(sp, origTitle);
    div.insertBefore(link, origTitle);
    div.removeChild(origTitle);
    },
             
  xisbn: function(isbn)
    {
    GM_xmlhttpRequest
      (
        {
        method:  'GET',
        url:     xisbnQuery + isbn,
        onload:  function(results)
          {
          page = results.responseText;
          xisbnDone = true;
          var isbns = page.match(isbnREplain);
          var isbnList = '';
          if ( isbns.length > 1)
            {
            isbnList = isbns.join(' ');
            }
          else
            {
            isbnList = isbn;
            }
          $('LibraryLookup').setAttribute('isbns',isbnList);
          }
        }
      );
    },
 
  doLookup: function(isbn)
    {
    GM_xmlhttpRequest
      (
        {
        method:  'GET',
        url:     libraryQuery + isbn,
        onload:  function(results)
          {
          page = results.responseText;
          if ( libraryAvailability.test(page) )
            {
            $('LibraryLookup').setAttribute('done','true');
            libraryLookup.insertLink
              (
              isbn,
              "On the shelf now!",
              "Hey! It's available in the " + 
                 libraryName + " Library!"
              );
            }
          if ( libraryDueBack.test(page) )
            {
            $('LibraryLookup').setAttribute('done','true');
            var due = page.match(libraryDueBack)[1]
            libraryLookup.insertLink
              (
              isbn,
              "Due back " + due,
              "Due back at the " + libraryName + 
                " Library on " + due
              );
            }
          }
        }
      )
    }
  }
 
try 
  {
  var isbn = location.href.match(isbnREdelimited)[1];
  }
catch (e)
  { return; }
 
var origTitle = document.evaluate
  (
    "//b[@class='sans']", 
    document,
    null, 
    XPathResult.FIRST_ORDERED_NODE_TYPE, null
  ).singleNodeValue;
 
if ( ! origTitle )
  { return; }
 
try
  {
  document.addEventListener("DOMAttrModified", 
    libraryLookup.changeHandler, false);
  }
catch (e) 
  { alert (e) }
 
try
  {  libraryLookup.createHiddenDiv() }
catch (e)
  { alert(e) }
 
libraryLookup.doLookup(isbn);
 
if ( $('LibraryLookup').getAttribute('done') == 'false' )
  {
  libraryLookup.xisbn(isbn);
  }
 
}
)();

Former URL: http://weblog.infoworld.com/udell/2006/01/30.html#a1378