Meme tracking with Greasemonkey

A couple of months ago, I charted the flow of the ACLU Pizza movie through the blogosphere, using data from Bloglines and del.icio.us. When I visit that blog entry today, I see this notation in the upper left corner of the page:

bloglines: 11 delicious: 2

In other words, the page has been cited 11 times on Bloglines and twice on del.icio.us; the links go to the details.

This tracking data is inserted by a Greasemonkey script which is a straightforward extension of two bookmarklets (del.icio.us, bloglines) that I've been using for quite a while now.

I toyed with the notion of presenting the data as sparklines (bloglines: 29 delicious: 78) -- Edward Tufte's "intense, simple, word-sized graphics" -- but decided that would require an unreasonable amount of page-fetching and HTML-reverse-engineering.

So for now it's just raw counts, and they're fascinating. Although the script I've posted runs only on InfoWorld pages, I have to admit that the version I'm using runs on every page I read, creating a realtime display that I find much more useful than what the generic toolbars (Alexa, Google) offer.

Although it's pretty straightforward to write these Greasemonkey scripts, there are two aspects of the job that feel antiquated. One is groveling around inside Web pages -- in this case, the Bloglines and del.icio.us citation pages -- using regular expressions. The other is groveling around inside the DOM (document object model) of the page into which you're inserting instrumentation.

I think there's some technology (bloglines: 368 delicious: 23) floating around that could help here.

Update: Erik Kastner notes that the Greasemonkey restriction on receiving XML is a security precaution. Simon Willison's excellent questions about etiquette and privacy have convinced me to throttle the script back to just InfoWorld pages for now.

Greasemonkey raises tough issues. It's clearly something of enormous value that we should, as Simon suggests, handle with caution.

Former URL: http://weblog.infoworld.com/udell/2005/04/11.html#a1212