Dynamic categories

A while back I stopped assigning the items I post here to categories. It wasn't because I couldn't be bothered to do the categorization. Quite the contrary, I'm really interested in achieving that result, and more than willing to put some effort into it. But, although I'm generally a huge proponent of the publishing technique I call static serving of dynamically-generated pages, it increasingly seemed like the wrong way to deal with categories.

Lately it's becoming clear how the XPath search technology I've been working with will enable a fully dynamic approach to categories. For example, after posting yesterday's item, it struck me that two labels I'd have wanted to attach to that item were: books, and AV clips. So I added these two queries to the list of canned queries on the search page:

books: //p[contains(.//a/@href,'amazon.com') or contains(.//a/@href,'allconsuming')]

AV clips: //p[contains(.//a/@href,'.mp3') or contains(.//a/@href,'.wav') or contains(.//a/@href,'.mov') or contains(.//a/@href,'.ram')]

Each of these queries finds yesterday's item (and this one too, actually). Each also forms a result page that could serve as a category page. There are a bunch of other queries that haven't been written down yet, but that implicitly categorize the same item in other ways. For example: Doc Searls quotations. Or Jeremy Rifkin's The Age of Access. Query. Gotta love it.

I also added some instrumentation to the search page that reports the number of entries searched (213, as of this one), and the date of the earliest entry searched (April 2003). Here are some next steps:

I'm still deciding whether to stick with Python's mini-httpd (BaseHTTPServer), or switch to something else. But here's a larger issue to consider. Most bloggers don't have the ability to maintain any non-standard server-side infrastructure. So if this approach is going to scale, it can't require that. I've been thinking about this for a while. It ties back to RSS. Any feed that includes well-formed XHTML content can deliver that content to a search service. So Technorati, or Feedster, or another service that's already in the business of aggregating and searching feeds could also offer XPath (or ultimately XQuery) services. I would love to see that happen.


Former URL: http://weblog.infoworld.com/udell/2004/01/15.html#a887