Suppose that we bloggers, collectively, wanted to migrate toward HTML coding and CSS styling conventions that would make our content more interoperable. Since none of us is starting from a clean slate, we'd need to analyze current practice. Well, now we can. Here, for example, is a concordance of use cases for HTML elements with class attributes, drawn from the database I'm building:
<a class="Troll">
<a class="listLinkLrg">
<a class="nodelink">
<a class="offlink">
<a class="regularArticleU">
<a class="weblogItemTitle">
<blockquote class="posts">
<div class="Section1">
<div class="active1">
<div class="blogtitle">
<div class="caption">
<div class="comment">
<div class="date">
<div class="inlineimage">
<div class="node">
<div class="personquote">
<div class="posts">
<li class="MsoNormal">
<p class="ArticleBody">
<p class="MsoNormal">
<p class="blogtitle">
<p class="code">
<p class="editorial">
<p class="imagelink">
<p class="posts">
<p class="q">
<p class="text">
<p class="times">
<span class="artText">
<span class="bodytext">
<span class="byline">
<span class="closed">
<span class="imagelink">
<span class="nxml-attribute-local-name">
<span class="nxml-attribute-value">
<span class="nxml-attribute-value-delimiter">
<span class="nxml-element-local-name">
<span class="nxml-tag-delimiter">
<span class="nxml-tag-slash">
<span class="nxml-text">
<span class="o">
<span class="ofp">
<span class="rss:item">
<span class="storyHead">
<span class="text">
<span class="title">
<span class="topstoryhead">
<ul class="noindent">
With only a few days' worth of accumulated content, I wouldn't dare to venture any recommendations about these use cases. But as the picture develops over time, we might start to see opportunities for convergence.
Update: I've been hoping for some external validation of this approach, and Giulio Piancastelli provides it today. As part of a much longer posting with lots of detailed technical analysis of RDF-oriented techniques, he writes:
What Jon is searching for, I think, is a good balance between the cost of providing metadata and the benefits gained by working on the provided metadata, while trying not to entirely move away from the web world as we know it. In fact, this is probably the most important characteristic of Jon's experiment: he is working with what he is able to find right now, that is lots of HTML documents, which can be converted to be well-formed XML quite easily, and then searched by means of XPath. While these are ubiquitous technologies, it's difficult to find RDF files spreaded around as such: proving that the RDF world is query-enabled, stating that the right place where to put metadata are RDF files because you can probably get higher quality and more complete results is useless if there are little or no data to query.
From my personal perspective, I see those two worlds, one working with XML and XPath, the other messing around with RDF and RDQL, still very far from each other. Jon's experiment is helping us to become conscious of the fact we already are on a metadata path as far as web content is concerned: XML and XPath are probably the first steps in this journey, leading us to a more semantic web augmented with technologies which nowadays seems not to be successful, but that will hopefully prove to be useful when more complex needs arise. We can only hope the virtuous cycle will start to spin soon.
[Through the blogging-glass]
Amen. Thanks, Guilio!
Former URL: http://weblog.infoworld.com/udell/2004/01/31.html#a903