Rolling our own OPML mashups

My career as an amateur explorer of social networks dates back to 2002, when the channelroll widget that I introduced into the blogosphere became popular enough to enable me to do some data mining. More recently I've mined Bloglines and, and now the Share Your OPML site affords similar opportunities.

None of these mechanisms, however, is friendly to ad-hoc query. Here are five examples of the kinds of questions I'd like to ask and answer:

  1. Who recently added my feed?
  2. Who recently dropped my feed?
  3. What clusters (interest groups) emerge from the data?
  4. Which clusters are least like mine?
  5. Who are the weak ties between my clusters and foreign clusters?

For years I've been writing scripts that scurry around, follow links, scrape pages, and abuse servers in order to discover these kinds of things. It would be nice to be able to get hold of the data and do this work more efficiently.

The mission statement of Share Your OPML reads:

The purpose of this site is to gather a community of subscription lists, in OPML format, and aggregate them in interesting ways.
Now that we've shared our OPML, will SYO share it back so we can create and contribute our own data mashups?

