The year in tags

Today's experimental screencast was inspired by the Juice Analytics guys. My effort to visualize the tags I've assigned to my blog entries this year won't win a Tufte award, but it was instructive nonetheless.

the year in tags

The procedure was as follows:

1. Grab an XML dump of my del.icio.us tags. Tool: curl.

2. Count the frequency of tag use by month, excluding singletons, and emit comma-separated text. Tool: Python.

3. Create a pivot table and chart. Tool: Excel.

4. Add a month slider to the chart; reorient and scale the tags. Tool: VBA.

5. Produce as a screencast. Tool: Camtasia Studio.

6. Tweak the FLV metadata. Tool: flvmdi.

Too many moving parts, clearly. And even within each of the tool domains, much latent capability is inaccessible on the surface. Commenting on the Juice Analytics example, one reader said:

I am a regular user of Excel and I don't really have a handle on when and how to use pivot tables, much less how to glue controls to a spreadsheet using VB.
Me too. Without the Gemignanis' screencast to remind me of the slider capability, and show me how to deploy it, I'd never have gotten to first base. Of course the initial data capture and transformation, which was easy for me just because I'm fluent with XML, might have similarly thwarted some expert Excel users.

Customizing the chart with VBA was no cakewalk for me either. I had to iterate through a number of techniques -- involving axis labels, data labels, or both -- before I settled on a decent compromise. As I'm always reminded whenever I dip my toe into the waters of Office development, those object models are hard to wrestle with. That's true despite the fact that you can record lots of interactive behavior into macros, and then use the generated code as a guide.

Once I settled on a treatment of the data series and its labels, I had what I wanted: a month-by-month visualization controlled by a slider, with tag frequency represented both on the Y axis and (as is customary for tag clouds) with font scaling. But it was suboptimal in two ways. First, all the per-datapoint computation precludes smooth animation. You can watch this movie in Excel, but you wouldn't want to.

And second, of course, you might not have Excel. To communicate this visualization I wanted to be able to embed it on a web page and make it almost universally accessible. Hence the screencast. In this case I used a beta version of Camtasia 3.1, which adds FLV encoding, although I've also been using ffmpeg to do the same thing. In both cases, I've had to experiment with frame-rate and key-frame settings in order to make scrubbing (scrolling) work properly.

The end result does what I intended. And it's noteworthy that the screencast can present Excel's own animation better than Excel can. (For example, try scrubbing backward and forward once the video has played through.) Still, the effect isn't as interesting or useful as I'd hoped. Not only because of the obvious problems -- compression artifacts, text truncation -- but because I'm sure there are better ways to explore this dataset.

If I sat down with a sketchpad I could probably come up with lots of ways to visualize and animate this data. But our tools constrain us severely. Excel's canned charts aren't very open-ended. And while there's useful widgetry out in the wild -- TouchGraph and TreeMap for example -- these are heavyweight components that aren't accessible to civilians, and don't support the interactive discovery of new visual styles.

We need to accelerate that process of discovery by one or two orders of magnitude. In last week's interview with Brendan Eich, Steve Gillmor asked why Firefox's inclusion of SVG matters. This is why. We need an environment that's open to users and developers, that fully embraces web standards and XML, that is dynamically scriptable, that deals with text, images, and vector graphics in the same domain, and that is tuned for rapid creation and wide propagation of memes. Firefox was all that except graphics, now that hole's been plugged.

Redoing today's exercise in the CANVAS- and SVG-enabled Firefox 1.5 would be an interesting exercise. Building an environment on top of Firefox 1.5 that enabled people to explore other visualization possibilities would be really interesting!


Former URL: http://weblog.infoworld.com/udell/2005/12/20.html#a1357