Testing for Windows rot

It's nice to see the New York Times mentioning Ward Cunningham as the father of Wiki. I wonder, though, whether another of Ward's efforts -- Extreme Programming, and in particular his advocacy of test-driven software development -- might not ultimately affect more people's lives.

The column in which I interviewed Ward attracted a lot of notice. Bill de hÓra wrote:

Stop what you're doing. Ward Cunningham is quite possibly the most vital actor and thinker on software development over the last ten years. [Bill de hÓra]

Tim Bray wrote:

A couple of [Ward's] remarks have been creating rumbling echoes that won't die down in the back of my brain. [ongoing]

One of the things Ward told me about in that interview, which I omitted from my writeup -- because, frankly, I didn't really get the significance of it at the time -- is the Fit framework. The O'Reilly Network has a good article about Fit. You can see it in action at the Fit Wiki, about which Ward writes, in his typically direct way:

This site is about tests that people can read. Here is a sample. Green is good.

Fit uses the simple construct of an HTML table on a Web page to synchronize the activities of testers and developers. There is art involved on both sides. The tester's art is to represent conditions and expected outcomes in tabular form. It turns out that all kinds of software behavior can be represented in this way, but doing so effectively takes thought, skill, and experience. The developer's art is to write the "fixtures" that map between the tabular representation and the code under test. Also a subtle game. In the end, the magical effect is this: the same HTML table that represents the tests becomes the dashboard that displays test results.

dry rot
dry rot
What got me thinking about this, again, was this Scientific American article on self-repairing computers. A couple of days ago, while wasting an afternoon undoing some rot that had crept into a Windows XP installation, I wished I could fast-forward to that happy world of micro-rebooting and rapid recovery. But when I stop to think about it, Windows rot isn't really about catastrophic failure, it's about, well, rot. I love this description:

The problem with WinRot is that its a process that just seems to "happen" over a period of time. There's no warning, no messages in the event log, no "Windows would like to rot now. Is this ok? Yes/No" dialog. Nothing. [Jim O'Halloran's Weblog]

In my case, two bizarre symptoms appeared on the same day:

I surmised these two oddities were related to a slew of software installation I'd done recently, so I began playing the System Restore roulette game. Pick a day, revert to that day. Problem solved? If yes, assess collateral damage. If no, pick an earlier day and repeat. After three tries I was back 12 days, and the Radio glitch was solved. But not the MSIE/Amazon glitch. Applying the most recent MSIE patch did solve that one too, for reasons I'll never know. Then the collateral damage. Office 2003 apps got unregistered, as did the SpamBayes add-in for Outlook 2000. My Jython-based email searcher broke because of a rollback to a previous CLASSPATH. And there were a few more things like that.

Probably it was software installation or uninstallation that caused the two symptoms of rot. Maybe not. Either way, lack of immediate feedback made recovery much worse. System Restore is actually a darned useful utility, but if you don't catch the cancer early you're in for a painful treatment. Problem was, the disease attacked functions that I don't use every day, or even every week.

I wonder if regular testing of installed applications could enable early detection. And if so, I wonder how that might practically be done.

Yes, I do also use Mac OS X. And while I can't confirm rot, I do detect questionable odors. In the final analysis, any complex system can benefit from regular and disciplined verification.

Former URL: http://weblog.infoworld.com/udell/2003/05/20.html#a695