Structured change detection

Consider two versions of a Word document saved as XML. There are "structured diff tools that can map the changes at an intermediate level, in terms of XML elements. For example, IBM's AlphaWorks site offers the XML Diff and Merge Tool for Java, while Microsoft's GotDotNet site offers XML Diff and Patch for .Net. Both of these free tools can track element-level change. To get a sense of what's possible, check out Monsell EDM's online demo of its Delta XML technology. The demo compares two subtly different versions of a complex graphic -- the standard SVG (Scalable Vector Graphics) "tiger" benchmark -- and animates the differences between the two. It's stunningly cool.

As XML becomes the standard way to represent prose, graphics, and other content, we should expect such change visualization to become routine. What about code? It has sections, subsections, and paragraphs, too. XML isn't -- and probably shouldn't be -- the primary way we read and write code. But the underlying abstract syntax tree has structure that can -- and arguably should -- help us see and comprehend the code's evolution. [Full story at]
Ordinarily readers call me on stuff like this, but for once I get a chance to beat them to the punch. This column certainly should have mentioned that Subversion, the open source project that aims to replace CVS, reached its 1.0 release last week. It looks really good, and I'm investing some time in learning how to deploy and use it.

Subversion's support for copying and renaming files and directories aims to reduce one of CVS's worst points of friction. Since I work with lots of XML data -- including just about everything I write -- I'm also eager to try plugging in some structured diff programs.

Former URL: