Data formats for digital democracy: XML vs CSV

As a first experiment I grabbed the DCStat reported-crime feed for November, sucked it into Excel 2003, consolidated incidents by day, pivoted them on type of offense (homicide, burglary), and exported them back out as a CSV (comma-separated value) file that Swivel could import. [Full story at InfoWorld.com]
Here's one of those pivot tables in Swivel. The auto-generated charts don't do much for this style of dataset. But the point of this week's column is that just publishing a named dataset, along with pointers to the raw data, is inherently valuable.

I imported the same data into Dabble DB where it's very easy to use grouping and filtering to make views like this one. Again the point is that the views are sharable on the web. Also, in this case, invited collaborators can tweak them.

Going through this exercise, I was struck by the distance between DCStat's namespace-rich XML formats and the CSV format that web apps like Swivel and Dabble DB want to read and write. I happen to know how to use the XML Maps feature of Excel 2003 to shred an XML file but I doubt many Excel users have ever done that. To enable ordinary citizens to explore this data, DCStat might want to offer a common-denominator CSV format in addition to the XML flavors.


Former URL: http://weblog.infoworld.com/udell/2006/12/13.html#a1578