A lonely job

I have a superpower that enables me to do battle with the evil of data lock-in. I can't leap tall buildings or crush lumps of coal into diamonds, but when I look at the barriers that divide one data format from another, they seem hardly to exist. For me, data transformation is almost an autonomic reflex, like breathing.

But PDF? Please, oh please, don't make me dig the data out of those PDF files. [Full story at InfoWorld.com]
While I was offline for a couple of days, my tale of data liberation garnered some sympathy. "PDF2XLS," Tim Bray said, "that's deeply sick." Indeed. I should mention that the villain of the piece was USAA, an otherwise pretty good financial services company that I hope will find its way into the 21st century when it comes to making its customers' data usefully available.

The hero of the piece -- apart from emacs, that is, whose regular-expression search-and-replace function got me where I needed to go -- was Investintech's Able2Extract. I've only used that program on a trial basis. If I'd spent more time with it could probably have gotten it to produce something closer to the final result I needed. But then, I'd rather not have to spend more time on such nonsense.

Of course that's just wishful thinking. I'll probably be fighting data lock-in for the rest of my life. I can't disavow my superpower. So long as data yearn to be set free, I'll be coming to the rescue, my utility belt festooned with regexes and parsers and scripts. It's a lonely job, but somebody's gotta do it.


Former URL: http://weblog.infoworld.com/udell/2006/04/13.html#a1426