Podcast transcription, revisited

Back in May I reviewed the CastingWords transcription service as applied (recursively) to the podcast in which Nathan McFarland, Ben Hill, and I discussed how technologies such as MTurk and Mycroft are being used to distribute and coordinate work. Recently I bit the bullet and submitted the URL of my podcast feed to CastingWords along with this order for transcription:

42 cents per minute, 620 minutes of audio, 260 dollars. That's just astoundingly cost-effective, and the quality of the results is excellent. I submitted the order on August 3, and the work was done on August 9.

So I'm going to start working through the backlog in spare moments, and publishing the transcripts as I can. This morning I went through the first podcast in the series, with Gary McGraw, and published this transcript. It looks like it'll take me about an hour per episode to tweak these up to a standard I'm comfortable with. The process involves some word-level corrections, some rewriting, and some fact-checking -- though it's clear that the transcribers have already done a lot of that. In this episode, for example, the transcriber went the extra mile to correctly identify Det Norske Veritas. Impressive!

It was nice to revisit this interview with Gary McGraw who, coincidentally, I finally met face-to-face at a music gathering this past weekend. Our interview was a great conversation that hasn't been heard as much as the more recent episodes, because it took a while for the series to build momentum. My hope is that the availability of the transcripts will yield two benefits. First, that they'll meet the needs of folks who have no time for, or interest in, listening to audio. But second, that they'll be a means of discovery for folks who would prefer to listen. The other day, for example, I had the option to read or listen to the talks given at the 30th anniversary celebration of The Selfish Gene. Although reading would have been faster, the only time I had available was exercise time, so I chose to listen instead.

Update: Heribert Slama, who is Swiss and whose first language is German, adds:

Transcripts are a big help for non-native speakers (readers) of English while listening to the podcast. Colloquial speech is often real fast and pronounced rather casually. The listener must work hard and still misses some points; a transcript makes listening a pleasure;-)

Former URL: http://weblog.infoworld.com/udell/2006/08/15.html#a1506