MP3 sound bites

In the inaugural column of this series on hypermedia, I mentioned an MP3 clipping service I wrote to enable quotation of sound bites. Before I explain how it works, let's review why it exists. Audio content -- and of particular interest to me, spoken-word audio content -- is flourishing. In the tech world, Doug Kaye's ITConversations web site is a great example. It features audio interviews with IT personalities, as well as recorded speeches from conferences -- including the recent O'Reilly Open Source Convention. Kaye's audio engineering credentials are impeccable, but nowadays anyone can pick up a microphone and speak into an MP3 file. Today, for example, I listened to Dave Winer's thoughts on the business model for Wi-Fi and blogs, recorded while he was driving northward in Wisconsin. In my own journalistic work, I increasingly record and post audio interviews.

Although the amount of audio content keeps growing, the time available for listening remains constant. Until and unless we achieve a radical breakthrough in speech-to-text translation -- and I'm not holding my breath -- we'll need to find another way to make audio content more granular, and easier to consume selectively. [Full story at O'Reilly Network]

I've been using the service described here for a while now. For this column, the second in a planned series on hypermedia, I rewrote and published the code in hopes that others will be inspired to help move the project forward.

One noteworthy contribution comes from Scott Reynen: an AppleScript that shows how to drive the Real player on OS X. Scott wonders how reliably the /ramgen directory can be assumed to be present, and how AppleScript might be used to capture both the beginning and end of a clip. I don't know the answers to these questions, but maybe someone else does.

I've written about Rich Persaud's clipping service before, but it deserves another mention. My encapsulation of MP3 downloadables and his encapsulation of Real, QuickTime, and WinMedia streams should really look like the same thing.

I'm not sure how Macromedia's SWF format fits in, and neither is John Dowdell, though I guess the Flash Communications Server may hold some clues.

Here's a sketch of the scenario I envision. A canonical syntax, like the one defined by Rich's service, is available for both AV streams and AV downloadables. The service is distributed to a set of hosts on the Net, which hypermedia authors treat as interchangeable mirrors. Media players support precise selection of clip bounds, are aware of the canonical syntax, and produce it (relative to a default mirror) when a selection is made. The likes of Bloglines, Technorati, and Feedster treat all mirrored instances of a clip URL as the same URL for the purposes of conversation assembly. Wherever a clip URL appears, the link to the unabridged AV stream or file is also made available -- i.e., the clip is "transcluded" from an original context that remains accessible.

Former URL: http://weblog.infoworld.com/udell/2004/09/07.html#a1070