Dragon NaturallySpeaking

Like many who pound keyboards more than is healthy, I struggle with RSI (repetitive stress injury). When it started, about seven years ago, I began looking into voice recognition as a way to help minimize my keystroke output. At the time I concluded that the state of the art -- in both hardware and software -- wasn't where it needed to be. At intervals since then I've repeated the experiment and come to the same conclusion. But at some point, a series of incremental gains adds up to a tipping point. And for me, at least, voice recognition may finally be tipping.

Dragon NaturallySpeaking 8 This week I installed the latest version of Dragon NaturallySpeaking and captured a 7-minute screen video of my initial experience with the product:

Introduction to NaturallySpeaking: Windows Media, Flash, QuickTime

(You'll hear some background noise. That's because I couldn't figure out how to share the headset microphone between Dragon and Windows Media Encoder, so I used a separate external microphone to capture the voiceover.)

It's been a couple of years since I tried dictation, so what you're seeing in that video is basically a new user of the product learning not only how to dictate but also how to edit with voice commands. Training, prior to this video, was minimal. I spent a few minutes reading a prepared text. But I declined the offer to have Dragon absorb samples of my writing, and just dove right in. The result was, by far, the best out-of-the-box experience I've ever had with this technology.

Subsequently I let Dragon read all of the weblog postings I've written in the last couple of years. Following that, I dictated and sent an email message, reasonably efficiently and in a completely hands-free manner. That's something I've never done before.

Will I become a regular user? Probably not. I've learned to manage my RSI problem with a regime of stretching and exercise. I can still produce correct copy much faster with my fingers than with my voice, and much of the editing I do involves constructs (XHTML markup, programming-language punctuation) that aren't (yet) open to voice command. Still, it's great to know that if I want to give my hands a rest now and then, there's an alternate way to produce prose.

Twenty years ago, for a master's thesis, I interviewed one of the original researchers in the field of natural language processing. Expect no breakthroughs, he said: "It's more about perspiration than inspiration." He was right. It's been a ground game: three yards, a cloud of dust, three more yards and another cloud of dust. But when I look back I can see the distance we've come. And when I look forward, I anticipate similar gains.


Former URL: http://weblog.infoworld.com/udell/2004/11/04.html#a1108