Talking to Julie

A couple of months ago I called Amtrak to book a train and was astounded by the performance of "Julie," the IVR (interactive voice response) agent at 1-800-USA-RAIL. My colleague Tom Yager had the same reaction to the unnamed IVR agent who handles flight, schedule, and gate info for American Airlines at 1-800-223-5436.

Here's part of my conversation with Julie. It was captured in a roundabout fashion, so it sounds kind of rough. I talked to Julie on the Vonage DigitalVoice IP phone, recorded to an analog tape recorder, and then because I couldn't find a patch cord, I used an external microphone to grab the .wav file. Still, it's clear that Julie is pretty darned effective. There's nothing revolutionary here, just incremental refinement of techniques that have been around for a long time. But you get the sense that differences in degree are starting to add up to a difference in kind.

I don't know if Julie is now using Web services to look up schedules and book reservations, but inevitably she will. If she's a client of those services, and so is the website, why not unify development of apps for both styles of interface? The Microsoft .NET speech API, now in beta, takes a step in that direction. The idea is that your ASP.NET application can be accessed GUI-less from a phone, or through a browser in which case the GUI can support and streamline the IVR. As the docs explain, though, the development sequence is:

  1. Build the application for voice-only.

  2. Debug the voice-only application.

  3. Extend the application to support multimodal clients.

  4. Debug the completed application.

Today it would be quite unusual for an ordinary Web application to target Julie. But as she keeps improving, that could change. It's interesting to imagine what effect broader awareness of IVR could have. The limitations of voice UI force IVR developers to be very explicit about grammars and scenarios. A dose of that discipline wouldn't be a bad thing for a lot of GUI and Web applications.

Former URL: