October 1, 2008
By Moshe Yudkowsky President - Disaggregate
Industry View

The Creepiness Factor

The voice user interface (VUI) world divides into two basic camps: dentists and entertainers. Dentists view the dialogue between the user and the system as something that both parties want to get over with as soon as possible. Entertainers view the interaction as a miniperformance staged for the benefit of the user.

I come down squarely on the side of the dentists. I don’t call interactive voice response systems because I’m bored. I call because I have a specific transaction in mind, and the faster and more efficient the transaction the better. I don’t want to hear about special offers, I don’t want to take surveys, and I don’t want to enter the same information over and over again. And I particularly don’t appreciate calls that begin with You can access this information via our Web site because I wouldn’t have made the phone call in the first place if I had an Internet connection.

But I admit that I do see some sense in regarding the dialogue as entertainment—specifically, as a magic show. Magicians misdirect the audience’s attention through bright objects, hand motions, and pretty girls; the audience never notices the wires, hidden boxes, and subtle hand motions that cause the volunteer from the audience to pick the one card out of the deck that makes the trick work. The same holds true for VUIs: A good user interface subtly tries to force the caller to use words in the system’s vocabulary. The other goal is to make the dialogue appear as smart as one that comes from a human agent.

But I’m a dentist and not an entertainer because systems that are too clever tend to become creepy. One demonstration heard a few years back had the oddest set of prerecorded prompts that I have ever heard. If you ordered a pair of boots, the system would say, Oh, those are great. My nephews really like those boots. I can cope with the idea that salespeople will lie, but a company that goes through the trouble to create prerecorded lies? That’s just plain creepy.

For another example, read a revealing article by Ben Fry called "Talking to the Machine" on the Creative Loafing blog. Here’s an interesting bit of Fry’s description of his conversation with a male voice—which he calls "he (it)" because the dialogue was so lifelike—just after Fry declined to purchase any additional services: "He (it) paused, and in a barely concealed dejected tone, asked me why I wasn’t interested. I told him (it) I don’t need phone service because I only use a cell phone and I had already called and placed an order to transfer my cable service."

Doesn’t this strike you as creepy? Not only has Fry almost lost sight of the idea that he’s talking to a machine, but he argues with the IVR system. At the end: "After his (its) last feeble effort to help me with some additional services, he (it) ended the call by telling me to have a nice day. ‘You too, man,’ I replied before I could even stop myself."

That last sentence will make the VUI designer proud, and well it should because it certainly met the marketing goals. On the other hand, Fry seems to think in retrospect that the entire experience was creepy.

I Like It

I’m not against personalization; in fact, I’m a strong advocate of it. If I call your IVR system every day and I always choose option 3, I would not mind if the system offered me option 3 by default.

Bruce Balentine of Enterprise Integration Group has done plenty of research showing that callers like a brisk, goal-directed dialogue that includes normal verbal courtesies, but they don’t like dialogues that attempt to be their best friends. Too much personalization can lead to trouble. If the system seems omniscient, users may find the experience too creepy. For example, when a New York brokerage firm installed caller ID several years ago, it started to greet callers by name immediately: Good morning, Mr. Jones. How are you today? Callers found this so disconcerting that the firm instructed its brokers to pretend they didn’t know who was calling.

Balentine also told me about new dialogues that use specific personal information, such as place of birth or previous residences, to verify identities. In postcall interviews, callers use the exact word "creepy" to describe the dialogue.

So how do you become a good dentist? At Bell Labs, I took a blue pencil to extraneous words in a dialogue to make the interaction more brisk. I justified my actions by claiming—without factual support; I just made the number up—that for each second I lopped off the dialogue, AT&T would save $1 million a month in connection charges. Internal customers accepted my reasoning, and the end product was quite successful. I was wrong about the cost savings; it wasn’t $1 million per month, it turned out to be only $1 million per year.

Moshe Yudkowsky, Ph.D., is president of Disaggregate Consulting and author of The Pebble and the Avalanche: How Taking Things Apart Creates Revolutions. He can be reached at speech@pobox.com.

The Creepiness Factor

Deepgram Launches Streaming Speech, Text, and Voice Agents on Amazon SageMaker AI, Integrates with Amazon Connect

Wispr Raises $25 Million to Build Its Voice Operating System

Curantis Partners with nVoq

Read AI Introduces Operator Mobile and Desktop Apps