November 10, 2012
By Leonard Klie Editor, Speech Technology and CRM magazines
FYI

Nuance Plans a (Truly) Hands-Free Phone

If Nuance Communications has its way, in a year or two, mobile phone users will never have to touch their devices again. The company is working with several chip and mobile phone handset makers on a persistent, low-power way for mobile phones to continuously listen for voice commands from the user.

The phone will be able to detect that its owner is speaking, wake itself up, and perform the requested task, such as sending or receiving text or email messages, searching the Web, dialing a contact, or posting to Twitter. All this could happen even when the phone is dormant—in power-saving "sleep" mode.

Vlad Sejnoha, Nuance's chief technology officer, says this capability is part of a larger movement that has already begun, in which handsets and speech processors are being endowed "with a lot of sense and contextual awareness."

He notes that this same movement is elevating speech beyond simply being another add-on to the phone's technology stack to being the central interface. This started, he says, with Apple's Siri virtual assistant application, for which Nuance has long been rumored to provide the speech engine.

"Consumers have shown a great interest in these speech interfaces," Sejnoha states. "We clearly see that with Siri."

But the technology can go much further, he says. "You should be able to tell your device what you want it to do without having to pick it up and scroll through several screens to find the app you need. The phone, in one shot, should be able to recognize that it's being spoken to and that it is the right person speaking to it, and then do the job or task requested."

Such an app would represent a big step in making mobile devices more user-friendly, Sejnoha points out. "Simply activating these devices can be a real problem for the user," especially in situations like driving.

To make all this happen will be a challenge, he acknowledges, partly because of all the components that will have to come together seamlessly. First, the phone will need full speech recognition and speech synthesis capabilities, but it will also require natural language understanding, voice biometrics, and even a level of artificial intelligence.

"The system would need a sense of judgment," Sejnoha states.

More challenging will be the issue of power consumption. Having the phone constantly in some kind of "listening" mode will quickly drain the battery unless a low-power solution is found. "We would need something to control what part of the phone is active and what part remains dormant," Sejnoha says. "It would have to be tied into the [phone's] hardware so that it only runs what it needs."

Interoperability, or making a chipset or software bundle that could work with every available device, mobile operating system, and network, is also an issue Sejnoha says can be overcome. "Certain algorithms will be reusable, but some will have to be adapted for each specific device," he says.

Security would be a concern unless voice identification can be incorporated into the application to prevent the phone from giving its data to anyone who asks for it. For that, Nuance has a number of voice security applications developed in house or gained through acquisitions.

In an advanced stage, the phone could even be proactive, notifying the user when he receives a new email or text message, for example. "There's so much information out there that is difficult to find and manipulate," Sejnoha says. "Speech interfaces can make it so much easier."

"There certainly is a broad range of applicability," he says, noting that such technology could be used as much in business scenarios as for personal tasks.

But even if Nuance and its development partners get past those other hurdles, there will still be privacy fears among a large segment of the population who might be concerned about their phones listening to their every word. The phone manufacturers and carriers that eventually deploy the technology will need to assure their customers that the phone is only listening for specific prompts and not "taking notes" on every conversation that goes on around it.

Sejnoha doesn't see this as a stumbling block. "We're already starting to transition from speech-based IVRs to more full-featured virtual assistants that can do this sort of thing," he states.

Free

for qualified subscribers

Subscribe Now Current Issue Past Issues

Nuance Plans a (Truly) Hands-Free Phone

Siri, Meet Nina

Nuance Powers Voice Capabilities in OnStar FMV Retail Mirror

Cyberon Releases Hands-Free Phone App

ZoomSafer Enables Drivers to Send and Receive Voice-Powered Texts and Emails

Gladia Launches Solaria, a Multilingual Speech-to-Text Model

aiOla Launches Jargonic Speech Recognition Model

XL8 Delivers Real-Time Spanish Translation Captions to U.S. Public Broadcasters

Northeastern Researchers Develop AI App to Help Speech-Impaired