The Internet of Things: A New Evolution for Speech Technology
“The ideal Internet of Things would be controllable by multiple people. The challenges to that are accurate biometrics and voice identification, fast switching, and data collection between multiple user profiles, and judicious withholding of sensitive information,” he says. The desired scenario, he adds, is one in which an interface could not only detect which users are within earshot of the intelligent assistant but could accept commands from all of them in the context of a single conversation, and withhold any privileged information if requested by an unauthorized user, even if the conversational logic prompts a voice response. In these situations, the ideal IoT would allow for seamless transition between a speech interface and touch screen or other non-voice device—all without losing the context of the conversation. A very tall order indeed, one which Melfi thinks, for now, will impede the utility of speech as a UI for the Internet of Things.
“People who want to use their intelligent assistants for financial services like personal banking are going to run into issues of verification or exposure of personal information,” Melfi says. “And that’s either going to limit what speech can do for the IoT or keep the IoT from being a many-to-many usership. In the meantime, there will just be some things it’s better to do on a screen.”
A Quantum Leap Through the IoT
Experts remain energized by the emerging markets. “The Internet of Things is an amazing opportunity for speech,” Dahl says.
“The exciting thing,” Bouzid says, “is that enterprises are becoming aware that these interfaces are a serious channel.” Bouzid predicts that the IoT will be as dominant as mobile devices in the next three years and will serve as an excellent conduit for customer service automation that is requested rather than forced on users, making the IoT home environment a go-to option for CRM and resulting in an increase in IoT-based applications that augment speech recognition. In the midst of all of this strategizing, speech is poised for a quantum leap in recognition, AI, and context. A more open developers’ market and a deeper penetration of the IoT means that speech is about to have an unprecedented role in the daily lives of people spread over a wide demographic. While speech technology engages with this massive new user base, a corresponding spike in the data necessary to train AI and fuel machine learning will also occur, improving performance and recognition and broadening dictionaries for a small group of immense servers and deep neural networks designed to provide optimal intelligent assistance through speech.
Users and businesses alike can expect massive advancements in humanlike intelligent assistance and further improvements to speech recognition and transcription performance as the data pours in. Further proprietary entrenchment or database partnerships for the data collected will likely shape the advancements and determine the industry’s leaders. And it may well be that the next few years will be a race between major consumer platforms like Apple and Amazon less to see which will innovate most in the areas of IoT and speech technology than which will manage to market itself well enough to collect the superior data. Operating calmly beside them will likely still be nimble companies like Melfi’s VoiceBox Technologies, which will seek to serve a smaller but still massively substantial fraction of an ever-growing market through innovation and development.
Tye Pemberton is a freelance writer based in Linwood, N.J. He can be reached at tyepemberton@gmail.com.