Top Trends in Speech Technology for 2018
Convergence of AI and Speech Technology
Another area where industry watchers are training their focus is the continued integration of artificial intelligence (AI) with speech technology. Expect both machine learning (ML) and natural language understanding (NLU) to play their part. Miller says, “With machine learning, we can train these conversational voice-first tools to be more elegant and effective over time, just through usage.” Relatedly, NLU maps human utterances to meaning. “And that corpus of utterances just gets bigger over time,” says Miller.
Michels believes that analytics solutions that listen to conversations will be more proactive and gain more insights going forward. He says, “Being able to talk to machines doesn’t interest me. It’s about how the ability to do that enables better communications.” An example he cites is a telecom provider receiving a flurry of customer inquiries about an outage. “Analytical tools could pick up the word ‘outage’ from calls and see what area codes people are calling from,” Michels says. “Then they could be more proactive in troubleshooting the problem.”
Another example of the AI/voice tech convergence comes from Dialpad, which operates the UberConference service for business conference calls. Dialpad announced in May that it had acquired real-time speech recognition provider TalkIQ. That means UberConference users will be seeing real-time call transcription and sentiment analysis; a Smart Notes feature that automatically takes notes on the conversation; and even contextually sensitive coaching via real-time feedback and recommended responses to customer questions, helping sales and support representatives respond to questions about pricing, new features, and the competition.
Yudkowsky sees the convergence of AI and speech broadening out to encompass other datasets to create more robust services. He offers the example of a non-tech-savvy grandparent and a digital picture frame: “With speech recognition, the frame could become interactive. You could say, ‘Please show me pictures of the grandkids when they were here last winter.’” Voice input, facial recognition technologies, and date stamps could work together to serve up the pictures that Grandma and Grandpa want to show off, without anyone ever touching a keyboard or a mouse.
For his part, Michels also believes that “convergence” isn’t quite the right characterization. “It’s over,” he says. “All speech tech is AI-enabled already.”
Sentiment Analysis Gets Smarter
The same AI that is listening hard for the word “outage” is getting better and better at gauging the state of mind of the person saying it. The acoustic qualities of an individual’s voice can tell more about the real meaning of a conversation than the words spoken, and tools are getting better at extracting that meaning for innovative purposes.
For instance, VoiceVibes Inc. offers a cloud-based solution that enables public speakers to practice, improve, and track public speaking skills. Its technology leverages voice analytics, predictive models/algorithms, and machine learning processes to assess and analyze specific features in voice patterns to help users improve speaking. The VoiceVibes technology examines vocal output to capture spoken patterns and habits to determine how an audience might perceive the message—flagging the dreaded “upspeak,” perhaps, which can make listeners perceive that the speaker isn’t confident about the subject.
Yudkowsky provides another example. “Let’s say you order flowers for your spouse, and Alexa does both voice and ‘affect’ recognition”—i.e., senses your mood. “Are you buying for your spouse because it’s your anniversary? Because you had a fight and are trying to make up?” The suggestions that Alexa serves up may change based on the qualities she hears in the voice asking the question.
Privacy Concerns
And that leads directly to the last speech tech trend: concerns around privacy. Says Yudkowsky, “People think that Alexa is a ‘personal shopping assistant’”—one that’s smart enough to ascertain that you’re buying those flowers in a spirit of love and celebration, or of guilt and remorse. But, he cautions, “if you think Alexa is going to keep that information to herself, you’re crazy. She’s not a ‘personal shopping assistant.’ She’s a shopkeeper’s assistant, and she doesn’t work for you.” If you think shoppers are creeped out by the nagging, persistent ads that track them across online experiences, just wait until they start getting served advertisements for books about anger management after Siri hears them lose their temper.
As speech technology gets cheaper and easier to access, “it’s now possible to put a police state on a single CD,” says Yudkowsky. He says that cities with limited budgets for police resources could, relatively cheaply, combine existing applications like facial recognition technology, tools for gait analysis, license plate recognition software, and, of course, speech tech applications. “There’s a dark side of the technology to consider. We don’t have a code of ethics for people who do speech technology, and we never will.”
In May, the General Data Protection Regulation (GDPR) rollout in the European Union sent ripples across the pond in the form of mountains of opt-in notices from businesses operating in Europe, with plain language instruction about how companies protect their privacy, potentially reminding U.S. consumers of the ways in which U.S.-based tech companies are failing to make or keep data privacy promises. As headline news about data breaches become a fixture of the news cycle, it’s not a stretch to predict that consumers will demand more from vendors of all kinds, in terms of data transparency and privacy protection.
Miller agrees, noting, “There is paranoia about Alexa hearing everything. That vector doesn’t go away; we just have to deal with it.”
Related Articles
The technology is magical, but can be misused
29 Apr 2019
A Game Platform Helps Nonverbal Children Find Their Voice.
13 Aug 2018
We present the thinkers and innovators who are creating new tools and approaches for speech technology—and fostering the next generation of talent.
06 Aug 2018
These cutting-edge vendors are leading the way in AI, analytics, natural language, smart speakers, and more.
01 Aug 2018