Improving Conversational Virtual Assistants with Natural Language Processing
How many times have you asked Siri to do something and she failed? Or maybe your device is based on Android, and after a few attempts using Google Assistant you finally just picked up your smartphone and used your thumbs to finish the task yourself? It’s a frustrating experience, but many of the recent advances in natural language processing (NLP)—along with those in surrounding technologies—are boosting the quality of conversational virtual assistants such as Siri and Alexa, and those in the enterprise space as well.
A Look Back at the Challenges of Conversational Virtual Assistants
Much of the key underlying technology of conversational virtual assistants—primarily deep neural networks and machine learning—has been around a long time. But its cost and resource demands have historically been too much for many companies to bear. “It’s been too expensive and it’s taken too much computing power to do it,” says William Meisel, president of TMA Associates in Tarzana, Calif., who wrote his Ph.D. dissertation on deep neural networks. “Once you created the model, you couldn’t run it inexpensively or fast enough.” Those road blocks kept the technology out of reach for many. But that has changed in recent years, and the technology and its accessibility to a broader range of organizations is opening virtual assistants up to new innovations and new business models.
With virtual assistants often divided out into two types—consumer versus enterprise—many of the challenges and advancements made in performance and functionality differ along those lines. Consumers interacting with their devices don’t always require a conversation, for example, instead giving one-way commands to play a song or dial a phone number. “If I’m calling an enterprise, I might have just loaded some software and my screen went blue,” says Jay Wilpon, senior vice president of natural language research at Interactions in Franklin, Mass.
Conversations at the enterprise level are typically going to be much more advanced than asking for the capital of New York. “Even though it seems as though the consumer virtual assistants are doing a lot, the complexity of the applications or the rigor that’s needed at the enterprise level isn’t necessarily there.” Consumers at home may be mildly annoyed if they have to repeat a command twice, but if they’re calling into a support desk and the virtual assistant can’t engage in a complex conversation to solve their problem, those consumers can quickly become irate. In sectors where the technology is important to the bottom line, it’s likely that a good customer experience has been given a higher priority over the years.
Traditionally, one of the biggest problems with virtual assistants was related to speech recognition. That component has improved tremendously, and Deborah Dahl, principal at Conversational Technologies in Plymouth Meeting, Pa., says that Siri’s current performance provides a good example of those advancements while also highlighting where there’s still room for improvement. “It shows what the speech recognition result was on the screen, and it almost always recognizes it,” she explains. Unfortunately, there continue to be plenty of opportunities for the system to do the wrong thing. Because she lives in a town called Plymouth Meeting, Dahl says that when she tries to find a local pizza place, Siri always wants to set up a meeting rather than find a restaurant. “The speech recognition is perfect, but it doesn’t understand what I was really asking about,” she says. It’s just one area where conversational virtual assistants still struggle.
NLP Moves Ahead
Despite some remaining challenges, the programming that underpins conversational virtual assistants has made big gains. As queries are posed by users, Apple, Google, and Amazon have people looking at them to see where improvements can be made. “They’re saying, ‘Now we need to fix this and that,’” Dahl says. “They’re constantly adding new layers of knowledge and background knowledge, and there are online resources they can access under the cover.” Other improvements are also making their way into the user experience. “Amazon recently introduced a thing called fallback intent, where if a user says something that Alexa’s skill doesn’t know how to handle, it will use that fallback intent,” Dahl explains. The user may then receive a response similar to “I don’t know how to answer that,” rather than the system simply leaping to a conclusion that isn’t justified by what the user said. Dahl believes there’s a good bit of human intervention going on behind the scenes to ensure these “I don’t know” scenarios are examined and, when possible, added to the growing list of the virtual assistant’s knowledge base and capabilities.
It’s true that consumers accustomed to the virtual assistants accessed through telephone trees enjoy a better experience today. “People don’t like that technology in general,” says Tom Hebner, worldwide leader of the cognitive innovation group at Nuance Communications. “It feels like it’s in your way.” Users learned not to say much to those systems because the earliest versions didn’t perform all that well. “Virtual assistants allow people to play again,” Hebner says of the latest generation of platforms. Consumers are increasingly interested in seeing what the systems can do and what they can understand. Because the edges of the technology have been pushed out, users are experimenting with longer phrases and more complex requests. In addition, most virtual assistants are becoming adept at understanding a wide range of accents and dialects, giving people the freedom to play with the boundaries and offering them some real usability in what can be accomplished with the technology.
Virtual Assistants Benefit from Advancements Outside NLP, Too
In many cases, the user experience has evolved as a result of improvements in the technologies that surround and support NLP. Siri’s use of smartphones and connections over data networks delivered better audio than virtual assistants had enjoyed in the past. “The performance of Siri was better because of cleaner audio,” Hebner says. “Alexa pushed that even further with far-range microphones in their device.” Even users with a house full of chatty dinner guests can ask one of these home-based platforms to play a song, and its microphones will, more often than not, allow the system to interpret the command correctly. “So the biggest advancements that are not talked about is the quality of the audio is better into the NLP software, and that gave the perception that the software was much better,” Hebner says.
More powerful tools have become available to make the technology around natural language processing far better, too. Much of what used to require the skills of a speech scientist is now put together in these tools. “Large technology companies have these builders you can use to build a natural language model,” Hebner says. “Accessibility has become a lot higher.” The explosion of cloud technology has also democratized a lot of the horsepower that goes into high-functioning conversational virtual assistants. “It used to be you had to be a large company to be able to afford that, but now with cloud computing, it’s cheaper and it’s easier,” Hebner says, adding that the level playing field brought by the cloud is one reason why advancements often occur in the enterprise space before moving to consumers.