November 27, 2019
By James A. Larson program co-chair, SpeechTEK 2021
Q & A

Q&A: David Morand on Integrating a Contextual AI Assistant with VoiceXML

Amazon and Google may have made it easier to develop speech applications without VoiceXML, but the technology is still widespread in the industry, especially in large contact centers. If you need a dialogue engine that will allow you to develop next-gen conversational IVR applications while being compatible with the VoiceXML standard, you need to hear what David Morand, Senior Software Developer, Nu Echo, has to say.

Q: What is the essence of your presentation, "Integrating a contextual AI assistant with VoiceXML," at the SpeechTEK Conference, April 27-29, in Washington, D.C?

A: We have explored dialogue management capabilities of the open-source AI assistant solution offered by Rasa and its integration with the VoiceXML standard. We will describe the rationale behind the Rasa-VoiceXML solution, the use cases that were explored, and how we leveraged both deterministic dialogue strategies and machine learning capabilities. We will also cover how we integrated its open-source components with VoiceXML, including multilingualism, audio file management and speech grammars. Deployment and testing strategies will also be presented.

Q: What is the context and why is it important?

A: While many options are available to develop speech applications without VoiceXML (Amazon Lex and Google Dialogflow come to mind), it is still widely used in the industry, especially in the large contact centers which are our primary customers. We needed a dialogue engine that would allow us to develop next-gen conversational IVR applications while being compatible with the VoiceXML standard.

Q: VoiceXML has been widely used for many years. What did you feel were the shortcomings of VoiceXML?

A: VoiceXML biggest shortcoming is the lack of control structures which make it unsuitable as a single tool for developing complex IVR applications. That’s why for a long time we used VoiceXML as an intermediate language to access ASR, TTS, telephony and media resources on top of a Java application development framework (Rivr) which was open-sourced several years ago.

Q: Why did you select Rasa for integration with VoiceXML?

A: Rasa offers us a great mix of ready to use tools, dialogue strategies (ML and deterministic) and deployment recipes to quickly create working intelligent assistants. Most importantly, it is fully extensible and customizable which allowed our engineering team to adapt it to support VoiceXML and the specificities of speech applications using Python.

Q: What are the challenges of integrating Rasa with VoiceXML?

A: Incorporating foreign VoiceXML concepts in Rasa required some outside-the-box thinking. Also, properly managing multilingualism and dynamic audio concatenation was a bit tricky. For the details, you will have to come to the presentation.

To see presentations by David Morand and other speech technology experts, register to attend the SpeechTEK Conference.

Free

for qualified subscribers

Subscribe Now Current Issue Past Issues

Q&A: David Morand on Integrating a Contextual AI Assistant with VoiceXML

Q&A: Scott Hoglund on Conversational Banking

Q&A: Sam Ringer Says the Revolution Is Coming — the Medium-Term Future of AI and ML

How Millennials and Generation Z Are Helping to Drive Voice-Based Solutions for Urgent Customer Support

The Key to Business Growth: Using AI to Break Barriers in Sales Conversations

Video: How to Assess if a Conversational UI Is Right for You

Gladia Launches Solaria, a Multilingual Speech-to-Text Model

aiOla Launches Jargonic Speech Recognition Model

XL8 Delivers Real-Time Spanish Translation Captions to U.S. Public Broadcasters

Northeastern Researchers Develop AI App to Help Speech-Impaired