-->

Q&A: Sam Ringer Says the Revolution Is Coming — the Medium-Term Future of AI and ML

Article Featured Image

Current AI and machine learning (ML) technologies are starting to change the way we build and innovate. However, the power of our current ML technologies is not fixed. Sam Ringer, Machine Learning Engineer, Speechmatics will explore where ML is at the moment and where it is heading in the next 10 years in his session “The Revolution Is Coming: The Medium-term Future of AI and ML”  at SpeechTEK 2020. How are its current uses cases different from those we can expect to see in the future? How will increased technological leverage change how we do business? He’ll answer these questions and more, but first, he answered a few of our questions.

Q:  How does ML make it possible to develop better speech applications than traditional technology?

A: ML allows you to leverage compute and data to learn patterns instead of having to define the patterns yourself. This scales far better than human heuristics can.

Q: Some developers feel that it is difficult to generate meaningful error messages when an ML fails?  How can this problem be overcome?

A: This is very much still an open problem and there are currently no standard best practices. There are however some early solutions include but are not limited to constantly checking the following: training vs validation loss, overfitting ability, gradient norms, deep data visualisation and recurring failure cases at test-time.

Q: ML requires large volumes of sample data to train the ML engine.  Where does the training data come from?

A: Training data typically comes in two forms: labelled and unlabelled. Until very recently only labelled data has been used to train ML systems at scale. In the case of ASR, data comes from a variety of sources but all of it must be hand labelled.

Q:  How much training data is required?  How do you know when you have enough training data?

A: Normally thousands of hours of labelled data. You can never really have too much training!

Q: Are there techniques to avoid using large amounts of training data?

A: Yes: meta-learning and self/semi-supervised learning. These areas have only shown promise recently and still have lots of open questions to be solved.

Q: If a trained ML system needs to support a new class of queries, must the ML system be retrained?

A: It depends. If the new query is within a similar domain to the query your system is trained on then you can use transfer learning to do a decent chunk of heavy lifting for you.

Q: ML improvements in accuracy often simply depend on supplying larger amounts of training data. What other techniques do developers have for improving the accuracy of their applications? 

A: There are hundreds, if not thousands of ways of boosting accuracy. However, nearly all are hacky solutions. If your learning process is strong enough then you shouldn’t have to hack your way to better results.

To see presentations by Ringer and other speech technology experts, register to attend the SpeechTEK Conference.

SpeechTek Covers
Free
for qualified subscribers
Subscribe Now Current Issue Past Issues
Related Articles

Q&A: David Morand on Integrating a Contextual AI Assistant with VoiceXML

If you need a dialogue engine that will allow you to develop next-gen conversational IVR applications while being compatible with the VoiceXML standard, this is a must read.

Higher Learning: AI, ML, and Speech Tech in Academia

Academics are at the forefront of the biggest AI changes in the industry. Learn more about the latest breakthroughs, the questions researchers are trying to answer, and the challenges they face.

Q&A: Dr. Nava Shaked on Evaluation, Testing Methodology & Best Practices for Speech-Based Interaction Systems

Get a sneak-peak into Dr. Nava Shaked's SpeechTEK workshop in this Q&A. Learn everything you need to know about evaluation, testing, and best practices for speech-based interactions.

Ethics and Algorithms—Exploring the Implications of AI

Concerns have been voiced about how AI and speech technologies are now being used, but solutions are not clear-cut