Video: Benchmarking Voice Assistants, Pt. 8: Conclusions
Learn more about intelligent assistants at the next SpeechTEK conference.
Read the complete transcript of this clip:
Ronald Schmelzer: The whole point of this was to find out kind of where the edge of where their intelligence is. So obviously we were hoping for and we were counting on a significant amount of failure, otherwise our questions would have been too easy. So, yes bottom line, these smart devices are not particularly smart. But here's the crazy thing. So we published this,
Kathleen Walch: In July of 2018.
Ronald Schmelzer: July 2018. And it got a bunch of press because we have videos obviously and we asked it some really funny questions. And within two weeks, I think it was, we would like to think it was coincidence but maybe this wasn't coincidence. Amazon Alexa announced a new, Amazon announced a new thing, which is called the, I forgot the name of service, where they would basically come back and for a question that they could not answer they would come back later and say we have the new answers for you. It's when Amazon was unsure.
Kathleen Walch: Mm hm.
Ronald Schmelzer: If it gave you a response that was a category zero response. And maybe we nudged them to helped to do this because obviously trying to minimize frustration. Since then, a lot of these devices have gotten a lot better at a lot of these questions, right?
Kathleen Walch: Right, so we can tell that they're continuing to iterate on it. Another question that we did not show but we did ask was blending of colors. So, what color is red plus blue? And back when we asked this, I don't think any device was able to understand what color you get if you mixed red plus blue. Since then, they now, I think at least Google Home and Alexa can get that question correct. So they are learning and they are iterating. So some of these questions, if you do ask, you know and you choose to do this benchmark at home, you might get different answers because they are updating it and constantly learning. But in general, we've still found that they all have a ways to go.
Ronald Schmelzer: Right, so I think the biggest thing obviously for those of you that are used to these types of devices and also the AI machine learning, you know that really what we're testing is not the device. I mean the device is just basically the gateway to accessing the AI machine learning technology that's really sitting in the cloud. And we're really testing how intelligent is Amazon, Google, Microsoft, Siri, etcetera's cloud intelligence capabilities. And they're basically getting better and better but I think what we were trying, I think the thing that we highlighted to them is that like, they key of intelligence is not just the ability to understand speech and to create understandings of text and when you're building an app to create these intents that kind of key against these capabilities of speech but to provide that higher layer, what we call machine reasoning. And machine reasoning is beyond just machine learning. Because machine learning is taking patterns and data and deriving incites from data and then using those insights, the experience, the encoded insights to do future things. But machine reasoning requires the connection of learnings together to derive some sort of additional insight. And this is what's new for those of you that have been in this industry for awhile, this is what we've been trying to do with knowledge grafts and ontologies and all this sort of stuff. And this is, I would say this is kind of the cutting edge area of research.
Kathleen Walch: Mm hm.
Ronald Schmelzer: That when Amazon announced that they have 10000 people working in their Alexa division, which was a huge surprise to us. To hear that,
Kathleen Walch: They announced that about a year ago. And then recently within the past month it came out that there were humans listening in on conversations and parts of conversations. So we said, uh huh, is that where your 10,000 employees are going?
Ronald Schmelzer: Yeah, so we got a little bit of heat because they got upset with us for basically implying that there was what's called pseudo-AI. Which is that there's a human, like The Wizard of Oz, there's a human behind the curtain. And there's actually a human moving the things. And they're like no, we guarantee there are no humans actually doing any of these activities. But we are using all of these people to improve the model. And to improve the recognition. We're like, okay, we'll give you some credit for it but over time it's like, in over time that has to,
Kathleen Walch: Yeah, the humans need to become less and less.
Related Articles
Cognilytica Analysts Kathleen Walch & Ronald Schmelzer discuss the seventh test in their benchmarking project, testing the intelligent assistants' understandings of slang and colloquialisms in this clip from their presentation at SpeechTEK 2019.
20 Dec 2019
Cognilytica Analysts Kathleen Walch & Ronald Schmelzer discuss the fifth and sixth tests in their benchmarking project, testing the intelligent assistants' emotional IQ and common sense in this clip from their presentation at SpeechTEK 2019.
13 Dec 2019
Cognilytica Analysts Kathleen Walch & Ronald Schmelzer discuss the fourth test in their benchmarking project, testing the intelligent assistants' handling of reasoning and logic in this clip from their presentation at SpeechTEK 2019.
06 Dec 2019
Cognilytica Analysts Kathleen Walch & Ronald Schmelzer discuss the third test in their benchmarking project, testing the intelligent assistants' grasp of cause and effect in this clip from their presentation at SpeechTEK 2019.
29 Nov 2019
Cognilytica Analysts Kathleen Walch & Ronald Schmelzer discuss the first test in their benchmarking project, testing the intelligent assistants' handling of comparison questions in this clip from their presentation at SpeechTEK 2019.
22 Nov 2019
Cognilytica Analysts Kathleen Walch & Ronald Schmelzer discuss the first test in their benchmarking project, testing the intelligent assistants' grasp of basic concepts in this clip from their presentation at SpeechTEK 2019.
15 Nov 2019
Cognilytica Analysts Kathleen Walch & Ronald Schmelzer introduce the Cognilytica Voice Assistant Benchmark for testing the intelligence of devices such as Alexa, Siri, Google Home, and Cortana in this clip from their presentation at SpeechTEK 2019.
08 Nov 2019