Sensory Releases a New SDK for the Apple iPhone.
Today, Sensory released a software development kit (SDK) for the Apple iPhone platform.
The new release is part of the FluentSoft Speech Recognition SDK and allows users to build speaker-independent command-and-control applications. The engine uses a proprietary text-based phonetic engine and resides onboard the device. It requires no hookup to a cloud server to function, and as such could even be used on the iPod touch—essentially the same device as the iPhone, but without mobile telephony.
Explaining the need for an embedded engine, Todd Mozer, president and CEO of Sensory, says, “There are a lot of people who don’t want to be connected for a variety of reasons. One issue is they don’t want to send data to a server because they think that they lose control.”
The release follows similar SDKs that have been released for the Symbian platform as well as Windows Mobile. The iPhone iteration was released because of customer demand rather than being preemptive. “We had enough critical mass that we decided to make the investment,” Mozer says.
“In many respects, Sensory is playing a numbers game,” he adds. “If we get enough developers doing enough interesting and unique things, some of them are going to be hits and some of them are going to be flops. We’re not going to be doing the applications ourselves. We’re not taking on that investment ourselves. We’re just focusing on what we’re good at—which is the speech end.”
The SDK already has six partners using it, though they have not yet been made public. Sensory has been selective in choosing partners with which to build applications thus far to build a “disproportionate number of reasonable sellers.”
For now, Mozer says that vocabularies are limited to about 1,000 or 2,000 words in an active set. The iPhone isn’t capable of processing more without substantial slow down in the response time.
“If it’s going to take more than a couple of seconds to respond, it’s going to be so unwieldy that people won’t want it,” Mozer explains.
Despite the fact that it works from an embedded vocabulary, the system also has the ability to add to it. VoiceActivation, the SDKs’ first publicly announced partner, builds a vocabulary automatically out of the contacts stored in the phone’s directly, making them recognizable to the engine.
Sensory is also working on new technology approach called “Large Vocabulary Based System” that more effectively uses RAM (often the most constraining factor in a system). Gains in effective memory use should allow for vocabularies in the tens of thousands in embedded applications.