Giving a Voice to the Developing World
I recently taught two short courses, in Ghana and Kenya, on voice applications for the World Wide Web Foundation’s Mobile Entrepreneurship in Africa program. The students were extremely enthusiastic and full of creative ideas for voice-based businesses. Clearly they understood the potential for voice applications in their countries. In fact, nearly all said they were interested in starting a voice-based business after the class. Our class was based on using the W3C standard for voice applications, VoiceXML. While teaching and working with my students, I was struck by the important role that standards play in creating opportunities in the developing world.
The Web has revolutionized the way we get information and transact business. Locally relevant and global information is available nearly instantly though Web search tools. But this convenience depends on a complex infrastructure of networks, computers, and basics, such as reliable power and literacy, that we take for granted in developed countries. In the developing world, this infrastructure is much less available.
According to Internet World Stats, while 77 percent of the population of North America has access to the Internet, only 11 percent of the population of Africa does. Yet the information on the Web can be just as useful to people in developing countries as it is to people in developed countries. Because mobile phones are becoming ubiquitous, voice applications can bridge this gap. They can enable many more people to benefit from the Web in parts of the world where computers are rare. Efforts such as IBM’s Spoken Web project in India (discussed in the “View from AVIOS” column in the March-April 2011 issue of Speech Technology magazine), and the Mobile Entrepreneurship in Africa program are making it possible for people to access information on the Web using technology as simple as a basic mobile phone.
Voice standards, such as VoiceXML, are key to realizing the goal of bringing the Web to people through mobile phones. There are a number of reasons for that, but they all revolve around the fact that standards offer stability. This is important for any entrepreneur, but especially for small entrepreneurs in the developing world with limited resources. If a vendor changes its business model or is acquired, entrepreneurs who rely on that vendor for a proprietary technology could be left unable to continue their businesses. It may be very difficult for a small entrepreneur to recover when a vendor on whom he is relying is no longer available. In contrast, entrepreneurs interested in building a voice-enabled business can be assured that a standard like VoiceXML, which is supported by many vendors, will not suddenly become unavailable. If they’ve invested their time and resources in learning how to use a standard and develop applications for it, they can still use those skills with another vendor’s products. Standards also encourage a third-party ecosystem to develop that can support the entrepreneurs who are offering solutions. These third parties may themselves be small entrepreneurs who offer services like training, VoiceXML coding, grammar development, and hosting.
Another aspect of stability is that standards like VoiceXML have also been tested by many implementations, which means that they can be expected to work.
Standards also promote the development of open-source implementations, so the cost of entry to standards-based technology is much lower. Entrepreneurs can test the viability of their ideas without a large investment of time and money, using open-source or easily available tools. Although special-purpose integrated development environments (IDEs) are helpful, VoiceXML applications can also be developed with simple text editors. While an actual deployment might require more than an open-source implementation, initial development and testing of VoiceXML documents and speech grammars can be done with simple platforms without requiring a large initial investment. Similarly, books, documentation, and tutorials for standards-based technology are much more widely available than similar material for proprietary approaches.
A possible concern for voice applications in the developing world is the large number of local languages for which there are no existing speech recognizers or TTS systems. Many very useful applications, however, can be developed with only recorded prompts, which can be in any language, and dual-tone multifrequency signaling. In addition, promising techniques are being explored that would enable non-experts to bootstrap small vocabulary speech recognizers in a new language based on a recognizer for another language.
Finally, another value of standards is that they connect with other standards. VoiceXML uses the browser/server paradigm, familiar in the traditional Web, to leverage Web development skills and an infrastructure already in place.
Standards are important as a basis for applications everywhere, but they can be critical in the developing world.
Deborah Dahl, Ph.D., is principal at speech and language consulting firm Conversational Technologies and chair of the World Wide Web Consortium’s Multimodal Interaction Working Group. She can be reached at dahl@conversational-technologies.com.