Speech Technology: Finally, a Competitive Necessity
Innovative technologies that are eventually successful follow a typical timeline.
- The capabilities and markets are over-stated in early stages as struggling start-ups try to make a case to investors and the press, leading to over-hype and disappointment.
- The core technology improves, and several early applications are identified where the use of the technology is particularly needed and is cost-effective, providing a base for growing businesses.
- Core technology continues to improve, making possible more marketable applications, but moving out of niche markets raises platform compatibility and integration issues that slow adoption.
- The technology breaks through barriers on two levels—the performance of the core technology exceeds thresholds required for wide use and barriers to integration have been largely breached. Enough applications have been fielded that market skepticism and fear of being on the "bleeding edge" begins to fade.
- Growing markets and competition lead to consolidation and lower cost, as companies begin to compete over real market opportunities. Larger companies begin to flex their marketing muscle and increase customer awareness. The general press is often slow to recognize this stage because of over-optimistic past coverage.
- The technology becomes a competitive necessity in some applications and markets, leading to solid growth and widespread adoption. The speed at which this stage expands often makes it seem to those outside the business as if the technology has exploded onto center stage.
Having gone through most of these stages, speech technology is moving through the fifth level and entering the sixth. It is becoming a competitive necessity in many markets, driven by real advantages and business requirements. A few major trends:
Targeted Marketing, Call Centers, and Speech Technology Marketing is undergoing a paradigm shift as conventional mass marketing becomes more difficult and fractionated. The growing profits of companies like Google and Yahoo signal the trend toward targeted marketing, where the consumer receives a message relevant to them and to their current buying interests. Both of those companies have hired speech experts, and it is likely that some company will launch a voice business directory assistance service or a broader telephone voice search and information service. Such "voice portals" will increase calls to conventional call centers, simply because call center numbers will be easier to find. The long-term impact will, however, be much greater than a minor boost in call volume. The trend is certain to focus top management on the marketing and branding opportunities implicit in a voluntary call from a customer. At a minimum, call centers' priorities and budgets will begin to reflect a transition from being viewed as a necessary but regrettable cost center to being seen as a customer relationship enhancement center. The objective of making each call as short as possible will change to making the most of each call. Adding to the call volume, the use of Voice over IP (VoIP) technology on PCs for customer service will grow, perhaps with the initial focus on click-to-call interactions. (Click-to-call was cited by EBay as one of the motivations for its purchase of Skype, a VoIP provider.) As phone services get less expensive (or free), calls to a company will be more comparable to Web visits—they will be numerous and encouraged. Speech recognition will be required to handle the increased volume of calls economically and provide round-the-clock service while maintaining customer satisfaction; agents will be reserved for the most difficult tasks. The STM Buyers' Guide lists the companies that can help meet the challenge that higher volumes and new attitudes toward call centers will create. Bill Meisel |
- Contact centers: Contact centers that upgrade telephone solutions today almost all include speech capabilities or a platform that can support adding speech. Vendors realize that speech solutions can differentiate platforms that would otherwise become commodities. They emphasize speech features, development tools, and lifecycle management in their marketing. Customers see early adopters saving money while maintaining or increasing customer satisfaction, and they now see the risk in not adopting the new technology. The voice user interface of most early adoptions have followed the model of the touchtone menu, while adding mundane functions that touchtone couldn't handle and that agents would rather not handle. A potentially explosive growth in volume at call centers, pushed by the trend toward targeted marketing described in the sidebar, will require automation of more transactions.
- Mobility applications: Wireless connectivity is making it possible to do many things away from one's office or home. In the automobile, speech recognition and text-to-speech synthesis can allow hands to remain on the wheel and eyes on the road while controlling electronics in the vehicle, making calls, or getting information such as driving directions. In wireless phones, speech technology can compensate for the small screen and keypad. For field workers, speech technology can turn a wireless phone into a productivity tool, delivering instructions and allowing data entry away from the office. In some specialized mobile applications, such as warehouse stock picking, a speech interface is becoming a required feature.
- Unified communications: Allowing access to email and other communications by phone using speech technology also makes workers more mobile. Generally, the voice user interface makes it possible to use all the features of today's complex communications without carrying around a manual.
- Health care: Many good things result from using modern computer technology in health care management. Speech technology can help create the electronic medical record by letting medical professionals dictate reports without the costs and delays of full manual transcription, resulting in better patient care at a lower cost. The use of speech technology in medical settings is growing rapidly, led by familiar dictation protocols, but with the beginnings of multimodal interactions using Tablet PCs and other portable devices.
- Entertainment: Toys and games are beginning to use speech recognition creatively, in part because today's chips and speech technology can support more interesting interactions. A recently introduced speech-interactive toy shows signs of being a big holiday hit. Signs are also growing that a speech-enabled program guide and remote control will be a competitive advantage in the many-channel and programs-on-demand world.
- Audio Web search: With audio blogs and other audio/video material becoming increasingly available on the Web, voice search services are already using speech recognition to find material with specific content and to jump to the specific part of the material that contains that content.
- Directory assistance and voice portals: Some directory assistance services already are automated with speech recognition. Using that technology to provide voice search by phone for businesses, services, and products—the proverbial "voice portal"—may be a natural extension for Web and phone companies.
- Your idea! Create your own trend.
Core Technologies
Once you determine you need speech technology, you have more choices than ever before. The core technologies you have available to you include:
- Speech recognition: Interpreting what a person said or representing it as text;
- Text-to-speech: Speaking text from sources such as email or a database, or even using a synthetic voice to make it possible to generate prompts dynamically;
- Speaker verification: Authenticating a claim of identity using the biometric qualities of a voice, often over the telephone and supplemented by other information about the caller;
- Audio scanning and analytics: Finding parts of an audio file (attached to a video or otherwise) that contain specific content, and, in some cases, summarizing this information as business intelligence.
Variations in these technologies include the platform on which the software runs, support for standards, supporting development environments, and other significant variations. Speech technologies are available in server configurations for telephone applications or for sharing across an enterprise. They can be standalone, running as a PC "desktop" application or as "embedded" software on a small device.
Buyers' Guide
The Buyers' Guide in this issue of Speech Technology Magazine outlines the way the marketplace has defined the practical use of speech technology. The categories provide a quick outline that can help you find the right vendor or vendors to support your needs.
Vendors are a good source of information as well as a resource for specific bids. They will, of course, tell you why their solution is best, but the criteria they use to do so can be useful in sorting out what is important in your specific case. Settling on the most important criteria for your application will make the process simpler than it appears. The result will justify the effort.
View the 2006 Speech Technology Magazine Buyer's Guide