Doctor, Doctor
Two major speech technology corporations - Lernout & Hauspie and Philips Electronics NV - recently made big waves in the industry with their purchases of medical transcription businesses. Philips Electronics NV, headquartered in the Netherlands, paid about $1.2 billion (according to the Wall Street Journal) to acquire 60 percent of Medquist Inc., the largest medical-transcription service firm in the United States. The Journal also reported that at the time of the purchase, MedQuist Inc. had a market value of about $1.5 billion and brings approximately 8,000 transcriptionists into the Philips fold. In July, L&H completed its acquisition of Dictaphone Corp. in a deal that totaled $450 million. L&H purchased Dictaphone's outstanding stock for approximately 4.7 million shares of L&H common stock, and L&H assumed or paid off $430 million of Dictaphone debt and other obligations, according to an L&H press release. According to market analysts, the medical dictation market is the largest vertical market in the speech technology industry, and these purchases signify a move by speech companies to direct attention on properties servicing those vertical markets.
Speech Technology Magazine took note of these two developments and interviewed executives with Philips Electronics NV and Lernout & Hauspie to determine what the acquisitions mean to the industry. Founded in 1987 and headquartered in Belgium, L&H offers an array of consumer, business and industry offerings in automatic dictation, translation, sound compression, voice synthesis and industrial documentation. L&H's products and services originate from four core technologies: automatic speech recognition, text-to-speech, digital speech and music compression, and text-to-text (translation).
Rob Schwager
is president of the L&H Healthcare Solutions Group and is responsible for the overall management for L&H Dictaphone's Healthcare and Commercial markets business units, including sales, marketing, business strategy, product development and engineering. Schwager joined Dictaphone in 1978 within the field sales organization holding various sales management positions. He spent approximately 10 years in the field organization and 11 years in headquarters positions. He lived in the United Kingdom while establishing newly organized operations in the UK, Switzerland and Germany. He played a key role as part of the senior management team and equity investor involved in the leveraged buyout of Dictaphone from Pitney Bowes in 1995. Mr. Schwager holds a B.S. in psychology from the University of Wisconsin and has completed some graduate work in psychology at the University of Wisconsin. Philips Speech Processing focuses on speech recognition, natural dialogue and language understanding technologies. Philips' speech technology is used for a variety of applications, such as directory assistance, customer care, banking, travel, auto attendants, speech portals and white and yellow pages automation. Philips has more than 40 years of experience in the development and marketing of speech products and developed the first commercially available PC-based natural, continuous speech recognition engine for speech-to-text applications in 1993.
Cesar Vohringer has been the CEO of Philips Speech Processing since March 2000. He is a native of Brazil and an experienced Philips manager who joined Philips Speech Processing from Philips' consumer electronics product division. For more than two years, Vohringer was a member of the management team of the television business group, responsible for strategic marketing and business strategy formulation for the television business worldwide and for the portfolio of new television products and projects. Additionally, his responsibilities have included managing the Global Business Creation Unit's performance on projects (specifically regarding its time to market and costs). He also chaired a team representing numerous Philips product divisions that implemented the Philips High Volume Electronics strategy. Vohringer's previous functions with Philips Consumer Electronics include global program manager for Business Group Television from 1993 to 1997 and development manager of the Singapore TV Laboratory from 1987 to 1993. He is based in Vienna, Austria, at Philips Speech Processing headquarters.
sT What do you see as the significance of the purchases of MedQuist and Dictaphone by Philips and L&H, respectively, as they relate to the overall dictation market? ROB SCHWAGER, Lernout & Hauspie: I believe they point to the sizable business opportunity derived from automating the production of medical reporting. The process of producing millions of medical reports each day has largely gone unchanged for many years. L&H has established an "end-to-end" offering which combines advanced technology, field deployment resources and transcription services that will significantly alter and improve the process of creating, analyzing and managing the vast amounts of clinical data locked in today's text-based medical reports. By combining system workflow platforms, health care system IT integration capabilities, Web/Application Service Provider technology, physician-oriented mobile devices/applications, speech recognition and natural language processing - vast labor efficiencies can be achieved in a market sector that, according to our research, is spending upwards of $10b per year just to produce "one time use" paper output. The view of L&H is that significant value exists in extracting medical facts or data from these huge volumes of narrative text-based reports. Speech technology combined with NLP will help the medical market cross the chasm of converting unstructured voice input into structured and datatized, reusabltput. The acquisition of Dictaphone Corp. points to the importance in having a sizable customer base, a field sales and field engineering organization and an established set of workflow "pipes" already laid in the marketplace.
CESAR VOHRINGER, Philips Electronics: The significance of recent purchases by speech technology vendors document the consolidation process in the speech industry, moving from a market creation phase into a maturity phase. Philips' MedQuist investment fosters PSP's leading position into the professional dictation segment, and puts us in an excellent position to best serve the medical dictation and transcription market in the U.S. The profitability of the MedQuist operation will be further enhanced through implementation of the latest Philips speech recognition technology in their applications. We will expand this business to other markets, geographically and to other market segments (e.g., legal, insurance).
sT The medical dictation market has now surpassed the legal market to become the largest vertical niche for speech technology products. Do you see other vertical markets that are ripe for ingress by speech product providers? Schwager: Certainly, several additional markets which involve the high volume production of narrative reports are great candidates for the speech and NLP-based workflow systems being applied to the medical market. Law enforcement and insurance industries are two good examples. Police departments have very significant reporting demands as well as the need to extract data from these largely narrative reports. Both industries also point to the need to provide these workflow and production services on mobile platforms, since police and insurance claims workers are largely mobile professionals.
Vohringer: The dictation market is ripe there, where professionals are relying on the dictation metaphor to be productive. Professions like physicians of all sort, lawyers and insurance agents (e.g., claim adjusters) historically use dictation as an efficiency tool. Their value-add resides in the content creation of a report. And as you speak seven times faster than you can write, dictation represents the optimal solution for these people. Digital dictation, speech recognition and the integration of speech, images and data in an IT infrastructure can deliver to organizations a degree of productivity and efficiency gains which are simply striking. In many other vertical markets, like customer care, directory assistance, banking, etc., cost containment programs, competitiveness in these markets and the need to improve the customer service level are all important business drivers speeding up the deployment and implementation of speech recognition solutions.
sT What factors contributed to your company's decision to enter the medical dictation market? Schwager: L&H has been in the medical market for some time. It is, without a doubt, the biggest vertical market for speech recognition and one which has the strongest characteristics for mass-market adoption. This is driven by the vast amounts of money being spent on medical transcription - measured in billions of dollars each year. L&H's acquisition of Dictaphone Corp. underscored the importance of not just providing "technology" but, rather an end-to-end solution that embeds the technology in a fluid and user-friendly workflow process which complements current practice patterns of physicians and health information professionals. Speech within the health care arena must deal with the issue of physicians being very resistant to practice pattern change. They are under tremendous pressures to see more patients and do so in the most efficient manner possible. For this reason, speech recognition will most easily be deployed as a background process connected to our many voice capture servers in 3,000 hospitals. The audio files will be post processed by speech recognition servers, which will produce a draft for either the medical transcription or the physician to finalize. It's a process, not a discrete technology.
Vohringer: For more than 40 years Philips has been the principal force in delivering speech technologies to users around the world. Creating new innovations to improve the way the world communicates with machines and with each other has always been at the forefront of the company's mission. Philips Speech Processing's Business Unit evolved from Philips Dictation Systems. We are the world market leader in professional dictation products with a world market share of about 32 percent. Our focus to the heavy dictation-oriented markets such as medical is therefore nothing new, but a core market segment we are serving for decades. Innovation led Philips to invent the minicassette specifically designed for dictation, professional tape-based and digital dictation and transcription products and become the first company to develop a commercially available natural continuous speech recognition engine that allowed spoken words to be converted into text on a PC. In 1994, Philips launched with the SP6000 product the first client server-based continuous speech recognition system for the radiology market. Our background in the dictation business and understanding of the dictation habits and workflow procedures of our client base allowed us to develop solutions which applied latest technology in a way which did not force the dictation users to change the way they speak and work. Professional speech input and transcription devices, workgroup solutions and specialized contexts (vocabularies and language models) for specific professions and state-of-the-art speech recognition technology are the ingredients of successful installations and the endorsement of our solutions by the medical industry. The acquisition of MedQuist reinforces our commitment to the medical dictation market.
sT Is there now, or will there be, some utilization of a marriage between medical dictation and the Internet? If so, how is that happening or how will it happen? Schwager: L&H is in the process of creating an ASP delivery center which utilizes Web technology to transport audio files to speech recognition servers and return draft documents either to the hospital staff/physicians or to L&H's outsourced transcription service. The marriage of Internet-based ASP business models and advanced technology such as automatic speech recognition and NLP will also allow L&H to reach "down market" to the physician practice market and offer these same end-to-end solutions for lower volume medical reporting environments. This same ASP business model will be used to further process the text reports with NLP to provide clinical coding which is increasingly being required by the federal government.
Vohringer: Absolutely. The Internet has become a prime source of reference for the medical industry. It is a platform that will become an integral part of dictation and transcription solutions.
sT How pervasive has speech become in the medical records industry? Schwager: It has already emerged from the immature technology phase to the commercial acceptance phase within selected domains, such as Radiology and Pathology. L&H PowerScribe is being successfully used by over 500 physicians in very high volume environments. By offering ASR servers on both a client/server or ASP basis as an "extension" to L&H Dictaphone Enterprise Express systems (which are currently being used by hundreds of thousands of physicians each day), we are about ready to see ASR move into the maturity phase of mass market application - unmatched by any other vertical market or supplier.
Vohringer: Speech has ever been pervasive in the medical records industry. The U.S. market place has developed fastest in this respect. Whilst tape-based dictation products were replaced by digital dictation solutions to speed up the transcription process and addressing the growing security needs in medical institutions, speech recognition is now paving the way for the next revolution: Speech is becoming the prime user interface to accessing information, commanding applications and generating documents. In the medical field, speech recognition-based dictation solutions will become part of virtually all applications.
sT The members of the speech technology business community seem to think that speech as a business is about to undergo an explosion of sorts onto the public consciousness. Do you agree, and why? Schwager: Yes, as stated in the above remarks, this will happen first in the medical market as we design speech into the workflow platform and properly support it by the appropriate system implementation resources - integration, physician training, localization of language models, etc.
Vohringer: Speech technology has been associated for a long time with science fiction. The speech industry developed this technology continuously to very high standards. However, the expectations of the users were much higher and did not relate to the given restrictions and limitations. Addressing the PC mass market with speech recognition solutions to users who are not used to talking to their computers, and requiring them to learn a new skill, namely dictating, is a huge task and did not reach its goal till today. In the business-to-business environment, however, adoption of speech recognition represents a striking business case and competitive weapon. The technology is mature; certain professional market segments are already climbing the maturity curve. The explosion in the B2B sector is happening, the public consciousness will raise rapidly, as more and more services (e.g., in telecom directory assistance, voice portals, unified messaging) will be speech enabled and consumer products with embedded speech recognition technology (e.g., mobile phones) will flood the mass market by the millions.
sT Handheld, mobile connectivity devices are the market rage right now. How important is the medical dictation device technology and experience as a component for developing these handhelds? Schwager: This is very important since the physician is a mobile professional. Speech-enabled mobile devices not only facilitate draft production of text, but provide the crucial Speech-based User Interface needed to address the obstacle of small visual interfaces currently available on mobile devices. Command and control functions by speech recognition, text-to-speech functions and Intelligent Content Management will be key for successful deployment of L&H's health care strategies to the time-crunched and practice-pattern physician professional.
Vohringer: The speech input devices are of utmost importance, as they have to respond to user requirements adequately. Our know-how in the dictation device field allowed us to develop products that combine technological innovations with the right user interface, building on the working habits of our customer base. As mobility and Internet access are becoming increasingly important to users, we, as vendor of dictation products and speech recognition software, will continue to work on integral solutions to address current and new needs of our customers. We will make them more productive and enable them to do their job better and as convenient as possible. We have all the competencies in house and are best prepared to innovate also in future the dictation industry.
sT With respect to your specific business, what do you see happening in the way of product development and market targeting in the near term? What about in the future? Schwager: The convergence of speech recognition, NLP fact extraction (referred to by L&H as Clinical Language Understanding mobile technology and clinical information Web portal services) are all key technology pillars supporting L&H's health care vision. Our goal is to create tremendous productivity in a vertical market that sees several billion in administrative waste per year and at the same time use this same technology to create data out of text allowing health care providers and non-providers such as the pharmaceutical and insurance industry to do things which previously could not be done at any cost.
Vohringer: As a global speech technology vendor, we are addressing all speech market segments be it in dictation, telecommunication or to speech-enabled devices through embedding our technology in them. With respect to the professional dictation market, we will continue our strategy to remain the leading dictation company worldwide with a product portfolio of hardware, software and services (MedQuist), addressing uniquely the needs of our customer base. This includes the development of additional market specific contexts in the medical, legal and insurance field for various regions and languages. With mobile devices becoming universal speech devices for dictation, telephony and access points to the Internet, we will come up with appropriate solutions to users in various contexts of use, but primarily where they have busy hands and busy eyes.
Gary Moyers is the executive editor of Speech Technology Magazine.
Companies and Suppliers Mentioned