The 2008 Speech Luminaries

Webster’s Dictionary defines a luminary as "a person who inspires or influences others, especially one prominent in a particular sphere." In the speech sphere, those named as luminaries this year by the editors of Speech Technology magazine have been inspiring and influencing others for years. Together they have more than 80 years of combined experience in corporate, organizational, and thought leadership. Their combined efforts have led to numerous patents and countless innovations in speech platform development and standards creation. They have also, during their careers, had a hand in advancing the causes of their respective companies and the industry as a whole, and have even appealed to the academic community for insight and inspiration. To be sure, the industry would not be where it is today without their help.

_{Man of Standards: Voxeo's Daniel Burnett}

Daniel Burnett has just one year under his belt at Voxeo, but make no mistake: He’s hardly a newcomer to the speech industry. In fact, Burnett has had a big hand in creating and writing more than a dozen major speech standards in use today.

Regarded as one of the most effective standards leaders in the World Wide Web Consortium (W3C), Burnett received his Ph.D. in computer science from the Center for Spoken Language Research at Oregon’s OGI School of Science and Engineering in 1997. He spent two years in Austin, Texas, as a senior member of SBC Labs’ technical staff, working on a variety of speech-recognition system prototypes, before heading west in 1999 to join Nuance Communications’ Dialog Research Group. But it wasn’t until six months later that he truly struck gold.

That was when Burnett attended one of the first W3C Voice Browser Working Group meetings. He became immediately—and deeply—involved, all to the speech industry’s benefit. During the nine years that have followed, Burnett was chief editor of Speech Synthesis Markup Language versions 1.0 and 1.1; editor of VoiceXML 2.0, 2.1, and 3.0 and Media Resource Control Protocol v2; author of the Extensible Multi-Modal Annotation specification, Pronunciation Lexicon Specification, and State Chart XML; and a contributor to the Speech Recognition Grammar Specification, Semantic Interpretation for Speech Recognition, and the multimodal framework documents of the Multimodal Interaction Working Group. Colleagues note his "ability to build consensus and craft specification wording that precisely captures the consensus of a decision."

Amid his W3C involvement, in 2005 Burnett moved to Vocalocity, where he was product manager for the company’s Voice Browser offering. The following year he bounced back to the "new Nuance" (post-ScanSoft merger) for a more standards-focused engineering position, then landed at Voxeo as director of speech technologies. He is charged with not only continuing his standards work, but also improving the use of speech-recognition technology within Voxeo itself. "If it has to do with speech recognition, I’m the go-to person," he says

_{Talk of the Town: Yap's Igor Jablokov}

Multitasking may be alive and well, but Igor Jablokov didn’t like the idea of his younger sister—or any typical teenager—driving while text messaging. The dangerous combination served as the catalyst for another idea: an automated speech-recognition platform that could convert voice to text via mobile phones.

In 2006, Jablokov partnered with his brother, Victor, to launch Charlotte, N.C.-based Yap, which recently caught the attention of venture capitalists to the tune of $6.5 million. Yap’s technology not only translates a user’s voice to written words to send text messages, but it can also post to the Facebook and Twitter social networks and search the Web via Google, among other Web services. To be sure, even those without the big bucks have taken notice of Jablokov’s company: For example, last November Yap was named Early Stage Company of the Year by the North Carolina Technology Association. It was also among 40 start-ups selected out of a pool of 700 to demonstrate its technology at the 2007 TechCrunch40 conference, and was one of two winners in the voice platform category at Under the Radar: Mobility, which recognizes start-ups.

Prior to launching privately held Yap, Jablokov spent close to a decade at IBM, where he helped drive awareness for voice search using multimodal interaction. Considered by his colleagues to be a forward-thinker and passionate leader, Jablokov was program director for WebSphere-based multimodal and voice portal initiatives, heading a worldwide team focused on developing advanced speech technologies for IBM’s On Demand computing vision. During Jablokov’s tenure, he worked on the technologies that went into General Motors’ OnStar car security system and Honda’s navigational systems. He also spent time in IBM’s Pervasive Computing, PartnerWorld, Global Industries, and Microelectronics divisions.

Between two posts as the VoiceXML Forum’s director, Jablokov served as chairman and was also a business mentor for IBM’s Extreme Blue incubation and internship program. Jablokov received a degree in computer engineering from Penn State in 1997, and an MBA from the University of North Carolina in 2000.

_{Next-Gen Thinker: Bill Scholz}

Bill Scholz, Ph.D., has led a long and distinguished career in speech technology, beginning with and currently inspiring the next generation of industry players. The good news is, he’s showing no signs of slowing down.

Scholz’s 30-year tenure has had plenty of highlights. An internationally recognized industry leader in speech technology (not to mention one of Speech Technology magazine’s proclaimed top 10 experts), Scholz spent two decades at Unisys, from 1987 to 2007, where he co-founded the company’s Speech and Natural Language initiative as architect director for the company’s Voice and Business Mobilization Solutions Engineering division. He managed the design and development of the NL Speech Assistant (NLSA), an integrated suite of tools for building speech and dialogue applications, all the while traveling the world to speak at conferences, contributing articles about speech and service creation technologies to numerous trade journals, and participating in the World Wide Web Consortium’s Voice Browser Working Group, the SALT Forum, and the VoiceXML Forum. Additionally, as part of Unisys’ Defense Systems group, Scholz focused on artificial intelligence, knowledge-base/database integration, and advanced software technology.

Prior to Unisys, Scholz led software development at Rabbit Software, following roughly six years at Kulicke & Soffa Industries, where he designed a multitasking operating system and process control software for custom precision robots. His roots in academia, Scholz earned a doctorate in cognitive psychology from Indiana University in the early 1970s, then he went on to teach for much of the remaining decade.

Let’s fast-forward post-Unisys. Last year Scholz created his own consultancy, Strafford, Pa.-based NewSpeech Solutions, to focus on the planning, analysis, design, development, and deployment of speech technology solutions. He is also president of the Applied Voice Input/Output Society (AVIOS), where he is working to spread the word about speech technology among the community of researchers, developers, and users already entrenched in the industry. But here’s where Scholz truly comes full circle: Part of his mission is also outbound, dedicated to attracting tomorrow’s speech leaders who, in his words, "will lead the speech industry forward into new areas." One example: Under Scholz, AVIOS has spearheaded an annual student contest aimed at aspiring developers to create speech applications that are judged by industry experts.

With Scholz leading the charge, the future of speech technology is assured.

_{Aligning Objectives: SpeechCycle's Roberto Pieraccini}

If you build it, they will come. Right? Perhaps. Seeking to bridge the gap between what users want and what speech developers create, Roberto Pieraccini, Ph.D., has been working diligently to align the efforts of the speech research community with the needs of the commercial sector.

Who better to do so? With a 25-plus-year career steeped in both research and technology, Pieraccini knows firsthand that those focused on advancing speech technology are not always aware of market conditions that could lead to breakthroughs in speech performance, usability, and deployment. Likewise, businesses selling speech solutions don’t always know about the latest body of research that could bolster revenue. In an effort to get all of the parties talking, Pieraccini, who’s chairing the Speech and Language Technical Committee of the Institute of Electrical and Electronics Engineers (IEEE) Signal Processing Society, organizes conferences, workshops, and sessions, focusing their discussions on market realities.

Pieraccini also endeavors to get everyone on the same page as chief technology officer at SpeechCycle, which recently introduced the Caller Experience Index (CEI). In true Pieraccini style, CEI blends qualitative and quantitative assessment of a speech application to accurately evaluate the caller experience. He has also established research partnerships with several European universities to develop or improve commercial speech recognition applications, and was instrumental in crafting Speech-Cycle’s High Definition Statistical Language Models, which recognize specific caller issues. Along those lines, Pieraccini has had three papers published in the past 12 months related to automated technical support, though that’s really just the tip of his article iceberg: Pieraccini has had more than 110 technical articles published about speech recognition, language modeling, language understanding, and dialogue, plus he writes a monthly blog called Dialing for Dialog. He is also recognized as a contributor to more than 10 patents.

Prior to joining SpeechCycle (then Tell-Eureka) in 2005, Pieraccini managed the IBM T.J. Watson Research Center in Yorktown Heights, N.Y. Before that, he led the Natural Dialog team at SpeechWorks International (now Nuance), following nearly a decade employed by Bell Laboratories and AT&T Shannon Laboratories. Pieraccini hails from Italy, where he worked on speech recognition algorithms at Centro Studi E Laboratori Telecomunicazioni, a Loquendo spin-off that is now Telecom Italia Labs. Pieraccini earned a doctorate in electrical engineering from Italy’s Universita degli Studi di Pisa in 1980.

Free

for qualified subscribers

Subscribe Now Current Issue Past Issues

The 2008 Speech Luminaries

Amazon Launches Nova Sonic, a Gen AI Model for Building Voice Applications and Agents

Krikey AI Launches Talking Avatars with ElevenLabs

Phonic Launches End-to-End Speech-to-Speech Platform for Building Voice Agents

SyncWords Introduces Ultra-Low Latency AI Captions with Kobe Muxer