June 13, 2022
By Paul Korzeniowski technology writer
Features

Vertical Markets Spotlight: Speech in Government

Government agencies in the United States employ more than 24 million people, creating a workforce that is much larger than the 1 million or so employed by Wal-Mart, the largest private-sector employer in the country. Like businesses, government agencies are trying to streamline operations and remove friction from their workflows by leveraging emerging speech solutions. As a result, agencies found a number of applications for the technology, helping police, social workers, and healthcare professionals do their jobs.

Many government sectors have comprehensive reporting requirements, forcing employees to spend huge parts of their workdays creating documents. Speech-to-text solutions transform the work by simplifying data input.

This is especially true in law enforcement, which is still today a very paper-intensive service. At the scene of a crime or an accident, officers are required to document what they see and who was involved in any incident. That information then passes through various channels: supervisors, lawyers, and courts.

While proper documentation is so critically important, data input is not an area of expertise for most police personnel. In some cases, officers can spend an hour or more typing a single incident report. In fact, research found that 38 percent of officers spend three to four hours each day on reporting, and about another 17 percent allocate four or more hours each week. For police sergeants, paperwork consumes up to 45 percent of the workday.

Heavy documentation demands impact the timely filing of reports, limit community visibility, and even put officers at risk. Rather than have them struggle to enter information, speech recognition systems improve the process by replacing written information with voice input.

The change has many potential benefits. It speeds up data entry into computer-aided dispatch and records management systems. Reports that sometimes took a day or day and a half to complete can now be done in fewer than 45 minutes.

Such capabilities can streamline operations in the following ways:

• Police perform everyday tasks, like license plate lookups, by voice.

• Departments eliminate paperwork backlogs.

• Police save money by eliminating transcription costs.

• Report details and accuracy improve because officers are no longer rushed and invest their time in editing rather than inputting information.

• Police spend more time on the street and less at their desks and in cars typing.

The pandemic forced police, like other industries, to revamp their workflows. Police investigators in Watertown, Mass., implemented speech recognition technology to address COVID 19 social distancing requirements. With the new system, perpetrators stayed right in the cruiser in the back seat. They made cell phone calls to defense attorneys. Then a conference call was established and a time set up with the judge, the prosecuting attorney, and the defense attorney. The criminals were arraigned within the car rather than in the station house.

By adopting speech technology, police work becomes more attractive. During the past few years, the profession has had a hard time attracting new recruits and holding on to veteran officers. Documentation requirements are one drag on interest: 81 percent of police officers said paperwork was a big contributor to burnout.

Finally, and perhaps most importantly, safety improves. “Traditionally, police would be hunched over in the squad car trying to type information in a timely manner,” explained Ed McGuiggan, vice president and general manager of Dragon Professional at Nuance Communications, which has 400 U.S. police departments using its system. “With a speech recognition system and a power mic, their hands are free, so they keep their heads up and are aware of everything happening in their surroundings.”

Paperwork Weighs Down Social Services

Social services is another government area that relies heavily on paperwork. In family courts, for example, if an item is not documented beforehand in case notes, it did not occur. Processes require that individuals with many job titles—from family crisis therapists to regional administrators, in a hodgepodge of departments, including investigations, treatment, and permanency—document case information as it makes its way through the system. Caseworkers must complete items such as initial interviews, risk assessments, and treatment plans for each of their clients. Because of onerous workloads, many spend time in the evening typing up their paper notes or voice memos made during the workday.

With a speech-to-text solution, the reports can be dictated in the moment. When they leave a meeting, employees can dictate their notes as they drive to the next appointment. The speech systems also help to prepare reports, dictate court updates to relevant parties, and compose emails.

Wireless microphones are another convenience, particularly as social workers spend much of their day on the road visiting client sites. Because there is less rushing, information is often fresher and more comprehensive and takes as much as 75 percent less time to prepare than legacy applications.

Speech Offloads 311 Calls

Speech recognition is helping to improve citizen interactions with 311 citizens’ information lines. These lines provide a quick and simple way for constituents to report problems or ask questions without tying up emergency lines or going through often frustrating municipal channels.

In some municipalities, 311 calls connect to the same contact centers used for 911 calls. 311 calls are assigned a secondary priority and answered only when no 911 calls are waiting. This approach extends call center usage but ensures that emergency calls are answered quickly and have highest priority.

Speech recognition is often coupled with intelligent call routing. In that case, contact centers offload routine inquiries, like the date and time of the next city council meeting, from agents to less expensive and more efficient automated systems.

Speech Streamlines Healthcare Processes

Government often plays a central role in healthcare in many countries. AssisTT is the contact center arm and wholly owned subsidiary of Türk Telekom. Turkey has a national healthcare system, and an important element is scheduling appointments for the country’s 87.4 million citizens. The Ministry of Health supports multimodal (call, text, email, web) appointment scheduling, but most adults prefer to call to set a date. Consequently, almost half of all appointments are entered via its call center. A staff of 4,250 handles 166 million calls yearly.

AssisTT has been using Verint voice technology to offload some calls from live operators to automated systems. The transition has its challenges. “Whenever you work with the government, everything has to be done by the book and follow the guidelines established in the contract,” explains Izzet Erten, chief technology officer for AssisTT.

When the system was installed in 2019, the government set a high mark (90 percent) for transcription accuracy. Meeting that number was difficult because the system was new; the solution was not tuned for Turkish, which has a number of local dialects; and input from elderly patients was difficult to understand because their speaking abilities were waning. AssisTT and Verint worked to boost the response to the 90 percent range.

More Rungs to Climb

While government is finding more applications for speech technology, vendors must clear more hurdles to deploy the technology in the sector than in many others. The first is identifying the potential buyer, not always an easy task.

Government use cases are often unique and do not follow the cookie-cutter approach seen in the commercial market. Therefore, these systems require special skills and a lot of integration work. “We work with partners who specialize in working directly with governments in countries like Singapore and Australia,” explains Daniel Ziv, vice president of speech and text analytics, global product strategy, at Verint.

Systems might be installed for the first time, so training is needed. Employees might require a handful of hours to understand the basics, such as connecting and positioning a microphone; building a user profile; reading the acoustic training text; and using a digital voice recorder.

Finding funding can also be a tedious process. Governments are constantly under pressure to lower costs and operate more efficiently. In addition, the buying cycles tend to be long as well. “It may take close to one year to close a deal, depending on where the agency is in the budgeting cycle when the process begins,” explains Nuance’s McGuiggan. Funding can be turbulent. In certain cases, projects are scrapped late in the process because of budget shortages and changes in administration.

While it is a challenge, selling speech to the government can be lucrative given the size of its workforce. Speech is helping to streamline data input and 311 call routing. Look for the number of use cases to expand as the technology matures, but vendors still face hurdles in finding decision makers and money to further expand its use.

Paul Korzeniowski is a freelance writer who specializes in technology issues. He has been covering speech technology issues for more than two decades, is based in Sudbury, Mass., and can be reached at paulkorzen@aol.com or on Twitter @PaulKorzeniowski.

Free

for qualified subscribers

Subscribe Now Current Issue Past Issues

Vertical Markets Spotlight: Speech in Government

Gladia Launches Solaria, a Multilingual Speech-to-Text Model

aiOla Launches Jargonic Speech Recognition Model

Northeastern Researchers Develop AI App to Help Speech-Impaired

Amazon Launches Nova Sonic, a Gen AI Model for Building Voice Applications and Agents