Speech Recognition in Healthcare: a Significant Improvement or Severe Headache?

Speech recognition (SR), also known as voice recognition, is the ability of a program to identify words in spoken language and convert them into a machine-readable format.

To better understand how this technology works, it would be helpful to touch upon its major types:  

  • front-end (user-dependent): the words are converted into a text in real time, obviating the need for a transcriptionist;
  • back-end (user-independent): the words are recorded in a digital form and then processed by a computer, after which a draft text is proofread by an editor.

Not a new word to the tech world

Voice recognition is actively used in an array of ways: from improving customer service to combating crime, to name a few. In addition, BCC Research reveals that the global market for SR will increase from $104.4 billion in 2016 to $184.9 billion in 2021.

Healthcare is a sphere where SR has put down deep roots. Long ago, the HIMSS called voice recognition an “aggressively” expanding market with a 20% growth rate per year. A KLAS report says that in spite of 50% physicians’ resistance to adopt the technology, 9 out of 10 hospitals plan to expand SR use.

Considering pros and cons

Any technology that sparks a huge boom due to its considerable advantages often opens the door to new challenges. This article will outline some of the technology's pros as well as the ways to address the challenges of implementing SR in healthcare.

How voice recognition boosts physicians’ productivity

Ben Brown, Vice President of business development and investment services at KLAS Enterprises, is sure that using the SR technology gives a significant ramp-up of productivity: “We saw radiologists who adopted speech recognition witness their productivity and competitiveness increase quite a bit.”

Let’s delve into voice recognition’s perks that enable such a productivity boost in healthcare organizations.

Less paperwork

The allure of SR is that it allows physicians to reduce paperwork. With a recording system at hand, doctors can avoid writing or typing numerous patient records, including diagnosis and treatment notes.

They just use portable devices (phones or voice recorders) for dictation and then obtain the needed data in a digital format.

Moreover, if voice recognition solutions are integrated with the existing EHR software, doctors will get the chance to dictate directly to the PC-based EHR system and won’t have to manually process and handle the obtained data.  

Time savings

Without doubt, it takes less time to dictate than to write or type. By using SR technology, clinicians don’t have to manually prepare endless reports; they get more time for their primary duties -- attending to patients. Moreover, dictation software allows decreasing average report time, thereby contributing to timely decision making.

Ben Brown confirms that time saving is one of the SR’s pros and gives preference to front-end systems in this regard, as they do not require an editor. “When clinicians do speech recognition on the spot, they actually complete a patient report much quicker than waiting for a transcriptionist to create a document that then must be reviewed, edited, and finalized,” he says.

As contradictory as it sounds, back-end SR also has the advantage over the front-end one in terms of saving time. Thanks to such systems, physicians don’t have to verify their records over and over again and manually correct errors. They can rely on transcriptionists and review the document, when the corrections are ready.

Improved workflows

Many physicians are sure that voice recognition is able to enhance their workflows, because SR solutions provide automatic queuing of dictations from several users to predefined assistants and selective routing of dictation files.

Beyond that, front-end systems are trained to recognize users, their accents, and previous corrections, thus “learning” common speech patterns.

Nick van Terheyden, MD, a Chief Medical Officer at NTT DATA, Inc., is a keen supporter of such user-dependent systems. He is sure that with a 90% accuracy rate they improve the overall workflow. “The software is very sophisticated and can distinguish among accents and specialties, information that becomes part of the provider’s profile. With that data and a limited amount of training, physicians can produce very accurate documents.”

Responding to challenges

Although SR technology can’t boast impeccable functioning and seems to create a wealth of challenges, many of them are easy to address.

Ambient noise

Unfortunately, in hospitals there’s much noise created by patients, family members, or medical assistants. This negatively influences the recording process and makes SR tools prone to errors. In this case, the usage of noise reduction microphones may become a solution to the problem.

Variable quality

Gary David, PhD, an associate professor of sociology at Bentley University in Waltham, surveyed how front-end SR influences physicians’ work. He admits that without a medical transcriptionist, physicians become in charge of assuring the quality of records.  

“These programs still make mistakes and, in my research, I found that physicians would sometimes let small errors go -- substituting ‘he’ for ‘she,’ for instance. At times, the errors can be very difficult to identify.”

Thus, according to David, it is not a good idea to task physicians with controlling quality. Instead, it’s wiser to use back-end SR, relying on transcription service providers.      

Heavy accents

Here medical experts’ opinion differ. Nick van Terheyden thinks front-end systems are sophisticated enough to recognize accents. However, Brown’s studies show that the recognition of accents and words with more than one meaning is still the holy grail of voice recognition.

It may be helpful for doctors to pronounce more clearly or turn to back-end systems that claim to use thousands of samples, which increases the chances to easily work with a certain accent.

Power outages

While working with any type of devices or software, there’s a risk of losing important data. Therefore, it’s pivotal to implement a robust backup strategy, provide autosave functionality, and use UPS systems.

Other issues

There are some other things to consider before adopting SR technology in healthcare institutions.

Transcription errors. This issue becomes relevant for both front- and back-end systems. On the one hand, physicians may be extremely busy and fail to pay close attention to small errors (according to Gary David’s research). On the other hand, transcription service providers may also lose sight of key details.

In any case, humans tend to err. The same mistakes may be found when manually entering data.

Poor grammar becomes a critical issue in case of back-end SR, when the system doesn’t recognize a particular user and their common mistakes, forcing an editor to double check.

Cost. Voice recognition systems are quite expensive to set up. Moreover, the cost becomes higher in case of back-end SR, as hospitals need to employ additional staff, such as transcriptionists or editors. However, some healthcare facilities that acquired automatic medical transcription software report notable financial savings.

Adoption challenges. There are other factors that may influence successful voice recognition adoption, such as physicians’ age or speech disorders. For examples, researchers acknowledge that younger physicians are more willing and faster to adopt the technology than established practitioners are.

Security is one more thing that requires careful attention. It’s paramount that the chosen SR solution is HIPAA-compliant.

Each type of voice recognition has its own pros and cons. David and van Terheyden agree this technology can both ease and complicate clinicians’ lives. That’s why they believe that “it’s key that we give physicians the choice to optimize workflow, whichever system they prefer.”

Key takeaways

In this article, we have tried to tackle the pros and cons of SR implementation in hospitals and answer the question whether this technology represents a significant step forward, or creates more problems.

Relying on some stats and medical experts’ findings, we have discovered an array of voice recognition’s positives, such as better productivity and improved work processes. However, after а more detailed examination, we have revealed a number of challenges.

Some of these challenges are quite small and can be easily addressed (noise or power outages). But some of them (accent recognition or poor grammar) need particular attention on the part of experts in custom medical software development. The way developers will overcome the existing challenges will determine the future of SR technology, whether it will be bright or dark.    


Produced from materials originally authored by Yana Yelina, a Technical Copywriter at Oxagile.

Last updated: Aug 21, 2017 at 9:47 AM

While we only use edited and approved content for Azthena answers, it may on occasions provide incorrect responses. Please confirm any data provided with the related suppliers or authors. We do not provide medical advice, if you search for medical information you must always consult a medical professional before acting on any information provided.

Your questions, but not your email details will be shared with OpenAI and retained for 30 days in accordance with their privacy principles.

Please do not ask questions that use sensitive or confidential information.

Read the full Terms & Conditions.