Voice Recognition and Screen Readers to Assist Students
Voice Recognition Software Helps Students
THE FUNDAMENTALS OF VOICE RECOGNITION
1. What is voice recognition technology?
Voice recognition is a computer application that lets people control a computer by speaking to it. In other words, rather than using a keyboard to communicate with the computer, the user speaks commands into a microphone (usually on a headset) that is connected to a computer.
By speaking into the microphone, users can do two things. First, they can tell their computers to execute commands such as open a document, save changes, delete a paragraph, even move the cursor--all without touching a key. Second, users can write using voice recognition in conjunction with a standard word processing program. When users speak into the microphone their words can appear on a computer screen in a word processing format, ready for revision and editing.
2. How does voice recognition work?
First, to operate a computer through voice, the user must learn how to dictate in a word-by-word manner known as "discrete speech." In other words, the computer cannot recognize individual words if they are spoken the way people usually speak--in fluent sentences or "continuous speech." Next, the user must "teach" the system to recognize his or her voice through a combination of training and usage. We all pronounce individual words in different ways, and voice recognition software cannot simply recognize everyone's voice right off the bat.
As the user speaks to the system, the software creates a user-specific voice file that contains a lot of information about his or her voice qualities and pronunciations. The system uses this information to make its best guess at what each word is as it is dictated.
The process of "familiarizing" the voice recognition software with an individual voice takes time. When a user takes the time to properly train and use the voice recognition system, which creates a strong and accurate voice file, the system will supply the correct word most of the time. However, the system will never achieve a 100% accuracy rate in all situations. Sometimes the computer just doesn't get it right and suggests the wrong word. The user must then stop and correct the system.
3. What happens when the computer does not recognize a dictated word correctly?
Because the computer "knows" it occasionally makes mistakes, for each word it offers as its best guess, it generates a list of alternative words. In some voice recognition programs, this list appears in a suggestion window on the screen and the words in it change with each dictation. The user can correct a mistake by choosing the desired word from this list if it appears there.
If the correct word is not in this list of alternatives, the user can spell it aloud, letter by letter, or begin typing the letters on the keyboard. The computer will use this information to predict the right word.
If you haven't seen the screen-by-screen demonstration of how voice recognition works, now would be a good time to do so by selecting the demo button below. Keep in mind that every system works a little bit differently, but this generic demonstration will help you visualize the essential procedures that all systems rely on to some degree. After the demonstration you will have the opportunity to ask me questions.
4. What exactly constitutes a voice recognition system?
A voice recognition system is made up of a computer with system software, voice recognition software, a microphone, and usually a sound card. To use voice recognition to word process, a word processing program is also needed. Each software program has different hardware requirements, but generally speaking a more powerful computer is needed--typically with a Pentium or a very fast 486-based CPU and at least 16 MB of RAM.
In general, the voice recognition software itself is built on three parts: a large electronic dictionary (e.g., a 150,000 word dictionary from some publisher such as Merriam-Webster), a smaller active dictionary that reflects the user's own usage, and a voice model.
How they work . . .
Screen readers are audio interfaces. Rather than displaying web content visually for users in a "window" or screen on the monitor, screen readers convert text into synthesized speech so that users can listen to the content. Sighted users usually have a hard time imagining having to always rely on an audio interface because their world is so highly visual. The experience is completely different, to be sure. The miracle is that the option for an audio interface even exists at all. Without screen readers, people who are blind would need to rely on other individuals to read the content out loud to them. The technology makes independent access to information possible for a population that would otherwise always need the support and assistance of others.
Screen readers do not read web content quite like human beings do. The voice may sound somewhat robotic and monotone. In addition, experienced users often like to speed up the reading rate to 300 words per minute or more, which is more than the inexperienced listener can easily understand. In fact, when many people hear a screen reader for the first time, at the normal rate of about 180 words per minute, they complain that it reads too quickly. It takes time to get used to a screen reader, but the interesting thing is that once users get used to it, they can race through content at speeds that can amaze sighted individuals.
Two of the most common screen readers are JAWS, by Freedom Scientific, and Window Eyes, by GW Micro. These programs can read not only web content but also the Windows operating system, word processing programs, and other software. Mac and iOS devices come with VoiceOver. There are many other types of screen readers available.