A speech recognition system having a user interface that provides both visual
and
auditory feedback to a user. The user interface includes an audio sound or speech
generator that produces three distinct sounds: an "on" sound signifying that the
speech recognition system is on and actively awaiting vocal input; an "off" sound
indicating that the speech recognition system is off and in a sleep mode; and a
"confirm" sound noting that an utterance has been recognized. The "on" sound is
triggered by a key "wake up" command or by depression of button. Once awake, the
speech recognition engine expects to receive an utterance within a predetermined
response time. The "confirm" sound signals the start of the response time. If the
response time lapses before a recognizable utterance is entered, the "off" sound
is played. The user interface further includes a visual component in the form of
a graphic that changes with the tolling of the response period. In one implementation,
the count graphic is a progress bar that counts down or shortens in proportion
to the passage of the response period. When the response time runs out, the progress
bar disappears entirely. On the other hand, if the speech engine recognizes an
utterance within the response period, the user interface plays the "confirm" sound
and restarts the countdown graphic. The user interface may also change the color
of the graphic elements briefly to reflect a correct voice entry.