A method and apparatus are disclosed for validating agreement between
textual and spoken representations of words. A voice input verification
process monitors a conversation between an agent and a caller to validate
the textual entry of the caller's spoken responses or the agent's spoken
delivery of a textual script (or both). The audio stream corresponding to
the conversation between the agent and the caller is recorded and the
textual information that is entered into the workstation by the agent is
evaluated. Speech recognition technology is applied to the recent audio
stream, to determine if the words that have been entered by the agent can
be found in the recent audio stream. The grammar employed by the speech
recognizer can be based, for example, on properties of the spoken words
or the type of field being populated by the agent. If there is a
discrepancy between what was entered by the agent and what was recently
spoken by the caller, the agent can be alerted and the error can
optionally be corrected.