A visual interface to an IVR system is provided to allow an interaction
between a user and an interactive voice response (IVR) system to be
visually monitored. A visual representation of an audio communication
with an agent is generated based on the IVR script. The commands in the
IVR scripts can be mapped to a visual representation. One or more fields
in the visual representation can be populated with utterances of the
caller. The agent can optionally review or update a field in the visual
representation that has been populated with an utterance. An agent can
optionally alter a flow of the IVR script or intervene in the audio
communication.