An approach for monitoring interaction between individuals engaged in a
communication session is disclosed. The individuals are described herein
as a customer service representative and a customer and the communication
session is accomplished over a communication network. Audio data
embodying the communication session is copied and stored to a media file
in conjunction with video data captured by a video capture device
monitoring the customer service representative. The media file is a data
structure in which the audio data and the video data are stored in
segmented fashion. Each segment of audio data is associated with a
segment of video data based on a common time reference, thereby providing
synchronized documentation of the communication session. The media file
is stored on a database and available to a supervisor using a server
computer to monitor the communication session for quality assurance or
other evaluation purposes.