A method and system for processing video signals is disclosed. The system receives
a video signal, a first audio signal containing an annotation and a second audio
signal containing environmental sounds corresponding to the video signal. In one
embodiment the system generates searchable annotations corresponding to the video
and second audio signals via the first audio signal.