A system and process for highlighting the current speaker on an on-going
basis in each frame of a low frame-rate video of an event having multiple
people in attendance is presented. In general, this is accomplished by
periodically identifying an attendee that is currently speaking at a rate
substantially faster than the video frame rate, and updating each frame
of the video to highlight the current speaker. More particularly, an A/V
source provides a video stream to a client computing device that includes
delta frames interspersed between the frames of the low frame-rate video.
The full video frames act as keyframes, and the delta frames provide the
changes needed to modify the last displayed version of the last keyframe
to highlight just the region associated with the location of a current
speaker. This allows the client device to operate as a standard A/V
rendering and display unit.