A method for non-causal speaker selection is provided. In accordance with a particular
embodiment of the present invention the method includes receiving a plurality of
video streams at a multipoint control unit, each of the plurality of video streams
being associated with a respective endpoint of a multipoint conference. A plurality
of audio streams may also be received at the multipoint control unit, and each
audio stream may be associated with a respective one of the video streams. The
audio streams are buffered in respective audio buffers, and the video streams are
buffered in respective video buffers. First video data is copied from the video
buffers to obtain a low latency video stream for distribution to active conference
participants. In a particular embodiment, second video data may be copied from
the video buffers to obtain a high latency video stream for distribution to passive
conference participants, the high latency video streams being delayed in time with
respect to the low latency video stream.