In a multimedia communications system (100) that supports conference calls
that include an audio portion and a video portion, a primary video image is selected
from a plurality of video images based on an amount of audio data generated. The
amount of audio data is determined by counting a number of audio packets or by
counting an amount of audio samples in audio packets (204). A dominant audio
participant is selected if the difference in the amount of audio exceeds a predetermined
threshold (206). If the difference in the amount of audio does not exceed
the predetermined threshold (206), the dominant audio participant may be
determined by comparing the loudness or volume for each audio participant (207
212). The primary video image is selected to correspond to the dominant audio
participant (208, 214). The primary video image remains constant for a predetermined
period of time before the possibility to change (210, 216).