A system and method facilitating object tracking is provided. The system includes
an audio model that receives at least two audio input signals and a video model
that receives a video input. The audio model and the video model employ probabilistic
generative models which are combined to facilitate object tracking. Expectation
maximization can be employed to modify trainable parameters of the audio model
and the video model.