A video is acquired of a scene. Each pixel in each frame of the video is
represented by multiple of layers. Each layer includes multiple Gaussian
distributions. Each Gaussian distribution includes a mean and a
covariance. The covariance is an inverse Wishart distribution. Then, the
layers are updated for each frame with a recursive Bayesian estimation
process to construct a model of the scene. The model can be used to
detect foreground and background pixels according to confidence scores of
the layers.