Human gestures are detected and/or tracked from a pair of digital video
images. The pair of images may be used to provide a set of observation
vectors that provide a three dimensional position of a subject's upper
body. The likelihood of each observation vector representing an upper
body component may be determined. Initialization of the model for
detecting and tracking gestures may include a set of assumptions
regarding the initial position of the subject in a set of foreground
observation vectors.