A method detects objects in a scene over time. Sets of time-aligned
features are extracted from multiple signals representing a scene over
time; each signal is acquired using a different modality. Each set of
time-aligned features is arranged as a vector in a matrix to which a
first transform is applied to produce a compressed matrix. A second
transform is applied to the compressed matrix to extract spatio-temporal
profiles of objects occurring in the scene.