A system and method for automatic scale selection in real-time image and video
processing and computer vision applications. In one aspect, a non-parametric variable
bandwidth mean shift technique, which is based on adaptive estimation of a normalized
density gradient, is used for detecting one or more modes in the underlying data
and clustering the underlying data. In another aspect, a data-driven bandwidth
(or scale) selection technique is provided for the variable bandwidth mean shift
method, which estimates for each data point the covariance matrix that is the most
stable across a plurality of scales. The methods can be used for detecting modes
and clustering data for various types of data such as image data, video data speech
data, handwriting data, etc.