Data compression techniques particularly applicable to high dimensional
data. The invention uses a hierarchical partitioning approach in
conjunction with a subspace sampling methodology which is sensitive to a
subject data set. The dual nature of this hierarchical partitioning and
subspace sampling approach makes the overall data compression process
very effective. While the data compression process provides a much more
compact representation than traditional dimensionality reduction
techniques, the process also provides hard bounds on the error of the
approximation. Also, the data compression process of the invention
realizes a compression factor that improves with increasing database
size.