A method, system and computer program product are provided for scaling, or
dimensionally reducing, multi-dimensional data sets that scale well for
large data sets. The invention scales multi-dimensional data sets by
determining one or more non-linear functions between a sample of points
from the multi-dimensional data set and a corresponding set of
dimensionally reduced points. Thereafter, these one or more non-linear
functions are used to non-linearly map additional points. The additional
points may be members of the original multi-dimensional data set or may
be new, previously unseen points. In an embodiment, the determination of
the non-linear relationship between the sample of points from the
multi-dimensional data set and the corresponding set of dimensionally
reduced points is performed by a self-learning system such as a neural
network. The additional points are mapped using the self-learning system
in a feed-forward/predictive manner.