A novel method is employed for collecting optimizer statistics for
optimizing database queries by gathering feedback from the query
execution engine about the observed cardinality of predicates and
constructing and maintaining multidimensional histograms. This makes use
of the correlation between data columns without employing an inefficient
data scan. The maximum entropy principle is used to approximate the true
data distribution by a histogram distribution that is as "simple" as
possible while being consistent with the observed predicate
cardinalities. Changes in the underlying data are readily adapted to,
automatically detecting and eliminating inconsistent feedback information
in an efficient manner. The size of the histogram is controlled by
retaining only the most "important" feedback.