A sampling infrastructure/scheme that supports flexible, efficient,
scalable and uniform sampling is disclosed. A sample is maintained in a
compact histogram form while the sample footprint stays below a specified
upper bound. If, at any point, the sample footprint exceeds the upper
bound, then the compact representation is abandoned, the sample purged to
obtain a subsample. The histogram of the purged subsample is expanded to
a bag of values while sampling remaining data values of the partitioned
subset. The expanded purged subsample is converted to a histogram and
uniform random samples are yielded. The sampling scheme retains the
bounded footprint property and to a partial degree the compact
representation of the Concise Sampling scheme, while ensuring statistical
uniformity. Samples from at least two partitioned subsets are merged on
demand to yield uniform merged samples of combined partitions wherein the
merged samples also maintain the histogram representation and bounded
footprint property.