A system and method utilizing random sampling for partition analysis on very
large
databases. The method utilizes a random sampling algorithm that provides results
accurate to within a few percentage points for large homogeneous databases. The
accuracy is not affected by the size of the database and is determined primarily
by the size of the sample. The system and method for approximate partition analysis
reduces the time required for an analysis to a fraction of the time required for
an exact analysis. The reduction in time thereby permits more frequent and timely
analyses of database partition sizes.