Relational database applications such as index selection, histogram tuning,
approximate query processing, and statistics selection have recognized the importance
of leveraging workloads. Often these applications are presented with large workloads,
i.e., a set of SQL DML statements, as input. A key factor affecting the scalability
of such applications is the size of the workload. The invention concerns workload
compression which helps improve the scalability of such applications. The exemplary
embodiment is broadly applicable to a variety of workload-driven applications,
while allowing for incorporation of application specific knowledge. The process
is described in detail in the context of two workload-driven applications: index
selection and approximate query processing.