One embodiment of the present invention provides a system that computes a
distance metric between computer system workloads. During operation, the
system receives a dataset containing metrics that have been collected for
a number of workloads of interest. Next, the system uses splines to
define bases for a regression model which uses a performance indicator y
as a response and uses the metrics (represented by a vector x) as
predictors. The system then fits the regression model to the dataset
using a penalized least squares (PLS) criterion to obtain functions
f.sub.1, . . . , f.sub.P, which are smooth univariate functions of
individual metrics that add up to the regression function f, such that
y=f(x)+.epsilon.=.times..function. ##EQU00001## wherein .epsilon.
represents noise. Finally, the system uses the fitted regression function
to define the distance metric.