One embodiment of the present invention provides a system that performs
thread migration within an array of computing nodes, wherein computing
nodes in the array contain central processing units (CPUs) and/or
memories. During operation, the system identifies CPUs within the array
of computing nodes that are available to accept a given thread. For each
available CPU, the system computes an average communication distance
between the CPU and memories which are accessed by the given thread.
Next, the system determines whether to move the given thread to an
available CPU based on the average communication distance for the
available CPU.