A method for balancing the load of a parallel processing system having
parallel processing elements (PEs) linked serially in a line with first
and second ends, wherein each of the PEs has a local number of tasks
associated therewith, the method comprising determining a total number of
tasks present on the line; notifying each of the PEs of the total number
of tasks, calculating a local mean number of tasks for each of the PEs,
and calculating a local deviation for each of the PEs. The method also
comprises determining a first local cumulative deviation for each of the
PEs, determining a second local cumulative deviation for each of the PEs,
and redistributing tasks among the PEs in response to the first local
cumulative deviation and the second local cumulative deviation.