An array processor includes processing elements (00, 01, 02, 03, 10, 11,
12, 13, 20, 21, 22, 23, 30, 31, 32, 33) arranged in clusters (e.g., 44,
46, 48, 50) to form a rectangular array (40). Inter-cluster communication
paths (88) are mutually exclusive. Due to the mutual exclusivity of the
data paths, communications between the processing elements of each
cluster may be combined in a single inter-cluster path, thus eliminating
half the wiring required for the path. The length of the longest
communication path is not directly determined by the overall dimension of
the array, as in conventional torus arrays. Rather, the longest
communications path is limited by the inter-cluster spacing. Transpose
elements of an N.times.N torus may be combined in clusters and
communicate with one another through intra-cluster communications paths.
Transpose operation latency is eliminated in this approach. Each PE may
have a single transmit port (35) and a single receive port (37). Thus,
the individual PEs are decoupled from the array topology.