An N-port memory architecture is disclosed that stores multi-dimensional
arrays so that: (1) N contiguous elements in a row can be accessed
without blocking, (2) N contiguous elements in a column can be accessed
without blocking, (3) some N-element two-dimensional sub-arrays can be
accessed without blocking, and (4) all N/2-element two-dimensional
sub-arrays can be accessed without blocking. Second, the architecture has
been modified so that the above can happen and that any element can be
accessed on any data port. The architecture is particularly advantageous
for loading and unloading data into the vector registers of a
single-instruction, multiple-data processor, such as that used for video
decoding.