Parallelism in a processor is exploited to permute a data set based on bit
reversal of indices associated with data points in the data set. Permuted
data can be stored in a memory having entries arranged in banks, where
entries in different banks can be accessed in parallel. A destination
location in the memory for a particular data point from the data set is
determined based on the bit-reversed index associated with that data
point. The bit-reversed index can be further modified so that at least
some of the destination locations determined by different parallel
processes are in different banks, allowing multiple points of the
bit-reversed data set to be written in parallel.