A memory accelerator module buffers program instructions and/or data for
high speed access using a deterministic access protocol. The program
memory is logically partitioned into `stripes`, or `cyclically
sequential` partitions, and the memory accelerator module includes a
latch that is associated with each partition. When a particular partition
is accessed, it is loaded into its corresponding latch, and the
instructions in the next sequential partition are automatically
pre-fetched into their corresponding latch. In this manner, the
performance of a sequential-access process will have a known response,
because the pre-fetched instructions from the next partition will be in
the latch when the program sequences to these instructions. Previously
accessed blocks remain in their corresponding latches until the pre-fetch
process `cycles around` and overwrites the contents of each
sequentially-accessed latch. In this manner, the performance of a loop
process, with regard to memory access, will be determined based solely on
the size of the loop. If the loop is below a given size, it will be
executable without overwriting existing latches, and therefore will not
incur memory access delays as it repeatedly executes instructions
contained within the latches. If the loop is above a given size, it will
overwrite existing latches containing portions of the loop, and therefore
require subsequent re-loadings of the latch with each loop. Because the
pre-fetch is automatic, and determined solely on the currently accessed
instruction, the complexity and overhead associated with this memory
acceleration is minimal.