A processor comprises an instruction cache that stores a cache line of
instructions and an execution engine for executing the instructions, along
with a buffer to store a plurality of entries. A first logic circuit
divides the cache line into instruction bundles, each of which gets
written into an entry of the buffer. A second logic circuit reads out a
number of consecutive instruction bundles from the buffer for dispersal to
the execution engine to optimize speculative fetching and maximizing
instruction supply to the execution resources of the processor.