A processor (e.g., a co-processor) capable of executing instructions
sequentially, comprises at least two functional hardware resources. When
two instructions that are consecutive in program order and are executed
on two separate functional hardware resources, the execution of the two
instructions may be parallelized if the two instructions are within a
hardware loop. The processor thus, may implement a multiply and
accumulate process in an efficient manner by performing the multiply
instructions concurrently with the add instructions (which require fewer
cycles to complete than the multiply instructions).