In lieu of branch prediction, a merged fetch-branch unit operates in parallel
with the decode unit within a processor. Upon detection of a branch instruction
within a group of one or more fetched instructions, any instructions preceding
the branch are marked regular instructions, the branch instruction is marked as
such, and any instructions following branch are marked sequential instructions.
Within two cycles, sequential instructions following the last fetched instruction
are retrieved and marked, target instructions beginning at the branch target address
are retrieved and marked, and the branch is resolved. Either the sequential or
target instructions are then dropped depending on the branch resolution, incurring
a fixed, 1 cycle branch penalty.