A pipelined processor includes a branch acceleration technique which is
based on an improved branch cache. The improved branch cache minimizes or
eliminates delays caused by branch instructions, especially
data-dependent unpredictable branches. The improved branch cache avoids
stalls by providing data that will be inserted into the pipeline stages
that would otherwise have stalled when a branch is taken. Special
architectural features and control structures are supplied to minimize
the amount of information that must be cached by recognizing that only
selected types of branches should be cached and by making use of
available cycles that would otherwise be wasted. The improved branch
cache supplies the missing information to the pipeline in the place of
the discarded instructions, completely eliminating the pipeline stall.
This technique accelerates performance, especially in real-time code that
must evaluate data-dependent conditions and branch accordingly.