The present invention provides a network multithreaded processor, such as
a network processor, including a thread interleaver that implements
fine-grained thread decisions to avoid underutilization of instruction
execution resources in spite of large communication latencies. In an
upper pipeline, an instruction unit determines an instruction fetch
sequence responsive to an instruction queue depth on a per thread basis.
In a lower pipeline, a thread interleaver determines a thread interleave
sequence responsive to thread conditions including thread latency
conditions. The thread interleaver selects threads using a two-level
round robin arbitration. Thread latency signals are active responsive to
thread latencies such as thread stalls, cache misses, and interlocks.
During the subsequent one or more clock cycles, the thread is ineligible
for arbitration. In one embodiment, other thread conditions affect
selection decisions such as local priority, global stalls, and late
stalls.