The present invention provides a method, a computer program product, and
an apparatus for blocking a thread at dispatch in a multi-thread
processor for fine-grained control of thread performance. Multiple
threads share a pipeline within a processor. Therefore, a long latency
condition for an instruction on one thread can stall all of the threads
that share the pipeline. A dispatch-block signaling instruction blocks
the thread containing the long latency condition at dispatch. The length
of the block matches the length of the latency, so the pipeline can
dispatch instructions from the blocked thread after the long latency
condition is resolved. In one embodiment the dispatch-block signaling
instruction is a modified OR instruction and in another embodiment it is
a Nop instruction. By blocking one thread at dispatch, the processor can
dispatch instructions from the other threads during the block.