The invention provides a processor with two or more parallel instruction paths
for processing instructions. The instruction paths may be implemented with a plurality
of cores on a common die. Instructions of the invention are preferably processed
within a bundle of two or more instructions of a common program thread; and each
of the instruction paths preferably forms a cluster to process bundled instructions.
Each of the instruction paths has an array of pipelined execution units. Initially,
two or more of the parallel instruction paths processes the same program thread
(one or more bundles) through the execution units, but with different optimization
characteristics set for each path. Assessment logic monitors the processing of
the initial program thread through the execution units and selects the heuristics
defining which path is in the lead. The other instruction paths are then reallocated,
or synchronized, with the optimization characteristics of the lead instruction
path, or with similarly optimized characteristics, to process other bundles of
the program thread; preferably, the lead path continues processing of the initial
thread without being disturbed. For other program threads, the process may repeat
in processing like bundles through multiple instruction paths to identify the preferred
heuristics; and then synchronizing the multiple instruction paths to the optimized
characteristics to improve per thread performance.