Methods and products for processing a software kernel of instructions are
disclosed. The software kernel has stages representing a loop nest. The
software kernel is processed by partitioning iterations of an outermost
loop into groups with each group representing iterations of the outermost
loop, running the software kernel and rotating a register file for each
stage of the software kernel preceding an innermost loop to generate code
to prepare for filling and executing instructions in software pipelines
for a current group, running the software kernel for each stage of the
software kernel in the innermost loop to generate code to fill the
software pipelines for the current group with the register file being
rotated after at least one run of the software kernel for the innermost
loop, and repeatedly running the software kernel to unroll inner loops to
generate code to further fill the software pipelines for the current
group.