A method for optimizing a software pipelineable loop in a software code is provided.
The loop comprises one or more pipelined stages and one or more loop operations.
The method comprises evaluating an initiation interval time (IN) for a pipelined
stage of the loop. A loop operation time latency (Tld) and a number of loop operations
(Np) from the pipelined stages to peel based on IN and Tld is then determined.
The loop operation is peeled Np times and copied before the loop in the software
code. A vector of registers is allocated and the results of the peeled loop operations
and a result of an original loop operation is assigned to the vector of registers.
Memory addresses for the results of the peeled loop operations and original loop
operation are also assigned.