A system for performing matrix operations utilizes a processor, memory, and a
matrix
operation manager. The processor has a memory cache. The memory is external to
the processor and stores first and second matrices. The matrix operation manager
is configured to mathematically combine the first matrix with the scond matrix
utilizing a hoisted matrix algorithm for hoisting values of the first matrix, and
the hoisted matrix algorithm has an outer loop and an inner loop that is performed
to completion for each iteration of the outer loop. The matrix operation manager,
for each iteration of the outer loop, is configured to load to the cache and to
write to a contiguous portion of the memory, before performing the inner loop,
values from the first matrix that are to be combined, via performance of the inner
loop, with values from the second matrix.