A system for binary code instrumentation to reduce effective memory
latency comprises a processor and memory coupled to the processor. The
memory comprises program instructions executable by the processor to
implement a code analyzer configured to analyze an instruction stream of
compiled code executable at an execution engine to identify, for a given
memory reference instruction in the stream that references data at a
memory address calculated during an execution of the instruction stream,
an earliest point in time during the execution at which sufficient data
is available at the execution engine to calculate the memory address. The
code analyzer generates an indication of whether the given memory
reference instruction is suitable for a prefetch operation based on a
difference in time between the earliest point in time and a time at which
the given memory reference instruction is executed during the execution.