A system and methods are provided for inserting probe points into an
executing program, and measuring the time spent traversing code paths
from one probe point to any other probe point or some other performance
metric (e.g., instructions executed, cache misses, memory addresses
accessed). One method is implemented by inserting N probes. Each probe
has a corresponding function configured to: retrieve the identifier and
timestamp of the previous probe executed, calculate the time spent
traversing the path from the previous probe to the current probe, and
update a matrix of N.times.N elements, wherein each element corresponds
to a path from one probe to another probe. After completion of the
program, this matrix is useful for identifying code paths that are
bottlenecks and hence candidates for optimization.