A system includes a data processing core coupled to a system memory
employing error correction code (ECC) circuitry. The core includes an
indicator of when a correctable system memory error occurs and what
address is associated with the error. A watchdog timer is instantiated on
a system management device. Periodically, the timer prompts the
management device to interrupt the processor and poll the error indicator
to determine if a memory error has been detected. If an error is
detected, the corresponding physical memory address is recorded. If a
predetermined number of consecutive errors associated with a single
memory address or range of addresses occurs, an alert is issued. In one
embodiment, polling the error indicator is infrequent initially. As
additional errors are detected, the polling frequency increases. At
higher polling frequencies, the system may require a greater number of
consecutive errors before taking additional action.