A method for identifying, managing, and signaling uncorrectable errors
among a plurality of clusters of symmetric multiprocessors (SMPs)
detects, manages and reports data errors. The method allows merging of
newly detected errors, including memory, cache, control, address, and
interface errors, into existing error status. Also, error status is
distributed in several possible formats, including separate status
signals, special UE (uncorrectable errors) ECC codewords, encoded data
patterns, parity error injection, and response codepoints. The error
status is also available for logging and analysis while the machine is
operating, allowing for recovery and component failure isolation as soon
as the errors are detected without stopping the machine.