An electronic document is parsed to remove irrelevant text and to identify
the significant elements of the retained text. The elements are assigned
scores representing their significance to the topical content of the
document. A matrix of element-pairs is constructed such that the matrix
nodes represent the result of one or more functions of the scores and
other attributes of the paired elements. The resulting matrix is a
compact representation of topical content that affords great precision in
information retrieval applications that depend on measurements of the
relatedness of topical content.