A term-by-document (or part-by-collection) matrix can be used to index
documents (or collections) for information retrieval applications.
Reducing the rank of the indexing matrix can further reduce the
complexity of information retrieval. A method for index matrix rank
reduction can involve computing a singular value decomposition and then
retaining singular values based on the singular values corresponding to
singular values of multiple topics. The expected singular values
corresponding to a topic can be determined using the roots of a specially
formed characteristic polynomial. The coefficients of the special
characteristic polynomial can be based on computing the determinants of a
Gram matrix of term (or part) probabilities, a method of recursion, or a
method of recursion further weighted by the probability of document (or
collection) lengths.