Text summarizers using relevance measurement technologies and latent
semantic analysis techniques provide accurate and useful summarization of
the contents of text documents. Generic text summaries may be produced by
ranking and extracting sentences from original documents; broad coverage
of document content and decreased redundancy may simultaneously be
achieved by constructing summaries from sentences that are highly ranked
and different from each other. In one embodiment, conventional
Information Retrieval (IR) technologies may be applied in a unique way to
perform the summarization; relevance measurement, sentence selection, and
term elimination may be repeated in successive iterations. In another
embodiment, a singular value decomposition technique may be applied to a
terms-by-sentences matrix such that all the sentences from the document
may be projected into the singular vector space; a text summarizer may
then select sentences having the largest index values with the most
important singular vectors as part of the text summary.