Facts are extracted from electronic documents by recognizing factual
descriptions using a fact-word table to match to words of the electronic
documents. The words of those factual descriptions may be tagged with the
appropriate part of speech. More detailed analysis is then performed on
those factual descriptions, rather than on the entire electronic
document, and particularly to the text in the neighborhood of the
fact-word matches. The analysis may involve identifying the linguistic
constituents of each phrase and determining the role as either subject or
object. Exclusion rules may be applied to eliminate those phrases
unlikely to be part of facts, the exclusion rules being based in part on
the linguistic constituents. Scoring rules may be applied to remaining
phrases, and for those phrases having a score in excess of a threshold,
the corresponding sentence part, whole sentence, paragraph, or other
document portion may be presented as representing one or more facts.