A knowledge-based document analysis system and method for identifying and
decomposing constrained and unconstrained images of documents is
disclosed. Low level features are extracted within bitonal and grayscale
images. Low level features are passed to a document classification means
which forms initial hypotheses about the document class. For constrained
documents, the document analysis system sorts through various models to
determine the exact type of document and then extracts the relevant fields
for character recognition. For unconstrained documents, through the use of
a blackboard architecture which includes a knowledge database and
knowledge sources, the document analysis means creates information and
hypotheses to identify and locate relevant fields within the document.
These fields are then sent for optical character recognition.