The present invention describes a process for enhancing optical
recognition of text in scanned documents. Prior to performing optical
recognition for identification of text in scanned documents, a
preprocessing algorithm identifies locations of noncontiguity in
character strokes. The gaps created by noncontiguous character strokes
are selectively filled with non-white or black pixels for enhanced
character recognition. The process may assess noncontiguity on a
bit-by-bit basis or, to reduce the number of operations, on a
byte-by-byte basis.