For encoding of mixed-mode images containing text and continuous-tone
content, the pixels in the image that form the text content are detected
and separated. Text detection classifies pixels as text or continuous
tone content by accumulating pixel counts for groups of contiguous,
non-smooth pixels with the same color. Groups whose pixel count exceeds a
threshold are classified as text. The text detection technique further
reduces classification errors by testing for boundary dimensions and
pixel density of the group characteristic of long straight lines or large
borders. The text detection technique further searches the neighborhood
of groups qualifying as text for pixels of the same color, so as to also
detect pixels for isolated text marks like dots, accents or punctuation.
The separated text and continuous-tone content can be encoded separately
for efficient compression while preserving text quality, and the text
again superimposed on the continuous tone content at decompression.