The present invention enables reduction in the amount of data of a
document image while maintaining a layout of the document image, and
enables suppression of image quality deterioration when reproducing the
document image. An original document image is inputted as multivalued
image data (original image data) from an image input unit. The
multivalued image data is binarized by a binary image output unit. Then,
layout analysis is performed based on the binary image data. With respect
to text areas, character recognition is performed on the binary image,
then the recognition data is outputted. With respect to non-text areas,
the binary image data is used without further processing. With respect to
a multivalued image, e.g., photographs or the like, the image data is
compressed with an appropriate condition and stored.