A method and computer program product are disclosed for recognizing italic
text in an optical character recognition system. A plurality of digital
images of alphanumeric characters are created from a block of text, each
comprising a plurality of rows of digital pixels. The digital images are
preprocessed such that each image is normalized to equal size and the
pixels within each image have a first value, such as "black," or a second
value, such as "white."The position of the left-most pixel with a first
value in each row is determined for each image. The position of each
left-most pixel with a first value is recorded as an ordered pair
including the row number and ordinal position within the row. A best-fit
line and a corresponding slope are calculated for the ordered pairs via
linear regression. The calculated slope is then compared to a
predetermined threshold.