An apparatus and a concomitant method for detecting and recognizing text information
in a captured imagery. The present method transforms the image of the text to a
normalized coordinate system before performing OCR, thereby yielding more robust
recognition performance. The present invention also combines OCR results from multiple
frames, in a manner that takes the best recognition results from each frame and
forms a single result that can be more accurate than the results from any of the
individual frames.