A plural number of letters or characters, inferred from the results of
letter/character recognition of an image photographed by a CCD camera
(20), a plural number of kana readings inferred from the letters or
characters and the way of pronunciation corresponding to the kana
readings are generated in an pronunciation information generating unit
(150) and the plural readings obtained are matched to the pronunciation
from the user acquired by a microphone (23) to specify one kana reading
and the way of pronunciation (reading) from among the plural generated
candidates.