To obtain a query for use in information retrieval, a document is scanned.
The resulting text image data define an image of a segment of text in a
first language. Automatic recognition is then performed on at least part
of the text image data to obtain text code data including a series of
element codes. Each element code indicates an element that occurs in the
first language, and the series of element codes defines a set of
expressions that also occur in the first language. Automatic translation
is then performed on a version of the text code data to obtain translation
data indicating a set of counterpart expressions in a second language. The
counterpart expressions are used to automatically obtain query data
defining the query. The query can then be provided to an information
retrieval engine.