In an image processing apparatus 20 an input sequence 130 of video
images is processed to determine the different positions and orientations at which
the images were recorded in an efficient and accurate manner. A subset of the input
images are selected as keyframes to form a sequence 250 of keyframes. Respective
triples of keyframes having different, non-overlapping positions in the sequence
250 are selected and processed to determine the relative positions and orientations
at which the keyframes in each triple were recorded to form respective sets of
keyframes. The positions and orientations of keyframes between the keyframes in
each triple are then calculated to form expanded sets of keyframes 266, 276,
286. The sets are further expanded by calculating the positions and orientations
of keyframes which lie between sets in the sequence 250. The sets are merged
by calculating the relationship between the coordinate systems in which the positions
and orientations of the keyframes in each set are defined. During the processing,
the positions and orientations calculated for keyframes in a set are adjusted to
optimise the calculated solutions. This is performed in stages, considering at
each stage a different window 270 of the keyframes and performing processing
to minimise the error associated with the keyframes in the window. The window is
moved sequentially through the keyframes so that every keyframe in a set is considered
at least once.