A foreground extracting section extracts the foreground of each of the first
to
the N-th frames, and a foreground-accumulated-image configuration section configures
a front accumulated image obtained by overlapping the foregrounds of the first
to the N-th frames viewed from the future side and a rear accumulated image obtained
by overlapping them viewed from the past side. A learning section uses the front
accumulated image and the rear accumulated image to obtain prediction coefficients
used for predicting the foreground of each frame, and a multiplexer outputs the
prediction coefficients, the front accumulated image, and the rear accumulated
image as the result of encoding of the foregrounds of the first to the N-th frames.