Video frames of original video data are sampled with arbitrary time interval
and size, and thumbnail frames are obtained. As thumbnail information concerning
these frames, information on frame number of the original video frame corresponding
each of the thumbnail frames and size of each thumbnail frame are described. Further,
scene change information on the original video frames or intra-frame frame change
value information are described altogether as additional information, and temporal/spatial
thumbnail meta-data is obtained. The meta-data is associated with original video
data, and a database is constructed. Then, the meta-data is employed, thereby performing
typical frame display of original video data or variable speed reproduction. In
this manner, even with a device with its low CPU capability, typical frame display
or variable speed reproduction is performed for compressed and encoded video data
such as MPEG-2, and the contents of video is checked, and retrieval is easily performed.