System and method for partitioning a video into a series of semantic units
where each semantic unit relates to a generally complete thematic topic.
A computer implemented method for partitioning a video into a series of
semantic units wherein each semantic unit relates to a theme or a topic,
comprises dividing a video into a plurality of homogeneous segments,
analyzing audio and visual content of the video, extracting a plurality
of keywords from the speech content of each of the plurality of
homogeneous segments of the video, and detecting and merging a plurality
of groups of semantically related and temporally adjacent homogeneous
segments into a series of semantic units in accordance with the results
of both the audio and visual analysis and the keyword extraction. The
present invention can be applied to generate important table-of-contents
as well as index tables for videos to facilitate efficient video topic
searching and browsing.