A method for summarizing a video first detects audio peaks in a sub-sampled audio
signal of the video. Then, motion activity in the video is extracted and filtered.
The filtered motion activity is quantized to a continuous stream of digital pulses,
one pulse for each frame. If the motion activity is greater than a predetermined
threshold the pulse is one, otherwise the pulse is zero. Each quantized pulse is
tested with respect to the timing of rising and falling edges. If the pulse meets
the condition of the test, then the pulse is selected as a candidate pulse related
to an interesting event in the video, otherwise the pulse is discarded. The candidate
pulses are correlated, time-wise to the audio peaks, and patterns between the pulses
and peaks are examined. The correlation patterns segment the video into uninteresting
and interesting portions, which can then be summarized.