A "music video parser" automatically detects and segments music videos in
a combined audio-video media stream. Automatic detection and segmentation
is achieved by integrating shot boundary detection, video text detection
and audio analysis to automatically detect temporal boundaries of each
music video in the media stream. In one embodiment, song identification
information, such as, for example, a song name, artist name, album name,
etc., is automatically extracted from the media stream using video
optical character recognition (OCR). This information is then used in
alternate embodiments for cataloging, indexing and selecting particular
music videos, and in maintaining statistics such as the times particular
music videos were played, and the number of times each music video was
played.