A method and system for detecting certain types of content, such as
advertisements, using acoustical means from a media stream. The method
uses two matching processes to detect and identify repeated content, the
starting and end boundaries of which are then found. This content is used
as the basis to find non-repeated content (such as less-frequently
repeated advertisements) that are typically located in proximity to
repeated content and can be evaluated using Gaussian mixture models
(GMMs). The system that implements this method can be used for
advertisement detection and monitoring for traditional media, such as
television and radio, as well as for Internet-based media, such as
streaming video, streaming audio and podcasts. The system can also be
used to detect and identify copyrighted material in Internet traffic.