A method for generating a representation of multimedia content by first
segmenting the multimedia content spatially and temporally to extract
objects. Feature extraction is applied to the objects to produce semantic
and syntactic attributes, relations, and a containment set of content
entities. The content entities are coded to produce directed acyclic
graphs of the content entities, where each directed acyclic graph
represents a particular interpretation of the multimedia content.
Attributes of each content entity are measured and the measured
attributes are assigned to each corresponding content entity in the
directed acyclic graphs to rank order the multimedia content.