A system trains a first model to identify portions of electronic media
streams based on first attributes of the electronic media streams and/or
trains a second model to identify labels for identified portions of the
electronic media streams based on at least one of second attributes of
the electronic media streams, feature information associated with the
electronic media streams, or information regarding other portions within
the electronic media streams. The system inputs an electronic media
stream into the first model, identifies, by the first model, portions of
the electronic media stream, inputs the electronic media stream and
information regarding the identified portions into the second model,
and/or determines, by the second model, human recognizable labels for the
identified portions.