A sound identification apparatus which reduces the chance of a drop in the
identification rate, including: a frame sound feature extraction unit
which extracts a sound feature per frame of an inputted audio signal; a
frame likelihood calculation unit which calculates a frame likelihood of
the sound feature in each frame, for each of a plurality of sound models;
a confidence measure judgment unit which judges a confidence measure
based on the frame likelihood; a cumulative likelihood output unit time
determination unit which determines a cumulative likelihood output unit
time based on the confidence measure; a cumulative likelihood calculation
unit which calculates a cumulative likelihood in which the frame
likelihoods of the frames included in the cumulative likelihood output
unit time are cumulated, for each sound model; a sound type candidate
judgment unit which determines, for each cumulative likelihood output
unit time, a sound type corresponding to the sound model that has a
maximum cumulative likelihood; a sound type frequency calculation unit
which calculates the frequency of the sound type candidate; and a sound
type interval determination unit which determines the sound type of the
inputted audio signal and the interval of the sound type, based on the
frequency of the sound type.