A method of identifying mismatches between acoustic data and a
corresponding transcription, the transcription being expressed in terms of
basic units, comprises the steps of: aligning the acoustic data with the
corresponding transcription; computing a probability score for each
instance of a basic unit in the acoustic data with respect to the
transcription; generating a distribution for each basic unit; tagging, as
mismatches, instances of a basic unit corresponding to a particular range
of scores in the distribution for each basic unit based on a threshold
value; and correcting the mismatches.