In accordance with a present invention speech training system is disclosed. It
uses a microphone to receive audible sounds input by a user into a first computing
device having a program with a database consisting of (i) digital representations
of known audible sounds and associated alphanumeric representations of the known
audible sounds, and (ii) digital representations of known audible sounds corresponding
to mispronunciations resulting from known classes of mispronounced words and phrases.
The method is performed by receiving the audible sounds in the form of the electrical
output of the microphone. A particular audible sound to be recognized is converted
into a digital representation of the audible sound. The digital representation
of the particular audible sound is then compared to the digital representations
of the known audible sounds to determine which of those known audible sounds is
most likely to be the particular audible sound being compared to the sounds in
the database. In response to a determination of error corresponding to a known
type or instance of mispronunciation, the system presents an interactive training
program from the computer to the user to enable the user to correct such mispronunciation.