A speech processing system (10) incorporates an analogue to digital
converter (16) to digitize input speech signals for Fourier transformation
to produce short-term spectral cross-sections. These cross-sections are
compared with one hundred and fifty reference patterns in a store (34),
the patterns having respective stored sets of formant frequencies assigned
thereto by a human expert. Six stored patterns most closely matching each
input cross-section are selected for further processing by dynamic
programming, which indicates the pattern which is a best match to the
input cross-section by using frequency-scale warping to achieve alignment.
The stores formant frequencies of the best matching pattern are modified
by the frequency warping, and the results are used as formant frequency
estimates for the input cross-section. The frequencies are further refined
on the basis of the shape of the input cross-section near to the chosen
formants. Formant amplitudes are produced from input cross-section
amplitudes at estimated formant frequencies. The formant frequencies and
amplitudes are used with a computer (25) to provide speech indications or
with a Hidden Markov Model word matcher (24) to provide word recognition.