In a speech synthesis process, micro-segments are cut from acquired
waveform data and a window function. The obtained micro-segments are
re-arranged to implement a desired prosody, and superposed data is
generated by superposing the re-arranged micro-segments, so as to obtain
synthetic speech waveform data. A spectrum correction filter is formed
based on the acquired waveform data. At least one of the waveform data,
micro-segments, and superposed data is corrected using the spectrum
correction filter. In this way, "blur" of a speech spectrum due to the
window function applied to obtain micro-segments is reduced, and speech
synthesis with high sound quality is realized.