A speech synthesis apparatus, which can embed unchangeable additional
information into synthesized speech without causing a deterioration of
speech quality and restriction by bands, includes a language processing
unit which generates synthesized speech generation information necessary
for generating synthesized speech in accordance with a language string, a
prosody generating unit which generates prosody information of speech
based on the synthesized speech generation information, and a waveform
generating unit which synthesizes speech based on the prosody
information, in which the prosody generating unit embed code information
as watermark information in the prosody information of a segment having a
predetermined time duration within a phoneme length including a phoneme
boundary.