A method and computer-readable medium are provided that identify prosodic
word boundaries for a text. If the text is unsegmented, it is first
segmented into lexical words. The lexical words are then converted into
prosodic words using an annotated lexicon to divide large lexical words
into smaller words and a model to combine the lexical words and/or the
smaller words into larger prosodic words. The boundaries of the resulting
prosodic words are used to set the prosody for the synthesized speech.