On receipt of a tagged file, as a tagged document, at step S1, a document
processing apparatus at step S2 derives the attribute information for
read-out from tags of the tagged file and embeds the attribute
information to generate a speech read-out file. Then, at step S3, the
document processing apparatus performs processing suited for a speech
synthesis engine, using the generated speech read-out file. At step S4,
the document processing apparatus performs processing depending on the
operation by the user through a user interface.