In an information processing apparatus or method for presenting multimedia
data, a storage unit holds an object in an image, such as an image,
characters, or symbols, and sound data associated with the object.
Metadata of the object is referred to, and an output parameter of the
sound data associated with the object is determined based on the
metadata. Then, a sound output unit outputs the sound data at a sound
volume or the like based on the output parameter.