A server receives an acquisition request for visual content, such as
desired text or image content, from a client terminal via a communication
network, and retrieves, from a database, data of the visual content
requested by the client terminal. Then, by analyzing the visual content,
by referring to an appropriate table or otherwise, the server determines
music content or effect tone content to be imparted to the requested
visual content in associated relation to specific substance of the
requested visual content. Then, the server retrieves data of the
determined music content or effect tone content from a database, and
transmits the data of the visual content, along with the data of the
music content or effect tone content, to the client terminal. On the
basis of the data transmitted by the server, the client terminal can not
only visually display the requested visual content but also audibly
reproduce the music content or effect tone content. The client terminal
may select appropriate music content or effect tone content from a memory
of the client terminal and associate the thus-selected appropriate music
content or effect tone content with the visual content received from the
server.