An image processing apparatus comprises: a voice input portion; a memory
that stores in itself as voice data, voice of a plurality of users for
voice assistance, which is inputted by the voice input portion; a
selection portion that selects voice data applied for a login user among
the voice data stored in the memory, if information should be given by
voice; and a voice output portion that outputs voice corresponding to the
selected voice data.