A method and system of mixing audios to convert a plurality of input voices into
a single output voice is described. The system of mixing audios has a decoding
device, an audio mixing device and a frame package unit. The input voices including
a plurality of audio frames are partially decoded to acquire audio parameters of
the input voices by the decoding device. One audio frame of the input voices is
selected by the audio mixing device to obtain a target frame according to the audio
parameters later. The target frame is then packaged so as to be identical to the
original format of the input voices by the frame package unit.