An auditory scene is synthesized by applying two or more different sets of
one or more spatial parameters (e.g., an inter-ear level difference
(ILD), inter-ear time difference (ITD), and/or head-related transfer
function (HRTF)) to two or more different frequency bands of a combined
audio signal, where each different frequency band is treated as if it
corresponded to a single audio source in the auditory scene. In one
embodiment, the combined audio signal corresponds to the combination of
two or more different source signals, where each different frequency band
corresponds to a region of the combined audio signal in which one of the
source signals dominates the others. In this embodiment, the different
sets of spatial parameters are applied to synthesize an auditory scene
comprising the different source signals. In another embodiment, the
combined audio signal corresponds to the combination of the left and
right audio signals of a binaural signal corresponding to an input
auditory scene. In this embodiment, the different sets of spatial
parameters are applied to reconstruct the input auditory scene. In either
case, transmission bandwidth requirements are reduced by reducing to one
the number of different audio signals that need to be transmitted to a
receiver configured to synthesize/reconstruct the auditory scene.