A crossbar is implemented within multimedia facilities of a processor to
perform vector permute operations, in which the bytes of a source operand
are reordered in the target output. The crossbar is then reused for other
instructions requiring multiplexing or shifting operations, particularly
those in which the size of additional multiplexers or the size and delay
of a barrel shifter is significant. A wide shift operation, for example,
may be performed with one cycle latency by the crossbar and one additional
layer of multiplexers or a small barrel shifter. The crossbar facility
thus gets reused with improved performance of the instructions now sharing
the crossbar and a reduction in the total area required by a multimedia
facility within a processor.