Methods and software are presented for processing data in a programmable
processor, involving (a) decoding instructions for execution using an
execution unit operable to execute instructions by partitioning data
stored in registers in a register file into multiple data elements, the
instructions selected from an instruction set that includes group
arithmetic instructions and group data handling instructions, (b) in
response to decoding different group data handling instructions,
executing group data handling operations that re-arrange data elements in
different ways, and (c) in response to decoding different group
arithmetic instructions, executing a plurality of different group
floating-point and group integer arithmetic operations that each
arithmetically operates on the multiple data elements stored in registers
in the register file to produce a catenated result that is returned to a
register in the register file, wherein the catenated result comprises a
plurality of individual results.