A method and apparatus for including in a processor instructions for
performing multiply-add operations on packed data. In one embodiment, a
processor is coupled to a memory. The memory has stored therein a first
packed data and a second packed data. The processor performs operations
on data elements in said first packed data and said second packed data to
generate a third packed data in response to receiving an instruction. At
least two of the data elements in this third packed data storing the
result of performing multiply-add operations on data elements in the
first and second packed data.