The present invention provides a vector floating point unit (FPU) comprising
a product-terms bus, a summation bus, a plurality of FIFO (first in first out)
registers, a crossbar operand multiplexor coupled, a floating point multiplier,
and a floating point adder. The floating point multiplier and the floating point
adder are disposed between the crossbar operand multiplexor and the product-terms
and summation buses, and are in parallel to each other. The invention also provides
the configuration register and the command register in order to provide flexible
architecture and the capability to fine-tune the performance to a particular application.
The invention performs the multiplication operation and the addition operation
in a pipelined fashion. Once the pipeline is filled, the invention outputs one
multiplication output and one addition output at each clock cycle. The invention
reduces the latency of the pipelined operation and improves the overall system
performance by separating the floating point multiplier from the floating point
adder so that the multiplication operation can be executed separately and independently
of the addition operation.