A method and apparatus for reducing logic activity in a microprocessor which
examines
every instruction before it is executed and determines in advance the minimum appropriate
datapath width (in byte or half-word quantities) necessary to accurately execute
the operation. Achieving this requires two major enhancements to a traditional
microprocessor pipeline. First, extra logic (potentially an extra pipeline stage
for determining an operation's effective bit width—the WD width detection
logic) is introduced between the Decode and Execution stages. Second, the traditional
Execution stage architecture (including a register file RF and the arithmetic logical
unit ALU), instead of being organized as one continuous 32-bit unit, is organized
as a collection of multiple slices, where a slice can be of an 8-bit (a byte) or
a 16-bit (double byte) granularity. Each slice in this case can operate independently
of each other slice, and includes portion of the register file, functional unit
and cache memory. Concatenating a multiple number of these slices together creates
a required full width processor.