A cache memory which loads two memory values into two cache lines by
receiving separate portions of a first requested memory value from a
first data bus over a first time span of successive clock cycles and
receiving separate portions of a second requested memory value from a
second data bus over a second time span of successive clock cycles which
overlaps with the first time span. In the illustrative embodiment a first
input line is used for loading both a first byte array of the first cache
line and a first byte array of the second cache line, a second input line
is used for loading both a second byte array of the first cache line and
a second byte array of the second cache line, and the transmission of the
separate portions of the first and second memory values is interleaved
between the first and second data busses. The first data bus can be one
of a plurality of data busses in a first data bus set, and the second
data bus can be one of a plurality of data busses in a second data bus
set. Two address busses (one for each data bus set) are used to receive
successive address tags that identify which portions of the requested
memory values are being received from each data bus set. For example, the
requested memory values may be 32 bytes each, and the separate portions
of the requested memory values are received over four successive cycles
with an 8-byte portion of each value received each cycle. The cache lines
are spread across different cache sectors of the cache memory, wherein
the cache sectors have different output latencies, and the separate
portions of a given requested memory value are loaded sequentially into
the corresponding cache sectors based on their respective output
latencies. Merge flow circuits responsive to the cache controller are
used to receive the portions of a requested memory value and input those
bytes into the cache sector.