A processor includes at least one instruction execution unit that executes
store instructions to obtain store operations and a store queue coupled
to the instruction execution unit. The store queue includes a queue entry
in which the store queue gathers multiple store operations during a store
gathering window to obtain a data portion of a write transaction directed
to lower level memory. In addition, the store queue includes dispatch
logic that varies a size of the store gathering window to optimize store
performance for different store behaviors and workloads.