In response to multiple data transfer requests from an application, a data
definition (DD) chain is generated. The DD chain is divided into multiple
DD sub-blocks by determining a bandwidth of channels (BOC) and whether
the BOC is less than the DD chain. If so, the DD chain is divided by the
available DMA engines. If not, the DD chain is divided by an optimum
atomic transfer unit (OATU). If the division yields a remainder, the
remainder is added to a last DD sub-block. If the remainder is less than
a predetermined value, the size of the last DD sub-block is set to the
OATU plus the remainder. Otherwise, the size of the last DD sub-block is
set to the remainder. The DD sub-blocks are subsequently loaded into a
set of available DMA engines. Each of the available DMA engines performs
data transfers on a corresponding DD sub-block until the entire DD chain
has been completed.