A method, apparatus, and program product checkpoint an application in a
parallel computing system of the type that includes a plurality of hybrid
nodes. Each hybrid node includes a host element and a plurality of
accelerator elements. Each host element may include at least one
multithreaded processor, and each accelerator element may include at
least one multi-element processor. In a first hybrid node from among the
plurality of hybrid nodes, checkpointing the application includes
executing at least a portion of the application in the host element and
at least one accelerator element and, in response to receiving a command
to checkpoint the application, checkpointing the host element separately
from the at least one accelerator element.