In certain aspects, the invention features a system and method for
receiving a parent job configured to produce one or more descendant jobs,
and scheduling computation of the parent job on a node computing device
that is one of a plurality of node computing devices of a distributed
computing system. In such an aspect, the distributed computing system
further includes a scheduler server configured to selectively reschedule
computation of a job other than a parent job from any one of the
plurality of node computing devices to another of the node computing
devices. Such an aspect further includes preventing rescheduling of the
parent job unless each of the descendant jobs is completed or terminated.
In other aspects, the invention features a system and method for
receiving, for computation by a node computing device, a parent job
configured to produce a descendant job, wherein the node computing device
is one of a plurality of node computing devices of a distributed
computing system that also includes a scheduler server. In such aspects,
the distributed computing system creates the descendant job, and the
parent and descendant jobs are scheduled for computation on different
node computing devices.