A method for efficiently managing memory resources in a computer system
having a graphics processing unit that runs several processes
simultaneously on the same computer system includes using threads to
communicate that additional memory is needed. If the request indicates
that termination will occur then the other processes will reduce their
memory usage to a minimum to avoid termination but if the request
indicates that the process will not run optimally then the other
processes will reduce their memory usage to 1/N where N is the count of
the total number of running processes. The apparatus includes a computer
system using a graphics processing unit and processes with threads that
can communicate directly with other threads and with a shared memory
which is part of the operating system memory.