A reader-writer lock minimizes writer and reader overhead by employing lock
structures that are shared among groups of processors that have lower
latencies. In the illustrated multiprocessor system having a non-uniform
memory access (NUMA) architecture, each processor node has a lock
structure comprised of a shared counter and associated flag for each CPU
group. During a read, the counter can be changed only by processors within
a CPU group performing a read. This reduces the reader overhead that
otherwise would exist if all processors in the system shared a single
counter. During a write, the shared flag can be changed by a process
running on any processor in the system. The processors in a CPU group are
notified of the write through the shared flag. This reduces the writer
overhead that otherwise would exist if each processor in the system had a
separate flag. The number of CPUs per group can be varied to optimize
performance of the lock in different multiprocessor systems.