A convolutional neural network is implemented on a graphics processing
unit. The network is then trained through a series of forward and
backward passes, with convolutional kernels and bias matrices modified on
each backward pass according to a gradient of an error function. The
implementation takes advantage of parallel processing capabilities of
pixel shader units on a GPU, and utilizes a set of start-to-finish
formulas to program the computations on the pixel shaders. Input and
output to the program is done through textures, and a multi-pass
summation process is used when sums are needed across pixel shader unit
registers.