Identification of a determinative subset of features from within a large
set of features is performed by training a support vector machine to rank
the features according to classifier weights, where features are removed
to determine how their removal affects the value of the classifier
weights. The features having the smallest weight values are removed and a
new support vector machine is trained with the remaining weights. The
process is repeated until a relatively small subset of features remain
that is capable of accurately separating the data into different patterns
or classes. The method is applied for selecting the smallest number of
genes that are capable of accurately distinguishing between medical
conditions such as cancer and non-cancer.