A program storage device is provided readable by machine, tangibly
embodying a program of instructions executable by the machine to perform
method steps for classification of biological tissue by gene expression
profiling. The method steps include providing a training set of gene
expression profiles of known tissue samples, providing a first-layer
strong classifier of the known tissue samples by combining weak
classifiers using boosting, creating two sample sets based on the first
classifier, populating the two sample sets with a next-layer of
classifiers based on a previous-layer classifier, organizing the
classifiers in a tree data structure, and outputting the tree data
structure as a probabilistic boosting tree classifier for tissue sample
classification and disease subtype discovery. A multi-class diagnosis
problem is transformed to a two-class diagnosis process by finding an
optimal feature and dividing the multi-class problem into two-classes.