Methods, computer program products and systems for identifying cellular
constituents in a secondary tissue that serve as surrogate markers for a
target gene expressed in a primary tissue of a species are provided. A
classifier is constructed using cellular constituent abundances of
cellular constituents in a first plurality of cellular constituents
measured in the secondary tissue in a population. This population
comprises a first and second subgroup. The classifier is based on a
second plurality of cellular constituents that comprises all or a portion
of the first plurality of cellular constituents. Abundance levels of each
cellular constituent in the second plurality of cellular constituents
varies between the first and second subgroup. All or portion of the
population is classified into a plurality of subtypes using the
classifier. Then, one or more cellular constituents that can discriminate
members of the population between a first subtype and a second subtype in
the plurality of subtypes are identified.