We used hierarchical clustering to examine gene expression profiles
generated by serial analysis of gene expression (SAGE) in a total of nine
normal lung epithelial cells and non-small cell lung cancers (NSCLC).
Separation of normal and tumor samples, as well as histopathological
subtypes, was evident using the 3,921 most abundant transcript tags. This
distinction remained when just 115 highly differentially expressed
transcript tags were used. Furthermore, these 115 transcript tags
clustered into groups that were suggestive of the unique biological and
pathological features of the different tissues examined. Adenocarcinomas
were characterized by high-level expression of small airway-associated or
immunologically related proteins, while squamous cell carcinomas
overexpressed genes involved in cellular detoxification or antioxidation.
The messages of two p53-regulated genes, p21.sup.WAF1/CIP1 and
14-3-3.sigma., were consistently under-expressed in the adenocarcinomas,
suggesting that the p53 pathway itself might be compromised in this
cancer type. Gene expression observed by SAGE were consistent with the
results obtained by quantitative real-time PCR or cDNA array analyses
using 43 additional lung tumor and normal samples. Thus, although derived
from only a few tissue libraries, molecular signatures of non-small cell
lung cancer derived from SAGE most likely represent an unbiased yet
distinctive molecular signature for human lung cancer.