A method for making a model for the folded structure of a set of proteins
from an evolutionary analysis of a set of aligned homologous protein
sequences was claimed in Ser. No. 07/857,224. The instant application
concerns methods for using these models. The first method is used to
confirm or deny a hypothesis that two proteins are homologous, and is
comprised of comparing a predicted structure model for one family of
proteins with a predicted structure model for a second family of proteins,
or an experimental structure for the second family, and deducing the
presence or absence of homology based on the presence or absence of
structural similarity flanking key residue motifs in the polypeptide
sequence. The second method identifies mutations during the divergent
evolution of a protein sequence that are potentially adaptive by
identifying episodes during the divergent evolution of a family of
proteins where there is a high absolute rate of amino acid substitution,
or a high ratio of non-silent substitutions to non-silent substitutions.
Amino acids that are changing during this episode are likely to be
adaptive. The third is a method for identifying specific in vitro
properties of the protein that are likely to play a physiological role in
vivo in an organism. This methods involves synthesizing in the laboratory
proteins having the reconstructed amino acid sequences of a protein before
and after a period of rapid sequence evolution that characterizes adaptive
substitution, measuring the in vitro properties of the protein before the
episode of rapid sequence evolution, and then measuring the in vivo
properties of the protein after the episode of rapid sequence evolution.
The in vitro behaviors that remained unchanged through this episode are
not likely to have adaptive significance physiologically. The in vitro
behaviors that changed through this episode are likely to have adaptive
significance physiologically. The fourth concerns method for organizing
genome sized sequence databases.