This application claims processes and compositions that enable discovery
of single nucleotide polymorphisms (SNPs) and other sequence variation
that follows two essentially identical sequences, one a reference, the
other a target, as well as SNPs discovered using these processes and
compositions. The inventive process comprises preparation of four sets of
primers, "T-extendable", "A-extendable", "C-extendable", and
"G-extendable". These primers, when templated on a reference genome, add
(respectively) T, A, C, and G to their 3'-ends. The invention also
comprises a step where these primer sets are separately bound to
complementary sequences on target DNA and, once bound, prime extension
reactions using target DNA as the template. If the target DNA directs
incorporation of the same nucleotide as the reference DNA, then the T-,
A-, C-, and G-extendable primers are extended (respectively) by T, A, C,
and G. The architecture of the process distinguishes products from these
extensions from products derived if not T, not A, not C and not G ("3N"
or "3", to indicate the other three nucleotides) are not added. Thus,
this process discovers differences between the target and reference DNA
in the site queried by the primer extension reaction. The distinction
makes the two kinds of products either separable or differentially
extendable. This distinction is used to disregard products that added T,
A, C, and G and to identify the sequence(s) of primers that added not-T,
not-A, not-C, and not-G. Further and optionally, information from these
sequences identifies loci of the SNPs in an in silico genome.