Sequence-based XML indexing aims at avoiding expensive join operations in
query processing. It transforms structured XML data into sequences so
that a structured query can be answered holistically through subsequence
matching. Herein, there is addressed the problem of query equivalence
with respect to this transformation, and thereis introduced a
performance-oriented principle for sequencing tree structures. With query
equivalence, XML queries can be performed through subsequence matching
without join operations, post-processing, or other special handling for
problems such as false alarms. There is identified a class of sequencing
methods for this purpose, and there is presented a novel subsequence
matching algorithm that observe query equivalence. Also introduced is a
performance-oriented principle to guide the sequencing of tree
structures. For any given XML dataset, the principle finds an optimal
sequencing strategy according to its schema and its data distribution;
there is thus presented herein a novel method that realizes this
principle.