A method and a computer system for indexing and searching the data content
of nested field records, such as those in Extensible Markup Language
(XML). The system includes an indexing and searching engine that
constructs an improved full-text search index on the input XML data and
then performs searches using the index. The system supports exact matches
and partial matches using a wildcard character. The method transforms the
input XML data into a form that encodes the data structural information
by suffixing each word with its corresponding field qualifiers or an
equivalent numerical pattern thereof. The resulting encoded words are
then stored in a full-text index structure. Various types of full-index
search may be performed. One alternative embodiment is to combine string
matching and numeric or integer pattern matching to identify a particular
word in a particular field. The portion of the word without field
qualifiers is matched against the words in the index, and the pattern of
numerals representing the word's field qualifiers is matched against the
numeral patterns of the words in the index that correspond to their
respective field qualifiers. Therefore, evaluation of complex field
criteria is reduced to simpler and faster numeric matching.