A method for automatically generating semantically valid XPath expressions
in a computer system is provided. The method includes populating an
instance of a sequence-type model by organizing XML data into a
hierarchical structure consistent with the sequence-type model. The
method also includes priming the instance of the sequence-type model to
remove ambiguities and redundancies, while retaining semantic validity of
the instance of the sequence-type model. The method further includes
scanning the instance of the sequence-type model to identify one or more
location paths that match a search pattern, where an initial scan
originates at a root of the hierarchical structure and subsequent scans
originate from a termination point of a prior scan to incrementally
search for location steps by searching along XPath axes. The method
additionally includes determining whether a sequence type at each
location step matches the search pattern and outputting a result as
semantically valid XPath expression output.