Methods and systems for syntactically indexing and searching data sets to
achieve more accurate search results and for indexing and searching data
sets using entity tags alone or in combination therewith are provided.
Example embodiments provide a Syntactic Query Engine ("SQE") that parses,
indexes, and stores a data set, as well as processes natural language
queries subsequently submitted against the data set. The SQE comprises a
Query Preprocessor, a Data Set Preprocessor, a Query Builder, a Data Set
Indexer, an Enhanced Natural Language Parser ("ENLP"), a data set
repository, and, in some embodiments, a user interface. After
preprocessing the data set, the SQE parses the data set according to a
variety of levels of parsing and determines as appropriate the entity
tags and syntactic and grammatical roles of each term to generate
enhanced data representations for each object in the data set. The SQE
indexes and stores these enhanced data representations in the data set
repository. Upon subsequently receiving a query, the SQE parses the query
also using a variety of parsing levels and searches the indexed stored
data set to locate data that contains similar terms used in similar
grammatical roles and/or with similar entity tag types as indicated by
the query. In this manner, the SQE is able to achieve more contextually
accurate search results more frequently than using traditional search
engines.