A system and method for inferring informational goals and preferred level
of details in answers in response to questions posed to computer-based
information retrieval or question-answering systems is provided. The
system includes a query subsystem that can receive an input query and
extrinsic data associated with the query and which can output an answer
to the query, and/or rephrased queries or sample queries. The query
subsystem accesses an inference model to infer a probability distribution
over a user's goals, age, and preferred level of detail of an answer. One
application of the system includes determining a user's likely
informational goals and then accessing a knowledge data store to retrieve
responsive information. The system includes a natural language processor
that parses queries into observable linguistic features and embedded
semantic components that can be employed to retrieve the conditional
probabilities from the inference model. The inference model is built by
employing supervised learning and statistical analysis on a set of
queries suitable to be presented to a question-answering system. Such a
set of queries can be manipulated to produce different inference models
based on demographic and/or localized linguistic data.