A method, computer readable medium and system are provided which collect
new words for addition to a lexicon for an agglutinative language. In the
method, a log of queries submitted to a search engine is obtained. The
log of queries is sorted to obtain sorted queries. The sorted queries are
then filtered using a plurality of heuristic criteria to obtain a
candidate list of new words. Words from the candidate list of new words
are then added to a lexicon.