The present invention provides a computer-readable medium and system for selecting a set of n-grams for indexing string data in a DBMS system. Aspects of the invention include providing a set of candidate n-grams, each n-gram comprising a sequence of characters; identifying sample queries having character strings containing the candidate n-grams; and based on the set of candidate n-grams, the sample queries, database records, and an n-gram space constraint, automatically selecting, given the space constraint, a minimal set of an n-grams from the set of candidate n-grams that minimizes the number of false hits for the set of sample queries had the sample queries been executed against the database records.

 
Web www.patentalert.com

< Method for identifying data characteristics for flash memory

> Accessing a data structure

~ 00492