Systems, methodologies, media, and other embodiments associated with
efficiently computing document similarity are described. One exemplary
system embodiment includes logic to produce a gram from a string and
logic to identify candidate documents based on identifying matches
between query grams and document grams stored in an inverted index that
relates grams to documents. The example system may also include logic to
selectively partially reconstruct a candidate document from entries in
the inverted index and logic to compute an edit distance between a string
associated with a query and a string associated with the partially
reconstructed candidate document. The example system may also include a
signal logic configured to provide a signal corresponding to the edit
distance.