A scheduler for a search engine crawler includes a history log containing
document identifiers (e.g., URLs) corresponding to documents (e.g., web
pages) on a network (e.g., Internet). The scheduler is configured to
process each document identifier in a set of the document identifiers by
determining a content change frequency of the document corresponding to
the document identifier, determining a first score for the document
identifier that is a function of the determined content change frequency
of the corresponding document, comparing the first score against a
threshold value, and scheduling the corresponding document for indexing
based on the results of the comparison.