A document similarity detector may be used to determine a family of
documents based on a similarity analysis between content of a seed
document and content of the family of documents, the content of the seed
document associated with at least one database object having at least one
field. A content extraction system may be used to determine a ranking of
a plurality of terms from within at least one document of the family of
documents, based on a relative frequency with which each of the plurality
of terms appears within the family of documents, and configured to
extract at least one term from the plurality of terms as being associated
with a value of the at least one field, based on the ranking.