A harvester is disclosed for harvesting metadata of managed objects (files
and directories) across file systems which are generally not
interoperable in an enterprise environment. Harvested metadata may
include 1) file system attributes such as size, owner, recency; 2)
content-specific attributes such as the presence or absence of various
keywords (or combinations of keywords) within documents as well as
concepts comprised of natural language entities; 3) synthetic attributes
such as mathematical checksums or hashes of file contents; and 4)
high-level semantic attributes that serve to classify and categorize
files and documents. The classification itself can trigger an action in
compliance with a policy rule. Harvested metadata are stored in a
metadata repository to facilitate the automated or semi-automated
application of policies.