In one embodiment, documents accessible via a designated public account
are classified as public. In another embodiment, documents accessible
according to a designated public access control list are classified as
public. In some embodiments, all documents not classified as public are
classified as private. Content in the public documents is linguistically
analyzed, resulting in a set of keys for use in subsequent full and
partial content matching. The keys and associated file names are stored
in a public-content identification repository. Similarly, content in the
private documents is linguistically analyzed, and the results are stored
in a private-content identification repository. Subsequently, full and
partial content matching is performed on monitored content according to
information in the public and private repositories. In a related aspect,
monitored content found to correspond to private content is selectively
flagged during electronic transmission or optionally prevented from
distribution according to a set of defined monitoring policies.