A document collection apparatus collects a document for a community from a
network. Before starting the collection, an initial document group is
given as a starting point of the collection. A reference extraction unit
extracts a reference from the initial document group. A next prospect
determination unit determines a prospect which is a prospective document
to be collected next. A document collection unit collects the prospect to
be collected next, and adds it to the collected documents. The document
collection unit first collects documents evenly from inside the community
in the network, and then the next prospect determination unit determines
the prospect to be collected next from inside and outside the community.