A method and system are provided for identifying groups in large-scale
networks. The large-scale networks include a collection of nodes and
edges that may represent relationships between entities or individuals.
The large-scale network is split into a number of fractions satisfying an
edge threshold. In turn, the nodes in each fraction are merged to
generate one or more clusters based on a specified similarity metric. The
large-scale network is recursively split and clustered until distinct
groups are identified.