The present invention relates to a method, system and computer program product
for clustering data points and its application to text summarization, customer
profiling for web personalization and product cataloging.
The method for clustering data points with defined quantified relationships between
them comprises the steps of obtaining lead value for each data point either by
deriving from said quantified relationships or as given input, ranking each data
point in a lead value sequence list in descending order of lead value, assigning
the first data point in said lead value sequence list as the leader of the first
cluster, and considering each subsequent data point in said lead value sequence
list as a leader of a new cluster if its relationship with the leaders of each
of the previous clusters is less than a defined threshold value or as a member
of one or more clusters where its relationship with the cluster leader is more
than or equal to said threshold value. The said relationships between data points
are symmetric or asymmetric. Similarly, system and computer program product have
also been claimed.