The invention provides a method and system to compare data objects. Each
data object is converted into a directed acyclic graph forest, which
comprises one or more directed acyclic graphs. The directed acyclic graph
forests corresponding to data objects are then compared to calculate a
similarity score between the data objects. The similarity score is then
used as a measure to determine the extent of similarity between the data
objects.