The invention relates to a method for comparing and analysing digital
documents. The present invention is founded on the basic principle of
searching for unambiguous roots in both documents. These roots are units
which occur in both documents and in each case are unique. These roots
can be individual words, word groups or other unambiguous textual
formatting functions. There is then a search for identical roots in the
respective other document (Root1 from Content1, and Root2 from Content2,
with Root1=Root2). If a pair has been found, the area around these roots
is compared until there is no longer any agreement. During the area
search, both the preceding words and the subsequent words are analysed.
The areas which are found in this way, Area1 around Root1 and Area2
around Root2, are stored in lists, List1 and List2, which are allocated
to Doc1 and Doc2. This procedure is repeated until such time as no roots
can be found any longer. The result is either a remaining area which has
no overlaps, or complete identity of the documents.