A method and apparatus for the automatic insertion of hypertext links into
a passage or document of encoded text is disclosed. A program, resident on
a personal computer, for example, receives and parses input text in HTML
format. In a first part of the processing, label strings identifying each
paragraph number are located in the read in document. These are converted
into an unambiguous format. Next, the text is re-read, with the
paragraphs/section headers masked off, to locate text strings within the
body of the text which cross-reference the section headers, or term
definitions, or external links. These are also placed in an unambiguous
format. Finally, the cross-references are matched up as far as possible
with section/paragraph headers and the original HTML text is marked up
automatically with hyperlinks, using the unambiguous section labels and
cross-references as HTML anchors and destinations.