A system and method automatically generate an on-line document from raw
text into an engaging, interactive form for a plurality of viewers.
Unstructured articles are read from an information feed. A computation
process extracts and tags proper names of people, products,
organizations, and places and categorizes them. An image database is used
to link these proper names with image files. The image database consists
of a series of attribute-value pairs for active searching of names. A URL
query string is input to the database to extract the location of the
image in the database file system. An Extensible Markup Language (XML)
file is created from the raw text of the article, the list of proper
names in the processed data and the image file references. The XML file
is stored in a file system. An Extensible Stylesheet Language (XSL) file
provides templates containing computational relationships between the
text and images. The XML and XSL style sheets are combined to generate a
Hypertext Markup Language (HTML) file containing an on-line story of the
unstructured articles in a Java Applet which allows the system to provide
a variety of interactive behaviors for a final presentation available by
a viewer from a browser.