A method and system for document analysis and retrieval. A remote host in
a first computing system transmits a first portion and at least one
additional portion of a document to a web service host in a second
computing system. The web service host reconstructs the entire document
from the received first portion and at the least one additional portion.
After reconstructing the entire document, the web service host implements
at least one of extracting, generating, and determining steps. The
extracting step extracts text from the entire document to configure the
text in a text format. The generating step generates document keys
associated with the text from analysis of the text in the text format.
The determining step determines from given categories of a document
taxonomy, a set of closet categories to the document based on comparing
the category keys of the given categories with the document keys.