Vision-based document segmentation identifies one or more portions of
semantic content of a document. The one or more portions are identified
by identifying a plurality of visual blocks in the document, and
detecting one or more separators between the visual blocks of the
plurality of visual blocks. A content structure for the document is
constructed based at least in part on the plurality of visual blocks and
the one or more separators, and the content structure identifies the one
or more portions of semantic content of the document. The content
structure obtained using the vision-based document segmentation can
optionally be used during document retrieval.