Computer-readable media having computer-executable instructions and
apparatuses categorize documents or corpus of documents. A Tensor Space
Model (TSM), which models the text by a higher-order tensor, represents a
document or a corpus of documents. Supported by techniques of multilinear
algebra, TSM provides a framework for analyzing the multifactor
structures. TSM is further supported by operations and presented tools,
such as the High-Order Singular Value Decomposition (HOSVD) for a
reduction of the dimensions of the higher-order tensor. The dimensionally
reduced tensor is compared with tensors that represent possible
categories. Consequently, a category is selected for the document or
corpus of documents. Experimental results on the dataset for 20
Newsgroups suggest that TSM is advantageous to a Vector Space Model (VSM)
for text classification.