A system and methodology is provided for filtering temporal streams of
information such as news stories by statistical measures of information
novelty. Various techniques can be applied to custom tailor news feeds or
other types of information based on information that a user has already
reviewed. Methods for analyzing information novelty are provided along
with a system that personalizes and filters information for users by
identifying the novelty of stories in the context of stories they have
already reviewed. The system employs novelty-analysis algorithms that
represent articles as a bag of words and named entities. The algorithms
analyze inter- and intra-document dynamics by considering how information
evolves over time from article to article, as well as within individual
articles.