A method of calculating trigram path probabilities for an input string of text
containing a multi-word-entry (MWE) or a factoid includes tokenizing the input
string to create a plurality of parse leaf units (PLUs). A PosColumn is constructed
for each word, MWE, factoid and character in the input string of text which has
a unique first (Ft) and last (Lt) token pair. TrigramColumns are constructed which
define corresponding TrigramNodes each representing a trigram for three PosColumns.
Forward and backward trigram path probabilities are calculated for each separate
TrigramNode. The sums of all trigram path probabilities through each PLU are then
calculated as a function of the forward and backward trigram path probabilities.
Systems and computer-readable medium configured to implement the methods are also provided.