Computer implemented methods and systems of processing transactions to
determine the risk of transaction convert high categorical information,
such as text data, to low categorical information, such as category or
cluster IDs. The text data may be merchant names or other textual content
of the transactions, or data related to a consumer, or any other type of
entity which engages in the transaction. Content mining techniques are
used to provide the conversion from high to low categorical information.
In operation, the resulting low categorical information is input, along
with other data, into a statistical model. The statistical model provides
an output of the level of risk in the transaction. Methods of converting
the high categorical information to low categorical clusters, of using
such information, and other aspects of the use of such clusters are
disclosed.