A method of training a natural language processing unit applies a candidate learning
set to at least one component of the natural language unit. The natural language
unit is then used to generate a meaning set from a first corpus. A second meaning
set is generated from a second corpus using a second natural language unit and
the two meaning sets are compared to each other to form a score for the candidate
learning set. This score is used to determine whether to modify the natural language
unit based on the candidate learning set.