A method of training a natural language processing unit applies a
candidate learning set to at least one component of the natural language
unit. The natural language unit is then used to generate a meaning set
from a first corpus. A second meaning set is generated from a second
corpus using a second natural language unit and the two meaning sets are
compared to each other to form a score for the candidate learning set.
This score is used to determine whether to modify the natural language
unit based on the candidate learning set.