A speech recognition system comprises exactly two automated speech
recognition (ASR) engines connected to receive the same inputs. Each
engine produces a recognition output, a hypothesis. The system implements
one of two (or both) methods for combining the output of the two engines.
In one method, a confusion matrix statistically generated for each speech
recognition engine is converted into an alternatives matrix in which
every column is ordered by highest-to-lowest probability. A program loop
is set up in which the recognition outputs of the speech recognition
engines are cross-compared with the alternatives matrices. If the output
from the first ASR engine matches an alternative, its output is adopted
as the final output. If the vectors provided by the alternatives matrices
are exhausted without finding a match, the output from the first speech
recognition engine is adopted as the final output. In a second method,
the confusion matrix for each ASR engine is converted into Bayesian
probability matrix.