Context-free grammars generally comprise a large number of rules, where
each rule defines how a sting of symbols is generated from a different
series of symbols. While techniques for creating finite-state automata
from the rules of context-free grammars exist, these techniques require
an input grammar to be strongly regular. Systems and methods that convert
the rules of a context-free grammar into a strongly regular grammar
include transforming each input rule into a set of output rules that
approximate the input rule. The output rules are all right- or
left-linear and are strongly regular. In various exemplary embodiments,
the output rules are output in a specific format that specifies, for each
rule, the left-hand non-terminal symbol, a single right-hand non-terminal
symbol, and zero, one or more terminal symbols. If the input context-free
grammar rule is weighted, the weight of that rule is distributed and
assigned to the output rules.