Words of an input string are morphologically analyzed to identify their
alternative base forms and parts of speech. The analyzed words of the
input string are used to compile the input string into a first
finite-state network. The first finite-state network is matched with a
second finite-state network of multiword expressions to identify all
subpaths of the first finite-state network that match one or more
complete paths in the second finite-state network. Each matching subpath
of the first finite-state network and path of the second finite-state
network identify a multiword expression in the input string. The
morphological analysis is performed without disambiguating words and
without segmenting the input string into sentences in the input string to
compile the first finite-state network with at least one path that
identifies alternative base forms or parts of speech of a word in the
input string.