A method and an apparatus accurately discriminates between speech and
voice-band data (VBD) in a communication network by calculating self
similarity ratio (SSR) values, which indicate periodicity characteristics
of an input signal segment, and/or autocorrelation coefficients, which
indicate spectral characteristics of an input signal segment, to generate
a speech/VBD discrimination result. In one implementation, the speech-VBD
discriminating apparatus calculates both short-term delay and long-term
delay SSR values to analyze the repetition rate of an input signal frame,
thereby indicating whether the input signal frame has the periodicity
characteristics of a typical speech signal or a VBD signal. The
speech-VBD discriminating apparatus further calculates a plurality of
short-term autocorrelation coefficients to determine the spectral
envelope of an input frame, thereby facilitating accurate speech/VBD
discrimination. According to one implementation of the present invention,
the speech-VBD discriminating apparatus relies on sequential decision
logic which improves classification performance by recognizing that
changes from speech to VBD or vice versa in a communication medium are
unlikely, and discounts discrimination results for relatively low-power
signal portions which are more susceptible to errors to further improve
discrimination accuracy.