A start of an input speech signal is detected during presentation of an output
audio signal and an input start time, relative to the output audio signal, is determined.
The input start time is then provided for use in responding to the input speech
signal. In another embodiment, the output audio signal has a corresponding identification.
When the input speech signal is detected during presentation of the output audio
signal, the identification of the output audio signal is provided for use in responding
to the input speech signal. Information signals comprising data and/or control
signals are provided in response to at least the contextual information provided,
i.e., the input start time and/or the identification of the output audio signal.
In this manner, the present invention accurately establishes a context of an input
speech signal relative to an output audio signal regardless of the delay characteristics
of the underlying communication system.