Voice-controlled apparatus is provided which minimises the risk
of activating more than one such apparatus at a time where multiple voice-controlled
apparatus exist in close proximity. To start voice control of the apparatus, a
user needs to be looking at the apparatus when speaking. Preferably, after the
user stops looking at the apparatus, continuing voice control can only be effected
whilst the user continues speaking without breaks longer than a predetermined duration.
Detection of whether the user is looking at the apparatus can be effected in a
number of ways including by the use of camera systems, by a head-mounted directional
transmitter, and by detecting the location and direction of facing of the user.