A client device comprises a plurality of audio input ports, at least one
audio output port, a MIDI input port, a MIDI output port, a video input
port, a video output port, a processor and a network interface. The
processor generates outgoing IP messages that carry at least one audio
input signal received from the audio input ports, a first MIDI signal
received from the MIDI input port and a first video signal received from
the video input port. From incoming IP messages, the processor extracts a
second MIDI signal for the MIDI output port, a second video signal for
the video output port, and at least one audio output signal for the at
least one audio output port. A network interface sends the outgoing IP
messages within a VoIP call via a network and receives the incoming IP
messages within the VoIP call via the network.