A system for initiating and maintaining a real time audio or video media
session between two clients, at least one of which has a private network
IP address and is supported by a NAT firewall, comprises a proxy server
serving each client and a relay server. The first proxy server may
receive an invite message from a caller client to initiate a media
session with a callee client. The invite message will identify the IP
address and media port number of the caller client. The proxy server
queries the relay server to obtain a port number of the relay server that
may be used for relaying the media session between the caller client and
the callee client. The proxy server will replace the IP address and port
number of the caller client with the IP address and port number of the
relay server in the invite message before forwarding to the callee
client. When the callee client generates a response message that includes
the IP address and media port number of the callee client, the proxy
server will replace the IP address and media port number of the callee
client in the response message before forwarding the response message to
the caller client.