A system and method efficiently and anonymously retrieves large scale Web data
through a restricted query interface. A number of proxy servers are utilized to
permit parallel access to a target Web server for processing multiple queries simultaneously.
Latency in the individual queries is absorbed by the proxy servers. Queries that
would otherwise appear structured to the target server are assigned to the proxy
server in a random fashion, obscuring the structured nature of the queries. The
anonymous nature of the queries made by the proxy servers furthermore conceals
the identity of the originating server.