A method and system for generating a search request from a multimodal
query that includes a query image and query text is provided. The
multimodal query system identifies images of a collection that are
textually related to the query image based on similarity between words
associated with each image and the query text. The multimodal query
system then selects those images of the identified images that are
visually related to the query image. The multimodal query system may
formulate a search request based on keywords of web pages that contain
the selected images and submit that search request to a search engine
service.