The present invention relates to systems and methods for determining a
content item relevance function. The method comprises collecting user
preference data at a search provider for storage in a user preference
data store and collecting expert-judgment data at the search provider for
storage in an expert sample data store. A modeling module trains a base
model through the use of the expert-judgment data and tunes the base
model through the use of the user preference data to learn a set of one
or more tuned models. A measure (B measure) is designed to evaluate the
balanced performance of tuned model over expert judgment and user
preference. The modeling module generates or selects the content item
relevance function from the tuned models with B measure as the selection
criterion.