The query rewriting service generates a list of paraphrases (i.e., alternate nearly equivalent wordings) for an Internet search query or a keyword expression. The service makes use of Sense Analysis to obtain the senses of the query and of the semantic relationships from the Language Graph to perform one or several transformations on the senses found.
Several levels of control over the generated rewrites are available:
- rewriting recipe selection
- transformations selection
- part of speech controls
- output filter selection
- weight filtering
This document describes these controls.
The first level of control available is the recipe selection.
The recipe specifies an overall target for the rewrite operation. This defines the transformation rules for each transformation types. It also influences the weight returned for the paraphrases as the weight considers the goal of the recipe.
The recipe is specified as parameter paraphrasingRecipe in the API. The following table shows the possible recipe values and their associated goal.
|paidListings||Generates equivalent keyword paraphrases, including some for a broader context. Uses broad generalization, rewording, and specialization. Appropriate for advertising applications.|
|productSearch||Generates equivalent keyword paraphrases targeted for searching a keyword index of products. Restricted to conservative generalization and rewording on product names and artefacts, and broader simplification and/or generalization on the rest of the query.|
|search||Generates equivalent keyword paraphrases targeted for searching a keyword index. Restricted to conservative generalization, rewording, and specialization.|
The rewrites are generated using several transformation rules regrouped in five types. Those types are explained in the following table. The actual rules enabled for each type depend on the recipe specified. Therefore different recipes will yield different rewrites for the same query despite the same transformations being enabled.
|synonymy||Replaces a sense with a synonym (different word, same sense).||New York -> Big Apple|
|generalization||Produces a broader rewrite by:
|specialization||Replaces a sense with a child sense.||
|association||Replaces a sense with another closely related sense in the Language Graph.||man psychology -> man psychologist|
||beach patrolling -> patrolling on beaches|
By default all transformation types are used. API parameter transformations can be used to explicitly enable specific transformations or turn some off.
Part of Speech Controls
The part of speech controls enable an application to specify how individual parts of speech (noun, verb, adjectives, adverbs) should be transformed. The following table describes how each can be controlled.
|remove||Words with that part of speech are always removed from any generated rewrites.|
|freeze||Words with that specified part of speech are never altered.|
|paraphrase||Words with that part of speech are subject to all transformations. This is the default.|
The controls can be provided using the API parameters adjectives, adverbs, nouns, verbs, and superfluousAdjectives. This last class are adjectives with no or little semantic content in search queries (e.g., “best”, “wonderful”, “great”).
Rewrites generated are normally subject to a likelyhood filter that rejects semantically correct transformations but unlikely to be found in a document. E.g., “free dating web sites -> no cost dating web sites”.
This filter can be controlled using API parameter filters. The filter is normally enabled.
Weight and Number of Rewrites
Each rewrite has a weight attribute. This weight is an empirical value based on the accuracy of the transformation that generated the paraphrase and the goal of the rewriting recipe. A weight of 0.60 would indicate that the rewrite should be productive 60% of the time for the intent of the recipe.
The rewrite logic generates paraphrases starting with the most accurate rules and finishing with the less productive ones. API parameter minWeight is used to indicate the minimum desired weight for a returned rewrite.
API parameter maxCount limits the number of rewrites generated. This can be useful to avoid generating too many. (Each rewrite is a charge unit in the query rewriting accounting. See Pricing.)
One can quickly experiment with these settings using the Query Rewriting demo. The output indicates the transformations that combined to yield each rewrite. The API Console is also a great tool to observe first hand the actual output of the API.