Using Query Rewriting in Social Media Search
Integrating search results from a user’s social network into results from any search engine is a great idea, often foiled because the user’s social index is so small. If you query your social index for great places to go for “deep dish pizza”, and there’s a comment from a friend on a “fantastic new pizza parlor”, your query won’t match the comment. But it should, and now it can.
- Idilia’s Social Search API improves recall from any keyword index of social content
- Queries are automatically and instantly rewritten into semantically equivalent queries
- Idilia’s software understands the precise meaning of each keyword in the original query and rewrites the query using synonyms, hypernyms, hyponyms, and transformations
- Users don’t need to spend time reformulating queries to find what they’re looking for
Let’s try searching for that pizza recommendation with the key words “deep”, “dish”, and “pizza”.
Idilia’s Social Search API determines the exact meaning of the query keywords and rewrites the query into semantically equivalent queries, or topical queries that cover the original query. So, in this case, Idilia’s software understands that you’re looking for a certain type of pizza and generates the appropriate rewrites.
|1.00||deep dish pizza|
|0.98||deep dish pizza pie|
|0.94||Chicago pizza pie|
|0.75||deep dish pizza pizzeria|
|0.50||deep-dish pizza pie|
The resulting rewrites can be expressed as separate queries, each submitted to the search engine, or the output can be combined using Boolean logic into a single query such as, “(“deep dish” or Chicago or “Chicago style”) + (pizza or pizzeria or “pizza parlor” or “pizza joint”) or (pizzeria or “pizza parlor” or “pizza joint”)”. In this case, there will be a match to a social index comment recommending a great new pizza parlor.
How it Works
- The Social Search API routes a query to Idilia’s Sense Analysis software
- A specially trained recipe determines the precise meaning of each keyword in the query
- Then, the sense-annotated query is routed to Idilia’s Paraphrasing software where the query is rewritten into several semantically equivalent queries
- The number of rewrites is configurable using several parameters including a maximum number, weighting for proximity to the original query, confidence of the sense analysis, and variations on the paraphrasing recipe (selecting whether to rewrite adjectives, verbs, nouns, etc.)
- Finally, the rewritten queries are returned with proximity weighting, and the original query is returned with sense-annotation, including confidence scores (read more about the API here).
- Depending on your search engine, you can process the rewrites individually or combine the unique keyword rewrites using Boolean logic into a single query
Four Ways To Deploy Query Rewriting
- Cached Queries – If you maintain a cache of queries and use auto-complete to suggest queries, then the entire cache can be sense-annotated, each query rewritten, and each set of rewrites turned into a Boolean query, all off-line
- Real-Time – Individual user queries can be routed to the Idilia API in real-time, or rules relating to the number of results returned by the original query can determine whether to route the query for semantic processing
- User Control – The interface can allow the user to specify precise senses for keywords, helping improve precision when the original query returns too many incorrect results
- Sense-Annotate the Index – The entire index of social comments can be sense annotated by Idilia allowing queries and rewritten queries to be sense-matched to the index, yielding simultaneous improvements in precision and recall
Customize Your Rewrites
Some keywords in a query are more important than others. Sometimes nouns matter more than adjectives, and especially certain superfluous adjectives like “nice”.
The Idilia Query Rewrite API allows you to control how queries are rewritten by selecting a part of speech (e.g., a verb), and specifying how that part of speech will managed by Idilia’s paraphrasing software.
Consider the query made up of the keywords “fantastic”, “deep”, “dish” and “pizza”:
Default Scenario – Rewrite all POS
|1.00||fantastic deep dish pizza|
|0.94||fantastic Chicago-style pizza|
|0.94||fantastic stuffed pizza|
|0.94||fantastic Chicago pizza|
|0.94||fantastic Chicago pizza pie|
|0.94||fantastic deep-dish pizza pie|
|0.71||great deep dish pizza|
|0.71||wonderful deep dish pizza|
|0.71||tremendous deep dish pizza|
|0.71||terrific deep dish pizza|
In the example above, the customized scenario yields a better set of rewrites that will match many more comments in a social index by virtue of relaxing the requirement to keyword match superfluous adjectives such as “fantastic”, “great”, “wonderful”, “tremendous”, and “terrific”.