Weighting Query Terms Based on Distributional Statistics

Title	Weighting Query Terms Based on Distributional Statistics
Publication Type	Book Chapter
Year of Publication	2006
Authors	Karlgren J, Sahlgren M, Cöster R
Editor	Peters C, Gey FC, Gonzalo J, Müller H, Jones G, Kluck M, Magnini B, de Rijke M
Book Title	Accessing Multilingual Information Repositories
Series Title	Lecture Notes in Computer Science
Volume	4022
Pagination	208-211
Publisher	Springer
City	Berlin / Heidelberg
ISBN	978-3-540-45697-1
Abstract	This year, the SICS team has concentrated on query processing and on the internal topical structure of the query, specifically compound translation. Compound translation is non-trivial due to dependencies between compound elements. This year, we have investigated topical dependencies between query terms: if a query term happens to be non-topical or noise, it should be discarded or given a low weight when ranking retrieved documents; if a query term shows high topicality its weight should be boosted. The two experiments described here are based on the analysis of the distributional character of query terms: one using similarity of occurrence context between query terms globally across the entire collection; the other using the likelihood of individual terms to appear topically in individual texts. Both – complementary – boosting schemes tested delivered improved results.
DOI	10.1007/11878773_24

You are here