Title | Weighting Query Terms Based on Distributional Statistics |
Publication Type | Book Chapter |
Year of Publication | 2006 |
Authors | Karlgren J, Sahlgren M, Cöster R |
Editor | Peters C, Gey FC, Gonzalo J, Müller H, Jones G, Kluck M, Magnini B, de Rijke M |
Book Title | Accessing Multilingual Information Repositories |
Series Title | Lecture Notes in Computer Science |
Volume | 4022 |
Pagination | 208-211 |
Publisher | Springer |
City | Berlin / Heidelberg |
ISBN | 978-3-540-45697-1 |
Abstract | This year, the SICS team has concentrated on query processing and on the internal topical structure of the query, specifically compound translation. Compound translation is non-trivial due to dependencies between compound elements. This year, we have investigated topical dependencies between query terms: if a query term happens to be non-topical or noise, it should be discarded or given a low weight when ranking retrieved documents; if a query term shows high topicality its weight should be boosted. The two experiments described here are based on the analysis of the distributional character of query terms: one using similarity of occurrence context between query terms globally across the entire collection; the other using the likelihood of individual terms to appear topically in individual texts. Both – complementary – boosting schemes tested delivered improved results. |
DOI | 10.1007/11878773_24 |
- Log in or register to post comments
- Google Scholar
- DOI