You are here

Vector-based semantic analysis using random indexing and morphological analysis for cross-lingual information retrieval

Title	Vector-based semantic analysis using random indexing and morphological analysis for cross-lingual information retrieval
Publication Type	Book Chapter
Year of Publication	2002
Authors	Karlgren J, Sahlgren M
Book Title	Revised Papers from the Second Workshop of the Cross-Language Evaluation Forum on Evaluation of Cross-Language Information Retrieval Systems, Darmstadt, Germany, September 3 - 4
Series Title	Lecture Notes In Computer Science
Pagination	169-176
Publisher	Springer
ISBN	3-540-44042-9
Keywords	information retrieval
Abstract	Meaning, the main object of study in information access, is most decidedly situation-dependent. While much of meaning appears to achieve consistency across usage situations -- a term will seem to mean much the same thing in many of its contexts -- most everything can be negotiated on the go. Human processing appears to be flexible in this respect, and oriented towards learning from prototypes rather than learning by definition: learning new words, and adding new meanings or shades of meaning to an existing word does not need a formal re-training process. We have built a query expansion and translation tool for information retrieval systems. When used in one single language it will expand the terms of a query using a thesaurus built for that purpose; when used across languages it will provide numerous translations and near translations for the source language terms. The underlying technology we are testing is that of vector-based semantic analysis, an analysis method related to latent semantic indexing based on stochastic pattern computing. This paper will briefly describe how we acquired training data, aligned it, analyzed it using morphological analysis tools, and finally built a thesaurus using the data, but will concentrate on an overview of vector-based semantic analysis and how stochastic pattern computing differs from latent semantic indexing in its current form.
URL	http://www.ercim.eu/publication/ws-proceedings/CLEF2/karlgren.pdf

Log in or register to post comments
Google Scholar