You are here

Evaluation of Linguistic Features for Word Sense Disambiguation with Self-Organized Document Maps

TitleEvaluation of Linguistic Features for Word Sense Disambiguation with Self-Organized Document Maps
Publication TypeJournal Article
Year of Publication2004
AuthorsLindén, K
JournalComputers and the Humanities
Volume38
Pagination417-435
ISSN0010-4817
Abstract

Word sense disambiguation automatically determines the appropriate senses of a word in context. We have previously shown that self-organized document maps have properties similar to a large-scale semantic structure that is useful for word sense disambiguation. This work evaluates the impact of different linguistic features on self-organized document maps for word sense disambiguation. The features evaluated are various qualitative features, e.g. part-of-speech and syntactic labels, and quantitative features, e.g. cut-off levels for word frequency. It is shown that linguistic features help make contextual information explicit. If the training corpus is large even contextually weak features, such as base forms, will act in concert to produce sense distinctions in a statistically significant way. However, the most important features are syntactic dependency relations and base forms annotated with part of speech or syntactic labels. We achieve 62.9% ± 0.73% correct results on the fine grained lexical task of the English SENSEVAL-2 data. On the 96.7% of the test cases which need no back-off to the most frequent sense we achieve 65.7% correct results.

URLhttp://www.ling.helsinki.fi/~klinden/PhD/linden04a.pdf
DOI10.1007/s10579-004-1948-9