A vector space analysis of swedish patent claims with different linguistic indices

Title	A vector space analysis of swedish patent claims with different linguistic indices
Publication Type	Conference Paper
Year of Publication	2010
Authors	Andersson L
Conference Name	3rd international workshop on Patent information retrieval
Publisher	ACM
Conference Location	Toronto, ON, Canada
ISBN Number	978-1-4503-0384-2
Abstract	The purpose of this study was twofold, first to examine if it is possible to use a general automatic retrieval model, the Vector Space Model (VSM), in order to discover similarities between Swedish patent claims; and second to examine whether an addition morphological decompounding module at the pre-processing level improves the result. In the present study, a comparison between three different topic sets consisting of patent claims was compared against an entire collection of 30,117 claims. The VSM was evaluated with and without additional morphological decompounding modules. The results indicate that decompounding will influence the performance of the retrieval model in a positive way. However, the sublanguage of patent claims and the errors made during the Optical Character Recognition (OCR) process were harmful towards the overall performance of the Natural Language Processing (NLP) applications as well as for the retrieval model.
DOI	10.1145/1871888.1871898

You are here