Title | Evaluating DBMS-based Access Strategies to Very Large Multi-layer Corpora |
Publication Type | Conference Paper |
Year of Publication | 2012 |
Authors | Schneider R |
Conference Name | LREC 2012 Workshop: Challenges in the management of large corpora |
Publisher | European Language Resources Association (ELRA) |
Conference Location | Istanbul, Turkey |
Abstract | Linguistic query systems are special purpose IR applications. As text sizes, annotation layers, and metadata schemes of language corpora grow rapidly, performing complex searches becomes a highly computational expensive task. We evaluate several storage models and indexing variants in two multi-processor/multi-core environments, focusing on prototypical linguistic querying scenarios. Our aim is to reveal modeling and querying tendencies – rather than absolute benchmark results – when using a relational database management system (RDBMS) and MapReduce for natural language corpus retrieval. Based on these findings, we are going to improve our approach for the efficient exploitation of very large corpora, combining advantages of state-of-the-art |
URL | http://www.ids-mannheim.de/gra/texte/LREC2012_final.pdf |
- Log in or register to post comments
- Google Scholar