Title | Building Support Tools for Russian-Language Information Extraction |
Publication Type | Book Chapter |
Year of Publication | 2011 |
Authors | Du M, von Etter P, Kopotev M, Novikov M, Tarbeeva N, Yangarber R |
Editor | Habernal I, Matoušek V |
Book Title | Text, Speech and Dialogue |
Series Title | Lecture Notes in Computer Science |
Volume | 6836 |
Pagination | 380-387 |
Publisher | Springer |
City | Berlin / Heidelberg |
ISBN Number | 978-3-642-23537-5 |
Abstract | There is currently a paucity of publicly available NLP tools to support analysis of Russian-language text. This especially concerns higher-level applications, such as Information Extraction. We present work on tools for information extraction from text in Russian in the domain of on-line news. On the lower level we employ the AOT toolkit for natural language processing, which provides modules for morphological analysis and partial syntactic chunking. Since the outputs of both lower-level modules contain unresolved ambiguity, we synthesize the outputs and pass the result into a pre-existing English-language analysis pipeline. We describe how the information extraction system is adapted for multi-lingual support, including extensions to the ontologies and to the pattern matching mechanism. While this is work in progress, we present an end-to-end pipeline for event extraction from Russian-language news. |
DOI | 10.1007/978-3-642-23538-2_48 |
- Log in or register to post comments
- Google Scholar
- DOI