|Elliphant: Improved Automatic Detection of Zero Subjects and Impersonal Constructions in Spanish
|Year of Publication
|Rello L, Baeza-Yates R, Mitkov R
|13th Conference of the European chapter of the Association for Computational Linguistics (EACL 2012)
|Association for Computational Linguistics
In pro-drop languages, the detection of explicit subjects, zero subjects and non-referential impersonal constructions is crucial for anaphora and co-reference resolution. While the identification of explicit and zero subjects has attracted the attention of researchers in the past, the automatic identification of impersonal constructions in Spanish has not been addressed yet and this work is the first such study. In this paper we present a corpus to underpin research on the automatic detection of these linguistic phenomena in Spanish and a novel machine learning-based methodology for their computational treatment. This study also provides an analysis of the features, discusses performance across two different genres and offers error analysis.