|Title||Towards a more consistent and comprehensive evaluation of anaphora resolution algorithms and systems|
|Publication Type||Journal Article|
|Year of Publication||2001|
|Journal||Applied Artificial Intelligence|
The article argues that recall and precision are imperfect as measures for robust anaphora resolution algorithms, and proposes instead a success rate for anaphora resolution algorithms and for anaphora resolution systems separately. The article also proposes a package of evaluation measures and tasks for anaphora resolution: it is believed that these newly added tasks which have been carried out on Mitkov's (1998) knowledge-poor approach, provide a better, more comprehensive picture of the performance of anaphora resolution algorithms or systems. Finally, the ongoing work on the development of a consistent evaluation environment for anaphora resolution is outlined.