Framework and Resources for Natural Language Parser Evaluation

Publication TypeThesis
Year of Publication2007
AuthorsKakkonen, T
Academic DepartmentDepartment of Computer Science and Statistics
DegreeDoctor of Philosophy
Date Published12/2007
UniversityUniversity of Joensuu
ISBN Number978-952-219-058-1 (paperback) 978-952-219-059-8 (PDF)

Because of the wide variety of contemporary practices used in the automatic syntactic parsing of natural languages, it has become necessary to analyze and evaluate the strengths and weaknesses of different approaches. This research is all the more necessary because there are currently no genre- and domain-independent parsers that are able to analyze unrestricted text
with 100% preciseness (I use this term to refer to the correctness of analyses assigned by a parser). All these factors create a need for methods and resources that can be used to evaluate and compare parsing systems. This research describes: (1) A theoretical analysis of current achievements in parsing and parser evaluation. (2) A framework (called FEPa) that can be used to carry out practical parser evaluations and comparisons. (3) A set of new evaluation resources: FiEval is a Finnish treebank under construction, and MGTS and RobSet are parser evaluation resources in English. (4) The results of experiments in which the developed evaluation framework and the two resources for English were used for evaluating a set of selected parsers.