|Title||Using Dependency-Based Features to Take the 'Para-farce' out of Paraphrase|
|Publication Type||Conference Paper|
|Year of Publication||2006|
|Authors||Wan S, Dras M, Dale R, Paris C|
|Conference Name||Proceedings of the Australasian Language Technology Workshop 2006|
|Conference Location||Sydney, Australia|
As research in text-to-text paraphrase generation progresses, it has the potential to improve the quality of generated text. However, the use of paraphrase generation methods creates a secondary problem. We must ensure that generated novel sentences are not inconsistent with the text from which it was generated. We propose a machine learning approach be used to filter out inconsistent novel sentences, or False Paraphrases. To train such a filter, we use the Microsoft Research Paraphrase corpus and investigate whether features based on syntactic dependencies can aid us in this task. Like Finch et al.(2005), we obtain a classification accuracy of 75.6%, the best known performance for this corpus. We also examine the strengths and weaknesses of dependency based features and conclude that they may be useful in more accurately classifying cases of False Paraphrase.