Title | Identification of Similar Documents Using Coherent Chunks |
Publication Type | Book Chapter |
Year of Publication | 2009 |
Authors | Lalitha Devi S, Kuppan S, Venkataswamy K, Rao PRK |
Editor | Lalitha Devi S, Branco A, Mitkov R |
Book Title | Anaphora Processing and Applications |
Series Title | Lecture Notes in Computer Science |
Volume | 5847 |
Pagination | 54-68 |
Publisher | Springer |
City | Berlin / Heidelberg |
ISBN Number | 978-3-642-04974-3 |
Abstract | We focus on automatically finding similar documents using coherent chunks. The similarity between the documents is determined by identifying the coherent chunks present in them. We apply linguistic rules in identifying the coherent chunks and uses Vector Space Model (VSM) in determining the similarity among documents. We have taken patent documents from USPTO for this work. This method of using coherent chunks for identifying similar documents has shown encouraging results. |
DOI | 10.1007/978-3-642-04975-0_5 |
- Log in or register to post comments
- Google Scholar
- DOI