Plural: corpora. A corpus is a collection of text documents. The collection is often annotated with linguistic information to be better suited for the purposes of linguistic research.