DeepDict – A Graphical Corpus-based Dictionary of Word Relations

Publication TypeConference Paper
Year of Publication2009
AuthorsBick E
Conference NameNordic Conference of Computational Linguistics NODALIDA 2009
PublisherNorthern European Association of Language Technology (NEALT)
Conference LocationOdense, Denmark

In our demonstration, we will present a new type of lexical resource, built from grammatically analysed corpus data. Co-occurrence strength between mother-daughter dependency pairs is used to automatically produce dictionary entries of typical complementation patterns and collocations, in the fashion of an instant monolingual Advanced Learner's dictionary. Entries are supplied to the user in a graphical interface with various thresholds for lexical frequencies as well as absolute and relative co-occurrence frequencies. DeepDict draws its data from Constraint Grammar-analysed corpora, ranging between tens and hundreds of millions of words, covering the major Germanic and Romance languages. Apart from its obvious lexicographical uses, DeepDict also targets teaching environments and translators.