Machinese Tokenizer
Machinese Tokenizer is a set of program components that performs basic text analysis tasks quickly. It splits raw text into understandable word units and provides the possible base forms and classes for words.
Machinese Phrase Tagger
Machinese Phrase Tagger is a set of program components that performs basic linguistic analysis tasks. It splits raw text into understandable word units and provides the possible base forms and classes for words.
Machinese Syntax
Machinese Syntax provides a full analysis of texts by showing how words and concepts relate to each other in sentences.
Machinese Metadata
Connexor Machinese Metadata is a high-performance text analytics and metadata automation solution, which extracts information in ten different languages.
Feature matrix
The table below explains which output features different Machinese analysers offer. Note that the table excludes Machinese Metadata, which uses totally different kind of output than other Machinese analysers as it only prints out entities and information about their occurrences rather than each and every token of the text.
Machinese Tokenizer | Machinese Phrase Tagger | Machinese Syntax | |
---|---|---|---|
Tokenisation | ● | ● | ● |
Morphological analysis | ● | ● | |
Base forms | ● | ● | ● |
Decompounding | ● | ● | ● |
Disambiguation | ● | ● | |
Phrasal analysis | ● | ● | |
Name tagging | ● | ||
Syntactic structure | ● |
Glossary
As the above includes some terms not necessarily known to all readers, glossary provides explanations for the key terms relating to Connexor technologies.