Annotation Information Conversion Tool

Description

MAT 2.0 introduces a new, extremely detailed mechanism for describing the annotation types available to a given task. The documented description language is XML. However, the compiled information is made available to the MAT UI as JSON, and this JSON format is also used in the MAT standalone viewer/annotator and the Java library. This tool converts the information in a known task into the appropriate JSON format.

Usage

Unix:

% $MAT_PKG_HOME/bin/MATAnnotationInfoToJSON

Windows native:

> %MAT_PKG_HOME%\bin\MATAnnotationInfoToJSON

Usage: MATAnnotationInfoToJSON [options] task outfile

task: the name of a MAT task.
outfile: the output file to write the JSON version of the annotation info to. If the file
is '-', the JSON will be written to standard output.

The tool also allows you to control whether redundant info is displayed; whether the output is pretty-printed or not; and whether noncontent annotations are produced.

Options

--dont_remove_redundant_info
By default, the tool removes redundant information to improve readability. If this flag is present, the redundant information will be retained. Both the Java library and the Javascript standalone viewer are configured to repopulate redundant information.
--compact
By default, the tool pretty-prints its JSON output. If this flag is present, the JSON will not be pretty-printed
--keep_noncontent_annotations
By default, the tool does not preserve token, zone or admin annotation type descriptors. If this flag is present, all annotation types will be presented.
--simplified
By default, the tool dumps expanded JSON, in which all the annotation-related task information, including tag order, hierarchy, etc., even if this information is not present. If this flag is present, the tool will dump simplified JSON, which is just a list of the annotation types.

Examples

Example 1

Let's say you want to convert the included "Named Entity" task and save the content annotations to the file ne.json, pretty-printed, without redundant information:

Unix:

% $MAT_PKG_HOME/bin/MATAnnotationInfoToJSON "Named Entity" ne.json

Windows native:

> %MAT_PKG_HOME%\bin\MATAnnotationInfoToJSON.cmd "Named Entity" ne.json

Example 2

Like example 1, but you want the noncontent annotations as well, and you want the output printed to standard output:

Unix:

% $MAT_PKG_HOME/bin/MATAnnotationInfoToJSON --keep_noncontent_annotations "Named Entity" -

Windows native:

> %MAT_PKG_HOME%\bin\MATAnnotationInfoToJSON.cmd --keep_noncontent_annotations "Named Entity" -