Task XML Reference

The XML format for the task files (see "Creating a New Task") is described in this document. Use cases are described here. Click here for a split-screen view.

Element hierarchy

    <task> (and <tasks>)
       <workflows>
          <workflow>
             <ui_settings>
                <setting>
             <step>
                <create_settings>
                    <setting>
                <run_settings>
                    <setting>
                <ui_settings>
                    <setting>
       <settings>
          <setting>
       <doc_enhancement_class>
       <java_subprocess_parameters>
       <web_customization>
          <js>
          <css>
          <short_name>
          <long_name>
       <model_config>
          <build_settings>
             <setting>
       <default_model>
       <workspace>
          <operation>
             <settings>
                <setting>
       <step_implementations>
          <step>
             <create_settings>
                <setting>
      <annotation_set_descriptors>
          <annotation_set_descriptor>
      <annotation_display>
          <label>
          <attribute>
          <label_group>
      <similarity_profile>
          <stratum>
          <tag_profile>
             <attr_equivalences>
             <dimension>
      <score_profile>
          <aggregation>
          <attr_decomposition>
          <partition_decomposition>
          <label_limitation>

<task>

The toplevel element in the file. For historical reasons, some of the tags are obligatory and some not. Conceptually speaking, you always need to specify <annotation_set_descriptors>; <model_config>, <workflows> and <step_implementations> are required for using the engine and experiment infrastructure; <workspace> if you're going to use workspace mode. The other elements are for advanced customizations.

If you want to define multiple tasks in the same task.xml file (if, for instance, you're defining a task and a set of child tasks), you can use <tasks> as your toplevel element. This element has no attributes, and only one repeatable child: <task>.

Attributes

Attribute
Value
Obligatory?
Description
name a string
yes The name of the task. This name will appear in menus in the UI, and in help strings in the engine, so make it something mnemonic, distinctive and descriptive.
visible "no"
no If present, the task is not "visible" in the various lists of tasks the user will see. Typically, this is used if this task is not a leaf in the tree of tasks. You will seldom need this capability.
parent a string
no The name of the parent task in the hierarchy. If this is not specified, the system root task will be used. You will seldom need this capability. If you do, typically the parent will specify visible="no".
class a string, the name of a Python class
no If you've found a need to specialize the default task implementation, the value of this attribute should be "<file>.<classname>", where <file> corresponds to a file <taskdir>/python/<file>.py.

Children

Element
Obligatory?
Repeatable?
Description
<workflows> yes no The workflows that are used in the MAT engine.
<settings> no no The task-specific settings which may be viewed by specializations of the root task.
<doc_enhancement_class> no no If specified, this element should delimit a string "<file>.<classname>", where <file> corresponds to a file <taskdir>/python/<file>.py. The specified class is a class which contributes to specializations of this documentation for the task in question. This functionality is currently undocumented.

This element has no attributes or element children; its value is the text it delimits.
<java_subprocess_parameters>
no
no
If present, defaults for various JVM parameters for all Java subprocesses (e.g., Java Carafe training and tagging).
<web_customization> no no Customizations of the Web UI.
<model_config> no yes
Settings for the model building engine.
<default_model> no
no If present, this element should delimit a pathname where models will be saved if MATModelBuilder is invoked with --save_as_default_model. If the pathname is relative, it will be interpreted as relative to the task directory. This value may be inherited from the parent task.
<workspace> no no Implementations of the operations in the workspaces.
<step_implementations> no no Implementations of the named steps in the MAT engine workflows.
<annotation_set_descriptors> no no The labels and attributes which are used in this task.
<annotation_display>
no
no
The display-related properties of the labels and attributes in this task.
<similarity_profile>
no
yes
The methods for comparing annotations for scoring and visual comparison.
<score_profile>
no
yes
The methods for decomposing and aggregating annotation labels for scoring.

<workflows> (of <task>)

Workflows are ordered sets of steps, corresponding to a larger-scale activity the user may wish to apply to the documents.

Attributes

Attribute
Value
Obligatory?
Description
inherit a string, a comma-delimited sequence of workflow names
no If the task has a non-root parent task, you may use this attribute to inherit workflows from the parent task. The implementations of the step names will also be inherited. You can list multiple workflows, delimited by commas, e.g., "Demo,Hand annotation".
inherit_all "yes"
no If the task has a non-root parent, you may use this attribute to specify that all workflows should be inherited from the parent.

Children

Element
Obligatory?
Repeatable?
Description
<workflow> no yes An individual workflow

<workflow> (of <workflows>)

Each non-inherited workflow is specified by a <workflow> tag.

Attributes

Attribute
Value
Obligatory?
Description
name a string
yes The name of the workflow that the user can specify in the Web UI or the MAT engine.
hand_annotation_available_at_end "yes"
no If specified, the user will be able to add or correct hand annotations in the document after the last step of this workflow is completed. If one of the steps in the workflow is implemented as a tag step, this attribute will be ignored; similarly, if hand_annotation_available_at_beginning is specified, this attribute will be ignored.
hand_annotation_available_at_beginning
"yes"
no
If specified, the user will be able to add or correct hand annotations in the document before the first step of this workflow.

Children

Element
Obligatory?
Repeatable?
Description
<ui_settings> no
no
These are settings that are intended to be passed unmodified to the UI. This is not currently used.
<step> no yes An individual step of a workflow.

<ui_settings> (of <workflow>)

These are settings that are intended to be passed unmodified to the UI, in order to declaratively configure UI customizations for particular workflows. At the moment, no tasks use this feature. You can configure these settings either with a child <setting> element, or with an attribute on the <settings> element itself; they're interchangeable.

Attributes

Attribute
Value
Obligatory?
Description
<attr> a string
no The <ui_settings> tag supports arbitrary attribute-value pairs.

Children

Element
Obligatory?
Repeatable?
Description
<setting> no yes An attribute-value pair.

<setting> (of <ui_settings>)

An individual UI setting.

Children

Element
Obligatory?
Repeatable?
Description
<name> yes no The name of the setting. This element has no attributes or element children; its value is the text it delimits.
<value> yes no The value of the setting. This element has no attributes or element children; its value is the text it delimits.

<step> (of <workflow>)

The steps are the basic elements of workflows.

Attributes

Attribute
Value
Obligatory?
Description
name a string
yes The name of the step. These names must be matched in the step implementations.
hand_annotation_available "yes"
no If specified, hand annotation is available in the Web UI during this step.
by_hand "yes"
no If specified, this step is performed by the user by hand, not automatically. This step must be defined as a tagging step in the step implementation, and it implies hand_annotation_available="yes".
pretty_name a string
no The name of this step that the user will see in the UI.
proxy_for_steps a comma-delimited string of step names
no Steps can be sequences of other steps (i.e., they can be composite). You may want this in a workflow if two steps will always be done as a group, for instance. The names in the value for this attribute must be the values of the "name" attribute of other steps, not the values of the "pretty_name" attribute.

Children

Element
Obligatory?
Repeatable?
Description
<create_settings> no no
Settings to pass to the initializer of the step
<run_settings>
no
no
Settings to pass to the execution of a step
<ui_settings>
no
no
Settings to pass to the UI for this step. Not currently used.

<create_settings> (of <step>)

These are settings that a step might pass to the initialization phase of its step class. These settings override the values in the <create_settings> element for <step_implementation>. You can configure these settings either with a child <setting> element, or with an attribute on the <settings> element itself; they're interchangeable.

Attributes

Attribute
Value
Obligatory?
Description
<attr> a string
no The <create_settings> tag supports arbitrary attribute-value pairs.

Children

Element
Obligatory?
Repeatable?
Description
<setting> no yes An attribute-value pair.

<setting> (of <create_settings>)

An individual step creation setting.

Children

Element
Obligatory?
Repeatable?
Description
<name> yes no The name of the setting. This element has no attributes or element children; its value is the text it delimits.
<value> yes no The value of the setting. This element has no attributes or element children; its value is the text it delimits.

<run_settings> (of <step>)

These are settings which are passed to the do() or doBatch() method of the step (that's the method that actually performs the step). You can configure the settings either with a child <setting> element, or with an attribute on the <settings> element itself; they're interchangeable.

Attributes

Attribute
Value
Obligatory?
Description
<attr> a string
no The <run_settings> tag supports arbitrary attribute-value pairs.

Children

Element
Obligatory?
Repeatable?
Description
<setting> no yes An attribute-value pair.

Most predefined step implementations in MAT do not support any run settings. The two implementations which do are MAT.JavaCarafe.CarafeTokenizationStep and MAT.JavaCarafe.CarafeTagStep.

The MAT.JavaCarafe.CarafeTagStep step implements automatic tagging. Any step which implements automatic tagging can bear the following additional attribute-value pairs:

Key
Value
Description
tagger_local
"yes"
By default, the MAT engine will contact the MAT Web server to tag a document, because the Web server has the capability of starting up and monitoring a long-living tagger task. The reason this is beneficial is that the Carafe tagger, like many model-based taggers, has a fairly expensive startup cost. To block the engine from contacting the Web server, and force it to start up and shut down the tagger on its own, specify tagger_local="yes".
tagger_model
a string, a filename of a tagging model
If the task does not have a default model, the user must specify the location of the tagger model.

In addition, the Carafe tagging and tokenization steps support other run settings, documented here.

<setting> (of <run_settings>)

An individual run setting.

Children

Element
Obligatory?
Repeatable?
Description
<name> yes no The name of the setting. This element has no attributes or element children; its value is the text it delimits.
<value> yes no The value of the setting. This element has no attributes or element children; its value is the text it delimits.

<ui_settings> (of <step>)

These are settings that are intended to be passed unmodified to the UI, in order to declaratively configure UI customizations for particular tasks. At the moment, no tasks use this feature. You can configure these settings either with a child <setting> element, or with an attribute on the <settings> element itself; they're interchangeable.

Attributes

Attribute
Value
Obligatory?
Description
<attr> a string
no The <ui_settings> tag supports arbitrary attribute-value pairs.

Children

Element
Obligatory?
Repeatable?
Description
<setting> no yes An attribute-value pair.

<setting> (of <ui_settings>)

An individual UI setting.

Children

Element
Obligatory?
Repeatable?
Description
<name> yes no The name of the setting. This element has no attributes or element children; its value is the text it delimits.
<value> yes no The value of the setting. This element has no attributes or element children; its value is the text it delimits.

<settings> (of <task>)

These are settings that a specialized task might require which the user wishes to be able to configure in XML, rather than by modifying the source code for the specialized task. The chances that a normal user will use this are extremely slim. These settings are not inherited by task children.

You can configure the settings either with a child <setting> element, or with an attribute on the <settings> element itself; they're interchangeable.

Attributes

Attribute
Value
Obligatory?
Description
<attr> a string
no The <settings> tag supports arbitrary attribute-value pairs.

Children

Element
Obligatory?
Repeatable?
Description
<setting> no yes An attribute-value pair.

<setting> (of <settings>)

An individual task-level setting.

Children

Element
Obligatory?
Repeatable?
Description
<name> yes no The name of the task-level setting. This element has no attributes or element children; its value is the text it delimits.
<value> yes no The value of the task-level setting. This element has no attributes or element children; its value is the text it delimits.

<java_subprocess_parameters> (of <task>)

MAT has some built-in tools to control Java Carafe and other Java subprocesses. Using this element, you can declare default settings for Java heap and stack sizes. If not set locally, these settings are inherited from parent tasks.

Children

Attribute
Value
Obligatory?
Description
heap_size a string no The value here is a value for the heap size for the Java VM. It is passed to the Java VM using the -Xmx argument. Values like 512M or 2G are examples of expected values. This default value can be overridden by declaring the empty string ("") in any configuration context where the heap size can be specified (see the Java Carafe engine for examples).
stack_size a string no The value here is a value for the stack size for the Java VM. It is passed to the Java VM using the -Xss argument. Values like 4096k or 512k are examples of expected values. This default value can be overridden by declaring the empty string ("") in any configuration context where the heap size can be specified (see the Java Carafe engine for examples).

<web_customization> (of <task>)

Among the ways that tasks can be customized is the Web UI can be customized in a number of ways. This process is quite complicated; it's almost entirely code-oriened, and it's not documented at all. This section is here for reference only; users who aren't really, really brave shouldn't go anywhere near most of these customizations.

Attributes

Attribute
Value
Obligatory?
Description
inherit_css "no"
no If the parent task has CSS customizations, as specified in the <css> element below, they are inherited by default. Use this setting to block inheritance.
inherit_js "no"
no If the parent task has Javascript customizations, as specified in the <css> element below, they are inherited by default. Use this setting to block inheritance.
display_config a string
no Each Web customization set has a name, so that when the user selects a particular task, the UI knows which customization set to use. Can be inherited from parent tasks; a value of "" cancels the inheritance.
alphabetize_labels
"no"
no
By default, the MAT UI orders the annotation labels alphabetically in the legend and the tag popup menu. If this attribute is set, the UI will list the annotation labels in the order they are defined in the <tags> element. Can be inherited from parent tasks; a value of "" cancels the inheritance.
tokenless_autotag_delimiters
a string
no
By default, if you ask the MAT UI to autotag similar strings when you're annotating without tokens, the only edge conditions that the UI recognizes are whitespace and zone boundaries. If your match abuts a punctuation mark, it will not recognize it as a delimiter. If you want other edge conditions to be recognized, you can list them in the value of this attribute. (Remember, though, that you may have to use the XML entity character codes for those characters which are significant to XML syntax, so that the XML parsing doesn't fail.) This setting can be inherited from parent tasks; a value of "" cancels the inheritance.
text_right_to_left
"yes"
no
If specified, documents viewed in this task in the MAT UI will be treated as right-to-left text (e.g., Arabic). Can be inherited from parent tasks; a value of "" cancels the inheritance.

Children

Element
Obligatory?
Repeatable?
Description
<js> no yes The relative pathname of the Javascript customizations. This path is relative to the task directory. By convention, this file should be in the "js" subdirectory.

This element has no attributes or element children; its value is the text it delimits.
<css> no yes The relative pathname of the CSS customizations. This path is relative to the task directory. By convention, this file should be in the "css" subdirectory.

This element has no attributes or element children; its value is the text it delimits.
<short_name>
no
no
This is the name that the UI will display in the upper left corner if this customization is the only customization available. This setting will be inherited by child tasks.

This element has no attributes or element children; its value is the text it delimits.
<long_name>
no
no
This is the name that the UI will use as the title of the Web page if this customization is the only customization available. This setting will be inherited by child tasks.

This element has no attributes or element children; its value is the text it delimits.

<model_config> (of <task>)

It's also possible to configure various dimensions of the model build process in the task.xml file. The settings for this config are identical to the command-line options available for the MATModelBuilder. There is no default model build engine in a task.xml file; if you want to build models, you must declare a model config.

MAT is delivered with a default Carafe model builder.

You can have multiple <model_config> entries, as long as they differ by the config_name attribute. If a named or default model config isn't found when requested by MATModelBuilder or the experiment engine, MAT will look for it in the parent task.

Attributes

Attribute
Value
Obligatory?
Description
class
the name of a Python class
yes
This attribute names the class which will be used as the model builder. The default Carafe model builder class is MAT.JavaCarafe.CarafeModelBuilder
config_name
a string
no
If present, a config name to specify as the --config_name in MATModelBuilder, or for the config_name attribute in <build_settings> in the experiment engine. If omitted, this entry is the default model config. There can be only one default.

Children

Element
Obligatory?
Repeatable?
Description
<build_settings> no no
The settings for this model config

<build_settings> (of <model_config>)

The <build_settings> tag supports arbitrary attribute-value pairs which are passed to the model builder. See the documentation for the Carafe model builder to see which attributes should be supplied to that engine. You can configure these settings either with a child <setting> element, or with an attribute on the <settings> element itself; they're interchangeable.

Attributes

Attribute
Value
Obligatory?
Description
<attr> a string
no The <build_settings> tag supports arbitrary attribute-value pairs.

Children

Element
Obligatory?
Repeatable?
Description
<setting> no yes An attribute-value pair.

<setting> (of <build_settings>)

An individual build setting.

Children

Element
Obligatory?
Repeatable?
Description
<name> yes no The name of the setting. This element has no attributes or element children; its value is the text it delimits.
<value> yes no The value of the setting. This element has no attributes or element children; its value is the text it delimits.

<workspace> (of <task>)

If you want to use workspace mode, you must declare how the various workspace operations are implemented. These operations are described here.

Attributes

Attribute
Value
Obligatory?
Description
inherit_operations "no"
no By default, workspace operation implementations are inherited from the task parent, if not available locally. Use this attribute to block inheritance.

Children

Element
Obligatory?
Repeatable?
Description
<operation> yes yes An individual operation.

<operation> (of <workspace>)

Specifies the implementation of a workspace operation. Note that in spite of the fact that operations are associated with folders, these operations are referenced only by name, because the operations should be named uniquely.

Attributes

Attribute
Value
Obligatory?
Description
name a string
yes The name of the operation

Children

Element
Obligatory?
Repeatable?
Description
<settings> no no The operation settings.

<settings> (of <operation>)

The settings for the operation. What these settings are depend on what sort of operation it is. For instance, for operations which invoke the MAT engine, these settings will be the arguments to the MAT engine. For operations which invoke the MAT model builder, these settings will be the arguments to the MAT model builder. See the documentation on workspaces to find out what the options are for particular operations.

Attributes

Attribute
Value
Obligatory?
Description
<attr> a string
no The <settings> tag supports arbitrary attribute-value pairs.

Children

Element
Obligatory?
Repeatable?
Description
<setting> no yes An attribute-value pair.

<setting> (of <settings>)

An individual operation setting.

Children

Element
Obligatory?
Repeatable?
Description
<name> yes no The name of the setting. This element has no attributes or element children; its value is the text it delimits.
<value> yes no The value of the setting. This element has no attributes or element children; its value is the text it delimits.

<step_implementations> (of <task>)

Step implementations associate a named step with an implementation for that step (i.e., a Python class), perhaps in the context of particular workflows. The effect of each named step in a task is global; e.g., the "tag" step might add content annotations. However, the way that effect is achieved may differ among step implementations; e.g., one implementation of the tag step may involve hand annotation, or there may be multiple possibilities for adding the tags automatically. By default, step implementations are inherited from the parent.

Children

Element
Obligatory?
Repeatable?
Description
<step> no yes An individual step implementation.

<step> (of <step_implementations>)

Each individual step implementation specifies the Python class, at least.

Attributes

Attribute
Value
Obligatory?
Description
name a string
yes The name of a step as it is used in workflows. These are values of the "name" attribute for the <workflow> <step> element, not the "pretty_name" attribute.
class a string, the name of a Python class
yes The Python class, including its module name, which implements this step.
workflows a comma-delimited string of workflow names
no The workflow contexts in which this implementation holds. Different workflows can have different implementations for the same named step.

Children

Element
Obligatory?
Repeatable?
Description
<create_settings> no no
Default settings for initializing the step.

<create_settings> (of <step>)

These are settings that a step might pass to the initialization phase of its step class. These settings can be overridden by the values in the <create_settings> element for <step> in the <workflow> element. You can configure these settings either with a child <setting> element, or with an attribute on the <settings> element itself; they're interchangeable.

Attributes

Attribute
Value
Obligatory?
Description
<attr> a string
no The <create_settings> tag supports arbitrary attribute-value pairs.

Children

Element
Obligatory?
Repeatable?
Description
<setting> no yes An attribute-value pair.

<setting> (of <create_settings>)

An individual step creation setting.

Children

Element
Obligatory?
Repeatable?
Description
<name> yes no The name of the setting. This element has no attributes or element children; its value is the text it delimits.
<value> yes no The value of the setting. This element has no attributes or element children; its value is the text it delimits.

<annotation_set_descriptors> (of <task>)

The <annotation_set_descriptors> element allows you to define multiple annotation sets. In the current implementation, you should have only one, which should have its attributes set as shown immediately below. You can inherit annotations from other tasks.

Attributes

Attribute
Value
Obligatory?
Description
all_annotations_known "yes"
no By default, the task leaves its annotation sets "open"; i.e., if the task encounters an unknown annotation label, it won't raise an error. If you provide the value "yes" for this attribute, an error will be raised if the task encounters an unknown annotation.
inherit a comma-separated list of labels to inherit
no You can inherit annotations from other tasks, either by label or by category (see the "category" attribute of <annotation_set_descriptor>. To inherit an annotation by label, simply list it; to inherit a category, list "category:" + the category name.

A typical value for this attribute is "category:zone,category:token", which inherits the annotations for the zone and token categories from the parent (usually root) task.

Children

Element
Obligatory?
Repeatable?
Description
<annotation_set_descriptor> no yes

<annotation_set_descriptor> (of <annotation_set_descriptors>)

The <annotation_set_descriptor> element is described in detail elsewhere (eventually, you'll be able to specify it in its own file, and share these files among tasks). The only elements of this type you should be declaring should have category="content" and name="content".

Attributes

Attribute
Value
Obligatory?
Description
category
no The category of the annotation set descriptor. Eventually, we intend for these values to be user-definable (aside from a few predetermined values like "zone" and "token"), but for now, the value for this attribute for those descriptors you define should be "content".
name

yes
The name of the annotation set descriptor. This attribute is distinguished from the category attribute in that, eventually, we'll treat the category attribute as a functional one, which can specify values which different descriptors can fill in different tasks. Eventually, we intend for the value of the "name" attribute to be user-definable, but for now, the value for this attribute for those descriptors you define should be "content".

<annotation_display> (of <task>)

This element defines all the display-related properties in the MAT UI of the elements defined in the <annotation_set_descriptors> element. Most of what you can do here is define the display-related properties of labels, although you can also define some of the properties of attributes, and also define groups for hierarchical annotation displays.

Children

Element
Obligatory?
Repeatable?
Description
<label> no yes Defines the display-related properties for a true or effective label
<attribute> no yes Defines the display-related properties for an attribute of a particular label
<label_group> no yes Defines groups for hierarchical annotation displays

<label> (of <annotation_display>)

This element defines the display-related properties of true or effective labels.

Attributes

Attribute
Value
Obligatory?
Description
name a string
yes The true or effective label to which this definition applies.
accelerator a single-character string
no If specified, this accelerator will be available in the annotation selection menu in the Web UI; if the user presses this key, this annotation will be selected for the span, just as if the element in the menu had been chosen.
edit_immediately "yes"
no If a label has attributes, or if it can be the value of an annotation-valued attribute, it is possible to edit the annotation in the UI, either in a popup dialog or in a detail tab. If the annotation is spanless, this editor will appear automatically when the annotation is created; if it is spanned, it will not. If you provide this attribute-value pair, the editor will appear automatically when the annotation is created, whether or not it's spanned.
presented_name a format string
no In the UI, there are many places (e.g., in the annotation tables) where the annotation can be described, and by default, the description for spanned annotations is the covered text, while the description for spanless annotations is the annotation ID. Sometimes, this name isn't what you would prefer it to be, and you can use this attribute to define the name you prefer. The syntax for the value of this string is described immediately below.
css a string of legal CSS
no If the label is a spanned label, the UI will apply this CSS on a token-by-token basis to any span labeled by this annotation. If the label is a spanless label, the CSS will be applied to its icon in the spanless sidebar. Because there's no text in the spanless sidebar, and because annotations can overlap and be displayed in an exploded, stacked representation in spanned hand annotation and in comparison, it's probably best to ensure that the primary aspect of this CSS is background styling.

The syntax of the presented_name attribute

The presented_name attribute is a simple string, with format directive of the form $(...). The values within the parentheses can be any of the attributes of the label (in which case the format directive will be replaced by the attribute value), or one of the special values listed below. The format directive can also contain key-value pairs, as in $(val:a=b,c=d). We describe the possible key-value pairs below as well.

special value
interpretation
_start
The start index of the spanned annotation
_end
The end index of the spanned annotation
_parent
The parent annotation(s) (i.e., the annotation(s) which have this annotation as a value of an annotation-valued attribute)
_label
The true or effective label of the annotation
_text
The spanned text of the spanned annotation

key
available for
possible value
interpretation
truncate
_text
integer
If specified, the UI will ensure that the spanned text is no longer than n characters long. This value cannot be less than 5. The UI will show the beginning and end of the text, and show the truncated medial text with ellipses (...).
truncate
_parent
integer
If specified, the UI will limit the number of parent annotations listed, and indicate the remainder of the list with ellipses (...).
showLabel
_parent, any annotation-valued attribute
"no"
By default, when the UI displays an annotation attribute value as part of the name of an annotation, it contains the label of the annotation value. If you don't want the label displayed, provide this key-value pair.
showIndices
_parent, any annotation-valued attribute "yes"
By default, when the UI displays an annotation attribute value as part of the name of an annotation,it does not display its start and end indices. If you want these indices displayed, provide this key-value pair.
showFormattedName
_parent, any annotation-valued attribute "yes"
By default, when the UI displays an annotation attribute value as part of the name of an annotation, it does not display the value's formatted name. If you want the formatted name displayed, provide this key-value pair.
showFeatures
_parent, any annotation-valued attribute "yes"
By default, when the UI displays an annotation attribute value as part of the name of an annotation, it does not display the values attribute-value pairs. If you want these pairs displayed, provide this key-value pair.

So, for example, if you want the presented name of your annotation to contain the text truncated to 20 characters, your value for presented_name would be "$(_text:truncate=20)".

<attribute> (of <annotation_display>)

This element defines the display-related properties for specific annotation attributes.

Attributes

Attribute
Value
Obligatory?
Description
name a string
yes The name of a known attribute
of_annotation a string
yes The name of a known true or effective label
editor_style "long"
no There are two ways of providing attribute values for non-choice string attributes in the UI: via a short typein window or via a multi-line typein window. By default, a short typein window will be used. If you provide this attribute-value pair, a long typein window will be used. Ignored if the attribute is not a string attribute, or if it's a choice attribute.
read_only
"yes"
no
Attributes are typically editable. If for some reason you don't want to be able to edit an attribute directly in the annotation editor (e.g., the attribute value is automatically populated by a custom editor, as described below), use this setting.
custom_editor a string
no You may have a string attribute which is actually a date, which you want to use a calendar widget to populate; or you might want to look up the annotation text in a database, and use the results to populate the attribute value. If you're willing to do some programming, you can use this attribute to specify an arbitrary JavaScript function for a string, int, or float attribute, to use as its editor. You can define your function in your task directory in your Javascript customization file. Unfortunately, we don't really have the resources to document the API this function has to conform to; either dig through the source code yourself, or ask us for help.
custom_editor_is_multiattribute "yes"
no If you've associated a custom_editor with this attribute, this attribute-value pair tells the UI that the editor will fill multiple attributes.
custom_editor_button_label
a string
no
If you have a custom editor, but you want the label to be something other than "Edit" (let's say the value is automatically calculated when you press the button), use this.
url_link
a string
no
When the annotation attribute is displayed in the annotation editor or the annotation table, the annotation will be used to construct a URL link. The syntax is identical to that of presented_name above, except that of the special values, only _text is recognized, and no key-value pairs are recognized within each directive. So, e.g., if you want the link on an annotation attribute to search for the spanned text in Google, the value of url_link would be

http://www.google.com/search?q=$(_text)

<label_group> (of <annotation_display>)

Under some circumstances, you might want to create cascaded annotation menus in the MAT UI, perhaps in order to group together similar annotations, or provide options for more or less general annotations, or to compress the screen real estate taken up by the annotation popup menu. You can use the <label_group> element to accomplish this.

Each label group has a name. This might correspond to a known true or effective label (in which case it refers to that annotation), or it might be a previously unknown name, in which case it serves merely as a group. The children of each label group can be actual annotation names, or other known label groups. The annotations the label group refers to must be content annotations. If the label group is not otherwise known, it has the option of declaring CSS styling for the menu entry.

Label groups are inherited from an available parent task; this group information is filtered by the locally defined annotations. Local label groups override parent label groups.

Attributes

Attribute
Value
Obligatory?
Description
name a string
yes the name of the label group, either a new name or the name of a true or effective content annotation
children a string
yes a comma-delimited sequence of names, either names of existing content annotations or of known label groups
css a string
no Like the css attribute on the <label> element above. Used to assign styling to otherwise unknown label groups.

<similarity_profile> (of <task>)

When you run the MATScore engine, or produce a visual comparison of annotations, MAT uses a set of heuristics to determine the best pairing of annotations. You can affect this process using the <similarity_profile> element.

Similarity profiles are not inherited.

Attributes

Attribute
Value
Obligatory?
Description
name a string
no The name of the profile, for use when creating comparison documents or scoring. If no name is provided, this is the default profile for the task. There can be only one unnamed profile.

Children

Element
Obligatory?
Repeatable?
Description
<stratum> no yes The comparison algorithm is stratified (see the algorithm for more details). You can use this element to define the strata, rather than allowing them to be inferred.
<tag_profile> no yes There's a default similarity profile for spanned and spanless annotations. If you want to declare your own profile explicitly, you can do that with this element.

<stratum> (of <similarity_profile>)

The comparison algorithm is stratified (see the algorithm for more details). You can use this element to define the strata, rather than allowing them to be inferred.

Attributes

Attribute
Value
Obligatory?
Description
true_labels a comma-separated string of labels
yes The labels in this stratum. Note that these labels must be true labels, not effective labels.

<tag_profile> (of <similarity_profile>)

There's a default similarity profile for spanned and spanless annotations. If you want to declare your own profile explicitly, you can do that with this element. See the algorithm for details on how to use these.

Attributes

Attribute
Value
Obligatory?
Description
true_labels a comma-separated string of labels
yes The labels to which this profile applies. Note that these labels must be true labels, not effective labels.

Children

Element
Obligatory?
Repeatable?
Description
<attr_equivalences>
no
yes
Equivalences for attributes among the various labels in the profile.
<dimension> yes yes One dimension of the profile.

<attr_equivalences> (of <tag_profile>)

The true labels in your tag profile may vary in their attribute names, but you may still want these attributes to be comparable. This element allows you to declare your equivalences. See the algorithm for details about the various dimensions.

Attributes

Attribute
Value
Obligatory?
Description
name a string
yes The name of the equivalence. See the algorithm for a list of legal names and their interpretations.
attrs a comma-separated stering
yes All attributes which stand in this equivalence. Each label in your profile must have at least one of these attributes, and no attribute name can appear more than once among the equivalences in the profile.

<dimension> (of <tag_profile>)

Each profile consists of a number of dimensions, which define some aspect of the annotation to use in comparison, along with the method to be used for comparison and the relative weight of the dimension. See the algorithm for details about the various dimensions.

Attributes

Attribute
Value
Obligatory?
Description
name a string
yes The name of the dimension. See the algorithm for a list of legal names and their interpretations.
weight a number
yes The relative weight of the dimension. The weights of all the dimensions will be normalized.
param_digester_method a Python function name
no In rare circumstances, the dimension method may accept parameters (see <attr> below) and these parameters may need to be interpreted (e.g., "yes" -> True). The full name of the function, including the module it's in, must be specified.
aggregator_method a Python function name
no If special handling is required for a dimension which has an aggregation value, this option allows you to declare the handler. The full name of the function, including the module it's in, must be specified.
method a string
no The method associated with the dimension, if not the default method. See the algorithm for a list of legal names.
<attr> a string no the <dimension> element supports arbitrary attribute-value pairs

<score_profile> (of <task>)

When you run the MATScore engine, you can control how the scored elements are aggregated, decomposed, or filtered in the scoring output. See the algorithm for details on how to use this.

Score profiles are not inherited.

Attributes

Attribute
Value
Obligatory?
Description
name a string
no The name of the profile, for use when scoring. If no name is provided, this is the default profile for the task. There can be only one unnamed profile.

Children

Element
Obligatory?
Repeatable?
Description
<aggregation> no yes A set of labels to aggregate as a separate entry.
<attr_decomposition> no yes An attribute-based decomposition of particular labels to report as a separate entry.
<partition_decomposition> no yes A function-based decomposition of particular labels to report as a separate entry.
<label_limitation>
no
no
A list of labels to restrict the overall reporting to.

<aggregation> (of <score_profile>)

Under normal circumstances, annotations are aggregated per document and per run by effective label (if available) or true label, or by equivalence classes passed to MATScore, and then all together into a single heap. You can add other aggregations of true labels using this element.

Attributes

Attribute
Value
Obligatory?
Description
name a string
yes The name of the aggregation as it will appear in the output spreadsheet
true_labels a comma-separated string of labels
yes The true labels in this aggregation.

<attr_decomposition> (of <score_profile>)

Under normal circumstances, the only way to decompose true labels in the score output is by effective label. If you want to decompose them by a particular attribute (e.g., you want to see the score for ENAMEXes when type = NOM), you can use this element. Decompositions can overlap with each other.

Attributes

Attribute
Value
Obligatory?
Description
true_labels a comma-separated string of labels
yes The true labels to which this decomposition applies.
attrs a comma-separated string of attrs
yes The names of attributes defined for all the listed labels. There will be a separate decomposition for each tuple of values for these attrs. The  name of the decomposition in the score output will be <attr1>=<val1> <attr2>=<val2>...

<partition_decomposition> (of <score_profile>)

Under normal circumstances, the only way to decompose true labels in the score output is by effective label. If you want to decompose them by a Python function, you can use this element. Decompositions can overlap with each other.

Attributes

Attribute
Value
Obligatory?
Description
true_labels a comma-separated string of labels
yes The true labels to which this decomposition applies.
method a Python function name
yes This function must take a single argument, which will be an annotation, and return a value. For instance, if you're evaluating a geotagger, and the tagger provides a country attribute for the location, and you want to decompose location scores by US and non-US, you'd define a function which returns "US" if the country attribute is "US", and "non-US" otherwise. The full name of the function, including the module it's in, must be specified. The name of the decomposition in the score output will be <bare function name>=<val>.

<label_limitation> (of <score_profile>)

The scorer will pair all annotations which are not specified as being ignored. Sometimes, you might need to pair some annotations as part of the scoring process (let's say they're arguments of relations, for instance), but you don't want them in the final output, even though you can't ignore them. You can use this element to provide that filter.

Attributes

Attribute
Value
Obligatory?
Description
true_labels a comma-separated string of labels
yes Only these true labels (and the effective labels that are defined on them) will be included in the scoring output.