At various points, you might want to know how your system is
performing, and you might want to conduct an experiment. In a
typical experiment of this sort, you'd have to do a number of
things:
You may also want to select your training and test corpora by
partitioning an single corpus of documents, or iterate through a
variety of settings for the model builder or tagger to compare
their performance (e.g., different bias values for recall vs.
precision).
MAT provides an engine to do all this for you, guided by an extensive, declarative XML specification for your experiment. The experiment XML file consists of three types of information:
The experiment engine runs this experiment in a directory which
is provided to it either via the XML file (the "dir" attribute of
the <experiment> element) or on the command line (the
--exp_dir option). The experiment engine prepares the corpora,
builds the models, and performs the experiment runs in this
directory. The experiment XML file is copied into the experiment
directory, if a file with the same name is not already present.
We document the wide range of ways that this engine can be used here; we describe the rich structure of its output directory here.