Package org.mitre.mat.engineclient

Implements a Java client interface to the MAT processing engine.

See:
          Description

Interface Summary
MATEngineClientInterface What will someday be an API for arbitrary MAT clients, independent of protocol.
 

Class Summary
MATCgiClient The class which implements CGI interaction with the MAT engine.
MATCgiClientDemo  
 

Exception Summary
MATEngineClientException  
 

Package org.mitre.mat.engineclient Description

Implements a Java client interface to the MAT processing engine.

Accessing the MAT engine

If you're running the MAT Web server (see the main MAT documentation), you can use this library to perform whatever automated tagging steps you want to perform. In order to use this library, you should understand how the MAT engine works, and how to use workflows and steps.

At the moment, the MAT Web server only presents a CGI interface (although there are plans to implement other protocols as well).

Performing a file-level operation

Only forward steps are supported; no undo is supported yet.

The client object accepts an input document, and returns a document of the same Java class.

Let's say your Web server is running on the default port, and you want to tokenize, zone and tag a document using the sample "Named Entity" task.

    String url = "http://localhost:7801";
MATDocument d = ...;
MATCgiClient client = new MATCgiClient(url);
MATDocument resultDoc = (MATDocument) client.doSteps(
d, "Named Entity", "Demo", "tokenize,zone,tag");

If there are additional attributes you want to pass to the engine, you can do so with an optional hash map argument.

There is a command-line tool in the toplevel bin directory which provides shell access to a demonstration of this capability using the MATCGIClientDemo class.

Checking to see if a workspace exists

Let's say your Web server is running on the default port, and you want to "open" the workspace "/home/me/my_workspace". You'll need the workspace key from the MAT Web server (see the documentation on workspace mode for details).

    String url = "http://localhost:7801";
MATCgiClient client = new MATCgiClient(url);
String workspaceKey = ...;
boolean res =
client.openWorkspace(
"/home/me/my_workspace", workspaceKey);

The method will return "true" if the workspace exists, or raise an error.

Listing the contents of a workspace folder

Let's say your Web server is running on the default port, and you want to know what basenames are in the "in process" folder in the workspace "/home/me/my_workspace". You'll need the workspace key from the MAT Web server (see the documentation on workspace mode for details).

    String url = "http://localhost:7801";
MATCgiClient client = new MATCgiClient(url);
String workspaceKey = ...;
ArrayList basenames =
client.listWorkspaceFolder(
"/home/me/my_workspace", workspaceKey, "in process");

Opening a workspace file

Let's say your Web server is running on the default port, and you know that the "in process" folder in the workspace "/home/me/my_workspace" contains the file "myfile.json", and you want to retrieve that file. You'll need the workspace key from the MAT Web server (see the documentation on workspace mode for details).

    String url = "http://localhost:7801";
MATCgiClient client = new MATCgiClient(url);
String workspaceKey = ...;
MATDocument doc =
client.openWorkspaceFile(
"/home/me/my_workspace", workspaceKey, "in process", "myfile.json");

If there are additional attributes you want to pass to the operation, you can do so with an optional hash map argument.

Performing a folder-level workspace operation

Let's say your Web server is running on the default port, and you want to perform the 'tagprep' operation on a document in the "raw, unprocessed" folder in your workspace "/home/me/my_workspace". You'll need the workspace key from the MAT Web server (see the documentation on workspace mode for details). You'll reference the document by its basename (that is, its name without the directory path).

    String url = "http://localhost:7801";
MATCgiClient client = new MATCgiClient(url);
String workspaceKey = ...;
String myBasename = ...;
MATCgiClient.WorkspaceFileResult res =
client.doWorkspaceOperation(
"/home/me/my_workspace", workspaceKey,
"raw, unprocessed", "tagprep", myBasename);

If there are additional attributes you want to pass to the engine, you can do so with an optional hash map argument.

The result object contains the document after it's processed, along with the folder it's in.

Performing a toplevel workspace operation

There's another, general way to access the toplevel operations in your workspace. Let's say your Web server is running on the default port, and you want to list the basenames in the workspace, that is, perform the 'list' toplevel operation with no arguments. You'll need the workspace key from the MAT Web server (see the documentation on workspace mode for details).

    String url = "http://localhost:7801";
MATCgiClient client = new MATCgiClient(url);
String workspaceKey = ...;
MATCgiClient.WorkspaceFileResult res =
client.doToplevelWorkspaceOperation(
"/home/me/my_workspace", workspaceKey,
"list", null);

The result object contains the list of files in the files slot.

If you want to specify arguments, you can do so as follows:

    String url = "http://localhost:7801";
MATCgiClient client = new MATCgiClient(url);
String workspaceKey = ...;
ArrayList<String> args = new ArrayList<String>(Arrays.asList("in process"));
MATCgiClient.WorkspaceFileResult res =
client.doToplevelWorkspaceOperation(
"/home/me/my_workspace", workspaceKey,
"list", args);

If your toplevel operation has additional attributes you want to pass to the engine, you can do so with an optional hash map argument.

Importing a file into a workspace

Let's say your Web server is running on the default port, and you have a document you've zoned and tokenized in a separate tool, and you want to insert it into your workspace in the "in process" folder (because it's ready to be hand tagged). You'll have to choose a basename for it (a name, without any directory path).

    String url = "http://localhost:7801";
MATCgiClient client = new MATCgiClient(url);
String workspaceKey = ...;
MATDocument d = ...;
MATCgiClient.WorkspaceFileResult res =
client.importFileIntoWorkspace(
"/home/me/my_workspace", workspaceKey,
"in process", d, myBasename);

If you want to strip a suffix from the basename, you should add an optional hash map argument whose key is "strip_suffix" and whose value is the suffix you wish to strip. See the details for MATWorkspaceEngine for an analogous example. All other keys and values for this operation are also supported in the hash map.

The result object contains the document after it's processed, along with the folder it's in.

Linking against the library

This library is distributed in a directory structure that looks like this:

    java/
...
lib/
jackson-core-lgpl-1.4.3.jar
jackson-mapper-lgpl-1.4.3.jar
commons-codec-1.2.jar
commons-httpclient-3.1.jar
commons-logging-1.1.jar
...
java-mat-core/
...
dist/
java-mat-core.jar
java-mat-engine-client/
dist/
java-mat-engine-client.jar

To use this library, you must include all seven of the libraries mentioned above in your classpath.