|
|||||||||
| PREV PACKAGE NEXT PACKAGE | FRAMES NO FRAMES | ||||||||
See:
Description
| Interface Summary | |
|---|---|
| MATEngineClientInterface | What will someday be an API for arbitrary MAT clients, independent of protocol. |
| Class Summary | |
|---|---|
| MATCgiClient | The class which implements CGI interaction with the MAT engine. |
| MATCgiClientDemo | |
| Exception Summary | |
|---|---|
| MATEngineClientException | |
Implements a Java client interface to the MAT processing engine.
If you're running the MAT Web server (see the main MAT documentation), you can use this library to perform whatever automated tagging steps you want to perform. In order to use this library, you should understand how the MAT engine works, and how to use workflows and steps.
At the moment, the MAT Web server only presents a CGI interface (although there are plans to implement other protocols as well).
Only forward steps are supported; no undo is supported yet.
The client object accepts an input document, and returns a document of the same Java class.
Let's say your Web server is running on the default port, and you want to tokenize, zone and tag a document using the sample "Named Entity" task.
String url = "http://localhost:7801";
MATDocument d = ...;
MATCgiClient client = new MATCgiClient(url);
MATDocument resultDoc = (MATDocument) client.doSteps(
d, "Named Entity", "Demo", "tokenize,zone,tag");
If there are additional attributes you want to pass to the engine, you can do so with an optional hash map argument.
There is a command-line tool in the toplevel bin directory which provides shell access to a demonstration of this capability using the MATCGIClientDemo class.
Let's say your Web server is running on the default port, and you want to "open" the workspace "/home/me/my_workspace". You'll need the workspace key from the MAT Web server (see the documentation on workspace mode for details).
String url = "http://localhost:7801";
MATCgiClient client = new MATCgiClient(url);
String workspaceKey = ...;
boolean res =
client.openWorkspace(
"/home/me/my_workspace", workspaceKey);
The method will return "true" if the workspace exists, or raise an error.
Let's say your Web server is running on the default port, and you want to know what basenames are in the "in process" folder in the workspace "/home/me/my_workspace". You'll need the workspace key from the MAT Web server (see the documentation on workspace mode for details).
String url = "http://localhost:7801";
MATCgiClient client = new MATCgiClient(url);
String workspaceKey = ...;
ArrayListbasenames =
client.listWorkspaceFolder(
"/home/me/my_workspace", workspaceKey, "in process");
Let's say your Web server is running on the default port, and you know that the "in process" folder in the workspace "/home/me/my_workspace" contains the file "myfile.json", and you want to retrieve that file. You'll need the workspace key from the MAT Web server (see the documentation on workspace mode for details).
String url = "http://localhost:7801";
MATCgiClient client = new MATCgiClient(url);
String workspaceKey = ...;
MATDocument doc =
client.openWorkspaceFile(
"/home/me/my_workspace", workspaceKey, "in process", "myfile.json");
If there are additional attributes you want to pass to the operation, you can do so with an optional hash map argument.
Let's say your Web server is running on the default port, and you want to perform the 'tagprep' operation on a document in the "raw, unprocessed" folder in your workspace "/home/me/my_workspace". You'll need the workspace key from the MAT Web server (see the documentation on workspace mode for details). You'll reference the document by its basename (that is, its name without the directory path).
String url = "http://localhost:7801";
MATCgiClient client = new MATCgiClient(url);
String workspaceKey = ...;
String myBasename = ...;
MATCgiClient.WorkspaceFileResult res =
client.doWorkspaceOperation(
"/home/me/my_workspace", workspaceKey,
"raw, unprocessed", "tagprep", myBasename);
If there are additional attributes you want to pass to the engine, you can do so with an optional hash map argument.
The result object contains the document after it's processed,
along with the folder it's in.
There's another, general way to access the toplevel operations in
your workspace. Let's say your Web server is running on the default
port, and
you want to list the basenames in the workspace, that is, perform the
'list' toplevel operation with no arguments.
You'll
need the workspace key from the MAT Web server (see the documentation
on workspace mode for details).
String url = "http://localhost:7801";
MATCgiClient client = new MATCgiClient(url);
String workspaceKey = ...;
MATCgiClient.WorkspaceFileResult res =
client.doToplevelWorkspaceOperation(
"/home/me/my_workspace", workspaceKey,
"list", null);
The result object contains the list of files in the files slot.
If you want to specify arguments, you can do so as follows:
String url = "http://localhost:7801";
MATCgiClient client = new MATCgiClient(url);
String workspaceKey = ...;
ArrayList<String> args = new ArrayList<String>(Arrays.asList("in process"));
MATCgiClient.WorkspaceFileResult res =
client.doToplevelWorkspaceOperation(
"/home/me/my_workspace", workspaceKey,
"list", args);
If your toplevel operation has additional attributes you want to pass to the engine, you can do so with an optional hash map argument.
Let's say your Web server is running on the default port, and you have a document you've zoned and tokenized in a separate tool, and you want to insert it into your workspace in the "in process" folder (because it's ready to be hand tagged). You'll have to choose a basename for it (a name, without any directory path).
String url = "http://localhost:7801";
MATCgiClient client = new MATCgiClient(url);
String workspaceKey = ...;
MATDocument d = ...;
MATCgiClient.WorkspaceFileResult res =
client.importFileIntoWorkspace(
"/home/me/my_workspace", workspaceKey,
"in process", d, myBasename);
If you want to strip a suffix from the basename, you should add an
optional
hash map argument whose key is "strip_suffix" and whose value is the
suffix you wish to strip. See the details for MATWorkspaceEngine for an
analogous
example. All other keys and values for this operation are also
supported in the hash map.
The result object contains the document after it's processed, along with the folder it's in.
This library is distributed in a directory structure that looks like this:
java/
...
lib/
jackson-core-lgpl-1.4.3.jar
jackson-mapper-lgpl-1.4.3.jar
commons-codec-1.2.jar
commons-httpclient-3.1.jar
commons-logging-1.1.jar
...
java-mat-core/
...
dist/
java-mat-core.jar
java-mat-engine-client/
dist/
java-mat-engine-client.jar
To use this library, you must include all seven of the libraries mentioned above in your classpath.
|
|||||||||
| PREV PACKAGE NEXT PACKAGE | FRAMES NO FRAMES | ||||||||