UIMA Annotator

Skip to end of metadata
Go to start of metadata
Table of Contents
Source code
The source code and the PEAR package for the UIMA annotator plugin is available from the S4 GitHub repository


Introduction

This component provides access to the S4 text analytics services directly from the UIMA platform. The S4 Annotator for UIMA is implemented as an Annotator type component which can be used within the UIMA GUI tools (such as the UIMA Document Analyser and the CAS Visual Debugger) for processing documents and it also acts as a local proxy to the remotely accessible RESTful services of S4, hiding the complexity of the underlying technologies and communication protocols. For portability we have packaged this component as a UIMA PEAR package. The PEAR can be installed and integrated into any UIMA processing pipeline regardless of the context and it does not have any requirements or assumptions about the type of pre-processing or post-processing of the textual data being annotated. It can be used as a standalone component or as part of a Collection Processing Engine (CPE) pipeline.

The following sections describe the steps for downloading, configuring and running the S4 UIMA Annotator component

Prerequisites

Details on acquiring S4 API keys are available in the S4 Management Console documentation section.

Installation

  • The Annotator's PEAR package can be downloaded from the S4 GitHub repository.

After downloading the PEAR package, follow the steps:

  • Step 1: Go to $UIMA_HOME/bin and execute the PEAR installer:

On Windows start $UIMA_HOME/bin/runPearInstaller.bat file. A GUI like this will pop up:

  • Step 2: Fill in the path to the PEAR file and the installation directory and click the Install button

If the PEAR has been installed successfully you should see a message like "Component S4 Document UIMA Annonator installation completed":

Configuring the S4 Annotator

After successfully installing the PEAR package for the S4 Annotator component, several additional parameters need to be specified.

  • Step 1: Go to the desc folder in the PEAR installation directory and open the file S4DocumentUimaAnnotator.xml with your favourite text editor (for example vim):
  • Step 2: Insert values for S4_SERVICE_ENDPOINT, API_KEY_ID and API_PASSWORD:

This is the xml fragment you are interested in:

  • Step 3: Save the file and proceed further with loading the S4 Annotator through one of the available UIMA GUI tools (CAS Visual Debugger or Document Viewer).

Using the S4 Annotator with the CAS Visual Debugger

Follow the steps:

  • Step 1: If you have already installed the PEAR package you have two ways to launch the CAS Visual Debugger:
  1. Click Run your AE in the CAS Visual Debugger from the PEAR Installer
  2. Run it from the UIMA GUI tools as a script like this:

On Windows start $UIMA_HOME/bin/cvd.bat file. A GUI like this will pop up:

  • Step 4: Open a raw text file. Go to File -> Open Text File and choose a file from your filesystem:

The text should appear in the big text box of the GUI:

  • Step 5: Load the S4 Annotator component from the installed PEAR. Click Run -> Load AE:

Go to the PEAR installation directory and you will find there a file called S4DocumentUIMAAnnotator_pear.xml which you have to open:

  • Step 6: Run the component on top of your opened raw text file

After successfully loading the S4 Annotator from the PEAR you should see the option "Run com.ontotext.s4.api.annotator.S4DocumentUimaAnnotator" in the Run menu highlighted:

Click on that button and wait for the results.

  • Step 7: Explore the end results

Using the S4 Annotator with the Document Analyzer

Follow the steps:

  • Step 1: Go to $UIMA_HOME/bin and execute the Document Analyzer GUI:

On Windows start $UIMA_HOME/bin/documentAnalyzer.bat file. A GUI like this will pop up:

The text boxes to fill in are:

  1. Input directory -> the directory with your raw text files
  2. Output directory -> the directory where the processed XMI files will be placed
  3. Location of Analysis Engine XML descriptor -> in the case of an unzipped PEAR package it is the file ending with "_pear.xml" otherwise it is a regular component descriptor xml file
  • Step 2: Click the Run button and wait the progress bar to fill up

  • Step 3: Click the View button and choose which of the processed files you wish to view

  • Step 4: Double click one file and explore the results

Next Steps

The S4 Annotator plugin for UIMA provides an easy way for UIMA developers and language engineers to incorporate the S4 text analytics services within UIMA text analytics applications. If you haven't done so already - register and start using S4 right away!

Labels:
None
Enter labels to add to this page:
Please wait 
Looking for a label? Just start typing.