Image Analytics

Skip to end of metadata
Go to start of metadata
Table of Contents

Introduction

The Image Analytics service provides the means to get the tags and categories associated with an image on a web page. The Image Analytics is based on the Imagga Image Tagging API and is integrated with the text analytics services on S4, so that when a web page is processed, both the text content and the images on the page will be analysed and tags and categories will be recommended.

The image analytics functionality has been integrated with the following text analytics services:

Currently the News Classifier does not offer image analytics capabilities
The Imagga image analytics recognises 3,000 objects and assigns relevant tags from a set of 20,000 words. The Imagga image analytics currently does not recognise concrete entities, e.g. a photo with Barack Obama will get tags such as "man", "businessman", etc. but not a reference to the DBpedia entity for Barack Obama, http://dbpedia.org/page/Barack_Obama

Service Endpoints

service
endpoint
methods
Twitter analytics
https://text.s4.ontotext.com/v1/twitie POST
News analytics https://text.s4.ontotext.com/v1/news POST
Bio-medical analytics https://text.s4.ontotext.com/v1/sbt POST

HTTP Headers

The HTTP headers for invoking the the image analytics functionality are the same as for the Text Analytics services

POST Request

Parameters

There are no parameters to the POST request - all the configuration information is provided in a JSON structure in the request body.

Request Body

The processing body is a JSON structure containing the input text document as a reference to remote URL (documentUrl). The documentType property should specify the format of the input data (html page, twitter message, etc.).

The following table provides the details on the attributes of the JSON request structure:

Attribute name Required
Description
Valid values
Default value
documentUrl
Yes
The URL of the document to be processed. Either the documentUrl or document parameter must be specified. Specifying both parameters is an error.
The URL must be accessible to the service i.e. it must be publicly accessible and should not require any authentication or setting cookies for access.
JSON String representing a publicly accessible URL n/a
documentType
Yes
The MIME type of the document to be processed
  • text/html for HTML documents,
  • text/x-json-twitter for Twitter JSON format
n/a
imageTagging

No get tags for the images found in the document (web page)
  • true
  • false
false
imageCategorization
No get categories for the images found in the document (web page)
  • true
  • false
false

Response Format

The response format for the S4 text analytics services (news, bio-medical, Twitter) is application/json. For each annotated document, it consists of a JSON object with two properties:

  • text - containing the plain text of the original document, stripped down from any markup (e.g. HTML/XML, etc. tags)
  • entities - containing the annotations for the entities identified in the text
  • images - containing the tags and categories identified in the images

Example

We will use the following web page for text and image analytics: "Radio reports say Germany spied on FBI, UN bodies and French foreign minister" from The Guardian.

To just process the text content of the page, execute the request:

Various entities will be found in text, for example:

To process the combined text & images on the web page execute the request:

Note the two additional parameters (set to "true"):

  • imaggeTagging
  • imageCategorization

The JSON result of this request will also include an "images" section with details on the images, their tags and categories, as explained in the "Response Format" section.

For the particular webpage from The Guardian, the following tags and categories will be proposed:

Labels:
None
Enter labels to add to this page:
Please wait 
Looking for a label? Just start typing.