News Classifier

Skip to end of metadata
Go to start of metadata
Table of Contents

Introduction

The News classifier service assigns each input document a category, based on the 17 top-level categories of IPTC Subject Reference System. Additionally the service provides a the two more category candidates which were ranked 2nd and 3rd.

Description of Categories

(see also IPTC diagram)

Name Description
Arts_Culture_Entertainment Matters pertaining to the advancement and refinement of the human mind, of interests, skills, tastes and emotions (media, movies and TV, literature and journalism, music, celebrities, entertainment products, internet culture, youth culture).
Crime_Law_Justice Establishment and/or statement of the rules of behaviour in society, the enforcement of these rules, breaches of the rules and the punishment of offenders. Organisations and bodies involved in these activities.
Disaster_Accident Man made and natural events resulting in loss of life or injury to living creatures and/or damage to inanimate objects or property.
Economy_Business_Finance All matters concerning the planning, production and exchange of wealth.
Education All aspects of furthering knowledge of human individuals from birth to death.
Environment All aspects of protection, damage, and condition of the ecosystem of the planet earth and its surroundings.
Health All aspects pertaining to the physical and mental welfare of human beings.
Human Interest Items about individuals, groups, animals, plants or other objects with a focus on emotional facets.
Labor Social aspects, organisations, rules and conditions affecting the employment of human effort for the generation of wealth or provision of services and the economic support of the unemployed.
Lifestyle_Leisure Activities undertaken for pleasure, relaxation or recreation outside paid employment, including eating and travel.
Politics Local, regional, national and international exercise of power, or struggle for power, and the relationships between governing bodies and states.
Religion_Belief All aspects of human existence involving theology, philosophy, ethics and spirituality.
Science_Technology All aspects pertaining to human understanding of nature and the physical world and the development and application of this knowledge.
Society Aspects of the life of humans affecting its relationships.
Sports Competitive exercise involving physical effort. Organizations and bodies involved in these activities.
Weather meteorological phenomena.
Conflicts_War_Peace Acts of socially or politically motivated protest and/or violence and actions to end them.

REST API

The details on the REST API for the News Classifier service are available on the Text Analytics page.

Example

RESTful Request (Plain Text Content)

We are now ready to send a simple RESTful request to the S4 News Classifier service using a simple command line tool like curl and a sample from a news article:

A TransAsia Airways flight in Taiwan carrying 58 passengers and crew careened past buildings, clipped a highway and crashed into a shallow stream, killing at least 23 people. TransAsia GE 235, a domestic flight from Taipei to Kinmen – a small archipelago near mainland China – crashed at 10.56am local time, according to Taiwan’s aviation council, about three minutes after it took off.
Astonishing dash-cam videos posted online showed the turboprop ATR 72-600 aircraft in its final airborne moments, turning vertical over a highway and clipping a taxi cab and a bridge with its left wing

Lets go step-by-step through the sample code above:

  1. we specify the API Key and secret - all S4 requests need a valid API key and secret pair which can be generated from the S4 Management Console
  2. we specify the S4 RESTful service to be used - in this case the "News Classifier" text analytics service. Note that as part of the endpoint URL we also provide the API key and secret
  3. we have chosen to analyse an simple snippet of text (from a news article)
  4. we construct the proper JSON request document - comprised of the content + "text/plain" as content type
  5. we make a RESTful request to the S4 service via curl, providing the JSON request document (from step 4), the S4 service endpoint (from step 2) and we specify in the HTTP header that this HTTP request type is "application/json"

JSON Result

The result is a JSON object providing document classification information as well as ranked list of top 3 category candidates.

Labels:
None
Enter labels to add to this page:
Please wait 
Looking for a label? Just start typing.