|Language Technology Programming Competition 2018|
2018 Shared Task Description: Classifying Patent Applications
Basic Task Description
When a patent application is submitted there is a process where the application is classified by examiners of patent offices or other people. Patent classifications make it feasible to search quickly for documents about earlier disclosures similar to or related to the invention for which a patent is applied for, and to track technological trends in patent applications.
The International Patent Classification (IPC) is agreed internationally. A patent can have several classification symbols but there is one which is the primary one. This is what is called the primary IPC mark.
An IPC classification symbol is of the form A01B 1/00, where each component has a special meaning:
For example, an application for a patent about lasers might have the following two IPC marks, where the first one is the primary IPC mark:
Some patent applications may refer to several sections of the IPC and this would be indicated by the presence of several IPC marks. For example, the following application would have a primary IPC mark about fittings for hats and an additional IPC mark for pyjamas:
The goal of this task is to automatically classify Australian patents into one of the principal International Patent Classification sections. This is the first character (A to H) of the primary IPC mark. We have downloaded nearly 5,000 documents, of which we have reserved 1,000 documents for the evaluation of the systems.
We will use Kaggle in Class to evaluate the systems.
Data Files and SubmissionWe will use Kaggle in Class for this year's competition (look for the ALTA 2018 Challenge). The data files and submission instructions will be provided in the competition website.
In order to access the Kaggle in Class pages, you need to register with this shared task.