|Language Technology Programming Competition 2022|
2022 Shared Task Description
Basic Task Description
This task is a re-visit of the ALTA 2012 Shared Task. At that time, the best-performing system Lui, 2012, Amini et al., 2012] used feature stacking and logistic regression. We want to know whether more recent developments in text processing can do better than then.
The goal of this task is to build automatic sentence classifiers that can map the content of biomedical abstracts into a set of pre-defined categories, which are used for Evidence-Based Medicine (EBM). EBM practitioners rely on specific criteria when judging whether a scientific article is relevant to a given question. They generally follow the PICO criterion: Population (P) (i.e., participants in a study); Intervention (I); Comparison (C) (if appropriate); and Outcome (O) (of an Intervention). Variations and extensions of this classification have been proposed, and for this task we will extend PICO by adding the classes Background (B) and Study Design (S); and including sentences that have no relevant content: Other (O). Therefore, the goal will be to classify the provided sentences according to the PIBOSO schema. Such information could be leveraged in various ways: e.g., to improve search performance; to enable structured querying with specific categories; and to aid users in more quickly making judgements against specified PICOSO criteria.
This is a multi-label classification problem, since each sentence can have more than one label. The tagset is defined as follows:
More information about this problem, the construction of the dataset, and a benchmark can be found in Kim et al. (2011). The original data is provided by former NICTA, now CSIRO DATA61, and curated for the earlier (2012) shared task by Iman Amini and David Martinez. The data for this (2022) shared task has been adapted by Diego Molla (firstname.lastname@example.org).
Data Files and SubmissionWe will use CodaLab for this year's competition (https://codalab.lisn.upsaclay.fr/competitions/6935). The details about data formats and the submission will be provided in the competition website.