Language Technology Programming Competition 2024 | |||||||||||||||||||||
|
2024 Shared Task DescriptionBasic Task DescriptionBackgroundRecent advancements in large language models (LLMs) have led to widespread use of AI-generated text, giving rise to human-AI collaborative writing. While this collaboration offers exciting possibilities, it also presents challenges in distinguishing between human-authored and AI-generated content within a single piece of text. Accurately detecting AI-generated text has become increasingly important across various domains, including journalism, content creation, and professional writing, to ensure transparency and maintain the integrity of written communication. Previous efforts in AI text detection often focused on document-level analysis, assuming entire documents were either human-written or AI-generated. However, as human-AI collaborative writing becomes more prevalent, there is a growing need to identify AI-generated content at a finer level. This sentence-level detection is crucial for understanding and analyzing hybrid texts composed of both AI and human-authored sentences, which are becoming increasingly common in various fields. GoalThe goal of this shared task is to develop automatic detection systems capable of identifying AI-generated sentences within hybrid articles containing both human-written and AI-generated content. Participants are challenged to create models that can accurately distinguish between human-authored and GPT-3.5-turbo-generated sentences in collaborative writing scenarios. Participants should develop a system that takes a list of sentences composing a hybrid article as input and outputs a list of predictions on whether each sentence in the article is human-written or generated by GPT-3.5-turbo. This task focuses on sentence-level detection, addressing the challenge of identifying GPT-3.5-turbo-generated content within mixed human-AI texts. The performance of the detection systems will be evaluated using the Kappa score on a test set of hybrid articles, where each article contains a mix of human-written and GPT-3.5-turbo-generated sentences. The Kappa score will measure the agreement between the system's predictions and the true labels for each sentence. Participants' systems will be ranked based on their Kappa scores, with higher scores indicating better performance in distinguishing between human-written and GPT-3.5-turbo-generated sentences within hybrid texts. This task aims to contribute to the development of more sophisticated methods for identifying GPT-3.5-turbo-generated content in collaborative writing scenarios, valuable for maintaining integrity in written communication and developing responsible practices in AI content creation. Data Files and SubmissionWe will use CodaLab for this year's competition (ALTA Shared Task 2024). The details about data formats and the submission will be provided in the competition website. Important Dates
|
||||||||||||||||||||