To Aggregate or Not to Aggregate. That is the Question: A Case Study on Annotation Subjectivity in Span Prediction

Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis (WASSA), 2024

5 August 2024

Main:5 Pages

Bibliography:2 Pages

3 Tables

Abstract

This paper explores the task of automatic prediction of text spans in a legal problem description that support a legal area label. We use a corpus of problem descriptions written by laypeople in English that is annotated by practising lawyers. Inherent subjectivity exists in our task because legal area categorisation is a complex task, and lawyers often have different views on a problem, especially in the face of legally-imprecise descriptions of issues. Experiments show that training on majority-voted spans outperforms training on disaggregated ones.

View on arXiv

Comments on this paper