Dealing with Disagreements: Looking Beyond the Majority Vote in Subjective Annotations

12 October 2021

Papers citing "Dealing with Disagreements: Looking Beyond the Majority Vote in Subjective Annotations"

50 / 59 papers shown

Title
Hateful Person or Hateful Model? Investigating the Role of Personas in Hate Speech Detection by Large Language Models Shuzhou Yuan Ercong Nie Mario Tawfelis Helmut Schmid Hinrich Schütze Michael Färber 18 0 0 10 Jun 2025
HESEIA: A community-based dataset for evaluating social biases in large language models, co-designed in real school settings in Latin America Guido Ivetta Marcos J. Gomez Sofía Martinelli Pietro Palombini M. Emilia Echeveste Nair Carolina Mazzeo Beatriz Busaniche Luciana Benotti VLM 22 0 0 30 May 2025
Large Language Models Do Multi-Label Classification Differently Marcus Ma Georgios Chochlakis Niyantha Maruthu Pandiyan Jesse Thomason Shrikanth Narayanan 98 1 0 23 May 2025
Humans Hallucinate Too: Language Models Identify and Correct Subjective Annotation Errors With Label-in-a-Haystack Prompts Georgios Chochlakis Peter Wu Arjun Bedi Marcus Ma Kristina Lerman Shrikanth Narayanan 182 0 0 22 May 2025
Evaluating how LLM annotations represent diverse views on contentious topics Megan A. Brown Shubham Atreja Libby Hemphill Patrick Y. Wu 425 0 0 29 Mar 2025
CULEMO: Cultural Lenses on Emotion -- Benchmarking LLMs for Cross-Cultural Emotion Understanding Tadesse Destaw Belay Ahmed Haj Ahmed Alvin Grissom II Iqra Ameer Grigori Sidorov Olga Kolesnikova Seid Muhie Yimam 151 2 0 12 Mar 2025
RideKE: Leveraging Low-Resource, User-Generated Twitter Content for Sentiment and Emotion Detection in Kenyan Code-Switched Dataset Naome A. Etori Maria Gini 169 3 0 10 Feb 2025
AI Alignment at Your Discretion Maarten Buyl Hadi Khalaf C. M. Verdun Lucas Monteiro Paes Caio Vieira Machado Flavio du Pin Calmon 114 1 0 10 Feb 2025
FuocChuVIP123 at CoMeDi Shared Task: Disagreement Ranking with XLM-Roberta Sentence Embeddings and Deep Neural Regression Phuoc Duong Huy Chu 61 1 0 21 Jan 2025
Aggregation Artifacts in Subjective Tasks Collapse Large Language Models' Posteriors Georgios Chochlakis Alexandros Potamianos Kristina Lerman Shrikanth Narayanan 151 2 0 17 Oct 2024
Keeping Humans in the Loop: Human-Centered Automated Annotation with Generative AI Nicholas Pangakis Samuel Wolken 69 4 0 14 Sep 2024
A Theory-Based Explainable Deep Learning Architecture for Music Emotion H. Fong Vineet Kumar K. Sudhir FAtt 19 2 0 13 Aug 2024
Extrinsic Evaluation of Cultural Competence in Large Language Models Shaily Bhatt Fernando Diaz ELM EGVM 110 9 0 17 Jun 2024
Bayesian WeakS-to-Strong from Text Classification to Generation Ziyun Cui Ziyang Zhang Wen Wu Wen Wu Chao Zhang 127 3 0 24 May 2024
Exploring Subjectivity for more Human-Centric Assessment of Social Biases in Large Language Models Paula Akemi Aoyagui Sharon Ferguson Anastasia Kuzminykh 81 0 0 17 May 2024
If there's a Trigger Warning, then where's the Trigger? Investigating Trigger Warnings at the Passage Level Matti Wiegmann Jennifer Rakete Magdalena Wolska Benno Stein Martin Potthast LLMSV 74 0 0 15 Apr 2024
Can Humans Identify Domains? Maria Barrett Max Müller-Eberstein Elisa Bassignana Amalie Brogaard Pauli Mike Zhang Rob van der Goot 104 1 0 02 Apr 2024
Surveying the Dead Minds: Historical-Psychological Text Analysis with Contextualized Construct Representation (CCR) for Classical Chinese Yuqi Chen Sixuan Li Ying Li Mohammad Atari 107 5 0 01 Mar 2024
Cost-Efficient Subjective Task Annotation and Modeling through Few-Shot Annotator Adaptation Preni Golazizian Ali Omrani Alireza S. Ziabari Morteza Dehghani 47 1 0 21 Feb 2024
Don't Label Twice: Quantity Beats Quality when Comparing Binary Classifiers on a Budget Florian E. Dorner Moritz Hardt 74 4 0 03 Feb 2024
An Empirical Analysis of Diversity in Argument Summarization Michiel van der Meer Piek T. J. M. Vossen Catholijn M. Jonker P. Murukannaiah 52 8 0 02 Feb 2024
GRASP: A Disagreement Analysis Framework to Assess Group Associations in Perspectives Vinodkumar Prabhakaran Christopher Homan Lora Aroyo Aida Mostafazadeh Davani Alicia Parrish Alex S. Taylor Mark Díaz Ding Wang Greg Serapio-García 97 9 0 09 Nov 2023
The Impact of Preference Agreement in Reinforcement Learning from Human Feedback: A Case Study in Summarization Sian Gooding Hassan Mansoor 42 2 0 02 Nov 2023
MoCa: Measuring Human-Language Model Alignment on Causal and Moral Judgment Tasks Allen Nie Yuhui Zhang Atharva Amdekar Chris Piech Tatsunori Hashimoto Tobias Gerstenberg 80 40 0 30 Oct 2023
The Past, Present and Better Future of Feedback Learning in Large Language Models for Subjective Human Preferences and Values Hannah Rose Kirk Andrew M. Bean Bertie Vidgen Paul Röttger Scott A. Hale ALM 113 50 0 11 Oct 2023
PopBERT. Detecting populism and its host ideologies in the German Bundestag Lukas Erhard Sara Hanke Uwe Remer A. Falenska R. Heiberger 61 2 0 22 Sep 2023
BatchPrompt: Accomplish more with less Jianzhe Lin Maurice Diesendruck Liang Du Robin Abraham LRM 96 10 0 01 Sep 2023
Designing Closed-Loop Models for Task Allocation Vijay Keswani L. E. Celis K. Kenthapadi Matthew Lease 59 0 0 31 May 2023
Consensus and Subjectivity of Skin Tone Annotation for ML Fairness Candice Schumann Gbolahan O. Olanubi Auriel Wright Ellis P. Monk Courtney Heldreth Susanna Ricco 107 24 0 16 May 2023
When the Majority is Wrong: Modeling Annotator Disagreement for Subjective Tasks Eve Fleisig Rediet Abebe Dan Klein 128 49 0 11 May 2023
iLab at SemEval-2023 Task 11 Le-Wi-Di: Modelling Disagreement or Modelling Perspectives? Nikolas Vitsakis Amit Parekh Tanvi Dinkar Gavin Abercrombie Ioannis Konstas Verena Rieser 98 10 0 10 May 2023
SemEval-2023 Task 10: Explainable Detection of Online Sexism Hannah Rose Kirk Wenjie Yin Bertie Vidgen Paul Röttger 81 122 0 07 Mar 2023
Everyone's Voice Matters: Quantifying Annotation Disagreement Using Demographic Information Ruyuan Wan Jaehyung Kim Dongyeop Kang 55 38 0 12 Jan 2023
AnnoBERT: Effectively Representing Multiple Annotators' Label Choices to Improve Hate Speech Detection Wenjie Yin Vibhor Agarwal Aiqi Jiang A. Zubiaga Nishanth R. Sastry 91 15 0 20 Dec 2022
Human-Guided Fair Classification for Natural Language Processing Florian E.Dorner Momchil Peychev Nikola Konstantinov Naman Goel Elliott Ash Martin Vechev FaML 71 4 0 20 Dec 2022
SeedBERT: Recovering Annotator Rating Distributions from an Aggregated Label A. Sampath Victoria Lin Louis-Philippe Morency 29 2 0 23 Nov 2022
"It's Not Just Hate'': A Multi-Dimensional Perspective on Detecting Harmful Speech Online Federico Bianchi S. A. Hills Patrícia G. C. Rossini Dirk Hovy Rebekah Tromble N. Tintarev 95 15 0 28 Oct 2022
Unifying Data Perspectivism and Personalization: An Application to Social Norms Joan Plepi Béla Neuendorf Lucie Flek Charles F Welch 117 21 0 26 Oct 2022
Trustworthy Human Computation: A Survey H. Kashima S. Oyama Hiromi Arai Junichiro Mori 80 1 0 22 Oct 2022
Noise Audits Improve Moral Foundation Classification Negar Mokhberian F. R. Hopp Bahareh Harandizadeh Fred Morstatter Kristina Lerman NoLa 82 7 0 13 Oct 2022
Explainable Abuse Detection as Intent Classification and Slot Filling Agostina Calabrese Bjorn Ross Mirella Lapata 93 11 0 06 Oct 2022
Investigating Reasons for Disagreement in Natural Language Inference Nan-Jiang Jiang M. Marneffe 72 27 0 07 Sep 2022
A Holistic Approach to Undesired Content Detection in the Real World Todor Markov Chong Zhang Sandhini Agarwal Tyna Eloundou Teddy Lee Steven Adler Angela Jiang L. Weng 125 237 0 05 Aug 2022
More Data Can Lead Us Astray: Active Data Acquisition in the Presence of Label Bias Yunyi Li Maria De-Arteaga M. Saar-Tsechansky FaML 83 3 0 15 Jul 2022
Is one annotation enough? A data-centric image classification benchmark for noisy and ambiguous label estimation Lars Schmarje Vasco Grossmann Claudius Zelenka S. Dippel R. Kiko ... M. Pastell J. Stracke A. Valros N. Volkmann Reinahrd Koch 115 37 0 13 Jul 2022
Empathic Conversations: A Multi-level Dataset of Contextualized Conversations Damilola Omitaomu Shabnam Tafreshi Tingting Liu Sven Buechel Chris Callison-Burch J. Eichstaedt Lyle Ungar João Sedoc 101 50 0 25 May 2022
Evaluation Gaps in Machine Learning Practice Ben Hutchinson Negar Rostamzadeh Christina Greer Katherine A. Heller Vinodkumar Prabhakaran ELM 98 63 0 11 May 2022
Justice in Misinformation Detection Systems: An Analysis of Algorithms, Stakeholders, and Potential Harms Terrence Neumann Maria De-Arteaga S. Fazelpour 94 26 0 28 Apr 2022
Challenges and Strategies in Cross-Cultural NLP Daniel Hershcovich Stella Frank Heather Lent Miryam de Lhoneux Mostafa Abdou ... Ruixiang Cui Constanza Fierro Katerina Margatina Phillip Rust Anders Søgaard 130 182 0 18 Mar 2022
ToxiGen: A Large-Scale Machine-Generated Dataset for Adversarial and Implicit Hate Speech Detection Thomas Hartvigsen Saadia Gabriel Hamid Palangi Maarten Sap Dipankar Ray Ece Kamar 92 391 0 17 Mar 2022