Humans Hallucinate Too: Language Models Identify and Correct Subjective Annotation Errors With Label-in-a-Haystack Prompts

22 May 2025

Papers citing "Humans Hallucinate Too: Language Models Identify and Correct Subjective Annotation Errors With Label-in-a-Haystack Prompts"

24 / 24 papers shown

Title
Aggregation Artifacts in Subjective Tasks Collapse Large Language Models' Posteriors Georgios Chochlakis Alexandros Potamianos Kristina Lerman Shrikanth Narayanan 151 2 0 17 Oct 2024
How Does Quantization Affect Multilingual LLMs? Kelly Marchisio Saurabh Dash Hongyu Chen Dennis Aumiller Ahmet Üstün Sara Hooker Sebastian Ruder MQ 123 15 0 03 Jul 2024
Harmful Speech Detection by Language Models Exhibits Gender-Queer Dialect Bias Rebecca Dorn Lee Kezar Fred Morstatter Kristina Lerman 89 11 0 23 May 2024
The Strong Pull of Prior Knowledge in Large Language Models and Its Impact on Emotion Recognition Georgios Chochlakis Alexandros Potamianos Kristina Lerman Shrikanth Narayanan 90 7 0 25 Mar 2024
Capturing Perspectives of Crowdsourced Annotators in Subjective Learning Tasks Negar Mokhberian Myrl G. Marmarelis F. R. Hopp Valerio Basile Fred Morstatter Kristina Lerman 104 13 0 16 Nov 2023
Modeling subjectivity (by Mimicking Annotator Annotation) in toxic comment identification across diverse communities Senjuti Dutta Sid Mittal Sherol Chen Deepak Ramachandran Ravi Rajakumar Ian D Kivlichan Sunny Mak Alena Butryna Praveen Paritosh University of Tennessee 106 7 0 01 Nov 2023
Foundation Model Assisted Automatic Speech Emotion Recognition: Transcribing, Annotating, and Augmenting Tiantian Feng Shrikanth Narayanan 89 21 0 15 Sep 2023
Trusting Your Evidence: Hallucinate Less with Context-aware Decoding Weijia Shi Xiaochuang Han M. Lewis Yulia Tsvetkov Luke Zettlemoyer Scott Yih HILM 78 215 0 24 May 2023
Language Models Don't Always Say What They Think: Unfaithful Explanations in Chain-of-Thought Prompting Miles Turpin Julian Michael Ethan Perez Sam Bowman ReLM LRM 118 444 0 07 May 2023
We're Afraid Language Models Aren't Modeling Ambiguity Alisa Liu Zhaofeng Wu Julian Michael Alane Suhr Peter West Alexander Koller Swabha Swayamdipta Noah A. Smith Yejin Choi 137 105 0 27 Apr 2023
The political ideology of conversational AI: Converging evidence on ChatGPT's pro-environmental, left-libertarian orientation Jochen Hartmann Jasper Schwenzow Maximilian Witte 87 224 0 05 Jan 2023
Leveraging Label Correlations in a Multi-label Setting: A Case Study in Emotion Georgios Chochlakis Gireesh Mahajan Sabyasachee Baruah Keith Burghardt Kristina Lerman Shrikanth Narayanan 86 24 0 28 Oct 2022
Noise Audits Improve Moral Foundation Classification Negar Mokhberian F. R. Hopp Bahareh Harandizadeh Fred Morstatter Kristina Lerman NoLa 82 7 0 13 Oct 2022
Training language models to follow instructions with human feedback Long Ouyang Jeff Wu Xu Jiang Diogo Almeida Carroll L. Wainwright ... Amanda Askell Peter Welinder Paul Christiano Jan Leike Ryan J. Lowe OSLM ALM 924 13,266 0 04 Mar 2022
Rethinking the Role of Demonstrations: What Makes In-Context Learning Work? Sewon Min Xinxi Lyu Ari Holtzman Mikel Artetxe M. Lewis Hannaneh Hajishirzi Luke Zettlemoyer LLMAG LRM 193 1,504 0 25 Feb 2022
Annotators with Attitudes: How Annotator Beliefs And Identities Bias Toxic Language Detection Maarten Sap Swabha Swayamdipta Laura Vianna Xuhui Zhou Yejin Choi Noah A. Smith 98 285 0 15 Nov 2021
Dealing with Disagreements: Looking Beyond the Majority Vote in Subjective Annotations Aida Mostafazadeh Davani Mark Díaz Vinodkumar Prabhakaran 88 315 0 12 Oct 2021
Does Knowledge Distillation Really Work? Samuel Stanton Pavel Izmailov Polina Kirichenko Alexander A. Alemi A. Wilson FedML 71 224 0 10 Jun 2021
Survey Equivalence: A Procedure for Measuring Classifier Accuracy Against Human Labels Paul Resnick Yuqing Kong Grant Schoenebeck Tim Weninger 45 13 0 02 Jun 2021
Dataset Cartography: Mapping and Diagnosing Datasets with Training Dynamics Swabha Swayamdipta Roy Schwartz Nicholas Lourie Yizhong Wang Hannaneh Hajishirzi Noah A. Smith Yejin Choi 147 452 0 22 Sep 2020
GoEmotions: A Dataset of Fine-Grained Emotions Dorottya Demszky Dana Movshovitz-Attias Jeongwoo Ko Alan S. Cowen Gaurav Nemade Sujith Ravi AI4MH 97 724 0 01 May 2020
Towards Faithfully Interpretable NLP Systems: How should we define and evaluate faithfulness? Alon Jacovi Yoav Goldberg XAI 138 601 0 07 Apr 2020
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding Jacob Devlin Ming-Wei Chang Kenton Lee Kristina Toutanova VLM SSL SSeg 1.9K 95,531 0 11 Oct 2018
Attention Is All You Need Ashish Vaswani Noam M. Shazeer Niki Parmar Jakob Uszkoreit Llion Jones Aidan Gomez Lukasz Kaiser Illia Polosukhin 3DV 918 133,201 0 12 Jun 2017