ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2505.17222
  4. Cited By
Humans Hallucinate Too: Language Models Identify and Correct Subjective Annotation Errors With Label-in-a-Haystack Prompts

Humans Hallucinate Too: Language Models Identify and Correct Subjective Annotation Errors With Label-in-a-Haystack Prompts

22 May 2025
Georgios Chochlakis
Peter Wu
Arjun Bedi
Marcus Ma
Kristina Lerman
Shrikanth Narayanan
ArXiv (abs)PDFHTML

Papers citing "Humans Hallucinate Too: Language Models Identify and Correct Subjective Annotation Errors With Label-in-a-Haystack Prompts"

24 / 24 papers shown
Title
Aggregation Artifacts in Subjective Tasks Collapse Large Language Models' Posteriors
Aggregation Artifacts in Subjective Tasks Collapse Large Language Models' Posteriors
Georgios Chochlakis
Alexandros Potamianos
Kristina Lerman
Shrikanth Narayanan
151
2
0
17 Oct 2024
How Does Quantization Affect Multilingual LLMs?
How Does Quantization Affect Multilingual LLMs?
Kelly Marchisio
Saurabh Dash
Hongyu Chen
Dennis Aumiller
Ahmet Üstün
Sara Hooker
Sebastian Ruder
MQ
123
15
0
03 Jul 2024
Harmful Speech Detection by Language Models Exhibits Gender-Queer
  Dialect Bias
Harmful Speech Detection by Language Models Exhibits Gender-Queer Dialect Bias
Rebecca Dorn
Lee Kezar
Fred Morstatter
Kristina Lerman
89
11
0
23 May 2024
The Strong Pull of Prior Knowledge in Large Language Models and Its
  Impact on Emotion Recognition
The Strong Pull of Prior Knowledge in Large Language Models and Its Impact on Emotion Recognition
Georgios Chochlakis
Alexandros Potamianos
Kristina Lerman
Shrikanth Narayanan
90
7
0
25 Mar 2024
Capturing Perspectives of Crowdsourced Annotators in Subjective Learning
  Tasks
Capturing Perspectives of Crowdsourced Annotators in Subjective Learning Tasks
Negar Mokhberian
Myrl G. Marmarelis
F. R. Hopp
Valerio Basile
Fred Morstatter
Kristina Lerman
104
13
0
16 Nov 2023
Modeling subjectivity (by Mimicking Annotator Annotation) in toxic
  comment identification across diverse communities
Modeling subjectivity (by Mimicking Annotator Annotation) in toxic comment identification across diverse communities
Senjuti Dutta
Sid Mittal
Sherol Chen
Deepak Ramachandran
Ravi Rajakumar
Ian D Kivlichan
Sunny Mak
Alena Butryna
Praveen Paritosh University of Tennessee
106
7
0
01 Nov 2023
Foundation Model Assisted Automatic Speech Emotion Recognition:
  Transcribing, Annotating, and Augmenting
Foundation Model Assisted Automatic Speech Emotion Recognition: Transcribing, Annotating, and Augmenting
Tiantian Feng
Shrikanth Narayanan
89
21
0
15 Sep 2023
Trusting Your Evidence: Hallucinate Less with Context-aware Decoding
Trusting Your Evidence: Hallucinate Less with Context-aware Decoding
Weijia Shi
Xiaochuang Han
M. Lewis
Yulia Tsvetkov
Luke Zettlemoyer
Scott Yih
HILM
78
215
0
24 May 2023
Language Models Don't Always Say What They Think: Unfaithful
  Explanations in Chain-of-Thought Prompting
Language Models Don't Always Say What They Think: Unfaithful Explanations in Chain-of-Thought Prompting
Miles Turpin
Julian Michael
Ethan Perez
Sam Bowman
ReLMLRM
118
444
0
07 May 2023
We're Afraid Language Models Aren't Modeling Ambiguity
We're Afraid Language Models Aren't Modeling Ambiguity
Alisa Liu
Zhaofeng Wu
Julian Michael
Alane Suhr
Peter West
Alexander Koller
Swabha Swayamdipta
Noah A. Smith
Yejin Choi
137
105
0
27 Apr 2023
The political ideology of conversational AI: Converging evidence on
  ChatGPT's pro-environmental, left-libertarian orientation
The political ideology of conversational AI: Converging evidence on ChatGPT's pro-environmental, left-libertarian orientation
Jochen Hartmann
Jasper Schwenzow
Maximilian Witte
87
224
0
05 Jan 2023
Leveraging Label Correlations in a Multi-label Setting: A Case Study in
  Emotion
Leveraging Label Correlations in a Multi-label Setting: A Case Study in Emotion
Georgios Chochlakis
Gireesh Mahajan
Sabyasachee Baruah
Keith Burghardt
Kristina Lerman
Shrikanth Narayanan
86
24
0
28 Oct 2022
Noise Audits Improve Moral Foundation Classification
Noise Audits Improve Moral Foundation Classification
Negar Mokhberian
F. R. Hopp
Bahareh Harandizadeh
Fred Morstatter
Kristina Lerman
NoLa
82
7
0
13 Oct 2022
Training language models to follow instructions with human feedback
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLMALM
924
13,266
0
04 Mar 2022
Rethinking the Role of Demonstrations: What Makes In-Context Learning
  Work?
Rethinking the Role of Demonstrations: What Makes In-Context Learning Work?
Sewon Min
Xinxi Lyu
Ari Holtzman
Mikel Artetxe
M. Lewis
Hannaneh Hajishirzi
Luke Zettlemoyer
LLMAGLRM
193
1,504
0
25 Feb 2022
Annotators with Attitudes: How Annotator Beliefs And Identities Bias
  Toxic Language Detection
Annotators with Attitudes: How Annotator Beliefs And Identities Bias Toxic Language Detection
Maarten Sap
Swabha Swayamdipta
Laura Vianna
Xuhui Zhou
Yejin Choi
Noah A. Smith
98
285
0
15 Nov 2021
Dealing with Disagreements: Looking Beyond the Majority Vote in
  Subjective Annotations
Dealing with Disagreements: Looking Beyond the Majority Vote in Subjective Annotations
Aida Mostafazadeh Davani
Mark Díaz
Vinodkumar Prabhakaran
88
315
0
12 Oct 2021
Does Knowledge Distillation Really Work?
Does Knowledge Distillation Really Work?
Samuel Stanton
Pavel Izmailov
Polina Kirichenko
Alexander A. Alemi
A. Wilson
FedML
71
224
0
10 Jun 2021
Survey Equivalence: A Procedure for Measuring Classifier Accuracy
  Against Human Labels
Survey Equivalence: A Procedure for Measuring Classifier Accuracy Against Human Labels
Paul Resnick
Yuqing Kong
Grant Schoenebeck
Tim Weninger
45
13
0
02 Jun 2021
Dataset Cartography: Mapping and Diagnosing Datasets with Training
  Dynamics
Dataset Cartography: Mapping and Diagnosing Datasets with Training Dynamics
Swabha Swayamdipta
Roy Schwartz
Nicholas Lourie
Yizhong Wang
Hannaneh Hajishirzi
Noah A. Smith
Yejin Choi
147
452
0
22 Sep 2020
GoEmotions: A Dataset of Fine-Grained Emotions
GoEmotions: A Dataset of Fine-Grained Emotions
Dorottya Demszky
Dana Movshovitz-Attias
Jeongwoo Ko
Alan S. Cowen
Gaurav Nemade
Sujith Ravi
AI4MH
97
724
0
01 May 2020
Towards Faithfully Interpretable NLP Systems: How should we define and
  evaluate faithfulness?
Towards Faithfully Interpretable NLP Systems: How should we define and evaluate faithfulness?
Alon Jacovi
Yoav Goldberg
XAI
138
601
0
07 Apr 2020
BERT: Pre-training of Deep Bidirectional Transformers for Language
  Understanding
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Jacob Devlin
Ming-Wei Chang
Kenton Lee
Kristina Toutanova
VLMSSLSSeg
1.9K
95,531
0
11 Oct 2018
Attention Is All You Need
Attention Is All You Need
Ashish Vaswani
Noam M. Shazeer
Niki Parmar
Jakob Uszkoreit
Llion Jones
Aidan Gomez
Lukasz Kaiser
Illia Polosukhin
3DV
918
133,201
0
12 Jun 2017
1