Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2505.17558
Cited By
Teaching with Lies: Curriculum DPO on Synthetic Negatives for Hallucination Detection
23 May 2025
Shrey Pandit
Ashwin Vinod
Liu Leqi
Ying Ding
HILM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Teaching with Lies: Curriculum DPO on Synthetic Negatives for Hallucination Detection"
17 / 17 papers shown
Title
FactCheckmate: Preemptively Detecting and Mitigating Hallucinations in LMs
Deema Alnuhait
Neeraja Kirtane
Muhammad Khalifa
Hao Peng
LRM
HILM
107
4
0
03 Oct 2024
Lynx: An Open Source Hallucination Evaluation Model
Selvan Sunitha Ravi
B. Mielczarek
Anand Kannappan
Douwe Kiela
Rebecca Qian
VLM
RALM
HILM
124
20
0
11 Jul 2024
Using LLMs in Software Requirements Specifications: An Empirical Evaluation
Madhava Krishna
Bhagesh Gaur
Arsh Verma
Pankaj Jalote
70
22
0
27 Apr 2024
MiniCheck: Efficient Fact-Checking of LLMs on Grounding Documents
Liyan Tang
Philippe Laban
Greg Durrett
HILM
SyDa
88
103
0
16 Apr 2024
Curry-DPO: Enhancing Alignment using Curriculum Learning & Ranked Preferences
Pulkit Pattnaik
Rishabh Maheshwary
Kelechi Ogueji
Vikas Yadav
Sathwik Tejaswi Madhusudhan
75
22
0
12 Mar 2024
Large Language Models in Law: A Survey
Jinqi Lai
Wensheng Gan
Jiayang Wu
Zhenlian Qi
Philip S. Yu
ELM
AILaw
121
91
0
26 Nov 2023
Direct Preference Optimization: Your Language Model is Secretly a Reward Model
Rafael Rafailov
Archit Sharma
E. Mitchell
Stefano Ermon
Christopher D. Manning
Chelsea Finn
ALM
405
4,190
0
29 May 2023
Large Language Models Encode Clinical Knowledge
K. Singhal
Shekoofeh Azizi
T. Tu
S. S. Mahdavi
Jason W. Wei
...
A. Rajkomar
Joelle Barral
Christopher Semturs
Alan Karthikesalingam
Vivek Natarajan
LM&MA
ELM
AI4MH
355
2,421
0
26 Dec 2022
A Token-level Reference-free Hallucination Detection Benchmark for Free-form Text Generation
Tianyu Liu
Yizhe Zhang
Chris Brockett
Yi Mao
Zhifang Sui
Weizhu Chen
W. Dolan
HILM
297
149
0
18 Apr 2021
FEQA: A Question Answering Evaluation Framework for Faithfulness Assessment in Abstractive Summarization
Esin Durmus
He He
Mona T. Diab
HILM
117
398
0
07 May 2020
BLEURT: Learning Robust Metrics for Text Generation
Thibault Sellam
Dipanjan Das
Ankur P. Parikh
154
1,511
0
09 Apr 2020
PubMedQA: A Dataset for Biomedical Research Question Answering
Qiao Jin
Bhuwan Dhingra
Zhengping Liu
William W. Cohen
Xinghua Lu
436
918
0
13 Sep 2019
Constrained Decoding for Neural NLG from Compositional Representations in Task-Oriented Dialogue
Anusha Balakrishnan
J. Rao
Kartikeya Upasani
Michael White
R. Subba
158
83
0
17 Jun 2019
Simple and Effective Curriculum Pointer-Generator Networks for Reading Comprehension over Long Narratives
Yi Tay
Shuohang Wang
Anh Tuan Luu
Jie Fu
Minh C. Phan
Xingdi Yuan
J. Rao
S. Hui
Aston Zhang
107
110
0
26 May 2019
Curriculum Learning for Domain Adaptation in Neural Machine Translation
Xuan Zhang
Pamela Shapiro
Manish Kumar
Paul McNamee
Marine Carpuat
Kevin Duh
81
124
0
14 May 2019
BERTScore: Evaluating Text Generation with BERT
Tianyi Zhang
Varsha Kishore
Felix Wu
Kilian Q. Weinberger
Yoav Artzi
684
5,897
0
21 Apr 2019
DROP: A Reading Comprehension Benchmark Requiring Discrete Reasoning Over Paragraphs
Dheeru Dua
Yizhong Wang
Pradeep Dasigi
Gabriel Stanovsky
Sameer Singh
Matt Gardner
AIMat
158
967
0
01 Mar 2019
1