Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2206.05802
Cited By
v1
v2 (latest)
Self-critiquing models for assisting human evaluators
12 June 2022
William Saunders
Catherine Yeh
Jeff Wu
Steven Bills
Ouyang Long
Jonathan Ward
Jan Leike
ALM
ELM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Self-critiquing models for assisting human evaluators"
50 / 260 papers shown
Title
Learning to Refine with Fine-Grained Natural Language Feedback
Manya Wadhwa
Xinyu Zhao
Junyi Jessy Li
Greg Durrett
444
25
0
02 Jul 2024
Large Language Models for Behavioral Economics: Internal Validity and Elicitation of Mental Models
Brian Jabarian
73
0
0
30 Jun 2024
Is It Really Long Context if All You Need Is Retrieval? Towards Genuinely Difficult Long Context NLP
Omer Goldman
Alon Jacovi
Aviv Slobodkin
Aviya Maimon
Ido Dagan
Reut Tsarfaty
350
17
0
29 Jun 2024
LLM Critics Help Catch LLM Bugs
Nat McAleese
Rai Michael Pokorny
Juan Felipe Cerón Uribe
Evgenia Nitishinskaya
Maja Trebacz
Jan Leike
ALM
LRM
197
119
0
28 Jun 2024
Human-AI Collaborative Taxonomy Construction: A Case Study in Profession-Specific Writing Assistants
Linghe Wang
Min Namgung
Vivek A. Khetan
Dongyeop Kang
306
5
0
26 Jun 2024
On the Transformations across Reward Model, Parameter Update, and In-Context Prompt
Deng Cai
Huayang Li
Tingchen Fu
Siheng Li
Weiwen Xu
...
Leyang Cui
Yan Wang
Lemao Liu
Taro Watanabe
Shuming Shi
KELM
180
2
0
24 Jun 2024
FastMem: Fast Memorization of Prompt Improves Context Awareness of Large Language Models
Junyi Zhu
Shuochen Liu
Yu Yu
Bo Tang
Yibo Yan
Zhiyu Li
Feiyu Xiong
Tong Xu
Matthew B. Blaschko
180
6
0
23 Jun 2024
Beyond Under-Alignment: Atomic Preference Enhanced Factuality Tuning for Large Language Models
Hongbang Yuan
Yubo Chen
Pengfei Cao
Zhuoran Jin
Kang Liu
Jun Zhao
151
0
0
18 Jun 2024
InternalInspector
I
2
I^2
I
2
: Robust Confidence Estimation in LLMs through Internal States
Mohammad Beigi
Ying Shen
Runing Yang
Zihao Lin
Qifan Wang
Ankith Mohan
Jianfeng He
Ming Jin
Chang-Tien Lu
Lifu Huang
HILM
200
19
0
17 Jun 2024
The BiGGen Bench: A Principled Benchmark for Fine-grained Evaluation of Language Models with Language Models
North American Chapter of the Association for Computational Linguistics (NAACL), 2024
Seungone Kim
Juyoung Suk
Ji Yong Cho
Shayne Longpre
Chaeeun Kim
...
Sean Welleck
Graham Neubig
Moontae Lee
Kyungjae Lee
Minjoon Seo
ELM
ALM
LM&MA
375
68
0
09 Jun 2024
Learning Task Decomposition to Assist Humans in Competitive Programming
Annual Meeting of the Association for Computational Linguistics (ACL), 2024
Jiaxin Wen
Ruiqi Zhong
Pei Ke
Zhihong Shao
Hongning Wang
Shiyu Huang
ReLM
286
12
0
07 Jun 2024
Open-Endedness is Essential for Artificial Superhuman Intelligence
International Conference on Machine Learning (ICML), 2024
Edward Hughes
Michael Dennis
Jack Parker-Holder
Feryal M. P. Behbahani
Aditi Mavalankar
Yuge Shi
Tom Schaul
Tim Rocktaschel
LRM
284
53
0
06 Jun 2024
When Can LLMs Actually Correct Their Own Mistakes? A Critical Survey of Self-Correction of LLMs
Ryo Kamoi
Yusen Zhang
Nan Zhang
Jiawei Han
Rui Zhang
LRM
336
144
0
03 Jun 2024
Improving Reward Models with Synthetic Critiques
Zihuiwen Ye
Fraser Greenlee-Scott
Max Bartolo
Phil Blunsom
Jon Ander Campos
Matthias Gallé
ALM
SyDa
LRM
219
34
0
31 May 2024
Stress-Testing Capability Elicitation With Password-Locked Models
Ryan Greenblatt
Fabien Roger
Dmitrii Krasheninnikov
David M. Krueger
278
25
0
29 May 2024
Offline Regularised Reinforcement Learning for Large Language Models Alignment
Pierre Harvey Richemond
Yunhao Tang
Daniel Guo
Daniele Calandriello
M. G. Azar
...
Gil Shamir
Rishabh Joshi
Tianqi Liu
Rémi Munos
Bilal Piot
OffRL
214
40
0
29 May 2024
Efficient Model-agnostic Alignment via Bayesian Persuasion
Fengshuo Bai
Mingzhi Wang
Zhaowei Zhang
Boyuan Chen
Yinda Xu
Ying Wen
Yaodong Yang
198
9
0
29 May 2024
LIRE: listwise reward enhancement for preference alignment
Mingye Zhu
Yi Liu
Lei Zhang
Junbo Guo
Zhendong Mao
106
8
0
22 May 2024
Adversarial DPO: Harnessing Harmful Data for Reducing Toxicity with Minimal Impact on Coherence and Evasiveness in Dialogue Agents
San Kim
Gary Geunbae Lee
AAML
269
6
0
21 May 2024
OLAPH: Improving Factuality in Biomedical Long-form Question Answering
Minbyul Jeong
Hyeon Hwang
Chanwoong Yoon
Taewhoo Lee
Jaewoo Kang
MedIm
HILM
LM&MA
371
19
0
21 May 2024
Fennec: Fine-grained Language Model Evaluation and Correction Extended through Branching and Bridging
Xiaobo Liang
Haoke Zhang
Helan hu
Juntao Li
Jun Xu
Min Zhang
ALM
153
4
0
20 May 2024
Can Language Models Explain Their Own Classification Behavior?
Dane Sherburn
Bilal Chughtai
Owain Evans
177
2
0
13 May 2024
LLMs can Find Mathematical Reasoning Mistakes by Pedagogical Chain-of-Thought
International Joint Conference on Artificial Intelligence (IJCAI), 2024
Zhuoxuan Jiang
Haoyuan Peng
Shanshan Feng
Fan Li
Dongsheng Li
KELM
LRM
319
27
0
09 May 2024
Conversational Topic Recommendation in Counseling and Psychotherapy with Decision Transformer and Large Language Models
Clinical Natural Language Processing Workshop (ClinicalNLP), 2024
Aylin Gunal
Baihan Lin
Djallel Bouneffouf
OffRL
AI4MH
LM&MA
163
1
0
08 May 2024
General Purpose Verification for Chain of Thought Prompting
Robert Vacareanu
Anurag Pratik
Evangelia Spiliopoulou
Zheng Qi
Giovanni Paolini
Neha Ann John
Jie Ma
Yassine Benajiba
Miguel Ballesteros
LRM
139
15
0
30 Apr 2024
Small Language Models Need Strong Verifiers to Self-Correct Reasoning
Yunxiang Zhang
Muhammad Khalifa
Lajanugen Logeswaran
Jaekyeom Kim
Moontae Lee
Honglak Lee
Lu Wang
LRM
KELM
ReLM
271
70
0
26 Apr 2024
Aligning LLM Agents by Learning Latent Preference from User Edits
Ge Gao
Alexey Taymanov
Eduardo Salinas
Paul Mineiro
Dipendra Kumar Misra
LLMAG
259
47
0
23 Apr 2024
A Survey on Self-Evolution of Large Language Models
Zhengwei Tao
Ting-En Lin
Xiancai Chen
Hangyu Li
Yuchuan Wu
Yongbin Li
Zhi Jin
Fei Huang
Dacheng Tao
Jingren Zhou
LRM
LM&Ro
272
45
0
22 Apr 2024
Toward Self-Improvement of LLMs via Imagination, Searching, and Criticizing
Ye Tian
Baolin Peng
Linfeng Song
Lifeng Jin
Dian Yu
Haitao Mi
Dong Yu
LRM
ReLM
195
123
0
18 Apr 2024
LLM Evaluators Recognize and Favor Their Own Generations
Arjun Panickssery
Samuel R. Bowman
Shi Feng
309
333
0
15 Apr 2024
PoLLMgraph: Unraveling Hallucinations in Large Language Models via State Transition Dynamics
Derui Zhu
Dingfan Chen
Qing Li
Zongxiong Chen
Lei Ma
Jens Grossklags
Mario Fritz
HILM
166
18
0
06 Apr 2024
IterAlign: Iterative Constitutional Alignment of Large Language Models
Xiusi Chen
Hongzhi Wen
Jiapeng Liu
Chen Luo
Qingyu Yin
Ruirui Li
Zheng Li
Wei Wang
AILaw
89
7
0
27 Mar 2024
Don't Trust: Verify -- Grounding LLM Quantitative Reasoning with Autoformalization
Jin Peng Zhou
Charles Staats
Wenda Li
Christian Szegedy
Kilian Q. Weinberger
Yuhuai Wu
LRM
164
58
0
26 Mar 2024
STRUM-LLM: Attributed and Structured Contrastive Summarization
Beliz Gunel
James Bradley Wendt
Jing Xie
Yichao Zhou
Nguyen Vo
Zachary Fisher
Sandeep Tata
94
6
0
25 Mar 2024
VURF: A General-purpose Reasoning and Self-refinement Framework for Video Understanding
Ahmad A Mahmood
Ashmal Vayani
Muzammal Naseer
Salman Khan
Fahad Shahbaz Khan
LRM
342
11
0
21 Mar 2024
Facilitating Pornographic Text Detection for Open-Domain Dialogue Systems via Knowledge Distillation of Large Language Models
Huachuan Qiu
Shuai Zhang
Hongliang He
Anqi Li
Zhenzhong Lan
202
2
0
20 Mar 2024
Eyes Closed, Safety On: Protecting Multimodal LLMs via Image-to-Text Transformation
European Conference on Computer Vision (ECCV), 2024
Yunhao Gou
Kai Chen
Zhili Liu
Lanqing Hong
Hang Xu
Zhenguo Li
Dit-Yan Yeung
James T. Kwok
Yu Zhang
MLLM
267
93
0
14 Mar 2024
Easy-to-Hard Generalization: Scalable Alignment Beyond Human Supervision
Neural Information Processing Systems (NeurIPS), 2024
Zhiqing Sun
Longhui Yu
Yikang Shen
Weiyang Liu
Yiming Yang
Sean Welleck
Chuang Gan
177
91
0
14 Mar 2024
Self-Refinement of Language Models from External Proxy Metrics Feedback
Keshav Ramji
Young-Suk Lee
Ramón Fernandez Astudillo
M. Sultan
Tahira Naseem
Asim Munawar
Radu Florian
Salim Roukos
HILM
141
8
0
27 Feb 2024
TruthX: Alleviating Hallucinations by Editing Large Language Models in Truthful Space
Shaolei Zhang
Tian Yu
Yang Feng
HILM
KELM
243
78
0
27 Feb 2024
Navigating Complexity: Orchestrated Problem Solving with Multi-Agent LLMs
Sumedh Rasal
E. Hauer
181
0
0
26 Feb 2024
TEaR: Improving LLM-based Machine Translation with Systematic Self-Refinement
Zhaopeng Feng
Yan Zhang
Hao Li
Bei Wu
Jiayu Liao
Wenqiang Liu
Jun Lang
Yang Feng
Jian Wu
Zuozhu Liu
LRM
384
28
0
26 Feb 2024
Look Before You Leap: Problem Elaboration Prompting Improves Mathematical Reasoning in Large Language Models
Haoran Liao
Jidong Tian
Shaohua Hu
Hao He
Yaohui Jin
ReLM
LRM
150
1
0
24 Feb 2024
CriticBench: Benchmarking LLMs for Critique-Correct Reasoning
Zicheng Lin
Zhibin Gou
Tian Liang
Ruilin Luo
Haowei Liu
Yujiu Yang
LRM
326
78
0
22 Feb 2024
Q-Probe: A Lightweight Approach to Reward Maximization for Language Models
Kenneth Li
Samy Jelassi
Hugh Zhang
Sham Kakade
Martin Wattenberg
David Brandfonbrener
227
15
0
22 Feb 2024
CriticBench: Evaluating Large Language Models as Critic
Tian Lan
Wenwei Zhang
Chen Xu
Heyan Huang
Dahua Lin
Kai-xiang Chen
Xian-Ling Mao
ELM
AI4MH
LRM
150
2
0
21 Feb 2024
TofuEval: Evaluating Hallucinations of LLMs on Topic-Focused Dialogue Summarization
Liyan Tang
Igor Shalyminov
Amy Wing-mei Wong
Jon Burnsky
Jake W. Vincent
...
Hang Su
Lijia Sun
Yi Zhang
Saab Mansour
Kathleen McKeown
HILM
165
71
0
20 Feb 2024
A Survey on Knowledge Distillation of Large Language Models
Xiaohan Xu
Ming Li
Chongyang Tao
Tao Shen
Reynold Cheng
Jinyang Li
Can Xu
Dacheng Tao
Wanrong Zhu
KELM
VLM
406
219
0
20 Feb 2024
Learning to Check: Unleashing Potentials for Self-Correction in Large Language Models
Che Zhang
Zhenyang Xiao
Chengcheng Han
Yixin Lian
Yuejian Fang
LRM
180
0
0
20 Feb 2024
Confidence Matters: Revisiting Intrinsic Self-Correction Capabilities of Large Language Models
Loka Li
Zhenhao Chen
Guan-Hong Chen
Yixuan Zhang
Yusheng Su
Eric P. Xing
Kun Zhang
LRM
269
33
0
19 Feb 2024
Previous
1
2
3
4
5
6
Next