ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2206.05802
  4. Cited By
Self-critiquing models for assisting human evaluators
v1v2 (latest)

Self-critiquing models for assisting human evaluators

12 June 2022
William Saunders
Catherine Yeh
Jeff Wu
Steven Bills
Ouyang Long
Jonathan Ward
Jan Leike
    ALMELM
ArXiv (abs)PDFHTML

Papers citing "Self-critiquing models for assisting human evaluators"

50 / 260 papers shown
Title
S$^2$R: Teaching LLMs to Self-verify and Self-correct via Reinforcement Learning
S2^22R: Teaching LLMs to Self-verify and Self-correct via Reinforcement LearningAnnual Meeting of the Association for Computational Linguistics (ACL), 2025
Ruotian Ma
Peisong Wang
Cheng Liu
Xingyan Liu
Jiaqi Chen
Bang Zhang
Xin Zhou
Nan Du
Jia Li
LRM
362
8
0
18 Feb 2025
Improve Decoding Factuality by Token-wise Cross Layer Entropy of Large Language Models
Improve Decoding Factuality by Token-wise Cross Layer Entropy of Large Language ModelsNorth American Chapter of the Association for Computational Linguistics (NAACL), 2025
Jialiang Wu
Yi Shen
Sijia Liu
Yi Tang
Sen Song
Xiaoyi Wang
Longjun Cai
185
2
0
05 Feb 2025
Satori: Reinforcement Learning with Chain-of-Action-Thought Enhances LLM Reasoning via Autoregressive Search
Satori: Reinforcement Learning with Chain-of-Action-Thought Enhances LLM Reasoning via Autoregressive Search
Maohao Shen
Guangtao Zeng
Zhenting Qi
Zhang-Wei Hong
Zhenfang Chen
Wei Lu
G. Wornell
Subhro Das
David D. Cox
Chuang Gan
LRMLLMAG
1.1K
34
0
04 Feb 2025
Iterative Label Refinement Matters More than Preference Optimization under Weak Supervision
Iterative Label Refinement Matters More than Preference Optimization under Weak SupervisionInternational Conference on Learning Representations (ICLR), 2025
Yaowen Ye
Cassidy Laidlaw
Jacob Steinhardt
ALM
170
2
0
14 Jan 2025
OpenCodeInterpreter: Integrating Code Generation with Execution and Refinement
OpenCodeInterpreter: Integrating Code Generation with Execution and RefinementAnnual Meeting of the Association for Computational Linguistics (ACL), 2024
Tianyu Zheng
Ge Zhang
Shangda Wu
Xueling Liu
Bill Yuchen Lin
Jie Fu
Lei Ma
Xiang Yue
SyDa
376
192
0
08 Jan 2025
Exploring and Controlling Diversity in LLM-Agent Conversation
Exploring and Controlling Diversity in LLM-Agent Conversation
Kuanchao Chu
Yi-Pei Chen
Hideki Nakayama
LLMAG
406
8
0
30 Dec 2024
The Superalignment of Superhuman Intelligence with Large Language Models
The Superalignment of Superhuman Intelligence with Large Language ModelsScience China Information Sciences (Sci. China Inf. Sci.), 2024
Shiyu Huang
Yingkang Wang
Shiyao Cui
Pei Ke
J. Tang
362
1
0
15 Dec 2024
ProcessBench: Identifying Process Errors in Mathematical Reasoning
ProcessBench: Identifying Process Errors in Mathematical ReasoningAnnual Meeting of the Association for Computational Linguistics (ACL), 2024
Chujie Zheng
Zizhuo Zhang
Beichen Zhang
Runji Lin
Keming Lu
Bowen Yu
Dayiheng Liu
Jingren Zhou
Junyang Lin
LRM
520
150
0
09 Dec 2024
Adaptive Deployment of Untrusted LLMs Reduces Distributed Threats
Adaptive Deployment of Untrusted LLMs Reduces Distributed ThreatsInternational Conference on Learning Representations (ICLR), 2024
Jiaxin Wen
Vivek Hebbar
Caleb Larson
Aryan Bhatt
Ansh Radhakrishnan
...
Shi Feng
He He
Ethan Perez
Buck Shlegeris
Akbir Khan
AAML
220
14
0
26 Nov 2024
Self-Generated Critiques Boost Reward Modeling for Language Models
Self-Generated Critiques Boost Reward Modeling for Language ModelsNorth American Chapter of the Association for Computational Linguistics (NAACL), 2024
Yue Yu
Zhengxing Chen
Aston Zhang
L Tan
Chenguang Zhu
...
Suchin Gururangan
Chao-Yue Zhang
Melanie Kambadur
Dhruv Mahajan
Rui Hou
LRMALM
450
49
0
25 Nov 2024
Search, Verify and Feedback: Towards Next Generation Post-training Paradigm of Foundation Models via Verifier Engineering
Xinyan Guan
Yanjiang Liu
Xinyu Lu
Boxi Cao
Xianpei Han
...
Le Sun
Jie Lou
Bowen Yu
Yaojie Lu
Hongyu Lin
ALM
478
8
0
18 Nov 2024
Beyond the Safety Bundle: Auditing the Helpful and Harmless Dataset
Beyond the Safety Bundle: Auditing the Helpful and Harmless DatasetNorth American Chapter of the Association for Computational Linguistics (NAACL), 2024
Khaoula Chehbouni
Jonathan Colaço-Carr
Yash More
Jackie CK Cheung
G. Farnadi
464
7
0
12 Nov 2024
SLED: Self Logits Evolution Decoding for Improving Factuality in Large Language Models
SLED: Self Logits Evolution Decoding for Improving Factuality in Large Language ModelsNeural Information Processing Systems (NeurIPS), 2024
Jianyi Zhang
Da-Cheng Juan
Cyrus Rashtchian
Chun-Sung Ferng
Heinrich Jiang
Yiran Chen
315
12
0
01 Nov 2024
An Actor-Critic Approach to Boosting Text-to-SQL Large Language Model
An Actor-Critic Approach to Boosting Text-to-SQL Large Language Model
Ziyang Zheng
Haipeng Jing
Canyu Rui
A. Hamdulla
D. Wang
LRM
146
1
0
28 Oct 2024
Improving Model Factuality with Fine-grained Critique-based Evaluator
Improving Model Factuality with Fine-grained Critique-based EvaluatorAnnual Meeting of the Association for Computational Linguistics (ACL), 2024
Yiqing Xie
Wenxuan Zhou
Pradyot Prakash
Di Jin
Yuning Mao
...
Sinong Wang
Han Fang
Carolyn Rose
Daniel Fried
Hejia Zhang
HILM
383
12
0
24 Oct 2024
CorrectionLM: Self-Corrections with SLM for Dialogue State Tracking
CorrectionLM: Self-Corrections with SLM for Dialogue State Tracking
Chia-Hsuan Lee
Hao Cheng
Mari Ostendorf
LRM
122
0
0
23 Oct 2024
LoGU: Long-form Generation with Uncertainty Expressions
LoGU: Long-form Generation with Uncertainty ExpressionsAnnual Meeting of the Association for Computational Linguistics (ACL), 2024
Ruihan Yang
Caiqi Zhang
Zhisong Zhang
Xinting Huang
Sen Yang
Nigel Collier
Dong Yu
Deqing Yang
HILM
499
15
0
18 Oct 2024
Balancing Label Quantity and Quality for Scalable Elicitation
Balancing Label Quantity and Quality for Scalable Elicitation
Alex Troy Mallen
Nora Belrose
125
3
0
17 Oct 2024
Embedding an Ethical Mind: Aligning Text-to-Image Synthesis via
  Lightweight Value Optimization
Embedding an Ethical Mind: Aligning Text-to-Image Synthesis via Lightweight Value OptimizationACM Multimedia (MM), 2024
Xingqi Wang
Xiaoyuan Yi
Xing Xie
Jia Jia
173
4
0
16 Oct 2024
JudgeBench: A Benchmark for Evaluating LLM-based Judges
JudgeBench: A Benchmark for Evaluating LLM-based JudgesInternational Conference on Learning Representations (ICLR), 2024
Sijun Tan
Siyuan Zhuang
Kyle Montgomery
William Y. Tang
Alejandro Cuadron
Chenguang Wang
Raluca A. Popa
Ion Stoica
ELMALM
526
130
0
16 Oct 2024
Divide-Verify-Refine: Can LLMs Self-Align with Complex Instructions?
Divide-Verify-Refine: Can LLMs Self-Align with Complex Instructions?Annual Meeting of the Association for Computational Linguistics (ACL), 2024
Xianren Zhang
Xianfeng Tang
Hui Liu
Zongyu Wu
Qi He
Dongwon Lee
Suhang Wang
ALM
223
2
0
16 Oct 2024
FB-Bench: A Fine-Grained Multi-Task Benchmark for Evaluating LLMs' Responsiveness to Human Feedback
FB-Bench: A Fine-Grained Multi-Task Benchmark for Evaluating LLMs' Responsiveness to Human Feedback
Yongbin Li
Miao Zheng
Fan Yang
Bin Cui
Tengjiao Wang
Xin Wu
Guosheng Dong
Wentao Zhang
ALM
253
10
0
12 Oct 2024
Enhancing Multi-Step Reasoning Abilities of Language Models through Direct Q-Function Optimization
Enhancing Multi-Step Reasoning Abilities of Language Models through Direct Q-Function Optimization
Guanlin Liu
Kaixuan Ji
Ning Dai
Zheng Wu
Chen Dun
Q. Gu
Lin Yan
Quanquan Gu
Lin Yan
OffRLLRM
271
18
0
11 Oct 2024
SuperCorrect: Advancing Small LLM Reasoning with Thought Template Distillation and Self-Correction
SuperCorrect: Advancing Small LLM Reasoning with Thought Template Distillation and Self-CorrectionInternational Conference on Learning Representations (ICLR), 2024
L. Yang
Zhaochen Yu
Tianze Zhang
Minkai Xu
Alfons Kemper
Tengjiao Wang
Shuicheng Yan
ELMReLMLRM
238
0
0
11 Oct 2024
LLM Self-Correction with DeCRIM: Decompose, Critique, and Refine for
  Enhanced Following of Instructions with Multiple Constraints
LLM Self-Correction with DeCRIM: Decompose, Critique, and Refine for Enhanced Following of Instructions with Multiple ConstraintsConference on Empirical Methods in Natural Language Processing (EMNLP), 2024
Thomas Palmeira Ferraz
Kartik Mehta
Yu-Hsiang Lin
Haw-Shiuan Chang
Shereen Oraby
Sijia Liu
Vivek Subramanian
Tagyoung Chung
Mohit Bansal
Nanyun Peng
223
23
0
09 Oct 2024
Rationale-Aware Answer Verification by Pairwise Self-Evaluation
Rationale-Aware Answer Verification by Pairwise Self-EvaluationConference on Empirical Methods in Natural Language Processing (EMNLP), 2024
Akira Kawabata
Saku Sugawara
LRM
245
6
0
07 Oct 2024
TICKing All the Boxes: Generated Checklists Improve LLM Evaluation and
  Generation
TICKing All the Boxes: Generated Checklists Improve LLM Evaluation and Generation
Jonathan Cook
Tim Rocktaschel
Jakob Foerster
Dennis Aumiller
Alex Wang
ALM
191
28
0
04 Oct 2024
CodePMP: Scalable Preference Model Pretraining for Large Language Model Reasoning
CodePMP: Scalable Preference Model Pretraining for Large Language Model Reasoning
Huimu Yu
Xing Wu
Weidong Yin
Debing Zhang
Songlin Hu
LRM
251
7
0
03 Oct 2024
Truth or Deceit? A Bayesian Decoding Game Enhances Consistency and
  Reliability
Truth or Deceit? A Bayesian Decoding Game Enhances Consistency and Reliability
Weitong Zhang
Chengqi Zang
Bernhard Kainz
178
1
0
01 Oct 2024
A Preliminary Study of o1 in Medicine: Are We Closer to an AI Doctor?
A Preliminary Study of o1 in Medicine: Are We Closer to an AI Doctor?
Yunfei Xie
Juncheng Wu
Haoqin Tu
Siwei Yang
Bingchen Zhao
Yongshuo Zong
Qiao Jin
Cihang Xie
Yuyin Zhou
LM&MAELMLRM
266
38
0
23 Sep 2024
Backtracking Improves Generation Safety
Backtracking Improves Generation Safety
Yiming Zhang
Jianfeng Chi
Hailey Nguyen
Kartikeya Upasani
Daniel M. Bikel
Jason Weston
Eric Michael Smith
SILM
263
24
0
22 Sep 2024
Language Models Learn to Mislead Humans via RLHF
Language Models Learn to Mislead Humans via RLHFInternational Conference on Learning Representations (ICLR), 2024
Jiaxin Wen
Ruiqi Zhong
Akbir Khan
Ethan Perez
Jacob Steinhardt
Minlie Huang
Samuel R. Bowman
He He
Shi Feng
270
69
0
19 Sep 2024
Model-in-the-Loop (MILO): Accelerating Multimodal AI Data Annotation
  with LLMs
Model-in-the-Loop (MILO): Accelerating Multimodal AI Data Annotation with LLMs
Yifan Wang
David Stevens
Pranay Shah
Wenwen Jiang
Miao Liu
...
Boying Gong
Daniel Lee
Jiabo Hu
Ning Zhang
Bob Kamma
190
4
0
16 Sep 2024
Pairing Analogy-Augmented Generation with Procedural Memory for
  Procedural Q&A
Pairing Analogy-Augmented Generation with Procedural Memory for Procedural Q&APacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD), 2024
K Roth
Rushil Gupta
Simon Halle
Bang Liu
RALM
165
0
0
02 Sep 2024
Critic-CoT: Boosting the reasoning abilities of large language model via Chain-of-thoughts Critic
Critic-CoT: Boosting the reasoning abilities of large language model via Chain-of-thoughts CriticAnnual Meeting of the Association for Computational Linguistics (ACL), 2024
Xin Zheng
Jie Lou
Boxi Cao
Xueru Wen
Yuqiu Ji
Hongyu Lin
Yaojie Lu
Xianpei Han
Debing Zhang
Le Sun
OffRLLRMLLMAGReLMKELM
413
22
1
29 Aug 2024
Critique-out-Loud Reward Models
Critique-out-Loud Reward Models
Zachary Ankner
Mansheej Paul
Brandon Cui
Jonathan D. Chang
Prithviraj Ammanabrolu
ALMLRM
231
67
0
21 Aug 2024
How Susceptible are LLMs to Influence in Prompts?
How Susceptible are LLMs to Influence in Prompts?
Sotiris Anagnostidis
Jannis Bulian
LRM
191
35
0
17 Aug 2024
Meta-Rewarding Language Models: Self-Improving Alignment with
  LLM-as-a-Meta-Judge
Meta-Rewarding Language Models: Self-Improving Alignment with LLM-as-a-Meta-Judge
Tianhao Wu
Weizhe Yuan
O. Yu. Golovneva
Jing Xu
Yuandong Tian
Jiantao Jiao
Jason Weston
Sainbayar Sukhbaatar
ALMKELMLRM
262
147
0
28 Jul 2024
Collaborative Evolving Strategy for Automatic Data-Centric Development
Collaborative Evolving Strategy for Automatic Data-Centric Development
Xu Yang
Haotian Chen
Wenjun Feng
Haoxue Wang
Zeqi Ye
Xinjie Shen
Xiao Yang
Shizhao Sun
Yuante Li
Jiang Bian
250
3
0
26 Jul 2024
SAFETY-J: Evaluating Safety with Critique
SAFETY-J: Evaluating Safety with Critique
Yixiu Liu
Yuxiang Zheng
Shijie Xia
Jiajun Li
Yi Tu
Chaoling Song
Pengfei Liu
ELM
167
2
0
24 Jul 2024
Internal Consistency and Self-Feedback in Large Language Models: A
  Survey
Internal Consistency and Self-Feedback in Large Language Models: A Survey
Xun Liang
Shichao Song
Zifan Zheng
Hanyu Wang
Qingchen Yu
...
Rong-Hua Li
Peng Cheng
Zhonghao Wang
Feiyu Xiong
Zhiyu Li
HILMLRM
423
47
0
19 Jul 2024
Prover-Verifier Games improve legibility of LLM outputs
Prover-Verifier Games improve legibility of LLM outputs
Jan Hendrik Kirchner
Yining Chen
Harri Edwards
Jan Leike
Nat McAleese
Yuri Burda
LRMAAML
214
50
0
18 Jul 2024
Halu-J: Critique-Based Hallucination Judge
Halu-J: Critique-Based Hallucination Judge
Binjie Wang
Steffi Chern
Ethan Chern
Pengfei Liu
HILM
195
13
0
17 Jul 2024
What's Wrong? Refining Meeting Summaries with LLM Feedback
What's Wrong? Refining Meeting Summaries with LLM Feedback
Frederic Kirstein
Terry Ruas
Bela Gipp
235
7
0
16 Jul 2024
Cohesive Conversations: Enhancing Authenticity in Multi-Agent Simulated
  Dialogues
Cohesive Conversations: Enhancing Authenticity in Multi-Agent Simulated Dialogues
Kuanchao Chu
Yi-Pei Chen
Hideki Nakayama
LLMAG
243
8
0
13 Jul 2024
Optimal Decision Making Through Scenario Simulations Using Large
  Language Models
Optimal Decision Making Through Scenario Simulations Using Large Language Models
Sumedh Rasal
E. Hauer
253
3
0
09 Jul 2024
Prompting Techniques for Secure Code Generation: A Systematic Investigation
Prompting Techniques for Secure Code Generation: A Systematic Investigation
Catherine Tony
Nicolás E. Díaz Ferreyra
Markus Mutas
Salem Dhiff
Riccardo Scandariato
SILM
334
32
0
09 Jul 2024
What's Wrong with Your Code Generated by Large Language Models? An Extensive Study
What's Wrong with Your Code Generated by Large Language Models? An Extensive Study
Jiajun Sun
Haoxiang Jia
Shenxi Wu
Huiyuan Zheng
Muling Wu
...
Ming-bo Wen
Yuhao Zhou
Y. Wu
Rui Zheng
Ming-bo Wen
237
66
0
08 Jul 2024
On scalable oversight with weak LLMs judging strong LLMs
On scalable oversight with weak LLMs judging strong LLMs
Zachary Kenton
Noah Y. Siegel
János Kramár
Jonah Brown-Cohen
Samuel Albanie
...
Rishabh Agarwal
David Lindner
Yunhao Tang
Noah D. Goodman
Rohin Shah
ELM
231
60
0
05 Jul 2024
Spontaneous Reward Hacking in Iterative Self-Refinement
Spontaneous Reward Hacking in Iterative Self-Refinement
Jane Pan
He He
Samuel R. Bowman
Shi Feng
213
16
0
05 Jul 2024
Previous
123456
Next