ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2310.02174
  4. Cited By
Ask Again, Then Fail: Large Language Models' Vacillations in Judgment

Ask Again, Then Fail: Large Language Models' Vacillations in Judgment

3 October 2023
Qiming Xie
Zengzhi Wang
Yi Feng
Rui Xia
    AAML
    HILM
ArXivPDFHTML

Papers citing "Ask Again, Then Fail: Large Language Models' Vacillations in Judgment"

14 / 14 papers shown
Title
Firm or Fickle? Evaluating Large Language Models Consistency in Sequential Interactions
Firm or Fickle? Evaluating Large Language Models Consistency in Sequential Interactions
Yubo Li
Yidi Miao
Xueying Ding
Ramayya Krishnan
R. Padman
37
0
0
28 Mar 2025
ReflectDiffu:Reflect between Emotion-intent Contagion and Mimicry for
  Empathetic Response Generation via a RL-Diffusion Framework
ReflectDiffu:Reflect between Emotion-intent Contagion and Mimicry for Empathetic Response Generation via a RL-Diffusion Framework
Jiahao Yuan
Zixiang Di
Zhiqing Cui
Guisong Yang
Usman Naseem
43
0
0
16 Sep 2024
StructEval: Deepen and Broaden Large Language Model Assessment via
  Structured Evaluation
StructEval: Deepen and Broaden Large Language Model Assessment via Structured Evaluation
Boxi Cao
Mengjie Ren
Hongyu Lin
Xianpei Han
Feng Zhang
Junfeng Zhan
Le Sun
ELM
29
3
0
06 Aug 2024
Why does in-context learning fail sometimes? Evaluating in-context
  learning on open and closed questions
Why does in-context learning fail sometimes? Evaluating in-context learning on open and closed questions
Xiang Li
Haoran Tang
Siyu Chen
Ziwei Wang
Ryan Chen
Marcin Abram
LRM
29
1
0
02 Jul 2024
BeHonest: Benchmarking Honesty in Large Language Models
BeHonest: Benchmarking Honesty in Large Language Models
Steffi Chern
Zhulin Hu
Yuqing Yang
Ethan Chern
Yuan Guo
Jiahe Jin
Binjie Wang
Pengfei Liu
HILM
ALM
84
3
0
19 Jun 2024
Towards Understanding Sycophancy in Language Models
Towards Understanding Sycophancy in Language Models
Mrinank Sharma
Meg Tong
Tomasz Korbak
D. Duvenaud
Amanda Askell
...
Oliver Rausch
Nicholas Schiefer
Da Yan
Miranda Zhang
Ethan Perez
209
178
0
20 Oct 2023
A Survey on Evaluation of Large Language Models
A Survey on Evaluation of Large Language Models
Yu-Chu Chang
Xu Wang
Jindong Wang
Yuanyi Wu
Linyi Yang
...
Yue Zhang
Yi-Ju Chang
Philip S. Yu
Qian Yang
Xingxu Xie
ELM
LM&MA
ALM
58
1,496
0
06 Jul 2023
Learning by Distilling Context
Learning by Distilling Context
Charles Burton Snell
Dan Klein
Ruiqi Zhong
ReLM
LRM
161
44
0
30 Sep 2022
Red Teaming Language Models to Reduce Harms: Methods, Scaling Behaviors,
  and Lessons Learned
Red Teaming Language Models to Reduce Harms: Methods, Scaling Behaviors, and Lessons Learned
Deep Ganguli
Liane Lovitt
John Kernion
Amanda Askell
Yuntao Bai
...
Nicholas Joseph
Sam McCandlish
C. Olah
Jared Kaplan
Jack Clark
218
441
0
23 Aug 2022
Large Language Models are Zero-Shot Reasoners
Large Language Models are Zero-Shot Reasoners
Takeshi Kojima
S. Gu
Machel Reid
Yutaka Matsuo
Yusuke Iwasawa
ReLM
LRM
291
4,048
0
24 May 2022
Training language models to follow instructions with human feedback
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
303
11,881
0
04 Mar 2022
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Jason W. Wei
Xuezhi Wang
Dale Schuurmans
Maarten Bosma
Brian Ichter
F. Xia
Ed H. Chi
Quoc Le
Denny Zhou
LM&Ro
LRM
AI4CE
ReLM
315
8,402
0
28 Jan 2022
Fantastically Ordered Prompts and Where to Find Them: Overcoming
  Few-Shot Prompt Order Sensitivity
Fantastically Ordered Prompts and Where to Find Them: Overcoming Few-Shot Prompt Order Sensitivity
Yao Lu
Max Bartolo
Alastair Moore
Sebastian Riedel
Pontus Stenetorp
AILaw
LRM
277
1,114
0
18 Apr 2021
Did Aristotle Use a Laptop? A Question Answering Benchmark with Implicit
  Reasoning Strategies
Did Aristotle Use a Laptop? A Question Answering Benchmark with Implicit Reasoning Strategies
Mor Geva
Daniel Khashabi
Elad Segal
Tushar Khot
Dan Roth
Jonathan Berant
RALM
245
671
0
06 Jan 2021
1