ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2409.01658
  4. Cited By
From Yes-Men to Truth-Tellers: Addressing Sycophancy in Large Language Models with Pinpoint Tuning

From Yes-Men to Truth-Tellers: Addressing Sycophancy in Large Language Models with Pinpoint Tuning

3 September 2024
Wei Chen
Zhen Huang
Liang Xie
Binbin Lin
Houqiang Li
Le Lu
Xinmei Tian
Deng Cai
Yonggang Zhang
Wenxiao Wang
Xu Shen
Jieping Ye
ArXivPDFHTML

Papers citing "From Yes-Men to Truth-Tellers: Addressing Sycophancy in Large Language Models with Pinpoint Tuning"

5 / 5 papers shown
Title
Leveraging Submodule Linearity Enhances Task Arithmetic Performance in LLMs
Leveraging Submodule Linearity Enhances Task Arithmetic Performance in LLMs
Rui Dai
Sile Hu
Xu Shen
Yonggang Zhang
Xinmei Tian
Jieping Ye
MoMe
34
2
0
15 Apr 2025
How to Mitigate Overfitting in Weak-to-strong Generalization?
Junhao Shi
Qinyuan Cheng
Zhaoye Fei
Y. Zheng
Qipeng Guo
Xipeng Qiu
65
0
0
06 Mar 2025
Have the VLMs Lost Confidence? A Study of Sycophancy in VLMs
Have the VLMs Lost Confidence? A Study of Sycophancy in VLMs
Shuo Li
Tao Ji
Xiaoran Fan
Linsheng Lu
L. Yang
...
Y. Wang
Xiaohui Zhao
Tao Gui
Qi Zhang
Xuanjing Huang
33
0
0
15 Oct 2024
Interpretability in the Wild: a Circuit for Indirect Object
  Identification in GPT-2 small
Interpretability in the Wild: a Circuit for Indirect Object Identification in GPT-2 small
Kevin Wang
Alexandre Variengien
Arthur Conmy
Buck Shlegeris
Jacob Steinhardt
205
486
0
01 Nov 2022
Training language models to follow instructions with human feedback
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
301
11,730
0
04 Mar 2022
1