Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2405.20612
Cited By
UniBias: Unveiling and Mitigating LLM Bias through Internal Attention and FFN Manipulation
31 May 2024
Hanzhang Zhou
Zijian Feng
Zixiao Zhu
Junlang Qian
Kezhi Mao
Re-assign community
ArXiv
PDF
HTML
Papers citing
"UniBias: Unveiling and Mitigating LLM Bias through Internal Attention and FFN Manipulation"
11 / 11 papers shown
Title
Exploiting Contextual Knowledge in LLMs through V-usable Information based Layer Enhancement
Xiaowei Yuan
Zhao Yang
Ziyang Huang
Y. Wang
Siqi Fan
Yiming Ju
Jun Zhao
Kang-Jun Liu
27
0
0
22 Apr 2025
Accelerating Particle-based Energetic Variational Inference
Xuelian Bao
Lulu Kang
Chun Liu
Yiwei Wang
BDL
59
0
0
04 Apr 2025
Grounded Chain-of-Thought for Multimodal Large Language Models
Qiong Wu
Xiangcong Yang
Yiyi Zhou
Chenxin Fang
Baiyang Song
Xiaoshuai Sun
Rongrong Ji
LRM
76
1
0
17 Mar 2025
Shortcut Learning in In-Context Learning: A Survey
Rui Song
Yingji Li
Fausto Giunchiglia
Fausto Giunchiglia
Hao Xu
38
1
0
04 Nov 2024
ReDeEP: Detecting Hallucination in Retrieval-Augmented Generation via Mechanistic Interpretability
ZhongXiang Sun
Xiaoxue Zang
Kai Zheng
Yang Song
Jun Xu
Xiao Zhang
Weijie Yu
Yang Song
Han Li
55
7
0
15 Oct 2024
NoVo: Norm Voting off Hallucinations with Attention Heads in Large Language Models
Zheng Yi Ho
Siyuan Liang
Sen Zhang
Yibing Zhan
Dacheng Tao
26
2
0
11 Oct 2024
Characterizing Mechanisms for Factual Recall in Language Models
Qinan Yu
Jack Merullo
Ellie Pavlick
KELM
42
23
0
24 Oct 2023
How does GPT-2 compute greater-than?: Interpreting mathematical abilities in a pre-trained language model
Michael Hanna
Ollie Liu
Alexandre Variengien
LRM
186
116
0
30 Apr 2023
Interpretability in the Wild: a Circuit for Indirect Object Identification in GPT-2 small
Kevin Wang
Alexandre Variengien
Arthur Conmy
Buck Shlegeris
Jacob Steinhardt
210
491
0
01 Nov 2022
Prototypical Calibration for Few-shot Learning of Language Models
Zhixiong Han
Y. Hao
Li Dong
Yutao Sun
Furu Wei
168
52
0
20 May 2022
Fantastically Ordered Prompts and Where to Find Them: Overcoming Few-Shot Prompt Order Sensitivity
Yao Lu
Max Bartolo
Alastair Moore
Sebastian Riedel
Pontus Stenetorp
AILaw
LRM
277
1,114
0
18 Apr 2021
1