Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2406.01563
Cited By
LoFiT: Localized Fine-tuning on LLM Representations
3 June 2024
Fangcong Yin
Xi Ye
Greg Durrett
Re-assign community
ArXiv
PDF
HTML
Papers citing
"LoFiT: Localized Fine-tuning on LLM Representations"
11 / 11 papers shown
Title
ShadowCoT: Cognitive Hijacking for Stealthy Reasoning Backdoors in LLMs
Gejian Zhao
Hanzhou Wu
Xinpeng Zhang
Athanasios V. Vasilakos
LRM
36
1
0
08 Apr 2025
How to Steer LLM Latents for Hallucination Detection?
Seongheon Park
Xuefeng Du
Min-Hsuan Yeh
Haobo Wang
Yixuan Li
LLMSV
44
1
0
01 Mar 2025
Math Neurosurgery: Isolating Language Models' Math Reasoning Abilities Using Only Forward Passes
Bryan R Christ
Zack Gottesman
Jonathan Kropko
Thomas Hartvigsen
LRM
49
2
0
20 Feb 2025
SEER: Self-Explainability Enhancement of Large Language Models' Representations
Guanxu Chen
Dongrui Liu
Tao Luo
Jing Shao
LRM
MILM
59
1
0
07 Feb 2025
Risk-Aware Distributional Intervention Policies for Language Models
Bao Nguyen
Binh Nguyen
Duy Nguyen
V. Nguyen
28
1
0
28 Jan 2025
Understanding Synthetic Context Extension via Retrieval Heads
Xinyu Zhao
Fangcong Yin
Greg Durrett
36
0
0
31 Dec 2024
Programming Refusal with Conditional Activation Steering
Bruce W. Lee
Inkit Padhi
K. Ramamurthy
Erik Miehling
Pierre L. Dognin
Manish Nagireddy
Amit Dhurandhar
LLMSV
89
13
0
06 Sep 2024
Attention Heads of Large Language Models: A Survey
Zifan Zheng
Yezhaohui Wang
Yuxin Huang
Shichao Song
Mingchuan Yang
Bo Tang
Feiyu Xiong
Zhiyu Li
LRM
48
21
0
05 Sep 2024
PERP: Rethinking the Prune-Retrain Paradigm in the Era of LLMs
Max Zimmer
Megi Andoni
Christoph Spiegel
S. Pokutta
VLM
48
10
0
23 Dec 2023
Dissecting Recall of Factual Associations in Auto-Regressive Language Models
Mor Geva
Jasmijn Bastings
Katja Filippova
Amir Globerson
KELM
189
260
0
28 Apr 2023
In-context Learning and Induction Heads
Catherine Olsson
Nelson Elhage
Neel Nanda
Nicholas Joseph
Nova Dassarma
...
Tom B. Brown
Jack Clark
Jared Kaplan
Sam McCandlish
C. Olah
240
453
0
24 Sep 2022
1