Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2201.12191
Cited By
Kernelized Concept Erasure
28 January 2022
Shauli Ravfogel
Francisco Vargas
Yoav Goldberg
Ryan Cotterell
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Kernelized Concept Erasure"
9 / 9 papers shown
Title
Focus On This, Not That! Steering LLMs With Adaptive Feature Specification
Tom A. Lamb
Adam Davies
Alasdair Paren
Philip H. S. Torr
Francesco Pinto
45
0
0
30 Oct 2024
Machine Unlearning Fails to Remove Data Poisoning Attacks
Martin Pawelczyk
Jimmy Z. Di
Yiwei Lu
Gautam Kamath
Ayush Sekhari
Seth Neel
AAML
MU
52
8
0
25 Jun 2024
Exploring Safety-Utility Trade-Offs in Personalized Language Models
Anvesh Rao Vijjini
Somnath Basu Roy Chowdhury
Snigdha Chaturvedi
33
6
0
17 Jun 2024
Sparse Feature Circuits: Discovering and Editing Interpretable Causal Graphs in Language Models
Samuel Marks
Can Rager
Eric J. Michaud
Yonatan Belinkov
David Bau
Aaron Mueller
44
110
0
28 Mar 2024
Robust Concept Erasure via Kernelized Rate-Distortion Maximization
Somnath Basu Roy Chowdhury
Nicholas Monath
Kumar Avinava Dubey
Amr Ahmed
Snigdha Chaturvedi
14
4
0
30 Nov 2023
LEACE: Perfect linear concept erasure in closed form
Nora Belrose
David Schneider-Joseph
Shauli Ravfogel
Ryan Cotterell
Edward Raff
Stella Biderman
KELM
MU
41
102
0
06 Jun 2023
Linear Adversarial Concept Erasure
Shauli Ravfogel
Michael Twiton
Yoav Goldberg
Ryan Cotterell
KELM
71
56
0
28 Jan 2022
On the Global Optima of Kernelized Adversarial Representation Learning
Bashir Sadeghi
Runyi Yu
Vishnu Naresh Boddeti
AAML
59
31
0
16 Oct 2019
Efficient Estimation of Word Representations in Vector Space
Tomáš Mikolov
Kai Chen
G. Corrado
J. Dean
3DV
228
31,150
0
16 Jan 2013
1