Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2207.04153
Cited By
Probing Classifiers are Unreliable for Concept Removal and Detection
8 July 2022
Abhinav Kumar
Chenhao Tan
Amit Sharma
AAML
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Probing Classifiers are Unreliable for Concept Removal and Detection"
8 / 8 papers shown
Title
A Geometric Notion of Causal Probing
Clément Guerner
Anej Svete
Tianyu Liu
Alex Warstadt
Ryan Cotterell
LLMSV
34
12
0
27 Jul 2023
LEACE: Perfect linear concept erasure in closed form
Nora Belrose
David Schneider-Joseph
Shauli Ravfogel
Ryan Cotterell
Edward Raff
Stella Biderman
KELM
MU
41
102
0
06 Jun 2023
Towards Procedural Fairness: Uncovering Biases in How a Toxic Language Classifier Uses Sentiment Information
I. Nejadgholi
Esma Balkir
Kathleen C. Fraser
S. Kiritchenko
23
3
0
19 Oct 2022
Linear Adversarial Concept Erasure
Shauli Ravfogel
Michael Twiton
Yoav Goldberg
Ryan Cotterell
KELM
71
57
0
28 Jan 2022
Probing Classifiers: Promises, Shortcomings, and Advances
Yonatan Belinkov
224
404
0
24 Feb 2021
An Investigation of Why Overparameterization Exacerbates Spurious Correlations
Shiori Sagawa
Aditi Raghunathan
Pang Wei Koh
Percy Liang
144
369
0
09 May 2020
A Survey on Bias and Fairness in Machine Learning
Ninareh Mehrabi
Fred Morstatter
N. Saxena
Kristina Lerman
Aram Galstyan
SyDa
FaML
294
4,187
0
23 Aug 2019
What you can cram into a single vector: Probing sentence embeddings for linguistic properties
Alexis Conneau
Germán Kruszewski
Guillaume Lample
Loïc Barrault
Marco Baroni
199
882
0
03 May 2018
1