Concept-Based Explanations to Test for False Causal Relationships Learned by Abusive Language Classifiers

4 July 2023

Papers citing "Concept-Based Explanations to Test for False Causal Relationships Learned by Abusive Language Classifiers"

6 / 6 papers shown

Title
Are All Spurious Features in Natural Language Alike? An Analysis through a Causal Lens Nitish Joshi X. Pan Hengxing He CML 44 28 0 25 Oct 2022
Probing Classifiers: Promises, Shortcomings, and Advances Yonatan Belinkov 224 402 0 24 Feb 2021
Towards generalisable hate speech detection: a review on obstacles and solutions Wenjie Yin A. Zubiaga 117 164 0 17 Feb 2021
On Completeness-aware Concept-Based Explanations in Deep Neural Networks Chih-Kuan Yeh Been Kim Sercan Ö. Arik Chun-Liang Li Tomas Pfister Pradeep Ravikumar FAtt 120 297 0 17 Oct 2019
What you can cram into a single vector: Probing sentence embeddings for linguistic properties Alexis Conneau Germán Kruszewski Guillaume Lample Loïc Barrault Marco Baroni 199 879 0 03 May 2018
Efficient Estimation of Word Representations in Vector Space Tomáš Mikolov Kai Chen G. Corrado J. Dean 3DV 228 31,150 0 16 Jan 2013