Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2203.17081
Cited By
Interpretation of Black Box NLP Models: A Survey
31 March 2022
Shivani Choudhary
N. Chatterjee
S. K. Saha
FAtt
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Interpretation of Black Box NLP Models: A Survey"
13 / 13 papers shown
Title
Representation Engineering for Large-Language Models: Survey and Research Challenges
Lukasz Bartoszcze
Sarthak Munshi
Bryan Sukidi
Jennifer Yen
Zejia Yang
David Williams-King
Linh Le
Kosi Asuzu
Carsten Maple
100
0
0
24 Feb 2025
Local Explanations and Self-Explanations for Assessing Faithfulness in black-box LLMs
Christos Fragkathoulas
Odysseas S. Chlapanis
LRM
20
0
0
18 Sep 2024
On Behalf of the Stakeholders: Trends in NLP Model Interpretability in the Era of LLMs
Nitay Calderon
Roi Reichart
32
10
0
27 Jul 2024
The Role of Syntactic Span Preferences in Post-Hoc Explanation Disagreement
Jonathan Kamp
Lisa Beinborn
Antske Fokkens
20
1
0
28 Mar 2024
Post Hoc Explanations of Language Models Can Improve Language Models
Satyapriya Krishna
Jiaqi Ma
Dylan Slack
Asma Ghandeharioun
Sameer Singh
Himabindu Lakkaraju
ReLM
LRM
11
53
0
19 May 2023
LM-Debugger: An Interactive Tool for Inspection and Intervention in Transformer-Based Language Models
Mor Geva
Avi Caciularu
Guy Dar
Paul Roit
Shoval Sadde
Micah Shlain
Bar Tamir
Yoav Goldberg
KELM
17
27
0
26 Apr 2022
Tailor: Generating and Perturbing Text with Semantic Controls
Alexis Ross
Tongshuang Wu
Hao Peng
Matthew E. Peters
Matt Gardner
134
77
0
15 Jul 2021
Probing Classifiers: Promises, Shortcomings, and Advances
Yonatan Belinkov
221
402
0
24 Feb 2021
A Survey on Neural Network Interpretability
Yu Zhang
Peter Tiño
A. Leonardis
K. Tang
FaML
XAI
134
654
0
28 Dec 2020
e-SNLI: Natural Language Inference with Natural Language Explanations
Oana-Maria Camburu
Tim Rocktaschel
Thomas Lukasiewicz
Phil Blunsom
LRM
252
618
0
04 Dec 2018
What you can cram into a single vector: Probing sentence embeddings for linguistic properties
Alexis Conneau
Germán Kruszewski
Guillaume Lample
Loïc Barrault
Marco Baroni
199
876
0
03 May 2018
Towards A Rigorous Science of Interpretable Machine Learning
Finale Doshi-Velez
Been Kim
XAI
FaML
225
3,658
0
28 Feb 2017
Efficient Estimation of Word Representations in Vector Space
Tomáš Mikolov
Kai Chen
G. Corrado
J. Dean
3DV
228
31,150
0
16 Jan 2013
1