Interpretation of Black Box NLP Models: A Survey

Interpretation of Black Box NLP Models: A Survey

31 March 2022

Shivani Choudhary

Papers citing "Interpretation of Black Box NLP Models: A Survey"

13 / 13 papers shown

Title
Representation Engineering for Large-Language Models: Survey and Research Challenges Lukasz Bartoszcze Sarthak Munshi Bryan Sukidi Jennifer Yen Zejia Yang David Williams-King Linh Le Kosi Asuzu Carsten Maple 100 0 0 24 Feb 2025
Local Explanations and Self-Explanations for Assessing Faithfulness in black-box LLMs Christos Fragkathoulas Odysseas S. Chlapanis LRM 20 0 0 18 Sep 2024
On Behalf of the Stakeholders: Trends in NLP Model Interpretability in the Era of LLMs Nitay Calderon Roi Reichart 29 10 0 27 Jul 2024
The Role of Syntactic Span Preferences in Post-Hoc Explanation Disagreement Jonathan Kamp Lisa Beinborn Antske Fokkens 20 1 0 28 Mar 2024
Post Hoc Explanations of Language Models Can Improve Language Models Satyapriya Krishna Jiaqi Ma Dylan Slack Asma Ghandeharioun Sameer Singh Himabindu Lakkaraju ReLM LRM 11 53 0 19 May 2023
LM-Debugger: An Interactive Tool for Inspection and Intervention in Transformer-Based Language Models Mor Geva Avi Caciularu Guy Dar Paul Roit Shoval Sadde Micah Shlain Bar Tamir Yoav Goldberg KELM 17 27 0 26 Apr 2022
Tailor: Generating and Perturbing Text with Semantic Controls Alexis Ross Tongshuang Wu Hao Peng Matthew E. Peters Matt Gardner 130 77 0 15 Jul 2021
Probing Classifiers: Promises, Shortcomings, and Advances Yonatan Belinkov 221 402 0 24 Feb 2021
A Survey on Neural Network Interpretability Yu Zhang Peter Tiño A. Leonardis K. Tang FaML XAI 134 654 0 28 Dec 2020
e-SNLI: Natural Language Inference with Natural Language Explanations Oana-Maria Camburu Tim Rocktaschel Thomas Lukasiewicz Phil Blunsom LRM 252 618 0 04 Dec 2018
What you can cram into a single vector: Probing sentence embeddings for linguistic properties Alexis Conneau Germán Kruszewski Guillaume Lample Loïc Barrault Marco Baroni 199 876 0 03 May 2018
Towards A Rigorous Science of Interpretable Machine Learning Finale Doshi-Velez Been Kim XAI FaML 225 3,658 0 28 Feb 2017
Efficient Estimation of Word Representations in Vector Space Tomáš Mikolov Kai Chen G. Corrado J. Dean 3DV 228 31,150 0 16 Jan 2013