ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2202.10419
  4. Cited By
Interpreting Language Models with Contrastive Explanations

Interpreting Language Models with Contrastive Explanations

21 February 2022
Kayo Yin
Graham Neubig
    MILM
ArXivPDFHTML

Papers citing "Interpreting Language Models with Contrastive Explanations"

18 / 18 papers shown
Title
Contextures: Representations from Contexts
Contextures: Representations from Contexts
Runtian Zhai
Kai Yang
Che-Ping Tsai
Burak Varici
Zico Kolter
Pradeep Ravikumar
110
0
0
02 May 2025
On the Consistency of Multilingual Context Utilization in Retrieval-Augmented Generation
On the Consistency of Multilingual Context Utilization in Retrieval-Augmented Generation
Jirui Qi
Raquel Fernández
Arianna Bisazza
RALM
58
0
0
01 Apr 2025
Making Them a Malicious Database: Exploiting Query Code to Jailbreak Aligned Large Language Models
Making Them a Malicious Database: Exploiting Query Code to Jailbreak Aligned Large Language Models
Qingsong Zou
Jingyu Xiao
Qing Li
Zhi Yan
Y. Wang
Li Xu
Wenxuan Wang
Kuofeng Gao
Ruoyu Li
Yong-jia Jiang
AAML
165
0
0
21 Feb 2025
Counterfactuals As a Means for Evaluating Faithfulness of Attribution Methods in Autoregressive Language Models
Counterfactuals As a Means for Evaluating Faithfulness of Attribution Methods in Autoregressive Language Models
Sepehr Kamahi
Yadollah Yaghoobzadeh
39
0
0
21 Aug 2024
CELL your Model: Contrastive Explanations for Large Language Models
CELL your Model: Contrastive Explanations for Large Language Models
Ronny Luss
Erik Miehling
Amit Dhurandhar
45
0
0
17 Jun 2024
MambaLRP: Explaining Selective State Space Sequence Models
MambaLRP: Explaining Selective State Space Sequence Models
F. Jafari
G. Montavon
Klaus-Robert Müller
Oliver Eberle
Mamba
56
9
0
11 Jun 2024
Establishing degrees of closeness between audio recordings along
  different dimensions using large-scale cross-lingual models
Establishing degrees of closeness between audio recordings along different dimensions using large-scale cross-lingual models
Maxime Fily
Guillaume Wisniewski
Severine Guillaume
Gilles Adda
Alexis Michaud
22
1
0
08 Feb 2024
LLM-based NLG Evaluation: Current Status and Challenges
LLM-based NLG Evaluation: Current Status and Challenges
Mingqi Gao
Xinyu Hu
Jie Ruan
Xiao Pu
Xiaojun Wan
ELM
LM&MA
55
29
0
02 Feb 2024
ALMANACS: A Simulatability Benchmark for Language Model Explainability
ALMANACS: A Simulatability Benchmark for Language Model Explainability
Edmund Mills
Shiye Su
Stuart J. Russell
Scott Emmons
46
7
0
20 Dec 2023
Interpreting Pretrained Language Models via Concept Bottlenecks
Interpreting Pretrained Language Models via Concept Bottlenecks
Zhen Tan
Lu Cheng
Song Wang
Yuan Bo
Jundong Li
Huan Liu
LRM
29
20
0
08 Nov 2023
Do Models Explain Themselves? Counterfactual Simulatability of Natural
  Language Explanations
Do Models Explain Themselves? Counterfactual Simulatability of Natural Language Explanations
Yanda Chen
Ruiqi Zhong
Narutatsu Ri
Chen Zhao
He He
Jacob Steinhardt
Zhou Yu
Kathleen McKeown
LRM
26
47
0
17 Jul 2023
Causal interventions expose implicit situation models for commonsense
  language understanding
Causal interventions expose implicit situation models for commonsense language understanding
Takateru Yamakoshi
James L. McClelland
A. Goldberg
Robert D. Hawkins
17
6
0
06 Jun 2023
Explaining How Transformers Use Context to Build Predictions
Explaining How Transformers Use Context to Build Predictions
Javier Ferrando
Gerard I. Gállego
Ioannis Tsiamas
Marta R. Costa-jussá
27
31
0
21 May 2023
Surfacing Biases in Large Language Models using Contrastive Input
  Decoding
Surfacing Biases in Large Language Models using Contrastive Input Decoding
G. Yona
Or Honovich
Itay Laish
Roee Aharoni
27
11
0
12 May 2023
Faithful Chain-of-Thought Reasoning
Faithful Chain-of-Thought Reasoning
Qing Lyu
Shreya Havaldar
Adam Stein
Li Zhang
D. Rao
Eric Wong
Marianna Apidianaki
Chris Callison-Burch
ReLM
LRM
24
207
0
31 Jan 2023
Mediators: Conversational Agents Explaining NLP Model Behavior
Mediators: Conversational Agents Explaining NLP Model Behavior
Nils Feldhus
A. Ravichandran
Sebastian Möller
27
16
0
13 Jun 2022
The Pile: An 800GB Dataset of Diverse Text for Language Modeling
The Pile: An 800GB Dataset of Diverse Text for Language Modeling
Leo Gao
Stella Biderman
Sid Black
Laurence Golding
Travis Hoppe
...
Horace He
Anish Thite
Noa Nabeshima
Shawn Presser
Connor Leahy
AIMat
253
1,986
0
31 Dec 2020
Towards A Rigorous Science of Interpretable Machine Learning
Towards A Rigorous Science of Interpretable Machine Learning
Finale Doshi-Velez
Been Kim
XAI
FaML
242
3,681
0
28 Feb 2017
1