ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2108.13654
  4. Cited By
Discretized Integrated Gradients for Explaining Language Models

Discretized Integrated Gradients for Explaining Language Models

31 August 2021
Soumya Sanyal
Xiang Ren
    FAtt
ArXivPDFHTML

Papers citing "Discretized Integrated Gradients for Explaining Language Models"

32 / 32 papers shown
Title
Why and How LLMs Hallucinate: Connecting the Dots with Subsequence Associations
Why and How LLMs Hallucinate: Connecting the Dots with Subsequence Associations
Yiyou Sun
Y. Gai
Lijie Chen
Abhilasha Ravichander
Yejin Choi
D. Song
HILM
57
0
0
17 Apr 2025
Reasoning-Grounded Natural Language Explanations for Language Models
Vojtech Cahlik
Rodrigo Alves
Pavel Kordík
LRM
51
1
0
14 Mar 2025
Can Input Attributions Interpret the Inductive Reasoning Process Elicited in In-Context Learning?
Can Input Attributions Interpret the Inductive Reasoning Process Elicited in In-Context Learning?
Mengyu Ye
Tatsuki Kuribayashi
Goro Kobayashi
Jun Suzuki
LRM
92
0
0
20 Dec 2024
Uniform Discretized Integrated Gradients: An effective attribution based
  method for explaining large language models
Uniform Discretized Integrated Gradients: An effective attribution based method for explaining large language models
Swarnava Sinha Roy
Ayan Kundu
FAtt
71
0
0
05 Dec 2024
One Mind, Many Tongues: A Deep Dive into Language-Agnostic Knowledge
  Neurons in Large Language Models
One Mind, Many Tongues: A Deep Dive into Language-Agnostic Knowledge Neurons in Large Language Models
Pengfei Cao
Yuheng Chen
Zhuoran Jin
Yubo Chen
Kang-Jun Liu
Jun Zhao
KELM
70
0
0
26 Nov 2024
Counterfactuals As a Means for Evaluating Faithfulness of Attribution Methods in Autoregressive Language Models
Counterfactuals As a Means for Evaluating Faithfulness of Attribution Methods in Autoregressive Language Models
Sepehr Kamahi
Yadollah Yaghoobzadeh
42
0
0
21 Aug 2024
Hard to Explain: On the Computational Hardness of In-Distribution Model
  Interpretation
Hard to Explain: On the Computational Hardness of In-Distribution Model Interpretation
Guy Amir
Shahaf Bassan
Guy Katz
42
2
0
07 Aug 2024
"Sorry, Come Again?" Prompting -- Enhancing Comprehension and
  Diminishing Hallucination with [PAUSE]-injected Optimal Paraphrasing
"Sorry, Come Again?" Prompting -- Enhancing Comprehension and Diminishing Hallucination with [PAUSE]-injected Optimal Paraphrasing
Vipula Rawte
Islam Tonmoy
M. M. Zaman
Prachi Priya
Marcin Kardas
Alan Schelten
Ruan Silva
LRM
28
1
0
27 Mar 2024
PE: A Poincare Explanation Method for Fast Text Hierarchy Generation
PE: A Poincare Explanation Method for Fast Text Hierarchy Generation
Qian Chen
Dongyang Li
Xiaofeng He
Hongzhao Li
Hongyu Yi
16
0
0
25 Mar 2024
Backward Lens: Projecting Language Model Gradients into the Vocabulary
  Space
Backward Lens: Projecting Language Model Gradients into the Vocabulary Space
Shahar Katz
Yonatan Belinkov
Mor Geva
Lior Wolf
60
10
1
20 Feb 2024
Identification of Knowledge Neurons in Protein Language Models
Identification of Knowledge Neurons in Protein Language Models
Divya Nori
Shivali Singireddy
M. T. Have
MILM
13
2
0
17 Dec 2023
CIDR: A Cooperative Integrated Dynamic Refining Method for Minimal
  Feature Removal Problem
CIDR: A Cooperative Integrated Dynamic Refining Method for Minimal Feature Removal Problem
Qian Chen
Tao Zhang
Dongyang Li
Xiaofeng He
26
0
0
13 Dec 2023
An Attribution Method for Siamese Encoders
An Attribution Method for Siamese Encoders
Lucas Moller
Dmitry Nikolaev
Sebastian Padó
15
4
0
09 Oct 2023
Explaining Speech Classification Models via Word-Level Audio Segments
  and Paralinguistic Features
Explaining Speech Classification Models via Word-Level Audio Segments and Paralinguistic Features
Eliana Pastor
Alkis Koudounas
Giuseppe Attanasio
Dirk Hovy
Elena Baralis
11
4
0
14 Sep 2023
Explainability for Large Language Models: A Survey
Explainability for Large Language Models: A Survey
Haiyan Zhao
Hanjie Chen
Fan Yang
Ninghao Liu
Huiqi Deng
Hengyi Cai
Shuaiqiang Wang
Dawei Yin
Mengnan Du
LRM
23
408
0
02 Sep 2023
Journey to the Center of the Knowledge Neurons: Discoveries of
  Language-Independent Knowledge Neurons and Degenerate Knowledge Neurons
Journey to the Center of the Knowledge Neurons: Discoveries of Language-Independent Knowledge Neurons and Degenerate Knowledge Neurons
Yuheng Chen
Pengfei Cao
Yubo Chen
Kang Liu
Jun Zhao
KELM
25
41
0
25 Aug 2023
Time Interpret: a Unified Model Interpretability Library for Time Series
Time Interpret: a Unified Model Interpretability Library for Time Series
Joseph Enguehard
FAtt
AI4TS
20
4
0
05 Jun 2023
Sequential Integrated Gradients: a simple but effective method for
  explaining language models
Sequential Integrated Gradients: a simple but effective method for explaining language models
Joseph Enguehard
22
38
0
25 May 2023
Token-wise Decomposition of Autoregressive Language Model Hidden States
  for Analyzing Model Predictions
Token-wise Decomposition of Autoregressive Language Model Hidden States for Analyzing Model Predictions
Byung-Doh Oh
William Schuler
24
2
0
17 May 2023
Inseq: An Interpretability Toolkit for Sequence Generation Models
Inseq: An Interpretability Toolkit for Sequence Generation Models
Gabriele Sarti
Nils Feldhus
Ludwig Sickert
Oskar van der Wal
Malvina Nissim
Arianna Bisazza
30
64
0
27 Feb 2023
Comparing Baseline Shapley and Integrated Gradients for Local
  Explanation: Some Additional Insights
Comparing Baseline Shapley and Integrated Gradients for Local Explanation: Some Additional Insights
Tianshu Feng
Zhipu Zhou
Tarun Joshi
V. Nair
FAtt
20
4
0
12 Aug 2022
Generalizability Analysis of Graph-based Trajectory Predictor with
  Vectorized Representation
Generalizability Analysis of Graph-based Trajectory Predictor with Vectorized Representation
Juanwu Lu
Wei Zhan
M. Tomizuka
Yeping Hu
22
6
0
06 Aug 2022
ferret: a Framework for Benchmarking Explainers on Transformers
ferret: a Framework for Benchmarking Explainers on Transformers
Giuseppe Attanasio
Eliana Pastor
C. Bonaventura
Debora Nozza
33
30
0
02 Aug 2022
FRAME: Evaluating Rationale-Label Consistency Metrics for Free-Text
  Rationales
FRAME: Evaluating Rationale-Label Consistency Metrics for Free-Text Rationales
Aaron Chan
Shaoliang Nie
Liang Tan
Xiaochang Peng
Hamed Firooz
Maziar Sanjabi
Xiang Ren
40
9
0
02 Jul 2022
SBERT studies Meaning Representations: Decomposing Sentence Embeddings
  into Explainable Semantic Features
SBERT studies Meaning Representations: Decomposing Sentence Embeddings into Explainable Semantic Features
Juri Opitz
Anette Frank
26
32
0
14 Jun 2022
ER-Test: Evaluating Explanation Regularization Methods for Language
  Models
ER-Test: Evaluating Explanation Regularization Methods for Language Models
Brihi Joshi
Aaron Chan
Ziyi Liu
Shaoliang Nie
Maziar Sanjabi
Hamed Firooz
Xiang Ren
AAML
30
6
0
25 May 2022
FaiRR: Faithful and Robust Deductive Reasoning over Natural Language
FaiRR: Faithful and Robust Deductive Reasoning over Natural Language
Soumya Sanyal
Harman Singh
Xiang Ren
ReLM
LRM
24
44
0
19 Mar 2022
UNIREX: A Unified Learning Framework for Language Model Rationale
  Extraction
UNIREX: A Unified Learning Framework for Language Model Rationale Extraction
Aaron Chan
Maziar Sanjabi
Lambert Mathias
L Tan
Shaoliang Nie
Xiaochang Peng
Xiang Ren
Hamed Firooz
38
41
0
16 Dec 2021
The Out-of-Distribution Problem in Explainability and Search Methods for
  Feature Importance Explanations
The Out-of-Distribution Problem in Explainability and Search Methods for Feature Importance Explanations
Peter Hase
Harry Xie
Mohit Bansal
OODD
LRM
FAtt
18
91
0
01 Jun 2021
Connecting Attributions and QA Model Behavior on Realistic
  Counterfactuals
Connecting Attributions and QA Model Behavior on Realistic Counterfactuals
Xi Ye
Rohan Nair
Greg Durrett
16
24
0
09 Apr 2021
Investigating Saturation Effects in Integrated Gradients
Investigating Saturation Effects in Integrated Gradients
Vivek Miglani
Narine Kokhlikyan
B. Alsallakh
Miguel Martin
Orion Reblitz-Richardson
FAtt
16
23
0
23 Oct 2020
Towards A Rigorous Science of Interpretable Machine Learning
Towards A Rigorous Science of Interpretable Machine Learning
Finale Doshi-Velez
Been Kim
XAI
FaML
251
3,683
0
28 Feb 2017
1