ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2309.01029
  4. Cited By
Explainability for Large Language Models: A Survey
v1v2v3 (latest)

Explainability for Large Language Models: A Survey

ACM Transactions on Intelligent Systems and Technology (ACM TIST), 2023
2 September 2023
Haiyan Zhao
Hanjie Chen
Fan Yang
Ninghao Liu
Huiqi Deng
Hengyi Cai
Shuaiqiang Wang
D. Yin
Jundong Li
    LRM
ArXiv (abs)PDFHTML

Papers citing "Explainability for Large Language Models: A Survey"

50 / 287 papers shown
Meta-Models: An Architecture for Decoding LLM Behaviors Through
  Interpreted Embeddings and Natural Language
Meta-Models: An Architecture for Decoding LLM Behaviors Through Interpreted Embeddings and Natural Language
Anthony Costarelli
Mat Allen
Severin Field
274
5
0
03 Oct 2024
F-Fidelity: A Robust Framework for Faithfulness Evaluation of Explainable AI
F-Fidelity: A Robust Framework for Faithfulness Evaluation of Explainable AIInternational Conference on Learning Representations (ICLR), 2024
Xu Zheng
Farhad Shirani
Zhuomin Chen
Chaohao Lin
Wei Cheng
Wenbo Guo
Dongsheng Luo
AAML
394
14
0
03 Oct 2024
Enhancing Training Data Attribution for Large Language Models with
  Fitting Error Consideration
Enhancing Training Data Attribution for Large Language Models with Fitting Error ConsiderationConference on Empirical Methods in Natural Language Processing (EMNLP), 2024
Kangxi Wu
Liang Pang
Huawei Shen
Xueqi Cheng
TDI
260
5
0
02 Oct 2024
Beyond Single Concept Vector: Modeling Concept Subspace in LLMs with Gaussian Distribution
Beyond Single Concept Vector: Modeling Concept Subspace in LLMs with Gaussian DistributionInternational Conference on Learning Representations (ICLR), 2024
Haiyan Zhao
Heng Zhao
Bo Shen
Ali Payani
Fan Yang
Mengnan Du
425
16
0
30 Sep 2024
Concept-Based Explanations in Computer Vision: Where Are We and Where
  Could We Go?
Concept-Based Explanations in Computer Vision: Where Are We and Where Could We Go?
Jae Hee Lee
Georgii Mikriukov
Gesina Schwalbe
Stefan Wermter
D. Wolter
295
11
0
20 Sep 2024
Local Explanations and Self-Explanations for Assessing Faithfulness in
  black-box LLMs
Local Explanations and Self-Explanations for Assessing Faithfulness in black-box LLMsHellenic Conference on Artificial Intelligence (HAI), 2024
Christos Fragkathoulas
Odysseas S. Chlapanis
LRM
158
2
0
18 Sep 2024
SimSUM: Simulated Benchmark with Structured and Unstructured Medical Records
SimSUM: Simulated Benchmark with Structured and Unstructured Medical Records
Paloma Rabaey
Stefan Heytens
Stefan Heytens
284
2
0
13 Sep 2024
Cross-Refine: Improving Natural Language Explanation Generation by
  Learning in Tandem
Cross-Refine: Improving Natural Language Explanation Generation by Learning in TandemInternational Conference on Computational Linguistics (COLING), 2024
Qianli Wang
Tatiana Anikina
Nils Feldhus
Simon Ostermann
Sebastian Möller
Vera Schmitt
LRM
228
11
0
11 Sep 2024
From Yes-Men to Truth-Tellers: Addressing Sycophancy in Large Language Models with Pinpoint Tuning
From Yes-Men to Truth-Tellers: Addressing Sycophancy in Large Language Models with Pinpoint TuningInternational Conference on Machine Learning (ICML), 2024
Wei Chen
Zhen Huang
Liang Xie
Binbin Lin
Houqiang Li
...
Deng Cai
Yonggang Zhang
Wenxiao Wang
Xu Shen
Jieping Ye
340
32
0
03 Sep 2024
A Survey of Large Language Models for European Languages
A Survey of Large Language Models for European Languages
Wazir Ali
S. Pyysalo
379
6
0
27 Aug 2024
Relation Also Knows: Rethinking the Recall and Editing of Factual Associations in Auto-Regressive Transformer Language Models
Relation Also Knows: Rethinking the Recall and Editing of Factual Associations in Auto-Regressive Transformer Language ModelsAAAI Conference on Artificial Intelligence (AAAI), 2024
Xiyu Liu
Zhengxiao Liu
Naibin Gu
Zheng Lin
Wanli Ma
Ji Xiang
Weiping Wang
KELM
424
3
0
27 Aug 2024
Defending against Jailbreak through Early Exit Generation of Large Language Models
Defending against Jailbreak through Early Exit Generation of Large Language Models
Chongwen Zhao
Zhihao Dou
Kaizhu Huang
AAML
238
3
0
21 Aug 2024
Visual Agents as Fast and Slow Thinkers
Visual Agents as Fast and Slow ThinkersInternational Conference on Learning Representations (ICLR), 2024
Guangyan Sun
Haoyang Ling
Zhenting Wang
Cheng-Long Wang
Siqi Ma
Qifan Wang
Ying Nian Wu
Ying Nian Wu
Dongfang Liu
Dongfang Liu
LLMAGLRM
529
43
0
16 Aug 2024
Prompt Recursive Search: A Living Framework with Adaptive Growth in LLM
  Auto-Prompting
Prompt Recursive Search: A Living Framework with Adaptive Growth in LLM Auto-Prompting
Xiangyu Zhao
Chengqian Ma
175
2
0
02 Aug 2024
Automated Software Vulnerability Static Code Analysis Using Generative
  Pre-Trained Transformer Models
Automated Software Vulnerability Static Code Analysis Using Generative Pre-Trained Transformer ModelsIACR Cryptology ePrint Archive (IACR ePrint), 2024
Elijah Pelofske
Vincent Urias
L. Liebrock
145
2
0
31 Jul 2024
LLMs for Enhanced Agricultural Meteorological Recommendations
LLMs for Enhanced Agricultural Meteorological Recommendations
Ji-jun Park
Soo-joon Choi
236
3
0
30 Jul 2024
Interpretable Pre-Trained Transformers for Heart Time-Series Data
Interpretable Pre-Trained Transformers for Heart Time-Series Data
H. Davies
James Monsen
Danilo P. Mandic
AI4TSMedIm
131
5
0
30 Jul 2024
Monetizing Currency Pair Sentiments through LLM Explainability
Monetizing Currency Pair Sentiments through LLM Explainability
Lior Limonad
Fabiana Fournier
Juan Manuel Vera
Inna Skarbovsky
Shlomit Gur
Raquel Lazcano
125
2
0
29 Jul 2024
AgentPeerTalk: Empowering Students through Agentic-AI-Driven Discernment
  of Bullying and Joking in Peer Interactions in Schools
AgentPeerTalk: Empowering Students through Agentic-AI-Driven Discernment of Bullying and Joking in Peer Interactions in Schools
Aditya Paul
Chi Lok Yu
Eva Adelina Susanto
Nicholas Wai Long Lau
Gwenyth Isobel Meadows
LLMAG
261
7
0
27 Jul 2024
On Behalf of the Stakeholders: Trends in NLP Model Interpretability in the Era of LLMs
On Behalf of the Stakeholders: Trends in NLP Model Interpretability in the Era of LLMs
Nitay Calderon
Roi Reichart
358
24
0
27 Jul 2024
Fairness Definitions in Language Models Explained
Fairness Definitions in Language Models Explained
Thang Viet Doan
Zhibo Chu
Sribala Vidyadhari Chinta
Wenbin Zhang
ALM
351
20
0
26 Jul 2024
Knowledge Mechanisms in Large Language Models: A Survey and Perspective
Knowledge Mechanisms in Large Language Models: A Survey and Perspective
Meng Wang
Yunzhi Yao
Ziwen Xu
Shuofei Qiao
Shumin Deng
...
Yong Jiang
Pengjun Xie
Fei Huang
Huajun Chen
Ningyu Zhang
332
60
0
22 Jul 2024
MAVEN-Fact: A Large-scale Event Factuality Detection Dataset
MAVEN-Fact: A Large-scale Event Factuality Detection Dataset
Chunyang Li
Hao Peng
Xiaozhi Wang
Yunjia Qi
Lei Hou
Bin Xu
Juanzi Li
HILM
263
5
0
22 Jul 2024
XAI meets LLMs: A Survey of the Relation between Explainable AI and
  Large Language Models
XAI meets LLMs: A Survey of the Relation between Explainable AI and Large Language Models
Xiaoshi Zhong
Lorenzo Malandri
Fabio Mercorio
Navid Nobani
Andrea Seveso
330
37
0
21 Jul 2024
Prover-Verifier Games improve legibility of LLM outputs
Prover-Verifier Games improve legibility of LLM outputs
Jan Hendrik Kirchner
Yining Chen
Harri Edwards
Jan Leike
Nat McAleese
Yuri Burda
LRMAAML
278
51
0
18 Jul 2024
A Survey on Symbolic Knowledge Distillation of Large Language Models
A Survey on Symbolic Knowledge Distillation of Large Language Models
Kamal Acharya
Alvaro Velasquez
Haoze Song
SyDa
278
23
0
12 Jul 2024
DeepCodeProbe: Towards Understanding What Models Trained on Code Learn
DeepCodeProbe: Towards Understanding What Models Trained on Code Learn
Vahid Majdinasab
Amin Nikanjam
Foutse Khomh
240
1
0
11 Jul 2024
Towards Explainable Evolution Strategies with Large Language Models
Towards Explainable Evolution Strategies with Large Language Models
Jill Baumann
Oliver Kramer
154
0
0
11 Jul 2024
Grounding and Evaluation for Large Language Models: Practical Challenges
  and Lessons Learned (Survey)
Grounding and Evaluation for Large Language Models: Practical Challenges and Lessons Learned (Survey)
K. Kenthapadi
M. Sameki
Ankur Taly
HILMELMAILaw
219
33
0
10 Jul 2024
Merge, Ensemble, and Cooperate! A Survey on Collaborative Strategies in
  the Era of Large Language Models
Merge, Ensemble, and Cooperate! A Survey on Collaborative Strategies in the Era of Large Language Models
Jinliang Lu
Ziliang Pang
Min Xiao
Yaochen Zhu
Rui Xia
Jiajun Zhang
MoMe
384
48
0
08 Jul 2024
Cognitive Modeling with Scaffolded LLMs: A Case Study of Referential
  Expression Generation
Cognitive Modeling with Scaffolded LLMs: A Case Study of Referential Expression Generation
Polina Tsvilodub
Michael Franke
Fausto Carcassi
175
2
0
04 Jul 2024
A Survey on Trustworthiness in Foundation Models for Medical Image
  Analysis
A Survey on Trustworthiness in Foundation Models for Medical Image Analysis
Congzhen Shi
Ryan Rezai
Jiaxi Yang
Qi Dou
Xiaoxiao Li
MedIm
223
17
0
03 Jul 2024
A Practical Review of Mechanistic Interpretability for Transformer-Based Language Models
A Practical Review of Mechanistic Interpretability for Transformer-Based Language Models
Daking Rai
Yilun Zhou
Shi Feng
Abulhair Saparov
Ziyu Yao
628
81
0
02 Jul 2024
When Search Engine Services meet Large Language Models: Visions and
  Challenges
When Search Engine Services meet Large Language Models: Visions and Challenges
Haoyi Xiong
Jiang Bian
Yuchen Li
Xuhong Li
Jundong Li
Shuaiqiang Wang
D. Yin
Sumi Helal
352
79
0
28 Jun 2024
Enabling Regional Explainability by Automatic and Model-agnostic Rule
  Extraction
Enabling Regional Explainability by Automatic and Model-agnostic Rule Extraction
Yu Chen
Tianyu Cui
Alexander Capstick
Nan Fletcher-Loyd
Payam Barnaghi
251
1
0
25 Jun 2024
RankAdaptor: Hierarchical Dynamic Low-Rank Adaptation for Structural
  Pruned LLMs
RankAdaptor: Hierarchical Dynamic Low-Rank Adaptation for Structural Pruned LLMs
Changhai Zhou
Shijie Han
Shiyang Zhang
Shichao Weng
Zekai Liu
Cheng Jin
189
1
0
22 Jun 2024
Retrieval-Augmented Generation for Generative Artificial Intelligence in
  Medicine
Retrieval-Augmented Generation for Generative Artificial Intelligence in Medicine
Rui Yang
Yilin Ning
Emilia Keppo
Mingxuan Liu
Chuan Hong
Danielle S Bitterman
J. Ong
Daniel Ting
Nan Liu
3DVMedImRALM
173
11
0
18 Jun 2024
D2O: Dynamic Discriminative Operations for Efficient Long-Context Inference of Large Language Models
D2O: Dynamic Discriminative Operations for Efficient Long-Context Inference of Large Language Models
Zhongwei Wan
Xinjian Wu
Yu Zhang
Yi Xin
Chaofan Tao
...
Xin Wang
Siqi Luo
Jing Xiong
Mi Zhang
Mi Zhang
392
5
0
18 Jun 2024
Self-MoE: Towards Compositional Large Language Models with
  Self-Specialized Experts
Self-MoE: Towards Compositional Large Language Models with Self-Specialized Experts
Junmo Kang
Leonid Karlinsky
Hongyin Luo
Zhen Wang
Jacob A. Hansen
James Glass
David D. Cox
Yikang Shen
Rogerio Feris
Alan Ritter
MoMeMoE
227
19
0
17 Jun 2024
Applications of Generative AI in Healthcare: algorithmic, ethical, legal
  and societal considerations
Applications of Generative AI in Healthcare: algorithmic, ethical, legal and societal considerations
Onyekachukwu R. Okonji
Kamol Yunusov
Bonnie Gordon
MedIm
206
17
0
15 Jun 2024
How Alignment and Jailbreak Work: Explain LLM Safety through
  Intermediate Hidden States
How Alignment and Jailbreak Work: Explain LLM Safety through Intermediate Hidden StatesConference on Empirical Methods in Natural Language Processing (EMNLP), 2024
Zhenhong Zhou
Haiyang Yu
Xinghua Zhang
Rongwu Xu
Fei Huang
Yongbin Li
375
76
0
09 Jun 2024
Deconstructing The Ethics of Large Language Models from Long-standing
  Issues to New-emerging Dilemmas
Deconstructing The Ethics of Large Language Models from Long-standing Issues to New-emerging Dilemmas
Chengyuan Deng
Yiqun Duan
Xin Jin
Heng Chang
Yijun Tian
...
Kuofeng Gao
Sihong He
Jun Zhuang
Lu Cheng
Haohan Wang
AILaw
264
28
0
08 Jun 2024
POEM: Interactive Prompt Optimization for Enhancing Multimodal Reasoning
  of Large Language Models
POEM: Interactive Prompt Optimization for Enhancing Multimodal Reasoning of Large Language Models
Jianben He
Xingbo Wang
Shiyi Liu
Guande Wu
Claudio Silva
Huamin Qu
LRM
242
6
0
06 Jun 2024
A Survey of Language-Based Communication in Robotics
A Survey of Language-Based Communication in Robotics
William Hunt
Sarvapali D. Ramchurn
Mohammad D. Soorati
LM&Ro
691
17
0
06 Jun 2024
I've got the "Answer"! Interpretation of LLMs Hidden States in Question
  Answering
I've got the "Answer"! Interpretation of LLMs Hidden States in Question Answering
Valeriya Goloviznina
Evgeny Kotelnikov
92
4
0
04 Jun 2024
CheckEmbed: Effective Verification of LLM Solutions to Open-Ended Tasks
CheckEmbed: Effective Verification of LLM Solutions to Open-Ended Tasks
Maciej Besta
Lorenzo Paleari
Marcin Copik
Robert Gerstenberger
Aleš Kubíček
...
Eric Schreiber
Torsten Hoefler
Tomasz Lehmann
H. Niewiadomski
Torsten Hoefler
672
11
0
04 Jun 2024
Position: Cracking the Code of Cascading Disparity Towards Marginalized
  Communities
Position: Cracking the Code of Cascading Disparity Towards Marginalized Communities
G. Farnadi
Mohammad Havaei
Negar Rostamzadeh
324
3
0
03 Jun 2024
Understanding Token Probability Encoding in Output Embeddings
Understanding Token Probability Encoding in Output Embeddings
Hakaze Cho
Yoshihiro Sakai
Kenshiro Tanaka
Mariko Kato
Naoya Inoue
294
3
0
03 Jun 2024
Towards Practical Single-shot Motion Synthesis
Towards Practical Single-shot Motion Synthesis
Konstantinos Roditakis
Spyridon Thermos
N. Zioulis
VGen
366
1
0
03 Jun 2024
Empirical influence functions to understand the logic of fine-tuning
Empirical influence functions to understand the logic of fine-tuning
Jordan K Matelsky
Lyle Ungar
Konrad Paul Kording
208
0
0
01 Jun 2024
Previous
123456
Next