Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
All Papers
0 / 0 papers shown
Title
Home
Papers
2309.01029
Cited By
v1
v2
v3 (latest)
Explainability for Large Language Models: A Survey
ACM Transactions on Intelligent Systems and Technology (ACM TIST), 2023
2 September 2023
Haiyan Zhao
Hanjie Chen
Fan Yang
Ninghao Liu
Huiqi Deng
Hengyi Cai
Shuaiqiang Wang
D. Yin
Jundong Li
LRM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Explainability for Large Language Models: A Survey"
50 / 287 papers shown
Title
A case for data valuation transparency via DValCards
Keziah Naggita
Julienne LaChance
TDI
352
0
0
29 Jun 2025
Beyond Autocomplete: Designing CopilotLens Towards Transparent and Explainable AI Coding Agents
Runlong Ye
Zeling Zhang
Boushra Almazroua
Michael Liut
190
0
0
24 Jun 2025
Understanding Software Engineering Agents: A Study of Thought-Action-Result Trajectories
Islem Bouzenia
Michael Pradel
LLMAG
164
6
0
23 Jun 2025
Human-Aligned Faithfulness in Toxicity Explanations of LLMs
Ramaravind Kommiya Mothilal
Joanna Roy
Syed Ishtiaque Ahmed
Shion Guha
128
0
0
23 Jun 2025
Intrinsic and Extrinsic Organized Attention: Softmax Invariance and Network Sparsity
Oluwadamilola Fasina
Ruben V.C. Pohle
Pei-Chun Su
Ronald R. Coifman
125
0
0
18 Jun 2025
Cohort Discovery: A Survey on LLM-Assisted Clinical Trial Recruitment
Shrestha Ghosh
Moritz Schneider
Carina Reinicke
Carsten Eickhoff
203
0
0
18 Jun 2025
SoK: The Privacy Paradox of Large Language Models: Advancements, Privacy Risks, and Mitigation
ACM Asia Conference on Computer and Communications Security (AsiaCCS), 2025
Yashothara Shanmugarasa
Ming Ding
M. Chamikara
Thierry Rakotoarivelo
PILM
AILaw
370
7
0
15 Jun 2025
Predicting Early-Onset Colorectal Cancer with Large Language Models
Wilson Lau
Youngwon Kim
Sravanthi Parasa
Md Enamul Haque
Anand Oka
Jay Nanduri
65
0
0
13 Jun 2025
Provoking Multi-modal Few-Shot LVLM via Exploration-Exploitation In-Context Learning
Computer Vision and Pattern Recognition (CVPR), 2025
C. L. Philip Chen
Yunpeng Zhai
Yifan Zhao
Jinyang Gao
Bolin Ding
Jia Li
192
1
0
11 Jun 2025
Towards Large Language Models with Self-Consistent Natural Language Explanations
Sahar Admoni
Ofra Amir
Assaf Hallak
Yftah Ziser
LRM
148
2
0
09 Jun 2025
United Minds or Isolated Agents? Exploring Coordination of LLMs under Cognitive Load Theory
HaoYang Shang
Xuan Liu
Zi Liang
J. Zhang
Haibo Hu
Song Guo
LLMAG
198
5
0
07 Jun 2025
Interpretation Meets Safety: A Survey on Interpretation Methods and Tools for Improving LLM Safety
Seongmin Lee
Aeree Cho
Grace C. Kim
ShengYun Peng
Mansi Phute
Duen Horng Chau
LM&MA
AI4CE
261
3
0
05 Jun 2025
Trustworthy Medical Question Answering: An Evaluation-Centric Survey
Yinuo Wang
Robert E. Mercer
Frank Rudzicz
Sudipta Singha Roy
Sudipta Singha Roy
Pengjie Ren
Zhumin Chen
Xindi Wang
ELM
208
2
0
04 Jun 2025
TracLLM: A Generic Framework for Attributing Long Context LLMs
Yanting Wang
Wei Zou
Runpeng Geng
Jinyuan Jia
LLMAG
459
3
0
04 Jun 2025
Establishing Trustworthy LLM Evaluation via Shortcut Neuron Analysis
Annual Meeting of the Association for Computational Linguistics (ACL), 2025
Kejian Zhu
Shangqing Tu
Zhuoran Jin
Lei Hou
Juanzi Li
Jun Zhao
KELM
207
0
0
04 Jun 2025
RACE-Align: Retrieval-Augmented and Chain-of-Thought Enhanced Preference Alignment for Large Language Models
Qihang Yan
Xinyu Zhang
Luming Guo
Tao Gui
Feifan Liu
AI4TS
LRM
154
0
0
03 Jun 2025
MemOS: An Operating System for Memory-Augmented Generation (MAG) in Large Language Models
Zhiyu Li
Shichao Song
Hanyu Wang
Simin Niu
Ding Chen
...
Qingchen Yu
Bo Tang
Hongkang Yang
Zhi-hai Xu
Feiyu Xiong
RALM
287
12
0
28 May 2025
Multi-Domain Explainability of Preferences
Nitay Calderon
Liat Ein-Dor
Roi Reichart
LRM
302
0
0
26 May 2025
Response Uncertainty and Probe Modeling: Two Sides of the Same Coin in LLM Interpretability?
Yongjie Wang
Yibo Wang
Xin Zhou
Zhiqi Shen
189
1
0
24 May 2025
ALPS: Attention Localization and Pruning Strategy for Efficient Alignment of Large Language Models
Hao Chen
Haoze Li
Zhiqing Xiao
Lirong Gao
Qi Zhang
Xiaomeng Hu
Ningtao Wang
Xing Fu
Junbo Zhao
548
0
0
24 May 2025
SAE-SSV: Supervised Steering in Sparse Representation Spaces for Reliable Control of Language Models
Zirui He
Haoyang Ling
Bo Shen
Ali Payani
Zelong Li
Mengnan Du
LLMSV
412
7
0
22 May 2025
Relative Bias: A Comparative Framework for Quantifying Bias in LLMs
Alireza Arbabi
Florian Kerschbaum
313
0
0
22 May 2025
Pierce the Mists, Greet the Sky: Decipher Knowledge Overshadowing via Knowledge Circuit Analysis
Haoming Huang
Yibo Yan
Jiahao Huo
Xin Zou
Xinfeng Li
Kun Wang
Xuming Hu
541
1
0
20 May 2025
Integration of TinyML and LargeML: A Survey of 6G and Beyond
Thai-Hoc Vu
Ngo Hoang Tu
Thien Huynh-The
Kyungchun Lee
Sunghwan Kim
Miroslav Voznak
Quoc-Viet Pham
246
1
0
20 May 2025
Through a Compressed Lens: Investigating the Impact of Quantization on LLM Explainability and Interpretability
Qianli Wang
Mingyang Wang
Nils Feldhus
Simon Ostermann
Yuan Cao
Hinrich Schütze
Sebastian Möller
Vera Schmitt
MQ
212
2
0
20 May 2025
Towards Budget-Friendly Model-Agnostic Explanation Generation for Large Language Models
Junhao Liu
Haonan Yu
Xin Zhang
LRM
352
1
0
18 May 2025
Fixed Point Explainability
Emanuele La Malfa
Jon Vadillo
Marco Molinari
Michael Wooldridge
392
0
0
18 May 2025
Can an Easy-to-Hard Curriculum Make Reasoning Emerge in Small Language Models? Evidence from a Four-Stage Curriculum on GPT-2
Xiang Fu
ReLM
LRM
210
1
0
16 May 2025
A Comprehensive Analysis of Large Language Model Outputs: Similarity, Diversity, and Bias
Brandon Smith
Mohamed Reda Bouadjenek
Tahsin Alamgir Kheya
Phillip Dawson
S. Aryal
ALM
ELM
337
2
0
14 May 2025
CrashSage: A Large Language Model-Centered Framework for Contextual and Interpretable Traffic Crash Analysis
Hao Zhen
Jidong J. Yang
184
3
0
08 May 2025
Retrieval Augmented Generation Evaluation for Health Documents
Mario Ceresa
Lorenzo Bertolini
Valentin Comte
Nicholas Spadaro
Barbara Raffael
...
Sergio Consoli
Amalia Muñoz Piñeiro
Alex Patak
Maddalena Querci
Tobias Wiesenthal
RALM
3DV
283
2
1
07 May 2025
Privacy Risks and Preservation Methods in Explainable Artificial Intelligence: A Scoping Review
Sonal Allana
Mohan Kankanhalli
Rozita Dara
335
3
0
05 May 2025
Gender Bias in Explainability: Investigating Performance Disparity in Post-hoc Methods
Conference on Fairness, Accountability and Transparency (FAccT), 2025
Mahdi Dhaini
Ege Erdogan
Nils Feldhus
Gjergji Kasneci
297
1
0
02 May 2025
LLM Security: Vulnerabilities, Attacks, Defenses, and Countermeasures
Francisco Aguilera-Martínez
Fernando Berzal
PILM
370
8
0
02 May 2025
XBreaking: Understanding how LLMs security alignment can be broken
Marco Arazzi
Vignesh Kumar Kembu
Antonino Nocera
V. P.
451
0
0
30 Apr 2025
Bi-directional Model Cascading with Proxy Confidence
David Warren
Mark Dras
248
1
0
27 Apr 2025
Beyond Public Access in LLM Pre-Training Data
Sruly Rosenblat
Tim O'Reilly
Ilan Strauss
MLAU
516
3
0
24 Apr 2025
Interpretable Locomotion Prediction in Construction Using a Memory-Driven LLM Agent With Chain-of-Thought Reasoning
Ehsan Ahmadi
Chao Wang
107
1
0
21 Apr 2025
CLIP-Powered Domain Generalization and Domain Adaptation: A Comprehensive Survey
Jindong Li
Yongqian Li
Yali Fu
Jiahong Liu
Yixin Liu
Menglin Yang
Irwin King
VLM
304
2
0
19 Apr 2025
Probing then Editing Response Personality of Large Language Models
Tianjie Ju
Zhenyu Shao
Binghai Wang
Yulin Chen
Zhuosheng Zhang
Hao Fei
Yang Deng
Wynne Hsu
Sufeng Duan
Gongshen Liu
KELM
353
3
0
14 Apr 2025
Linguistic Interpretability of Transformer-based Language Models: a systematic review
Miguel López-Otal
Jorge Gracia
Jordi Bernad
Carlos Bobed
Lucía Pitarch-Ballesteros
Emma Anglés-Herrero
VLM
324
6
0
09 Apr 2025
LExT: Towards Evaluating Trustworthiness of Natural Language Explanations
Conference on Fairness, Accountability and Transparency (FAccT), 2025
Krithi Shailya
Shreya Rajpal
Gokul S Krishnan
Balaraman Ravindran
ELM
321
6
0
08 Apr 2025
An overview of model uncertainty and variability in LLM-based sentiment analysis. Challenges, mitigation strategies and the role of explainability
Frontiers in Artificial Intelligence (Front. Artif. Intell.), 2025
David Herrera-Poyatos
Carlos Peláez-González
Cristina Zuheros
Andrés Herrera-Poyatos
Virilo Tejedor
F. Herrera
Rosana Montes
253
15
0
06 Apr 2025
Verification of Autonomous Neural Car Control with KeYmaera X
International Conference on Abstract State Machines, Alloy, B, TLA, VDM, and Z (ABZ), 2025
Enguerrand Prebet
Samuel Teuber
André Platzer
251
1
0
04 Apr 2025
Digital Forensics in the Age of Large Language Models
Zhipeng Yin
Sribala Vidyadhari Chinta
Weifeng Xu
Jun Zhuang
Pallab Mozumder
Antoinette Smith
Wenbin Zhang
AILaw
234
15
0
03 Apr 2025
An evaluation of LLMs and Google Translate for translation of selected Indian languages via sentiment and semantic analyses
IEEE Access (IEEE Access), 2025
Rohitash Chandra
Aryan Chaudhary
Yeshwanth Rayavarapu
353
4
0
27 Mar 2025
Hacia la interpretabilidad de la detección anticipada de riesgos de depresión utilizando grandes modelos de lenguaje
Horacio Thompson
Maximiliano Sapino
Edgardo Ferretti
Marcelo Errecalde
108
1
0
26 Mar 2025
Leveraging Large Language Models for Explainable Activity Recognition in Smart Homes: A Critical Evaluation
ACM Transactions on Internet of Things (ACM TIOT), 2025
Michele Fiori
Gabriele Civitarese
Priyankar Choudhary
Claudio Bettini
203
3
0
20 Mar 2025
Fin-R1: A Large Language Model for Financial Reasoning through Reinforcement Learning
Zhaowei Liu
X. Guo
Fangqi Lou
Lingfeng Zeng
Jinyi Niu
...
Xueqian Zhao
Chao Li
Sheng Xu
Dezhi Chen
Yun Chen
ReLM
AIFin
OffRL
AI4TS
LRM
288
48
0
20 Mar 2025
Using LLMs for Automated Privacy Policy Analysis: Prompt Engineering, Fine-Tuning and Explainability
Yuxin Chen
Peng Tang
Weidong Qiu
Shujun Li
157
1
0
16 Mar 2025
Previous
1
2
3
4
5
6
Next