v1v2v3 (latest)

Explainability for Large Language Models: A Survey

ACM Transactions on Intelligent Systems and Technology (ACM TIST), 2023

2 September 2023

Haiyan Zhao

Hanjie Chen

Fan Yang

Ninghao Liu

Papers citing "Explainability for Large Language Models: A Survey"

50 / 287 papers shown

Meta-Models: An Architecture for Decoding LLM Behaviors Through Interpreted Embeddings and Natural Language

Anthony Costarelli

Mat Allen

Severin Field

274

03 Oct 2024

F-Fidelity: A Robust Framework for Faithfulness Evaluation of Explainable AIInternational Conference on Learning Representations (ICLR), 2024

Farhad Shirani

394

03 Oct 2024

Enhancing Training Data Attribution for Large Language Models with Fitting Error ConsiderationConference on Empirical Methods in Natural Language Processing (EMNLP), 2024

260

02 Oct 2024

Beyond Single Concept Vector: Modeling Concept Subspace in LLMs with Gaussian DistributionInternational Conference on Learning Representations (ICLR), 2024

425

30 Sep 2024

Concept-Based Explanations in Computer Vision: Where Are We and Where Could We Go?

Stefan Wermter

295

20 Sep 2024

Local Explanations and Self-Explanations for Assessing Faithfulness in black-box LLMsHellenic Conference on Artificial Intelligence (HAI), 2024

Christos Fragkathoulas

Odysseas S. Chlapanis

LRM

158

18 Sep 2024

SimSUM: Simulated Benchmark with Structured and Unstructured Medical Records

Paloma Rabaey

Stefan Heytens

284

13 Sep 2024

Cross-Refine: Improving Natural Language Explanation Generation by Learning in TandemInternational Conference on Computational Linguistics (COLING), 2024

228

11 Sep 2024

From Yes-Men to Truth-Tellers: Addressing Sycophancy in Large Language Models with Pinpoint TuningInternational Conference on Machine Learning (ICML), 2024

Wei Chen

Zhen Huang

Liang Xie

Binbin Lin

Houqiang Li

...

Deng Cai

Yonggang Zhang

Wenxiao Wang

Xu Shen

Jieping Ye

340

03 Sep 2024

A Survey of Large Language Models for European Languages

Wazir Ali

S. Pyysalo

379

27 Aug 2024

Relation Also Knows: Rethinking the Recall and Editing of Factual Associations in Auto-Regressive Transformer Language ModelsAAAI Conference on Artificial Intelligence (AAAI), 2024

Weiping Wang

424

27 Aug 2024

Defending against Jailbreak through Early Exit Generation of Large Language Models

238

21 Aug 2024

Visual Agents as Fast and Slow ThinkersInternational Conference on Learning Representations (ICLR), 2024

Zhenting Wang

529

16 Aug 2024

Prompt Recursive Search: A Living Framework with Adaptive Growth in LLM Auto-Prompting

Xiangyu Zhao

Chengqian Ma

175

02 Aug 2024

Automated Software Vulnerability Static Code Analysis Using Generative Pre-Trained Transformer ModelsIACR Cryptology ePrint Archive (IACR ePrint), 2024

Elijah Pelofske

Vincent Urias

L. Liebrock

145

31 Jul 2024

LLMs for Enhanced Agricultural Meteorological Recommendations

Ji-jun Park

Soo-joon Choi

236

30 Jul 2024

Interpretable Pre-Trained Transformers for Heart Time-Series Data

131

30 Jul 2024

Monetizing Currency Pair Sentiments through LLM Explainability

125

29 Jul 2024

AgentPeerTalk: Empowering Students through Agentic-AI-Driven Discernment of Bullying and Joking in Peer Interactions in Schools

Aditya Paul

Chi Lok Yu

Eva Adelina Susanto

Nicholas Wai Long Lau

Gwenyth Isobel Meadows

LLMAG

261

27 Jul 2024

On Behalf of the Stakeholders: Trends in NLP Model Interpretability in the Era of LLMs

Nitay Calderon

Roi Reichart

358

27 Jul 2024

Fairness Definitions in Language Models Explained

Thang Viet Doan

Zhibo Chu

Sribala Vidyadhari Chinta

Wenbin Zhang

ALM

351

26 Jul 2024

Knowledge Mechanisms in Large Language Models: A Survey and Perspective

Shumin Deng

...

Yong Jiang

Pengjun Xie

Fei Huang

Huajun Chen

Ningyu Zhang

332

22 Jul 2024

MAVEN-Fact: A Large-scale Event Factuality Detection Dataset

Chunyang Li

Hao Peng

Xiaozhi Wang

Yunjia Qi

Lei Hou

Bin Xu

Juanzi Li

HILM

263

22 Jul 2024

XAI meets LLMs: A Survey of the Relation between Explainable AI and Large Language Models

330

21 Jul 2024

Prover-Verifier Games improve legibility of LLM outputs

278

18 Jul 2024

A Survey on Symbolic Knowledge Distillation of Large Language Models

278

12 Jul 2024

DeepCodeProbe: Towards Understanding What Models Trained on Code Learn

Vahid Majdinasab

Amin Nikanjam

Foutse Khomh

240

11 Jul 2024

Towards Explainable Evolution Strategies with Large Language Models

Jill Baumann

Oliver Kramer

154

11 Jul 2024

Grounding and Evaluation for Large Language Models: Practical Challenges and Lessons Learned (Survey)

219

10 Jul 2024

Merge, Ensemble, and Cooperate! A Survey on Collaborative Strategies in the Era of Large Language Models

Jiajun Zhang

384

08 Jul 2024

Cognitive Modeling with Scaffolded LLMs: A Case Study of Referential Expression Generation

Polina Tsvilodub

Michael Franke

Fausto Carcassi

175

04 Jul 2024

A Survey on Trustworthiness in Foundation Models for Medical Image Analysis

Congzhen Shi

Ryan Rezai

Jiaxi Yang

Qi Dou

Xiaoxiao Li

MedIm

223

03 Jul 2024

A Practical Review of Mechanistic Interpretability for Transformer-Based Language Models

628

02 Jul 2024

When Search Engine Services meet Large Language Models: Visions and Challenges

352

28 Jun 2024

Enabling Regional Explainability by Automatic and Model-agnostic Rule Extraction

Nan Fletcher-Loyd

251

25 Jun 2024

RankAdaptor: Hierarchical Dynamic Low-Rank Adaptation for Structural Pruned LLMs

189

22 Jun 2024

Retrieval-Augmented Generation for Generative Artificial Intelligence in Medicine

173

18 Jun 2024

D2O: Dynamic Discriminative Operations for Efficient Long-Context Inference of Large Language Models

Zhongwei Wan

Xinjian Wu

Yu Zhang

Yi Xin

Chaofan Tao

...

392

18 Jun 2024

Self-MoE: Towards Compositional Large Language Models with Self-Specialized Experts

Hongyin Luo

James Glass

Alan Ritter

227

17 Jun 2024

Applications of Generative AI in Healthcare: algorithmic, ethical, legal and societal considerations

Onyekachukwu R. Okonji

Kamol Yunusov

Bonnie Gordon

MedIm

206

15 Jun 2024

How Alignment and Jailbreak Work: Explain LLM Safety through Intermediate Hidden StatesConference on Empirical Methods in Natural Language Processing (EMNLP), 2024

Zhenhong Zhou

Haiyang Yu

Xinghua Zhang

Rongwu Xu

Fei Huang

Yongbin Li

375

09 Jun 2024

Deconstructing The Ethics of Large Language Models from Long-standing Issues to New-emerging Dilemmas

...

Lu Cheng

264

08 Jun 2024

POEM: Interactive Prompt Optimization for Enhancing Multimodal Reasoning of Large Language Models

242

06 Jun 2024

A Survey of Language-Based Communication in Robotics

William Hunt

Sarvapali D. Ramchurn

Mohammad D. Soorati

LM&Ro

691

06 Jun 2024

I've got the "Answer"! Interpretation of LLMs Hidden States in Question Answering

Valeriya Goloviznina

Evgeny Kotelnikov

04 Jun 2024

CheckEmbed: Effective Verification of LLM Solutions to Open-Ended Tasks

...

672

04 Jun 2024

Position: Cracking the Code of Cascading Disparity Towards Marginalized Communities

G. Farnadi

Mohammad Havaei

Negar Rostamzadeh

324

03 Jun 2024

Understanding Token Probability Encoding in Output Embeddings

294

03 Jun 2024

Towards Practical Single-shot Motion Synthesis

Konstantinos Roditakis

Spyridon Thermos

N. Zioulis

VGen

366

03 Jun 2024

Empirical influence functions to understand the logic of fine-tuning

Jordan K Matelsky

Lyle Ungar

Konrad Paul Kording

208

01 Jun 2024