Large Language Models have demonstrated remarkable capabilities in natural language processing, yet their decision-making processes often lack transparency. This opaqueness raises significant concerns regarding trust, bias, and model performance. To address these issues, understanding and evaluating the interpretability of LLMs is crucial. This paper introduces a standardised benchmarking technique, Benchmarking the Explainability of Large Language Models, designed to evaluate the explainability of large language models.
View on arXiv@article{ahmed2025_2504.18572, title={ BELL: Benchmarking the Explainability of Large Language Models }, author={ Syed Quiser Ahmed and Bharathi Vokkaliga Ganesh and Jagadish Babu P and Karthick Selvaraj and ReddySiva Naga Parvathi Devi and Sravya Kappala }, journal={arXiv preprint arXiv:2504.18572}, year={ 2025 } }