v1v2 (latest)

Evaluating Neuron Interpretation Methods of NLP Models

Neural Information Processing Systems (NeurIPS), 2023

30 January 2023

ArXiv (abs)PDF HTML Github (3★)

Papers citing "Evaluating Neuron Interpretation Methods of NLP Models"

7 / 7 papers shown

Polysemy of Synthetic Neurons Towards a New Type of Explanatory Categorical Vector Spaces

Michael Veillet-Guillem

MILM

359

30 Apr 2025

Intra-neuronal attention within language models Relationships between activation and semantics

Corbet Alois Georgeon

Michael Veillet-Guillem

MILM

349

17 Mar 2025

Discovering Influential Neuron Path in Vision TransformersInternational Conference on Learning Representations (ICLR), 2025

652

12 Mar 2025

How Do Artificial Intelligences Think? The Three Mathematico-Cognitive Factors of Categorical Segmentation Operated by Synthetic Neurons

Michael Veillet-Guillem

312

26 Dec 2024

Understanding Internal Representations of Recommendation Models with Sparse Autoencoders

448

09 Nov 2024

Neuropsychology of AI: Relationship Between Activation Proximity and Categorical Proximity Within Neural Categories of Synthetic Cognition

Michael Veillet-Guillem

305

08 Oct 2024

NeuroX Library for Neuron Analysis of Deep NLP ModelsAnnual Meeting of the Association for Computational Linguistics (ACL), 2023

Fahim Dalvi

Hassan Sajjad

Nadir Durrani

312

26 May 2023