ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2402.15390
  4. Cited By
Explorations of Self-Repair in Language Models
v1v2 (latest)

Explorations of Self-Repair in Language Models

23 February 2024
Cody Rushing
Neel Nanda
    KELMMILMLRM
ArXiv (abs)PDFHTML

Papers citing "Explorations of Self-Repair in Language Models"

15 / 15 papers shown
Validation of Various Normalization Methods for Brain Tumor Segmentation: Can Federated Learning Overcome This Heterogeneity?
Validation of Various Normalization Methods for Brain Tumor Segmentation: Can Federated Learning Overcome This Heterogeneity?
Jan Fiszer
Dominika Ciupek
Maciej Malawski
FedML
212
1
0
08 Oct 2025
Interpreting Attention Heads for Image-to-Text Information Flow in Large Vision-Language Models
Interpreting Attention Heads for Image-to-Text Information Flow in Large Vision-Language Models
Jinyeong Kim
Seil Kang
Jiwoo Park
Junhyeok Kim
Seong Jae Hwang
134
1
0
22 Sep 2025
Bridging Neural ODE and ResNet: A Formal Error Bound for Safety Verification
Bridging Neural ODE and ResNet: A Formal Error Bound for Safety Verification
Abdelrahman Sayed Sayed
Pierre-Jean Meyer
Mohamed Ghazel
199
0
0
03 Jun 2025
GIM: Improved Interpretability for Large Language Models
GIM: Improved Interpretability for Large Language Models
Joakim Edin
Róbert Csordás
Tuukka Ruotsalo
Zhengxuan Wu
Maria Maistro
Casper L. Christensen
Jing-ling Huang
Lars Maaløe
386
0
0
23 May 2025
Understanding Gated Neurons in Transformers from Their Input-Output Functionality
Understanding Gated Neurons in Transformers from Their Input-Output Functionality
Sebastian Gerstner
Hinrich Schütze
MILMFAtt
381
0
0
23 May 2025
Decoding Vision Transformers: the Diffusion Steering Lens
Decoding Vision Transformers: the Diffusion Steering Lens
Ryota Takatsuki
Sonia Joseph
Ippei Fujisawa
Ryota Kanai
DiffM
375
0
0
18 Apr 2025
Model Hemorrhage and the Robustness Limits of Large Language Models
Model Hemorrhage and the Robustness Limits of Large Language Models
Ziyang Ma
Hui Yuan
Guang Dai
Gui-Song Xia
Bo Du
Liangpei Zhang
Dacheng Tao
317
1
0
31 Mar 2025
Neuroplasticity and Corruption in Model Mechanisms: A Case Study Of Indirect Object IdentificationNorth American Chapter of the Association for Computational Linguistics (NAACL), 2025
Vishnu Kabir Chhabra
Ding Zhu
Mohammad Mahdi Khalili
325
5
0
27 Feb 2025
Looking Beyond The Top-1: Transformers Determine Top Tokens In Order
Looking Beyond The Top-1: Transformers Determine Top Tokens In Order
Daria Lioubashevski
Tomer Schlank
Gabriel Stanovsky
Ariel Goldstein
306
5
0
26 Oct 2024
The Geometry of Concepts: Sparse Autoencoder Feature Structure
The Geometry of Concepts: Sparse Autoencoder Feature Structure
Yuxiao Li
Eric J. Michaud
David D. Baek
Joshua Engels
Xiaoqing Sun
Max Tegmark
322
34
0
10 Oct 2024
Sparse Attention Decomposition Applied to Circuit Tracing
Sparse Attention Decomposition Applied to Circuit Tracing
Gabriel Franco
Mark Crovella
382
1
0
01 Oct 2024
Optimal ablation for interpretability
Optimal ablation for interpretabilityNeural Information Processing Systems (NeurIPS), 2024
Maximilian Li
Lucas Janson
FAtt
343
11
0
16 Sep 2024
The Remarkable Robustness of LLMs: Stages of Inference?
The Remarkable Robustness of LLMs: Stages of Inference?
Vedang Lad
Wes Gurnee
Max Tegmark
Max Tegmark
515
86
0
27 Jun 2024
From Text to Life: On the Reciprocal Relationship between Artificial
  Life and Large Language Models
From Text to Life: On the Reciprocal Relationship between Artificial Life and Large Language Models
Eleni Nisioti
Claire Glanois
Elias Najarro
Andrew Dai
Elliot Meyerson
J. Pedersen
Laetitia Teodorescu
Conor F. Hayes
Shyam Sudhakaran
Sebastian Risi
AI4CELM&Ro
338
7
0
14 Jun 2024
Monotonic Representation of Numeric Properties in Language Models
Monotonic Representation of Numeric Properties in Language Models
Benjamin Heinzerling
Kentaro Inui
KELMMILM
224
13
0
15 Mar 2024
1