ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2406.12618
  4. Cited By
From Insights to Actions: The Impact of Interpretability and Analysis
  Research on NLP

From Insights to Actions: The Impact of Interpretability and Analysis Research on NLP

18 June 2024
Marius Mosbach
Vagrant Gautam
Tomás Vergara-Browne
Dietrich Klakow
Mor Geva
    AI4CE
ArXivPDFHTML

Papers citing "From Insights to Actions: The Impact of Interpretability and Analysis Research on NLP"

4 / 4 papers shown
Title
Aligned Probing: Relating Toxic Behavior and Model Internals
Aligned Probing: Relating Toxic Behavior and Model Internals
Andreas Waldis
Vagrant Gautam
Anne Lauscher
Dietrich Klakow
Iryna Gurevych
45
0
0
17 Mar 2025
Investigating Neurons and Heads in Transformer-based LLMs for Typographical Errors
Investigating Neurons and Heads in Transformer-based LLMs for Typographical Errors
Kohei Tsuji
Tatsuya Hiraoka
Yuchang Cheng
Eiji Aramaki
Tomoya Iwakura
74
0
0
27 Feb 2025
What Do Speech Foundation Models Not Learn About Speech?
What Do Speech Foundation Models Not Learn About Speech?
Abdul Waheed
Hanin Atwany
Bhiksha Raj
Rita Singh
SSL
35
1
0
16 Oct 2024
Interpretability in the Wild: a Circuit for Indirect Object
  Identification in GPT-2 small
Interpretability in the Wild: a Circuit for Indirect Object Identification in GPT-2 small
Kevin Wang
Alexandre Variengien
Arthur Conmy
Buck Shlegeris
Jacob Steinhardt
212
494
0
01 Nov 2022
1