ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2408.12664
  4. Cited By
Multilevel Interpretability Of Artificial Neural Networks: Leveraging
  Framework And Methods From Neuroscience

Multilevel Interpretability Of Artificial Neural Networks: Leveraging Framework And Methods From Neuroscience

22 August 2024
Zhonghao He
Jascha Achterberg
Katie Collins
Kevin K. Nejad
Danyal Akarca
Yinzhu Yang
Wes Gurnee
Ilia Sucholutsky
Yuhan Tang
Rebeca Ianov
George Ogden
Chole Li
Kai J. Sandbrink
Stephen Casper
Anna Ivanova
Grace W. Lindsay
    AI4CE
ArXivPDFHTML

Papers citing "Multilevel Interpretability Of Artificial Neural Networks: Leveraging Framework And Methods From Neuroscience"

9 / 9 papers shown
Title
Towards Uncovering How Large Language Model Works: An Explainability
  Perspective
Towards Uncovering How Large Language Model Works: An Explainability Perspective
Haiyan Zhao
Fan Yang
Bo Shen
Himabindu Lakkaraju
Mengnan Du
30
10
0
16 Feb 2024
Interpretability in the Wild: a Circuit for Indirect Object
  Identification in GPT-2 small
Interpretability in the Wild: a Circuit for Indirect Object Identification in GPT-2 small
Kevin Wang
Alexandre Variengien
Arthur Conmy
Buck Shlegeris
Jacob Steinhardt
207
486
0
01 Nov 2022
Omnigrok: Grokking Beyond Algorithmic Data
Omnigrok: Grokking Beyond Algorithmic Data
Ziming Liu
Eric J. Michaud
Max Tegmark
54
76
0
03 Oct 2022
In-context Learning and Induction Heads
In-context Learning and Induction Heads
Catherine Olsson
Nelson Elhage
Neel Nanda
Nicholas Joseph
Nova Dassarma
...
Tom B. Brown
Jack Clark
Jared Kaplan
Sam McCandlish
C. Olah
237
453
0
24 Sep 2022
Toy Models of Superposition
Toy Models of Superposition
Nelson Elhage
Tristan Hume
Catherine Olsson
Nicholas Schiefer
T. Henighan
...
Sam McCandlish
Jared Kaplan
Dario Amodei
Martin Wattenberg
C. Olah
AAML
MILM
117
314
0
21 Sep 2022
Probing Classifiers: Promises, Shortcomings, and Advances
Probing Classifiers: Promises, Shortcomings, and Advances
Yonatan Belinkov
221
291
0
24 Feb 2021
Scaling Laws for Neural Language Models
Scaling Laws for Neural Language Models
Jared Kaplan
Sam McCandlish
T. Henighan
Tom B. Brown
B. Chess
R. Child
Scott Gray
Alec Radford
Jeff Wu
Dario Amodei
220
3,054
0
23 Jan 2020
Towards A Rigorous Science of Interpretable Machine Learning
Towards A Rigorous Science of Interpretable Machine Learning
Finale Doshi-Velez
Been Kim
XAI
FaML
219
3,658
0
28 Feb 2017
Demixed principal component analysis of population activity in higher
  cortical areas reveals independent representation of task parameters
Demixed principal component analysis of population activity in higher cortical areas reveals independent representation of task parameters
D. Kobak
Wieland Brendel
C. Constantinidis
C. Feierstein
Adam Kepecs
Z. Mainen
R. Romo
Xue-Lian Qi
N. Uchida
C. Machens
27
462
0
22 Oct 2014
1