Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
All Papers
0 / 0 papers shown
Title
Home
Papers
2408.12664
Cited By
v1
v2 (latest)
Multilevel Interpretability Of Artificial Neural Networks: Leveraging Framework And Methods From Neuroscience
22 August 2024
Zhonghao He
Jascha Achterberg
Katie Collins
Kevin K. Nejad
Danyal Akarca
Yinzhu Yang
Wes Gurnee
Ilia Sucholutsky
Yuhan Tang
Rebeca Ianov
George Ogden
Chole Li
Kai J. Sandbrink
Stephen Casper
Anna Ivanova
Grace W. Lindsay
AI4CE
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Multilevel Interpretability Of Artificial Neural Networks: Leveraging Framework And Methods From Neuroscience"
3 / 3 papers shown
Title
Representation biases: will we achieve complete understanding by analyzing representations?
Andrew Kyle Lampinen
Stephanie Chan
Yuxuan Li
Katherine Hermann
FaML
274
3
0
29 Jul 2025
Representation Engineering for Large-Language Models: Survey and Research Challenges
Lukasz Bartoszcze
Sarthak Munshi
Bryan Sukidi
Jennifer Yen
Zejia Yang
David Williams-King
Linh Le
Kosi Asuzu
Carsten Maple
340
4
0
24 Feb 2025
Towards Understanding Sycophancy in Language Models
Mrinank Sharma
Meg Tong
Tomasz Korbak
David Duvenaud
Amanda Askell
...
Oliver Rausch
Nicholas Schiefer
Da Yan
Miranda Zhang
Ethan Perez
925
440
0
20 Oct 2023
1